[
https://issues.apache.org/jira/browse/SOLR-1082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688642#action_12688642
]
Shalin Shekhar Mangar commented on SOLR-1082:
---------------------------------------------
Kaktu, also see SOLR-667 and SOLR-665 if you haven't already.
Some of us did look at ehcache's implementation when were looking for a better
cache for the faceting part. I checked again to see if they have a better
implementation but I don't think it has changed.
Specifically look at
http://fisheye3.atlassian.com/browse/ehcache/trunk/core/src/main/java/net/sf/ehcache/concurrent/ConcurrentLinkedHashMap.java?r=910
I haven't studied the code completely but from the javadocs:
{quote}
Least Recently Used: An eviction policy based on the observation that entries
that have been used recently will likely be used again soon. This policy
provides a good approximation of an optimal algorithm, but suffers by being
expensive to maintain. The cost of reordering entries on the list during every
access operation reduces the concurrency and performance characteristics of
this policy.
{quote}
Compare that implementation with Solr's own ConcurrentLRUCache.
http://svn.apache.org/viewvc/lucene/solr/trunk/src/common/org/apache/solr/common/util/ConcurrentLRUCache.java?view=log
This was built from the ground up to be a fast LRU implementation suited for
highly concurrent loads. If somebody can post some benchmarks showing if/how
ehcache (or some other implementation) is improving the performance, we will be
definitely interested.
bq. 3. As I see it, the current caching is pretty basic, and does not scale
well to the kind of production-usage scenarios i have in mind.
Don't go on gut feel. I'd highly recommend benchmarking with real data and
queries before you jump to any conclusions. Solr has a SolrCache interface. It
shouldn't be very tough to write an implementation which uses ehcache for
testing.
bq. Also, some of the more serious issues (OOM's) I see with the current
behavior when committing index updates while serving requests and having two
searcher instances running concurrently, requiring up to twice the space needed
in terms of cached objects, might very well be addressed with an implementation
such as ehcache that supports Cache/Element-based expiry policies, disk
flushing, and cache event listeners.
Those may have nothing to do with the cache implementation itself. Simply
switching to some other cache implementation will not solve the problem. Some
of these are need based. A lot of those will be easier when Lucene/Solr can
cache per-segment. I'll leave the more intricate details to somebody who knows
more than I do about these things. But I can tell you that a lot of work is
going on in Lucene/Solr to overcome these difficulties.
> Refactor caching layer to be JCache compliant (jsr-107). In particular,
> consider using ehcache implementation
> -------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-1082
> URL: https://issues.apache.org/jira/browse/SOLR-1082
> Project: Solr
> Issue Type: New Feature
> Components: search
> Affects Versions: 1.5
> Reporter: Kaktu Chakarabati
>
> overhaul the caching layer to be compliant
> with the upcoming Jcache api (jsr-107).
> In specific, I've been experimenting some with ehcache
> (http://ehcache.sourceforge.net/ , Apache OS license) and it seems to be a
> very comprehensive implementation, as well as fully compliant with the jcache
> API.
> I think the benefits are numerous: in respect to ehcache itself, it seems to
> be a very mature implementation, supporting most classical cache schemes as
> well as some interesting distributed cache options (and of course,
> performance-wise its very lucrative in terms of reported multi-cpu scaling
> performance and some of the benchmark figures they show).
> Further, abstracting away the caches to use the jcache api would probably
> make it easier in the future to make the whole caching layer more easily
> swappable with some other implementations that will probably crop up.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.