[jira] Commented: (SOLR-1082) Refactor caching layer to be JCache compliant (jsr-107). In particular, consider using ehcache implementation

Shalin Shekhar Mangar (JIRA) Tue, 24 Mar 2009 04:12:22 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-1082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688642#action_12688642
 ]


Shalin Shekhar Mangar commented on SOLR-1082:
---------------------------------------------

Kaktu, also see SOLR-667 and SOLR-665 if you haven't already.

Some of us did look at ehcache's implementation when were looking for a better 
cache for the faceting part. I checked again to see if they have a better 
implementation but I don't think it has changed.

Specifically look at 
http://fisheye3.atlassian.com/browse/ehcache/trunk/core/src/main/java/net/sf/ehcache/concurrent/ConcurrentLinkedHashMap.java?r=910

I haven't studied the code completely but from the javadocs:
{quote}
Least Recently Used: An eviction policy based on the observation that entries 
that have been used recently will likely be used again soon. This policy 
provides a good approximation of an optimal algorithm, but suffers by being 
expensive to maintain. The cost of reordering entries on the list during every 
access operation reduces the concurrency and performance characteristics of 
this policy.
{quote}

Compare that implementation with Solr's own ConcurrentLRUCache.
http://svn.apache.org/viewvc/lucene/solr/trunk/src/common/org/apache/solr/common/util/ConcurrentLRUCache.java?view=log

This was built from the ground up to be a fast LRU implementation suited for 
highly concurrent loads. If somebody can post some benchmarks showing if/how 
ehcache (or some other implementation) is improving the performance, we will be 
definitely interested.

bq. 3. As I see it, the current caching is pretty basic, and does not scale 
well to the kind of production-usage scenarios i have in mind.

Don't go on gut feel. I'd highly recommend benchmarking with real data and 
queries before you jump to any conclusions. Solr has a SolrCache interface. It 
shouldn't be very tough to write an implementation which uses ehcache for 
testing.

bq. Also, some of the more serious issues (OOM's) I see with the current 
behavior when committing index updates while serving requests and having two 
searcher instances running concurrently, requiring up to twice the space needed 
in terms of cached objects, might very well be addressed with an implementation 
such as ehcache that supports Cache/Element-based expiry policies, disk 
flushing, and cache event listeners.

Those may have nothing to do with the cache implementation itself. Simply 
switching to some other cache implementation will not solve the problem. Some 
of these are need based. A lot of those will be easier when Lucene/Solr can 
cache per-segment. I'll leave the more intricate details to somebody who knows 
more than I do about these things. But I can tell you that a lot of work is 
going on in Lucene/Solr to overcome these difficulties.

> Refactor caching layer to be JCache compliant (jsr-107). In particular, 
> consider using ehcache implementation
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1082
>                 URL: https://issues.apache.org/jira/browse/SOLR-1082
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 1.5
>            Reporter: Kaktu Chakarabati
>
> overhaul the caching layer to be compliant
> with the upcoming Jcache api (jsr-107).
> In specific, I've been experimenting some with ehcache
> (http://ehcache.sourceforge.net/ , Apache OS license) and it seems to be a
> very comprehensive implementation, as well as fully compliant with the jcache 
> API.
> I think the benefits are numerous: in respect to ehcache itself, it seems to
> be a very mature implementation, supporting most classical cache schemes as
> well as some interesting distributed cache options (and of course,
> performance-wise its very lucrative in terms of reported multi-cpu scaling
> performance and  some of the benchmark figures they show).
> Further, abstracting away the caches to use the jcache api would probably
> make it easier in the future to make the whole caching layer more easily
> swappable with some other implementations that will probably crop up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1082) Refactor caching layer to be JCache compliant (jsr-107). In particular, consider using ehcache implementation

Reply via email to