[ 
https://issues.apache.org/jira/browse/LUCENENET-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743371#action_12743371
 ] 

Michael Garski commented on LUCENENET-190:
------------------------------------------

DIGY,

I ran your patch through the same test set I ran 2.4 through earlier, and I 
found that it performed slower than the original port of the SimpleLRUCache, 
but there was less churn in the Gen2 heap.

The indexes I am testing with are searched in parallel using out own custom 
extension of the ParallelMultiSearcher, and perform some fairly complex 
queries, and is our slowest search among all that are backed by Lucene.  There 
are approximately 12 million items in 4 sub-indexes, and the queries are 
randomly generated from a list of terms and phrases that are known to exist in 
the index.  The test server is a dual core server throttled so that only 2 
searches are executed at a time concurrently over a period of 18 hours.  
Shorter test runs did not surface the cache issue as the performance hit was 
only apparent after the cache was full and items were being removed.

The original Java Lucene item this is related to, LUCENE-1195 mentions that 
this patch will have less of an impact on larger indexes, and from what I have 
seen this is true.  I have not performed this test against smaller indexes.  If 
the cache is efficient enough there is no gain on performance for large 
indexes.  That does not mean that it has no effect on the system, as the 
utilization of the Gen2 heap increases due to the cached items.  In my testing 
this lead to large garbage collections about every 6 hours.  I've attached an 
image of the Gen2 utilization charted out.  

I would suggest that being able to disable the caching programmatically or via 
configuration would be ideal.  We're going to continue to disable it in our 
applications due to the effect on Gen2 and no real gain in performance.

I also ran a test after modifying your patch to remove the locking.  As the 
cache is persistent per thread, there is no need for the overhead of the 
locking.  I also modified Doug's original port a bit to use the base class 
hashtable (map field) to perform the Contains call - moving that from O(n) to 
O(1). 

Results:
2.3 : 12.02 searches/sec
2.4 native : 8.90 searches/sec
2.4 w/cache disabled : 12.07 searches/sec
2.4 w/DIGY patch : 5.23 searches/sec

Additional Tests
2.4 w/C5 collections : 11.87 searches/sec
2.4 w/Digy patch & no locking : 12.13 searches/sec
2.4 w/modified native patch : 11.59 searches/sec

Looking further into the cache performance, here is the big O on the insert at 
head/remove of the LRU:

2.4 port as committed (LinkedList)
Contains : O(n)
Insert : O(1)
Remove : O(n)

2.4 DIGY Patch (SortedList) 
Get (using hashtable) : O(1)
Insert : O(log n)
Remove : O(n) 

2.4 using C5 collections
Contains : O(1)
Insert : O(1)
Remove : O(1) 

2.4 DIGY Patch (SortedList w/ no locking) 
Get (using hashtable) : O(1)
Insert : O(log n)
Remove : O(n) 

2.4 modified commited code using base.Map.ContainsKey to avoid walking the 
entire list when determining an item is cached
Contains : O(1)
Insert : O(1)
Remove : O(n)

Michael


> 2.4.0 Performance in TermInfosReader term caching (New implementation of 
> SimpleLRUCache)
> ----------------------------------------------------------------------------------------
>
>                 Key: LUCENENET-190
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-190
>             Project: Lucene.Net
>          Issue Type: Improvement
>         Environment: v2.4.0
>            Reporter: Digy
>            Priority: Minor
>         Attachments: SimpleLRUCache.rar
>
>
> Below is the mail from Michael Garski about the Performance in 
> TermInfosReader term caching. It would be good to have a faster LRUCache 
> implementation in Lucene.Net
> DIGY
> {quote}
> Doug did an amazing job of porting 2.4.0, doing it mostly on his own!  
> Hooray Doug!
> We are using the committed version of 2.4.0 in production and I wanted to 
> share a performance issue we discovered and what we've done to work around 
> it.  From the Java Lucene change log:  "LUCENE-1195: Improve term lookup 
> performance by adding a LRU cache to the TermInfosReader. In performance 
> experiments the speedup was about 25% on average on mid-size indexes with 
> ~500,000 documents for queries with 3 terms and about 7% on larger indexes 
> with ~4.3M documents."
> The Java implementation uses a LinkedHashMap within the class 
> org.apache.lucene.util.cache.SimpleLRUCache, which is very efficient at 
> maintaining the cache.  As there is no equivalent collection in .Net The 
> current 2.4.0 port uses a combination of a LinkedList to maintain LRU state 
> and a HashTable to provide lookups.  While this implementation works, 
> maintaining the LRU state via the LinkedList creates a fair amount of 
> overhead and can result in a significant reduction of performance, most 
> likely attributed to the LinkedList.Remove method being O(n).  As each thread 
> maintains its own cache of 1024 terms, these overhead in performing the 
> removal is a drain on performance.
> At this time we have disabled the cache in the method 
> TermInfosReader.TermInfo Get(Term term, bool useCache) by always setting the 
> useCache parameter to false inside the body of the method.  After doing this 
> we saw performance return back to the 2.3.2 levels.  I have not yet had the 
> opportunity to experiment with other implementations within the 
> SimpleLRUCache to address the performance issue.  One approach that would 
> might solve the issue is to use the HashedLinkedList<T> class provided in the 
> C5 collection library [http://www.itu.dk/research/c5/].
> Michael
> Michael Garski
> Search Architect
> MySpace.com
> www.myspace.com/michaelgarski <http://%27www.myspace.com/mgarski>
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to