[ 
https://issues.apache.org/jira/browse/LUCENENET-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743739#action_12743739
 ] 

Digy commented on LUCENENET-190:
--------------------------------

Hi Michael,

How do you think the API should be to disable the cache?
  - A method in IndexWriter passing the boolean value to SegmentReader which in 
turn passes it to TermInfosReader like
        public void TermInfosReaderCacheUsage(bool UseCache) //but it seems 
like diversion from Lucene.Java
        or
  - A static variable in SupportClass like "EnableTermInfosReaderCache" which 
is true by default.
  
DIGY
  

  
  

> 2.4.0 Performance in TermInfosReader term caching (New implementation of 
> SimpleLRUCache)
> ----------------------------------------------------------------------------------------
>
>                 Key: LUCENENET-190
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-190
>             Project: Lucene.Net
>          Issue Type: Improvement
>         Environment: v2.4.0
>            Reporter: Digy
>            Priority: Minor
>         Attachments: cache_Gen2.PNG, SimpleLRUCache.rar
>
>
> Below is the mail from Michael Garski about the Performance in 
> TermInfosReader term caching. It would be good to have a faster LRUCache 
> implementation in Lucene.Net
> DIGY
> {quote}
> Doug did an amazing job of porting 2.4.0, doing it mostly on his own!  
> Hooray Doug!
> We are using the committed version of 2.4.0 in production and I wanted to 
> share a performance issue we discovered and what we've done to work around 
> it.  From the Java Lucene change log:  "LUCENE-1195: Improve term lookup 
> performance by adding a LRU cache to the TermInfosReader. In performance 
> experiments the speedup was about 25% on average on mid-size indexes with 
> ~500,000 documents for queries with 3 terms and about 7% on larger indexes 
> with ~4.3M documents."
> The Java implementation uses a LinkedHashMap within the class 
> org.apache.lucene.util.cache.SimpleLRUCache, which is very efficient at 
> maintaining the cache.  As there is no equivalent collection in .Net The 
> current 2.4.0 port uses a combination of a LinkedList to maintain LRU state 
> and a HashTable to provide lookups.  While this implementation works, 
> maintaining the LRU state via the LinkedList creates a fair amount of 
> overhead and can result in a significant reduction of performance, most 
> likely attributed to the LinkedList.Remove method being O(n).  As each thread 
> maintains its own cache of 1024 terms, these overhead in performing the 
> removal is a drain on performance.
> At this time we have disabled the cache in the method 
> TermInfosReader.TermInfo Get(Term term, bool useCache) by always setting the 
> useCache parameter to false inside the body of the method.  After doing this 
> we saw performance return back to the 2.3.2 levels.  I have not yet had the 
> opportunity to experiment with other implementations within the 
> SimpleLRUCache to address the performance issue.  One approach that would 
> might solve the issue is to use the HashedLinkedList<T> class provided in the 
> C5 collection library [http://www.itu.dk/research/c5/].
> Michael
> Michael Garski
> Search Architect
> MySpace.com
> www.myspace.com/michaelgarski <http://%27www.myspace.com/mgarski>
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to