[ https://issues.apache.org/jira/browse/LUCENE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
robert engels updated LUCENE-1195: ---------------------------------- Attachment: SafeThreadLocal.java A "safe" ThreadLocal that can be used for more deterministic memory usage. Probably a bit slower than the JDK ThreadLocal, due to the synchronization. Offers a "purge()" method to force the cleanup of stale entries. Probably most useful in code like this: SomeLargeObject slo; // maybe a RAMDirectory? try { slo = new SomeLargeObject(); // or other creation mechanism; } catch (OutOfMemoryException e) { SafeThreadLocal.purge(); // now try again slo = new SomeLargeObject(); // or other creation mechanism; } > Performance improvement for TermInfosReader > ------------------------------------------- > > Key: LUCENE-1195 > URL: https://issues.apache.org/jira/browse/LUCENE-1195 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Michael Busch > Assignee: Michael Busch > Priority: Minor > Fix For: 2.4 > > Attachments: lucene-1195.patch, lucene-1195.patch, lucene-1195.patch, > SafeThreadLocal.java > > > Currently we have a bottleneck for multi-term queries: the dictionary lookup > is being done > twice for each term. The first time in Similarity.idf(), where > searcher.docFreq() is called. > The second time when the posting list is opened (TermDocs or TermPositions). > The dictionary lookup is not cheap, that's why a significant performance > improvement is > possible here if we avoid the second lookup. An easy way to do this is to add > a small LRU > cache to TermInfosReader. > I ran some performance experiments with an LRU cache size of 20, and an > mid-size index of > 500,000 documents from wikipedia. Here are some test results: > 50,000 AND queries with 3 terms each: > old: 152 secs > new (with LRU cache): 112 secs (26% faster) > 50,000 OR queries with 3 terms each: > old: 175 secs > new (with LRU cache): 133 secs (24% faster) > For bigger indexes this patch will probably have less impact, for smaller > once more. > I will attach a patch soon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]