[ https://issues.apache.org/jira/browse/LUCENE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12572913#action_12572913 ]
Yonik Seeley commented on LUCENE-1195: -------------------------------------- {quote}When faceting(iterating all terms), will each term be looked up twice in dictionary?{quote} No, but consider the case of one thread iterating over all the terms in a field (due to faceting, a range query, prefix query, etc) That would tend to destroy the cache for other threads doing simple queries unless a way could be found to bypass the cache during enumeration somehow, or make each cache thread specific. > Performance improvement for TermInfosReader > ------------------------------------------- > > Key: LUCENE-1195 > URL: https://issues.apache.org/jira/browse/LUCENE-1195 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Michael Busch > Assignee: Michael Busch > Priority: Minor > Fix For: 2.4 > > > Currently we have a bottleneck for multi-term queries: the dictionary lookup > is being done > twice for each term. The first time in Similarity.idf(), where > searcher.docFreq() is called. > The second time when the posting list is opened (TermDocs or TermPositions). > The dictionary lookup is not cheap, that's why a significant performance > improvement is > possible here if we avoid the second lookup. An easy way to do this is to add > a small LRU > cache to TermInfosReader. > I ran some performance experiments with an LRU cache size of 20, and an > mid-size index of > 500,000 documents from wikipedia. Here are some test results: > 50,000 AND queries with 3 terms each: > old: 152 secs > new (with LRU cache): 112 secs (26% faster) > 50,000 OR queries with 3 terms each: > old: 175 secs > new (with LRU cache): 133 secs (24% faster) > For bigger indexes this patch will probably have less impact, for smaller > once more. > I will attach a patch soon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]