[
https://issues.apache.org/jira/browse/LUCENE-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584319#comment-13584319
]
Yonik Seeley commented on LUCENE-4791:
--------------------------------------
I did some ad-hoc testing to verify:
randomly distributed dense terms that almost never match: 3.7% perf increase
randomly distributed dense terms that almost always match: 0%
random(10) on one field matching random(10) on another: 4.1% perf increase
terms grouped in blocks of 10 (i.e. 10 sequential docs have same value): 67%
perf increase
As you can see, this really hits non-random distribution of terms the most.
The larger the blocks of terms, the larger the performance increase after
applying the patch. I was able to get it up to 10x, but it's really
theoretically unbounded.
> ConjunctionTermScorer scans instead of skips on first scorer
> ------------------------------------------------------------
>
> Key: LUCENE-4791
> URL: https://issues.apache.org/jira/browse/LUCENE-4791
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Yonik Seeley
> Attachments: LUCENE-4791.patch
>
>
> As discovered by John Wang, it looks like a bug was introduced when
> ConjunctionTermScorer was first introduced in 7/2011 that causes scanning
> instead of skipping on the lowest frequency term.
> http://markmail.org/message/wuukqzbhe7zgkfmf
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]