[jira] [Commented] (LUCENE-4873) corner case in MinShouldMatchSumScorer when there are many terms

Robert Muir (JIRA) Sun, 24 Mar 2013 09:39:16 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-4873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13612143#comment-13612143
 ]


Robert Muir commented on LUCENE-4873:
-------------------------------------

Thanks Stefan:
{quote}
please remove these expensive operation calls if you think that assertions 
should be fast because many users run with assertions enabled and could be 
irritated.
{quote}

good call: I disabled the expensive ones at least for now. I'll commit this for 
now, but I think as a next step if we can refactor, we can instead unit test 
the utility heap methods directly and feel a lot better.

{quote}
This just exemplifies that one shouldn't re-implement basic data structures in 
each and every class.
Would it make sense to add heap operations to e.g. ArrayUtil and refactor the 
codebase? Or is it known that this would mean prohibitive performance impact?
{quote}

Yes: I think we should do this. This was my original motivation for having a 
base class between DisjunctionSum and DisjunctionMax scorers: but this sounds 
like it might be a better way to do it. We can just benchmark that it doesnt 
have a performance impact.

                
> corner case in MinShouldMatchSumScorer when there are many terms
> ----------------------------------------------------------------
>
>                 Key: LUCENE-4873
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4873
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/query/scoring
>    Affects Versions: 4.3
>            Reporter: Robert Muir
>         Attachments: LUCENE-4873.patch
>
>
> I think this bug is some extreme corner case...
> This test currently only uses up to 9 terms. By increasing it to 26 and 
> blasting the test, I was able to uncover a bug.
> Here's the seed: ant test  -Dtestcase=TestMinShouldMatch2 
> -Dtests.method=testNextAllTerms -Dtests.seed=E0334C37E6E190D8 
> -Dtests.slow=true -Dtests.locale=pl_PL -Dtests.timezone=Asia/Thimphu 
> -Dtests.file.encoding=US-ASCII
> Here's the patch to make the test use 26 terms.
> {noformat}
> Index: lucene/core/src/test/org/apache/lucene/search/TestMinShouldMatch2.java
> ===================================================================
> --- lucene/core/src/test/org/apache/lucene/search/TestMinShouldMatch2.java    
> (revision 1459937)
> +++ lucene/core/src/test/org/apache/lucene/search/TestMinShouldMatch2.java    
> (working copy)
> @@ -56,7 +56,7 @@
>    static final String alwaysTerms[] = { "a" };
>    static final String commonTerms[] = { "b", "c", "d" };
>    static final String mediumTerms[] = { "e", "f", "g" };
> -  static final String rareTerms[]   = { "h", "i", "j" };
> +  static final String rareTerms[]   = { "h", "i", "j", "k", "l", "m", "n", 
> "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z" };
>    
>    @Override
>    public void setUp() throws Exception {
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4873) corner case in MinShouldMatchSumScorer when there are many terms

Reply via email to