[jira] [Commented] (LUCENE-7810) false positive equality: distinctly diff join queries return equals()==true

Adrien Grand (JIRA) Tue, 16 May 2017 08:33:21 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012590#comment-16012590
 ]


Adrien Grand commented on LUCENE-7810:
--------------------------------------

I actually meant the BytesRefHash in my previous comment as comparing only the 
{{from}} query could still lead to false positives on different index readers 
that have segments in common. However Martijn's idea to take the index reader 
context identifier into account in equals/hashCode is better I think as it 
would make comparisons faster.

These queries should also take the score mode into account for equals/hashCode 
I think? I read your comment about the fact that it is not needed for query 
caching, but I think two queries should only be equal if they matche the same 
docs and give them the same scores. If we want to be able te reuse cache 
entries that have different score modes, we could rewrite to a TermsQuery in 
createWeight, similarly to how BooleanQuery rewrites all MUST clauses into 
FILTER clauses when needsScores is false?

{code}
ScoreMode scoreMode1 = scoreModes.toArray(new 
ScoreMode[0])[random().nextInt(scoreModes.size())];
{code}

I think you could do {{RandomPicks.randomFrom(random(), scoreModes)}}.

Otherwise +1

> false positive equality: distinctly diff join queries return equals()==true
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-7810
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7810
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Hoss Man
>         Attachments: LUCENE_7810.patch, LUCENE-7810.patch
>
>
> While working on SOLR-10583 I was getting some odd test failures that seemed 
> to suggest we were getting false cache hits for Join queries that should have 
> been unique.
> tracing thorugh the code, the problem seems to be the way {{TermsQuery}} 
> implements {{equals(Object)}}.  This class takes in the {{fromQuery}} (used 
> to identify set of documents we "join from") and uses it in the equals 
> calculation -- but the information about the join _field_ is never passed 
> directly to {{TermsQuery}} and the BytesRefs that are passed in can't be 
> compared efficiently (AFAICT), so 2 completely diff calls to 
> {{JoinUtils.createJoinQuery(...)}} can result in Query objects that think 
> they are {{equal()}} even when they most certainly are not.
> At a brief glance, it appears that similar bugs exist in 
> {{TermsIncludingScoreQuery}} (and possibly {{GlobalOrdinalsWithScoreQuery}}, 
> but i didn't look into that class at all)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-7810) false positive equality: distinctly diff join queries return equals()==true

Reply via email to