[
https://issues.apache.org/jira/browse/LUCENE-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877816#comment-16877816
]
Andrei commented on LUCENE-8902:
--------------------------------
My apologies. Thank you and I will pot something to the mailing list.
> Index-time join ToParentBlockJoinQuery query produces incorrect result with
> child wildcards
> -------------------------------------------------------------------------------------------
>
> Key: LUCENE-8902
> URL: https://issues.apache.org/jira/browse/LUCENE-8902
> Project: Lucene - Core
> Issue Type: Bug
> Components: modules/join
> Affects Versions: 8.1.1
> Reporter: Andrei
> Priority: Major
>
> When I do a index-time join query on certain parent docs with a wildcard
> query for child docs, sometimes I get the wrong answer. Example:
>
> ||Parent Doc||Children||
> |id=id00000| none|
> |id=id00001| # program=P1|
> |id=id00002| # program=P1
> # program=P2|
> |id=id00003| none|
> |id=id00004| # program=P1|
> |id=id00005| # program=P1
> # program=P2|
> So essentially I have 6 parent docs, doc 0 has no children, doc 1 has 1
> child, doc 2 has 2 children, etc.
> 1. The following query gives the correct results:
> BitSetProducer parentSet = new QueryBitSetProducer(new
> TermInSetQuery("id", toSet("id00000", "id00001", "id00002", "id00003",
> "id00004", "id00005")));
> Query q = new ToParentBlockJoinQuery(new TermInSetQuery("program",
> toSet("P1", "P2")), parentSet, ScoreMode.None);
> Returns the correct result (4 docs: ["id00001", "id00002", "id00004",
> "id00005"]
>
> 2. This also gives correct result (same as above):
> BitSetProducer parentSet = new QueryBitSetProducer(new
> TermInSetQuery("id", toSet("id00000", "id00001", "id00002", "id00003",
> "id00004", "id00005")));
> Query q = new ToParentBlockJoinQuery(new WildcardQuery(new
> Term("program", "*")), parentSet, ScoreMode.None);
>
> 3. Also correct (same as above)
> BitSetProducer parentSet = new QueryBitSetProducer(new
> WildcardQuery(new Term("id", "*")));
> Query q = new ToParentBlockJoinQuery(new WildcardQuery(new
> Term("program", "*")), parentSet, ScoreMode.None);
> so far so good.
>
> 4. This one gives incorrect result:
> BitSetProducer parentSet = new QueryBitSetProducer(new
> TermInSetQuery("id", toSet("id00000", "id00001", "id00003")));
> Query q = new ToParentBlockJoinQuery(new WildcardQuery(new
> Term("program", "*")), parentSet,
> org.apache.lucene.search.join.ScoreMode.None);
> Returns 2 docs ["id00001", "id00003"]. It should only return "id00001" and
> not "id00003" here. Very strange behavior.
>
> 5. Just asking for "id00003" also incorrectly returns it:
> BitSetProducer parentSet = new QueryBitSetProducer(new TermQuery(new
> Term("id", "id00003")));
> Query q = new ToParentBlockJoinQuery(new WildcardQuery(new
> Term("program", "*")), parentSet,
> org.apache.lucene.search.join.ScoreMode.None);
>
> 6. But as soon as I add "id00002" to the parent query, it works again..
> BitSetProducer parentSet = new QueryBitSetProducer(new
> TermInSetQuery("id", toSet( "id00003", "id00002")));
> Query q = new ToParentBlockJoinQuery(new WildcardQuery(new
> Term("program", "*")), parentSet,
> org.apache.lucene.search.join.ScoreMode.None);
> Gives the correct result ["id00002"]
> ----
> I am attaching the unit test that demonstrates this:
> [https://pastebin.com/aJ1LDLCS]
> I don't know if I am doing something wrong, or if there is an issue.
> Thank you for looking into it.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]