[ https://issues.apache.org/jira/browse/LUCENE-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877816#comment-16877816 ]
Andrei commented on LUCENE-8902: -------------------------------- My apologies. Thank you and I will pot something to the mailing list. > Index-time join ToParentBlockJoinQuery query produces incorrect result with > child wildcards > ------------------------------------------------------------------------------------------- > > Key: LUCENE-8902 > URL: https://issues.apache.org/jira/browse/LUCENE-8902 > Project: Lucene - Core > Issue Type: Bug > Components: modules/join > Affects Versions: 8.1.1 > Reporter: Andrei > Priority: Major > > When I do a index-time join query on certain parent docs with a wildcard > query for child docs, sometimes I get the wrong answer. Example: > > ||Parent Doc||Children|| > |id=id00000| none| > |id=id00001| # program=P1| > |id=id00002| # program=P1 > # program=P2| > |id=id00003| none| > |id=id00004| # program=P1| > |id=id00005| # program=P1 > # program=P2| > So essentially I have 6 parent docs, doc 0 has no children, doc 1 has 1 > child, doc 2 has 2 children, etc. > 1. The following query gives the correct results: > BitSetProducer parentSet = new QueryBitSetProducer(new > TermInSetQuery("id", toSet("id00000", "id00001", "id00002", "id00003", > "id00004", "id00005"))); > Query q = new ToParentBlockJoinQuery(new TermInSetQuery("program", > toSet("P1", "P2")), parentSet, ScoreMode.None); > Returns the correct result (4 docs: ["id00001", "id00002", "id00004", > "id00005"] > > 2. This also gives correct result (same as above): > BitSetProducer parentSet = new QueryBitSetProducer(new > TermInSetQuery("id", toSet("id00000", "id00001", "id00002", "id00003", > "id00004", "id00005"))); > Query q = new ToParentBlockJoinQuery(new WildcardQuery(new > Term("program", "*")), parentSet, ScoreMode.None); > > 3. Also correct (same as above) > BitSetProducer parentSet = new QueryBitSetProducer(new > WildcardQuery(new Term("id", "*"))); > Query q = new ToParentBlockJoinQuery(new WildcardQuery(new > Term("program", "*")), parentSet, ScoreMode.None); > so far so good. > > 4. This one gives incorrect result: > BitSetProducer parentSet = new QueryBitSetProducer(new > TermInSetQuery("id", toSet("id00000", "id00001", "id00003"))); > Query q = new ToParentBlockJoinQuery(new WildcardQuery(new > Term("program", "*")), parentSet, > org.apache.lucene.search.join.ScoreMode.None); > Returns 2 docs ["id00001", "id00003"]. It should only return "id00001" and > not "id00003" here. Very strange behavior. > > 5. Just asking for "id00003" also incorrectly returns it: > BitSetProducer parentSet = new QueryBitSetProducer(new TermQuery(new > Term("id", "id00003"))); > Query q = new ToParentBlockJoinQuery(new WildcardQuery(new > Term("program", "*")), parentSet, > org.apache.lucene.search.join.ScoreMode.None); > > 6. But as soon as I add "id00002" to the parent query, it works again.. > BitSetProducer parentSet = new QueryBitSetProducer(new > TermInSetQuery("id", toSet( "id00003", "id00002"))); > Query q = new ToParentBlockJoinQuery(new WildcardQuery(new > Term("program", "*")), parentSet, > org.apache.lucene.search.join.ScoreMode.None); > Gives the correct result ["id00002"] > ---- > I am attaching the unit test that demonstrates this: > [https://pastebin.com/aJ1LDLCS] > I don't know if I am doing something wrong, or if there is an issue. > Thank you for looking into it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org