Christoph Kaser created LUCENE-4076:
---------------------------------------
Summary: When doing nested (index-time) joins,
ToParentBlockJoinCollector delivers incomplete information on the grand-children
Key: LUCENE-4076
URL: https://issues.apache.org/jira/browse/LUCENE-4076
Project: Lucene - Java
Issue Type: Bug
Components: modules/join
Affects Versions: 3.6, 3.5, 3.4
Reporter: Christoph Kaser
ToParentBlockJoinCollector.getTopGroups does not provide the correct answer
when a query with nested ToParentBlockJoinCollectors is performed.
Given the following example query:
{code}
Query grandChildQuery=new TermQuery(new Term("color", "red"));
Filter childFilter = new CachingWrapperFilter(new RawTermFilter(new
Term("type","child")), DeletesMode.IGNORE);
ToParentBlockJoinQuery grandchildJoinQuery = new
ToParentBlockJoinQuery(grandChildQuery, childFilter, ScoreMode.Max);
BooleanQuery childQuery= new BooleanQuery();
childQuery.add(grandchildJoinQuery, Occur.MUST);
childQuery.add(new TermQuery(new Term("shape", "round")), Occur.MUST);
Filter parentFilter = new CachingWrapperFilter(new RawTermFilter(new
Term("type","parent")), DeletesMode.IGNORE);
ToParentBlockJoinQuery childJoinQuery = new ToParentBlockJoinQuery(childQuery,
parentFilter, ScoreMode.Max);
parentQuery=new BooleanQuery();
parentQuery.add(childJoinQuery, Occur.MUST);
parentQuery.add(new TermQuery(new Term("name", "test")), Occur.MUST);
ToParentBlockJoinCollector parentCollector= new
ToParentBlockJoinCollector(Sort.RELEVANCE, 30, true, true);
searcher.search(parentQuery, null, parentCollector);
{code}
This produces the correct results:
{code}
TopGroups<Integer> childGroups = parentCollector.getTopGroups(childJoinQuery,
null, 0, 20, 0, false);
{code}
However, this does not:
{code}
TopGroups<Integer> grandChildGroups =
parentCollector.getTopGroups(grandchildJoinQuery, null, 0, 20, 0, false);
{code}
The content of grandChildGroups is broken in the following ways:
* The groupValue is not the document id of the child document (which is the
parent of a grandchild document), but the document id of the _previous_
matching parent document
* There are only as much GroupDocs as there are parent documents (not child
documents), and they only contain the children of the last child document (but,
as mentioned before, with the wrong groupValue).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]