Hi developers, I've recently found a few bugs in advanced features of Lucene-core 4.6 (which is perfectly normal as those features are less likely to be used and tested), the most serious one has rendered my ToParentBlockJoinCollector close to useless:
In the scorer generation stage, the ToParentBlockJoinCollector will automatically rewrite all the associated ToParentBlockJoinQuery (and their subqueries), and save them into its in-memory Look-up table, namely joinQueryID (see enroll() method for detail). Unfortunately, in the getTopGroups method, the new ToParentBlockJoinQuery parameter is not rewritten (at least users are not expected to do so). When the new one is searched in the old lookup table (considering the impact of rewrite() on hashCode()), the result (namely _slot) will always fail and eventually end up with a topGroup collection consisting of only empty groups (their hitCounts are guaranteed to be zero). I'm not positive about whether rewrite() should preserver Query's hashcode, as I've found many counterexamples already. If this is not true, then this problem can be solved by rewriting the origianl BlockJoinQuery before invoking getTopGroups method. Nevertheless users are not expected to do so, therefore I would suggest submitting a hotfix that add the described rewrite step. If rewrite() must preserver the hashcode, then this is a problem of the various rewrite() implementations and fix should be much harder. This bug has caused widespread panic in my company and I would like to see it fixed ASAP. Please give me some suggestion so I know which hotfix I should be working on. All the best, Yours Peng
