Do you suggest me to open a jira ticket about it? I think its a bug considering common interface standard (rewrite should not be exposed to the end user), documentation and running efficiency (as you said, rewrite is slow).
On Tue, Jan 14, 2014 at 4:38 AM, Peng Cheng <[email protected]> wrote: > I see, perhaps the best solution is to put the un-rewritten > blockJoinQuries into the joinQueryID? The result will be the same. Right > now the code have very strange behavior if no rewrite is called beforehand, > it gives empty groups or correct results at random. > > Its a great pleasure to read your reply, never expect someone to respond > that fast. > > Yours Peng > > > > On Tue, Jan 14, 2014 at 2:33 AM, Uwe Schindler <[email protected]> wrote: > >> Hi Peng, >> >> >> >> rewrite() returns a different query that will definitely not preserve the >> hashCode() or be equals() to the original one or any other rewritten one. >> The reason for this is: A rewritten query is a new query that contains >> information about the index it will be executed on (e.g., it references >> terms from that index), so it **cannot** be equal to the original one. >> If it cannot be equal, also the hashCode should be different. If you >> execute the query on a later stage you have to rewrite the original query >> again, because the index may have changed. And take care: This rewrite may >> produce a completely different query (with a new hashCode again) if the >> index changed in the meantime. >> >> >> >> As there is a workaround (to me it looks, that the code is missing >> documentation), so you can manually rewrite the query before invoking >> getTopGroups() using Searcher#rewrite(query). Why is a hotfix needed? >> >> >> >> Also rewriting the query on every call of getTopGroups is a major >> overhead (most query’s rewrites are very expensice and take as long as the >> execution of the query, e.g. MultiTermQueries), so it should only be done >> once, not on every call. Maybe that’s the reason why it was left out, but >> it was not documented. >> >> >> >> Uwe >> >> >> >> ----- >> >> Uwe Schindler >> >> H.-H.-Meier-Allee 63, D-28213 Bremen >> >> http://www.thetaphi.de >> >> eMail: [email protected] >> >> >> >> *From:* Peng Cheng [mailto:[email protected]] >> *Sent:* Tuesday, January 14, 2014 3:59 AM >> *To:* [email protected]; [email protected] >> >> *Subject:* (Lucene-core) Is Query's rewrite method mandated to preserver >> original Query's hashcode? >> >> >> >> Hi developers, >> >> >> >> I've recently found a few bugs in advanced features of Lucene-core 4.6 >> (which is perfectly normal as those features are less likely to be used and >> tested), the most serious one has rendered my ToParentBlockJoinCollector >> close to useless: >> >> >> >> In the scorer generation stage, the ToParentBlockJoinCollector will >> automatically rewrite all the associated ToParentBlockJoinQuery (and their >> subqueries), and save them into its in-memory Look-up table, namely >> joinQueryID (see enroll() method for detail). Unfortunately, in the >> getTopGroups method, the new ToParentBlockJoinQuery parameter is not >> rewritten (at least users are not expected to do so). When the new one is >> searched in the old lookup table (considering the impact of rewrite() on >> hashCode()), the result (namely _slot) will always fail and eventually end >> up with a topGroup collection consisting of only empty groups (their >> hitCounts are guaranteed to be zero). >> >> >> >> I'm not positive about whether rewrite() should preserver Query's >> hashcode, as I've found many counterexamples already. If this is not true, >> then this problem can be solved by rewriting the origianl BlockJoinQuery >> before invoking getTopGroups method. Nevertheless users are not expected to >> do so, therefore I would suggest submitting a hotfix that add the described >> rewrite step. >> >> >> >> If rewrite() must preserver the hashcode, then this is a problem of the >> various rewrite() implementations and fix should be much harder. >> >> >> >> This bug has caused widespread panic in my company and I would like to >> see it fixed ASAP. Please give me some suggestion so I know which hotfix I >> should be working on. >> >> >> >> All the best, >> >> >> >> Yours Peng >> > >
