Do you suggest me to open a jira ticket about it? I think its a bug
considering common interface standard (rewrite should not be exposed to the
end user), documentation and running efficiency (as you said, rewrite is
slow).


On Tue, Jan 14, 2014 at 4:38 AM, Peng Cheng <[email protected]> wrote:

> I see, perhaps the best solution is to put the un-rewritten
> blockJoinQuries into the joinQueryID? The result will be the same. Right
> now the code have very strange behavior if no rewrite is called beforehand,
> it gives empty groups or correct results at random.
>
> Its a great pleasure to read your reply, never expect someone to respond
> that fast.
>
> Yours Peng
>
>
>
> On Tue, Jan 14, 2014 at 2:33 AM, Uwe Schindler <[email protected]> wrote:
>
>> Hi Peng,
>>
>>
>>
>> rewrite() returns a different query that will definitely not preserve the
>> hashCode() or be equals() to the original one or any other rewritten one.
>> The reason for this is: A rewritten query is a new query that contains
>> information about the index it will be executed on (e.g., it references
>> terms from that index), so it **cannot** be equal to the original one.
>> If it cannot be equal, also the hashCode should be different. If you
>> execute the query on a later stage you have to rewrite the original query
>> again, because the index may have changed. And take care: This rewrite may
>> produce a completely different query (with a new hashCode again) if the
>> index changed in the meantime.
>>
>>
>>
>> As there is a workaround (to me it looks, that the code is missing
>> documentation), so you can manually rewrite the query before invoking
>> getTopGroups() using Searcher#rewrite(query). Why is a hotfix needed?
>>
>>
>>
>> Also rewriting the query on every call of getTopGroups is a major
>> overhead (most query’s rewrites are very expensice and take as long as the
>> execution of the query, e.g. MultiTermQueries), so it should only be done
>> once, not on every call. Maybe that’s the reason why it was left out, but
>> it was not documented.
>>
>>
>>
>> Uwe
>>
>>
>>
>> -----
>>
>> Uwe Schindler
>>
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>
>> http://www.thetaphi.de
>>
>> eMail: [email protected]
>>
>>
>>
>> *From:* Peng Cheng [mailto:[email protected]]
>> *Sent:* Tuesday, January 14, 2014 3:59 AM
>> *To:* [email protected]; [email protected]
>>
>> *Subject:* (Lucene-core) Is Query's rewrite method mandated to preserver
>> original Query's hashcode?
>>
>>
>>
>> Hi developers,
>>
>>
>>
>> I've recently found a few bugs in advanced features of Lucene-core 4.6
>> (which is perfectly normal as those features are less likely to be used and
>> tested), the most serious one has rendered my ToParentBlockJoinCollector
>> close to useless:
>>
>>
>>
>> In the scorer generation stage, the ToParentBlockJoinCollector will
>> automatically rewrite all the associated ToParentBlockJoinQuery (and their
>> subqueries), and save them into its in-memory Look-up table, namely
>> joinQueryID (see enroll() method for detail). Unfortunately, in the
>> getTopGroups method, the new ToParentBlockJoinQuery parameter is not
>> rewritten (at least users are not expected to do so). When the new one is
>> searched in the old lookup table (considering the impact of rewrite() on
>> hashCode()), the result (namely _slot) will always fail and eventually end
>> up with a topGroup collection consisting of only empty groups (their
>> hitCounts are guaranteed to be zero).
>>
>>
>>
>> I'm not positive about whether rewrite() should preserver Query's
>> hashcode, as I've found many counterexamples already. If this is not true,
>> then this problem can be solved by rewriting the origianl BlockJoinQuery
>> before invoking getTopGroups method. Nevertheless users are not expected to
>> do so, therefore I would suggest submitting a hotfix that add the described
>> rewrite step.
>>
>>
>>
>> If rewrite() must preserver the hashcode, then this is a problem of the
>> various rewrite() implementations and fix should be much harder.
>>
>>
>>
>> This bug has caused widespread panic in my company and I would like to
>> see it fixed ASAP. Please give me some suggestion so I know which hotfix I
>> should be working on.
>>
>>
>>
>> All the best,
>>
>>
>>
>> Yours Peng
>>
>
>

Reply via email to