msfroh commented on PR #16240:
URL: https://github.com/apache/lucene/pull/16240#issuecomment-4733772590

   > it is prohibitively expensive to build the Scorer in the first place - 
once it is built the cost of iterating is likely to be much lower. 
   
   Right -- a challenge is that some of these Scorers are actually very cheap 
to build, but we don't have a good way of knowing that without trying. So, we 
assume the cost of the Scorer is high and make bad decisions as a consequence 
(either going for DV-based filtering when the MTQ actually matches a small 
number of docs, or opting to skip caching of the MTQ and any BooleanQuery that 
contains it).
   
   @romseygeek, I really like your suggestion of moving `estimateCost` into 
`MultiTermQuery`. I think we need more polymorphism here, since shoving all the 
logic into `AbstractMultiTermQueryConstantScoreWrapper` is already getting 
messy. In particular, the logic should be a lot simpler for some queries (e.g. 
`PrefixQuery`) than others (e.g. `RegexpQuery`).
   
   It'll probably take me a few days to implement that, but I don't want to 
hold up the 10.5 release. IMO, the improvement that @txwei made is probably 
more significant than the hit that we take from not caching "easy" clauses. In 
10.6, we can have both.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to