itschrispeck opened a new pull request, #13199:
URL: https://github.com/apache/pinot/pull/13199

   Problem: https://github.com/apache/pinot/pull/11185 added proper support for 
null handling. One side effect is that the order of execution was changed, 
which has performance implications for queries where a bitmap based filter 
operator can reduce evaluating some expression like `regexp_like` as many 
times. In summary, `NOT (a AND b)` is executed as `NOT a OR NOT b`.
   
   For example, affected queries could look like:
   ```
   NOT (text_match(col, '...') AND regexp_like(col, '...'))
   ```
   
   In this case, the PR changed `NotFilterOperator` to use 
`AndFilterOperator.getFalses()` which [builds an `OrDocIdSet` from the false 
DocIdSets](https://github.com/apache/pinot/pull/11185/files#diff-13077035b35ccc6f9b73625c2315d9571a80fd9fedcec2d75f689f9b335e22aaR59-R68)
 instead of using a `NotDocIdIterator` built from the `AndDocIdSet` as was done 
in the old implementation.
   
   This PR changes implementation back to the first, except also handles nulls 
properly. 
   
   Open question: The behavior of `NOT (a OR b)` was also changed to be 
executed as `NOT a AND NOT b` - I'm not sure if it's better to leave this even 
though the order of execution is implicitly changed, since the change probably 
benefits most queries. One option is the ensure the implementation matches the 
query and then use an optimizer if we think this case should be executed 
differently. 
   
   tags: `bugfix` `performance`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to