clintropolis opened a new pull request, #15551:
URL: https://github.com/apache/druid/pull/15551

   ### Description
   Adds the enhancements from #13977 to traditional string columns to improve 
performance of filter processing by allowing obviously expensive bitmap 
operations to be skipped. While I haven't run the benchmarks, I assume a 
similar improvement for string columns that are not created with the 'auto' 
indexer as were seen in #13977. Previously this performance enhancement was 
only available for 'auto' and 'json' columns.
   
   I've not documented the configs behind this yet in this PR because I'm still 
thinking on how to frame what these configs do in an understandable manner 
without requiring deep knowledge of how query processing actually works. For 
the most part these settings should probably not be tweaked by users, but will 
try to think of something and add docs in a follow-up PR.
   
   #### Release note
   String columns created with the 'string' column indexer now have an 
enhancement for filter match processing that was previously only available to 
columns created by the 'auto' indexer, and will automatically skip obviously 
expensive index computation for filters which would require a very large number 
of bitmap operations. This should improve performance, particularly when filter 
clauses contain a mix of simple and complex filters since it allows the complex 
filters. Previously Druid would always utilize indexes if they were available, 
and this behavior can be returned by setting 
`druid.processing.skipValuePredicateIndexScale` and 
`druid.processing.skipValueRangeIndexScale` to `1.0`.
   
   <hr>
   
   This PR has:
   
   - [ ] been self-reviewed.
   - [ ] added documentation for new or modified features or behaviors.
   - [x] a release note entry in the PR description.
   - [x] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to