jon-wei commented on a change in pull request #9647: document useFilterCNF query context parameter URL: https://github.com/apache/druid/pull/9647#discussion_r405972067
########## File path: docs/querying/query-context.md ########## @@ -45,6 +45,7 @@ The query context is used for various query configuration parameters. The follow |parallelMergeParallelism|`druid.processing.merge.pool.parallelism`|Maximum number of parallel threads to use for parallel result merging on the Broker. See [Broker configuration](../configuration/index.html#broker) for more details.| |parallelMergeInitialYieldRows|`druid.processing.merge.task.initialYieldNumRows`|Number of rows to yield per ForkJoinPool merge task for parallel result merging on the Broker, before forking off a new task to continue merging sequences. See [Broker configuration](../configuration/index.html#broker) for more details.| |parallelMergeSmallBatchRows|`druid.processing.merge.task.smallBatchNumRows`|Size of result batches to operate on in ForkJoinPool merge tasks for parallel result merging on the Broker. See [Broker configuration](../configuration/index.html#broker) for more details.| +|useFilterCNF|`false`| If true, Druid will attempt to convert the query filter to Conjunctive Normal Form (CNF). During query processing, columns can be pre-filtered by intersecting the bitmap indexes of all values that match the eligible filters, often greatly reducing the raw number of rows which need to be scanned. But this effect only happens for the top level filter, or individual clauses of a top level 'and' filter. As such, filters in CNF potentially have a higher chance to utilize a large amount of bitmap indexes on string columns during pre-filtering. However, this setting should be used with great caution, as it can sometimes have a negative effect on performance, and in some cases, the act of computing CNF of a filter can be expensive. We recommend hand tuning your filters to produce an optimal form if possible, or at least verifying through experimentation that using this parameter actually improves your query performance with no ill-effects.| Review comment: > But this effect only happens for the top level filter, or individual clauses of a top level 'and' filter. Suggest providing a few examples to clarify: - An OR filter `A || B` where `A` can be resolved using bitmap indexes but `B` cannot will prevent the whole OR filter from being considered for pre-filtering - If it were `A && B` instead, `A` would be considered for pre-filtering but `B` would not. - If it were `A && (C || D)` where `C` and `D` can be resolved using bitmap indexes, then the whole filter can be considered for pre-filtering - If were `A && (B || C)` only `A` will be considered for pre-filtering ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
