+1.
This will help in range query as only one time filter will be applied and
range of binary search will decrease as we can do incremental binary
search. Currently in carbon range filter for dictionary column is slow as
number of filter applied can be more and it degrade the filter query
performance, above optimisation will improve the dictionary column also.
-Regards
Kumar Vishal
On Tue, Mar 21, 2017 at 9:14 AM, sounak wrote:
> Hi,
>
> Currently I am working in a RANGE Filter Optimization. Previously if in
> case there are both Greater Than and Less than predicates on a same column,
> then they are evaluated spearately where multiple times scanning of the
> blocks are involved which is undoubtedly costly.
>
> To optimize this, if "Greater and Less than" predicates are there the a
> same column and joined by AND expression, then these two expression can be
> merged into a single expression and can form Range Expression. Rather than
> evaluating the expressions separately only one time the block will be
> scanned and if the values lies within Range Min and Max then will be
> selected.
>
> Current the filter expression tree is transformed as shown below. The below
> tranformation helps us to carry the Range MIN MAX value so that prunning
> optimization is applied.
>
> //AND AND
> // | |
> /// \ / \
> // / \ / \
> //Less Greater => TRUE Range
> // / \/ \ / \
> /// \ / \ / \
> // a10 a 5 Less greater
> // /\
> /\
> // / \
> / \
> // a 10 a
> 5
>
>
> Presence of a Range in a single expression gives few more optimization
> during block prunning. Comparing Filter Min and Max and Block Min and Max
> values few decisions can be taken as shown below.
>
> Case A) In case the Filter Range lies completely outside of Block range =>
> Then no scanning is required and the block can be completely skipped.
>
>---
>
> | BLOCK |
>
> ---
> -
> Min MAX
>
> | FILTER |
> or |
>FILTER |
>--
>
>
> Min Max
> Min
> Max
>
> Case B ) The Filter Completely Covers the Block => No scanning is required.
> Select all Rows of the Block. i.e. turn all bits of block to true.
>
> --
>
>| BLOCK |
>
>--
>
> --
>|
> FILTER |
>
> --
>
> Case C) The Filter Overlaps the Block, But completely donot cover it. =>
> Choose the Block Scan Start or End based on filter overlaping, no need to
> do a binary search to find the Start and End Index of Block.
>
>
> -
> --
> | BLOCK
> ||
> BLOCK|
>
> -
> -
>
> OR
> -
>| FILTER|
>
> | FILTER|
>---
>
>-
> Start of Block will be default
> StartIndex Value