[
https://issues.apache.org/jira/browse/IGNITE-21277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Konstantin Orlov updated IGNITE-21277:
--------------------------------------
Epic Link: IGNITE-21312
> Sql. Partition pruning. Port partitionExtractor from AI2.
> ---------------------------------------------------------
>
> Key: IGNITE-21277
> URL: https://issues.apache.org/jira/browse/IGNITE-21277
> Project: Ignite
> Issue Type: Improvement
> Components: sql
> Affects Versions: 3.0.0-beta2
> Reporter: Maksim Zhuravkov
> Priority: Major
> Labels: ignite-3
>
> In order to prune unnecessary partitions we need to obtain information that
> includes possible "values" of colocation key columns from filter expressions
> for every scan operators prior to statement execution. This can be
> accomplished by traversing an expression tree of scan's filter and collecting
> expressions with colocation key columns (this data is called partition
> pruning metadata for simplicity).
> 1. Implement a component that takes a physical plan and analyses filter
> expressions of every scan operator and creates (if possible) an expression
> that includes all colocated columns. (The PartitionExtractor from patch
> (https://github.com/apache/ignite/pull/10928/files) can be used a reference
> implementation).
> Expression types to analyze:
> * AND
> * EQUALS
> * IS_FALSE
> * IS_NOT_DISTINCT_FROM
> * IS_NOT_FALSE
> * IS_NOT_TRUE
> * IS_TRUE
> * NOT
> * OR
> * SEARCH (operation that tests whether a value is included in a certain
> range)
> 2. Update QueryPlan to include partition pruning metadata for every scan
> operator (source_id = <partition pruning metadata>).
> Basic examples:
> {code:java}
> Statement:
> SELECT * FROM t WHERE pk = 7 OR pk = 42
> Partition metadata:
> t's source_id = [pk=7 || pk = 42] // Assuming colocation key is equal to
> primary key, || denotes OR operation
> Statement:
> SELECT * FROM t WHERE pk = 7 OR col1 = 1
> Partition metadata: [] // Empty, because col1 is not part of a colocation key.
> Statement:
> SELECT * FROM t_colo_key1_colo_key2 WHERE colo_key1= 42
> Partition metadata: [] // Empty, because colo_key2 is missing
> {code}
> —
> *Additional examples - partition pruning is possible*
> Dynamic parameters:
> {code:java}
> SELECT * FROM t WHERE pk = ?1
> Partition pruning metadata: t = [ pk = ?1 ]
> {code}
> Colocation columns reside inside a nested expression:
> {code:java}
> SELECT * FROM t WHERE col1 = col2 AND (col2 = 100 AND pk = 2)
> Partition pruning metadata: t = [ pk = 2 ]
> {code}
> Multiple keys:
> {code:java}
> SELECT * FROM t WHERE col_c1 = 1 AND col_c2 = 2
> Partition pruning metadata: t = [ (col_c1 = 1, col_c2 = 2) ]
> {code}
> Complex expression with multiple keys:
> {code:java}
> SELECT * FROM t WHERE (col_col1 = 100 and col_col2 = 4) OR (col_col1 = 4 and
> col_col2 = 100)
> Partition pruning metadata: t = [ (col_col1 = 100, col_col2 = 4) || (col_col1
> = 4, col_col2 = 100) ]
> {code}
> Multiple tables, assuming that filter b_id = 42 is pushed into scan b,
> because a_id = b_id:
> {code:java}
> SELECT * FROM a JOIN b WHERE a_id = b_id AND a_id = 42
> Partition pruning metadata: a= [ a_id=42 ], b=[ b_id=42 ]
> {code}
> ---
> *Additional examples - partition pruning is not possible*
> Columns named col* are not part of colocation key:
> {code:java}
> SELECT * FROM t WHERE col1 = 10
> // Filter does not use colocation key columns.
> Partition pruning metadata: []
> {code}
> {code:java}
> SELECT * FROM t WHERE col1 = col2 OR pk = 42
> // We need to scan all partitions to figure out which tuples have ‘col1 =
> col2’
> Partition pruning metadata: []
> {code}
> {code:java}
> SELECT * FROM t WHERE col_col1 = 10 AND col_col2 OR col_col1 = 42
> // Although first expression uses all colocation key columns the second one
> only uses some.
> Partition pruning metadata: []
> {code}
> {code:java}
> SELECT * FROM t WHERE col_c1 = 1 OR col_c2 = 2
> // We need to scan all partitions to figure out which tuples have col_c1 = 1
> OR col_c2 = 2.
> Partition pruning metadata: []
> {code}
> {code:java}
> SELECT * FROM t WHERE col_col1 = col_col2 OR col_col2 = 42
> // We need to scan all partitions to figure out which tuples have ‘col_col1 =
> col_col2’
> Partition pruning metadata: []
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)