[
https://issues.apache.org/jira/browse/IGNITE-21277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maksim Zhuravkov updated IGNITE-21277:
--------------------------------------
Affects Version/s: 3.0.0-beta2
> Sql. Partition pruning. Extract partition pruning information from scan
> operations
> ----------------------------------------------------------------------------------
>
> Key: IGNITE-21277
> URL: https://issues.apache.org/jira/browse/IGNITE-21277
> Project: Ignite
> Issue Type: Improvement
> Components: sql
> Affects Versions: 3.0.0-beta2
> Reporter: Maksim Zhuravkov
> Priority: Major
> Labels: ignite-3
>
> In order to prune unnecessary partitions we need to obtain information that
> includes possible "values" of colocation key columns from filter expressions
> for every scan operators (such as IgniteTableScan, IgniteIndexScan and
> IgniteSystemViewScan - or simply subclasses of all
> ProjectableFilterableTableScan) prior to statement execution. This can be
> accomplished by traversing an expression tree of scan's filter and collecting
> expressions with colocation key columns (This data is called partition
> pruning metadata for simplicity).
> 1. Implement a component that takes a physical plan and analyses filter
> expressions of every scan operator and creates (if possible) an expression
> that includes all colocated columns. (The PartitionExtractor from patch can
> be used a reference implementation).
> Basic example:
>
> {code:java}
> Statement:
> SELECT * FROM t WHERE pk = 7 OR pk = 42
> Partition metadata:
> t's source_id = [pk=7 || pk = 42] // Assuming colocation key is equal to
> primary key, || denotes OR operation
> {code}
>
> If some colocation key columns are missing from then filter, then partition
> pruning is not possible for such operation.
> Expression types to analyze:
> * AND
> * EQUALS
> * IS_FALSE
> * IS_NOT_DISTINCT_FROM
> * IS_NOT_FALSE
> * IS_NOT_TRUE
> * IS_TRUE
> * NOT
> * OR
> * SEARCH (operation that tests whether a value is included in a certain
> range)
> 2. Update QueryPlan to include partition pruning metadata for every scan
> operator (source_id = <expression>).
> —
> *Additional examples - partition pruning is possible*
> Dynamic parameters:
> {code:java}
> SELECT * FROM t WHERE pk = ?1
> Partition pruning metadata: t = [ pk = ?1 ]
> {code}
> Colocation columns reside inside a nested expression:
> {code:java}
> SELECT * FROM t WHERE col1 = col2 AND (col2 = 100 AND pk = 2)
> Partition pruning metadata: t = [ pk = 2 ]
> {code}
> Multiple keys:
> {code:java}
> SELECT * FROM t WHERE col_c1 = 1 AND col_c2 = 2
> Partition pruning metadata: t = [ (col_c1 = 1, col_c2 = 2) ]
> {code}
> Complex expression with multiple keys:
> {code:java}
> SELECT * FROM t WHERE (col_col1 = 100 and col_col2 = 4) OR (col_col1 = 4 and
> col_col2 = 100)
> Partition pruning metadata: t = [ (col_col1 = 100, col_col2 = 4) || (col_col1
> = 4, col_col2 = 100) ]
> {code}
> Multiple tables, assuming that filter b_id = 42 is pushed into scan b,
> because a_id = b_id:
> {code:java}
> SELECT * FROM a JOIN b WHERE a_id = b_id AND a_id = 42
> Partition pruning metadata: a= [ a_id=42 ], b=[ b_id=42 ]
> {code}
> ---
> *Partition pruning is not possible*
> Columns named col* are not part of colocation key:
> {code:java}
> SELECT * FROM t WHERE col1 = 10
> Partition pruning metadata: [] // (empty) because filter does not use
> colocation key columns.
> {code}
> {code:java}
> SELECT * FROM t WHERE col1 = col2 OR pk = 42
> // Pruning is not possible because we need to scan all partitions to figure
> out which tuples have ‘col1 = col2’
> Partition pruning metadata: []
> {code}
> {code:java}
> SELECT * FROM t WHERE col_col1 = 10 AND col_col2 OR col_col1 = 42
> // Although first expression uses all colocation key columns the second one
> only uses some.
> Partition pruning metadata: []
> {code}
> {code:java}
> SELECT * FROM t WHERE col_c1 = 1 OR col_c2 = 2
> // Empty partition pruning metadata: need to scan all partitions to figure
> out which tuples have col_c1 = 1 OR col_c2 = 2.
> Partition pruning metadata: []
> {code}
> {code:java}
> SELECT * FROM t WHERE col_col1 = col_col2 OR col_col2 = 42
> // Empty partition pruning metadata: need to scan all partitions to figure
> out which tuples have ‘col_col1 = col_col2’
> Partition pruning metadata: []
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)