Zoltán Borók-Nagy created IMPALA-15040:
------------------------------------------
Summary: Predicate pull-up for Iceberg tables
Key: IMPALA-15040
URL: https://issues.apache.org/jira/browse/IMPALA-15040
Project: IMPALA
Issue Type: Improvement
Reporter: Zoltán Borók-Nagy
Impala uses the following plan to scan Iceberg tables with position deletes /
Deletion Vectors:
{noformat}
UNION ALL
/ \
/ \
/ \
SCAN all IcebergDeleteNode
datafiles / \
without / \
deletes SCAN SCAN
datafiles deletes
with
deletes
{noformat}
IcebergDeleteNode is highly efficient in eliminating deleted rows. Therefore,
we could pull up expensive conjuncts from "SCAN datafiles with deletes" to
IcebergDeleteNode, so we'd only evaluate them on the active rows.
{*}There are some caveats{*}:
* If the conjuncts are used in late materialization in the SCANs, and have
high selectivity, then pulling them up might make performance worse
* We probably shouldn't pull up predicates that can be evaluated by
min/max/dictionary filters
--
This message was sent by Atlassian Jira
(v8.20.10#820010)