[
https://issues.apache.org/jira/browse/IMPALA-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17525694#comment-17525694
]
Riza Suminto commented on IMPALA-10453:
---------------------------------------
Since IMPALA-11147 gets in, we're not creating column readers for
identity-partitioned columns anymore and lose some opportunity do min/max and
dictionary filtering in row group level.
We will need this IMPALA-10453 to compensate for that.
> Support file/partition pruning via runtime filters on Iceberg
> -------------------------------------------------------------
>
> Key: IMPALA-10453
> URL: https://issues.apache.org/jira/browse/IMPALA-10453
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Reporter: Tim Armstrong
> Assignee: Tamas Mate
> Priority: Major
> Labels: iceberg, impala-iceberg, performance
>
> This is a placeholder to figure out what we'd need to do to support dynamic
> file-level pruning in Iceberg using runtime filters, i.e. have parity for
> partition pruning.
> * If there is a single partition value per file, then applying bloom filters
> to the row group stats would be effective at pruning files.
> * If there are partition transforms, e.g. hash-based, then I think we
> probably need to track the partition that the file is associated with and
> then have some custom logic in the parquet scanner to do partition pruning.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]