[
https://issues.apache.org/jira/browse/HUDI-9088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17933165#comment-17933165
]
Y Ethan Guo edited comment on HUDI-9088 at 3/7/25 1:13 AM:
-----------------------------------------------------------
Note that there is no partition predicate in the MERGE INTO statement so
(regular) partition pruning does not apply here. If the source table has data
for only one partition and we want to leverage that information during query
planning and execution, dynamic partition pruning (DPP) can be applied here.
Currently it looks like MERGE INTO does not apply DPP support from the Spark
side but Hudi internally does indexing to read modified partitions based on the
input data only. So I'll verify if that's the case. Note that the indexing in
Hudi filtering for modified partitions does not show up in the logical plan so
it can lead to false belief that there there is no prunning applied.
was (Author: JIRAUSER280684):
Note that there is no partition predicate in the MERGE INTO statement so
(regular) partition prunning does not apply here. If the source table has data
for only one partition and we want to leverage that information during query
planning and execution, dynamic partition prunning (DPP) can be applied here.
Currently it looks like MERGE INTO does not apply DPP support and I'm looking
into how to add such DPP support.
> MIT not doing partition pruning when using partition columns
> ------------------------------------------------------------
>
> Key: HUDI-9088
> URL: https://issues.apache.org/jira/browse/HUDI-9088
> Project: Apache Hudi
> Issue Type: Sub-task
> Components: spark-sql
> Reporter: Aditya Goenka
> Assignee: Y Ethan Guo
> Priority: Critical
> Fix For: 0.16.0, 1.0.2
>
> Original Estimate: 6h
> Remaining Estimate: 6h
>
> MIT not doing partition pruning . Reproducble code -
> https://gist.github.com/ad1happy2go/584e0ce3731ab8be5093bbc2c86a002d
--
This message was sent by Atlassian Jira
(v8.20.10#820010)