[jira] [Comment Edited] (HUDI-9088) MIT not doing partition pruning when using partition columns

Y Ethan Guo (Jira) Thu, 06 Mar 2025 17:14:06 -0800


    [ 
https://issues.apache.org/jira/browse/HUDI-9088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17933165#comment-17933165
 ]


Y Ethan Guo edited comment on HUDI-9088 at 3/7/25 1:13 AM:
-----------------------------------------------------------

Note that there is no partition predicate in the MERGE INTO statement so 
(regular) partition pruning does not apply here.  If the source table has data 
for only one partition and we want to leverage that information during query 
planning and execution, dynamic partition pruning (DPP) can be applied here.  
Currently it looks like MERGE INTO does not apply DPP support from the Spark 
side but Hudi internally does indexing to read modified partitions based on the 
input data only.  So I'll verify if that's the case.  Note that the indexing in 
Hudi filtering for modified partitions does not show up in the logical plan so 
it can lead to false belief that there there is no prunning applied.


was (Author: JIRAUSER280684):
Note that there is no partition predicate in the MERGE INTO statement so 
(regular) partition prunning does not apply here.  If the source table has data 
for only one partition and we want to leverage that information during query 
planning and execution, dynamic partition prunning (DPP) can be applied here.  
Currently it looks like MERGE INTO does not apply DPP support and I'm looking 
into how to add such DPP support.

> MIT not doing partition pruning when using partition columns
> ------------------------------------------------------------
>
>                 Key: HUDI-9088
>                 URL: https://issues.apache.org/jira/browse/HUDI-9088
>             Project: Apache Hudi
>          Issue Type: Sub-task
>          Components: spark-sql
>            Reporter: Aditya Goenka
>            Assignee: Y Ethan Guo
>            Priority: Critical
>             Fix For: 0.16.0, 1.0.2
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> MIT not doing partition pruning . Reproducble code - 
> https://gist.github.com/ad1happy2go/584e0ce3731ab8be5093bbc2c86a002d



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HUDI-9088) MIT not doing partition pruning when using partition columns

Reply via email to