[ 
https://issues.apache.org/jira/browse/HUDI-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-8928:
------------------------------
    Description: 
Based on our analysis, drop partition support is broken in 0.15.0 for multi 
partition fields. 

 

For nested field, it is swapping a field with the same name but different path 
with the partition value

For timestamp issue, the field gets replaced with the partition value instead 
of the value in the file (for example: 
{{{}timestamp_micros_nullable_field":"2025-01-25T00:00:00.000Z"{}}})

Also seeing a regression on drop partition where the dropped partition is still 
being read

The replace commit is not being written correctly in 0.15.0, the 
{{partitionToReplaceFileIds}} contains a map with an empty list instead of the 
filegroup ids for the partition

 

We need a fix for 0.15.0.  

1.0 works fine. See the test script 
https://gist.github.com/codope/dfb87d35112ddbb7f207ad3f52320071

  was:
Based on our analysis, drop partition support is broken in 0.15.0 for multi 
partition fields. 

 

For nested field, it is swapping a field with the same name but different path 
with the partition value

For timestamp issue, the field gets replaced with the partition value instead 
of the value in the file (for example: 
{{{}timestamp_micros_nullable_field":"2025-01-25T00:00:00.000Z"{}}})

Also seeing a regression on drop partition where the dropped partition is still 
being read

The replace commit is not being written correctly in 0.15.0, the 
{{partitionToReplaceFileIds}} contains a map with an empty list instead of the 
filegroup ids for the partition

 

We need a fix for 0.15.0.  

1.0 is yet to tried. not sure if its broken. 


> Fix timestamp based partitioning and drop partition support with 0.15.0
> -----------------------------------------------------------------------
>
>                 Key: HUDI-8928
>                 URL: https://issues.apache.org/jira/browse/HUDI-8928
>             Project: Apache Hudi
>          Issue Type: Sub-task
>          Components: reader-core
>            Reporter: sivabalan narayanan
>            Assignee: Sagar Sumit
>            Priority: Blocker
>             Fix For: 0.15.1
>
>         Attachments: Screenshot 2025-01-28 at 4.12.49 PM.png
>
>
> Based on our analysis, drop partition support is broken in 0.15.0 for multi 
> partition fields. 
>  
> For nested field, it is swapping a field with the same name but different 
> path with the partition value
> For timestamp issue, the field gets replaced with the partition value instead 
> of the value in the file (for example: 
> {{{}timestamp_micros_nullable_field":"2025-01-25T00:00:00.000Z"{}}})
> Also seeing a regression on drop partition where the dropped partition is 
> still being read
> The replace commit is not being written correctly in 0.15.0, the 
> {{partitionToReplaceFileIds}} contains a map with an empty list instead of 
> the filegroup ids for the partition
>  
> We need a fix for 0.15.0.  
> 1.0 works fine. See the test script 
> https://gist.github.com/codope/dfb87d35112ddbb7f207ad3f52320071



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to