[
https://issues.apache.org/jira/browse/HUDI-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sagar Sumit updated HUDI-8928:
------------------------------
Description:
Based on our analysis, drop partition support is broken in 0.15.0 for multi
partition fields.
For nested field, it is swapping a field with the same name but different path
with the partition value
For timestamp issue, the field gets replaced with the partition value instead
of the value in the file (for example:
{{{}timestamp_micros_nullable_field":"2025-01-25T00:00:00.000Z"{}}})
Also seeing a regression on drop partition where the dropped partition is still
being read
The replace commit is not being written correctly in 0.15.0, the
{{partitionToReplaceFileIds}} contains a map with an empty list instead of the
filegroup ids for the partition
We need a fix for 0.15.0.
1.0 works fine. See the test script
https://gist.github.com/codope/dfb87d35112ddbb7f207ad3f52320071
was:
Based on our analysis, drop partition support is broken in 0.15.0 for multi
partition fields.
For nested field, it is swapping a field with the same name but different path
with the partition value
For timestamp issue, the field gets replaced with the partition value instead
of the value in the file (for example:
{{{}timestamp_micros_nullable_field":"2025-01-25T00:00:00.000Z"{}}})
Also seeing a regression on drop partition where the dropped partition is still
being read
The replace commit is not being written correctly in 0.15.0, the
{{partitionToReplaceFileIds}} contains a map with an empty list instead of the
filegroup ids for the partition
We need a fix for 0.15.0.
1.0 is yet to tried. not sure if its broken.
> Fix timestamp based partitioning and drop partition support with 0.15.0
> -----------------------------------------------------------------------
>
> Key: HUDI-8928
> URL: https://issues.apache.org/jira/browse/HUDI-8928
> Project: Apache Hudi
> Issue Type: Sub-task
> Components: reader-core
> Reporter: sivabalan narayanan
> Assignee: Sagar Sumit
> Priority: Blocker
> Fix For: 0.15.1
>
> Attachments: Screenshot 2025-01-28 at 4.12.49 PM.png
>
>
> Based on our analysis, drop partition support is broken in 0.15.0 for multi
> partition fields.
>
> For nested field, it is swapping a field with the same name but different
> path with the partition value
> For timestamp issue, the field gets replaced with the partition value instead
> of the value in the file (for example:
> {{{}timestamp_micros_nullable_field":"2025-01-25T00:00:00.000Z"{}}})
> Also seeing a regression on drop partition where the dropped partition is
> still being read
> The replace commit is not being written correctly in 0.15.0, the
> {{partitionToReplaceFileIds}} contains a map with an empty list instead of
> the filegroup ids for the partition
>
> We need a fix for 0.15.0.
> 1.0 works fine. See the test script
> https://gist.github.com/codope/dfb87d35112ddbb7f207ad3f52320071
--
This message was sent by Atlassian Jira
(v8.20.10#820010)