[
https://issues.apache.org/jira/browse/HUDI-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sagar Sumit closed HUDI-4453.
-----------------------------
Fix Version/s: 0.12.1
(was: 0.13.0)
Resolution: Fixed
> Support partition pruning for tables Bootstrapped from Source Hive Style
> partitioned tables
> -------------------------------------------------------------------------------------------
>
> Key: HUDI-4453
> URL: https://issues.apache.org/jira/browse/HUDI-4453
> Project: Apache Hudi
> Issue Type: Improvement
> Reporter: Udit Mehrotra
> Assignee: Ethan Guo
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 0.12.1
>
>
> As of now the *Bootstrap* feature determines the source schema by reading it
> from the source parquet files =>
> [https://github.com/apache/hudi/blob/master/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/bootstrap/ParquetBootstrapMetadataHandler.java#L61]
> This does not consider parquet tables which might be Hive style partitioned.
> Thus, from the source schema partition columns would be missed and not
> written to the target Hudi table either. Also because of this partition
> pruning does not work, as we are unable to prune out source partitions. We
> should improve this logic to determine partition schema correctly from the
> partition paths in case of hive style partitioned tables and write the
> partition column values correctly in the target Hudi table.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)