[
https://issues.apache.org/jira/browse/HUDI-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-5694:
--------------------------------------
Description:
looks like we are parsing all partitions and files from file system to
initialize metadata table for a new data table. Immediately following it, we
anyways update w/ new delta commit w/ exactly same set of files. So, we should
optimize this code path.
This impact CTAS specifically.
was:
looks like we are parsing all partitions and files from file system to
initialize metadata table for a new data table. Infact, at the end of it, we
discard all of them since they are not committed. So, we should optimize this
code path.
This impact CTAS specifically.
> Avoid unnecessary file system parsing to initialize a metadata for a new data
> table
> -----------------------------------------------------------------------------------
>
> Key: HUDI-5694
> URL: https://issues.apache.org/jira/browse/HUDI-5694
> Project: Apache Hudi
> Issue Type: Improvement
> Components: metadata
> Reporter: sivabalan narayanan
> Assignee: sivabalan narayanan
> Priority: Blocker
> Fix For: 0.13.0
>
>
> looks like we are parsing all partitions and files from file system to
> initialize metadata table for a new data table. Immediately following it, we
> anyways update w/ new delta commit w/ exactly same set of files. So, we
> should optimize this code path.
> This impact CTAS specifically.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)