[
https://issues.apache.org/jira/browse/HUDI-8463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Y Ethan Guo reassigned HUDI-8463:
---------------------------------
Assignee: Y Ethan Guo
> Revisit snapshot query planning performance regarding completion time
> ---------------------------------------------------------------------
>
> Key: HUDI-8463
> URL: https://issues.apache.org/jira/browse/HUDI-8463
> Project: Apache Hudi
> Issue Type: Improvement
> Reporter: Y Ethan Guo
> Assignee: Y Ethan Guo
> Priority: Blocker
> Fix For: 1.0.0
>
> Original Estimate: 20h
> Remaining Estimate: 20h
>
> When the snapshot query is planned, there are cases to look up completion
> time based on instant time, which can be a performance bottleneck, especially
> there are huge number of files, and large number of instants to look up, in
> both archived and active timeline. We should see if this can be improved by
> storing the completion time of each file in the FILES partition in the
> metadata table to avoid expensive lookup every time. When the completion
> time of each file in the FILES partition is stored in MDT, we only need to do
> filtering based on the information from MDT only.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)