[
https://issues.apache.org/jira/browse/HUDI-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Y Ethan Guo updated HUDI-7101:
------------------------------
Parent: HUDI-9176
Issue Type: Sub-task (was: Improvement)
> File slice instantiation for MDT file groups
> --------------------------------------------
>
> Key: HUDI-7101
> URL: https://issues.apache.org/jira/browse/HUDI-7101
> Project: Apache Hudi
> Issue Type: Sub-task
> Components: metadata
> Reporter: sivabalan narayanan
> Priority: Major
> Fix For: 1.1.0
>
>
> here is what a typical file group instantiation of MDT partition looks like
> t10: create a dummy commit w/ base commit time "0000000".
> So this will create a log file w/ dummy delete block.
> Immediately following this, we take the bulk_insert which will create a new
> file slice but w/ same commit time.
> base_file_00000.parquet.
> Theoretically, these both belong to diff file slices and when latest snapshot
> is read, only latest base file should be read. but as of now, we consider the
> log file also as latest and read it. Since its dummy delete log block, there
> is no correctness issue here.
>
> Just some code clean up is required.
>
> this is an issue only w/ a fresh table.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)