[ 
https://issues.apache.org/jira/browse/HUDI-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Y Ethan Guo updated HUDI-7101:
------------------------------
        Parent: HUDI-9176
    Issue Type: Sub-task  (was: Improvement)

> File slice instantiation for MDT file groups
> --------------------------------------------
>
>                 Key: HUDI-7101
>                 URL: https://issues.apache.org/jira/browse/HUDI-7101
>             Project: Apache Hudi
>          Issue Type: Sub-task
>          Components: metadata
>            Reporter: sivabalan narayanan
>            Priority: Major
>             Fix For: 1.1.0
>
>
> here is what a typical file group instantiation of MDT partition looks like
> t10: create a dummy commit w/ base commit time "0000000". 
> So this will create a log file w/ dummy delete block. 
> Immediately following this, we take the bulk_insert which will create a new 
> file slice but w/ same commit time. 
> base_file_00000.parquet. 
> Theoretically, these both belong to diff file slices and when latest snapshot 
> is read, only latest base file should be read. but as of now, we consider the 
> log file also as latest and read it. Since its dummy delete log block, there 
> is no correctness issue here. 
>  
> Just some code clean up is required. 
>  
> this is an issue only w/ a fresh table. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to