[ 
https://issues.apache.org/jira/browse/HUDI-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17915917#comment-17915917
 ] 

Y Ethan Guo commented on HUDI-8632:
-----------------------------------

Found another issue: HUDI-8896.  Right now reading partition columns for 
bootstrapped table (bootstrap skeleton + data file) relies on engine-specific 
handling, not in the file group reader.  So compaction and clustering using the 
file group reader to read out records from metadata-only bootstrapped table can 
encounter nulls in the partition columns, causing the new base file written to 
have nulls in the partition columns.

> Support bootstrap files in file group reader-based compaction
> -------------------------------------------------------------
>
>                 Key: HUDI-8632
>                 URL: https://issues.apache.org/jira/browse/HUDI-8632
>             Project: Apache Hudi
>          Issue Type: Sub-task
>            Reporter: Y Ethan Guo
>            Assignee: Y Ethan Guo
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 1.0.1
>
>   Original Estimate: 4h
>          Time Spent: 8h
>  Remaining Estimate: 2h
>
> testMetadataBootstrapMORPartitionedInlineCompactionOn fails with file group 
> reader-based compaction (HoodieSparkMergeHandleV2).  Currently if the 
> compaction plan and operations contain bootstrap files, the compaction goes 
> through the old flow using the regular HoodieMergeHandle.  We should support 
> bootstrap files in file group reader-based compaction in 
> HoodieSparkMergeHandleV2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to