[ 
https://issues.apache.org/jira/browse/HUDI-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-6962:
---------------------------------
    Labels: pull-request-available  (was: )

> Correct the behavior of bulk insert for NB-CC 
> ----------------------------------------------
>
>                 Key: HUDI-6962
>                 URL: https://issues.apache.org/jira/browse/HUDI-6962
>             Project: Apache Hudi
>          Issue Type: New Feature
>            Reporter: Jing Zhang
>            Assignee: Jing Zhang
>            Priority: Major
>              Labels: pull-request-available
>
> How to handle the case if the multiple writer contains a job with bulk insert 
> operation?
> 1. Generated file group id: Generate a fixed file group ID because other jobs 
> will use the fixed file group id suffix instead of random uuid suffix. The 
> behavior needs to be consistent to prevent later writer jobs from writing the 
> records with same primary key to different file groups.
> 2.Deal with the transaction: The conflict resolution of bulk insert could not 
> defer to the compaction phase. Because bulk insert writers flush data into 
> base files, if there are multiple bulk insert job, there might exists 
> multiple base files in the same bucket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to