[
https://issues.apache.org/jira/browse/HUDI-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HUDI-6962:
---------------------------------
Labels: pull-request-available (was: )
> Correct the behavior of bulk insert for NB-CC
> ----------------------------------------------
>
> Key: HUDI-6962
> URL: https://issues.apache.org/jira/browse/HUDI-6962
> Project: Apache Hudi
> Issue Type: New Feature
> Reporter: Jing Zhang
> Assignee: Jing Zhang
> Priority: Major
> Labels: pull-request-available
>
> How to handle the case if the multiple writer contains a job with bulk insert
> operation?
> 1. Generated file group id: Generate a fixed file group ID because other jobs
> will use the fixed file group id suffix instead of random uuid suffix. The
> behavior needs to be consistent to prevent later writer jobs from writing the
> records with same primary key to different file groups.
> 2.Deal with the transaction: The conflict resolution of bulk insert could not
> defer to the compaction phase. Because bulk insert writers flush data into
> base files, if there are multiple bulk insert job, there might exists
> multiple base files in the same bucket.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)