Re: [PR] [MINOR] Fix the generation of uncertain file id in NBCC mode [hudi]

via GitHub Mon, 13 Jan 2025 19:01:29 -0800


TheR1sing3un commented on code in PR #12627:
URL: https://github.com/apache/hudi/pull/12627#discussion_r1914123666



##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/SparkBucketIndexPartitioner.java:
##########
@@ -127,10 +131,13 @@ public BucketInfo getBucketInfo(int bucketNumber) {
     if (fileIdOption.isPresent()) {
       return new BucketInfo(BucketType.UPDATE, fileIdOption.get(), 
partitionPath);
     } else {
-      // Always write into log file instead of base file if using NB-CC
-      BucketType bucketType = isNonBlockingConcurrencyControl ? 
BucketType.UPDATE : BucketType.INSERT;
       String fileIdPrefix = BucketIdentifier.newBucketFileIdPrefix(bucketId, 
isNonBlockingConcurrencyControl);
-      return new BucketInfo(bucketType, fileIdPrefix, partitionPath);
+      // Always write into log file instead of base file if using NB-CC
+      if (isNonBlockingConcurrencyControl) {
+        String fileId = FSUtils.createNewFileId(fileIdPrefix, 0);

Review Comment:
   > Can you make sure whether Flink writer has the 0 as suffix.
   
   I am sorry that I did not notice the rules on flink side at the beginning, I 
just checked it, it seems that there will be no `-0` suffix when writing on 
flink side, I think we should ignore the specific engine, and completely fix 
the file id generation rules of NBCC, so that we don't have a problem if we 
process stream writes and batch writes together.
   In my next pr, I will comb and unify the file id rules for all in NBCC mode, 
and add more test to verify it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [MINOR] Fix the generation of uncertain file id in NBCC mode [hudi]

Reply via email to