stream2000 commented on code in PR #9199:
URL: https://github.com/apache/hudi/pull/9199#discussion_r1280497250
##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/execution/bulkinsert/RDDConsistentBucketBulkInsertPartitioner.java:
##########
@@ -144,9 +152,20 @@ private Map<String, Map<String, Integer>>
generateFileIdPfx(Map<String, Consiste
}
partitionToFileIdPfxIdxMap.put(identifier.getMetadata().getPartitionPath(),
fileIdPfxToIdx);
}
-
ValidationUtils.checkState(fileIdPfxList.size() ==
partitionToIdentifier.values().stream().mapToInt(ConsistentBucketIdentifier::getNumBuckets).sum(),
"Error state after constructing fileId & idx mapping");
return partitionToFileIdPfxIdxMap;
}
+
+ @Override
+ public Option<WriteHandleFactory> getWriteHandleFactory(int idx) {
+ return super.getWriteHandleFactory(idx).map(writeHandleFactory -> new
WriteHandleFactory() {
+ @Override
+ public HoodieWriteHandle create(HoodieWriteConfig config, String
commitTime, HoodieTable hoodieTable, String partitionPath, String fileIdPrefix,
TaskContextSupplier taskContextSupplier) {
+ // Ensure we do not create append handle for consistent hashing
bulk_insert, align with `ConsistentBucketBulkInsertDataInternalWriterHelper`
Review Comment:
For reviewers: When we bulk insert twice into a consistent hashing bucket
index table, we need to write logs to existing file groups in the second bulk
insert, while for normal bloom filter index table, we will always create new
base files when bulk insert. However currently bulk insert row writer path do
not support writing logs, so I add a check here to prevent user from bulk
insert twice into a consistent hashing bucket index table. We should use upsert
after first bulk insert for consistent hashing bucket index table.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]