prashantwason commented on code in PR #8604:
URL: https://github.com/apache/hudi/pull/8604#discussion_r1213467835
##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/metadata/SparkHoodieBackedTableMetadataWriter.java:
##########
@@ -159,6 +162,13 @@ protected void commit(String instantTime,
Map<MetadataPartitionType, HoodieData<
compactIfNecessary(writeClient, instantTime);
}
+ // It is possible that the given instantTime already exists in metadata
table,
Review Comment:
@codope Can you please explain this case?
The new partition initialization should use a unique timestamp (with a
suffix) and not conflict with any existing deltacommit.
If two commits in dataset are attached to same deltacommit then it may cause
issues with the log block reading and rollbacks/ restore functionality.
Please also check https://github.com/apache/hudi/pull/8684 where the new
partition enabling has been changed to:
1. Use bulkInsert for initial commit
2. Always use a unique timestamp on MDT
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]