TheR1sing3un commented on code in PR #17472:
URL: https://github.com/apache/hudi/pull/17472#discussion_r2591227264
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/timeline/versioning/v2/LSMTimelineWriter.java:
##########
@@ -208,16 +209,19 @@ private void createManifestFile(HoodieLSMTimelineManifest
manifest, int currentV
int newVersion = currentVersion < 0 ? 1 : currentVersion + 1;
// create manifest file
final StoragePath manifestFilePath =
LSMTimeline.getManifestFilePath(newVersion, archivePath);
- metaClient.getStorage().createImmutableFileInPath(manifestFilePath,
Option.of(HoodieInstantWriter.convertByteArrayToWriter(content)));
+ // create the manifest file with overwrite semantics, to handle the case
like that:
+ // writer_1 creates the manifest file successfully, but fails before
updating the version file,
+ // so we need to allow writer_2 to overwrite the manifest file wth the
same version number.
+ FileIOUtils.createFileInPath(metaClient.getStorage(), manifestFilePath,
Option.of(HoodieInstantWriter.convertByteArrayToWriter(content)));
Review Comment:
Thank you for your detailed explanation. :)
> the corrput files you mentioned would be cleand in the following-up
successful write.
However, I don't quite understand this sentence.
1. Suppose the current version is 1.
2. There is an archiver that is archiving and writing the file manifest_2,
but it fails before modifying the version file from 1 to 2.
3. Then the next time archiver attempts to create manifest_2 too, since the
current latest version is still 1, this logic is with expectations.
4. However, when creating manifest_2, it is found that there is already a
file with this name on fs, so the creation will fail.
In our current scenarios, we often encounter the situation where we can only
manually delete this orphan manifest file and then retry.
Does our code logic need to consider handling this situation? Or, when you
encountered it, how did you handle it? Looking forward to your reply~ Thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]