prashantwason commented on PR #18058: URL: https://github.com/apache/hudi/pull/18058#issuecomment-3949745347
Thanks @danny0405 for the review! **Regarding `createImmutableFileInPath`:** I looked at this method and it does provide atomic write with temp file + rename, which is exactly what this PR implements manually. However, `createImmutableFileInPath` is designed for creating new immutable files (files that are never rewritten), while this PR deals with modifying an existing hoodie.properties file. The method signature requires `HoodieInstantWriter` for content, which doesn't directly work with the Properties object we're writing here. I could adapt the code to use it, but it would require some refactoring. **Regarding whether this fix is necessary with checksum validation:** You raise a valid point. The checksum validation from PR #13740 and #14148 should indeed catch zero-byte files since they wouldn't have a valid checksum. Let me think about this more carefully: 1. **Zero-byte file detection**: The checksum validation would fail for zero-byte files since they can't contain a valid checksum. ✓ 2. **Atomic write**: The current code writes directly to the file, which could leave a partial/corrupted file if the write fails mid-way. The checksum would catch this, but having atomic writes prevents the partial state entirely. The main value of this PR is: 1. **Defense in depth** - atomic writes + checksum validation 2. **Better error messages** - reporting property count mismatch for easier debugging 3. **Explicit empty check** - failing fast before checksum comparison Would you prefer I: A) Simplify to just the enhanced error messages/logging since checksum covers atomicity B) Keep the atomic write pattern as defense in depth C) Refactor to use `createImmutableFileInPath` by adapting it for Properties files Please let me know your preference and I'll update accordingly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
