prashantwason commented on PR #18058:
URL: https://github.com/apache/hudi/pull/18058#issuecomment-3949745347

   Thanks @danny0405 for the review!
   
   **Regarding `createImmutableFileInPath`:**
   I looked at this method and it does provide atomic write with temp file + 
rename, which is exactly what this PR implements manually. However, 
`createImmutableFileInPath` is designed for creating new immutable files (files 
that are never rewritten), while this PR deals with modifying an existing 
hoodie.properties file.
   
   The method signature requires `HoodieInstantWriter` for content, which 
doesn't directly work with the Properties object we're writing here. I could 
adapt the code to use it, but it would require some refactoring.
   
   **Regarding whether this fix is necessary with checksum validation:**
   You raise a valid point. The checksum validation from PR #13740 and #14148 
should indeed catch zero-byte files since they wouldn't have a valid checksum. 
Let me think about this more carefully:
   
   1. **Zero-byte file detection**: The checksum validation would fail for 
zero-byte files since they can't contain a valid checksum. ✓
   2. **Atomic write**: The current code writes directly to the file, which 
could leave a partial/corrupted file if the write fails mid-way. The checksum 
would catch this, but having atomic writes prevents the partial state entirely.
   
   The main value of this PR is:
   1. **Defense in depth** - atomic writes + checksum validation
   2. **Better error messages** - reporting property count mismatch for easier 
debugging
   3. **Explicit empty check** - failing fast before checksum comparison
   
   Would you prefer I:
   A) Simplify to just the enhanced error messages/logging since checksum 
covers atomicity
   B) Keep the atomic write pattern as defense in depth
   C) Refactor to use `createImmutableFileInPath` by adapting it for Properties 
files
   
   Please let me know your preference and I'll update accordingly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to