prashantwason opened a new pull request, #18058:
URL: https://github.com/apache/hudi/pull/18058

   ### Describe the issue this Pull Request addresses
   
   When HDFS cluster is overloaded, updates to the hoodie.properties file may 
fail silently, resulting in zero byte or corrupted properties files. This PR 
addresses this reliability issue by:
   1. Using atomic write pattern (temp file + rename) instead of direct writes
   2. Adding enhanced verification to detect empty properties and property 
count mismatch
   
   ### Summary and Changelog
   
   **Summary:** Improves reliability of hoodie.properties file updates under 
HDFS cluster load.
   
   **Changelog:**
   - Added `TEMP_SUFFIX` private constant for temp file naming
   - Modified `modify()` method to write to a temporary file first, then 
atomically rename to the target path
   - Enhanced verification logic to check for:
     - Empty properties file (`verifyProps.isEmpty()`)
     - Property count mismatch (`verifyProps.size() != props.size()`)
   - Improved error message to include property count comparison for easier 
debugging
   - Added logging with path information after successful modification
   
   ### Impact
   
   No public API changes. This is an internal reliability improvement for the 
properties file update mechanism.
   
   ### Risk Level
   
   low - The changes follow standard atomic write patterns and add additional 
safety checks. All existing tests pass (93 tests in TestHoodieTableConfig).
   
   ### Documentation Update
   
   none - No new configs or user-facing changes.
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Enough context is provided in the sections above
   - [x] Adequate tests were added if applicable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to