prashantwason opened a new pull request, #18058:
URL: https://github.com/apache/hudi/pull/18058
### Describe the issue this Pull Request addresses
When HDFS cluster is overloaded, updates to the hoodie.properties file may
fail silently, resulting in zero byte or corrupted properties files. This PR
addresses this reliability issue by:
1. Using atomic write pattern (temp file + rename) instead of direct writes
2. Adding enhanced verification to detect empty properties and property
count mismatch
### Summary and Changelog
**Summary:** Improves reliability of hoodie.properties file updates under
HDFS cluster load.
**Changelog:**
- Added `TEMP_SUFFIX` private constant for temp file naming
- Modified `modify()` method to write to a temporary file first, then
atomically rename to the target path
- Enhanced verification logic to check for:
- Empty properties file (`verifyProps.isEmpty()`)
- Property count mismatch (`verifyProps.size() != props.size()`)
- Improved error message to include property count comparison for easier
debugging
- Added logging with path information after successful modification
### Impact
No public API changes. This is an internal reliability improvement for the
properties file update mechanism.
### Risk Level
low - The changes follow standard atomic write patterns and add additional
safety checks. All existing tests pass (93 tests in TestHoodieTableConfig).
### Documentation Update
none - No new configs or user-facing changes.
### Contributor's checklist
- [x] Read through [contributor's
guide](https://hudi.apache.org/contribute/how-to-contribute)
- [x] Enough context is provided in the sections above
- [x] Adequate tests were added if applicable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]