nada-attia opened a new pull request, #18143:
URL: https://github.com/apache/hudi/pull/18143

   ### Describe the issue this Pull Request addresses
   
   This PR adds a new API to fetch log files created on or before a given 
instant time, which is useful for metadata table consistency checks and log 
file validation.
   
   ### Summary and Changelog
   
   Added `getAllLogFilesWithMaxCommit` API to `LogReaderUtils` which:
   - Gets filtered timeline based on commits modified before or on the max 
instant
   - Gets all file slices in given partitions based on max commit instant
   - Gets all log files from each file slice
   - For each log file, returns the list of commit instant times for blocks 
created on or before the max commit instant time
   
   This API uses the modern Storage API instead of the deprecated FileSystem 
API.
   
   Also added supporting methods for encoding/decoding record positions using 
Roaring64NavigableMap:
   - `encodePositions(Set<Long>)` - Encodes a set of positions
   - `encodePositions(Roaring64NavigableMap)` - Encodes a bitmap
   - `decodeRecordPositionsHeader(String)` - Decodes positions from header
   
   ### Impact
   
   - **New API**: `getAllLogFilesWithMaxCommit` provides capability to fetch 
log files by instant time
   - **No breaking changes**: Existing methods in LogReaderUtils are preserved
   
   ### Risk Level
   
   **low**
   
   This is a new API addition with no changes to existing functionality. The 
implementation follows existing patterns in the codebase.
   
   ### Documentation Update
   
   none
   
   The new API follows existing patterns and includes comprehensive JavaDoc 
comments.
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Enough context is provided in the sections above
   - [x] Adequate tests were added if applicable
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)
   
   Co-Authored-By: Claude <[email protected]>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to