[PR] feat(metadata): Fix partitioned RLI lookup to use full record key [hudi]

via GitHub Tue, 16 Jun 2026 19:18:05 -0700


cshuo opened a new pull request, #19026:
URL: https://github.com/apache/hudi/pull/19026


   ### Describe the issue this Pull Request addresses
   
   Partitioned record-level index lookups in metadata should still match 
against the full record key within the selected data partition. The previous 
lookup path filtered records using prefix matching after narrowing to a single 
shard, which could allow a prefix-only key to match an existing record 
incorrectly.
   
   This PR fixes the partitioned RLI metadata read path to use full-key 
matching and adds a Spark functional test that verifies valid keys are returned 
while non-existent prefix-only keys do not match.
   
   ### Summary and Changelog
   
   - Update `HoodieBackedTableMetadata` to read partitioned record index 
entries with full-key filtering when resolving a single file slice.
   - Add `testPartitionedRecordLevelIndexLookupUsesFullKey` in 
`TestRecordLevelIndex.scala` to cover partitioned RLI reads with 
`GLOBAL_RECORD_LEVEL_INDEX_ENABLE=false` and `RECORD_LEVEL_INDEX_ENABLE=true`.
   - Verify both the positive path for real record keys and the negative path 
for a prefix-only key that should return no matches.
   
   ### Impact
   
   - **Functional impact**: Fixes incorrect prefix-based matches during 
partitioned record-level index lookup in metadata.
   - **Maintainability**: Tightens the lookup contract in the metadata reader 
and documents the expected behavior with a targeted regression test.
   - **Extensibility**: Reduces ambiguity for future changes around partitioned 
RLI sharding and lookup semantics by anchoring behavior in test coverage.
   
   ### Risk Level
   
   low. The production change is a one-line behavioral fix in the partitioned 
RLI metadata lookup path, and it is covered by a focused functional regression 
test in the Spark datasource test suite.
   
   ### Documentation Update
   
   none
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Enough context is provided in the sections above
   - [ ] Adequate tests were added if applicable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] feat(metadata): Fix partitioned RLI lookup to use full record key [hudi]

Reply via email to