yugan95 opened a new pull request, #8211:
URL: https://github.com/apache/paimon/pull/8211

     ### Purpose
   
     `readAllDeletionVectors(IndexFileMeta)` reads deletion vectors 
sequentially without seeking to the offset recorded in each 
`DeletionVectorMeta`, assuming the `dvRanges` iteration order always matches 
the physical storage order in the index file.
   
     When this assumption does not hold (e.g. after compaction merges dvRanges 
from multiple data files into a single index file), the stream reads from wrong 
positions and fails with:
   
   ```
     java.lang.RuntimeException: Size not match, actual size: 6690, expected 
size: 311612
   ```
   
     The other read methods — `readDeletionVector(Map<String, DeletionFile>)` 
and `readDeletionVector(DeletionFile)` — already seek to the correct offset 
before each read.
   
     #### Changes
   
     Fix by calling `inputStream.seek(deletionVectorMeta.offset())` before each 
read, consistent with the existing read paths.
   
     ### Tests
   
     - `testReadAllDeletionVectorsWithOutOfOrderDvRanges` — writes multiple DVs 
to an index file, then constructs an `IndexFileMeta` with reversed `dvRanges` 
iteration order (simulating the compaction-merged scenario), and verifies 
`readAllDeletionVectors` still reads each DV correctly. Parameterized for both 
bitmap32 and bitmap64.
   
     ### API and Format
   
     N/A — no public API or format changes.
   
     ### Documentation
   
     N/A


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to