yugan95 opened a new pull request, #8211:
URL: https://github.com/apache/paimon/pull/8211
### Purpose
`readAllDeletionVectors(IndexFileMeta)` reads deletion vectors
sequentially without seeking to the offset recorded in each
`DeletionVectorMeta`, assuming the `dvRanges` iteration order always matches
the physical storage order in the index file.
When this assumption does not hold (e.g. after compaction merges dvRanges
from multiple data files into a single index file), the stream reads from wrong
positions and fails with:
```
java.lang.RuntimeException: Size not match, actual size: 6690, expected
size: 311612
```
The other read methods — `readDeletionVector(Map<String, DeletionFile>)`
and `readDeletionVector(DeletionFile)` — already seek to the correct offset
before each read.
#### Changes
Fix by calling `inputStream.seek(deletionVectorMeta.offset())` before each
read, consistent with the existing read paths.
### Tests
- `testReadAllDeletionVectorsWithOutOfOrderDvRanges` — writes multiple DVs
to an index file, then constructs an `IndexFileMeta` with reversed `dvRanges`
iteration order (simulating the compaction-merged scenario), and verifies
`readAllDeletionVectors` still reads each DV correctly. Parameterized for both
bitmap32 and bitmap64.
### API and Format
N/A — no public API or format changes.
### Documentation
N/A
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]