JingsongLi opened a new pull request, #6:
URL: https://github.com/apache/paimon-vector-index/pull/6
## Summary
- **io**: Binary serialization format for IVF-PQ indexes — delta-varint ID
compression (typically 3-5x smaller than raw int64) + transposed code layout
for cache-friendly SIMD scan
- **Reader-based search**: `search_with_reader`,
`search_with_reader_filter`, `search_batch_reader` — lazy-loading reader
(header-only open), reads inverted lists on demand, parallel per-list scanning
- **Batch search optimization**: `search_batch_reader` groups queries by
probed list, reading each unique list once instead of nq*nprobe I/O ops
~1260 lines of new code, all clippy-clean and tested (43 tests pass, 7 new).
## Test plan
- [x] `cargo fmt --all -- --check` passes
- [x] `cargo clippy --all-targets --workspace -- -D warnings` passes
- [x] `cargo test --all` passes (43 tests, 7 new)
- io: test_varint_roundtrip, test_delta_varint_ids_roundtrip
- io: test_write_read_roundtrip_delta_ids (sorted IDs, delta encoding)
- io: test_space_savings (100K vectors, verifies >10% compression)
- ivfpq: test_write_read_search (serialize → deserialize → search)
- ivfpq: test_write_read_search_with_filter (reader + ID filter)
- ivfpq: test_big_batch_search (batch reader, 20 queries)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]