JingsongLi opened a new pull request, #25: URL: https://github.com/apache/paimon-vector-index/pull/25
## Summary Add a Paimon-only ANN-style benchmark for the core vector indexes. The benchmark builds IVF-PQ, IVF-HNSW-FLAT, and IVF-HNSW-SQ indexes on the same synthetic clustered dataset and reports build time, reader open/load time, first-query latency, batch query throughput, and serialized index size. ## Changes - Add `core/benches/ann_bench.rs` with configurable synthetic data generation and CSV output. - Register the new benchmark in `core/Cargo.toml`. - Document how to run the benchmark and configure dataset/index parameters in `README.md`. ## Testing - `cargo fmt --check` - `cargo check -p paimon-vindex-core --bench ann_bench` - `ANN_N=512 ANN_NQ=8 ANN_D=32 ANN_K=5 ANN_NLIST=16 ANN_NPROBE=4 ANN_PQ_M=4 ANN_HNSW_EF_CONSTRUCTION=32 ANN_HNSW_EF_SEARCH=24 cargo bench -p paimon-vindex-core --bench ann_bench -- --nocapture` - `cargo test -p paimon-vindex-core` ## Notes The benchmark intentionally avoids external vector database dependencies and reports `disk_scope=index_bytes` for serialized Paimon index files. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
