dracarys09 opened a new pull request, #86:
URL: https://github.com/apache/cassandra-easy-stress/pull/86
Summary of the changes
- Add VectorSearch workload for benchmarking Cassandra 5.0+ vector search
(ANN) capabilities
- Support both synthetic random vectors and realistic datasets via HDF5
files (SIFT, GloVe, etc.)
- Implement recall@K calculation with ground truth comparison for
measuring search quality
- Add configurable similarity functions (COSINE, EUCLIDEAN, DOT_PRODUCT)
and vector dimensions
Testing
- Run ./gradlew test --tests
"org.apache.cassandra.easystress.workloads.VectorSearchTest"
- Verify workload runs against a Cassandra 5.0+ cluster with random vectors
- Test with an HDF5 dataset (e.g., sift-128-euclidean.hdf5) and verify
recall metrics are logged
- Confirm ktlint passes: ./gradlew ktlintCheck
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]