[
https://issues.apache.org/jira/browse/CASSANDRA-21126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Abhijeet Dubey updated CASSANDRA-21126:
---------------------------------------
Test and Documentation Plan:
Summary of the changes
* Add VectorSearch workload for benchmarking Cassandra 5.0+ vector search
(ANN) capabilities
* Support both synthetic random vectors and realistic datasets via HDF5 files
({{{}SIFT{}}}, {{{}GloVe{}}}, etc.)
* Implement recall@K calculation with ground truth comparison for measuring
search quality
* Add configurable similarity functions ({{{}COSINE{}}}, {{{}EUCLIDEAN{}}},
{{{}DOT_PRODUCT{}}}) and vector dimensions
Testing
* Run {{./gradlew test --tests
"org.apache.cassandra.easystress.workloads.VectorSearchTest"}}
* Verify workload runs against a Cassandra 5.0+ cluster with random vectors
* Test with an HDF5 dataset (e.g., sift-128-euclidean.hdf5) and verify recall
metrics are logged
* Confirm ktlint passes: {{./gradlew ktlintCheck}}
Status: Patch Available (was: In Progress)
> Vector Search support in cassandra-easy-stress
> ----------------------------------------------
>
> Key: CASSANDRA-21126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21126
> Project: Apache Cassandra
> Issue Type: New Feature
> Components: Feature/Vector Search
> Reporter: Abhijeet Dubey
> Assignee: Abhijeet Dubey
> Priority: Normal
> Labels: pull-request-available
> Fix For: 5.x
>
>
> The idea here is to have a way for {{cassandra-easy-stress}} to have the
> following:
> # A vector search workload with configurable dimensions, similarity function
> # A way for loading real embeddings (for realistic benchmarks)
> # Also just support piping the embeddings into C* via easy-cass-stress
>
> Reference Slack Thread:
> [https://the-asf.slack.com/archives/C018YGVCHMZ/p1768941198240049]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]