jtibshirani edited a comment on issue #1314: LUCENE-9136: Coarse quantization 
that reuses existing formats.
URL: https://github.com/apache/lucene-solr/pull/1314#issuecomment-608645326
 
 
   **Benchmarks**
   In these benchmarks, we find the nearest k=10 vectors and record the recall 
and queries per second. For the number of centroids, we use the heuristic num 
centroids = sqrt(dataset size).
   
   sift-128-euclidean: a dataset of 1 million SIFT descriptors with 128 dims.
   ```
   APPROACH                          RECALL     QPS
   LuceneExact()                     1.000        6.425
   LuceneCluster(n_probes=5)         0.756      604.133
   LuceneCluster(n_probes=10)        0.874      323.791
   LuceneCluster(n_probes=20)        0.951      166.580
   LuceneCluster(n_probes=50)        0.993       68.465
   LuceneCluster(n_probes=100)       0.999       35.139
   ```
   
   glove-100-angular: a dataset of ~1.2 million GloVe word vectors of 100 dims.
   ```
   APPROACH                          RECALL     QPS
   LuceneExact()                     1.000        6.764
   LuceneCluster(n_probes=5)         0.681      642.247
   LuceneCluster(n_probes=10)        0.768      343.067
   LuceneCluster(n_probes=20)        0.836      177.037
   LuceneCluster(n_probes=50)        0.908       73.256
   LuceneCluster(n_probes=100)       0.951       37.302
   ```
   
   These benchmarks were performed using the [ann-benchmarks 
repo](https://github.com/erikbern/ann-benchmarks). The branch and instructions 
for benchmarking can be found here: 
https://github.com/jtibshirani/ann-benchmarks/pull/2.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to