Re your "second" question about suboptimal results, I think Nils Reimers
explains quite nicely why this might happen, see for example
https://www.youtube.com/watch?v=Abh3YCahyqU
HTH
Michael
Am 30.01.24 um 15:48 schrieb Moll, Dr. Andreas:
Hi,
the hnsw documentation for the Lucene HnswGraph and the SolR vector search is
not very verbose, especially in regards to the parameters hnswMaxConn and
hnswBeamWidth.
I find it hard to come up with sensible values for these parameters by reading
the paper from 2018.
Does anyone have experience with the influence of the parameters on the
results? As far as I understand the code the graph is created at indexing time
so it would be time intensive to come up with the optimal values for a specific
use case by trial and error?
We have a SolR index with roughly 100 million embeddings and in a synthetic
randomized benchmarks around 14% percent of requests will result in a
suboptimal answer (based on the cosine vector similarity).
I expected this "error" rate to be much smaller. I would love to hear your
experiences.
Best regards
Andreas Moll
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org