mikemccand commented on issue #15773:
URL: https://github.com/apache/lucene/issues/15773#issuecomment-4045603043

   Since merging/indexing is essentially running `KNNFloat/ByteVectorQuery` and 
then inserting the returned top K as the connections for that node ... could we 
use re-ranking (needs fewer full precision vectors) during that query?  But, in 
aggregate, it seems likely to need all ish of those full precision vectors (I 
have the same concern about reranking at search time -- does it really reduce 
transient hot RAM needed?).
   
   Or, maybe when quantizing to 1 or 2 bits, we also quantize to 8 bits, and 
use those 8 bit vectors for re-ranking or so.
   
   > There's some acknowledgement of this in the quantized vectors format for 1 
and 2 bits because we also store a 4 bit representation for the "query" during 
indexing to better approximate the actual score.
   
   Oh, hmm, what does this mean exactly?  Is it asymmetric quantization used 
for the query vector (the vector being inserted) during its HNSW search against 
the 1 or 2 bit indexed vectors?
   
   It's curious that we do use full precision scores on segment birth (newly 
indexed vectors), but quantized scores for merging.  It means depending on your 
RAM buffer settings, numbers of threads, etc., you can "see" the effect on 
different HNSW graphs -- normally such settings do not (should not?) alter the 
semantics of the index, but HNSW is probabilistic anyways so this is probably 
just in the noise, ish.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to