mikemccand commented on issue #15773: URL: https://github.com/apache/lucene/issues/15773#issuecomment-4045603043
Since merging/indexing is essentially running `KNNFloat/ByteVectorQuery` and then inserting the returned top K as the connections for that node ... could we use re-ranking (needs fewer full precision vectors) during that query? But, in aggregate, it seems likely to need all ish of those full precision vectors (I have the same concern about reranking at search time -- does it really reduce transient hot RAM needed?). Or, maybe when quantizing to 1 or 2 bits, we also quantize to 8 bits, and use those 8 bit vectors for re-ranking or so. > There's some acknowledgement of this in the quantized vectors format for 1 and 2 bits because we also store a 4 bit representation for the "query" during indexing to better approximate the actual score. Oh, hmm, what does this mean exactly? Is it asymmetric quantization used for the query vector (the vector being inserted) during its HNSW search against the 1 or 2 bit indexed vectors? It's curious that we do use full precision scores on segment birth (newly indexed vectors), but quantized scores for merging. It means depending on your RAM buffer settings, numbers of threads, etc., you can "see" the effect on different HNSW graphs -- normally such settings do not (should not?) alter the semantics of the index, but HNSW is probabilistic anyways so this is probably just in the noise, ish. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
