Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
dungba88 commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2499978963 I think there are still 2 issues to address: - Prevent quantized vectors from being swapped out: Loading full-precision vectors are costly and can cause the quantized vectors to be swapped out if the OS is under memory pressure. Maybe we can use something similar to `mlock` if the system supports it. But I guess it can be done by the developers instead having it built-in support in the re-ranking Query. - The latency could be better. I'm still running a thorough benchmark with KnnGraphTester, but preliminary results show the re-ranking step adds quite some latency. Maybe we can execute the re-ranking per segment in parallel, or apply some optimization. Another thing is that we are running the rewrite phase and createRewrittenQuery twice: once for the main search phase and one for the re-ranking phase. Not sure how much overhead it will introduce. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
dungba88 commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2492678908 Thanks @benwtrent , I have published a PR here: https://github.com/apache/lucene/pull/14009 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
benwtrent commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2491348818 @dungba88 feel free to open a PR against the main repo. I can do a fuller review once its against Lucene. Utilizing a query that wraps things via dependency injection is way better than inheritance and better fits the API. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
dungba88 commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2490332029 Thanks @benwtrent for the feedback! I managed to change the abstraction to composition in the next rev. We need to create the DocAndScoreQuery twice (once during the original Query rewrite, and once during the re-ranking Query rewrite) but now we only need to rerank r * k documents instead of r * k * num_segments. Let me know what you think. If it (almost) looks good, I can switch to create PR on the main repo, probably adding more tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
benwtrent commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2488523095 @dungba88 I left a comment on your POC. It seems to me the best abstraction is a new query that doesn't inherit from AbstractKnnVectorQuery. Instead its just a new query that requires a KNN query as its parameter. Then you can gather the quantized scored knn docs during rewrite. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
dungba88 commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2487581544 Relatedly, I got to [this paper](https://arxiv.org/abs/2411.12229) today. It mainly talks about integrating RaBitQ with graph search, but the part about reducing random access intrigues me. The author designs a way to avoid explicit re-ranking step (and thus random access) by only using approximate distance for graph search while using exact distance (with raw vectors) for nearest neighbors heap. It also stores the raw vector compactly along with the quantized vectors to avoid the random access (we are using seek + readFloats currently). I haven't looked deeply into the paper though. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
dungba88 commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2481270142 > Hey @dungba88 - not intrusive at all! I ended up not having more time to work more on this. I had a [simple PoC](https://github.com/dungba88/lucene/pull/29/) to demonstrate this idea (test has passed as well). I created a separate Query for sake of simplicity, but also proves that Lucene already enable this idea with (almost) no change needed in core. The idea is to add a `float oversample` parameter in constructor, which will sample more matches and run the second pass for FP reranking. Although the second pass currently has to be done at each segment level, as I still don't know how to get the raw FP vectors without the LeafReaderContext, so I can't compute vector similarity with just the TopDocs alone. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
jmazanec15 commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2479340671 > Hi all, I was looking at this idea for some experimentation ideas (not mean to be intrusive to ongoing effort). Hey @dungba88 - not intrusive at all! I ended up not having time to work more on this. > So, the collector will likely need to be told "Hey, these are estimations", and then a second pass over the top docs there would be possible via something like if(collector.scoresAreEstimates) I like this approach. I think it is a good way to maintain abstraction between knn codec formats and queries, where the query does not need to know any specifics about quantization done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
dungba88 commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2477832716 I also saw where my misunderstanding comes from now: `getFloatVectorValues()` returns QuantizedVectorValues, which has 2 methods: `vectorValue(int ord)` returns the raw vector while `scorer(float[] target)` uses the quantized vector for scoring. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
dungba88 commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-241816 > Apply the brute-force reranking (by passing the "exactSearch" path as that uses quantized values). Maybe I misunderstood something, but I thought the idea is to re-rank with *full precision vectors* over the oversampled matches? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
dungba88 commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2477616738 Hi @benwtrent Thanks for the reply. It seems I wasn't clear enough in my previous question. Regarding `exactSearch`, it was based on this jmazanec15@ comment. `getFloatVectorValues()` is used in exactSearch(), thus I thought that it would use full-precision vectors, but based on your comment, it seems it also uses quantized vectors? As the re-ranking step needs full-precision vectors, how can I access them? > Didnt realize that the full-precision vectors for quantized index are exposed via getFloatVectorValues -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
benwtrent commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2476505715 > then it seems we can just call exactSearch over the results of approximateSearch, Exact search will hopefully just be using the quantized values as well. > However I don't know how to get the full-sized vectors without LeafContextReader. Do you have an idea how we can achieve that? I don't see why that would be a problem. How I would understand this is you would need to iterate the segments twice regardless. - Once to get the top k as desired with the oversampling - Apply the brute-force reranking (by passing the "exactSearch" path as that uses quantized values). However, the query shouldn't bother attempting to rerank results that are considered "accurate". So, the collector will likely need to be told "Hey, these are estimations", and then a second pass over the top docs there would be possible via something like `if(collector.scoresAreEstimates)` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
dungba88 commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2475738069 Hi all, I was looking at this idea for some experimentation ideas (not mean to be intrusive to ongoing effort). If the full sized vectors are exposed through `getFloatVectorValues` then it seems we can just call exactSearch over the results of approximateSearch, but this would inefficiently re-rank `k * oversample * num_segments` instead of just `k * oversample` (with some improve in recall perhaps). However I don't know how to get the full-sized vectors without LeafContextReader. Do you have an idea how we can achieve that? Or maybe we can just reduce oversample to account for the number of segments, e.g: `oversample = oversample / leafReaderContexts.size()` (which maybe differs)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
jmazanec15 commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2237010997 Makes sense thanks @benwtrent . Im working on PoC and some experiments. Didnt realize that the full-precision vectors for quantized index are exposed via getFloatVectorValues. That should make things simpler. I was also looking at this comment for DiskANN from @kevindrosendahl (https://github.com/apache/lucene/issues/12615#issuecomment-1868095892) and noticed that mlock'ing parts of the index in memory became really important in lower mem environments for performance. I dont think this is currently supported in Lucene, but will take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
benwtrent commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2223362650 @jmazanec15 > I think we would need to refine not the top k but the top r*k and then reduce to k 100%, I wasn't clear. Yes, over all segments, we gather some approximate scoring that is `r*k` total when collapsed together, and then refine & rescore the total `r*k`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
jmazanec15 commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2223317735 > I think the API would be tricky, but I am all for this idea Yes agree, Ill think on this a little bit. Ill start with a PoC and go from there. > Whatever the design, it would be most efficient to first gather the nearest vectors from ALL segments with an approximate score, and then do a second pass over all segments with to refine the top k. > > Rescoring per segment would be needlessly in-efficient. Yes I agree. However, with this, I think we would need to refine not the top k but the top r*k and then reduce to k. Otherwise, I dont think that recall would actually be improved - just ordering might be better. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Add refinement of quantized vector scores with fp distance calculations [lucene]
benwtrent commented on issue #13564: URL: https://github.com/apache/lucene/issues/13564#issuecomment-2223238648 I think the API would be tricky, but I am all for this idea. The ability to "approximately score" and then do second pass to "exact score" is useful for all sorts of levels of quantization. Whatever the design, it would be most efficient to first gather the nearest vectors from ALL segments with an approximate score, and then do a second pass over all segments with to refine the top k. Rescoring per segment would be needlessly in-efficient. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org