jtibshirani commented on a change in pull request #656:
URL: https://github.com/apache/lucene/pull/656#discussion_r801192735



##########
File path: 
lucene/core/src/java/org/apache/lucene/codecs/lucene91/Lucene91HnswVectorsReader.java
##########
@@ -227,16 +231,36 @@ public TopDocs search(String field, float[] target, int 
k, Bits acceptDocs) thro
 
     // bound k by total number of vectors to prevent oversizing data structures
     k = Math.min(k, fieldEntry.size());
-
     OffHeapVectorValues vectorValues = getOffHeapVectorValues(fieldEntry);
+
+    DocIdSetIterator acceptIterator = null;
+    int visitedLimit = Integer.MAX_VALUE;
+
+    if (acceptDocs instanceof BitSet acceptBitSet) {

Review comment:
       This is a temporary hack since I wasn't sure about the right design. I 
could see a couple possibilities:
   1. Add a new `BitSet filter` parameter to `searchNearestVectors`, keeping 
the fallback logic within the HNSW classes. 
   2. Add a new `int visitedLimit` parameter to 
`LeafReader#searchNearestVectors`. Pull the "exact search" logic up into 
`KnnVectorQuery`.
   
   Which option is better probably depends on how other algorithms would handle 
filtering (which I am not sure about), and also if we think `visitedLimit` is 
useful in other contexts.
   
   I also played around with having `searchNearestVectors` take a `Collector` 
and using `CollectionTerminatedException`... but couldn't really see how this 
made sense.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to