jtibshirani commented on a change in pull request #656: URL: https://github.com/apache/lucene/pull/656#discussion_r801192735
########## File path: lucene/core/src/java/org/apache/lucene/codecs/lucene91/Lucene91HnswVectorsReader.java ########## @@ -227,16 +231,36 @@ public TopDocs search(String field, float[] target, int k, Bits acceptDocs) thro // bound k by total number of vectors to prevent oversizing data structures k = Math.min(k, fieldEntry.size()); - OffHeapVectorValues vectorValues = getOffHeapVectorValues(fieldEntry); + + DocIdSetIterator acceptIterator = null; + int visitedLimit = Integer.MAX_VALUE; + + if (acceptDocs instanceof BitSet acceptBitSet) { Review comment: This is a temporary hack since I wasn't sure about the right design. I could see a couple possibilities: 1. Add a new `BitSet filter` parameter to `searchNearestVectors`, keeping the fallback logic within the HNSW classes. 2. Add a new `int visitedLimit` parameter to `LeafReader#searchNearestVectors`. Pull the "exact search" logic up into `KnnVectorQuery`. Which option is better probably depends on how other algorithms would handle filtering (which I am not sure about), and also if we think `visitedLimit` is useful in other contexts. I also played around with having `searchNearestVectors` take a `Collector` and using `CollectionTerminatedException`... but couldn't really see how this made sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org