kaivalnp commented on code in PR #15784:
URL: https://github.com/apache/lucene/pull/15784#discussion_r2880425102


##########
lucene/core/src/java/org/apache/lucene/search/VectorSimilarityCollector.java:
##########
@@ -25,44 +25,39 @@
  * @lucene.experimental
  */
 class VectorSimilarityCollector extends AbstractKnnCollector {
-  private final float traversalSimilarity, resultSimilarity;
-  private float maxSimilarity;
+  private final float resultSimilarity;
   private final List<ScoreDoc> scoreDocList;
+  private float minCompetitiveSimilarity;
 
   /**
-   * Perform a similarity-based graph search. The graph is traversed till 
better scoring nodes are
-   * available, or the best candidate is below {@link #traversalSimilarity}. 
All traversed nodes
-   * above {@link #resultSimilarity} are collected.
+   * Perform a similarity-based graph search. The buffer for graph traversal 
is adaptive: starts
+   * with a high value, and exponentially decays towards scores of nodes 
traversed, but not
+   * collected during graph search.
    *
-   * @param traversalSimilarity (lower) similarity score for graph traversal.
-   * @param resultSimilarity (higher) similarity score for result collection.
+   * @param resultSimilarity similarity score for result collection.
    * @param visitLimit limit on number of nodes to visit.
    */
-  public VectorSimilarityCollector(
-      float traversalSimilarity, float resultSimilarity, long visitLimit) {
+  public VectorSimilarityCollector(float resultSimilarity, long visitLimit) {
     // TODO: add search strategy support
     super(1, visitLimit, AbstractVectorSimilarityQuery.DEFAULT_STRATEGY);
-    if (traversalSimilarity > resultSimilarity) {
-      throw new IllegalArgumentException("traversalSimilarity should be <= 
resultSimilarity");
-    }
-    this.traversalSimilarity = traversalSimilarity;
     this.resultSimilarity = resultSimilarity;
-    this.maxSimilarity = Float.NEGATIVE_INFINITY;
     this.scoreDocList = new ArrayList<>();
+    this.minCompetitiveSimilarity = -Float.MAX_VALUE;
   }
 
   @Override
   public boolean collect(int docId, float similarity) {
-    maxSimilarity = Math.max(maxSimilarity, similarity);
     if (similarity >= resultSimilarity) {
       scoreDocList.add(new ScoreDoc(docId, similarity));
+      return false; // do not update minCompetitiveSimilarity

Review Comment:
   The return value of this function is used to update 
`minCompetitiveSimilarity` in `HnswGraphSearcher` (see 
[here](https://github.com/apache/lucene/blob/02014a6245d329ae93898f281b52eb3203d3b343/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java#L350)).
   
   Returning `true` when a result is collected is more natural for a KNN 
search, where "collecting" a result == replacing an existing result in the 
result queue (of size `topK`) == possibility of `minCompetitiveSimilarity` 
(least scoring result) being updated.
   
   However for a similarity-based vector search, we want to increase 
`minCompetitiveSimilarity` (i.e. reduce buffer of traversal) on encountering 
nodes _outside_ `resultSimilarity` to make it adaptive.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to