irvingzhang opened a new pull request #1295: Lucene-9004: bug fix for searching 
the nearest one neighbor in higher layers
URL: https://github.com/apache/lucene-solr/pull/1295
 
 
   `if (dist < f.distance() || results.size() < ef) {
      Neighbor n = new ImmutableNeighbor(e.docId(), dist);
      candidates.add(n);
      results.insertWithOverflow(n);
      f = results.top();
   }`
   
   If (dist < f.distance()) but results.size() >= ef, the "Neighbor n" would be 
added to "results" ("results" is a sub-type of PriorityQueue). The actual size 
of "results" would be between "ef" and results' max queue size, while its 
expected size if "ef". 
   
   Consider the following situation:
   
   `FurthestNeighbors neighbors = new FurthestNeighbors(ef, ep);
     for (int l = hnsw.topLevel(); l > 0; l--) {
       visitedCount += hnsw.searchLayer(query, neighbors, 1, l, vectorValues);
     }
     visitedCount += hnsw.searchLayer(query, neighbors, ef, 0, vectorValues);`
   
   where the max size of "neighbors" ("neighbors" is also a sub-type of 
PriorityQueue) is ef (assume ef > 1). When search over a non-zero layer, we are 
going to find the nearest one neighbor by `hnsw.searchLayer(query, neighbors, 
1, l, vectorValues);`, where l is the layer and layer > 0. The actual size of 
"neighbors" may be larger than 1.
   
   Assume that "results.size() <= ef", I think "results.pop();" when 
"results.size() == ef" can solve this problem.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to