kaivalnp commented on code in PR #15607:
URL: https://github.com/apache/lucene/pull/15607#discussion_r2743358635


##########
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java:
##########
@@ -470,9 +476,20 @@ static void popToScratch(GraphBuilderKnnCollector 
candidates, NeighborArray scra
    */
   private boolean diversityCheck(float score, NeighborArray neighbors, 
RandomVectorScorer scorer)
       throws IOException {
-    for (int i = 0; i < neighbors.size(); i++) {
-      float neighborSimilarity = scorer.score(neighbors.nodes()[i]);
-      if (neighborSimilarity >= score) {
+    final int bulkScoreChunk = Math.min((neighbors.size() + 1) / 2, 
bulkScoreNodes.length);
+    int scored = 0;
+    for (scored = 0; scored < neighbors.size(); scored += bulkScoreChunk) {
+      int chunkSize = Math.min(bulkScoreChunk, neighbors.size() - scored);
+      System.arraycopy(neighbors.nodes(), scored, bulkScoreNodes, 0, 
chunkSize);
+      if (scorer.bulkScore(bulkScoreNodes, bulkScores, chunkSize) >= score) {
+        return false;
+      }
+    }
+    // handle a tail
+    if (scored < neighbors.size()) {
+      int chunkSize = neighbors.size() - scored;
+      System.arraycopy(neighbors.nodes(), scored, bulkScoreNodes, 0, 
chunkSize);
+      if (scorer.bulkScore(bulkScoreNodes, bulkScores, chunkSize) >= score) {
         return false;
       }

Review Comment:
   I don't think we need this tail -- we're doing `Math.min(bulkScoreChunk, 
neighbors.size() - scored)` in the above loop, which automatically bulk-scores 
the tail (using the second value)



##########
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java:
##########
@@ -156,6 +158,10 @@ protected HnswGraphBuilder(
     this.hnsw = hnsw;
     this.hnswLock = hnswLock;
     this.graphSearcher = graphSearcher;
+    // pick a number that keeps us from scoring TOO much for diversity checking
+    // but enough to take advantage of bulk scoring
+    this.bulkScoreNodes = new int[8];
+    this.bulkScores = new float[8];

Review Comment:
   nit: maybe host `8` up as a `private static final` variable?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to