jmazanec15 commented on code in PR #12050:
URL: https://github.com/apache/lucene/pull/12050#discussion_r1061916872


##########
lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java:
##########
@@ -94,36 +93,83 @@ public int size() {
   }
 
   /**
-   * Add node on the given level
+   * Add node on the given level. Nodes can be inserted out of order, but it 
requires that the nodes

Review Comment:
   Im not sure on this. For random insertion for the graph, I think a BST would 
be better. 
   
   However, the insertion pattern for merge typically wont be random. It will 
be more like first, nodes [15-73] are inserted, and then nodes [0-14] and then 
nodes [74-100]. This assumes that the MergedVectorValues are a concatenation of 
the segments to be merged vectors. For this, I added the small optimization of 
the "lastAddedPosInLayer" list, which will skip binary search during locally 
ordered insertion.
   
   That being said, I am not sure that the "concatenation" property is 
guaranteed, specifically in the case when the mergeState.needsIndexSort == 
true. Given this case, it seems like insertion pattern might be more random.
   
   All that being said, do you think it would be better to build for the 
concatenation pattern or more random insert pattern?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to