jmazanec15 commented on code in PR #12050: URL: https://github.com/apache/lucene/pull/12050#discussion_r1061916872
########## lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java: ########## @@ -94,36 +93,83 @@ public int size() { } /** - * Add node on the given level + * Add node on the given level. Nodes can be inserted out of order, but it requires that the nodes Review Comment: Im not sure on this. For random insertion for the graph, I think a BST would be better. However, the insertion pattern for merge typically wont be random. It will be more like first, nodes [15-73] are inserted, and then nodes [0-14] and then nodes [74-100]. This assumes that the MergedVectorValues are a concatenation of the segments to be merged vectors. For this, I added the small optimization of the "lastAddedPosInLayer" list, which will skip binary search during locally ordered insertion. That being said, I am not sure that the "concatenation" property is guaranteed, specifically in the case when the mergeState.needsIndexSort == true. Given this case, it seems like insertion pattern might be more random. All that being said, do you think it would be better to build for the concatenation pattern or more random insert pattern? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org