benwtrent commented on code in PR #12651:
URL: https://github.com/apache/lucene/pull/12651#discussion_r1359458954
##########
lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java:
##########
@@ -163,45 +185,66 @@ public NodesIterator getNodesOnLevel(int level) {
if (level == 0) {
return new ArrayNodesIterator(size());
} else {
- return new CollectionNodesIterator(graphUpperLevels.get(level).keySet());
+ generateLevelToNodes();
+ return new CollectionNodesIterator(levelToNodes[level]);
}
}
+ @SuppressWarnings({"unchecked", "rawtypes"})
+ private void generateLevelToNodes() {
Review Comment:
I was also worried about the performance here, but I see this is only used
sparingly in two places (and neither update the graph afterwards).
++
##########
lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java:
##########
@@ -40,31 +41,39 @@ public final class OnHeapHnswGraph extends HnswGraph
implements Accountable {
// to vectors
// added to HnswBuilder, and the node values are the ordinals of those
vectors.
// Thus, on all levels, neighbors expressed as the level 0's nodes' ordinals.
- private final List<NeighborArray> graphLevel0;
- // Represents levels 1-N. Each level is represented with a Map that maps a
levels level 0
- // ordinal to its neighbors on that level. All nodes are in level 0, so we
do not need to maintain
- // it in this list. However, to avoid changing list indexing, we always will
make the first
- // element
- // null.
- private final List<Map<Integer, NeighborArray>> graphUpperLevels;
- private final int nsize;
- private final int nsize0;
+ private NeighborArray[][] graph;
Review Comment:
could you add some comments here around the format. It's non trivial at
first glance and it would be nice to explain it.
Especially why we even need `nonZeroLevelSize` (its only used for tracking
memory from what I can tell)
I really like how its `graph[node_id][level] =neighbors`
I was originally worried about space as not all nodes are on all levels.
But, the simple change in requiring nodes to be added from their top level
first is really nice.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]