[ 
https://issues.apache.org/jira/browse/LUCENE-9626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Sokolov updated LUCENE-9626:
------------------------------------
    Description: 
I ran some KNN tests constructing an index under the profiler. 

||function  || percent CPU ||
|---------------|-----------------------|
|dotProduct| 28%|
|PriorityQueue<Neighbor>.insertWithOverflow| 13% + 4%|
|  PriorityQueue.lessThan| 10%|
|TreeSet.add| 4% + 4%|
|HashSet.add| 7% (visited list?) + 2%|
|BoundedVectorValues.vectorValue| 6%|
|HnswGraph.getNeighbors| 6%|
|HashSet.init| 3%|

The main cost, as we'd expect, is computing dot products, but we also spend a 
lot of time in the various collections. We do not need a {TreeSet} (used to 
keep a candidate list); a heap is enough for that. We should also be able to 
improve the {PriorityQueue<Neighbor>} times by switching to a native int heap 
({lessThan} will be faster, at least). And I also noticed in the profiler that 
we do a lot of autoboxing of Integers today, which we can start to reduce to 
save on garbage.

The idea of this issue is that instead of maintaining a priority queue of 
Neighbor objects (node id, score) for each node in the graph, we maintain two 
parallel arrays: one for node ids and one for scores. These can be 
pre-allocated to max-connections, or perhaps to half of that and then grown, 
since we see that on average fanout is about half of max-connections.

Then we can reimplement {Neighbors}, which is currently a 
{PriorityQueue<Neighbor>}, as an integer heap, encoding both the score (as a 
half-width float sortable bits), and the index into the parallel arrays of the 
node (as a short) in the same integer value, using the score as the high bits 
so that priority queue sorting is correct.

Future issues can tackle replacing the visited {HashSet<Integer>} with some 
more efficient data structure - perhaps a {SparseBitSet} or native int hash set 
of some sort. 



> Represent HNSW neighbors with primitive arrays instead of Neighbor Objects
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-9626
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9626
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael Sokolov
>            Priority: Major
>
> I ran some KNN tests constructing an index under the profiler. 
> ||function  || percent CPU ||
> |---------------|-----------------------|
> |dotProduct| 28%|
> |PriorityQueue<Neighbor>.insertWithOverflow| 13% + 4%|
> |  PriorityQueue.lessThan| 10%|
> |TreeSet.add| 4% + 4%|
> |HashSet.add| 7% (visited list?) + 2%|
> |BoundedVectorValues.vectorValue| 6%|
> |HnswGraph.getNeighbors| 6%|
> |HashSet.init| 3%|
> The main cost, as we'd expect, is computing dot products, but we also spend a 
> lot of time in the various collections. We do not need a {TreeSet} (used to 
> keep a candidate list); a heap is enough for that. We should also be able to 
> improve the {PriorityQueue<Neighbor>} times by switching to a native int heap 
> ({lessThan} will be faster, at least). And I also noticed in the profiler 
> that we do a lot of autoboxing of Integers today, which we can start to 
> reduce to save on garbage.
> The idea of this issue is that instead of maintaining a priority queue of 
> Neighbor objects (node id, score) for each node in the graph, we maintain two 
> parallel arrays: one for node ids and one for scores. These can be 
> pre-allocated to max-connections, or perhaps to half of that and then grown, 
> since we see that on average fanout is about half of max-connections.
> Then we can reimplement {Neighbors}, which is currently a 
> {PriorityQueue<Neighbor>}, as an integer heap, encoding both the score (as a 
> half-width float sortable bits), and the index into the parallel arrays of 
> the node (as a short) in the same integer value, using the score as the high 
> bits so that priority queue sorting is correct.
> Future issues can tackle replacing the visited {HashSet<Integer>} with some 
> more efficient data structure - perhaps a {SparseBitSet} or native int hash 
> set of some sort. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to