Julie Tibshirani created LUCENE-10318:
-----------------------------------------

             Summary: Reuse HNSW graphs when merging segments?
                 Key: LUCENE-10318
                 URL: https://issues.apache.org/jira/browse/LUCENE-10318
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Julie Tibshirani


Currently when merging segments, the HNSW vectors format rebuilds the entire 
graph from scratch. In general, building these graphs is very expensive, and 
it'd be nice to optimize it in any way we can. I was wondering if during merge, 
we could choose the largest segment with no deletes, and load its HNSW graph 
into heap. Then we'd add vectors from the other segments to this graph, through 
the normal build process. This could cut down on the number of operations we 
need to perform when building the graph.

This is just an early idea, I haven't run experiments to see if it would help. 
I'd guess that whether it helps would also depend on details of the MergePolicy.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to