Re: [I] Make HNSW merges faster [lucene]

via GitHub Wed, 25 Feb 2026 09:47:30 -0800


mccullocht commented on issue #12440:
URL: https://github.com/apache/lucene/issues/12440#issuecomment-3960932305


   Looking at the two graph merge in the paper it seems like the idea is to 
seed the searches performed during merge for most vectors. This is necessary 
but doesn't seem sufficient -- you avoid navigation (you could probably start 
from the l0 graph), but unless the graph is poorly constructed profiles and 
counting stats suggest that navigation is not a big fraction of the cost. Even 
if the initial candidate list by seeding is the correct top N we still have to 
visit each of those vertices and score all of their edges. I suppose if we 
process everything belonging to some pivot X at once we could read all the 
necessary vectors into a buffer and score them linearly for every "query" which 
might be a large win in practice, but really we want a lower budget or other 
early termination for the seeded searches.
   
   (Haven't looked at the graph merge order yet but thanks for the share Ben).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Make HNSW merges faster [lucene]

Reply via email to