mccullocht commented on issue #12440: URL: https://github.com/apache/lucene/issues/12440#issuecomment-3960932305
Looking at the two graph merge in the paper it seems like the idea is to seed the searches performed during merge for most vectors. This is necessary but doesn't seem sufficient -- you avoid navigation (you could probably start from the l0 graph), but unless the graph is poorly constructed profiles and counting stats suggest that navigation is not a big fraction of the cost. Even if the initial candidate list by seeding is the correct top N we still have to visit each of those vertices and score all of their edges. I suppose if we process everything belonging to some pivot X at once we could read all the necessary vectors into a buffer and score them linearly for every "query" which might be a large win in practice, but really we want a lower budget or other early termination for the seeded searches. (Haven't looked at the graph merge order yet but thanks for the share Ben). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
