atris commented on PR #15613: URL: https://github.com/apache/lucene/pull/15613#issuecomment-3843331582
> IIUC this doesn't attempt to balance the size of the clusters at all, which is an important property of SPANN because it means that selecting N centroids results in a ~fixed amount of work. Yeah, the current optimization relies on weighted reclustering to handle distribution changes, but you're right—it doesn't explicitly enforce balancing, so we could see some drift if segments are heavily skewed. I'd prefer to address that in a follow-up if it becomes a real issue (e.g., by adding a splitting pass), rather than touching the merge logic right now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
