iprithv commented on PR #15982: URL: https://github.com/apache/lucene/pull/15982#issuecomment-4493679638
> @iprithv Thanks for iterating here and looks ready. I think I understand now what was my confusion around double accounting(which we are not doing apparently). I added a small comment above to mention that in a inline code comment. > > Could you also run [luceneutil benchmarks](https://github.com/mikemccand/luceneutil/) with you PR to see how this impacts the vector indexing or merging etc? sure, done. added the comment. this change is just fixing ram accounting, no changes to actual indexing or merge logic. for float vectors, nothing changes. for byte vectors, we were overcounting memory before (around 4x), so now it just reports the correct usage. this mainly helps avoid early flushes from IndexWriter. so indexing/merging performance shouldn’t really change. I anyhow ran KnnGraphTester with 50k cohere vectors (1024d, 8 threads, hnsw): index time main → 5.31 sec this PR → 5.09 sec no regression. both runs behave the same (same segments, same merges). as expected since this is only a ram accounting fix. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
