Re: [PR] Fix undercounting of RAM used by vectors buffered in in-memory segments [lucene]

via GitHub Tue, 19 May 2026 18:27:45 -0700


iprithv commented on PR #15982:
URL: https://github.com/apache/lucene/pull/15982#issuecomment-4493679638


   > @iprithv Thanks for iterating here and looks ready. I think I understand 
now what was my confusion around double accounting(which we are not doing 
apparently). I added a small comment above to mention that in a inline code 
comment.
   > 
   > Could you also run [luceneutil 
benchmarks](https://github.com/mikemccand/luceneutil/) with you PR to see how 
this impacts the vector indexing or merging etc?
   
   sure, done. added the comment.
   
   this change is just fixing ram accounting, no changes to actual indexing or 
merge logic. for float vectors, nothing changes. for byte vectors, we were 
overcounting memory before (around 4x), so now it just reports the correct 
usage. this mainly helps avoid early flushes from IndexWriter. so 
indexing/merging performance shouldn’t really change.
   
   I anyhow ran KnnGraphTester with 50k cohere vectors (1024d, 8 threads, hnsw):
   index time
   main → 5.31 sec
   this PR → 5.09 sec
   
   no regression. both runs behave the same (same segments, same merges). as 
expected since this is only a ram accounting fix. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Fix undercounting of RAM used by vectors buffered in in-memory segments [lucene]

Reply via email to