Re: [PR] Improve BytesRefHash.add performance by optimize rehash operation [lucene]

via GitHub Wed, 04 Mar 2026 04:43:35 -0800


mikemccand commented on PR #15779:
URL: https://github.com/apache/lucene/pull/15779#issuecomment-3997319873


   > If we change it to "store the low bits for fingerprint," the first k bits 
overlap with the bucket location, essentially wasting k bits of information.
   
   Wait -- we would not duplicate the hash bits in this approach?  Bucket 
location is lower k bits, then store the next m lower bits (not overlapping 
with the k bits) in the high unused bits of ids (fingerprint)?  Then we do not 
lose any hash bits (still 32-k bits used for fingerprint) and I think we can 
avoid recomputing hash of keys during rehash.
   
   Really, during rehash, we just need one more bit (the lowest bit of the 
fingerprint) of each hash.  It tells us whether bucket location in the new 
table is the same spot (0 bit) in bottom half of the new table, or the same 
spot in the "top half" (spot + hashTableSize/2).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Improve BytesRefHash.add performance by optimize rehash operation [lucene]

Reply via email to