tyronecai commented on PR #15779:
URL: https://github.com/apache/lucene/pull/15779#issuecomment-3997398070

   > > If we change it to "store the low bits for fingerprint," the first k 
bits overlap with the bucket location, essentially wasting k bits of 
information.如果我们将其改为“将低位数用于指纹信息”,那么前 k 位会与桶位置重叠,实际上会浪费 k 位的信息。
   > 
   > Wait -- we would not duplicate the hash bits in this approach? Bucket 
location is lower k bits, then store the next m lower bits (not overlapping 
with the k bits) in the high unused bits of ids (fingerprint)? Then we do not 
lose any hash bits (still 32-k bits used for fingerprint) and I think we can 
avoid recomputing hash of keys during rehash.等等——这种方法不会重复哈希位吗?桶的位置是较低的 k 
位,然后存储接下来的 m 个较低位(不与 k 位重叠)在 ids 的高位未使用位中?这样我们就不会丢失任何哈希位(仍然使用 32-k 
位用于指纹),我认为我们可以避免在重新哈希时重新计算键的哈希值。
   > 
   > Really, during rehash, we just need one more bit (the lowest bit of the 
fingerprint) of each hash. It tells us whether bucket location in the new table 
is the same spot (0 bit) in bottom half of the new table, or the same spot in 
the "top half" (spot + 
hashTableSize/2).实际上,在重新哈希过程中,我们只需要每个哈希值多一个位(指纹的最低位)。这个位可以告诉我们,新表中的桶位置是在新表下半部分的相同位置(位值为
 0),还是在“上半部分”的相同位置(位置值加上 hashTableSize/2)。
   
   Let me understand what you're saying.
   
   > Whereas if you only stepped through the pages directly that's a single 
sequential read stream. But, it's an added 1 or 2 byte `vInt` decode, yet, that 
`if` should be trivial for CPU (almost always 1 byte, keys < 128 length vast 
majority of time).
   
   Directly iterating through the data in the pool within BytesRefHash doesn't 
feel quite right, even though it does reduce one access to bytesStart.
   
   I still need to test the effect of this change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to