bkirwi commented on PR #6953:
URL: https://github.com/apache/arrow-rs/pull/6953#issuecomment-2941133915

   In particular, this also fixes behaviour for negative and positive zero in 
floating point. Since these _do_ compare equal but have different bit 
representations, IIUC the old code would combine them into a single dictionary 
entry... but _only if they happened to hash to the same bucket. Since the hash 
function is randomly initialized, this was very rare, but I have observed it. 
(Normally the bits will hash to different buckets and they'll have separate 
entries in the dictionary.)
   
   Comparing the bit representation solves this issue as well, since those are 
not bitwise identical and thus will reliably end up with different dict entries.
   
   Mentioning here in case anyone else is searching for this issue. Thanks for 
the fix!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to