bkirwi commented on PR #6953: URL: https://github.com/apache/arrow-rs/pull/6953#issuecomment-2941133915
In particular, this also fixes behaviour for negative and positive zero in floating point. Since these _do_ compare equal but have different bit representations, IIUC the old code would combine them into a single dictionary entry... but _only if they happened to hash to the same bucket. Since the hash function is randomly initialized, this was very rare, but I have observed it. (Normally the bits will hash to different buckets and they'll have separate entries in the dictionary.) Comparing the bit representation solves this issue as well, since those are not bitwise identical and thus will reliably end up with different dict entries. Mentioning here in case anyone else is searching for this issue. Thanks for the fix! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
