felipecrv commented on issue #35360:
URL: https://github.com/apache/arrow/issues/35360#issuecomment-1567693454

   Hashing anything other than the validity bitmap buffer can be super tricky 
as the null values can be anything within the value buffers. It would also 
require inference of bit widths for each type, so I maintained the hashing 
simple by hashing only validity buffers as it is now.
   
   The part where I might have overdone things a bit is that I figured a way to 
make a hash function for bitmaps that can consider offsets. It didn't have to 
be a rolling hash, only involved shifts and rotations before data is fed into 
the hash mixing and careful handling of leading/trailing bits.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to