wesm opened a new pull request #7521:
URL: https://github.com/apache/arrow/pull/7521


   This significantly speeds up processing of mostly-not-null or mostly-null 
data, while having almost no overhead for the other scenarios where you rarely 
have a word-sized run of all-not-null or all-null-data. For data with 
null_count 0, data is processed in blocks of INT16_MAX values at a time, so 
this adds no meaningful overhead for this case either. 
   
   I modified the hash benchmarks where this code is used to exhibit both the 
cases that benefit from this optimization as well as the ones that don't. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to