emkornfield commented on pull request #7143:
URL: https://github.com/apache/arrow/pull/7143#issuecomment-628384930


   @wesm OK, I did a little bit more in depth sampling.  And it looks like this 
new algorithm is a win for 0-5% nulls, then a regression until someplace 
between 45-50% nulls then a likely a win with a larger percentage of nulls.  
I'll add a special case to estimate which algorithm to use (this one or 1 by 1 
based on percentage of nulls and sampling the first N elements of the bitmap 
vector).
   
   
![image](https://user-images.githubusercontent.com/17869838/81894035-bdfd0c80-9563-11ea-90f9-e334a9a1666e.png)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to