iChauster commented on PR #13366:
URL: https://github.com/apache/arrow/pull/13366#issuecomment-1162283321

   > For a filtering operation I think there is an extra parameter which is the 
selectivity (what percentage of rows are kept). I think it would be valuable to 
add that as a parameter but it would make test data generation more complicated.
   
   @westonpace one idea I had to benchmark selectivity is perhaps using the 
existing 'null_percent/proportion' generators we already have, and then using 
`is_null` as the filter. Let me know if you think that would be the right 
approach. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to