iChauster commented on PR #13366: URL: https://github.com/apache/arrow/pull/13366#issuecomment-1162283321
> For a filtering operation I think there is an extra parameter which is the selectivity (what percentage of rows are kept). I think it would be valuable to add that as a parameter but it would make test data generation more complicated. @westonpace one idea I had to benchmark selectivity is perhaps using the existing 'null_percent/proportion' generators we already have, and then using `is_null` as the filter. Let me know if you think that would be the right approach. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org