andygrove commented on issue #808: URL: https://github.com/apache/datafusion-comet/issues/808#issuecomment-2284140087
The `FilterExec` in the above example is even more expensive than the `HashJoinExec`. Evaluating the predicate is cheap but copying data to the filtered batch takes 99% of the time. We could potentially avoid this copy by using a selection vector approach instead. ``` Time to compute filter mask on batch of 32768 rows is: 581ns Time to filter batch is: 252.194µs ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
