rluvaton commented on PR #18873:
URL: https://github.com/apache/datafusion/pull/18873#issuecomment-3599124674
You piqued my interest with why this is slow.
couple of questions:
1. what are the sized of the left and right boolean buffers? maybe they are
very large and each copy is expensive
2. who produce the selection masks?
ideas:
1. you could try to reuse the same buffer when combining selection masks and
thus avoid copy every time
3. keep track of some estimate of how many true exists in the selection mask
for each `and_then`
for large number of true and large number right selection mask you
should work in chunks rather than bits
4. keep some kind of data struct that let you track whether it is better to
do
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]