rluvaton commented on PR #18873:
URL: https://github.com/apache/datafusion/pull/18873#issuecomment-3599124674

   You piqued my interest with why this is slow.
   
   couple of questions:
   1. what are the sized of the left and right boolean buffers? maybe they are 
very large and each copy is expensive
   2. who produce the selection masks?
   
   
   ideas:
   1. you could try to reuse the same buffer when combining selection masks and 
thus avoid copy every time
   3. keep track of some estimate of how many true exists in the selection mask 
for each `and_then`
        for large number of true and large number right selection mask you 
should work in chunks rather than bits   
   4. keep some kind of data struct that let you track whether it is better to 
do 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to