korowa commented on issue #2628:
URL: 
https://github.com/apache/arrow-datafusion/issues/2628#issuecomment-1141733468

   I've tried to do some POC with constructing intermediate batch and applying 
filter to it while `freeze_buffered_join_streamed` -- it seems to be the only 
place where filtering required, and noticed (please correct me, if I'm 
mistaken), that due to `freeze_*` functions logic, output ordering can be 
broken in case of outer joins -- while freezing, joined and non-joined records 
from outer table append as separate batches, and after that no merges / resorts 
happen -- just batch concatenation.
   
   I suppose output ordering to be quite important for planning (i.e. if we had 
merge/stream/pipe aggregate operator it could be planned over merge join 
output), so I wonder - shouldn't this be fixed prior to MJ filter? I guess this 
fix could significantly change MJ logic in places where filtering required 🤔 
   
   @yjshen , @richox what do you think of it? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to