korowa commented on issue #2628: URL: https://github.com/apache/arrow-datafusion/issues/2628#issuecomment-1141733468
I've tried to do some POC with constructing intermediate batch and applying filter to it while `freeze_buffered_join_streamed` -- it seems to be the only place where filtering required, and noticed (please correct me, if I'm mistaken), that due to `freeze_*` functions logic, output ordering can be broken in case of outer joins -- while freezing, joined and non-joined records from outer table append as separate batches, and after that no merges / resorts happen -- just batch concatenation. I suppose output ordering to be quite important for planning (i.e. if we had merge/stream/pipe aggregate operator it could be planned over merge join output), so I wonder - shouldn't this be fixed prior to MJ filter? I guess this fix could significantly change MJ logic in places where filtering required 🤔 @yjshen , @richox what do you think of it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
