comphead commented on code in PR #12082: URL: https://github.com/apache/datafusion/pull/12082#discussion_r1733404538
########## datafusion/physical-plan/src/joins/sort_merge_join.rs: ########## @@ -1356,16 +1392,82 @@ impl SMJStream { pre_mask.clone() }; + // Try to calculate if the buffered batch we scan is the last one for specific stream row and join key + // for Batchsize == 1 self.buffered_data.scanning_finished() works well + // For other scenarios its an attempt to figure out there is no more rows matching the same join key + let last_batch = if self.batch_size == 1 { Review Comment: Thanks @korowa for the directions. This week I will try to find if such approach works for us, and alternatively I'm planning to play with a pair `scanning_batch().range.start` and `self.scanning_offset` perhaps it can give a hint how to identify last joined buffered side batch for the for the streaming row. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org