alamb commented on a change in pull request #8503: URL: https://github.com/apache/arrow/pull/8503#discussion_r510252959
########## File path: rust/datafusion/src/physical_plan/merge.rs ########## @@ -103,37 +105,56 @@ impl ExecutionPlan for MergeExec { self.input.execute(0).await } _ => { - let tasks = (0..input_partitions).map(|part_i| { + let (sender, receiver) = mpsc::unbounded::<ArrowResult<RecordBatch>>(); Review comment: What experiment failed? removing the calls to `join_all`? > is that we want to ensure the the threads finish I wonder why we care about the thread "finishing" on some particular schedule. It seems what matter The situation described in https://stackoverflow.com/q/38957741/931303 I think is about expecting to see a side-effect of thread execution that happens *after* all the results are retrieved from them. So in this case, the threads will all hang around, sending results via the channel as long as they have something to send and the channel hasn't been closed. Once the channel is closed, there is some period of time during which the threads might be running (effectively after completing ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org