[GitHub] [arrow-datafusion] alamb commented on a change in pull request #521: Return errors properly from RepartitionExec
alamb commented on a change in pull request #521: URL: https://github.com/apache/arrow-datafusion/pull/521#discussion_r649260106 ## File path: datafusion/src/physical_plan/repartition.rs ## @@ -308,6 +310,45 @@ impl RepartitionExec { send_time_nanos: SQLMetric::time_nanos(), }) } + +/// Waits for `input_task` which is consuming one of the inputs to Review comment: in https://github.com/apache/arrow-datafusion/pull/538 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on a change in pull request #521: Return errors properly from RepartitionExec
alamb commented on a change in pull request #521: URL: https://github.com/apache/arrow-datafusion/pull/521#discussion_r646646671 ## File path: datafusion/src/physical_plan/repartition.rs ## @@ -308,6 +310,45 @@ impl RepartitionExec { send_time_nanos: SQLMetric::time_nanos(), }) } + +/// Waits for `input_task` which is consuming one of the inputs to Review comment: I agree the approach you describe would be clearer (and avoid needing a separate task) The reason I did not pull the main body out into its own function was mostly "trying to keep the diff small" (or perhaps my own laziness wanting to avoid having to figure out all the types of the arguments that got captured), Perhaps that would be a good follow on PR (there is a lot of messiness / duplication for updating counters which I would also kind of like to fix too) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on a change in pull request #521: Return errors properly from RepartitionExec
alamb commented on a change in pull request #521: URL: https://github.com/apache/arrow-datafusion/pull/521#discussion_r646597513 ## File path: datafusion/src/physical_plan/repartition.rs ## @@ -249,13 +252,12 @@ impl ExecutionPlan for RepartitionExec { counter += 1; } -// notify each output partition that this input partition has no more data -for (_, tx) in txs { -tx.send(None) -.map_err(|e| DataFusionError::Execution(e.to_string()))?; -} Ok(()) }); + +// In a separate task, wait for each input to be done Review comment: This is the actual code change (to check for return value in another task). Otherwise the rest of this PR is tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org