xudong963 commented on a change in pull request #1776:
URL: https://github.com/apache/arrow-datafusion/pull/1776#discussion_r801726569
##########
File path: datafusion/src/physical_optimizer/repartition.rs
##########
@@ -36,33 +128,70 @@ impl Repartition {
}
}
+/// Recursively visits all `plan`s puts and then optionally adds a
+/// `RepartitionExec` at the output of `plan` to match
+/// `target_partitions`
+///
+/// if `can_reorder` is false, means that the output of this node
+/// can not be reordered as as something upstream is relying on that order
+///
+/// If 'would_benefit` is false, the upstream operator doesn't
+/// benefit from additional reordering
+///
fn optimize_partitions(
target_partitions: usize,
plan: Arc<dyn ExecutionPlan>,
- should_repartition: bool,
+ can_reorder: bool,
+ would_benefit: bool,
Review comment:
Not very understand the variable, If 'would_benefit` is false, the
upstream operator doesn't benefit from additional reordering, but wouldn't
produce wrong results? So it's ok to repartition to benefit from concurrency?
If so, I think the variable is needless.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]