Dandandan commented on code in PR #8731:
URL: https://github.com/apache/arrow-datafusion/pull/8731#discussion_r1441012064
##########
datafusion/core/src/physical_optimizer/enforce_distribution.rs:
##########
@@ -1198,32 +1198,33 @@ fn ensure_distribution(
)
.map(
|(mut child, requirement, required_input_ordering, would_benefit,
maintains)| {
- // Don't need to apply when the returned row count is not greater
than 1:
+ // Don't need to apply when the returned row count is not greater
than batch size
let num_rows = child.plan.statistics()?.num_rows;
let repartition_beneficial_stats = if
num_rows.is_exact().unwrap_or(false) {
num_rows
.get_value()
.map(|value| value > &batch_size)
- .unwrap_or(true)
+ .unwrap() // safe to unwrap since is_exact() is true
} else {
true
};
+ // When `repartition_file_scans` is set, attempt to increase
+ // parallelism at the source.
+ if repartition_file_scans {
Review Comment:
Should we also check for `repartition_beneficial_stats` maybe?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]