ozankabak commented on code in PR #5074:
URL: https://github.com/apache/arrow-datafusion/pull/5074#discussion_r1090939100
##########
datafusion/core/tests/sql/explain_analyze.rs:
##########
@@ -654,13 +654,13 @@ async fn
test_physical_plan_display_indent_multi_children() {
" HashJoinExec: mode=Partitioned, join_type=Inner, on=[(Column {
name: \"c1\", index: 0 }, Column { name: \"c2\", index: 0 })]",
" CoalesceBatchesExec: target_batch_size=4096",
" RepartitionExec: partitioning=Hash([Column { name: \"c1\",
index: 0 }], 9000), input_partitions=9000",
- " ProjectionExec: expr=[c1@0 as c1]",
- " RepartitionExec: partitioning=RoundRobinBatch(9000),
input_partitions=1",
+ " RepartitionExec: partitioning=RoundRobinBatch(9000),
input_partitions=1",
Review Comment:
I think @andygrove came across this behavior recently and @Dandandan had a
good explanation why this happens. IIRC, this surprising-looking repartition
actually is not unnecessary because hash repartitioning could benefit from
parallelization (which is supplied by RR).
The net effect of this PR is simply moving the RR from below projection to
above.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]