Kontinuation commented on issue #1589:
URL:
https://github.com/apache/datafusion-comet/issues/1589#issuecomment-2771986460
Just took a deeper look and found that CometBroadcastHashJoin is not the
root cause of AQE failure. I enabled plan change log and found that
CometScanRule and CometExecRule have not kicked in yet, the broadcast hash join
node is not a Comet operator but a Spark `BroadcastHashJoin` operator.
The immediate child of `BroadcastHashJoin` became a `BroadcastExchange`
instead of `BroadcastQueryStage` after the `EnsureRequirements` transformation.
This violates the constraint that `BroadcastQueryStage` must be the immediate
child of `BroadcastHashJoin`.
Part of the original plan
```
BroadcastHashJoin [c_nationkey#3L], [n_nationkey#127L], Inner, BuildRight,
(((n_name#49 = GERMANY) AND (n_name#128 = IRAQ)) OR ((n_name#49 = IRAQ) AND
(n_name#128 = GERMANY))), false
:- Project [l_extendedprice#21, l_discount#22, l_shipdate#26,
c_nationkey#3L, n_name#49]
: +- BroadcastHashJoin [s_nationkey#111L], [n_nationkey#48L], Inner,
BuildRight, false
: :- Project [s_nationkey#111L, l_extendedprice#21, l_discount#22,
l_shipdate#26, c_nationkey#3L]
: : +- <omitted>
: +- BroadcastQueryStage 4
: +- CometBroadcastExchange [n_nationkey#48L, n_name#49]
: +- CometFilter [n_nationkey#48L, n_name#49],
(isnotnull(n_nationkey#48L) AND ((n_name#49 = GERMANY) OR (n_name#49 = IRAQ)))
: +- CometScan parquet [n_nationkey#48L,n_name#49] Batched:
true, ...
+- BroadcastQueryStage 5
+- ReusedExchange [n_nationkey#127L, n_name#128], CometBroadcastExchange
[n_nationkey#48L, n_name#49]
```
was transformed to
```
BroadcastHashJoin [c_nationkey#3L], [n_nationkey#127L], Inner, BuildRight,
(((n_name#49 = GERMANY) AND (n_name#128 = IRAQ)) OR ((n_name#49 = IRAQ) AND
(n_name#128 = GERMANY))), false
:- Project [l_extendedprice#21, l_discount#22, l_shipdate#26,
c_nationkey#3L, n_name#49]
: +- BroadcastHashJoin [s_nationkey#111L], [n_nationkey#48L], Inner,
BuildRight, false
: :- Project [s_nationkey#111L, l_extendedprice#21, l_discount#22,
l_shipdate#26, c_nationkey#3L]
: : +- <omitted>
: +- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint,
false]),false), [plan_id=916]
: +- BroadcastQueryStage 4
: +- CometBroadcastExchange [n_nationkey#48L, n_name#49]
: +- CometFilter [n_nationkey#48L, n_name#49],
(isnotnull(n_nationkey#48L) AND ((n_name#49 = GERMANY) OR (n_name#49 = IRAQ)))
: +- CometScan parquet [n_nationkey#48L,n_name#49] Batched:
true, ...
+- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint,
false]),false), [plan_id=920]
+- BroadcastQueryStage 5
+- ReusedExchange [n_nationkey#127L, n_name#128],
CometBroadcastExchange [n_nationkey#48L, n_name#49]
```
This transformation does not introduce a new `BroadcastExchange` node when
not using Comet.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]