Ruhui Wang created SPARK-20295:
----------------------------------
Summary: when spark.sql.adaptive.enabled is enabled, have
conflict with Exchange Resue
Key: SPARK-20295
URL: https://issues.apache.org/jira/browse/SPARK-20295
Project: Spark
Issue Type: Bug
Components: Shuffle, SQL
Affects Versions: 2.1.0
Reporter: Ruhui Wang
when spark.sql.exchange.reuse is opened, then run a query with self join(such
as tpcds-q95), the physical plan will become below randomly:
WholeStageCodegen
: +- Project [id#0L]
: +- BroadcastHashJoin [id#0L], [id#2L], Inner, BuildRight, None
: :- Project [id#0L]
: : +- BroadcastHashJoin [id#0L], [id#1L], Inner, BuildRight, None
: : :- Range 0, 1, 4, 1024, [id#0L]
: : +- INPUT
: +- INPUT
:- BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L))
: +- WholeStageCodegen
: : +- Range 0, 1, 4, 1024, [id#1L]
+- ReusedExchange [id#2L], BroadcastExchange
HashedRelationBroadcastMode(true,List(id#1L),List(id#1L))
If spark.sql.adaptive.enabled = true, the code stack is :
ShuffleExchange#doExecute --> postShuffleRDD function -->
doEstimationIfNecessary . In this function,
assert(exchanges.length == numExchanges) will be error, as left side has only
one element, but right is equal to 2.
If this is a bug of spark.sql.adaptive.enabled and exchange resue
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]