[
https://issues.apache.org/jira/browse/SPARK-20295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ruhui Wang updated SPARK-20295:
-------------------------------
Description:
when spark.sql.exchange.reuse is opened, then run a query with self join(such
as tpcds-q95), the physical plan will become below randomly:
WholeStageCodegen
: +- Project [id#0L]
: +- BroadcastHashJoin [id#0L], [id#2L], Inner, BuildRight, None
: :- Project [id#0L]
: : +- BroadcastHashJoin [id#0L], [id#1L], Inner, BuildRight, None
: : :- Range 0, 1, 4, 1024, [id#0L]
: : +- INPUT
: +- INPUT
:- BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L))
: +- WholeStageCodegen
: : +- Range 0, 1, 4, 1024, [id#1L]
+- ReusedExchange [id#2L], BroadcastExchange
HashedRelationBroadcastMode(true,List(id#1L),List(id#1L))
If spark.sql.adaptive.enabled = true, the code stack is :
ShuffleExchange#doExecute --> postShuffleRDD function -->
doEstimationIfNecessary . In this function,
assert(exchanges.length == numExchanges) will be error, as left side has only
one element, but right is equal to 2.
If this is a bug of spark.sql.adaptive.enabled and exchange resue?
was:
when spark.sql.exchange.reuse is opened, then run a query with self join(such
as tpcds-q95), the physical plan will become below randomly:
WholeStageCodegen
: +- Project [id#0L]
: +- BroadcastHashJoin [id#0L], [id#2L], Inner, BuildRight, None
: :- Project [id#0L]
: : +- BroadcastHashJoin [id#0L], [id#1L], Inner, BuildRight, None
: : :- Range 0, 1, 4, 1024, [id#0L]
: : +- INPUT
: +- INPUT
:- BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L))
: +- WholeStageCodegen
: : +- Range 0, 1, 4, 1024, [id#1L]
+- ReusedExchange [id#2L], BroadcastExchange
HashedRelationBroadcastMode(true,List(id#1L),List(id#1L))
If spark.sql.adaptive.enabled = true, the code stack is :
ShuffleExchange#doExecute --> postShuffleRDD function -->
doEstimationIfNecessary . In this function,
assert(exchanges.length == numExchanges) will be error, as left side has only
one element, but right is equal to 2.
If this is a bug of spark.sql.adaptive.enabled and exchange resue
> when spark.sql.adaptive.enabled is enabled, have conflict with Exchange Resue
> ------------------------------------------------------------------------------
>
> Key: SPARK-20295
> URL: https://issues.apache.org/jira/browse/SPARK-20295
> Project: Spark
> Issue Type: Bug
> Components: Shuffle, SQL
> Affects Versions: 2.1.0
> Reporter: Ruhui Wang
>
> when spark.sql.exchange.reuse is opened, then run a query with self join(such
> as tpcds-q95), the physical plan will become below randomly:
> WholeStageCodegen
> : +- Project [id#0L]
> : +- BroadcastHashJoin [id#0L], [id#2L], Inner, BuildRight, None
> : :- Project [id#0L]
> : : +- BroadcastHashJoin [id#0L], [id#1L], Inner, BuildRight, None
> : : :- Range 0, 1, 4, 1024, [id#0L]
> : : +- INPUT
> : +- INPUT
> :- BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L))
> : +- WholeStageCodegen
> : : +- Range 0, 1, 4, 1024, [id#1L]
> +- ReusedExchange [id#2L], BroadcastExchange
> HashedRelationBroadcastMode(true,List(id#1L),List(id#1L))
> If spark.sql.adaptive.enabled = true, the code stack is :
> ShuffleExchange#doExecute --> postShuffleRDD function -->
> doEstimationIfNecessary . In this function,
> assert(exchanges.length == numExchanges) will be error, as left side has only
> one element, but right is equal to 2.
> If this is a bug of spark.sql.adaptive.enabled and exchange resue?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]