Chao Sun created SPARK-44660:
--------------------------------
Summary: Relax constraint for columnar shuffle check in AQE
Key: SPARK-44660
URL: https://issues.apache.org/jira/browse/SPARK-44660
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 3.4.1
Reporter: Chao Sun
Currently in AQE, after evaluating the columnar rules, Spark will check if the
top operator of the stage is still a shuffle operator, and throw exception if
it doesn't.
{code}
val optimized = e.withNewChildren(Seq(optimizeQueryStage(e.child,
isFinalStage = false)))
val newPlan = applyPhysicalRules(
optimized,
postStageCreationRules(outputsColumnar = plan.supportsColumnar),
Some((planChangeLogger, "AQE Post Stage Creation")))
if (e.isInstanceOf[ShuffleExchangeLike]) {
if (!newPlan.isInstanceOf[ShuffleExchangeLike]) {
throw SparkException.internalError(
"Custom columnar rules cannot transform shuffle node to something
else.")
}
{code}
However, once a shuffle operator is transformed into a custom columnar shuffle
operator, the {{supportsColumnar}} of the new shuffle operator will return
true, and therefore the columnar rules will insert {{ColumnarToRow}} on top of
it. This means the {{newPlan}} is likely no longer a {{ShuffleExchangeLike}}
but a {{ColumnarToRow}}, and exception will be thrown, even though the use case
is valid.
This JIRA proposes to relax the check by allowing the above case.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]