[
https://issues.apache.org/jira/browse/SPARK-44660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17750881#comment-17750881
]
Chao Sun commented on SPARK-44660:
----------------------------------
In fact the check is necessary, but it seems
{code}
postStageCreationRules(outputsColumnar = plan.supportsColumnar)
{code}
can be relaxed: if the new shuffle operator supports columnar, then maybe we
shouldn't insert {{ColumnarToRow}} to this stage. This is assuming the
following stage knows the shuffle output is columnar and has corresponding
{{ColumnarToRow}} if necessary.
> Relax constraint for columnar shuffle check in AQE
> --------------------------------------------------
>
> Key: SPARK-44660
> URL: https://issues.apache.org/jira/browse/SPARK-44660
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.4.1
> Reporter: Chao Sun
> Priority: Major
>
> Currently in AQE, after evaluating the columnar rules, Spark will check if
> the top operator of the stage is still a shuffle operator, and throw
> exception if it doesn't.
> {code}
> val optimized = e.withNewChildren(Seq(optimizeQueryStage(e.child,
> isFinalStage = false)))
> val newPlan = applyPhysicalRules(
> optimized,
> postStageCreationRules(outputsColumnar = plan.supportsColumnar),
> Some((planChangeLogger, "AQE Post Stage Creation")))
> if (e.isInstanceOf[ShuffleExchangeLike]) {
> if (!newPlan.isInstanceOf[ShuffleExchangeLike]) {
> throw SparkException.internalError(
> "Custom columnar rules cannot transform shuffle node to
> something else.")
> }
> {code}
> However, once a shuffle operator is transformed into a custom columnar
> shuffle operator, the {{supportsColumnar}} of the new shuffle operator will
> return true, and therefore the columnar rules will insert {{ColumnarToRow}}
> on top of it. This means the {{newPlan}} is likely no longer a
> {{ShuffleExchangeLike}} but a {{ColumnarToRow}}, and exception will be
> thrown, even though the use case is valid.
> This JIRA proposes to relax the check by allowing the above case.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]