This is an automated email from the ASF dual-hosted git repository. yao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 722d02ccbda7 [SPARK-52484][SQL] Skip child.supportsColumnar assertion from driver side in ColumnarToRowExec 722d02ccbda7 is described below commit 722d02ccbda7157226a389c8e6fcceef4978c1e6 Author: Hongze Zhang <hongze.zzz...@gmail.com> AuthorDate: Wed Jun 18 11:33:02 2025 +0800 [SPARK-52484][SQL] Skip child.supportsColumnar assertion from driver side in ColumnarToRowExec https://issues.apache.org/jira/browse/SPARK-52484 ### What changes were proposed in this pull request? The PR removes the unnecessary assertion in `ColumnarToRowExec` introduced by https://github.com/apache/spark/pull/25264 to guarantee some flexibilities for 3rd Spark plugins. Especially in Apache Gluten, the assertion blocks some of our effort in query optimization because we needed an intermediate state of the query plan which Spark may see as illegal. Moreover, some typical reasons this intermediate state is needed in Gluten are: 1. Gluten has a cost evaluator API to evaluate the cost of a `transition rule` (which adds a unary node on top of an input plan). In the case Gluten will need a fake leaf to let the rule apply on it for cost evaluation. This leaf node has to be made a columnar one to bypass this assertion, which is a bit hacky. 2. Gluten has a cascades-style query optimizer (RAS) which could set a leaf, dummy, row-based plan node to hide up a child-tree of a brach query plan node, during which this leaf is to represent a so-called cascades 'group'. Although this pattern (C2R on a row-based plan) is illegal, it could still be used as the input of an optimizer rule to potentially be matched on and then to be converted into a valid query plan. This PR is to remove the assertion to ensure some flexibilities to the 3rd plugins. This should be no harm for the upstream Apache Spark, because the query execution will still be failed by [this error](https://github.com/apache/spark/blob/5d0b2f41794bf4dd25b3ce19bc4f634082b40876/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala#L343-L351) without this assertion on an illegal query plan. Some workarounds used by Gluten for bypassing this assertion: 1. https://github.com/apache/incubator-gluten/blob/0a1b5c28678653242ab0fd7b28ebba1dca43ccb1/gluten-core/src/main/scala/org/apache/gluten/extension/columnar/transition/package.scala#L83 2. https://github.com/apache/incubator-gluten/blob/0a1b5c28678653242ab0fd7b28ebba1dca43ccb1/gluten-core/src/main/scala/org/apache/gluten/extension/columnar/enumerated/planner/plan/GlutenPlanModel.scala#L51-L55 Once the assertion is removed, Gluten will be able to remove these workarounds to simply code. ### Does this PR introduce _any_ user-facing change? Basically no. An assertion error in plan-building time will be replaced by an exception in execution time (still from the driver side) when an illegal query plan is generated. ### How was this patch tested? Existing UTs. Closes #51183 from zhztheplayer/wip-rm-c2r-check. Authored-by: Hongze Zhang <hongze.zzz...@gmail.com> Signed-off-by: Kent Yao <y...@apache.org> --- sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala | 4 ---- 1 file changed, 4 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala index 4c9ae155ec17..85ed1a24fc6e 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala @@ -32,7 +32,6 @@ import org.apache.spark.sql.execution.metric.{SQLMetric, SQLMetrics} import org.apache.spark.sql.execution.vectorized.WritableColumnVector import org.apache.spark.sql.types._ import org.apache.spark.sql.vectorized.{ColumnarBatch, ColumnVector} -import org.apache.spark.util.Utils /** * Holds a user defined rule that can be used to inject columnar implementations of various @@ -66,9 +65,6 @@ trait ColumnarToRowTransition extends UnaryExecNode * [[MapPartitionsInRWithArrowExec]]. Eventually this should replace those implementations. */ case class ColumnarToRowExec(child: SparkPlan) extends ColumnarToRowTransition with CodegenSupport { - // supportsColumnar requires to be only called on driver side, see also SPARK-37779. - assert(Utils.isInRunningSparkTask || child.supportsColumnar) - override def output: Seq[Attribute] = child.output override def outputPartitioning: Partitioning = child.outputPartitioning --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org