[ https://issues.apache.org/jira/browse/SPARK-39729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
xiangxiang Shen updated SPARK-39729: ------------------------------------ Environment: (was: WholeStagecodegen will have better performance in many cases. But it should not use WholeStagecodegen for single operator. Below is a simple experiment. {code:java} test("range/filter should be combined") { val df = spark.range(10).filter("id = 1").selectExpr("id + 1") val plan = df.queryExecution.executedPlan assert(plan.find(_.isInstanceOf[WholeStageCodegenExec]).isDefined) assert(df.collect() === Array(Row(2))) df.explain(false) df.queryExecution.debug.codegen }{code} If add {code:java} override def supportCodegen: Boolean = false{code} in FilterExec. The physical plan is {code:java} == Physical Plan == *(2) Project [(id#0L + 1) AS (id + 1)#4L] +- Filter (id#0L = 1) +- *(1) Range (0, 10, step=1, splits=2){code} The performence is not good in this case. How can disable WholeStagecodegen in these cases? ) > Why generate WholeStagecodegen for single operator? > --------------------------------------------------- > > Key: SPARK-39729 > URL: https://issues.apache.org/jira/browse/SPARK-39729 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.3.0 > Reporter: xiangxiang Shen > Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org