[ 
https://issues.apache.org/jira/browse/SPARK-39729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiangxiang Shen updated SPARK-39729:
------------------------------------
    Environment:     (was: WholeStagecodegen will have better performance in 
many cases. But it should not use WholeStagecodegen for single operator.


Below is a simple experiment.
{code:java}
test("range/filter should be combined") {
    val df = spark.range(10).filter("id = 1").selectExpr("id + 1")
    val plan = df.queryExecution.executedPlan
    assert(plan.find(_.isInstanceOf[WholeStageCodegenExec]).isDefined)
    assert(df.collect() === Array(Row(2)))
    df.explain(false)
    df.queryExecution.debug.codegen
  }{code}
 


If add 
{code:java}
override def supportCodegen: Boolean = false{code}
  in FilterExec.

 

The physical plan is 
{code:java}
== Physical Plan ==
*(2) Project [(id#0L + 1) AS (id + 1)#4L]
+- Filter (id#0L = 1)
   +- *(1) Range (0, 10, step=1, splits=2){code}
  
The performence is not good in this case.

How can disable WholeStagecodegen in these cases?

 )

> Why generate WholeStagecodegen for single operator?
> ---------------------------------------------------
>
>                 Key: SPARK-39729
>                 URL: https://issues.apache.org/jira/browse/SPARK-39729
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: xiangxiang Shen
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to