Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20224#discussion_r163449646
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala
 ---
    @@ -365,6 +388,26 @@ case class WholeStageCodegenExec(child: SparkPlan) 
extends UnaryExecNode with Co
         "pipelineTime" -> SQLMetrics.createTimingMetric(sparkContext,
           WholeStageCodegenExec.PIPELINE_DURATION_METRIC))
     
    +  /**
    +   * ID for codegen stages within a query plan.
    +   * It does not affect equality, nor does it participate in destructuring 
pattern matching.
    +   *
    +   * Within a query, a codegen stage in a plan starts counting from 1, in 
"insertion order".
    +   * WholeStageCodegenExec operators are inserted into a plan in 
depth-first post-order.
    +   * See CollapseCodegenStages.insertWholeStageCodegen for the definition 
of insertion order.
    +   *
    +   * 0 is reserved as a special ID value to indicate a temporary 
WholeStageCodegenExec object
    +   * is created, e.g. for special fallback handling when an existing 
WholeStageCodegenExec
    +   * failed to generate/compile code.
    +   */
    +  val codegenStageId = WholeStageCodegenId.getNextStageId()
    --- End diff --
    
    my major concern is, what if we transform the physical plan tree after 
adding `WholeStageCodegenExec`? e.g. during transformation we may copy a plan 
code, then a copied `WholeStageCodegenExec` will have a different stage id.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to