[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

rednaxelafx Mon, 29 Jan 2018 11:33:33 -0800

Github user rednaxelafx commented on the issue:

    https://github.com/apache/spark/pull/20419
  
    @kiszk SGTM and LGTM. Let's ship it!
    
    One more question on the side: with the `forceComment = true`, are we fully 
sure that won't affect the equality of `CodeAndComment`?
    The whole point of this PR is to have a way to embed the ID into the 
non-executable code of the generated code so that it won't affect the codegen 
cache hit, right?
    
    Could you please post an example of either the generated code or the 
metrics, e.g. a `SortMergeJoin` on two identical `spark.range(3)`s, and confirm 
that even when the two `range()`s are codegen's into different 
`codegenStageId`s, with the `spark.sql.codegen.useIdInClassName` turned off, 
the two stages would still hit the codegen cache? Basically to verify the 
example I gave in 
https://github.com/apache/spark/pull/20224#issuecomment-357091842 still hits 
the codegen cache when `spark.sql.codegen.useIdInClassName=false`.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

Reply via email to