[GitHub] spark issue #19752: [SPARK-22520][SQL] Support code generation for large Cas...

mgaido91 Fri, 24 Nov 2017 07:41:45 -0800

Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19752
  
    @gatorsmile I added a test case to check that the execution plan is 
`WholeStageCodegenExec` as expected. I also made some performance test using 
almost the same code, ie.:
    ```
    val N = 30
    val nRows = 1000000
    var expr1 = when($"id" === lit(0), 0)
    var expr2 = when($"id" === lit(0), 10)
    (1 to N).foreach { i =>
      expr1 = expr1.when($"id" === lit(i), -i)
      expr2 = expr2.when($"id" === lit(i + 10), i)
    }
    time { spark.range(nRows).select(expr1.as("c1"), 
expr2.otherwise(0).as("c2")).sort("c1").show }
    ```
    before this PR, it takes on average 1091.690996ms. After the PR, it takes 
on average 106.894443ns.
    
    Actually there is a problem which is fixed in #18641 and it is not fixed 
here, ie. when the code contains deeply nested exceptions, the 64KB limit 
exception can still happen. But this should be handled in a more generic way in 
#19813.
    
    @kiszk What do you think?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19752: [SPARK-22520][SQL] Support code generation for large Cas...

Reply via email to