[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-04 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19860 Thanks for your work! A late LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19860 LGTM, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-03 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19860 @kiszk @viirya I made the following performance test: ``` val a = (1 to 10).map(x => 1).toDS val filtered = a.where($"value".isin((1 to 10): _*)) (1 to

[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-02 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19860 I am also interested in how much this PR can improve performance. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-02 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19860 OK. I see the intention here now. I'm not sure if it does considerable impact, especially smaller functions will be inlined IIUC. If it has impact not ignoring, it should be worth doing.

[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-02 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19860 @viirya sorry, I don't understand your question. In Coalesce, we need to find the first non-null element. As soon as we find one, we don't need to evaluate anything else. Previously, the code

[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-02 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19860 I'm not sure if I currently follow this. For example, Coalesce, doesn't guarantee the later functions won't be called by the conditions of ev.isNull? Why we need to apply this do loop?

[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-02 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19860 @kiszk of course it depends on each specific case, on average after this PR we use only 50% of the function calls. Thus on average the overhead caused by the many function calls is reduced by 50%.

[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-02 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19860 > this is a not negligible overhead which can be avoided. How much can this PR improve this overhead? --- - To

[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-02 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19860 cc @gatorsmile @kiszk @cloud-fan @viirya --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19860 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19860 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84376/ Test PASSed. ---

[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19860 **[Test build #84376 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84376/testReport)** for PR 19860 at commit

[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19860 **[Test build #84376 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84376/testReport)** for PR 19860 at commit