Re: How to tune the performance of Tpch query5 within Spark

2017-07-16 Thread 163
I change the UDF but the performance seems still slow. What can I do else? > 在 2017年7月14日,下午8:34,Wenchen Fan 写道: > > Try to replace your UDF with Spark built-in expressions, it should be as > simple as `$”x” * (lit(1) - $”y”)`. > >> On 14 Jul 2017, at 5:46 PM, 163 >

Re: [SQL] Syntax "case when" doesn't be supported in JOIN

2017-07-16 Thread Xiao Li
If the join condition is non-deterministic, pushing it down to the underlying project will change the semantics. Thus, we are unable to do it in PullOutNondeterministic. Users can do it manually if they do not care the semantics difference. Thanks, Xiao 2017-07-16 20:07 GMT-07:00 Chang Chen :

Re: [SQL] Syntax "case when" doesn't be supported in JOIN

2017-07-16 Thread Chang Chen
It is tedious since we have lots of Hive SQL being migrated to Spark. And this workaround is equivalent to insert a Project between Join operator and its child. Why not do it in PullOutNondeterministic? Thanks Chang On Fri, Jul 14, 2017 at 5:29 PM, Liang-Chi Hsieh wrote: > > A possible worka

Re: 2.2.0 under Unreleased Versions in JIRA?

2017-07-16 Thread Jacek Laskowski
Confirmed. Thanks a lot, Sean. Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Sun, Jul 16, 2017 at 3:02 PM, Sean Owen wrote: > Done, it just needed to be marke

Re: 2.2.0 under Unreleased Versions in JIRA?

2017-07-16 Thread Sean Owen
Done, it just needed to be marked as released. On Sun, Jul 16, 2017 at 12:03 PM Jacek Laskowski wrote: > Hi, > > Just noticed that 2.2.0 label is under Unreleased Versions in JIRA. > Since it's out, I think 2.2.1 and 2.3.0 are valid only. Correct? > > Pozdrawiam, > Jacek Laskowski > > https

2.2.0 under Unreleased Versions in JIRA?

2017-07-16 Thread Jacek Laskowski
Hi, Just noticed that 2.2.0 label is under Unreleased Versions in JIRA. Since it's out, I think 2.2.1 and 2.3.0 are valid only. Correct? Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark Follow me at https://twitter