RE: [SQL] Is it worth it (and advisable) to implement native UDFs?

2020-02-04 Thread email
Is there any documentation/ sample about this besides the pull requests merged to spark core? It seems that I need to create my custom functions under the package org.apache.spark.sql.* in order to be able to access some of the internal classes I saw in[1] such as Column[2] Could you

Re: More publicly documenting the options under spark.sql.*

2020-02-04 Thread Hyukjin Kwon
FYI, PR was open at https://github.com/apache/spark/pull/27459. Thanks Nicholas. Hope guys find some time to take a look. 2020년 1월 28일 (화) 오전 8:15, Nicholas Chammas 님이 작성: > I am! Thanks for the reference. > > On Thu, Jan 16, 2020 at 9:53 PM Hyukjin Kwon wrote: > >> Nicholas, are you interested

Re: Spark 3.0 branch cut and code freeze on Jan 31?

2020-02-04 Thread Xiao Li
Thank you, Shane! Xiao On Tue, Feb 4, 2020 at 2:16 PM Dongjoon Hyun wrote: > Thank you, Shane! :D > > Bests, > Dongjoon > > On Tue, Feb 4, 2020 at 13:28 shane knapp ☠ wrote: > >> all the 3.0 builds have been created and are currently churning away! >> >> (the failed builds were to a silly bug

Re: Spark 3.0 branch cut and code freeze on Jan 31?

2020-02-04 Thread Dongjoon Hyun
Thank you, Shane! :D Bests, Dongjoon On Tue, Feb 4, 2020 at 13:28 shane knapp ☠ wrote: > all the 3.0 builds have been created and are currently churning away! > > (the failed builds were to a silly bug in the build scripts sneaking it's > way back in, but that's resolved now) > > shane > > On

Initial Decom PR for Spark 3?

2020-02-04 Thread Holden Karau
Hi Y’all, I’ve got a K8s graceful decom PR ( https://github.com/apache/spark/pull/26440 ) I’d love to try and get in for Spark 3, but I don’t want to push on it if folks don’t think it’s worth it. I’ve been working on it since 2017 and it was really close in November but then I had the crash and

Re: [VOTE] Release Apache Spark 2.4.5 (RC2)

2020-02-04 Thread Sean Owen
+1 from me too. Same outcome as in RC1 for me. On Sun, Feb 2, 2020 at 9:31 PM Dongjoon Hyun wrote: > > Please vote on releasing the following candidate as Apache Spark version > 2.4.5. > > The vote is open until February 5th 11PM PST and passes if a majority +1 PMC > votes are cast, with a

unify benchmarks in 2.4 and regenerate results

2020-02-04 Thread Maxim Gekk
Hi All, Currently, most of benchmark results are embedded into benchmark source codes in Spark 2.4.x. This makes comparison of the results between 2.4 releases and master pretty inconvenient. I would like to propose to unify benchmarks in branch-2.4 by backporting the changes made in the master:

Re: [VOTE] Release Apache Spark 2.4.5 (RC2)

2020-02-04 Thread Maxim Gekk
+1 I re-ran some of existing benchmarks in branch-2.4 on Linux/MacOS, and haven't found any regressions compared to 2.4.4. Maxim Gekk On Tue, Feb 4, 2020 at 11:07 AM Takeshi Yamamuro wrote: > +1; > I run the tests with > `-Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pkubernetes >

Re: [VOTE] Release Apache Spark 2.4.5 (RC2)

2020-02-04 Thread Takeshi Yamamuro
+1; I run the tests with `-Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pkubernetes -Psparkr` on macOS (Java 8). All the things look fine in my env. Bests, Takeshi On Tue, Feb 4, 2020 at 12:35 PM Hyukjin Kwon wrote: > +1 from me too. > > 2020년 2월 4일 (화) 오후 12:26, Wenchen Fan 님이 작성: