Re: [VOTE] Release Apache Spark 2.0.1 (RC4)

2016-09-30 Thread Shixiong(Ryan) Zhu
Hey Mark, I can reproduce the failure locally using your command. There were a lot of OutOfMemoryError in the unit test log. I increased the heap size from 3g to 4g at https://github.com/apache/spark/blob/v2.0.1-rc4/pom.xml#L2029 and it passed tests. I think the patch you mentioned increased the

regression: no longer able to use HDFS wasbs:// path for additional python files on LIVY batch submit

2016-09-30 Thread Kevin Grealish
I'm seeing a regression when submitting a batch PySpark program with additional files using LIVY. This is YARN cluster mode. The program files are placed into the mounted Azure Storage before making the call to LIVY. This is happening from an application which has credentials for the storage

Re: Issues in compiling spark 2.0.0 code using scala-maven-plugin

2016-09-30 Thread satyajit vegesna
> > > i am trying to compile code using maven ,which was working with spark > 1.6.2, but when i try for spark 2.0.0 then i get below error, > > org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute > goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile (default) on >

Re: Catalyst - ObjectType for Encoders

2016-09-30 Thread Michael Armbrust
I'd be okay removing that modifier, with one caveat. The code in org.apache.spark.sql.catalyst.* is purposefully excluded from published documentation and does not have the same compatibility guarantees as the rest of the Spark's Public APIs. We leave most of it not "private" so that advanced

Re: Restful WS for Spark

2016-09-30 Thread Mahendra Kutare
Try Cloudera Livy https://github.com/cloudera/livy It may be helpful for your requirement. Cheers, Mahendra about.me/mahendrakutare

Re: [VOTE] Release Apache Spark 2.0.1 (RC4)

2016-09-30 Thread akchin
+1 (non-binding) Tested with following: -Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -Psparkr CentOS 7.2 / openjdk version "1.8.0_101" - IBM Spark Technology Center -- View this message in context:

code questions, sql.functions.scala

2016-09-30 Thread Peter Figliozzi
Taking isnan as a simple example, I'd like to understand what happens downstream of sql.functions.scala. 1. The withExpr { } construct. How does that work? I see it refers to the IsNan case class, which gets passed not the column e but e.expr, which is an Expression. I also see that withExpr

Re: [VOTE] Release Apache Spark 2.0.1 (RC4)

2016-09-30 Thread Mark Hamstra
0 RC4 is causing a build regression for me on at least one of my machines. RC3 built and ran tests successfully, but the tests consistently fail with RC4 unless I revert 9e91a1009e6f916245b4d4018de1664ea3decfe7, "[SPARK-15703][SCHEDULER][CORE][WEBUI] Make ListenerBus event queue size configurable

Catalyst - ObjectType for Encoders

2016-09-30 Thread Aleksander Eskilson
Hi there, Currently Catalyst supports encoding custom classes represented as Java Beans (among others). This Java Bean implementation depends internally on Catalyst’s ObjectType extension of DataType. Currently, this class is private to the sql package [1], which is sensible, as it is only

Re: [VOTE] Release Apache Spark 2.0.1 (RC4)

2016-09-30 Thread Tom Graves
+1 Tom On Wednesday, September 28, 2016 9:15 PM, Reynold Xin wrote: Please vote on releasing the following candidate as Apache Spark version 2.0.1. The vote is open until Sat, Oct 1, 2016 at 20:00 PDT and passes if a majority of at least 3+1 PMC votes are cast. [

Re: java.util.NoSuchElementException when serializing Map with default value

2016-09-30 Thread Maciej Szymkiewicz
Thanks guys. This is not a big issue in general. More an annoyance and can be rather confusing when encountered for the first time. On 09/29/2016 02:05 AM, Jakob Odersky wrote: > I agree with Sean's answer, you can check out the relevant serializer > here >

Re: IllegalArgumentException: spark.sql.execution.id is already set

2016-09-30 Thread Marcin Tustin
The solution is to strip it out in a hook on your threadpool, by overriding beforeExecute. See: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadPoolExecutor.html On Fri, Sep 30, 2016 at 7:08 AM, Grant Digby wrote: > Thanks for the link. Yeah if there's no

Re: IllegalArgumentException: spark.sql.execution.id is already set

2016-09-30 Thread Grant Digby
Thanks for the link. Yeah if there's no need to copy execution.id from parent to child then I agree, you could strip it out, presumably in this part of the code using some kind of configuration as to which properties shouldn't go across SparkContext: protected[spark] val localProperties = new

Re: Using Spark as a Maven dependency but with Hadoop 2.6

2016-09-30 Thread Steve Loughran
On 29 Sep 2016, at 10:37, Olivier Girardot > wrote: I know that the code itself would not be the same, but it would be useful to at least have the pom/build.sbt transitive dependencies different when fetching the artifact

Re: [VOTE] Release Apache Spark 2.0.1 (RC4)

2016-09-30 Thread Maciej Bryński
+1 2016-09-30 7:01 GMT+02:00 vaquar khan : > +1 (non-binding) > Regards, > Vaquar khan > > On 29 Sep 2016 23:00, "Denny Lee" wrote: > >> +1 (non-binding) >> >> On Thu, Sep 29, 2016 at 9:43 PM Jeff Zhang wrote: >> >>> +1 >>> >>>