Re: [VOTE] Release Apache Spark 2.0.2 (RC3)

2016-11-11 Thread Adam Roberts
+1 (non-binding) Build: mvn -T 1C -Psparkr -Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -DskipTests clean package Test: mvn -Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver -Dtest.exclude.tags=org.apache.spark.tags.DockerTest -fn test Test options: -Xss2048k -Dspark.buffer.pageSize=1048576

Reduce the memory usage if we do same first in GradientBoostedTrees if subsamplingRate< 1.0

2016-11-11 Thread WangJianfei
when we train the mode, we will use the data with a subSampleRate, so if the subSampleRate < 1.0 , we can do a sample first to reduce the memory usage. se the code below in GradientBoostedTrees.boost() while (m < numIterations && !doneLearning) { // Update data with pseudo-residuals 剩余误差

spark sql query of nested json lists data

2016-11-11 Thread robert
I am new to the spark sql development. I have a json file with nested arrays. I can extract/query these arrays. However, when I add order by clause, I get exceptions: here is the step: 1) val a = sparkSession.sql("SELECT Tables.TableName, Tables.TableType, Tables.TableExecOrder, Tables.Columns

Re: [VOTE] Release Apache Spark 2.0.2 (RC3)

2016-11-11 Thread Reynold Xin
The vote has passed with the following +1s and no -1. I will work on packaging the release. +1: Reynold Xin* Herman van Hövell tot Westerflier Ricardo Almeida Shixiong (Ryan) Zhu Sean Owen* Michael Armbrust* Dongjoon Hyun Jagadeesan As Liwei Lin Weiqing Yang Vaquar Khan Denny Lee Yin Huai* Ryan

Re: [VOTE] Release Apache Spark 2.0.2 (RC3)

2016-11-11 Thread Dongjoon Hyun
Hi. Now, do we have Apache Spark 2.0.2? :) Bests, Dongjoon. On 2016-11-07 22:09 (-0800), Reynold Xin wrote: > Please vote on releasing the following candidate as Apache Spark version > 2.0.2. The vote is open until Thu, Nov 10, 2016 at 22:00 PDT and passes if > a majority

withExpr private method duplication in Column and functions objects?

2016-11-11 Thread Jacek Laskowski
Hi, Any reason for withExpr duplication in Column [1] and functions [2] objects? It looks like it could be less private and be at least private[sql]? private def withExpr(newExpr: Expression): Column = new Column(newExpr) [1]

Re: withExpr private method duplication in Column and functions objects?

2016-11-11 Thread Reynold Xin
private[sql] has no impact in Java, and these functions are literally one line of code. It's overkill to think about code duplication for functions that simple. On Fri, Nov 11, 2016 at 1:12 PM, Jacek Laskowski wrote: > Hi, > > Any reason for withExpr duplication in Column [1]