+1 (non-binding)
Build: mvn -T 1C -Psparkr -Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver
-DskipTests clean package
Test: mvn -Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver
-Dtest.exclude.tags=org.apache.spark.tags.DockerTest -fn test
Test options: -Xss2048k -Dspark.buffer.pageSize=1048576
when we train the mode, we will use the data with a subSampleRate, so if the
subSampleRate < 1.0 , we can do a sample first to reduce the memory usage.
se the code below in GradientBoostedTrees.boost()
while (m < numIterations && !doneLearning) {
// Update data with pseudo-residuals 剩余误差
I am new to the spark sql development. I have a json file with nested arrays.
I can extract/query these arrays. However, when I add order by clause, I get
exceptions: here is the step:
1) val a = sparkSession.sql("SELECT Tables.TableName, Tables.TableType,
Tables.TableExecOrder, Tables.Columns
The vote has passed with the following +1s and no -1. I will work on
packaging the release.
+1:
Reynold Xin*
Herman van Hövell tot Westerflier
Ricardo Almeida
Shixiong (Ryan) Zhu
Sean Owen*
Michael Armbrust*
Dongjoon Hyun
Jagadeesan As
Liwei Lin
Weiqing Yang
Vaquar Khan
Denny Lee
Yin Huai*
Ryan
Hi.
Now, do we have Apache Spark 2.0.2? :)
Bests,
Dongjoon.
On 2016-11-07 22:09 (-0800), Reynold Xin wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 2.0.2. The vote is open until Thu, Nov 10, 2016 at 22:00 PDT and passes if
> a majority
Hi,
Any reason for withExpr duplication in Column [1] and functions [2]
objects? It looks like it could be less private and be at least
private[sql]?
private def withExpr(newExpr: Expression): Column = new Column(newExpr)
[1]
private[sql] has no impact in Java, and these functions are literally one
line of code. It's overkill to think about code duplication for functions
that simple.
On Fri, Nov 11, 2016 at 1:12 PM, Jacek Laskowski wrote:
> Hi,
>
> Any reason for withExpr duplication in Column [1]