spark git commit: [SPARK-12198][SPARKR] SparkR support read.parquet and deprecate parquetFile

2015-12-10 Thread shivaram
Repository: spark Updated Branches: refs/heads/master db5165246 -> eeb58722a [SPARK-12198][SPARKR] SparkR support read.parquet and deprecate parquetFile SparkR support ```read.parquet``` and deprecate ```parquetFile```. This change is similar with #10145 for ```jsonFile```. Author: Yanbo

spark git commit: [SPARK-11602][MLLIB] Refine visibility for 1.6 scala API audit

2015-12-10 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master eeb58722a -> 9fba9c800 [SPARK-11602][MLLIB] Refine visibility for 1.6 scala API audit jira: https://issues.apache.org/jira/browse/SPARK-11602 Made a pass on the API change of 1.6. Open the PR for efficient discussion. Author: Yuhao Yang

spark git commit: [SPARK-12250][SQL] Allow users to define a UDAF without providing details of its inputSchema

2015-12-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/master d9d354ed4 -> bc5f56aa6 [SPARK-12250][SQL] Allow users to define a UDAF without providing details of its inputSchema https://issues.apache.org/jira/browse/SPARK-12250 Author: Yin Huai Closes #10236 from

spark git commit: [SPARK-12250][SQL] Allow users to define a UDAF without providing details of its inputSchema

2015-12-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 e541f703d -> 594fafc61 [SPARK-12250][SQL] Allow users to define a UDAF without providing details of its inputSchema https://issues.apache.org/jira/browse/SPARK-12250 Author: Yin Huai Closes #10236 from

spark git commit: [SPARK-12228][SQL] Try to run execution hive's derby in memory.

2015-12-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/master bc5f56aa6 -> ec5f9ed5d [SPARK-12228][SQL] Try to run execution hive's derby in memory. This PR tries to make execution hive's derby run in memory since it is a fake metastore and every time we create a HiveContext, we will switch to a new

[1/2] spark git commit: [SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib and mllib in the documentation.

2015-12-10 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master ec5f9ed5d -> 2ecbe02d5 http://git-wip-us.apache.org/repos/asf/spark/blob/2ecbe02d/docs/mllib-clustering.md -- diff --git a/docs/mllib-clustering.md

[2/2] spark git commit: [SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib and mllib in the documentation.

2015-12-10 Thread jkbradley
[SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib and mllib in the documentation. Replaces a number of occurences of `MLlib` in the documentation that were meant to refer to the `spark.mllib` package instead. It should clarify for new users the difference between

[2/2] spark git commit: [SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib and mllib in the documentation.

2015-12-10 Thread jkbradley
[SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib and mllib in the documentation. Replaces a number of occurences of `MLlib` in the documentation that were meant to refer to the `spark.mllib` package instead. It should clarify for new users the difference between

[1/2] spark git commit: [SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib and mllib in the documentation.

2015-12-10 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.6 594fafc61 -> d0307deaa http://git-wip-us.apache.org/repos/asf/spark/blob/d0307dea/docs/mllib-clustering.md -- diff --git a/docs/mllib-clustering.md

spark git commit: [SPARK-11563][CORE][REPL] Use RpcEnv to transfer REPL-generated classes.

2015-12-10 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 2ecbe02d5 -> 4a46b8859 [SPARK-11563][CORE][REPL] Use RpcEnv to transfer REPL-generated classes. This avoids bringing up yet another HTTP server on the driver, and instead reuses the file server already managed by the driver's RpcEnv. As a

spark git commit: [SPARK-11713] [PYSPARK] [STREAMING] Initial RDD updateStateByKey for PySpark

2015-12-10 Thread davies
Repository: spark Updated Branches: refs/heads/master 4a46b8859 -> 6a6c1fc5c [SPARK-11713] [PYSPARK] [STREAMING] Initial RDD updateStateByKey for PySpark Adding ability to define an initial state RDD for use with updateStateByKey PySpark. Added unit test and changed

spark git commit: [STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc

2015-12-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.4 c7c99857d -> 43f02e41e [STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc With the merge of [SPARK-8337](https://issues.apache.org/jira/browse/SPARK-8337), now the Python API has the same functionalities

spark git commit: [SPARK-12258][SQL] passing null into ScalaUDF

2015-12-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 5d3722f8e -> d09af2cb4 [SPARK-12258][SQL] passing null into ScalaUDF Check nullability and passing them into ScalaUDF. Closes #10249 Author: Davies Liu Closes #10259 from davies/udf_null. (cherry picked from

spark git commit: [SPARK-12155][SPARK-12253] Fix executor OOM in unified memory management

2015-12-10 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.6 9870e5c7a -> c247b6a65 [SPARK-12155][SPARK-12253] Fix executor OOM in unified memory management **Problem.** In unified memory management, acquiring execution memory may lead to eviction of storage memory. However, the space freed

spark git commit: [SPARK-12155][SPARK-12253] Fix executor OOM in unified memory management

2015-12-10 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 23a9e62ba -> 5030923ea [SPARK-12155][SPARK-12253] Fix executor OOM in unified memory management **Problem.** In unified memory management, acquiring execution memory may lead to eviction of storage memory. However, the space freed from

spark git commit: [SPARK-12251] Document and improve off-heap memory configurations

2015-12-10 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.6 d0307deaa -> 9870e5c7a [SPARK-12251] Document and improve off-heap memory configurations This patch adds documentation for Spark configurations that affect off-heap memory and makes some naming and validation improvements for those

spark git commit: [STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc

2015-12-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.5 4b99f72f7 -> cb0246c93 [STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc With the merge of [SPARK-8337](https://issues.apache.org/jira/browse/SPARK-8337), now the Python API has the same functionalities

spark git commit: [STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc

2015-12-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5030923ea -> 24d3357d6 [STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc With the merge of [SPARK-8337](https://issues.apache.org/jira/browse/SPARK-8337), now the Python API has the same functionalities compared

spark git commit: [SPARK-11602][MLLIB] Refine visibility for 1.6 scala API audit

2015-12-10 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.6 b7b9f7727 -> e65c88536 [SPARK-11602][MLLIB] Refine visibility for 1.6 scala API audit jira: https://issues.apache.org/jira/browse/SPARK-11602 Made a pass on the API change of 1.6. Open the PR for efficient discussion. Author: Yuhao

spark git commit: [SPARK-12234][SPARKR] Fix ```subset``` function error when only set ```select``` argument

2015-12-10 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-1.6 e65c88536 -> 93ef24638 [SPARK-12234][SPARKR] Fix ```subset``` function error when only set ```select``` argument Fix ```subset``` function error when only set ```select``` argument. Please refer to the

spark git commit: [SPARK-12198][SPARKR] SparkR support read.parquet and deprecate parquetFile

2015-12-10 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-1.6 f939c71b1 -> b7b9f7727 [SPARK-12198][SPARKR] SparkR support read.parquet and deprecate parquetFile SparkR support ```read.parquet``` and deprecate ```parquetFile```. This change is similar with #10145 for ```jsonFile```. Author:

spark git commit: [SPARK-12136][STREAMING] rddToFileName does not properly handle prefix and suffix parameters

2015-12-10 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.6 f6d866173 -> b5e5812f9 [SPARK-12136][STREAMING] rddToFileName does not properly handle prefix and suffix parameters The original code does not properly handle the cases where the prefix is null, but suffix is not null - the suffix

spark git commit: [SPARK-12136][STREAMING] rddToFileName does not properly handle prefix and suffix parameters

2015-12-10 Thread srowen
Repository: spark Updated Branches: refs/heads/master d8ec081c9 -> e29704f90 [SPARK-12136][STREAMING] rddToFileName does not properly handle prefix and suffix parameters The original code does not properly handle the cases where the prefix is null, but suffix is not null - the suffix should

spark git commit: [SPARK-11530][MLLIB] Return eigenvalues with PCA model

2015-12-10 Thread srowen
Repository: spark Updated Branches: refs/heads/master e29704f90 -> 21b3d2a75 [SPARK-11530][MLLIB] Return eigenvalues with PCA model Add `computePrincipalComponentsAndVariance` to also compute PCA's explained variance. CC mengxr Author: Sean Owen Closes #9736 from

spark git commit: [SPARK-12234][SPARKR] Fix ```subset``` function error when only set ```select``` argument

2015-12-10 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 9fba9c800 -> d9d354ed4 [SPARK-12234][SPARKR] Fix ```subset``` function error when only set ```select``` argument Fix ```subset``` function error when only set ```select``` argument. Please refer to the

spark git commit: [SPARK-12012][SQL][BRANCH-1.6] Show more comprehensive PhysicalRDD metadata when visualizing SQL query plan

2015-12-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 93ef24638 -> e541f703d [SPARK-12012][SQL][BRANCH-1.6] Show more comprehensive PhysicalRDD metadata when visualizing SQL query plan This PR backports PR #10004 to branch-1.6 It adds a private[sql] method metadata to SparkPlan, which

spark git commit: [SPARK-12242][SQL] Add DataFrame.transform method

2015-12-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 b5e5812f9 -> f939c71b1 [SPARK-12242][SQL] Add DataFrame.transform method Author: Reynold Xin Closes #10226 from rxin/df-transform. (cherry picked from commit 76540b6df5370b463277d3498097b2cc2d2e97a8)

spark git commit: [SPARK-12242][SQL] Add DataFrame.transform method

2015-12-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 21b3d2a75 -> 76540b6df [SPARK-12242][SQL] Add DataFrame.transform method Author: Reynold Xin Closes #10226 from rxin/df-transform. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-11832][CORE] Process arguments in spark-shell for Scala 2.11

2015-12-10 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 76540b6df -> db5165246 [SPARK-11832][CORE] Process arguments in spark-shell for Scala 2.11 Process arguments passed to the spark-shell. Fixes running the spark-shell from within a build environment. Author: Jakob Odersky