spark git commit: [SPARK-16266][SQL][STREAING] Moved DataStreamReader/Writer from pyspark.sql to pyspark.sql.streaming

2016-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 153c2f9ac -> f454a7f9f [SPARK-16266][SQL][STREAING] Moved DataStreamReader/Writer from pyspark.sql to pyspark.sql.streaming ## What changes were proposed in this pull request? - Moved DataStreamReader/Writer from pyspark.sql to

spark git commit: [SPARK-16271][SQL] Implement Hive's UDFXPathUtil

2016-06-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 0df5ce1bc -> 153c2f9ac [SPARK-16271][SQL] Implement Hive's UDFXPathUtil ## What changes were proposed in this pull request? This patch ports Hive's UDFXPathUtil over to Spark, which can be used to implement xpath functionality in Spark in

spark git commit: [SPARK-16245][ML] model loading backward compatibility for ml.feature.PCA

2016-06-28 Thread meng
Repository: spark Updated Branches: refs/heads/branch-2.0 dd70a115c -> 22b4072e7 [SPARK-16245][ML] model loading backward compatibility for ml.feature.PCA ## What changes were proposed in this pull request? model loading backward compatibility for ml.feature.PCA. ## How was this patch

spark git commit: [SPARK-16245][ML] model loading backward compatibility for ml.feature.PCA

2016-06-28 Thread meng
Repository: spark Updated Branches: refs/heads/master 363bcedee -> 0df5ce1bc [SPARK-16245][ML] model loading backward compatibility for ml.feature.PCA ## What changes were proposed in this pull request? model loading backward compatibility for ml.feature.PCA. ## How was this patch tested?

spark git commit: [SPARK-16248][SQL] Whitelist the list of Hive fallback functions

2016-06-28 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 835c5a3bd -> dd70a115c [SPARK-16248][SQL] Whitelist the list of Hive fallback functions ## What changes were proposed in this pull request? This patch removes the blind fallback into Hive for functions. Instead, it creates a whitelist

spark git commit: [SPARK-16248][SQL] Whitelist the list of Hive fallback functions

2016-06-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 5bf8881b3 -> 363bcedee [SPARK-16248][SQL] Whitelist the list of Hive fallback functions ## What changes were proposed in this pull request? This patch removes the blind fallback into Hive for functions. Instead, it creates a whitelist and

spark git commit: [SPARK-16268][PYSPARK] SQLContext should import DataStreamReader

2016-06-28 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.0 d7a59f1f4 -> 835c5a3bd [SPARK-16268][PYSPARK] SQLContext should import DataStreamReader ## What changes were proposed in this pull request? Fixed the following error: ``` >>> sqlContext.readStream Traceback (most recent call last):

spark git commit: [SPARK-16268][PYSPARK] SQLContext should import DataStreamReader

2016-06-28 Thread tdas
Repository: spark Updated Branches: refs/heads/master 823518c2b -> 5bf8881b3 [SPARK-16268][PYSPARK] SQLContext should import DataStreamReader ## What changes were proposed in this pull request? Fixed the following error: ``` >>> sqlContext.readStream Traceback (most recent call last): File

spark git commit: [SPARKR] add csv tests

2016-06-28 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 52c9d69f7 -> d7a59f1f4 [SPARKR] add csv tests ## What changes were proposed in this pull request? Add unit tests for csv data for SPARKR ## How was this patch tested? unit tests Author: Felix Cheung

spark git commit: [SPARKR] add csv tests

2016-06-28 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 5545b7910 -> 823518c2b [SPARKR] add csv tests ## What changes were proposed in this pull request? Add unit tests for csv data for SPARKR ## How was this patch tested? unit tests Author: Felix Cheung Closes

spark git commit: [MINOR][DOCS][STRUCTURED STREAMING] Minor doc fixes around `DataFrameWriter` and `DataStreamWriter`

2016-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 5fb7804e5 -> 52c9d69f7 [MINOR][DOCS][STRUCTURED STREAMING] Minor doc fixes around `DataFrameWriter` and `DataStreamWriter` ## What changes were proposed in this pull request? Fixes a couple old references to

spark git commit: [MINOR][DOCS][STRUCTURED STREAMING] Minor doc fixes around `DataFrameWriter` and `DataStreamWriter`

2016-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 3554713a1 -> 5545b7910 [MINOR][DOCS][STRUCTURED STREAMING] Minor doc fixes around `DataFrameWriter` and `DataStreamWriter` ## What changes were proposed in this pull request? Fixes a couple old references to `DataFrameWriter.startStream`

spark git commit: [SPARK-16114][SQL] structured streaming network word count examples

2016-06-28 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.0 d73c38ed0 -> 5fb7804e5 [SPARK-16114][SQL] structured streaming network word count examples ## What changes were proposed in this pull request? Network word count example for structured streaming ## How was this patch tested? Run

spark git commit: [SPARK-16114][SQL] structured streaming network word count examples

2016-06-28 Thread tdas
Repository: spark Updated Branches: refs/heads/master 8a977b065 -> 3554713a1 [SPARK-16114][SQL] structured streaming network word count examples ## What changes were proposed in this pull request? Network word count example for structured streaming ## How was this patch tested? Run locally

spark git commit: [SPARK-16100][SQL] fix bug when use Map as the buffer type of Aggregator

2016-06-28 Thread lian
Repository: spark Updated Branches: refs/heads/branch-2.0 5626a0af5 -> d73c38ed0 [SPARK-16100][SQL] fix bug when use Map as the buffer type of Aggregator ## What changes were proposed in this pull request? The root cause is in `MapObjects`. Its parameter `loopVar` is not declared as child,

spark git commit: [SPARK-16100][SQL] fix bug when use Map as the buffer type of Aggregator

2016-06-28 Thread lian
Repository: spark Updated Branches: refs/heads/master 25520e976 -> 8a977b065 [SPARK-16100][SQL] fix bug when use Map as the buffer type of Aggregator ## What changes were proposed in this pull request? The root cause is in `MapObjects`. Its parameter `loopVar` is not declared as child, but

spark git commit: [SPARK-16236][SQL] Add Path Option back to Load API in DataFrameReader

2016-06-28 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 43bd612f3 -> 5626a0af5 [SPARK-16236][SQL] Add Path Option back to Load API in DataFrameReader What changes were proposed in this pull request? koertkuipers identified the PR https://github.com/apache/spark/pull/13727/ changed the

spark git commit: [SPARK-16236][SQL] Add Path Option back to Load API in DataFrameReader

2016-06-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 35438fb0a -> 25520e976 [SPARK-16236][SQL] Add Path Option back to Load API in DataFrameReader What changes were proposed in this pull request? koertkuipers identified the PR https://github.com/apache/spark/pull/13727/ changed the

spark git commit: [SPARK-16175] [PYSPARK] handle None for UDT

2016-06-28 Thread davies
Repository: spark Updated Branches: refs/heads/master 1aad8c6e5 -> 35438fb0a [SPARK-16175] [PYSPARK] handle None for UDT ## What changes were proposed in this pull request? Scala UDT will bypass all the null and will not pass them into serialize() and deserialize() of UDT, this PR update

spark git commit: [SPARK-16175] [PYSPARK] handle None for UDT

2016-06-28 Thread davies
Repository: spark Updated Branches: refs/heads/branch-2.0 5c9555e11 -> 43bd612f3 [SPARK-16175] [PYSPARK] handle None for UDT ## What changes were proposed in this pull request? Scala UDT will bypass all the null and will not pass them into serialize() and deserialize() of UDT, this PR

spark git commit: [SPARK-16259][PYSPARK] cleanup options in DataFrame read/write API

2016-06-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master ae14f3623 -> 1aad8c6e5 [SPARK-16259][PYSPARK] cleanup options in DataFrame read/write API ## What changes were proposed in this pull request? There are some duplicated code for options in DataFrame reader/writer API, this PR clean them

spark git commit: [SPARK-16148][SCHEDULER] Allow for underscores in TaskLocation in the Executor ID

2016-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 c86d29b2e -> 5c9555e11 [SPARK-16148][SCHEDULER] Allow for underscores in TaskLocation in the Executor ID ## What changes were proposed in this pull request? Previously, the TaskLocation implementation would not allow for executor ids

spark git commit: [SPARK-16148][SCHEDULER] Allow for underscores in TaskLocation in the Executor ID

2016-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master d59ba8e30 -> ae14f3623 [SPARK-16148][SCHEDULER] Allow for underscores in TaskLocation in the Executor ID ## What changes were proposed in this pull request? Previously, the TaskLocation implementation would not allow for executor ids

spark git commit: [MINOR][SPARKR] update sparkR DataFrame.R comment

2016-06-28 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 a1d04cc03 -> c86d29b2e [MINOR][SPARKR] update sparkR DataFrame.R comment ## What changes were proposed in this pull request? update sparkR DataFrame.R comment SQLContext ==> SparkSession ## How was this patch tested? N/A Author:

spark git commit: [MINOR][SPARKR] update sparkR DataFrame.R comment

2016-06-28 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 26252f706 -> d59ba8e30 [MINOR][SPARKR] update sparkR DataFrame.R comment ## What changes were proposed in this pull request? update sparkR DataFrame.R comment SQLContext ==> SparkSession ## How was this patch tested? N/A Author:

spark git commit: [SPARK-15643][DOC][ML] Update spark.ml and spark.mllib migration guide from 1.6 to 2.0

2016-06-28 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 1f2776df6 -> 26252f706 [SPARK-15643][DOC][ML] Update spark.ml and spark.mllib migration guide from 1.6 to 2.0 ## What changes were proposed in this pull request? Update ```spark.ml``` and ```spark.mllib``` migration guide from 1.6 to 2.0.

spark git commit: [SPARK-15643][DOC][ML] Update spark.ml and spark.mllib migration guide from 1.6 to 2.0

2016-06-28 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-2.0 e68872f2e -> a1d04cc03 [SPARK-15643][DOC][ML] Update spark.ml and spark.mllib migration guide from 1.6 to 2.0 ## What changes were proposed in this pull request? Update ```spark.ml``` and ```spark.mllib``` migration guide from 1.6 to

spark git commit: [SPARK-16181][SQL] outer join with isNull filter may return wrong result

2016-06-28 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 4c5e16f58 -> e68872f2e [SPARK-16181][SQL] outer join with isNull filter may return wrong result ## What changes were proposed in this pull request? The root cause is: the output attributes of outer join are derived from its children,

spark git commit: [SPARK-16181][SQL] outer join with isNull filter may return wrong result

2016-06-28 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 0923c4f56 -> 1f2776df6 [SPARK-16181][SQL] outer join with isNull filter may return wrong result ## What changes were proposed in this pull request? The root cause is: the output attributes of outer join are derived from its children,

spark git commit: [SPARK-16224] [SQL] [PYSPARK] SparkSession builder's configs need to be set to the existing Scala SparkContext's SparkConf

2016-06-28 Thread davies
Repository: spark Updated Branches: refs/heads/branch-2.0 b349237e4 -> 4c5e16f58 [SPARK-16224] [SQL] [PYSPARK] SparkSession builder's configs need to be set to the existing Scala SparkContext's SparkConf ## What changes were proposed in this pull request? When we create a SparkSession at the

spark git commit: [SPARK-16224] [SQL] [PYSPARK] SparkSession builder's configs need to be set to the existing Scala SparkContext's SparkConf

2016-06-28 Thread davies
Repository: spark Updated Branches: refs/heads/master e158478a9 -> 0923c4f56 [SPARK-16224] [SQL] [PYSPARK] SparkSession builder's configs need to be set to the existing Scala SparkContext's SparkConf ## What changes were proposed in this pull request? When we create a SparkSession at the

spark git commit: [SPARK-16242][MLLIB][PYSPARK] Conversion between old/new matrix columns in a DataFrame (Python)

2016-06-28 Thread yliang
Repository: spark Updated Branches: refs/heads/branch-2.0 af70ad028 -> b349237e4 [SPARK-16242][MLLIB][PYSPARK] Conversion between old/new matrix columns in a DataFrame (Python) ## What changes were proposed in this pull request? This PR implements python wrappers for #13888 to convert

spark git commit: [SPARK-16242][MLLIB][PYSPARK] Conversion between old/new matrix columns in a DataFrame (Python)

2016-06-28 Thread yliang
Repository: spark Updated Branches: refs/heads/master f6b497fcd -> e158478a9 [SPARK-16242][MLLIB][PYSPARK] Conversion between old/new matrix columns in a DataFrame (Python) ## What changes were proposed in this pull request? This PR implements python wrappers for #13888 to convert old/new

spark git commit: [SPARK-16128][SQL] Allow setting length of characters to be truncated to, in Dataset.show function.

2016-06-28 Thread prashant
Repository: spark Updated Branches: refs/heads/master 4cbf611c1 -> f6b497fcd [SPARK-16128][SQL] Allow setting length of characters to be truncated to, in Dataset.show function. ## What changes were proposed in this pull request? Allowing truncate to a specific number of character is

spark git commit: [SPARK-16202][SQL][DOC] Correct The Description of CreatableRelationProvider's createRelation

2016-06-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master dd6b7dbe7 -> 4cbf611c1 [SPARK-16202][SQL][DOC] Correct The Description of CreatableRelationProvider's createRelation What changes were proposed in this pull request? The API description of `createRelation` in