spark git commit: [SPARK-15585][SQL] Fix NULL handling along with a spark-csv behaivour

2016-06-06 Thread rxin
Repository: spark Updated Branches: refs/heads/master 79268aa46 -> b7e8d1cb3 [SPARK-15585][SQL] Fix NULL handling along with a spark-csv behaivour ## What changes were proposed in this pull request? This pr fixes the behaviour of `format("csv").option("quote", null)` along with one of

spark git commit: [SPARK-15585][SQL] Fix NULL handling along with a spark-csv behaivour

2016-06-06 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 790de600b -> 9e7e2f916 [SPARK-15585][SQL] Fix NULL handling along with a spark-csv behaivour ## What changes were proposed in this pull request? This pr fixes the behaviour of `format("csv").option("quote", null)` along with one of

spark git commit: [SPARK-15748][SQL] Replace inefficient foldLeft() call with flatMap() in PartitionStatistics

2016-06-05 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 38a626a54 -> d8370ef11 [SPARK-15748][SQL] Replace inefficient foldLeft() call with flatMap() in PartitionStatistics `PartitionStatistics` uses `foldLeft` and list concatenation (`++`) to flatten an iterator of lists, but this is

spark git commit: [SPARK-15748][SQL] Replace inefficient foldLeft() call with flatMap() in PartitionStatistics

2016-06-05 Thread rxin
Repository: spark Updated Branches: refs/heads/master 30c4774f3 -> 26c1089c3 [SPARK-15748][SQL] Replace inefficient foldLeft() call with flatMap() in PartitionStatistics `PartitionStatistics` uses `foldLeft` and list concatenation (`++`) to flatten an iterator of lists, but this is

spark git commit: [SPARK-15770][ML] Annotation audit for Experimental and DeveloperApi

2016-06-05 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 8c0ec85e6 -> 1ece135b9 [SPARK-15770][ML] Annotation audit for Experimental and DeveloperApi ## What changes were proposed in this pull request? 1, remove comments `:: Experimental ::` for non-experimental API 2, add comments `::

spark git commit: [SPARK-15770][ML] Annotation audit for Experimental and DeveloperApi

2016-06-05 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4e767d0f9 -> 372fa61f5 [SPARK-15770][ML] Annotation audit for Experimental and DeveloperApi ## What changes were proposed in this pull request? 1, remove comments `:: Experimental ::` for non-experimental API 2, add comments `::

spark git commit: [SPARK-15756][SQL] Support command 'create table stored as orcfile/parquetfile/avrofile'

2016-06-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 61d729abd -> 2ca563cc4 [SPARK-15756][SQL] Support command 'create table stored as orcfile/parquetfile/avrofile' ## What changes were proposed in this pull request? Now Spark SQL can support 'create table src stored as orc/parquet/avro'

spark git commit: [SPARK-15756][SQL] Support command 'create table stored as orcfile/parquetfile/avrofile'

2016-06-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 a2540b936 -> cf8782116 [SPARK-15756][SQL] Support command 'create table stored as orcfile/parquetfile/avrofile' ## What changes were proposed in this pull request? Now Spark SQL can support 'create table src stored as

spark git commit: [SPARK-15744][SQL] Rename two TungstenAggregation*Suites and update codgen/error messages/comments

2016-06-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 306601282 -> 3a9ee549c [SPARK-15744][SQL] Rename two TungstenAggregation*Suites and update codgen/error messages/comments ## What changes were proposed in this pull request? For consistency, this PR updates some remaining

spark git commit: [SPARK-15744][SQL] Rename two TungstenAggregation*Suites and update codgen/error messages/comments

2016-06-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master f7288e166 -> b9fcfb3bd [SPARK-15744][SQL] Rename two TungstenAggregation*Suites and update codgen/error messages/comments ## What changes were proposed in this pull request? For consistency, this PR updates some remaining

spark git commit: [SPARK-15745][SQL] Use classloader's getResource() for reading resource files in HiveTests

2016-06-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 1e13d09c5 -> 306601282 [SPARK-15745][SQL] Use classloader's getResource() for reading resource files in HiveTests ## What changes were proposed in this pull request? This is a cleaner approach in general but my motivation behind this

spark git commit: [SPARK-15745][SQL] Use classloader's getResource() for reading resource files in HiveTests

2016-06-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 76aa45d35 -> f7288e166 [SPARK-15745][SQL] Use classloader's getResource() for reading resource files in HiveTests ## What changes were proposed in this pull request? This is a cleaner approach in general but my motivation behind this

[1/2] spark git commit: [SPARK-15728][SQL] Rename aggregate operators: HashAggregate and SortAggregate

2016-06-02 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 841523cdc -> cd7bf4b8e http://git-wip-us.apache.org/repos/asf/spark/blob/cd7bf4b8/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala

[2/2] spark git commit: [SPARK-15728][SQL] Rename aggregate operators: HashAggregate and SortAggregate

2016-06-02 Thread rxin
. This patch renames them HashAggregate and SortAggregate. ## How was this patch tested? Updated test cases. Author: Reynold Xin <r...@databricks.com> Closes #13465 from rxin/SPARK-15728. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/rep

[2/2] spark git commit: [SPARK-15728][SQL] Rename aggregate operators: HashAggregate and SortAggregate

2016-06-02 Thread rxin
. This patch renames them HashAggregate and SortAggregate. ## How was this patch tested? Updated test cases. Author: Reynold Xin <r...@databricks.com> Closes #13465 from rxin/SPARK-15728. (cherry picked from commit 8900c8d8ff1614b5ec5a2ce213832fa13462b4d4) Signed-off-by: Reynold

spark git commit: [SPARK-14752][SQL] Explicitly implement KryoSerialization for LazilyGenerateOrdering

2016-06-02 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 18d613a4d -> 841523cdc [SPARK-14752][SQL] Explicitly implement KryoSerialization for LazilyGenerateOrdering ## What changes were proposed in this pull request? This patch fixes a number of `com.esotericsoftware.kryo.KryoException:

spark git commit: [SPARK-14752][SQL] Explicitly implement KryoSerialization for LazilyGenerateOrdering

2016-06-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master 7c07d176f -> 09b3c56c9 [SPARK-14752][SQL] Explicitly implement KryoSerialization for LazilyGenerateOrdering ## What changes were proposed in this pull request? This patch fixes a number of `com.esotericsoftware.kryo.KryoException:

spark git commit: [SPARK-15702][DOCUMENTATION] Update document programming-guide accumulator section

2016-06-01 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 beb4ea0b4 -> 47902d4bc [SPARK-15702][DOCUMENTATION] Update document programming-guide accumulator section ## What changes were proposed in this pull request? Update document programming-guide accumulator section (scala language) java

spark git commit: [SPARK-15702][DOCUMENTATION] Update document programming-guide accumulator section

2016-06-01 Thread rxin
Repository: spark Updated Branches: refs/heads/master 07a98ca4c -> 2402b9146 [SPARK-15702][DOCUMENTATION] Update document programming-guide accumulator section ## What changes were proposed in this pull request? Update document programming-guide accumulator section (scala language) java and

spark git commit: [SPARK-15680][SQL] Disable comments in generated code in order to avoid perf. issues

2016-05-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 223f1d58c -> 8ca01a6fe [SPARK-15680][SQL] Disable comments in generated code in order to avoid perf. issues ## What changes were proposed in this pull request? In benchmarks involving tables with very wide and complex schemas (thousands

spark git commit: [SPARK-15680][SQL] Disable comments in generated code in order to avoid perf. issues

2016-05-31 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 978f54e76 -> f0e8738c1 [SPARK-15680][SQL] Disable comments in generated code in order to avoid perf. issues ## What changes were proposed in this pull request? In benchmarks involving tables with very wide and complex schemas

spark git commit: [SPARK-15649][SQL] Avoid to serialize MetastoreRelation in HiveTableScanExec

2016-05-31 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 2e3ead20c -> e11046457 [SPARK-15649][SQL] Avoid to serialize MetastoreRelation in HiveTableScanExec ## What changes were proposed in this pull request? in HiveTableScanExec, schema is lazy and is related with relation.attributeMap. So

spark git commit: [SPARK-15649][SQL] Avoid to serialize MetastoreRelation in HiveTableScanExec

2016-05-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 95db8a44f -> 2bfc4f152 [SPARK-15649][SQL] Avoid to serialize MetastoreRelation in HiveTableScanExec ## What changes were proposed in this pull request? in HiveTableScanExec, schema is lazy and is related with relation.attributeMap. So it

spark git commit: [SPARK-15638][SQL] Audit Dataset, SparkSession, and SQLContext

2016-05-30 Thread rxin
ion, and SQLContext. The patch audits the categorization of experimental APIs, function groups, and deprecations. For the detailed list of changes, please see the diff. ## How was this patch tested? N/A Author: Reynold Xin <r...@databricks.com> Closes #13370 from rxin/SPARK-15638. Project: http:

spark git commit: [SPARK-15553][SQL] Dataset.createTempView should use CreateViewCommand

2016-05-27 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 ada319844 -> 36045106d [SPARK-15553][SQL] Dataset.createTempView should use CreateViewCommand ## What changes were proposed in this pull request? Let `Dataset.createTempView` and `Dataset.createOrReplaceTempView` use

spark git commit: [SPARK-15553][SQL] Dataset.createTempView should use CreateViewCommand

2016-05-27 Thread rxin
Repository: spark Updated Branches: refs/heads/master 73178c755 -> f1b220eee [SPARK-15553][SQL] Dataset.createTempView should use CreateViewCommand ## What changes were proposed in this pull request? Let `Dataset.createTempView` and `Dataset.createOrReplaceTempView` use `CreateViewCommand`,

[1/2] spark git commit: [SPARK-15633][MINOR] Make package name for Java tests consistent

2016-05-27 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 3801fb4f3 -> ada319844 http://git-wip-us.apache.org/repos/asf/spark/blob/ada31984/external/java8-tests/src/test/java/test/org/apache/spark/java8/dstream/Java8APISuite.java

[2/2] spark git commit: [SPARK-15633][MINOR] Make package name for Java tests consistent

2016-05-27 Thread rxin
added "java8" as the package name so we can easily run all the tests related to Java 8. ## How was this patch tested? This is a test only change. Author: Reynold Xin <r...@databricks.com> Closes #13364 from rxin/SPARK-15633. Project: http://git-wip-us.apache.org/repos/asf/s

[1/2] spark git commit: [SPARK-15633][MINOR] Make package name for Java tests consistent

2016-05-27 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9893dc975 -> 73178c755 http://git-wip-us.apache.org/repos/asf/spark/blob/73178c75/external/java8-tests/src/test/java/test/org/apache/spark/java8/dstream/Java8APISuite.java

spark git commit: [SPARK-14400][SQL] ScriptTransformation does not fail the job for bad user command

2016-05-27 Thread rxin
Repository: spark Updated Branches: refs/heads/master b376a4eab -> a96e4151a [SPARK-14400][SQL] ScriptTransformation does not fail the job for bad user command ## What changes were proposed in this pull request? - Refer to the Jira for the problem: jira :

spark git commit: [SPARK-14400][SQL] ScriptTransformation does not fail the job for bad user command

2016-05-27 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 5ea58898c -> d76e066d3 [SPARK-14400][SQL] ScriptTransformation does not fail the job for bad user command ## What changes were proposed in this pull request? - Refer to the Jira for the problem: jira :

[1/2] spark git commit: [SPARK-15529][SQL] Replace SQLContext and HiveContext with SparkSession in Test

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6b1a6180e -> d5911d117 http://git-wip-us.apache.org/repos/asf/spark/blob/d5911d11/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala -- diff --git

[1/2] spark git commit: [SPARK-15529][SQL] Replace SQLContext and HiveContext with SparkSession in Test

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 9c7e6ad28 -> b3845fede http://git-wip-us.apache.org/repos/asf/spark/blob/b3845fed/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala -- diff --git

[2/2] spark git commit: [SPARK-15529][SQL] Replace SQLContext and HiveContext with SparkSession in Test

2016-05-26 Thread rxin
[SPARK-15529][SQL] Replace SQLContext and HiveContext with SparkSession in Test What changes were proposed in this pull request? This PR is to use the new entrance `Sparksession` to replace the existing `SQLContext` and `HiveContext` in SQL test suites. No change is made in the following

[2/2] spark git commit: [SPARK-15529][SQL] Replace SQLContext and HiveContext with SparkSession in Test

2016-05-26 Thread rxin
[SPARK-15529][SQL] Replace SQLContext and HiveContext with SparkSession in Test What changes were proposed in this pull request? This PR is to use the new entrance `Sparksession` to replace the existing `SQLContext` and `HiveContext` in SQL test suites. No change is made in the following

spark git commit: [MINOR] Fix Typos 'a -> an'

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master ee3609a2e -> 6b1a6180e [MINOR] Fix Typos 'a -> an' ## What changes were proposed in this pull request? `a` -> `an` I use regex to generate potential error lines: `grep -in ' a [aeiou]' mllib/src/main/scala/org/apache/spark/ml/*/*scala`

spark git commit: [MINOR] Fix Typos 'a -> an'

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 64d477cd4 -> 9c7e6ad28 [MINOR] Fix Typos 'a -> an' ## What changes were proposed in this pull request? `a` -> `an` I use regex to generate potential error lines: `grep -in ' a [aeiou]'

spark git commit: [MINOR][CORE] Fixed doc for Accumulator2.add

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master c82883239 -> ee3609a2e [MINOR][CORE] Fixed doc for Accumulator2.add ## What changes were proposed in this pull request? Scala doc used outdated ```+=```. Replaced with ```add```. ## How was this patch tested? N/A Author: Joseph K.

spark git commit: [MINOR][CORE] Fixed doc for Accumulator2.add

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 c1468447e -> 64d477cd4 [MINOR][CORE] Fixed doc for Accumulator2.add ## What changes were proposed in this pull request? Scala doc used outdated ```+=```. Replaced with ```add```. ## How was this patch tested? N/A Author: Joseph K.

spark git commit: [BUILD][1.6] Fix compilation

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 0b8bdf793 -> c53c83ce8 [BUILD][1.6] Fix compilation ## What changes were proposed in this pull request? Makes `UnsafeSortDataFormat` and `RecordPointerAndKeyPrefix` public. These are already public in 2.0 and are used in an

spark git commit: [SPARK-8428][SPARK-13850] Fix integer overflows in TimSort

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 5cc1e2cec -> 0b8bdf793 [SPARK-8428][SPARK-13850] Fix integer overflows in TimSort This patch fixes a few integer overflows in `UnsafeSortDataFormat.copyRange()` and `ShuffleSortDataFormat copyRange()` that seems to be the most likely

spark git commit: [SPARK-8428][SPARK-13850] Fix integer overflows in TimSort

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 7393449db -> 29681cca1 [SPARK-8428][SPARK-13850] Fix integer overflows in TimSort ## What changes were proposed in this pull request? This patch fixes a few integer overflows in `UnsafeSortDataFormat.copyRange()` and

spark git commit: [SPARK-8428][SPARK-13850] Fix integer overflows in TimSort

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master b5859e0bb -> fe6de16f7 [SPARK-8428][SPARK-13850] Fix integer overflows in TimSort ## What changes were proposed in this pull request? This patch fixes a few integer overflows in `UnsafeSortDataFormat.copyRange()` and

spark git commit: [SPARK-13445][SQL] Improves error message and add test coverage for Window function

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master b0a03feef -> b5859e0bb [SPARK-13445][SQL] Improves error message and add test coverage for Window function ## What changes were proposed in this pull request? Add more verbose error message when order by clause is missed when using

spark git commit: [SPARK-13445][SQL] Improves error message and add test coverage for Window function

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 216e39505 -> 7393449db [SPARK-13445][SQL] Improves error message and add test coverage for Window function ## What changes were proposed in this pull request? Add more verbose error message when order by clause is missed when using

spark git commit: [SPARK-15537][SQL] fix dir delete issue

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 b3ee53b84 -> 36acd53e8 [SPARK-15537][SQL] fix dir delete issue ## What changes were proposed in this pull request? For some of the test cases, e.g. `OrcSourceSuite`, it will create temp folders and temp files inside them. But after

spark git commit: [SPARK-15537][SQL] fix dir delete issue

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master 361ebc282 -> 53d4abe9e [SPARK-15537][SQL] fix dir delete issue ## What changes were proposed in this pull request? For some of the test cases, e.g. `OrcSourceSuite`, it will create temp folders and temp files inside them. But after tests

[1/4] spark git commit: [SPARK-15543][SQL] Rename DefaultSources to make them more self-describing

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 bcad1d13f -> b3ee53b84 http://git-wip-us.apache.org/repos/asf/spark/blob/b3ee53b8/sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcRelation.scala -- diff --git

[3/4] spark git commit: [SPARK-15543][SQL] Rename DefaultSources to make them more self-describing

2016-05-26 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/361ebc28/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala -- diff --git

[1/4] spark git commit: [SPARK-15543][SQL] Rename DefaultSources to make them more self-describing

2016-05-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master dfc9fc02c -> 361ebc282 http://git-wip-us.apache.org/repos/asf/spark/blob/361ebc28/sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcRelation.scala -- diff --git

[2/4] spark git commit: [SPARK-15543][SQL] Rename DefaultSources to make them more self-describing

2016-05-26 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/b3ee53b8/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/text/DefaultSource.scala -- diff --git

[4/4] spark git commit: [SPARK-15543][SQL] Rename DefaultSources to make them more self-describing

2016-05-26 Thread rxin
old Xin <r...@databricks.com> Closes #13311 from rxin/SPARK-15543. (cherry picked from commit 361ebc282b2d09dc6dcf21419a53c5c617b1b6bd) Signed-off-by: Reynold Xin <r...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/

[3/4] spark git commit: [SPARK-15543][SQL] Rename DefaultSources to make them more self-describing

2016-05-26 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/b3ee53b8/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala -- diff --git

[4/4] spark git commit: [SPARK-15543][SQL] Rename DefaultSources to make them more self-describing

2016-05-26 Thread rxin
old Xin <r...@databricks.com> Closes #13311 from rxin/SPARK-15543. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/361ebc28 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/361ebc28 Diff: http://git-wip-us.apache.org

[2/4] spark git commit: [SPARK-15543][SQL] Rename DefaultSources to make them more self-describing

2016-05-26 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/361ebc28/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/text/DefaultSource.scala -- diff --git

spark git commit: [SPARK-15533][SQL] Deprecate Dataset.explode

2016-05-25 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 733cb44e3 -> 15a2dba66 [SPARK-15533][SQL] Deprecate Dataset.explode ## What changes were proposed in this pull request? This patch deprecates `Dataset.explode` and documents appropriate workarounds to use `flatMap()` or

spark git commit: [SPARK-15533][SQL] Deprecate Dataset.explode

2016-05-25 Thread rxin
Repository: spark Updated Branches: refs/heads/master 527499b62 -> 06ed1fa3e [SPARK-15533][SQL] Deprecate Dataset.explode ## What changes were proposed in this pull request? This patch deprecates `Dataset.explode` and documents appropriate workarounds to use `flatMap()` or

spark git commit: [SPARK-15525][SQL][BUILD] Upgrade ANTLR4 SBT plugin

2016-05-25 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 347acc4ea -> 733cb44e3 [SPARK-15525][SQL][BUILD] Upgrade ANTLR4 SBT plugin ## What changes were proposed in this pull request? The ANTLR4 SBT plugin has been moved from its own repo to one on bintray. The version was also changed from

spark git commit: [SPARK-15525][SQL][BUILD] Upgrade ANTLR4 SBT plugin

2016-05-25 Thread rxin
Repository: spark Updated Branches: refs/heads/master ee682fe29 -> 527499b62 [SPARK-15525][SQL][BUILD] Upgrade ANTLR4 SBT plugin ## What changes were proposed in this pull request? The ANTLR4 SBT plugin has been moved from its own repo to one on bintray. The version was also changed from

spark git commit: [SPARK-15493][SQL] default QuoteEscapingEnabled flag to true when writing CSV

2016-05-25 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 20cc2eb1b -> 8629537cc [SPARK-15493][SQL] default QuoteEscapingEnabled flag to true when writing CSV ## What changes were proposed in this pull request? Default QuoteEscapingEnabled flag to true when writing CSV and add an

spark git commit: [SPARK-15493][SQL] default QuoteEscapingEnabled flag to true when writing CSV

2016-05-25 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4b8806741 -> c875d81a3 [SPARK-15493][SQL] default QuoteEscapingEnabled flag to true when writing CSV ## What changes were proposed in this pull request? Default QuoteEscapingEnabled flag to true when writing CSV and add an escapeQuotes

svn commit: r1745433 - in /spark: downloads.md site/downloads.html site/js/downloads.js

2016-05-25 Thread rxin
Author: rxin Date: Wed May 25 06:34:51 2016 New Revision: 1745433 URL: http://svn.apache.org/viewvc?rev=1745433=rev Log: spark-2.0.0-preview Modified: spark/downloads.md spark/site/downloads.html spark/site/js/downloads.js Modified: spark/downloads.md URL: http://svn.apache.org

svn commit: r13787 - in /dev/spark/spark-2.0.0-preview: ./ spark-2.0.0-preview-bin-hadoop2.3.tgz

2016-05-25 Thread rxin
Author: rxin Date: Wed May 25 06:24:15 2016 New Revision: 13787 Log: add missing spark-2.0.0-preview-bin-hadoop2.3.tgz Added: dev/spark/spark-2.0.0-preview/ dev/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.3.tgz (with props) Added: dev/spark/spark-2.0.0-preview/spark

svn commit: r13782 - in /release/spark/spark-2.0.0-preview: spark-2.0.0-preview-bin-hadoop2.4.tgz.asc spark-2.0.0-preview-bin-hadoop2.4.tgz.asc.txt

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:55:59 2016 New Revision: 13782 Log: rename asc Added: release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.4.tgz.asc - copied unchanged from r13781, release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.4.tgz.asc.txt Removed

svn commit: r13786 - in /release/spark/spark-2.0.0-preview: spark-2.0.0-preview.tgz.asc spark-2.0.0-preview.tgz.asc.txt

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:56:45 2016 New Revision: 13786 Log: rename asc Added: release/spark/spark-2.0.0-preview/spark-2.0.0-preview.tgz.asc - copied unchanged from r13785, release/spark/spark-2.0.0-preview/spark-2.0.0-preview.tgz.asc.txt Removed: release/spark/spark-2.0.0

svn commit: r13783 - in /release/spark/spark-2.0.0-preview: spark-2.0.0-preview-bin-hadoop2.6.tgz.asc spark-2.0.0-preview-bin-hadoop2.6.tgz.asc.txt

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:56:13 2016 New Revision: 13783 Log: rename asc Added: release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.6.tgz.asc - copied unchanged from r13782, release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.6.tgz.asc.txt Removed

svn commit: r13784 - in /release/spark/spark-2.0.0-preview: spark-2.0.0-preview-bin-hadoop2.7.tgz.asc spark-2.0.0-preview-bin-hadoop2.7.tgz.asc.txt

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:56:23 2016 New Revision: 13784 Log: rename asc Added: release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.7.tgz.asc - copied unchanged from r13783, release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.7.tgz.asc.txt Removed

svn commit: r13785 - in /release/spark/spark-2.0.0-preview: spark-2.0.0-preview-bin-without-hadoop.tgz.asc spark-2.0.0-preview-bin-without-hadoop.tgz.asc.txt

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:56:33 2016 New Revision: 13785 Log: rename asc Added: release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-without-hadoop.tgz.asc - copied unchanged from r13784, release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-without-hadoop.tgz.asc.txt

svn commit: r13780 - in /release/spark/spark-2.0.0-preview: spark-2.0.0-preview-bin-hadoop2.3.tgz.asc spark-2.0.0-preview-bin-hadoop2.3.tgz.asc.txt

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:55:29 2016 New Revision: 13780 Log: (empty) Added: release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.3.tgz.asc - copied unchanged from r13779, release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.3.tgz.asc.txt Removed

svn commit: r13781 - in /release/spark/spark-2.0.0-preview: spark-2.0.0-preview-bin-hadoop2.4-without-hive.tgz.asc spark-2.0.0-preview-bin-hadoop2.4-without-hive.tgz.asc.txt

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:55:43 2016 New Revision: 13781 Log: rename asc Added: release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.4-without-hive.tgz.asc - copied unchanged from r13780, release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.4-without

svn commit: r13779 - /dev/spark/spark-2.0.0-preview/ /release/spark/spark-2.0.0-preview/

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:04:13 2016 New Revision: 13779 Log: spark-2.0.0-preview Added: release/spark/spark-2.0.0-preview/ - copied from r13778, dev/spark/spark-2.0.0-preview/ Removed: dev/spark/spark-2.0.0-preview

svn commit: r13778 - /dev/spark/spark-2.0.0-preview/

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 04:54:42 2016 New Revision: 13778 Log: Add spark-2.0.0-preview Added: dev/spark/spark-2.0.0-preview/ dev/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.3.tgz.asc.txt dev/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.3.tgz.md5

spark git commit: [SPARK-15365][SQL] When table size statistics are not available from metastore, we should fallback to HDFS

2016-05-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master 14494da87 -> 4acababca [SPARK-15365][SQL] When table size statistics are not available from metastore, we should fallback to HDFS ## What changes were proposed in this pull request? Currently if a table is used in join operation we rely

spark git commit: [SPARK-15365][SQL] When table size statistics are not available from metastore, we should fallback to HDFS

2016-05-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 5504f60e8 -> e13cfd6d2 [SPARK-15365][SQL] When table size statistics are not available from metastore, we should fallback to HDFS ## What changes were proposed in this pull request? Currently if a table is used in join operation we

[2/3] spark git commit: [SPARK-15518] Rename various scheduler backend for consistency

2016-05-24 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/5504f60e/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala -- diff --git

[1/3] spark git commit: [SPARK-15518] Rename various scheduler backend for consistency

2016-05-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 1de3446d9 -> 5504f60e8 http://git-wip-us.apache.org/repos/asf/spark/blob/5504f60e/core/src/test/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackendSuite.scala

[2/3] spark git commit: [SPARK-15518] Rename various scheduler backend for consistency

2016-05-24 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/14494da8/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala -- diff --git

[3/3] spark git commit: [SPARK-15518] Rename various scheduler backend for consistency

2016-05-24 Thread rxin
. Author: Reynold Xin <r...@databricks.com> Closes #13288 from rxin/SPARK-15518. (cherry picked from commit 14494da87bdf057d2d2f796b962a4d8bc4747d31) Signed-off-by: Reynold Xin <r...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip

[1/3] spark git commit: [SPARK-15518] Rename various scheduler backend for consistency

2016-05-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master f08bf587b -> 14494da87 http://git-wip-us.apache.org/repos/asf/spark/blob/14494da8/core/src/test/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackendSuite.scala

[3/3] spark git commit: [SPARK-15518] Rename various scheduler backend for consistency

2016-05-24 Thread rxin
. Author: Reynold Xin <r...@databricks.com> Closes #13288 from rxin/SPARK-15518. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/14494da8 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/14494da8 Diff: http://git

spark git commit: [SPARK-15512][CORE] repartition(0) should raise IllegalArgumentException

2016-05-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master e631b819f -> f08bf587b [SPARK-15512][CORE] repartition(0) should raise IllegalArgumentException ## What changes were proposed in this pull request? Previously, SPARK-8893 added the constraints on positive number of partitions for

spark git commit: [SPARK-15512][CORE] repartition(0) should raise IllegalArgumentException

2016-05-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 1fb7b3a0a -> 1de3446d9 [SPARK-15512][CORE] repartition(0) should raise IllegalArgumentException ## What changes were proposed in this pull request? Previously, SPARK-8893 added the constraints on positive number of partitions for

spark git commit: [SPARK-15485][SQL][DOCS] Spark SQL Configuration

2016-05-23 Thread rxin
ttp://spark.apache.org/docs/latest/configuration.html For Spark users, the information and default values of these public configuration parameters are very useful. This PR is to add this missing section to the configuration.html. rxin yhuai marmbrus How was this patch tested? Be

spark git commit: [SPARK-15485][SQL][DOCS] Spark SQL Configuration

2016-05-23 Thread rxin
ttp://spark.apache.org/docs/latest/configuration.html For Spark users, the information and default values of these public configuration parameters are very useful. This PR is to add this missing section to the configuration.html. rxin yhuai marmbrus How was this patch tested? Below is the genera

spark git commit: [SPARK-15425][SQL] Disallow cross joins by default

2016-05-23 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 ddac9f262 -> 4462da707 [SPARK-15425][SQL] Disallow cross joins by default ## What changes were proposed in this pull request? In order to prevent users from inadvertently writing queries with cartesian joins, this patch introduces a

spark git commit: [SPARK-15425][SQL] Disallow cross joins by default

2016-05-23 Thread rxin
Repository: spark Updated Branches: refs/heads/master fc44b694b -> dafcb05c2 [SPARK-15425][SQL] Disallow cross joins by default ## What changes were proposed in this pull request? In order to prevent users from inadvertently writing queries with cartesian joins, this patch introduces a new

spark git commit: [SPARK-15459][SQL] Make Range logical and physical explain consistent

2016-05-22 Thread rxin
nge (0, 100, step=2, splits=2) ``` ## How was this patch tested? N/A Author: Reynold Xin <r...@databricks.com> Closes #13239 from rxin/SPARK-15459. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/845e447f Tree: http:

spark git commit: [SPARK-15459][SQL] Make Range logical and physical explain consistent

2016-05-22 Thread rxin
lan == *Range (0, 100, step=2, splits=2) ``` ## How was this patch tested? N/A Author: Reynold Xin <r...@databricks.com> Closes #13239 from rxin/SPARK-15459. (cherry picked from commit 845e447fa03bf0a53ed79fa7e240af94dc152d2c) Signed-off-by: Reynold Xin <r...@databricks.com> Project:

spark git commit: Small documentation and style fix.

2016-05-22 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6cb8f836d -> 6d0bfb960 Small documentation and style fix. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6d0bfb96 Tree:

spark git commit: [SPARK-15396][SQL][DOC] It can't connect hive metastore database

2016-05-22 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 fd7e83119 -> da5d2300e [SPARK-15396][SQL][DOC] It can't connect hive metastore database What changes were proposed in this pull request? The `hive.metastore.warehouse.dir` property in hive-site.xml is deprecated since Spark

spark git commit: [SPARK-15396][SQL][DOC] It can't connect hive metastore database

2016-05-22 Thread rxin
Repository: spark Updated Branches: refs/heads/master 223f63390 -> 6cb8f836d [SPARK-15396][SQL][DOC] It can't connect hive metastore database What changes were proposed in this pull request? The `hive.metastore.warehouse.dir` property in hive-site.xml is deprecated since Spark 2.0.0.

spark git commit: [SPARK-15415][SQL] Fix BroadcastHint when autoBroadcastJoinThreshold is 0 or -1

2016-05-22 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 9a08c9f1c -> fd7e83119 [SPARK-15415][SQL] Fix BroadcastHint when autoBroadcastJoinThreshold is 0 or -1 ## What changes were proposed in this pull request? This PR makes BroadcastHint more deterministic by using a special

spark git commit: [SPARK-15415][SQL] Fix BroadcastHint when autoBroadcastJoinThreshold is 0 or -1

2016-05-22 Thread rxin
Repository: spark Updated Branches: refs/heads/master df9adb5ec -> 223f63390 [SPARK-15415][SQL] Fix BroadcastHint when autoBroadcastJoinThreshold is 0 or -1 ## What changes were proposed in this pull request? This PR makes BroadcastHint more deterministic by using a special isBroadcastable

spark git commit: [SPARK-15330][SQL] Implement Reset Command

2016-05-21 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 6871deb93 -> 9c20c7a33 [SPARK-15330][SQL] Implement Reset Command What changes were proposed in this pull request? Like `Set` Command in Hive, `Reset` is also supported by Hive. See the link:

spark git commit: [SPARK-15330][SQL] Implement Reset Command

2016-05-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master c18fa464f -> 8f0a3d5bc [SPARK-15330][SQL] Implement Reset Command What changes were proposed in this pull request? Like `Set` Command in Hive, `Reset` is also supported by Hive. See the link:

spark git commit: [SPARK-15452][SQL] Mark aggregator API as experimental

2016-05-21 Thread rxin
ked as experimental in 2.0. ## How was this patch tested? N/A - annotation only change. Author: Reynold Xin <r...@databricks.com> Closes #13226 from rxin/SPARK-15452. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/201a51f3 T

spark git commit: [SPARK-15452][SQL] Mark aggregator API as experimental

2016-05-21 Thread rxin
ked as experimental in 2.0. ## How was this patch tested? N/A - annotation only change. Author: Reynold Xin <r...@databricks.com> Closes #13226 from rxin/SPARK-15452. (cherry picked from commit 201a51f36682726d78d9d2fe2c388093bb860ee0) Signed-off-by: Reynold Xin <r...@databricks.com>

[3/3] spark git commit: [SPARK-15078] [SQL] Add all TPCDS 1.4 benchmark queries for SparkSQL

2016-05-21 Thread rxin
[SPARK-15078] [SQL] Add all TPCDS 1.4 benchmark queries for SparkSQL Now that SparkSQL supports all TPC-DS queries, this patch adds all 99 benchmark queries inside SparkSQL. Benchmark only Author: Sameer Agarwal Closes #13188 from sameeragarwal/tpcds-all. (cherry

[2/3] spark git commit: [SPARK-15078] [SQL] Add all TPCDS 1.4 benchmark queries for SparkSQL

2016-05-21 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/d7bf318a/sql/core/src/test/resources/tpcds/q49.sql -- diff --git a/sql/core/src/test/resources/tpcds/q49.sql b/sql/core/src/test/resources/tpcds/q49.sql new file mode 100755 index

[1/3] spark git commit: [SPARK-15078] [SQL] Add all TPCDS 1.4 benchmark queries for SparkSQL

2016-05-21 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 9a8df0c9a -> d7bf318a0 http://git-wip-us.apache.org/repos/asf/spark/blob/d7bf318a/sql/core/src/test/resources/tpcds/q85.sql -- diff --git

<    7   8   9   10   11   12   13   14   15   16   >