spark git commit: [SPARK-11500][SQL] Not deterministic order of columns when using merging schemas.

2015-11-11 Thread lian
Repository: spark Updated Branches: refs/heads/master 99f5f9886 -> 1bc41125e [SPARK-11500][SQL] Not deterministic order of columns when using merging schemas. https://issues.apache.org/jira/browse/SPARK-11500 As filed in SPARK-11500, if merging schemas is enabled, the order of files to

spark git commit: [SPARK-11500][SQL] Not deterministic order of columns when using merging schemas.

2015-11-11 Thread lian
Repository: spark Updated Branches: refs/heads/branch-1.6 daa74be6f -> 7de8abd6f [SPARK-11500][SQL] Not deterministic order of columns when using merging schemas. https://issues.apache.org/jira/browse/SPARK-11500 As filed in SPARK-11500, if merging schemas is enabled, the order of files to

spark git commit: [SQL][MINOR] rename present to finish in Aggregator

2015-11-11 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 297048ff7 -> a83ce04f5 [SQL][MINOR] rename present to finish in Aggregator Author: Wenchen Fan Closes #9617 from cloud-fan/tmp. (cherry picked from commit c964fc101585171aee76996981fe2c9fdafc614e)

spark git commit: [SPARK-11646] WholeTextFileRDD should return Text rather than String

2015-11-11 Thread rxin
Repository: spark Updated Branches: refs/heads/master 27524a3a9 -> 95daff645 [SPARK-11646] WholeTextFileRDD should return Text rather than String If it returns Text, we can reuse this in Spark SQL to provide a WholeTextFile data source and directly convert the Text into UTF8String without

spark git commit: [SPARK-11626][ML] ml.feature.Word2Vec.transform() function very slow

2015-11-11 Thread meng
Repository: spark Updated Branches: refs/heads/master 1510c527b -> 27524a3a9 [SPARK-11626][ML] ml.feature.Word2Vec.transform() function very slow org.apache.spark.ml.feature.Word2Vec.transform() very slow. we should not read broadcast every sentence. Author: Yuming Wang

spark git commit: [SPARK-11656][SQL] support typed aggregate in project list

2015-11-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master c964fc101 -> 9c57bc0ef [SPARK-11656][SQL] support typed aggregate in project list insert `aEncoder` like we do in `agg` Author: Wenchen Fan Closes #9630 from cloud-fan/select. Project:

spark git commit: [SPARK-11626][ML] ml.feature.Word2Vec.transform() function very slow

2015-11-11 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 0d637571d -> 9dda83efe [SPARK-11626][ML] ml.feature.Word2Vec.transform() function very slow org.apache.spark.ml.feature.Word2Vec.transform() very slow. we should not read broadcast every sentence. Author: Yuming Wang

spark git commit: [SPARK-10371][SQL][FOLLOW-UP] fix code style

2015-11-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 7de8abd6f -> 0d637571d [SPARK-10371][SQL][FOLLOW-UP] fix code style Author: Wenchen Fan Closes #9627 from cloud-fan/follow. (cherry picked from commit 1510c527b4f5ee0953ae42313ef9e16d2f5864c4) Signed-off-by:

spark git commit: [SPARK-10371][SQL][FOLLOW-UP] fix code style

2015-11-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 1bc41125e -> 1510c527b [SPARK-10371][SQL][FOLLOW-UP] fix code style Author: Wenchen Fan Closes #9627 from cloud-fan/follow. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-11598] [SQL] enable tests for ShuffledHashOuterJoin

2015-11-11 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 5940fc71d -> 78674d3cc [SPARK-11598] [SQL] enable tests for ShuffledHashOuterJoin Author: Davies Liu Closes #9573 from davies/join_condition. (cherry picked from commit

spark git commit: [MINOR] Fix typo in AggregationQuerySuite.scala

2015-11-11 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 78674d3cc -> 0c50ac516 [MINOR] Fix typo in AggregationQuerySuite.scala Author: Forest Fang Closes #9357 from saurfang/patch-1. (cherry picked from commit 12c7635dc025239d3b69b9adef2f4eebb28edf48)

spark git commit: [SPARK-11396] [SQL] add native implementation of datetime function to_unix_timestamp

2015-11-11 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.6 990f8ce39 -> e9031478f [SPARK-11396] [SQL] add native implementation of datetime function to_unix_timestamp `to_unix_timestamp` is the deterministic version of `unix_timestamp`, as it accepts at least one parameters. Since the

spark git commit: [SPARK-11396] [SQL] add native implementation of datetime function to_unix_timestamp

2015-11-11 Thread davies
Repository: spark Updated Branches: refs/heads/master e49e72339 -> 39b1e36fb [SPARK-11396] [SQL] add native implementation of datetime function to_unix_timestamp `to_unix_timestamp` is the deterministic version of `unix_timestamp`, as it accepts at least one parameters. Since the behavior

spark git commit: [SPARK-11674][ML] add private val after @transient in Word2VecModel

2015-11-11 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 e9031478f -> 57f281c1a [SPARK-11674][ML] add private val after @transient in Word2VecModel This causes compile failure with Scala 2.11. See https://issues.scala-lang.org/browse/SI-8813. (Jenkins won't test Scala 2.11. I tested

spark git commit: [SPARK-11675][SQL] Remove shuffle hash joins.

2015-11-11 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 282350464 -> 990f8ce39 [SPARK-11675][SQL] Remove shuffle hash joins. Author: Reynold Xin Closes #9645 from rxin/SPARK-11675. (cherry picked from commit e49e723392b8a64d30bd90944a748eb6f5ef3a8a) Signed-off-by:

spark git commit: [SPARK-11675][SQL] Remove shuffle hash joins.

2015-11-11 Thread rxin
Repository: spark Updated Branches: refs/heads/master b8ff6888e -> e49e72339 [SPARK-11675][SQL] Remove shuffle hash joins. Author: Reynold Xin Closes #9645 from rxin/SPARK-11675. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-11674][ML] add private val after @transient in Word2VecModel

2015-11-11 Thread meng
Repository: spark Updated Branches: refs/heads/master 39b1e36fb -> e2957bc08 [SPARK-11674][ML] add private val after @transient in Word2VecModel This causes compile failure with Scala 2.11. See https://issues.scala-lang.org/browse/SI-8813. (Jenkins won't test Scala 2.11. I tested compile

spark git commit: [SPARK-10827] replace volatile with Atomic* in AppClient.scala.

2015-11-11 Thread rxin
Repository: spark Updated Branches: refs/heads/master 2d76e44b1 -> e1bcf6af9 [SPARK-10827] replace volatile with Atomic* in AppClient.scala. This is a followup for #9317 to replace volatile fields with AtomicBoolean and AtomicReference. Author: Reynold Xin Closes

spark git commit: [SPARK-10827] replace volatile with Atomic* in AppClient.scala.

2015-11-11 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 e046c6855 -> fdb777444 [SPARK-10827] replace volatile with Atomic* in AppClient.scala. This is a followup for #9317 to replace volatile fields with AtomicBoolean and AtomicReference. Author: Reynold Xin Closes

spark git commit: [SPARK-11672][ML] disable spark.ml read/write tests

2015-11-11 Thread meng
Repository: spark Updated Branches: refs/heads/master e1bcf6af9 -> 1a21be15f [SPARK-11672][ML] disable spark.ml read/write tests Saw several failures on Jenkins, e.g.,

spark git commit: [SPARK-11672][ML] disable spark.ml read/write tests

2015-11-11 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 fdb777444 -> 4151afbf5 [SPARK-11672][ML] disable spark.ml read/write tests Saw several failures on Jenkins, e.g.,

spark git commit: [SPARK-8992][SQL] Add pivot to dataframe api

2015-11-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 1a21be15f -> b8ff6888e [SPARK-8992][SQL] Add pivot to dataframe api This adds a pivot method to the dataframe api. Following the lead of cube and rollup this adds a Pivot operator that is translated into an Aggregate by the analyzer.

spark git commit: [SPARK-8992][SQL] Add pivot to dataframe api

2015-11-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 4151afbf5 -> 5940fc71d [SPARK-8992][SQL] Add pivot to dataframe api This adds a pivot method to the dataframe api. Following the lead of cube and rollup this adds a Pivot operator that is translated into an Aggregate by the analyzer.

spark git commit: [SPARK-11564][SQL][FOLLOW-UP] clean up java tuple encoder

2015-11-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.6 f9aeb961e -> 9bf988555 [SPARK-11564][SQL][FOLLOW-UP] clean up java tuple encoder We need to support custom classes like java beans and combine them into tuple, and it's very hard to do it with the TypeTag-based approach. We should

spark git commit: [SPARK-11564][SQL][FOLLOW-UP] clean up java tuple encoder

2015-11-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 9c57bc0ef -> ec2b80721 [SPARK-11564][SQL][FOLLOW-UP] clean up java tuple encoder We need to support custom classes like java beans and combine them into tuple, and it's very hard to do it with the TypeTag-based approach. We should keep

spark git commit: [SQL][MINOR] remove newLongEncoder in functions

2015-11-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master ec2b80721 -> e71ba5658 [SQL][MINOR] remove newLongEncoder in functions it may shadows the one from implicits in some case. Author: Wenchen Fan Closes #9629 from cloud-fan/minor. Project:

spark git commit: [SPARK-11639][STREAMING][FLAKY-TEST] Implement BlockingWriteAheadLog for testing the BatchedWriteAheadLog

2015-11-11 Thread tdas
Repository: spark Updated Branches: refs/heads/master 529a1d338 -> 27029bc8f [SPARK-11639][STREAMING][FLAKY-TEST] Implement BlockingWriteAheadLog for testing the BatchedWriteAheadLog Several elements could be drained if the main thread is not fast enough. zsxwing warned me about a similar

spark git commit: [SQL][MINOR] remove newLongEncoder in functions

2015-11-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.6 9bf988555 -> 47cc1fe06 [SQL][MINOR] remove newLongEncoder in functions it may shadows the one from implicits in some case. Author: Wenchen Fan Closes #9629 from cloud-fan/minor. (cherry picked from commit

spark git commit: [SPARK-6152] Use shaded ASM5 to support closure cleaning of Java 8 compiled classes

2015-11-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master e71ba5658 -> 529a1d338 [SPARK-6152] Use shaded ASM5 to support closure cleaning of Java 8 compiled classes This patch modifies Spark's closure cleaner (and a few other places) to use ASM 5, which is necessary in order to support cleaning

spark git commit: [SPARK-6152] Use shaded ASM5 to support closure cleaning of Java 8 compiled classes

2015-11-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.6 47cc1fe06 -> 1fbfc1b48 [SPARK-6152] Use shaded ASM5 to support closure cleaning of Java 8 compiled classes This patch modifies Spark's closure cleaner (and a few other places) to use ASM 5, which is necessary in order to support

spark git commit: [MINOR] License header formatting fix

2015-11-11 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 0c50ac516 -> 282350464 [MINOR] License header formatting fix The header wasn't indented properly. Author: Marc Prud'hommeaux Closes #9312 from mprudhom/patch-1. (cherry picked from commit

spark git commit: [SPARK-11335][STREAMING] update kafka direct python docs on how to get the offset ranges for a KafkaRDD

2015-11-11 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.6 d6d31815f -> f7c6c95f9 [SPARK-11335][STREAMING] update kafka direct python docs on how to get the offset ranges for a KafkaRDD tdas koeninger This updates the Spark Streaming + Kafka Integration Guide doc with a working method to

spark git commit: [SPARK-11335][STREAMING] update kafka direct python docs on how to get the offset ranges for a KafkaRDD

2015-11-11 Thread tdas
Repository: spark Updated Branches: refs/heads/master a9a6b80c7 -> dd77e278b [SPARK-11335][STREAMING] update kafka direct python docs on how to get the offset ranges for a KafkaRDD tdas koeninger This updates the Spark Streaming + Kafka Integration Guide doc with a working method to access

spark git commit: [SPARK-11645][SQL] Remove OpenHashSet for the old aggregate.

2015-11-11 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 c227180f1 -> d6d31815f [SPARK-11645][SQL] Remove OpenHashSet for the old aggregate. Author: Reynold Xin Closes #9621 from rxin/SPARK-11645. (cherry picked from commit a9a6b80c718008aac7c411dfe46355efe58dee2e)

spark git commit: [SPARK-11645][SQL] Remove OpenHashSet for the old aggregate.

2015-11-11 Thread rxin
Repository: spark Updated Branches: refs/heads/master df97df2b3 -> a9a6b80c7 [SPARK-11645][SQL] Remove OpenHashSet for the old aggregate. Author: Reynold Xin Closes #9621 from rxin/SPARK-11645. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-11644][SQL] Remove the option to turn off unsafe and codegen.

2015-11-11 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 aa7194de3 -> c227180f1 [SPARK-11644][SQL] Remove the option to turn off unsafe and codegen. Author: Reynold Xin Closes #9618 from rxin/SPARK-11644. (cherry picked from commit

spark git commit: [SPARK-11644][SQL] Remove the option to turn off unsafe and codegen.

2015-11-11 Thread rxin
Repository: spark Updated Branches: refs/heads/master 27029bc8f -> df97df2b3 [SPARK-11644][SQL] Remove the option to turn off unsafe and codegen. Author: Reynold Xin Closes #9618 from rxin/SPARK-11644. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-11647] Attempt to reduce time/flakiness of Thriftserver CLI and SparkSubmit tests

2015-11-11 Thread rxin
Repository: spark Updated Branches: refs/heads/master dd77e278b -> 2d76e44b1 [SPARK-11647] Attempt to reduce time/flakiness of Thriftserver CLI and SparkSubmit tests This patch aims to reduce the test time and flakiness of HiveSparkSubmitSuite, SparkSubmitSuite, and CliSuite. Key changes:

spark git commit: [SPARK-11647] Attempt to reduce time/flakiness of Thriftserver CLI and SparkSubmit tests

2015-11-11 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 f7c6c95f9 -> e046c6855 [SPARK-11647] Attempt to reduce time/flakiness of Thriftserver CLI and SparkSubmit tests This patch aims to reduce the test time and flakiness of HiveSparkSubmitSuite, SparkSubmitSuite, and CliSuite. Key