svn commit: r28987 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_27_20_01-8198ea5-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-27 Thread pwendell
Author: pwendell Date: Tue Aug 28 03:16:01 2018 New Revision: 28987 Log: Apache Spark 2.4.0-SNAPSHOT-2018_08_27_20_01-8198ea5 docs [This commit notification would consist of 1478 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSourceStrategy

2018-08-27 Thread wenchen
Repository: spark Updated Branches: refs/heads/master dac099d08 -> 8198ea501 [SPARK-24721][SQL] Exclude Python UDFs filters in FileSourceStrategy ## What changes were proposed in this pull request? The PR excludes Python UDFs filters in FileSourceStrategy so that they don't ExtractPythonUDF

svn commit: r28985 - in /dev/spark/2.3.3-SNAPSHOT-2018_08_27_18_01-8db935f-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-27 Thread pwendell
Author: pwendell Date: Tue Aug 28 01:15:10 2018 New Revision: 28985 Log: Apache Spark 2.3.3-SNAPSHOT-2018_08_27_18_01-8db935f docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL][BACKPORT-2.1] Shuffle+Repartition on a DataFrame could lead to incorrect answers

2018-08-27 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.1 09f70f5fd -> 4d2d3d47e [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL][BACKPORT-2.1] Shuffle+Repartition on a DataFrame could lead to incorrect answers ## What changes were proposed in this pull request? Back port of

svn commit: r28984 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_27_16_01-dac099d-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-27 Thread pwendell
Author: pwendell Date: Mon Aug 27 23:15:51 2018 New Revision: 28984 Log: Apache Spark 2.4.0-SNAPSHOT-2018_08_27_16_01-dac099d docs [This commit notification would consist of 1478 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-25164][SQL] Avoid rebuilding column and path list for each column in parquet reader

2018-08-27 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 d7c3aae20 -> af41dedc6 [SPARK-25164][SQL] Avoid rebuilding column and path list for each column in parquet reader ## What changes were proposed in this pull request? VectorizedParquetRecordReader::initializeInternal rebuilds the

spark git commit: [SPARK-25164][SQL] Avoid rebuilding column and path list for each column in parquet reader

2018-08-27 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.3 f5983823e -> 8db935f97 [SPARK-25164][SQL] Avoid rebuilding column and path list for each column in parquet reader ## What changes were proposed in this pull request? VectorizedParquetRecordReader::initializeInternal rebuilds the

spark git commit: [SPARK-24090][K8S] Update running-on-kubernetes.md

2018-08-27 Thread srowen
Repository: spark Updated Branches: refs/heads/master c3f285c93 -> dac099d08 [SPARK-24090][K8S] Update running-on-kubernetes.md ## What changes were proposed in this pull request? Updated documentation for Spark on Kubernetes for the upcoming 2.4.0. Please review

spark git commit: [SPARK-24149][YARN][FOLLOW-UP] Only get the delegation tokens of the filesystem explicitly specified by the user

2018-08-27 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 810d59ce4 -> c3f285c93 [SPARK-24149][YARN][FOLLOW-UP] Only get the delegation tokens of the filesystem explicitly specified by the user ## What changes were proposed in this pull request? Our HDFS cluster configured 5 nameservices:

svn commit: r28983 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_27_12_02-810d59c-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-27 Thread pwendell
Author: pwendell Date: Mon Aug 27 19:16:20 2018 New Revision: 28983 Log: Apache Spark 2.4.0-SNAPSHOT-2018_08_27_12_02-810d59c docs [This commit notification would consist of 1478 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-24882][FOLLOWUP] Fix flaky synchronization in Kafka tests.

2018-08-27 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 381a967a7 -> 810d59ce4 [SPARK-24882][FOLLOWUP] Fix flaky synchronization in Kafka tests. ## What changes were proposed in this pull request? Fix flaky synchronization in Kafka tests - we need to use the scan config that was persisted

spark git commit: [SPARK-25249][CORE][TEST] add a unit test for OpenHashMap

2018-08-27 Thread srowen
Repository: spark Updated Branches: refs/heads/master 6193a202a -> 381a967a7 [SPARK-25249][CORE][TEST] add a unit test for OpenHashMap ## What changes were proposed in this pull request? This PR adds a unit test for OpenHashMap , this can help developers to distinguish between the 0/0.0/0L

svn commit: r28980 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_27_07_23-6193a20-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-27 Thread pwendell
Author: pwendell Date: Mon Aug 27 14:39:07 2018 New Revision: 28980 Log: Apache Spark 2.4.0-SNAPSHOT-2018_08_27_07_23-6193a20 docs [This commit notification would consist of 1478 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-24978][SQL] Add spark.sql.fast.hash.aggregate.row.max.capacity to configure the capacity of fast aggregation.

2018-08-27 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 5c27b0d4f -> 6193a202a [SPARK-24978][SQL] Add spark.sql.fast.hash.aggregate.row.max.capacity to configure the capacity of fast aggregation. ## What changes were proposed in this pull request? this pr add a configuration parameter to

svn commit: r28969 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_27_00_01-5c27b0d-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-27 Thread pwendell
Author: pwendell Date: Mon Aug 27 07:16:20 2018 New Revision: 28969 Log: Apache Spark 2.4.0-SNAPSHOT-2018_08_27_00_01-5c27b0d docs [This commit notification would consist of 1478 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-19355][SQL][FOLLOWUP] Remove the child.outputOrdering check in global limit

2018-08-27 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 5cdb8a23d -> 5c27b0d4f [SPARK-19355][SQL][FOLLOWUP] Remove the child.outputOrdering check in global limit ## What changes were proposed in this pull request? This is based on the discussion