spark git commit: [SPARK-22371][CORE] Return None instead of throwing an exception when an accumulator is garbage collected.

2018-05-17 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.3 28973e152 -> 895c95e5b [SPARK-22371][CORE] Return None instead of throwing an exception when an accumulator is garbage collected. ## What changes were proposed in this pull request? There's a period of time when an accumulator has

svn commit: r26983 - in /dev/spark/2.4.0-SNAPSHOT-2018_05_17_20_01-d4a0895-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-05-17 Thread pwendell
Author: pwendell Date: Fri May 18 03:15:37 2018 New Revision: 26983 Log: Apache Spark 2.4.0-SNAPSHOT-2018_05_17_20_01-d4a0895 docs [This commit notification would consist of 1462 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

svn commit: r26986 - in /dev/spark/2.3.2-SNAPSHOT-2018_05_17_22_01-895c95e-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-05-17 Thread pwendell
Author: pwendell Date: Fri May 18 05:16:43 2018 New Revision: 26986 Log: Apache Spark 2.3.2-SNAPSHOT-2018_05_17_22_01-895c95e docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-23922][SQL] Add arrays_overlap function

2018-05-17 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 6ec05826d -> 69350aa2f [SPARK-23922][SQL] Add arrays_overlap function ## What changes were proposed in this pull request? The PR adds the function `arrays_overlap`. This function returns `true` if the input arrays contain a non-null

svn commit: r26966 - in /dev/spark/2.4.0-SNAPSHOT-2018_05_17_04_02-6c35865-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-05-17 Thread pwendell
Author: pwendell Date: Thu May 17 11:21:02 2018 New Revision: 26966 Log: Apache Spark 2.4.0-SNAPSHOT-2018_05_17_04_02-6c35865 docs [This commit notification would consist of 1462 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-24002][SQL][BACKPORT-2.3] Task not serializable caused by org.apache.parquet.io.api.Binary$ByteBufferBackedBinary.getBytes

2018-05-17 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.3 d4a892af5 -> 1708de27e [SPARK-24002][SQL][BACKPORT-2.3] Task not serializable caused by org.apache.parquet.io.api.Binary$ByteBufferBackedBinary.getBytes This PR is to backport https://github.com/apache/spark/pull/21086 to Apache

spark git commit: [SPARK-24107][CORE][FOLLOWUP] ChunkedByteBuffer.writeFully method has not reset the limit value

2018-05-17 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 6c35865d9 -> 6ec05826d [SPARK-24107][CORE][FOLLOWUP] ChunkedByteBuffer.writeFully method has not reset the limit value ## What changes were proposed in this pull request? According to the discussion in

svn commit: r26969 - in /dev/spark/2.4.0-SNAPSHOT-2018_05_17_08_01-8a837bf-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-05-17 Thread pwendell
Author: pwendell Date: Thu May 17 15:16:46 2018 New Revision: 26969 Log: Apache Spark 2.4.0-SNAPSHOT-2018_05_17_08_01-8a837bf docs [This commit notification would consist of 1462 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-24193] create TakeOrderedAndProjectExec only when the limit number is below spark.sql.execution.topKSortFallbackThreshold.

2018-05-17 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 69350aa2f -> 8a837bf4f [SPARK-24193] create TakeOrderedAndProjectExec only when the limit number is below spark.sql.execution.topKSortFallbackThreshold. ## What changes were proposed in this pull request? Physical plan of `select colA

svn commit: r26970 - in /dev/spark/2.3.2-SNAPSHOT-2018_05_17_10_01-1708de2-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-05-17 Thread pwendell
Author: pwendell Date: Thu May 17 17:15:27 2018 New Revision: 26970 Log: Apache Spark 2.3.2-SNAPSHOT-2018_05_17_10_01-1708de2 docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-24115] Have logging pass through instrumentation class.

2018-05-17 Thread meng
Repository: spark Updated Branches: refs/heads/master 8a837bf4f -> a7a9b1837 [SPARK-24115] Have logging pass through instrumentation class. ## What changes were proposed in this pull request? Fixes to tuning instrumentation. ## How was this patch tested? Existing tests. Please review

svn commit: r26977 - in /dev/spark/2.4.0-SNAPSHOT-2018_05_17_12_01-a7a9b18-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-05-17 Thread pwendell
Author: pwendell Date: Thu May 17 19:16:06 2018 New Revision: 26977 Log: Apache Spark 2.4.0-SNAPSHOT-2018_05_17_12_01-a7a9b18 docs [This commit notification would consist of 1462 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-21945][YARN][PYTHON] Make --py-files work with PySpark shell in Yarn client mode

2018-05-17 Thread vanzin
Repository: spark Updated Branches: refs/heads/branch-2.3 1708de27e -> 28973e152 [SPARK-21945][YARN][PYTHON] Make --py-files work with PySpark shell in Yarn client mode When we run _PySpark shell with Yarn client mode_, specified `--py-files` are not recognised in _driver side_. Here are

spark git commit: [SPARK-24114] Add instrumentation to FPGrowth.

2018-05-17 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master a7a9b1837 -> 439c69511 [SPARK-24114] Add instrumentation to FPGrowth. ## What changes were proposed in this pull request? Have FPGrowth keep track of model training using the Instrumentation class. ## How was this patch tested? manually

svn commit: r26981 - in /dev/spark/2.4.0-SNAPSHOT-2018_05_17_16_04-439c695-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-05-17 Thread pwendell
Author: pwendell Date: Thu May 17 23:17:50 2018 New Revision: 26981 Log: Apache Spark 2.4.0-SNAPSHOT-2018_05_17_16_04-439c695 docs [This commit notification would consist of 1462 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-22884][ML] ML tests for StructuredStreaming: spark.ml.clustering

2018-05-17 Thread meng
Repository: spark Updated Branches: refs/heads/master 439c69511 -> d4a0895c6 [SPARK-22884][ML] ML tests for StructuredStreaming: spark.ml.clustering ## What changes were proposed in this pull request? Converting clustering tests to also check code with structured streaming, using the ML

svn commit: r26982 - in /dev/spark/2.3.2-SNAPSHOT-2018_05_17_18_01-28973e1-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-05-17 Thread pwendell
Author: pwendell Date: Fri May 18 01:15:56 2018 New Revision: 26982 Log: Apache Spark 2.3.2-SNAPSHOT-2018_05_17_18_01-28973e1 docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]