spark git commit: [SPARK-23815][CORE] Spark writer dynamic partition overwrite mode may fail to write output on multi level partition
Repository: spark Updated Branches: refs/heads/branch-2.3 2995b79d6 -> dfdf1bb9b [SPARK-23815][CORE] Spark writer dynamic partition overwrite mode may fail to write output on multi level partition ## What changes were proposed in this pull request? Spark introduced new writer mode to overwrite only related partitions in SPARK-20236. While we are using this feature in our production cluster, we found a bug when writing multi-level partitions on HDFS. A simple test case to reproduce this issue: val df = Seq(("1","2","3")).toDF("col1", "col2","col3") df.write.partitionBy("col1","col2").mode("overwrite").save("/my/hdfs/location") If HDFS location "/my/hdfs/location" does not exist, there will be no output. This seems to be caused by the job commit change in SPARK-20236 in HadoopMapReduceCommitProtocol. In the commit job process, the output has been written into staging dir /my/hdfs/location/.spark-staging.xxx/col1=1/col2=2, and then the code calls fs.rename to rename /my/hdfs/location/.spark-staging.xxx/col1=1/col2=2 to /my/hdfs/location/col1=1/col2=2. However, in our case the operation will fail on HDFS because /my/hdfs/location/col1=1 does not exists. HDFS rename can not create directory for more than one level. This does not happen in the new unit test added with SPARK-20236 which uses local file system. We are proposing a fix. When cleaning current partition dir /my/hdfs/location/col1=1/col2=2 before the rename op, if the delete op fails (because /my/hdfs/location/col1=1/col2=2 may not exist), we call mkdirs op to create the parent dir /my/hdfs/location/col1=1 (if the parent dir does not exist) so the following rename op can succeed. Reference: in official HDFS document(https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/filesystem/filesystem.html), the rename command has precondition "dest must be root, or have a parent that exists" ## How was this patch tested? We have tested this patch on our production cluster and it fixed the problem Author: Fangshi LiCloses #20931 from fangshil/master. (cherry picked from commit 4b07036799b01894826b47c73142fe282c607a57) Signed-off-by: Wenchen Fan Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/dfdf1bb9 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/dfdf1bb9 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/dfdf1bb9 Branch: refs/heads/branch-2.3 Commit: dfdf1bb9be19bd31e398f97310391b391fabfcfd Parents: 2995b79 Author: Fangshi Li Authored: Fri Apr 13 13:46:34 2018 +0800 Committer: Wenchen Fan Committed: Fri Apr 13 13:47:31 2018 +0800 -- .../internal/io/HadoopMapReduceCommitProtocol.scala | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/dfdf1bb9/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala -- diff --git a/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala b/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala index 6d20ef1..3e60c50 100644 --- a/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala +++ b/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala @@ -186,7 +186,17 @@ class HadoopMapReduceCommitProtocol( logDebug(s"Clean up default partition directories for overwriting: $partitionPaths") for (part <- partitionPaths) { val finalPartPath = new Path(path, part) - fs.delete(finalPartPath, true) + if (!fs.delete(finalPartPath, true) && !fs.exists(finalPartPath.getParent)) { +// According to the official hadoop FileSystem API spec, delete op should assume +// the destination is no longer present regardless of return value, thus we do not +// need to double check if finalPartPath exists before rename. +// Also in our case, based on the spec, delete returns false only when finalPartPath +// does not exist. When this happens, we need to take action if parent of finalPartPath +// also does not exist(e.g. the scenario described on SPARK-23815), because +// FileSystem API spec on rename op says the rename dest(finalPartPath) must have +// a parent that exists, otherwise we may get unexpected result on the rename. +fs.mkdirs(finalPartPath.getParent) + } fs.rename(new Path(stagingDir, part), finalPartPath) } } - To unsubscribe,
spark git commit: [SPARK-23815][CORE] Spark writer dynamic partition overwrite mode may fail to write output on multi level partition
Repository: spark Updated Branches: refs/heads/master 1018be44d -> 4b0703679 [SPARK-23815][CORE] Spark writer dynamic partition overwrite mode may fail to write output on multi level partition ## What changes were proposed in this pull request? Spark introduced new writer mode to overwrite only related partitions in SPARK-20236. While we are using this feature in our production cluster, we found a bug when writing multi-level partitions on HDFS. A simple test case to reproduce this issue: val df = Seq(("1","2","3")).toDF("col1", "col2","col3") df.write.partitionBy("col1","col2").mode("overwrite").save("/my/hdfs/location") If HDFS location "/my/hdfs/location" does not exist, there will be no output. This seems to be caused by the job commit change in SPARK-20236 in HadoopMapReduceCommitProtocol. In the commit job process, the output has been written into staging dir /my/hdfs/location/.spark-staging.xxx/col1=1/col2=2, and then the code calls fs.rename to rename /my/hdfs/location/.spark-staging.xxx/col1=1/col2=2 to /my/hdfs/location/col1=1/col2=2. However, in our case the operation will fail on HDFS because /my/hdfs/location/col1=1 does not exists. HDFS rename can not create directory for more than one level. This does not happen in the new unit test added with SPARK-20236 which uses local file system. We are proposing a fix. When cleaning current partition dir /my/hdfs/location/col1=1/col2=2 before the rename op, if the delete op fails (because /my/hdfs/location/col1=1/col2=2 may not exist), we call mkdirs op to create the parent dir /my/hdfs/location/col1=1 (if the parent dir does not exist) so the following rename op can succeed. Reference: in official HDFS document(https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/filesystem/filesystem.html), the rename command has precondition "dest must be root, or have a parent that exists" ## How was this patch tested? We have tested this patch on our production cluster and it fixed the problem Author: Fangshi LiCloses #20931 from fangshil/master. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4b070367 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4b070367 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4b070367 Branch: refs/heads/master Commit: 4b07036799b01894826b47c73142fe282c607a57 Parents: 1018be4 Author: Fangshi Li Authored: Fri Apr 13 13:46:34 2018 +0800 Committer: Wenchen Fan Committed: Fri Apr 13 13:46:34 2018 +0800 -- .../internal/io/HadoopMapReduceCommitProtocol.scala | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/4b070367/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala -- diff --git a/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala b/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala index 6d20ef1..3e60c50 100644 --- a/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala +++ b/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala @@ -186,7 +186,17 @@ class HadoopMapReduceCommitProtocol( logDebug(s"Clean up default partition directories for overwriting: $partitionPaths") for (part <- partitionPaths) { val finalPartPath = new Path(path, part) - fs.delete(finalPartPath, true) + if (!fs.delete(finalPartPath, true) && !fs.exists(finalPartPath.getParent)) { +// According to the official hadoop FileSystem API spec, delete op should assume +// the destination is no longer present regardless of return value, thus we do not +// need to double check if finalPartPath exists before rename. +// Also in our case, based on the spec, delete returns false only when finalPartPath +// does not exist. When this happens, we need to take action if parent of finalPartPath +// also does not exist(e.g. the scenario described on SPARK-23815), because +// FileSystem API spec on rename op says the rename dest(finalPartPath) must have +// a parent that exists, otherwise we may get unexpected result on the rename. +fs.mkdirs(finalPartPath.getParent) + } fs.rename(new Path(stagingDir, part), finalPartPath) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-23971] Should not leak Spark sessions across test suites
Repository: spark Updated Branches: refs/heads/master ab7b961a4 -> 1018be44d [SPARK-23971] Should not leak Spark sessions across test suites ## What changes were proposed in this pull request? Many suites currently leak Spark sessions (sometimes with stopped SparkContexts) via the thread-local active Spark session and default Spark session. We should attempt to clean these up and detect when this happens to improve the reproducibility of tests. ## How was this patch tested? Existing tests Author: Eric LiangCloses #21058 from ericl/clear-session. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1018be44 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1018be44 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1018be44 Branch: refs/heads/master Commit: 1018be44d6c52cf18e14d84160850063f0e60a1d Parents: ab7b961 Author: Eric Liang Authored: Thu Apr 12 22:30:59 2018 -0700 Committer: gatorsmile Committed: Thu Apr 12 22:30:59 2018 -0700 -- .../org/apache/spark/SharedSparkSession.java| 9 ++-- .../org/apache/spark/sql/SparkSession.scala | 23 ++-- .../apache/spark/sql/SessionStateSuite.scala| 2 ++ .../spark/sql/test/SharedSparkSession.scala | 22 ++- 4 files changed, 47 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/1018be44/mllib/src/test/java/org/apache/spark/SharedSparkSession.java -- diff --git a/mllib/src/test/java/org/apache/spark/SharedSparkSession.java b/mllib/src/test/java/org/apache/spark/SharedSparkSession.java index 4377987..35a2509 100644 --- a/mllib/src/test/java/org/apache/spark/SharedSparkSession.java +++ b/mllib/src/test/java/org/apache/spark/SharedSparkSession.java @@ -42,7 +42,12 @@ public abstract class SharedSparkSession implements Serializable { @After public void tearDown() { -spark.stop(); -spark = null; +try { + spark.stop(); + spark = null; +} finally { + SparkSession.clearDefaultSession(); + SparkSession.clearActiveSession(); +} } } http://git-wip-us.apache.org/repos/asf/spark/blob/1018be44/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala index b107492..c502e58 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala @@ -44,7 +44,7 @@ import org.apache.spark.sql.sources.BaseRelation import org.apache.spark.sql.streaming._ import org.apache.spark.sql.types.{DataType, StructType} import org.apache.spark.sql.util.ExecutionListenerManager -import org.apache.spark.util.Utils +import org.apache.spark.util.{CallSite, Utils} /** @@ -81,6 +81,9 @@ class SparkSession private( @transient private[sql] val extensions: SparkSessionExtensions) extends Serializable with Closeable with Logging { self => + // The call site where this SparkSession was constructed. + private val creationSite: CallSite = Utils.getCallSite() + private[sql] def this(sc: SparkContext) { this(sc, None, None, new SparkSessionExtensions) } @@ -763,7 +766,7 @@ class SparkSession private( @InterfaceStability.Stable -object SparkSession { +object SparkSession extends Logging { /** * Builder for [[SparkSession]]. @@ -1090,4 +1093,20 @@ object SparkSession { } } + private[spark] def cleanupAnyExistingSession(): Unit = { +val session = getActiveSession.orElse(getDefaultSession) +if (session.isDefined) { + logWarning( +s"""An existing Spark session exists as the active or default session. + |This probably means another suite leaked it. Attempting to stop it before continuing. + |This existing Spark session was created at: + | + |${session.get.creationSite.longForm} + | + """.stripMargin) + session.get.stop() + SparkSession.clearActiveSession() + SparkSession.clearDefaultSession() +} + } } http://git-wip-us.apache.org/repos/asf/spark/blob/1018be44/sql/core/src/test/scala/org/apache/spark/sql/SessionStateSuite.scala -- diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SessionStateSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/SessionStateSuite.scala index 4efae4c..7d13660 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/SessionStateSuite.scala +++
svn commit: r26318 - in /dev/spark/2.3.1-SNAPSHOT-2018_04_12_22_02-2995b79-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Fri Apr 13 05:17:16 2018 New Revision: 26318 Log: Apache Spark 2.3.1-SNAPSHOT-2018_04_12_22_02-2995b79 docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-23748][SS] Fix SS continuous process doesn't support SubqueryAlias issue
Repository: spark Updated Branches: refs/heads/branch-2.3 908c681c6 -> 2995b79d6 [SPARK-23748][SS] Fix SS continuous process doesn't support SubqueryAlias issue ## What changes were proposed in this pull request? Current SS continuous doesn't support processing on temp table or `df.as("xxx")`, SS will throw an exception as LogicalPlan not supported, details described in [here](https://issues.apache.org/jira/browse/SPARK-23748). So here propose to add this support. ## How was this patch tested? new UT. Author: jerryshaoCloses #21017 from jerryshao/SPARK-23748. (cherry picked from commit 14291b061b9b40eadbf4ed442f9a5021b8e09597) Signed-off-by: Tathagata Das Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2995b79d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2995b79d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2995b79d Branch: refs/heads/branch-2.3 Commit: 2995b79d6a78bf632aa4c1c99bebfc213fb31c54 Parents: 908c681 Author: jerryshao Authored: Thu Apr 12 20:00:25 2018 -0700 Committer: Tathagata Das Committed: Thu Apr 12 20:00:40 2018 -0700 -- .../analysis/UnsupportedOperationChecker.scala | 2 +- .../streaming/continuous/ContinuousSuite.scala | 19 +++ 2 files changed, 20 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/2995b79d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala index b55043c..ff9d6d7 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala @@ -345,7 +345,7 @@ object UnsupportedOperationChecker { plan.foreachUp { implicit subPlan => subPlan match { case (_: Project | _: Filter | _: MapElements | _: MapPartitions | - _: DeserializeToObject | _: SerializeFromObject) => + _: DeserializeToObject | _: SerializeFromObject | _: SubqueryAlias) => case node if node.nodeName == "StreamingRelationV2" => case node => throwError(s"Continuous processing does not support ${node.nodeName} operations.") http://git-wip-us.apache.org/repos/asf/spark/blob/2995b79d/sql/core/src/test/scala/org/apache/spark/sql/streaming/continuous/ContinuousSuite.scala -- diff --git a/sql/core/src/test/scala/org/apache/spark/sql/streaming/continuous/ContinuousSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/streaming/continuous/ContinuousSuite.scala index 4b4ed82..95406b3 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/streaming/continuous/ContinuousSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/streaming/continuous/ContinuousSuite.scala @@ -174,6 +174,25 @@ class ContinuousSuite extends ContinuousSuiteBase { "Continuous processing does not support current time operations.")) } + test("subquery alias") { +val df = spark.readStream + .format("rate") + .option("numPartitions", "5") + .option("rowsPerSecond", "5") + .load() + .createOrReplaceTempView("rate") +val test = spark.sql("select value from rate where value > 5") + +testStream(test, useV2Sink = true)( + StartStream(longContinuousTrigger), + AwaitEpoch(0), + Execute(waitForRateSourceTriggers(_, 2)), + IncrementEpoch(), + Execute(waitForRateSourceTriggers(_, 4)), + IncrementEpoch(), + CheckAnswerRowsContains(scala.Range(6, 20).map(Row(_ + } + test("repeatedly restart") { val df = spark.readStream .format("rate") - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-23748][SS] Fix SS continuous process doesn't support SubqueryAlias issue
Repository: spark Updated Branches: refs/heads/master 682002b6d -> 14291b061 [SPARK-23748][SS] Fix SS continuous process doesn't support SubqueryAlias issue ## What changes were proposed in this pull request? Current SS continuous doesn't support processing on temp table or `df.as("xxx")`, SS will throw an exception as LogicalPlan not supported, details described in [here](https://issues.apache.org/jira/browse/SPARK-23748). So here propose to add this support. ## How was this patch tested? new UT. Author: jerryshaoCloses #21017 from jerryshao/SPARK-23748. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/14291b06 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/14291b06 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/14291b06 Branch: refs/heads/master Commit: 14291b061b9b40eadbf4ed442f9a5021b8e09597 Parents: 682002b Author: jerryshao Authored: Thu Apr 12 20:00:25 2018 -0700 Committer: Tathagata Das Committed: Thu Apr 12 20:00:25 2018 -0700 -- .../analysis/UnsupportedOperationChecker.scala | 2 +- .../streaming/continuous/ContinuousSuite.scala | 19 +++ 2 files changed, 20 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/14291b06/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala index b55043c..ff9d6d7 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala @@ -345,7 +345,7 @@ object UnsupportedOperationChecker { plan.foreachUp { implicit subPlan => subPlan match { case (_: Project | _: Filter | _: MapElements | _: MapPartitions | - _: DeserializeToObject | _: SerializeFromObject) => + _: DeserializeToObject | _: SerializeFromObject | _: SubqueryAlias) => case node if node.nodeName == "StreamingRelationV2" => case node => throwError(s"Continuous processing does not support ${node.nodeName} operations.") http://git-wip-us.apache.org/repos/asf/spark/blob/14291b06/sql/core/src/test/scala/org/apache/spark/sql/streaming/continuous/ContinuousSuite.scala -- diff --git a/sql/core/src/test/scala/org/apache/spark/sql/streaming/continuous/ContinuousSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/streaming/continuous/ContinuousSuite.scala index f5884b9..ef74efe 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/streaming/continuous/ContinuousSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/streaming/continuous/ContinuousSuite.scala @@ -171,6 +171,25 @@ class ContinuousSuite extends ContinuousSuiteBase { "Continuous processing does not support current time operations.")) } + test("subquery alias") { +val df = spark.readStream + .format("rate") + .option("numPartitions", "5") + .option("rowsPerSecond", "5") + .load() + .createOrReplaceTempView("rate") +val test = spark.sql("select value from rate where value > 5") + +testStream(test, useV2Sink = true)( + StartStream(longContinuousTrigger), + AwaitEpoch(0), + Execute(waitForRateSourceTriggers(_, 2)), + IncrementEpoch(), + Execute(waitForRateSourceTriggers(_, 4)), + IncrementEpoch(), + CheckAnswerRowsContains(scala.Range(6, 20).map(Row(_ + } + test("repeatedly restart") { val df = spark.readStream .format("rate") - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-23867][SCHEDULER] use droppedCount in logWarning
Repository: spark Updated Branches: refs/heads/branch-2.3 571269519 -> 908c681c6 [SPARK-23867][SCHEDULER] use droppedCount in logWarning ## What changes were proposed in this pull request? Get the count of dropped events for output in log message. ## How was this patch tested? The fix is pretty trivial, but `./dev/run-tests` were run and were successful. Please review http://spark.apache.org/contributing.html before opening a pull request. vanzin cloud-fan The contribution is my original work and I license the work to the project under the projectâs open source license. Author: Patrick PisciuneriCloses #20977 from phpisciuneri/fix-log-warning. (cherry picked from commit 682002b6da844ed11324ee5ff4d00fc0294c0b31) Signed-off-by: Wenchen Fan Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/908c681c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/908c681c Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/908c681c Branch: refs/heads/branch-2.3 Commit: 908c681c6786ef0d772a43508285cb8891fc524a Parents: 5712695 Author: Patrick Pisciuneri Authored: Fri Apr 13 09:45:27 2018 +0800 Committer: Wenchen Fan Committed: Fri Apr 13 09:45:45 2018 +0800 -- .../main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/908c681c/core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala -- diff --git a/core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala b/core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala index 7e14938..c1fedd6 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala @@ -166,7 +166,7 @@ private class AsyncEventQueue(val name: String, conf: SparkConf, metrics: LiveLi val prevLastReportTimestamp = lastReportTimestamp lastReportTimestamp = System.currentTimeMillis() val previous = new java.util.Date(prevLastReportTimestamp) - logWarning(s"Dropped $droppedEvents events from $name since $previous.") + logWarning(s"Dropped $droppedCount events from $name since $previous.") } } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-23867][SCHEDULER] use droppedCount in logWarning
Repository: spark Updated Branches: refs/heads/master 0f93b91a7 -> 682002b6d [SPARK-23867][SCHEDULER] use droppedCount in logWarning ## What changes were proposed in this pull request? Get the count of dropped events for output in log message. ## How was this patch tested? The fix is pretty trivial, but `./dev/run-tests` were run and were successful. Please review http://spark.apache.org/contributing.html before opening a pull request. vanzin cloud-fan The contribution is my original work and I license the work to the project under the projectâs open source license. Author: Patrick PisciuneriCloses #20977 from phpisciuneri/fix-log-warning. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/682002b6 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/682002b6 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/682002b6 Branch: refs/heads/master Commit: 682002b6da844ed11324ee5ff4d00fc0294c0b31 Parents: 0f93b91 Author: Patrick Pisciuneri Authored: Fri Apr 13 09:45:27 2018 +0800 Committer: Wenchen Fan Committed: Fri Apr 13 09:45:27 2018 +0800 -- .../main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/682002b6/core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala -- diff --git a/core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala b/core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala index 7e14938..c1fedd6 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala @@ -166,7 +166,7 @@ private class AsyncEventQueue(val name: String, conf: SparkConf, metrics: LiveLi val prevLastReportTimestamp = lastReportTimestamp lastReportTimestamp = System.currentTimeMillis() val previous = new java.util.Date(prevLastReportTimestamp) - logWarning(s"Dropped $droppedEvents events from $name since $previous.") + logWarning(s"Dropped $droppedCount events from $name since $previous.") } } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-23751][FOLLOW-UP] fix build for scala-2.12
Repository: spark Updated Branches: refs/heads/master 0b19122d4 -> 0f93b91a7 [SPARK-23751][FOLLOW-UP] fix build for scala-2.12 ## What changes were proposed in this pull request? fix build for scala-2.12 ## How was this patch tested? Manual. Author: WeichenXuCloses #21051 from WeichenXu123/fix_build212. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0f93b91a Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0f93b91a Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0f93b91a Branch: refs/heads/master Commit: 0f93b91a71444a1a938acfd8ea2191c54fb0187c Parents: 0b19122 Author: WeichenXu Authored: Thu Apr 12 15:47:42 2018 -0600 Committer: Joseph K. Bradley Committed: Thu Apr 12 15:47:42 2018 -0600 -- .../scala/org/apache/spark/ml/stat/KolmogorovSmirnovTest.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/0f93b91a/mllib/src/main/scala/org/apache/spark/ml/stat/KolmogorovSmirnovTest.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/ml/stat/KolmogorovSmirnovTest.scala b/mllib/src/main/scala/org/apache/spark/ml/stat/KolmogorovSmirnovTest.scala index af8ff64..adf8145 100644 --- a/mllib/src/main/scala/org/apache/spark/ml/stat/KolmogorovSmirnovTest.scala +++ b/mllib/src/main/scala/org/apache/spark/ml/stat/KolmogorovSmirnovTest.scala @@ -85,7 +85,7 @@ object KolmogorovSmirnovTest { dataset: Dataset[_], sampleCol: String, cdf: Function[java.lang.Double, java.lang.Double]): DataFrame = { -test(dataset, sampleCol, (x: Double) => cdf.call(x)) +test(dataset, sampleCol, (x: Double) => cdf.call(x).toDouble) } /** - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[1/2] spark-website git commit: Update text/wording to more "modern" Spark and more consistent.
Repository: spark-website Updated Branches: refs/heads/asf-site 91b561749 -> 658467248 http://git-wip-us.apache.org/repos/asf/spark-website/blob/65846724/site/news/strata-exercises-now-available-online.html -- diff --git a/site/news/strata-exercises-now-available-online.html b/site/news/strata-exercises-now-available-online.html index 916f242..4f250a3 100644 --- a/site/news/strata-exercises-now-available-online.html +++ b/site/news/strata-exercises-now-available-online.html @@ -66,7 +66,7 @@ - Lightning-fast cluster computing + Lightning-fast unified analytics engine http://git-wip-us.apache.org/repos/asf/spark-website/blob/65846724/site/news/submit-talks-to-spark-summit-2014.html -- diff --git a/site/news/submit-talks-to-spark-summit-2014.html b/site/news/submit-talks-to-spark-summit-2014.html index 4f43c23..18f2642 100644 --- a/site/news/submit-talks-to-spark-summit-2014.html +++ b/site/news/submit-talks-to-spark-summit-2014.html @@ -66,7 +66,7 @@ - Lightning-fast cluster computing + Lightning-fast unified analytics engine http://git-wip-us.apache.org/repos/asf/spark-website/blob/65846724/site/news/submit-talks-to-spark-summit-2016.html -- diff --git a/site/news/submit-talks-to-spark-summit-2016.html b/site/news/submit-talks-to-spark-summit-2016.html index 3163bab..3766932 100644 --- a/site/news/submit-talks-to-spark-summit-2016.html +++ b/site/news/submit-talks-to-spark-summit-2016.html @@ -66,7 +66,7 @@ - Lightning-fast cluster computing + Lightning-fast unified analytics engine http://git-wip-us.apache.org/repos/asf/spark-website/blob/65846724/site/news/submit-talks-to-spark-summit-east-2016.html -- diff --git a/site/news/submit-talks-to-spark-summit-east-2016.html b/site/news/submit-talks-to-spark-summit-east-2016.html index 1984db7..b4a51a7 100644 --- a/site/news/submit-talks-to-spark-summit-east-2016.html +++ b/site/news/submit-talks-to-spark-summit-east-2016.html @@ -66,7 +66,7 @@ - Lightning-fast cluster computing + Lightning-fast unified analytics engine http://git-wip-us.apache.org/repos/asf/spark-website/blob/65846724/site/news/submit-talks-to-spark-summit-eu-2016.html -- diff --git a/site/news/submit-talks-to-spark-summit-eu-2016.html b/site/news/submit-talks-to-spark-summit-eu-2016.html index 8e33a17..940bc6f 100644 --- a/site/news/submit-talks-to-spark-summit-eu-2016.html +++ b/site/news/submit-talks-to-spark-summit-eu-2016.html @@ -66,7 +66,7 @@ - Lightning-fast cluster computing + Lightning-fast unified analytics engine http://git-wip-us.apache.org/repos/asf/spark-website/blob/65846724/site/news/two-weeks-to-spark-summit-2014.html -- diff --git a/site/news/two-weeks-to-spark-summit-2014.html b/site/news/two-weeks-to-spark-summit-2014.html index 3863298..d4e993a 100644 --- a/site/news/two-weeks-to-spark-summit-2014.html +++ b/site/news/two-weeks-to-spark-summit-2014.html @@ -66,7 +66,7 @@ - Lightning-fast cluster computing + Lightning-fast unified analytics engine http://git-wip-us.apache.org/repos/asf/spark-website/blob/65846724/site/news/video-from-first-spark-development-meetup.html -- diff --git a/site/news/video-from-first-spark-development-meetup.html b/site/news/video-from-first-spark-development-meetup.html index 2be7f50..04151a8 100644 --- a/site/news/video-from-first-spark-development-meetup.html +++ b/site/news/video-from-first-spark-development-meetup.html @@ -66,7 +66,7 @@ - Lightning-fast cluster computing + Lightning-fast unified analytics engine http://git-wip-us.apache.org/repos/asf/spark-website/blob/65846724/site/powered-by.html -- diff --git a/site/powered-by.html b/site/powered-by.html index 3449782..b303df0 100644 --- a/site/powered-by.html +++ b/site/powered-by.html @@ -66,7 +66,7 @@ - Lightning-fast cluster computing + Lightning-fast unified analytics engine http://git-wip-us.apache.org/repos/asf/spark-website/blob/65846724/site/release-process.html -- diff --git a/site/release-process.html b/site/release-process.html index
[2/2] spark-website git commit: Update text/wording to more "modern" Spark and more consistent.
Update text/wording to more "modern" Spark and more consistent. 1. Use DataFrame examples. 2. Reduce explicit comparison with MapReduce, since the topic does not really come up. 3. More focus on analytics rather than "cluster compute". 4. Update committer affiliation. 5. Make it more clear Spark runs in diverse environments (especially on MLlib page). There are a lot that needs to be done that I don't have time today, e.g. refer to Structured Streaming. Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/65846724 Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/65846724 Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/65846724 Branch: refs/heads/asf-site Commit: 658467248b278b109bc3d2594b0ef08ff0c727cb Parents: 91b5617 Author: Reynold XinAuthored: Thu Apr 12 12:56:05 2018 -0700 Committer: Reynold Xin Committed: Thu Apr 12 12:56:05 2018 -0700 -- _layouts/global.html| 2 +- committers.md | 22 +- index.md| 34 +-- mllib/index.md | 18 +- site/committers.html| 24 +- site/community.html | 2 +- site/contributing.html | 2 +- site/developer-tools.html | 2 +- site/documentation.html | 2 +- site/downloads.html | 2 +- site/examples.html | 2 +- site/faq.html | 2 +- site/history.html | 2 +- site/improvement-proposals.html | 2 +- site/index.html | 36 +-- site/mailing-lists.html | 4 +- site/mllib/index.html | 18 +- site/news/amp-camp-2013-registration-ope.html | 2 +- .../news/announcing-the-first-spark-summit.html | 2 +- .../news/fourth-spark-screencast-published.html | 2 +- site/news/index.html| 2 +- site/news/nsdi-paper.html | 2 +- site/news/one-month-to-spark-summit-2015.html | 2 +- .../proposals-open-for-spark-summit-east.html | 2 +- ...registration-open-for-spark-summit-east.html | 2 +- .../news/run-spark-and-shark-on-amazon-emr.html | 2 +- site/news/spark-0-6-1-and-0-5-2-released.html | 2 +- site/news/spark-0-6-2-released.html | 2 +- site/news/spark-0-7-0-released.html | 2 +- site/news/spark-0-7-2-released.html | 2 +- site/news/spark-0-7-3-released.html | 2 +- site/news/spark-0-8-0-released.html | 2 +- site/news/spark-0-8-1-released.html | 2 +- site/news/spark-0-9-0-released.html | 2 +- site/news/spark-0-9-1-released.html | 2 +- site/news/spark-0-9-2-released.html | 2 +- site/news/spark-1-0-0-released.html | 2 +- site/news/spark-1-0-1-released.html | 2 +- site/news/spark-1-0-2-released.html | 2 +- site/news/spark-1-1-0-released.html | 2 +- site/news/spark-1-1-1-released.html | 2 +- site/news/spark-1-2-0-released.html | 2 +- site/news/spark-1-2-1-released.html | 2 +- site/news/spark-1-2-2-released.html | 2 +- site/news/spark-1-3-0-released.html | 2 +- site/news/spark-1-4-0-released.html | 2 +- site/news/spark-1-4-1-released.html | 2 +- site/news/spark-1-5-0-released.html | 2 +- site/news/spark-1-5-1-released.html | 2 +- site/news/spark-1-5-2-released.html | 2 +- site/news/spark-1-6-0-released.html | 2 +- site/news/spark-1-6-1-released.html | 2 +- site/news/spark-1-6-2-released.html | 2 +- site/news/spark-1-6-3-released.html | 2 +- site/news/spark-2-0-0-released.html | 2 +- site/news/spark-2-0-1-released.html | 2 +- site/news/spark-2-0-2-released.html | 2 +- site/news/spark-2-1-0-released.html | 2 +- site/news/spark-2-1-1-released.html | 2 +- site/news/spark-2-1-2-released.html | 2 +- site/news/spark-2-2-0-released.html | 2 +- site/news/spark-2-2-1-released.html | 2 +- site/news/spark-2-3-0-released.html | 2 +- site/news/spark-2.0.0-preview.html | 2 +- .../spark-accepted-into-apache-incubator.html | 2 +- site/news/spark-and-shark-in-the-news.html | 2 +- site/news/spark-becomes-tlp.html| 2 +-
svn commit: r26307 - in /dev/spark/2.4.0-SNAPSHOT-2018_04_12_08_02-0b19122-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Apr 12 15:24:31 2018 New Revision: 26307 Log: Apache Spark 2.4.0-SNAPSHOT-2018_04_12_08_02-0b19122 docs [This commit notification would consist of 1458 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-23762][SQL] UTF8StringBuffer uses MemoryBlock
Repository: spark Updated Branches: refs/heads/master 6a2289ecf -> 0b19122d4 [SPARK-23762][SQL] UTF8StringBuffer uses MemoryBlock ## What changes were proposed in this pull request? This PR tries to use `MemoryBlock` in `UTF8StringBuffer`. In general, there are two advantages to use `MemoryBlock`. 1. Has clean API calls rather than using a Java array or `PlatformMemory` 2. Improve runtime performance of memory access instead of using `Object`. ## How was this patch tested? Added `UTF8StringBufferSuite` Author: Kazuaki IshizakiCloses #20871 from kiszk/SPARK-23762. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0b19122d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0b19122d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0b19122d Branch: refs/heads/master Commit: 0b19122d434e39eb117ccc3174a0688c9c874d48 Parents: 6a2289e Author: Kazuaki Ishizaki Authored: Thu Apr 12 22:21:30 2018 +0800 Committer: Wenchen Fan Committed: Thu Apr 12 22:21:30 2018 +0800 -- .../expressions/codegen/UTF8StringBuilder.java | 35 +++- .../codegen/UTF8StringBuilderSuite.scala| 42 2 files changed, 56 insertions(+), 21 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/0b19122d/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UTF8StringBuilder.java -- diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UTF8StringBuilder.java b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UTF8StringBuilder.java index f0f66ba..f8000d7 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UTF8StringBuilder.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UTF8StringBuilder.java @@ -19,6 +19,8 @@ package org.apache.spark.sql.catalyst.expressions.codegen; import org.apache.spark.unsafe.Platform; import org.apache.spark.unsafe.array.ByteArrayMethods; +import org.apache.spark.unsafe.memory.ByteArrayMemoryBlock; +import org.apache.spark.unsafe.memory.MemoryBlock; import org.apache.spark.unsafe.types.UTF8String; /** @@ -29,43 +31,34 @@ public class UTF8StringBuilder { private static final int ARRAY_MAX = ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH; - private byte[] buffer; - private int cursor = Platform.BYTE_ARRAY_OFFSET; + private ByteArrayMemoryBlock buffer; + private int length = 0; public UTF8StringBuilder() { // Since initial buffer size is 16 in `StringBuilder`, we set the same size here -this.buffer = new byte[16]; +this.buffer = new ByteArrayMemoryBlock(16); } // Grows the buffer by at least `neededSize` private void grow(int neededSize) { -if (neededSize > ARRAY_MAX - totalSize()) { +if (neededSize > ARRAY_MAX - length) { throw new UnsupportedOperationException( "Cannot grow internal buffer by size " + neededSize + " because the size after growing " + "exceeds size limitation " + ARRAY_MAX); } -final int length = totalSize() + neededSize; -if (buffer.length < length) { - int newLength = length < ARRAY_MAX / 2 ? length * 2 : ARRAY_MAX; - final byte[] tmp = new byte[newLength]; - Platform.copyMemory( -buffer, -Platform.BYTE_ARRAY_OFFSET, -tmp, -Platform.BYTE_ARRAY_OFFSET, -totalSize()); +final int requestedSize = length + neededSize; +if (buffer.size() < requestedSize) { + int newLength = requestedSize < ARRAY_MAX / 2 ? requestedSize * 2 : ARRAY_MAX; + final ByteArrayMemoryBlock tmp = new ByteArrayMemoryBlock(newLength); + MemoryBlock.copyMemory(buffer, tmp, length); buffer = tmp; } } - private int totalSize() { -return cursor - Platform.BYTE_ARRAY_OFFSET; - } - public void append(UTF8String value) { grow(value.numBytes()); -value.writeToMemory(buffer, cursor); -cursor += value.numBytes(); +value.writeToMemory(buffer.getByteArray(), length + Platform.BYTE_ARRAY_OFFSET); +length += value.numBytes(); } public void append(String value) { @@ -73,6 +66,6 @@ public class UTF8StringBuilder { } public UTF8String build() { -return UTF8String.fromBytes(buffer, 0, totalSize()); +return UTF8String.fromBytes(buffer.getByteArray(), 0, length); } } http://git-wip-us.apache.org/repos/asf/spark/blob/0b19122d/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/UTF8StringBuilderSuite.scala -- diff --git
svn commit: r26305 - in /dev/spark/2.4.0-SNAPSHOT-2018_04_12_04_02-6a2289e-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Apr 12 11:23:41 2018 New Revision: 26305 Log: Apache Spark 2.4.0-SNAPSHOT-2018_04_12_04_02-6a2289e docs [This commit notification would consist of 1458 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r26303 - in /dev/spark/2.3.1-SNAPSHOT-2018_04_12_02_01-5712695-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Apr 12 09:15:55 2018 New Revision: 26303 Log: Apache Spark 2.3.1-SNAPSHOT-2018_04_12_02_01-5712695 docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-23962][SQL][TEST] Fix race in currentExecutionIds().
Repository: spark Updated Branches: refs/heads/master e904dfaf0 -> 6a2289ecf [SPARK-23962][SQL][TEST] Fix race in currentExecutionIds(). SQLMetricsTestUtils.currentExecutionIds() was racing with the listener bus, which lead to some flaky tests. We should wait till the listener bus is empty. I tested by adding some Thread.sleep()s in SQLAppStatusListener, which reproduced the exceptions I saw on Jenkins. With this change, they went away. Author: Imran RashidCloses #21041 from squito/SPARK-23962. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6a2289ec Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6a2289ec Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6a2289ec Branch: refs/heads/master Commit: 6a2289ecf020a99cd9b3bcea7da5e78fb4e0303a Parents: e904dfa Author: Imran Rashid Authored: Thu Apr 12 15:58:04 2018 +0800 Committer: Wenchen Fan Committed: Thu Apr 12 15:58:04 2018 +0800 -- .../org/apache/spark/sql/execution/metric/SQLMetricsTestUtils.scala | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/6a2289ec/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsTestUtils.scala -- diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsTestUtils.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsTestUtils.scala index 534d8bb..dcc540f 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsTestUtils.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsTestUtils.scala @@ -34,6 +34,7 @@ trait SQLMetricsTestUtils extends SQLTestUtils { import testImplicits._ protected def currentExecutionIds(): Set[Long] = { +spark.sparkContext.listenerBus.waitUntilEmpty(1) statusStore.executionsList.map(_.executionId).toSet } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-23962][SQL][TEST] Fix race in currentExecutionIds().
Repository: spark Updated Branches: refs/heads/branch-2.3 03a4dfd69 -> 571269519 [SPARK-23962][SQL][TEST] Fix race in currentExecutionIds(). SQLMetricsTestUtils.currentExecutionIds() was racing with the listener bus, which lead to some flaky tests. We should wait till the listener bus is empty. I tested by adding some Thread.sleep()s in SQLAppStatusListener, which reproduced the exceptions I saw on Jenkins. With this change, they went away. Author: Imran RashidCloses #21041 from squito/SPARK-23962. (cherry picked from commit 6a2289ecf020a99cd9b3bcea7da5e78fb4e0303a) Signed-off-by: Wenchen Fan Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/57126951 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/57126951 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/57126951 Branch: refs/heads/branch-2.3 Commit: 571269519a49e0651575de18de81b788b6548afd Parents: 03a4dfd Author: Imran Rashid Authored: Thu Apr 12 15:58:04 2018 +0800 Committer: Wenchen Fan Committed: Thu Apr 12 15:58:36 2018 +0800 -- .../org/apache/spark/sql/execution/metric/SQLMetricsTestUtils.scala | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/57126951/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsTestUtils.scala -- diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsTestUtils.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsTestUtils.scala index 122d287..e22bbb6 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsTestUtils.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsTestUtils.scala @@ -34,6 +34,7 @@ trait SQLMetricsTestUtils extends SQLTestUtils { import testImplicits._ protected def currentExecutionIds(): Set[Long] = { +spark.sparkContext.listenerBus.waitUntilEmpty(1) statusStore.executionsList.map(_.executionId).toSet } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org