spark git commit: [SPARK-23279][SS] Avoid triggering distributed job for Console sink

jshao Tue, 30 Jan 2018 22:00:05 -0800

Repository: spark
Updated Branches:
  refs/heads/branch-2.3 b8778321b -> ab5a51055



[SPARK-23279][SS] Avoid triggering distributed job for Console sink

## What changes were proposed in this pull request?

Console sink will redistribute collected local data and trigger a distributed 
job in each batch, this is not necessary, so here change to local job.

## How was this patch tested?

Existing UT and manual verification.

Author: jerryshao <ss...@hortonworks.com>

Closes #20447 from jerryshao/console-minor.

(cherry picked from commit 8c6a9c90a36a938372f28ee8be72178192fbc313)
Signed-off-by: jerryshao <ss...@hortonworks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ab5a5105
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ab5a5105
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ab5a5105

Branch: refs/heads/branch-2.3
Commit: ab5a5105502c545bed951538f0ce9409cfbde154
Parents: b877832
Author: jerryshao <ss...@hortonworks.com>
Authored: Wed Jan 31 13:59:21 2018 +0800
Committer: jerryshao <ss...@hortonworks.com>
Committed: Wed Jan 31 13:59:36 2018 +0800

----------------------------------------------------------------------
 .../spark/sql/execution/streaming/sources/ConsoleWriter.scala    | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/ab5a5105/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/ConsoleWriter.scala
----------------------------------------------------------------------
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/ConsoleWriter.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/ConsoleWriter.scala
index d46f4d7..c57bdc4 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/ConsoleWriter.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/ConsoleWriter.scala
@@ -17,6 +17,8 @@
 
 package org.apache.spark.sql.execution.streaming.sources
 
+import scala.collection.JavaConverters._
+
 import org.apache.spark.internal.Logging
 import org.apache.spark.sql.{Row, SparkSession}
 import org.apache.spark.sql.sources.v2.DataSourceOptions
@@ -61,7 +63,7 @@ class ConsoleWriter(schema: StructType, options: 
DataSourceOptions)
     println("-------------------------------------------")
     // scalastyle:off println
     spark
-      .createDataFrame(spark.sparkContext.parallelize(rows), schema)
+      .createDataFrame(rows.toList.asJava, schema)
       .show(numRowsToShow, isTruncated)
   }
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-23279][SS] Avoid triggering distributed job for Console sink

Reply via email to