brkyvz commented on a change in pull request #26225: [SPARK-29568][SS] Stop 
existing running streams when a new stream is launched
URL: https://github.com/apache/spark/pull/26225#discussion_r338072359
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryManager.scala
 ##########
 @@ -355,11 +355,22 @@ class StreamingQueryManager private[sql] (sparkSession: 
SparkSession) extends Lo
       // Make sure no other query with same id is active across all sessions
       val activeOption =
         
Option(sparkSession.sharedState.activeStreamingQueries.putIfAbsent(query.id, 
this))
-      if (activeOption.isDefined || activeQueries.values.exists(_.id == 
query.id)) {
+
+      val streamAlreadyActive =
+        activeOption.isDefined || activeQueries.values.exists(_.id == query.id)
+      val turnOffOldStream =
+        
sparkSession.sessionState.conf.getConf(SQLConf.STOP_RUNNING_DUPLICATE_STREAM)
+      if (streamAlreadyActive && turnOffOldStream) {
+        val queryManager = activeOption.getOrElse(this)
+        logInfo(s"Stopping existing streaming query [id=${query.id}], as a new 
run is being " +
+          "started.")
+        queryManager.get(query.id).stop()
 
 Review comment:
   Great question. I can add some safeguards against this, but in most cases we 
mean that the stream is a "zombie", because we lost all references to it, not 
because it is uninterruptable.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to