brkyvz commented on a change in pull request #26225: [SPARK-29568][SS] Stop 
existing running streams when a new stream is launched
URL: https://github.com/apache/spark/pull/26225#discussion_r342829903
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
 ##########
 @@ -1087,6 +1087,14 @@ object SQLConf {
       .checkValue(v => Set(1, 2).contains(v), "Valid versions are 1 and 2")
       .createWithDefault(2)
 
+  val STOP_RUNNING_DUPLICATE_STREAM = 
buildConf("spark.sql.streaming.stopExistingDuplicateStream")
+    .doc("Running two streams using the same checkpoint location concurrently 
is not supported. " +
+      "In the case where multiple streams are started on different 
SparkSessions, access to the " +
+      "older stream's SparkSession may not be possible, and the stream may 
have turned into a " +
+      "zombie stream. When this flag is true, we will stop the old stream to 
start the new one.")
+    .booleanConf
+    .createWithDefault(true)
 
 Review comment:
   Great question. Here's my argument why we should change it:
    1. This change is going into Spark 3.0, a release where we can actually 
break existing behavior (unless it is critical behavior which people depend on)
    2. The existing behavior was that any new start of a stream would fail, 
because an existing stream was already running. This is programming error on 
the user's part.
    3. However, there are legitimate cases, where a user would like to restart 
a new instance of the stream (because they upgrade the code for instance), but 
they have no way of stopping the existing stream, because it turns into a 
zombie.
   
   I would argue that 3 is more common than 2, and including 1, this is where 
we can change behavior and mention in release notes.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to