brkyvz opened a new pull request #26225: [SPARK-29568][SS] Stop existing 
running streams when a new stream is launched
URL: https://github.com/apache/spark/pull/26225
 
 
   #26157 # What changes were proposed in this pull request?
   
   This PR adds a SQL Conf: `spark.sql.streaming.stopExistingDuplicateStream`. 
When this conf is `true` (by default it is), an already running stream will be 
stopped, if a new copy gets launched on the same checkpoint location.
   
   ### Why are the changes needed?
   
   In multi-tenant environments where you have multiple SparkSessions, you can 
accidentally start multiple copies of the same stream (i.e. streams using the 
same checkpoint location). This will cause all new instantiations of the new 
stream to fail. However, sometimes you may want to turn off the old stream, as 
the old stream may have turned into a zombie (you no longer have access to the 
query handle or SparkSession).
   
   It would be nice to have a SQL flag that allows the stopping of the old 
stream for such zombie cases.
   
   ### Does this PR introduce any user-facing change?
   
   Yes. Now by default, if you launch a new copy of an already running stream 
on a multi-tenant cluster, the existing stream will be stopped.
   
   ### How was this patch tested?
   
   Unit tests in StreamingQueryManagerSuite

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to