Hello Spark devs, I built stopstreaming, a small library that allows you to stop Spark Structured Streaming jobs through signals by adding a single extension method to the StreamingQuery. The library extends StreamingQuery with a single method: query.awaitExternalTermination(config).
Github: https://github.com/a-biratsis/stopstreaming The goal is to stop Spark Structured Streaming jobs through external signals, without relying on calling StreamingQuery.stop() directly, by leveraging Spark's existing API (StreamingQueryListener is used internally). Currently two watchers are available: - *REST*: embeds a lightweight HTTP server on the Spark driver. User sends a POST /stop/<query-id> from any orchestrator, notebook, or curl command. - *FileSystem*: creates a marker file at startup. Delete it (locally or from DBFS) and the query stops. Works in Databricks with dbutils.fs.rm(...) or in any other FS using rm. Soon I will add more watchers such as Kafka and Azure Service Bus, and of course I am open to new ideas/suggestions. Please let me know if you find the above useful and it could enhance or extend the current Spark API. Best, Alex
