pengfei zhao created SPARK-47794: ------------------------------------ Summary: After executing the reboot command on the host where the Driver node is located, the spark streaming application ends in a SUCCESSED state Key: SPARK-47794 URL: https://issues.apache.org/jira/browse/SPARK-47794 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.3.4, 2.4.8 Reporter: pengfei zhao Attachments: image-2024-04-10-16-03-29-393.png
While running Spark streaming applications on YARN in cluster mode, reboot/shutdown of the node hosting AM causes the application to terminate SparkContext and mark it as SUCCEEDED。 The reboot/shutdown command will send an graceful stop signal of "kill -15" to all processes on this machine. From the Spark Streaming code, this end signal will make the Spark Streaming application think it has ended normally. The log is as follows: !image-2024-04-10-16-00-22-450.png! But in most cases, the reboot/shutdown command may occur due to misoperation, other services needing to be restarted, or the operating system itself needing to be restarted. Is it appropriate for Spark to report such a "SUCCEEDED" status? Especially now, many scheduling systems decide whether to restart the Spark task based on its status, such as the "Failed" status. Spark streaming reports such a "SUCCEEDED" status, making it difficult for the scheduling system to handle. Moreover, Spark streaming runs for a long time, and reporting the "SUCCEEDED" status is also very ambiguous.  -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org