Zhanghao Chen created FLINK-35145:
-------------------------------------

             Summary: Add timeout for cluster termination
                 Key: FLINK-35145
                 URL: https://issues.apache.org/jira/browse/FLINK-35145
             Project: Flink
          Issue Type: Improvement
          Components: Runtime / Coordination
    Affects Versions: 1.20.0
            Reporter: Zhanghao Chen
             Fix For: 1.20.0


Currently, cluster termination may be blocked forever as there's no timeout for 
that. For example, for an Application cluster with ZK HA enabled, when ZK 
cluster is down, the cluster will reach termination status, but the termination 
process will be blocked when trying to clean up HA data on ZK. Similar 
phenomenon can be observed when an HDFS/S3 outage occurs.

I propose adding a timeout for the cluster termination process in 
ClusterEntryPoint#
shutDownAsync method. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to