Till Rohrmann created FLINK-3184: ------------------------------------ Summary: Decrease Akka timeouts on cluster side to make system more responsive Key: FLINK-3184 URL: https://issues.apache.org/jira/browse/FLINK-3184 Project: Flink Issue Type: Improvement Affects Versions: 1.0.0 Reporter: Till Rohrmann Assignee: Till Rohrmann Priority: Minor
Currently, the default timeout for futures is set to 100 s. This also the time used to wait in between restart attempts if no other value has been explicitly specified. Especially in the streaming case, it is often necessary to detect failures and to react to failures in shorter period than 100 s. Therefore, I propose to decrease the default timeout to 10 s. Additionally, I propose to introduce a slightly higher timeout for the client side (e.g. 60 s). The reason is that in case of a {{JobManager}} the client has to wait until the cluster has recovered. Using ZooKeeper for that can entail a longer timeout than 10 s. In such a case a recovery could be falsely recognized as a lost connection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)