Till Rohrmann created FLINK-3184:
------------------------------------

             Summary: Decrease Akka timeouts on cluster side to make system 
more responsive
                 Key: FLINK-3184
                 URL: https://issues.apache.org/jira/browse/FLINK-3184
             Project: Flink
          Issue Type: Improvement
    Affects Versions: 1.0.0
            Reporter: Till Rohrmann
            Assignee: Till Rohrmann
            Priority: Minor


Currently, the default timeout for futures is set to 100 s. This also the time 
used to wait in between restart attempts if no other value has been explicitly 
specified. Especially in the streaming case, it is often necessary to detect 
failures and to react to failures in shorter period than 100 s. Therefore, I 
propose to decrease the default timeout to 10 s.

Additionally, I propose to introduce a slightly higher timeout for the client 
side (e.g. 60 s). The reason is that in case of a {{JobManager}} the client has 
to wait until the cluster has recovered. Using ZooKeeper for that can entail a 
longer timeout than 10 s. In such a case a recovery could be falsely recognized 
as a lost connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to