[ https://issues.apache.org/jira/browse/FLINK-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433305#comment-15433305 ]
Till Rohrmann commented on FLINK-4449: -------------------------------------- Can we create a generic {{HeartbeatManager}} which can be used for the heartbeats between RM <=> TM, RM <=> JM and JM <=> TM? I think it should be possible similar to the {{RetryingRegistration}}. I think we should create a dedicated issue for the implementation. There we should also flesh out a little bit more the details of the implementation. Like shall the heartbeat be delivered as the result of a future or shall the sending side be also an rpc endpoint which is told about the heartbeat response via a tell operation. > Heartbeat Manager between ResourceManager and TaskExecutor > ---------------------------------------------------------- > > Key: FLINK-4449 > URL: https://issues.apache.org/jira/browse/FLINK-4449 > Project: Flink > Issue Type: Sub-task > Components: Cluster Management > Reporter: zhangjing > Assignee: zhangjing > > HeartbeatManager is responsible for heartbeat between resourceManager to > TaskExecutor > 1. Register taskExecutors > register heartbeat targets. If the heartbeat response for these targets is > not reported in time, mark target failed and notify resourceManager > 2. trigger heartbeat > trigger heartbeat from resourceManager to TaskExecutor periodically > taskExecutor report slot allocation in the heartbeat response > ResourceManager sync self slot allocation with the heartbeat response -- This message was sent by Atlassian JIRA (v6.3.4#6332)