[ 
https://issues.apache.org/jira/browse/FLINK-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433305#comment-15433305
 ] 

Till Rohrmann commented on FLINK-4449:
--------------------------------------

Can we create a generic {{HeartbeatManager}} which can be used for the 
heartbeats between RM <=> TM, RM <=> JM and JM <=> TM? I think it should be 
possible similar to the {{RetryingRegistration}}. I think we should create a 
dedicated issue for the implementation. There we should also flesh out a little 
bit more the details of the implementation. Like shall the heartbeat be 
delivered as the result of a future or shall the sending side be also an rpc 
endpoint which is told about the heartbeat response via a tell operation.

> Heartbeat Manager between ResourceManager and TaskExecutor
> ----------------------------------------------------------
>
>                 Key: FLINK-4449
>                 URL: https://issues.apache.org/jira/browse/FLINK-4449
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Cluster Management
>            Reporter: zhangjing
>            Assignee: zhangjing
>
> HeartbeatManager is responsible for heartbeat between resourceManager to 
> TaskExecutor
> 1. Register taskExecutors
> register heartbeat targets. If the heartbeat response for these targets is 
> not reported in time, mark target failed and notify resourceManager
> 2. trigger heartbeat
> trigger heartbeat from resourceManager to TaskExecutor periodically
> taskExecutor report slot allocation in the heartbeat response
> ResourceManager sync self slot allocation with the heartbeat response



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to