[
https://issues.apache.org/jira/browse/FLINK-4537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhangjing updated FLINK-4537:
-----------------------------
Description:
The ResourceManager keeps tracks of all JobManager's which execute Jobs. When a
new JobManager registered, its leadership status is checked through the
HighAvailabilityServices. It will then be registered at the ResourceManager
using the {{JobID}} provided with the initial registration message.
ResourceManager should use JobID and LeaderSessionID(notified by
HighAvailabilityServices) to identify a a session to JobMaster.
When JobManager's register at ResourceManager, it takes the following 2 input
parameters :
1. resourceManagerLeaderId: the fencing token for the ResourceManager leader
which is kept by taskExecutor who send the registration
2. JobMasterRegistration: contain address, JobID
ResourceManager need to process the registration event based on the following
steps:
1. Check whether input resourceManagerLeaderId is as same as the current
leadershipSessionId of resourceManager. If not, it means that maybe two or more
resourceManager exists at the same time, and current resourceManager is not the
proper rm. so it rejects or ignores the registration.
2. Check whether exists a valid JobMaster at the giving address by connecting
to the address. Reject the registration from invalid address.(Hidden in the
connect logic)
3. Keep JobID and JobMasterGateway mapping relationships.
4. Start a JobMasterLeaderListener at the given JobID to listen to the
leadership of the specified JobMaster.
5. Send registration successful ack to the taskExecutor.
was:The ResourceManager keeps tracks of all JobManager's which execute Jobs.
When a new JobManager registered, its leadership status is checked through the
HighAvailabilityServices. It will then be registered at the ResourceManager
using the {{JobID}} provided with the initial registration message.
> ResourceManager registration with JobManager
> --------------------------------------------
>
> Key: FLINK-4537
> URL: https://issues.apache.org/jira/browse/FLINK-4537
> Project: Flink
> Issue Type: Sub-task
> Components: Cluster Management
> Reporter: Maximilian Michels
> Assignee: zhangjing
>
> The ResourceManager keeps tracks of all JobManager's which execute Jobs. When
> a new JobManager registered, its leadership status is checked through the
> HighAvailabilityServices. It will then be registered at the ResourceManager
> using the {{JobID}} provided with the initial registration message.
> ResourceManager should use JobID and LeaderSessionID(notified by
> HighAvailabilityServices) to identify a a session to JobMaster.
> When JobManager's register at ResourceManager, it takes the following 2 input
> parameters :
> 1. resourceManagerLeaderId: the fencing token for the ResourceManager leader
> which is kept by taskExecutor who send the registration
> 2. JobMasterRegistration: contain address, JobID
> ResourceManager need to process the registration event based on the following
> steps:
> 1. Check whether input resourceManagerLeaderId is as same as the current
> leadershipSessionId of resourceManager. If not, it means that maybe two or
> more resourceManager exists at the same time, and current resourceManager is
> not the proper rm. so it rejects or ignores the registration.
> 2. Check whether exists a valid JobMaster at the giving address by connecting
> to the address. Reject the registration from invalid address.(Hidden in the
> connect logic)
> 3. Keep JobID and JobMasterGateway mapping relationships.
> 4. Start a JobMasterLeaderListener at the given JobID to listen to the
> leadership of the specified JobMaster.
> 5. Send registration successful ack to the taskExecutor.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)