To add to my post, instead of using POD IP for the `jobmanager.rpc.address` configuration we start each JM pod with the Fully Qualified Name `--host <pod-name>.<stateful-set-name>.ns.svc:8081` and this address gets persisted to the ConfigMaps. In some scenarios, the leader address in the ConfigMaps might differ.
For example, let's assume I have 3 JMs: jm-0.jm-statefulset.ns.svc:8081 <-- Leader jm-1.jm-statefulset.ns.svc:8081 jm-2.jm-statefulset..ns.svc:8081 I have seen the ConfigMaps in the following state: RestServer Configmap Address: jm-0.jm-statefulset.ns.svc:8081 DispatchServer Configmap Address: jm-1.jm-statefulset.ns.svc:8081 ResourceManager ConfigMap Address: jm-0.jm-statefulset.ns.svc:8081 Is this the correct behaviour? I then have seen that the TM pods fail to connect due to ``` java.util.concurrent.CompletionException: org.apache.flink.runtime.rpc.exceptions.FencingTokenException: Fencing token not set: Ignoring message RemoteFencedMessage(b870874c1c590d593178811f052a42c9, RemoteRpcInvocation(registerTaskExecutor(TaskExecutorRegistration, Time))) sent to akka.tcp://fl...@jm-1.jm-statefulset.ns.svc:6123/user/rpc/resourcemanager_0 because the fencing token is null. ``` This is explained by Till https://issues.apache.org/jira/browse/FLINK-18367?focusedCommentId=17141070&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17141070 Has anyone else seen this? Thanks! Enrique -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/