Hi Gary, Thanks for your support.
I use flink 1.7.0. I will try to test without that -n. Here below are the JM log (on server .82) and TM log (on server .88). I'm sorry that I missed that TM log before asking, had a thought that it would not relevant. I just fixed the issue with connection to zookeeper and the problem was solved. Then I have another question: when JM cannot start/connect to the JM on .88, why didn't it try on .82 where resource are still available? Thanks and regards, Averell Here is the JM log (from /mnt/var/log/hadoop-yarn/.../jobmanager.log on .82) (it seems irrelevant. Even the earlier message regarding NoResourceAvailable was there in GUI, but not found in the jobmanager.log file): 2019-01-23 04:15:01.869 [main] WARN org.apache.flink.configuration.Configuration - Config uses deprecated configuration key 'web.port' instead of proper key 'rest.port' 2019-01-23 04:15:03.483 [main] WARN org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Upload directory /tmp/flink-web-08279f45-0244-4c5c-bc9b-299ac59b4068/flink-web-upload does not exist, or has been deleted externally. Previously uploaded files are no longer available. And here is the TM log: 2019-01-23 11:07:07.479 [main] ERROR o.a.flink.shaded.curator.org.apache.curator.ConnectionState - Connection timed out for connection string (localhost:2181) and timeout (15000) / elapsed (56538) org.apache.flink.shaded.curator.org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss at org.apache.flink.shaded.curator.org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:225) at org.apache.flink.shaded.curator.org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:94) at org.apache.flink.shaded.curator.org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:117) at org.apache.flink.shaded.curator.org.apache.curator.framework.imps.NamespaceImpl$1.call(NamespaceImpl.java:90) at org.apache.flink.shaded.curator.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109) at org.apache.flink.shaded.curator.org.apache.curator.framework.imps.NamespaceImpl.fixForNamespace(NamespaceImpl.java:83) at org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl.fixForNamespace(CuratorFrameworkImpl.java:594) at org.apache.flink.shaded.curator.org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:158) at org.apache.flink.shaded.curator.org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:32) at org.apache.flink.shaded.curator.org.apache.curator.framework.recipes.cache.NodeCache.reset(NodeCache.java:242) at org.apache.flink.shaded.curator.org.apache.curator.framework.recipes.cache.NodeCache.start(NodeCache.java:175) at org.apache.flink.shaded.curator.org.apache.curator.framework.recipes.cache.NodeCache.start(NodeCache.java:154) at org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService.start(ZooKeeperLeaderRetrievalService.java:107) at org.apache.flink.runtime.taskexecutor.TaskExecutor.start(TaskExecutor.java:277) at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.start(TaskManagerRunner.java:168) at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:332) at org.apache.flink.yarn.YarnTaskExecutorRunner.lambda$run$0(YarnTaskExecutorRunner.java:142) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.yarn.YarnTaskExecutorRunner.run(YarnTaskExecutorRunner.java:141) at org.apache.flink.yarn.YarnTaskExecutorRunner.main(YarnTaskExecutorRunner.java:75) 2019-01-23 11:07:08.224 [main-SendThread(localhost:2181)] WARN o.a.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141) -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/