Hi Gary,

Thanks for your support.

I use flink 1.7.0. I will try to test without that -n.
Here below are the JM log (on server .82) and TM log (on server .88). I'm
sorry that I missed that TM log before asking, had a thought that it would
not relevant. I just fixed the issue with connection to zookeeper and the
problem was solved.

Then I have another question: when JM cannot start/connect to the JM on .88,
why didn't it try on .82 where resource are still available? 

Thanks and regards,
Averell

Here is the JM log (from /mnt/var/log/hadoop-yarn/.../jobmanager.log on .82)
(it seems irrelevant. Even the earlier message regarding NoResourceAvailable
was there in GUI, but not found in the jobmanager.log file):

2019-01-23 04:15:01.869 [main] WARN 
org.apache.flink.configuration.Configuration  - Config uses deprecated
configuration key 'web.port' instead of proper key 'rest.port'
2019-01-23 04:15:03.483 [main] WARN 
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint  - Upload
directory
/tmp/flink-web-08279f45-0244-4c5c-bc9b-299ac59b4068/flink-web-upload does
not exist, or has been deleted externally. Previously uploaded files are no
longer available.

And here is the TM log:
2019-01-23 11:07:07.479 [main] ERROR
o.a.flink.shaded.curator.org.apache.curator.ConnectionState  - Connection
timed out for connection string (localhost:2181) and timeout (15000) /
elapsed (56538)
org.apache.flink.shaded.curator.org.apache.curator.CuratorConnectionLossException:
KeeperErrorCode = ConnectionLoss
        at
org.apache.flink.shaded.curator.org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:225)
        at
org.apache.flink.shaded.curator.org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:94)
        at
org.apache.flink.shaded.curator.org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:117)
        at
org.apache.flink.shaded.curator.org.apache.curator.framework.imps.NamespaceImpl$1.call(NamespaceImpl.java:90)
        at
org.apache.flink.shaded.curator.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109)
        at
org.apache.flink.shaded.curator.org.apache.curator.framework.imps.NamespaceImpl.fixForNamespace(NamespaceImpl.java:83)
        at
org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl.fixForNamespace(CuratorFrameworkImpl.java:594)
        at
org.apache.flink.shaded.curator.org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:158)
        at
org.apache.flink.shaded.curator.org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:32)
        at
org.apache.flink.shaded.curator.org.apache.curator.framework.recipes.cache.NodeCache.reset(NodeCache.java:242)
        at
org.apache.flink.shaded.curator.org.apache.curator.framework.recipes.cache.NodeCache.start(NodeCache.java:175)
        at
org.apache.flink.shaded.curator.org.apache.curator.framework.recipes.cache.NodeCache.start(NodeCache.java:154)
        at
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService.start(ZooKeeperLeaderRetrievalService.java:107)
        at
org.apache.flink.runtime.taskexecutor.TaskExecutor.start(TaskExecutor.java:277)
        at
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.start(TaskManagerRunner.java:168)
        at
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:332)
        at
org.apache.flink.yarn.YarnTaskExecutorRunner.lambda$run$0(YarnTaskExecutorRunner.java:142)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
        at
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at
org.apache.flink.yarn.YarnTaskExecutorRunner.run(YarnTaskExecutorRunner.java:141)
        at
org.apache.flink.yarn.YarnTaskExecutorRunner.main(YarnTaskExecutorRunner.java:75)
2019-01-23 11:07:08.224 [main-SendThread(localhost:2181)] WARN 
o.a.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn  - Session 0x0
for server null, unexpected error, closing socket connection and attempting
reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
        at
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
 



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply via email to