Hi folks, I've just upgraded our Samza app to v1.5.1. Everything is building and running correctly, and the command to load it into our YARN cluster is initially succeeding. However, as the application is starting in YARN, this message is produced repeatedly until it hits a retry threshold:
"logger_name": "org.apache.hadoop.ipc.Client","message":"Retrying connect to server: localhost/127.0.0.1:8030. Already tried # time(s)" As I said, this is coming from an application that is running on the YARN server itself, so we seem to be able to use the local yarn-site.xml accurately. When it starts on yarn, though, it seems to be resolving the resource manager to localhost. The final exception information is at the end of this email. Any ideas? Cheers, Malcolm McFarland Cavulus Failed to connect to server: localhost/127.0.0.1:8030: retries get failed due to exceeded maximum allowed retries number: 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:685) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:788) at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:410) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1550) at org.apache.hadoop.ipc.Client.call(Client.java:1381) at org.apache.hadoop.ipc.Client.call(Client.java:1345) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) at com.sun.proxy.$Proxy48.registerApplicationMaster(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:107) at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346) at com.sun.proxy.$Proxy49.registerApplicationMaster(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:223) at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:214) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.registerApplicationMaster(AMRMClientAsyncImpl.java:139) at org.apache.samza.job.yarn.SamzaYarnAppMasterLifecycle.onInit(SamzaYarnAppMasterLifecycle.scala:42) at org.apache.samza.job.yarn.YarnClusterResourceManager.start(YarnClusterResourceManager.java:218) at org.apache.samza.clustermanager.ContainerProcessManager.start(ContainerProcessManager.java:230) at org.apache.samza.clustermanager.ClusterBasedJobCoordinator.run(ClusterBasedJobCoordinator.java:289) at org.apache.samza.clustermanager.JobCoordinatorLaunchUtil.run(JobCoordinatorLaunchUtil.java:83) at org.apache.samza.clustermanager.ClusterBasedJobCoordinator.runClusterBasedJobCoordinator(ClusterBasedJobCoordinator.java:529) at org.apache.samza.clustermanager.ClusterBasedJobCoordinator.main(ClusterBasedJobCoordinator.java:473)