Harpreet Sawhney created FLINK-4426: ---------------------------------------
Summary: Unable to create proxy to the ResourceManager Key: FLINK-4426 URL: https://issues.apache.org/jira/browse/FLINK-4426 Project: Flink Issue Type: Bug Components: ResourceManager Affects Versions: 1.0.3 Environment: Flink 1.0.3 built with MapR (2.7.0-mapr-1602) Reporter: Harpreet Sawhney We have a Mapr cluster on which I am trying to run a single flink job (from examples) on YARN Running the example (./bin/flink run -m yarn-cluster -yn 4 ./examples/batch/WordCount.jar) fails with an "Unable to create proxy to the ResourceManager null" error: More detailed logs from the flink run below (server addresses removed): ========================================================= 2016-08-18 23:02:32,249 DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(value=[Rate of successful kerberos logins and latency (milliseconds)], valueName=Time, about=, type=DEFAULT, always=false, sampleName=Ops) 2016-08-18 23:02:32,261 DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(value=[Rate of failed kerberos logins and latency (milliseconds)], valueName=Time, about=, type=DEFAULT, always=false, sampleName=Ops) 2016-08-18 23:02:32,261 DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(value=[GetGroups], valueName=Time, about=, type=DEFAULT, always=false, sampleName=Ops) 2016-08-18 23:02:32,263 DEBUG org.apache.hadoop.metrics2.impl.MetricsSystemImpl - UgiMetrics, User and group related metrics 2016-08-18 23:02:33,777 DEBUG com.mapr.baseutils.cldbutils.CLDBRpcCommonUtils - init 2016-08-18 23:02:33,793 DEBUG com.mapr.baseutils.JVMProperties - Setting JVM property zookeeper.saslprovider to com.mapr.security.simplesasl.SimpleSaslProvider 2016-08-18 23:02:33,794 DEBUG com.mapr.baseutils.JVMProperties - Setting JVM property zookeeper.sasl.clientconfig to Client_simple 2016-08-18 23:02:33,794 DEBUG com.mapr.baseutils.JVMProperties - Setting JVM property java.security.auth.login.config to /opt/mapr/conf/mapr.login.conf 2016-08-18 23:02:33,797 DEBUG org.apache.hadoop.conf.Configuration - Loaded org.apache.hadoop.conf.CoreDefaultProperties 2016-08-18 23:02:33,805 DEBUG org.apache.hadoop.security.UserGroupInformation - HADOOP_SECURITY_AUTHENTICATION is set to: SIMPLE 2016-08-18 23:02:33,805 DEBUG org.apache.hadoop.security.UserGroupInformation - Login configuration entry is hadoop_simple 2016-08-18 23:02:33,806 DEBUG org.apache.hadoop.security.UserGroupInformation - authenticationMethod from JAAS configuration:SIMPLE 2016-08-18 23:02:33,867 DEBUG org.apache.hadoop.conf.Configuration - Loaded org.apache.hadoop.conf.CoreDefaultProperties 2016-08-18 23:02:33,875 DEBUG org.apache.hadoop.security.Groups - Creating new Groups object 2016-08-18 23:02:33,878 DEBUG org.apache.hadoop.util.PerformanceAdvisory - Falling back to shell based 2016-08-18 23:02:33,879 DEBUG org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping 2016-08-18 23:02:33,934 DEBUG org.apache.hadoop.conf.Configuration - Loaded org.apache.hadoop.conf.CoreDefaultProperties 2016-08-18 23:02:34,002 DEBUG org.apache.hadoop.conf.Configuration - Loaded org.apache.hadoop.yarn.conf.YarnDefaultProperties 2016-08-18 23:02:34,021 DEBUG org.apache.hadoop.util.Shell - setsid exited with exit code 0 2016-08-18 23:02:34,047 DEBUG org.apache.hadoop.security.Groups - Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000 2016-08-18 23:02:34,058 DEBUG org.apache.hadoop.security.login.HadoopLoginModule - Priority principal search list is [class javax.security.auth.kerberos.KerberosPrincipal] 2016-08-18 23:02:34,058 DEBUG org.apache.hadoop.security.login.HadoopLoginModule - Additional principal search list is [class com.sun.security.auth.UnixPrincipal] 2016-08-18 23:02:34,058 DEBUG org.apache.hadoop.security.login.HadoopLoginModule - hadoop login 2016-08-18 23:02:34,059 DEBUG org.apache.hadoop.security.login.HadoopLoginModule - hadoop login commit 2016-08-18 23:02:34,098 DEBUG org.apache.hadoop.security.rpcauth.RpcAuthRegistry - Added SIMPLE to registry. 2016-08-18 23:02:34,098 DEBUG org.apache.hadoop.security.rpcauth.RpcAuthRegistry - Added KERBEROS to registry. 2016-08-18 23:02:34,098 DEBUG org.apache.hadoop.security.rpcauth.RpcAuthRegistry - Added TOKEN to registry. 2016-08-18 23:02:34,098 DEBUG org.apache.hadoop.security.rpcauth.RpcAuthRegistry - Added FAKE to registry. 2016-08-18 23:02:34,099 DEBUG org.apache.hadoop.security.UserGroupInformation - Found no authentication principals in subject. Simple? 2016-08-18 23:02:34,099 DEBUG org.apache.hadoop.security.UserGroupInformation - UGI loginUser:hsawhney (auth:SIMPLE) 2016-08-18 23:02:34,100 INFO org.apache.flink.client.CliFrontend - -------------------------------------------------------------------------------- 2016-08-18 23:02:34,100 INFO org.apache.flink.client.CliFrontend - Starting Command Line Client (Version: 1.0.3, Rev:<unknown>, Date:<unknown>) 2016-08-18 23:02:34,100 INFO org.apache.flink.client.CliFrontend - Current user: hsawhney 2016-08-18 23:02:34,100 INFO org.apache.flink.client.CliFrontend - JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.7/24.95-b01 2016-08-18 23:02:34,100 INFO org.apache.flink.client.CliFrontend - Maximum heap size: 27159 MiBytes 2016-08-18 23:02:34,100 INFO org.apache.flink.client.CliFrontend - JAVA_HOME: (not set) 2016-08-18 23:02:34,102 INFO org.apache.flink.client.CliFrontend - Hadoop version: 2.7.0-mapr-1602 2016-08-18 23:02:34,102 INFO org.apache.flink.client.CliFrontend - JVM Options: 2016-08-18 23:02:34,102 INFO org.apache.flink.client.CliFrontend - -Dlog.file=/home/hsawhney/flink-1.0.3/log/flink-hsawhney-client.log 2016-08-18 23:02:34,102 INFO org.apache.flink.client.CliFrontend - -Dlog4j.configuration=file:/home/hsawhney/flink-1.0.3/conf/log4j-cli.properties 2016-08-18 23:02:34,102 INFO org.apache.flink.client.CliFrontend - -Dlogback.configurationFile=file:/home/hsawhney/flink-1.0.3/conf/logback.xml 2016-08-18 23:02:34,102 INFO org.apache.flink.client.CliFrontend - Program Arguments: 2016-08-18 23:02:34,102 INFO org.apache.flink.client.CliFrontend - run 2016-08-18 23:02:34,102 INFO org.apache.flink.client.CliFrontend - -m 2016-08-18 23:02:34,102 INFO org.apache.flink.client.CliFrontend - yarn-cluster 2016-08-18 23:02:34,103 INFO org.apache.flink.client.CliFrontend - -yn 2016-08-18 23:02:34,103 INFO org.apache.flink.client.CliFrontend - 4 2016-08-18 23:02:34,103 INFO org.apache.flink.client.CliFrontend - ./examples/batch/WordCount.jar 2016-08-18 23:02:34,103 INFO org.apache.flink.client.CliFrontend - Classpath: /home/hsawhney/flink-1.0.3/lib/flink-dist_2.11-1.0.3.jar:/home/hsawhney/flink-1.0.3/lib/flink-python_2.11-1.0.3.jar:/home/hsawhney/flink-1.0.3/lib/log4j-1.2.17.jar:/home/hsawhney/flink-1.0.3/lib/slf4j-log4j12-1.7.7.jar::/opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop:/opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop: 2016-08-18 23:02:34,103 INFO org.apache.flink.client.CliFrontend - -------------------------------------------------------------------------------- 2016-08-18 23:02:34,103 INFO org.apache.flink.client.CliFrontend - Using configuration directory /home/hsawhney/flink-1.0.3/conf 2016-08-18 23:02:34,103 INFO org.apache.flink.client.CliFrontend - Trying to load configuration file 2016-08-18 23:02:34,122 DEBUG org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, localhost 2016-08-18 23:02:34,122 DEBUG org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.port, 6123 2016-08-18 23:02:34,122 DEBUG org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.heap.mb, 256 2016-08-18 23:02:34,122 DEBUG org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.heap.mb, 512 2016-08-18 23:02:34,122 DEBUG org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 1 2016-08-18 23:02:34,122 DEBUG org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.preallocate, false 2016-08-18 23:02:34,122 DEBUG org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1 2016-08-18 23:02:34,123 DEBUG org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.web.port, 8081 2016-08-18 23:02:34,336 DEBUG org.apache.hadoop.conf.Configuration - Loaded org.apache.hadoop.conf.CoreDefaultProperties 2016-08-18 23:02:34,379 DEBUG org.apache.hadoop.conf.Configuration - Loaded org.apache.hadoop.yarn.conf.YarnDefaultProperties 2016-08-18 23:02:34,382 DEBUG org.apache.hadoop.security.UserGroupInformation - HADOOP_SECURITY_AUTHENTICATION is set to: SIMPLE 2016-08-18 23:02:34,383 DEBUG org.apache.hadoop.security.UserGroupInformation - Login configuration entry is hadoop_simple 2016-08-18 23:02:34,383 DEBUG org.apache.hadoop.security.UserGroupInformation - authenticationMethod from JAAS configuration:SIMPLE 2016-08-18 23:02:34,383 INFO org.apache.flink.client.CliFrontend - Running 'run' command. 2016-08-18 23:02:34,391 INFO org.apache.flink.client.CliFrontend - Building program from JAR file 2016-08-18 23:02:34,404 DEBUG org.apache.flink.client.CliFrontend - User parallelism is set to -1 2016-08-18 23:02:34,404 INFO org.apache.flink.client.CliFrontend - YARN cluster mode detected. Switching Log4j output to console 2016-08-18 23:02:34,419 DEBUG org.apache.hadoop.service.AbstractService - Service: org.apache.hadoop.yarn.client.api.impl.YarnClientImpl entered state INITED 2016-08-18 23:02:34,438 DEBUG org.apache.hadoop.conf.Configuration - Loaded org.apache.hadoop.conf.CoreDefaultProperties 2016-08-18 23:02:34,471 DEBUG org.apache.hadoop.conf.Configuration - Loaded org.apache.hadoop.yarn.conf.YarnDefaultProperties 2016-08-18 23:02:34,536 DEBUG org.apache.hadoop.conf.Configuration - Loaded org.apache.hadoop.conf.CoreDefaultProperties 2016-08-18 23:02:34,562 DEBUG org.apache.hadoop.conf.Configuration - Loaded org.apache.hadoop.yarn.conf.YarnDefaultProperties 2016-08-18 23:02:34,594 DEBUG com.mapr.login.client.MapRLoginHttpsClient - Entering authenticate if needed. 2016-08-18 23:02:34,594 DEBUG com.mapr.login.client.MapRLoginHttpsClient - Kerberos not configured for this cluster. 2016-08-18 23:02:34,594 DEBUG com.mapr.login.client.MapRLoginHttpsClient - security appears to be off 2016-08-18 23:02:38,613 DEBUG com.mapr.fs.MapRFileSystem - User Info object initialized for user hsawhney with user ID 10031 2016-08-18 23:02:38,632 INFO org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT 2016-08-18 23:02:38,632 INFO org.apache.zookeeper.ZooKeeper - Client environment:host.name=edge-1.users.skynet.quantium.com.au 2016-08-18 23:02:38,633 INFO org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.7.0_95 2016-08-18 23:02:38,633 INFO org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Oracle Corporation 2016-08-18 23:02:38,633 INFO org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-7-openjdk-amd64/jre 2016-08-18 23:02:38,633 INFO org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/home/hsawhney/flink-1.0.3/lib/flink-dist_2.11-1.0.3.jar:/home/hsawhney/flink-1.0.3/lib/flink-python_2.11-1.0.3.jar:/home/hsawhney/flink-1.0.3/lib/log4j-1.2.17.jar:/home/hsawhney/flink-1.0.3/lib/slf4j-log4j12-1.7.7.jar::/opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop:/opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop: 2016-08-18 23:02:38,633 INFO org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib 2016-08-18 23:02:38,633 INFO org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp 2016-08-18 23:02:38,633 INFO org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA> 2016-08-18 23:02:38,633 INFO org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux 2016-08-18 23:02:38,633 INFO org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64 2016-08-18 23:02:38,633 INFO org.apache.zookeeper.ZooKeeper - Client environment:os.version=3.19.0-56-generic 2016-08-18 23:02:38,633 INFO org.apache.zookeeper.ZooKeeper - Client environment:user.name=hsawhney 2016-08-18 23:02:38,634 INFO org.apache.zookeeper.ZooKeeper - Client environment:user.home=/home/hsawhney 2016-08-18 23:02:38,634 INFO org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/home/hsawhney/flink-1.0.3 2016-08-18 23:02:38,635 INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=[ZOOKEEPERS] sessionTimeout=30000 watcher=com.mapr.util.zookeeper.ZKDataRetrieval@545c1b34 2016-08-18 23:02:38,641 DEBUG org.apache.zookeeper.ClientCnxn - zookeeper.disableAutoWatchReset is false 2016-08-18 23:02:38,668 DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient - JAAS loginContext is: Client_simple 2016-08-18 23:02:38,673 INFO org.apache.zookeeper.Login - successfully logged in. 2016-08-18 23:02:38,685 INFO org.apache.zookeeper.client.ZooKeeperSaslClient - Client will use GSSAPI as SASL mechanism. 2016-08-18 23:02:38,685 DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient - creating sasl client: client=hsawhney;service=zookeeper;serviceHostname=[ZOOKEEPER] 2016-08-18 23:02:38,700 ERROR org.apache.zookeeper.client.ZooKeeperSaslClient - Exception while trying to create SASL client java.security.PrivilegedActionException: javax.security.sasl.SaslException: Failure to initialize security context [Caused by GSSException: Invalid name provided (Mechanism level: KrbException: Cannot locate default realm)] at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslClient(ZooKeeperSaslClient.java:283) at org.apache.zookeeper.client.ZooKeeperSaslClient.<init>(ZooKeeperSaslClient.java:131) at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:949) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1003) Caused by: javax.security.sasl.SaslException: Failure to initialize security context [Caused by GSSException: Invalid name provided (Mechanism level: KrbException: Cannot locate default realm)] at com.sun.security.sasl.gsskerb.GssKrb5Client.<init>(GssKrb5Client.java:150) at com.sun.security.sasl.gsskerb.FactoryImpl.createSaslClient(FactoryImpl.java:63) at javax.security.sasl.Sasl.createSaslClient(Sasl.java:372) at org.apache.zookeeper.client.ZooKeeperSaslClient$1.run(ZooKeeperSaslClient.java:288) at org.apache.zookeeper.client.ZooKeeperSaslClient$1.run(ZooKeeperSaslClient.java:283) ... 6 more Caused by: GSSException: Invalid name provided (Mechanism level: KrbException: Cannot locate default realm) at sun.security.jgss.krb5.Krb5NameElement.getInstance(Krb5NameElement.java:129) at sun.security.jgss.krb5.Krb5MechFactory.getNameElement(Krb5MechFactory.java:95) at sun.security.jgss.GSSManagerImpl.getNameElement(GSSManagerImpl.java:202) at sun.security.jgss.GSSNameImpl.getElement(GSSNameImpl.java:476) at sun.security.jgss.GSSNameImpl.init(GSSNameImpl.java:201) at sun.security.jgss.GSSNameImpl.<init>(GSSNameImpl.java:170) at sun.security.jgss.GSSManagerImpl.createName(GSSManagerImpl.java:137) at com.sun.security.sasl.gsskerb.GssKrb5Client.<init>(GssKrb5Client.java:108) ... 10 more 2016-08-18 23:02:38,705 INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 192.168.81.101/192.168.81.101:5181. Will attempt to SASL-authenticate using Login Context section 'Client_simple' 2016-08-18 23:02:38,716 INFO org.apache.zookeeper.ClientCnxn - Socket connection established to [ZOOKEEPER], initiating session 2016-08-18 23:02:38,718 DEBUG org.apache.zookeeper.ClientCnxn - Session establishment request sent on [ZOOKEEPER] 2016-08-18 23:02:38,737 INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server 192.168.81.101/192.168.81.101:5181, sessionid = 0x55d89dc108fb21, negotiated timeout = 30000 2016-08-18 23:02:38,739 ERROR org.apache.zookeeper.ClientCnxn - SASL authentication with Zookeeper Quorum member failed: javax.security.sasl.SaslException: saslClient failed to initialize properly: it's null. 2016-08-18 23:02:38,739 INFO com.mapr.util.zookeeper.ZKDataRetrieval - Process path: null. Event state: SyncConnected. Event type: None 2016-08-18 23:02:38,740 INFO com.mapr.util.zookeeper.ZKDataRetrieval - Connected to ZK: [ZOOKEEPERS] 2016-08-18 23:02:38,740 INFO com.mapr.util.zookeeper.ZKDataRetrieval - Process path: null. Event state: null. Event type: None 2016-08-18 23:02:38,741 INFO com.mapr.util.zookeeper.ZKDataRetrieval - Getting serviceData for master node of resourcemanager 2016-08-18 23:02:38,754 ERROR com.mapr.util.zookeeper.ZKDataRetrieval - Can not get children of /services/resourcemanager/master with error: KeeperErrorCode = AuthFailed for /services/resourcemanager/master 2016-08-18 23:02:38,754 ERROR org.apache.hadoop.yarn.client.MapRZKRMFinderUtils - Unable to determine ResourceManager service address from Zookeeper at [ZOOKEEPERS] 2016-08-18 23:02:38,755 ERROR org.apache.hadoop.yarn.client.MapRZKBasedRMFailoverProxyProvider - Unable to create proxy to the ResourceManager null 2016-08-18 23:02:38,755 DEBUG org.apache.hadoop.service.AbstractService - noteFailure java.lang.RuntimeException: Unable to create proxy to the ResourceManager null 2016-08-18 23:02:38,755 INFO org.apache.hadoop.service.AbstractService - Service org.apache.hadoop.yarn.client.api.impl.YarnClientImpl failed in state STARTED; cause: java.lang.RuntimeException: Unable to create proxy to the ResourceManager null java.lang.RuntimeException: Unable to create proxy to the ResourceManager null at org.apache.hadoop.yarn.client.MapRZKBasedRMFailoverProxyProvider.getProxy(MapRZKBasedRMFailoverProxyProvider.java:135) at org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:73) at org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:64) at org.apache.hadoop.io.retry.RetryProxy.create(RetryProxy.java:58) at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:95) at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:73) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:193) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.flink.yarn.FlinkYarnClientBase.<init>(FlinkYarnClientBase.java:158) at org.apache.flink.yarn.FlinkYarnClient.<init>(FlinkYarnClient.java:23) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at java.lang.Class.newInstance(Class.java:383) at org.apache.flink.util.InstantiationUtil.instantiate(InstantiationUtil.java:143) at org.apache.flink.util.InstantiationUtil.instantiate(InstantiationUtil.java:122) at org.apache.flink.client.FlinkYarnSessionCli.getFlinkYarnClient(FlinkYarnSessionCli.java:273) at org.apache.flink.client.FlinkYarnSessionCli.createFlinkYarnClient(FlinkYarnSessionCli.java:115) at org.apache.flink.client.CliFrontend.getClient(CliFrontend.java:1016) at org.apache.flink.client.CliFrontend.run(CliFrontend.java:315) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1192) at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1243) Caused by: java.lang.RuntimeException: Unable to determine ResourceManager service address from Zookeeper at [ZOOKEEPERS] at org.apache.hadoop.yarn.client.MapRZKRMFinderUtils.mapRZkBasedRMFinder(MapRZKRMFinderUtils.java:114) at org.apache.hadoop.yarn.client.MapRZKBasedRMFailoverProxyProvider.updateCurrentRMAddress(MapRZKBasedRMFailoverProxyProvider.java:64) at org.apache.hadoop.yarn.client.MapRZKBasedRMFailoverProxyProvider.getProxy(MapRZKBasedRMFailoverProxyProvider.java:131) ... 22 more 2016-08-18 23:02:38,757 DEBUG org.apache.hadoop.service.AbstractService - Service: org.apache.hadoop.yarn.client.api.impl.YarnClientImpl entered state STOPPED 2016-08-18 23:02:38,758 ERROR org.apache.flink.client.CliFrontend - Error while running the command. java.lang.RuntimeException: Could not instantiate type 'org.apache.flink.yarn.FlinkYarnClient' Most likely the constructor (or a member variable initialization) threw an exception: Unable to create proxy to the ResourceManager null at org.apache.flink.util.InstantiationUtil.instantiate(InstantiationUtil.java:156) at org.apache.flink.util.InstantiationUtil.instantiate(InstantiationUtil.java:122) at org.apache.flink.client.FlinkYarnSessionCli.getFlinkYarnClient(FlinkYarnSessionCli.java:273) at org.apache.flink.client.FlinkYarnSessionCli.createFlinkYarnClient(FlinkYarnSessionCli.java:115) at org.apache.flink.client.CliFrontend.getClient(CliFrontend.java:1016) at org.apache.flink.client.CliFrontend.run(CliFrontend.java:315) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1192) at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1243) Caused by: java.lang.RuntimeException: Unable to create proxy to the ResourceManager null at org.apache.hadoop.yarn.client.MapRZKBasedRMFailoverProxyProvider.getProxy(MapRZKBasedRMFailoverProxyProvider.java:135) at org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:73) at org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:64) at org.apache.hadoop.io.retry.RetryProxy.create(RetryProxy.java:58) at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:95) at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:73) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:193) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.flink.yarn.FlinkYarnClientBase.<init>(FlinkYarnClientBase.java:158) at org.apache.flink.yarn.FlinkYarnClient.<init>(FlinkYarnClient.java:23) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at java.lang.Class.newInstance(Class.java:383) at org.apache.flink.util.InstantiationUtil.instantiate(InstantiationUtil.java:143) ... 7 more Caused by: java.lang.RuntimeException: Unable to determine ResourceManager service address from Zookeeper at [ZOOKEEPERS] at org.apache.hadoop.yarn.client.MapRZKRMFinderUtils.mapRZkBasedRMFinder(MapRZKRMFinderUtils.java:114) at org.apache.hadoop.yarn.client.MapRZKBasedRMFailoverProxyProvider.updateCurrentRMAddress(MapRZKBasedRMFailoverProxyProvider.java:64) at org.apache.hadoop.yarn.client.MapRZKBasedRMFailoverProxyProvider.getProxy(MapRZKBasedRMFailoverProxyProvider.java:131) ... 22 more ========================================================= On The zookeeper side of things here are the logs I see: 2016-08-19 09:02:38,724 [myid:0] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:5181:ZooKeeperServer@839] - Client attempting to establish new session at /172.31.166.10:58389 2016-08-19 09:02:38,738 [myid:0] - INFO [CommitProcessor:0:ZooKeeperServer@595] - Established session 0x55d89dc108fb21 with negotiated timeout 30000 for client /172.31.166.10:58389 2016-08-19 09:02:38,752 [myid:0] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:5181:NIOServerCnxn@349] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x55d89dc108fb21, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2016-08-19 09:02:38,753 [myid:0] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:5181:NIOServerCnxn@1001] - Closed socket connection for client /172.31.166.10:58389 which had sessionid 0x55d89dc108fb21 -- This message was sent by Atlassian JIRA (v6.3.4#6332)