Istvan Vajnorak created YARN-7550:
-------------------------------------

             Summary: Allow YARN HA to be fault tolerant on missing DNS entries
                 Key: YARN-7550
                 URL: https://issues.apache.org/jira/browse/YARN-7550
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: client
    Affects Versions: 2.6.5
            Reporter: Istvan Vajnorak


Should for some reason from the DNS registry one of the ResourceManager host's 
would be missing, the HA configuration of the ClientProxy is not fault tolerant 
enough to survive this.

To ensure that even in the face of DNS resolution issues, when at least one of 
the RMs can be resolved, then allow the tokenService call to succeed. This can 
be seen at: 
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ClientRMProxy.java#L153

We can safely assume if one of the RMs is missing from DNS, they can't be the 
active one anyways, so clients jobs can still be submitted while people fix the 
DNS issues.

A sample exception when one of the entries are missing:

{code}
17/11/02 18:20:35 INFO service.AbstractService: Service 
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl failed in state STARTED; 
cause: java.lang.IllegalArgumentException: java.net.UnknownHostException: 
some.dns.entry
at 
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
 
at 
org.apache.hadoop.yarn.client.ClientRMProxy.getTokenService(ClientRMProxy.java:153)
 
at 
org.apache.hadoop.yarn.client.ClientRMProxy.getAMRMTokenService(ClientRMProxy.java:138)
 
at 
org.apache.hadoop.yarn.client.ClientRMProxy.setAMRMTokenService(ClientRMProxy.java:80)
 
at 
org.apache.hadoop.yarn.client.ClientRMProxy.getRMAddress(ClientRMProxy.java:99) 
at 
org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.getProxyInternal(ConfiguredRMFailoverProxyProvider.java:76)
 
at 
org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.getProxy(ConfiguredRMFailoverProxyProvider.java:90)
 
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:75)
 
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:66)
 
at org.apache.hadoop.io.retry.RetryProxy.create(RetryProxy.java:58) 
at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:95) 
at 
org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
 
at 
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.serviceStart(AMRMClientImpl.java:186)
 
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) 
at org.apache.spark.deploy.yarn.YarnRMClient.register(YarnRMClient.scala:65) 
at 
org.apache.spark.deploy.yarn.ApplicationMaster.registerAM(ApplicationMaster.scala:359)
 
at 
org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:435)
 
at 
org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:256) 
at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:774)
 
at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67) 
at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 
at 
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
 
at 
org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:772)
 
at 
org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:795)
 
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala) 
Caused by: java.net.UnknownHostException: some.dns.entry 
... 28 more
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to