[jira] [Commented] (YARN-7319) java.net.UnknownHostException when trying contact node by hostname

2019-04-11 Thread Liu Shaohui (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815232#comment-16815232
 ] 

Liu Shaohui commented on YARN-7319:
---

+1 for encountering the same problem when deploying hadoop on k8s for elastic 
scaling.

I think we should add a option to make hadoop running on env without dns

> java.net.UnknownHostException when trying contact node by hostname
> --
>
> Key: YARN-7319
> URL: https://issues.apache.org/jira/browse/YARN-7319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Evgeny Makarov
>Priority: Major
>
> I'm trying to setup Hadoop on Kubernetes cluster with following setup:
> Hadoop master is k8s pod
> Each hadoop slave is additional k8s pod
> All communication is being processed on IP based manned. In HDFS I have 
> setting of dfs.namenode.datanode.registration.ip-hostname-check set to false 
> and all works fine, however same option missing for YARN manager. 
> Here part of hadoop-master log when trying to submit simple word-count job:
> 2017-10-12 09:00:25,005 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  Error trying to assign container token and NM token to an allocated 
> container container_1507798393049_0001_01_01
> java.lang.IllegalArgumentException: java.net.UnknownHostException: 
> hadoop-slave-743067341-hqrbk
> at 
> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377)
> at 
> org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerToken(BuilderUtils.java:258)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.createContainerToken(RMContainerTokenSecretManager.java:220)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.pullNewlyAllocatedContainersAndNMTokens(SchedulerApplicationAttempt.java:454)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.getAllocation(FiCaSchedulerApp.java:269)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocate(CapacityScheduler.java:988)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:971)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:964)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:789)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:105)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:795)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:776)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.net.UnknownHostException: hadoop-slave-743067341-hqrbk
> ... 19 more
> As can be seen, host hadoop-slave-743067341-hqrbk is unreachable. Adding 
> record to /ets/hosts of master will solve the problem, however its not an 
> option in Kubernetes environment. There is should be a way to resolve nodes 
> by IP address



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7319) java.net.UnknownHostException when trying contact node by hostname

2017-10-12 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202708#comment-16202708
 ] 

Daryn Sharp commented on YARN-7319:
---

bq. java.lang.IllegalArgumentException: java.net.UnknownHostException: 
hadoop-slave-743067341-hqrbk

I'm a bit confused.  Why is the node resolving itself as 
"hadoop-slave-743067341-hqrbk"?  I believe that's the hostname self-reported 
during registration.  If this is truly an ip-only environment, presumably that 
means the junk hostname is only in that node's /etc/hosts, but not in 
/etc/hosts of the other nodes?  I understand not having reverse dns.  However 
not having forward dns but assigning a private hostname is a bit obtuse, might 
as well not let the host resolve itself if nobody else can resolve it...

Did you try setting {{hadoop.security.token.service.use_ip=false}} per the 
javadocs on buildTokenService?  That will get you past the exception while 
generating the container token.  It's likely the client won't be able to locate 
the token though – ie. token will have a host, but if the env is ip-only, the 
client must use an ip to connect and won't be able to match the ip with the 
hostname in the token.



> java.net.UnknownHostException when trying contact node by hostname
> --
>
> Key: YARN-7319
> URL: https://issues.apache.org/jira/browse/YARN-7319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Evgeny Makarov
>
> I'm trying to setup Hadoop on Kubernetes cluster with following setup:
> Hadoop master is k8s pod
> Each hadoop slave is additional k8s pod
> All communication is being processed on IP based manned. In HDFS I have 
> setting of dfs.namenode.datanode.registration.ip-hostname-check set to false 
> and all works fine, however same option missing for YARN manager. 
> Here part of hadoop-master log when trying to submit simple word-count job:
> 2017-10-12 09:00:25,005 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  Error trying to assign container token and NM token to an allocated 
> container container_1507798393049_0001_01_01
> java.lang.IllegalArgumentException: java.net.UnknownHostException: 
> hadoop-slave-743067341-hqrbk
> at 
> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377)
> at 
> org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerToken(BuilderUtils.java:258)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.createContainerToken(RMContainerTokenSecretManager.java:220)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.pullNewlyAllocatedContainersAndNMTokens(SchedulerApplicationAttempt.java:454)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.getAllocation(FiCaSchedulerApp.java:269)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocate(CapacityScheduler.java:988)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:971)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:964)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:789)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:105)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:795)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:776)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.net.UnknownHostException: hadoop-slave-743067341-hqrbk
> ... 19 more
> As can be seen, host hadoop-slave-743067341-hqrbk is unreachable. Adding 

[jira] [Commented] (YARN-7319) java.net.UnknownHostException when trying contact node by hostname

2017-10-12 Thread Elek, Marton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202642#comment-16202642
 ] 

Elek, Marton commented on YARN-7319:


I think it's a valid report even if it's not a bug but rather a 
improvements/feature request. I understand that the current Yarn couldn't be 
start without dns system, but it's a valid request to make it usable (at least 
without kerberos) in an ip only environment (such as kubernetes without 
statefulset). It's not just about kubernetes but there are cloud providers 
where the dns (or at least the reverse dns) is not guaranteed. Therefor a 
setting to use ip only yarn without dns/reversedns would help to use Hadoop in 
these environments (even if there are limitations of this approach).  

> java.net.UnknownHostException when trying contact node by hostname
> --
>
> Key: YARN-7319
> URL: https://issues.apache.org/jira/browse/YARN-7319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Evgeny Makarov
>
> I'm trying to setup Hadoop on Kubernetes cluster with following setup:
> Hadoop master is k8s pod
> Each hadoop slave is additional k8s pod
> All communication is being processed on IP based manned. In HDFS I have 
> setting of dfs.namenode.datanode.registration.ip-hostname-check set to false 
> and all works fine, however same option missing for YARN manager. 
> Here part of hadoop-master log when trying to submit simple word-count job:
> 2017-10-12 09:00:25,005 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  Error trying to assign container token and NM token to an allocated 
> container container_1507798393049_0001_01_01
> java.lang.IllegalArgumentException: java.net.UnknownHostException: 
> hadoop-slave-743067341-hqrbk
> at 
> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377)
> at 
> org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerToken(BuilderUtils.java:258)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.createContainerToken(RMContainerTokenSecretManager.java:220)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.pullNewlyAllocatedContainersAndNMTokens(SchedulerApplicationAttempt.java:454)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.getAllocation(FiCaSchedulerApp.java:269)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocate(CapacityScheduler.java:988)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:971)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:964)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:789)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:105)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:795)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:776)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.net.UnknownHostException: hadoop-slave-743067341-hqrbk
> ... 19 more
> As can be seen, host hadoop-slave-743067341-hqrbk is unreachable. Adding 
> record to /ets/hosts of master will solve the problem, however its not an 
> option in Kubernetes environment. There is should be a way to resolve nodes 
> by IP address



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: