Data node not able to contact the resource manager

Daniel Santos Mon, 05 Aug 2019 07:15:44 -0700

Hello,

I have a cluster with one machine holding the name nodes (primary and 
secondary) a yarn node (resource manager) and four data nodes.
I am running hadoop 2.7.0.


When I submit a job to the cluster I can see it in the scheduler webpage. If I 
go to the container page and check the logs, in the syslog file i have in the 
end the following :

2019-08-05 14:58:05,962 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2019-08-05 14:58:06,962 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2019-08-05 14:58:07,963 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 4 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2019-08-05 14:58:08,965 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 5 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2019-08-05 14:58:09,966 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 6 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2019-08-05 14:58:10,967 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 7 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2019-08-05 14:58:11,968 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 8 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2019-08-05 14:58:12,969 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 9 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)

I have checked the configuration of the resource manager and the data node 
where the application is running on and the property :  
yarn.resourcemanager.hostname that I have set in yarn-site.xml is shown.
I have disabled ipv6 on the yarn machine, as some posts on the internet 
suggested. All the configuration files are the same in every node of the 
cluster.

still I am getting these errors, and the application ends with a timeout.

What am I doing wrong ?

Thanks
Regards

Data node not able to contact the resource manager

Reply via email to