Looks for errors in your DataNode log file. It’s in $HADOOP_HOME/logs by 
default.



On Feb 23, 2018, at 12:55 AM, Butler, RD, Mnr <17647...@sun.ac.za> 
<17647...@sun.ac.za> wrote:

To whom it may concern

I have two computers, the one I work on (CENTOS installed) and a second 
computer (also CENTOS (server), to act as the datanode), both not in a VM 
environment. I want to create a multi-node cluster with these computers. I have 
directly connected the computers together to test for possible network issues 
(ports etc.) and found that not to be the issue. I also used the guide by
https://tecadmin.net/set-up-hadoop-multi-node-cluster-on-centos-redhat/# and
<image001.png><https://tecadmin.net/set-up-hadoop-multi-node-cluster-on-centos-redhat>

How to Set Up Hadoop Multi-Node Cluster on CentOS 
7/6<https://tecadmin.net/set-up-hadoop-multi-node-cluster-on-centos-redhat>
tecadmin.net<http://tecadmin.net/>
Our earlier article describing to how to setup single node cluster. This 
article will help you to Set Up Hadoop Multi-Node Cluster on CentOS/RHEL 7/6.




https://dwbi.org/etl/bigdata/183-setup-hadoop-cluster. I have created a 
'hadoop' user on both machines, with permissions, and established a 
password-less SSH access between both.
How to Setup Hadoop Multi Node Cluster - Step By 
Step<https://dwbi.org/etl/bigdata/183-setup-hadoop-cluster>
dwbi.org<http://dwbi.org/>
Step by Step Guide to Setting up Hadoop Cluster with namenodes and datanodes




The hostname for the computers are:
1. NameNode (main computer): master
2. DataNode (the server): datanode1

The /etc/hosts file I have as(showing 'computerIP' in place of the actual IP's):
computerIP master
computerIP datanode1

My .xml file configurations on the NameNode are:
1. core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:8020/</value><hdfs://master:8020/%3C/value%3E>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
</configuration>
2. hdfs-site.xml:
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/volume/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/volume/datanode</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file:/opt/volume/namesecondary</value>
</property>
<property>
<name>dfs.replication</name>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:8020/</value><hdfs://master:8020/%3C/value%3E>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
</configuration>
3. mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/user/app</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Djava.security.egd=file:/dev/../dev/urandom</value>
</property>
</configuration>
4. yarn-site.xml:
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.resourcemanager.bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>yarn.nodemanager.bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>file:/opt/volume/local</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>file:/opt/volume/yarn/log</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>hdfs://master:8020/var/log/hadoop-yarn/apps</value><hdfs://master:8020/var/log/hadoop-yarn/apps%3C/value%3E>
</property>
</configuration>
5. JAVA_HOME (Where java is located):
# The java implementation to use.
export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk
6. Slaves file:
datanode1
7. Masters file:
master


My .bashrc file is as follows:
export JAVA_HOME=/usr/lib/java-1.8.0
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/opt/hadoop/hadoop-2.8.3
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export CLASSPATH=$CLASSPATH:/usr/local/hadoop/lib/*:.

export HADOOP_OPTS="$HADOOP_OPTS -Djava.security.egd=file:/dev/../dev/urandom"

The permissions are as follows on both machines (from terminal):

[hadoop@master hadoop]$ ls -al /opt
total 0
drwxr-xr-x. 5 hadoop hadoop 44 Feb 15 16:05 .
dr-xr-xr-x. 17 root root 242 Feb 21 11:38 ..
drwxr-xr-x. 3 hadoop hadoop 53 Feb 15 16:00 hadoop
drwxr-xr-x. 2 hadoop hadoop 6 Sep 7 01:11 rh
drwxr-xr-x. 7 hadoop hadoop 84 Feb 20 11:27 volume
For the DataNode:
[hadoop@datanode1 ~]$ ls -al /opt
total 0
drwxrwxrwx. 4 hadoop hadoop 34 Feb 20 11:06 .
dr-xr-xr-x. 17 root root 242 Feb 19 16:13 ..
drwxr-xr-x. 3 hadoop hadoop 53 Feb 20 11:07 hadoop
drwxrwxrwx. 5 hadoop hadoop 59 Feb 21 09:53 volume


So when I go to format the namenode: hdfs namenode -format, I get that the 
NameNode is formatted on the 'master'.
And then I go start the system, $HADOOP_HOME/sbin/start-dfs.sh and get the 
following output:

[hadoop@master hadoop]$ $HADOOP_HOME/sbin/start-dfs.sh
Starting namenodes on [master]
master: starting namenode, logging to 
/opt/hadoop/hadoop-2.8.3/logs/hadoop-hadoop-namenode-master.out
datanode1: starting datanode, logging to 
/opt/hadoop/hadoop-2.8.3/logs/hadoop-hadoop-datanode-datanode1.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to 
/opt/hadoop/hadoop-2.8.3/logs/hadoop-hadoop-secondarynamenode-master.out
Showing that the datanode is started, yet i go to the 50070 terminal to find 
that the datanode storage is not configured. I then stop the whole process, 
$HADOOP_HOME/sbin/stop-dfs.sh only to find that indeed the datanode didnt even 
start in the first place.
[hadoop@master hadoop]$ $HADOOP_HOME/sbin/stop-dfs.sh
Stopping namenodes on [master]
master: stopping namenode
datanode1: no datanode to stop
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
This is even when the computers are directly connected together. I have no idea 
why the datanode is not starting, and I hope someone could help. Need this for 
my masters thesis.
Thanks!

Regards
Rhett

The integrity and confidentiality of this email is governed by these terms / 
Die integriteit en vertroulikheid van hierdie e-pos word deur die volgende 
bepalings gereël. http://www.sun.ac.za/emaildisclaimer

Reply via email to