Hi All,

I am suddenly started facing issue on Hadoop Cluster. Seems like HTTP request 
at port 50070 on dfs is not working properly.
I have an Hadoop cluster which is operating from several days. Recently we are 
also not able to see dfshealth.jsp page from webconsole.

Problems :
1. 
http://<Hostname>:50070/dfshealth.jsp<http://%3cHostname%3e:50070/dfshealth.jsp>
 shows following error

HTTP ERROR: 404
Problem accessing /. Reason:
NOT_FOUND

2. SNN is not able to roll edits :
ERROR in SecondaryNameNode Log
java.io.FileNotFoundException: http://HOSTNAME:50070/getimage?getimage=1
       at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1401)
       at 
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.getFileClient(TransferFsImage.java:160)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$3.run(SecondaryNameNode.java:347)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$3.run(SecondaryNameNode.java:336)
       at java.security.AccessController.doPrivileged(Native Method)
       at javax.security.auth.Subject.doAs(Subject.java:416)
       at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.downloadCheckpointFiles(SecondaryNameNode.java:336)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:411)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:312)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:275)

ERROR in Namenode Log
2014-01-31 18:15:12,046 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 
10.139.9.231
2014-01-31 18:15:12,046 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit log, 
edits.new files already exists in all healthy directories:
  /usr/lib/hadoop/storage/dfs/nn/current/edits.new



Namenode logs which suggest that webserver is started on 50070 successfully:
2014-01-31 14:42:35,208 INFO org.apache.hadoop.http.HttpServer: Port returned 
by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the 
listener on 50070
2014-01-31 14:42:35,209 INFO org.apache.hadoop.http.HttpServer: 
listener.getLocalPort() returned 50070 
webServer.getConnectors()[0].getLocalPort() returned 50070
2014-01-31 14:42:35,209 INFO org.apache.hadoop.http.HttpServer: Jetty bound to 
port 50070
2014-01-31 14:42:35,378 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
Web-server up at: HOSTNAME:50070


Hdfs-site.xml
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>

    <property>
        <name>dfs.name.dir</name>
        <value>/usr/lib/hadoop/storage/dfs/nn</value>
    </property>

    <property>
        <name>dfs.data.dir</name>
        <value>/usr/lib/hadoop/storage/dfs/dn</value>
    </property>

    <property>
        <name>dfs.permissions</name>
        <value>false</value>
    </property>
<property>
  <name>dfs.webhdfs.enabled</name>
  <value>true</value>
</property>

<property>
  <name>dfs.http.address</name>
  <value>HOSTNAME:50070</value>
</property>

<property>
  <name>dfs.secondary.http.address</name>
  <value>HOSTNAME:50090</value>
</property>

<property>
  <name>fs.checkpoint.dir</name>
  <value>/usr/lib/hadoop/storage/dfs/snn</value>
</property>

</configuration>


/etc/hosts (Note I have also tried by commenting 127.0.0.1 entry in host file 
but the issue was not resolved)

127.0.0.1       localhost

IP1    Hostname1         # Namenode- vm01 - itself
IP2    Hostname2         # DataNode- vm02
........

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters


Note : All Hadoop daemons are executing fine and the jobs are running properly.

How to resolve this issue, I have tried many options provided on different 
forums but still facing the same issue.
I belive that this can cause a major problem later as my edits are not getting 
rolled into fsimage.. This can cause me a data loss in case of failure.

Please suggest

Thanks
Stuti





::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information 
could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in 
transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on 
the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the 
author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, 
dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written 
consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please 
delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and 
other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------

Reply via email to