Further info: From the log of secondary NN:

INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Posted
URL
AAA.xxx.yyy.zzz:50070putimage=1&port=50090&machine=0.0.0.0&token=-32:1245372967:0:1343222881000:1343222711330

ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode:
Exception in doCheckpoint:

ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode:
java.io.FileNotFoundException:
http://AAA.xxx.yyy.zzz:50070/getimage?putimage=1&port=50090&machine=0.0.0.0&token=-32:1245372967:0:1343222881000:1343222711330

Where Primary NN is AAA.xxx.yyy.zzz.

The 0.0.0.0 embedded in the URL looks suspicious, but I have no idea
what file it is telling me it is missing.


On 08/15/2012 12:39 PM, Terry Healy wrote:
> This situation continues and reports every 10 minutes. I even tried
> moving the secondary NN function to a different node. I have run TCPDump
> but cannot isolate the "Connection Refused" issue. Am I correct in
> assuming that the NN will try to connect to SNN on port 50090?
> 
> Any way out? Current partial dump below.
> 
> 2012-08-15 12:36:09,422 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from
> xxx.yyy.254.254
> 
> 2012-08-15 12:36:09,423 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit
> log, edits.new files already exists in all healthy directories:
>   /home/thealy/hdfs/name/current/edits.new
> 
> 2012-08-15 12:36:09,879 ERROR
> org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException as:thealy cause:java.net.ConnectException:
> Connection refused
> 
> 2012-08-15 12:36:09,879 ERROR
> org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException as:thealy cause:java.net.ConnectException:
> Connection refused
> 
> 2012-08-15 12:36:09,881 WARN org.mortbay.log: /getimage:
> java.io.IOException: GetImage failed. java.net.ConnectException:
> Connection refused
> 
> 
> 
> On 07/09/2012 01:23 PM, Terry Healy wrote:
>> Any suggestions on how to clear the error? stp-all / start-all had no
>> effect.
>>
>> On 07/03/2012 04:22 PM, Brandon Li wrote:
>>> With 1.0.2, only one checkpoint process is executed at a time. When the
>>> namenode gets an overlapping checkpointing request, it checks edit.new
>>> in its storage directories. If all of them have this file, namenode
>>> concludes the previous checkpoint process is not done yet and prints the
>>> warning message you've seen.
>>>
>>> Brandon
>>>
>>> On Tue, Jul 3, 2012 at 10:56 AM, Terry Healy <the...@bnl.gov
>>> <mailto:the...@bnl.gov>> wrote:
>>>
>>>     Running Apache 1.0.2.
>>>
>>>     The NN log is reporting that it cannot "roll the edit log" from the
>>>     secondary NN. The SecondaryNameNode is running on the system referred to
>>>     as xxx.yyy.254.238 in the log snippet below.
>>>
>>>     From the NN, I can connect to the Secondary via ssh as the user. Any
>>>     suggestions what have I got wrong here?
>>>
>>>     thanks,
>>>
>>>     Terry
>>>
>>>     INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log
>>>     from xxx.yyy.254.238
>>>     WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll
>>>     edit log, edits.new files already exists in all healthy directories:
>>>       /home/[user]/hdfs/name/current/edits.new
>>>
>>>     ERROR org.apache.hadoop.security.UserGroupInformation:
>>>     PriviledgedActionException as:[user] cause:java.net.ConnectException:
>>>     Connection refused
>>>     ERROR org.apache.hadoop.security.UserGroupInformation:
>>>     PriviledgedActionException as:[user] cause:java.net.ConnectException:
>>>     Connection refused
>>>
>>>     WARN org.mortbay.log: /getimage: java.io.IOException: GetImage failed.
>>>     java.net.ConnectException: Connection refused
>>>             at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>             at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
>>>     ....
>>>
>>>
>>>     --
>>>     Terry Healy / the...@bnl.gov <mailto:the...@bnl.gov>
>>>     Cyber Security Operations
>>>     Brookhaven National Laboratory
>>>     Building 515, Upton N.Y. 11973
>>>
>>>
>>
> 

-- 
Terry Healy / the...@bnl.gov
Cyber Security Operations
Brookhaven National Laboratory
Building 515, Upton N.Y. 11973

Reply via email to