Further info: From the log of secondary NN: INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Posted URL AAA.xxx.yyy.zzz:50070putimage=1&port=50090&machine=0.0.0.0&token=-32:1245372967:0:1343222881000:1343222711330
ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint: ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.io.FileNotFoundException: http://AAA.xxx.yyy.zzz:50070/getimage?putimage=1&port=50090&machine=0.0.0.0&token=-32:1245372967:0:1343222881000:1343222711330 Where Primary NN is AAA.xxx.yyy.zzz. The 0.0.0.0 embedded in the URL looks suspicious, but I have no idea what file it is telling me it is missing. On 08/15/2012 12:39 PM, Terry Healy wrote: > This situation continues and reports every 10 minutes. I even tried > moving the secondary NN function to a different node. I have run TCPDump > but cannot isolate the "Connection Refused" issue. Am I correct in > assuming that the NN will try to connect to SNN on port 50090? > > Any way out? Current partial dump below. > > 2012-08-15 12:36:09,422 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from > xxx.yyy.254.254 > > 2012-08-15 12:36:09,423 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit > log, edits.new files already exists in all healthy directories: > /home/thealy/hdfs/name/current/edits.new > > 2012-08-15 12:36:09,879 ERROR > org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException as:thealy cause:java.net.ConnectException: > Connection refused > > 2012-08-15 12:36:09,879 ERROR > org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException as:thealy cause:java.net.ConnectException: > Connection refused > > 2012-08-15 12:36:09,881 WARN org.mortbay.log: /getimage: > java.io.IOException: GetImage failed. java.net.ConnectException: > Connection refused > > > > On 07/09/2012 01:23 PM, Terry Healy wrote: >> Any suggestions on how to clear the error? stp-all / start-all had no >> effect. >> >> On 07/03/2012 04:22 PM, Brandon Li wrote: >>> With 1.0.2, only one checkpoint process is executed at a time. When the >>> namenode gets an overlapping checkpointing request, it checks edit.new >>> in its storage directories. If all of them have this file, namenode >>> concludes the previous checkpoint process is not done yet and prints the >>> warning message you've seen. >>> >>> Brandon >>> >>> On Tue, Jul 3, 2012 at 10:56 AM, Terry Healy <the...@bnl.gov >>> <mailto:the...@bnl.gov>> wrote: >>> >>> Running Apache 1.0.2. >>> >>> The NN log is reporting that it cannot "roll the edit log" from the >>> secondary NN. The SecondaryNameNode is running on the system referred to >>> as xxx.yyy.254.238 in the log snippet below. >>> >>> From the NN, I can connect to the Secondary via ssh as the user. Any >>> suggestions what have I got wrong here? >>> >>> thanks, >>> >>> Terry >>> >>> INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log >>> from xxx.yyy.254.238 >>> WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll >>> edit log, edits.new files already exists in all healthy directories: >>> /home/[user]/hdfs/name/current/edits.new >>> >>> ERROR org.apache.hadoop.security.UserGroupInformation: >>> PriviledgedActionException as:[user] cause:java.net.ConnectException: >>> Connection refused >>> ERROR org.apache.hadoop.security.UserGroupInformation: >>> PriviledgedActionException as:[user] cause:java.net.ConnectException: >>> Connection refused >>> >>> WARN org.mortbay.log: /getimage: java.io.IOException: GetImage failed. >>> java.net.ConnectException: Connection refused >>> at java.net.PlainSocketImpl.socketConnect(Native Method) >>> at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) >>> .... >>> >>> >>> -- >>> Terry Healy / the...@bnl.gov <mailto:the...@bnl.gov> >>> Cyber Security Operations >>> Brookhaven National Laboratory >>> Building 515, Upton N.Y. 11973 >>> >>> >> > -- Terry Healy / the...@bnl.gov Cyber Security Operations Brookhaven National Laboratory Building 515, Upton N.Y. 11973