This situation continues and reports every 10 minutes. I even tried moving the secondary NN function to a different node. I have run TCPDump but cannot isolate the "Connection Refused" issue. Am I correct in assuming that the NN will try to connect to SNN on port 50090?
Any way out? Current partial dump below. 2012-08-15 12:36:09,422 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from xxx.yyy.254.254 2012-08-15 12:36:09,423 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit log, edits.new files already exists in all healthy directories: /home/thealy/hdfs/name/current/edits.new 2012-08-15 12:36:09,879 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:thealy cause:java.net.ConnectException: Connection refused 2012-08-15 12:36:09,879 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:thealy cause:java.net.ConnectException: Connection refused 2012-08-15 12:36:09,881 WARN org.mortbay.log: /getimage: java.io.IOException: GetImage failed. java.net.ConnectException: Connection refused On 07/09/2012 01:23 PM, Terry Healy wrote: > Any suggestions on how to clear the error? stp-all / start-all had no > effect. > > On 07/03/2012 04:22 PM, Brandon Li wrote: >> With 1.0.2, only one checkpoint process is executed at a time. When the >> namenode gets an overlapping checkpointing request, it checks edit.new >> in its storage directories. If all of them have this file, namenode >> concludes the previous checkpoint process is not done yet and prints the >> warning message you've seen. >> >> Brandon >> >> On Tue, Jul 3, 2012 at 10:56 AM, Terry Healy <the...@bnl.gov >> <mailto:the...@bnl.gov>> wrote: >> >> Running Apache 1.0.2. >> >> The NN log is reporting that it cannot "roll the edit log" from the >> secondary NN. The SecondaryNameNode is running on the system referred to >> as xxx.yyy.254.238 in the log snippet below. >> >> From the NN, I can connect to the Secondary via ssh as the user. Any >> suggestions what have I got wrong here? >> >> thanks, >> >> Terry >> >> INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log >> from xxx.yyy.254.238 >> WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll >> edit log, edits.new files already exists in all healthy directories: >> /home/[user]/hdfs/name/current/edits.new >> >> ERROR org.apache.hadoop.security.UserGroupInformation: >> PriviledgedActionException as:[user] cause:java.net.ConnectException: >> Connection refused >> ERROR org.apache.hadoop.security.UserGroupInformation: >> PriviledgedActionException as:[user] cause:java.net.ConnectException: >> Connection refused >> >> WARN org.mortbay.log: /getimage: java.io.IOException: GetImage failed. >> java.net.ConnectException: Connection refused >> at java.net.PlainSocketImpl.socketConnect(Native Method) >> at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) >> .... >> >> >> -- >> Terry Healy / the...@bnl.gov <mailto:the...@bnl.gov> >> Cyber Security Operations >> Brookhaven National Laboratory >> Building 515, Upton N.Y. 11973 >> >> > -- Terry Healy / the...@bnl.gov Cyber Security Operations Brookhaven National Laboratory Building 515, Upton N.Y. 11973