Is the 2NN reachable at http://10.1.1.5:50090? This is the addr the NN is being told to grab the merged image from. There can be problems with VIPs, etc. if this address is not reachable.
On Thu, Jan 6, 2011 at 12:57 PM, Tyler Coffin <tcof...@rim.com> wrote: > > > 2011-01-06 15:52:00,814 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=admin,users,admin ip=/10.90.37.10 cmd=open > src=/logs/remote.log.01.20110105.2045.lzo dst=null perm=null > > 2011-01-06 15:52:00,881 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=admin,users,admin ip=/10.90.37.12 cmd=open > src=/logs/remote.log.01.20110104.1815.lzo dst=null perm=null > > 2011-01-06 15:52:00,891 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=admin,users,admin ip=/10.90.37.9 cmd=open > src=/logs/remote.log.01.20110105.0015.lzo dst=null perm=null > > 2011-01-06 15:52:01,253 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=admin,users,admin ip=/10.90.37.7 cmd=open > src=/logs/remote.log.01.20110105.2035.lzo dst=null perm=null > > 2011-01-06 15:52:01,612 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=admin,users,admin ip=/10.90.37.11 cmd=open > src=/logs/remote.log.01.20110105.2145.lzo dst=null perm=null > > 2011-01-06 15:52:01,715 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=admin,users,admin ip=/10.90.37.8 cmd=open > src=/logs/remote.log.01.20110104.1915.lzo dst=null perm=null > > 2011-01-06 15:52:02,030 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=admin,users,admin ip=/10.90.37.13 cmd=open > src=/logs/remote.log.01.20110105.1235.lzo dst=null perm=null > > 2011-01-06 15:52:02,701 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* ask > 10.11.138.52:50010 to delete blk_5975970115065306335_564427 > blk_-4283643305091895835_564439 > > 2011-01-06 15:52:02,701 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* ask > 10.90.37.8:50010 to delete blk_7719330829554745618_564439 > > 2011-01-06 15:52:02,701 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* ask > 10.11.138.73:50010 to delete blk_5975970115065306335_564427 > > 2011-01-06 15:52:02,701 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* ask > 10.11.138.156:50010 to delete blk_-4283643305091895835_564439 > blk_7719330829554745618_564439 > > 2011-01-06 15:52:04,256 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from > 10.90.37.13 > > 2011-01-06 15:52:04,494 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=admin,users,admin ip=/10.90.37.12 cmd=open > src=/logs/remote.log.01.20110105.1120.lzo dst=null perm=null > > 2011-01-06 15:52:04,529 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=admin,users,admin ip=/10.90.37.7 cmd=open > src=/logs/remote.log.01.20110104.1950.lzo dst=null perm=null > > 2011-01-06 15:52:04,563 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=admin,users,admin ip=/10.90.37.9 cmd=open > src=/logs/remote.log.01.20110105.0215.lzo dst=null perm=null > > 2011-01-06 15:52:05,255 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=admin,users,admin ip=/10.90.37.11 cmd=open > src=/logs/remote.log.01.20110105.2245.lzo dst=null perm=null > > 2011-01-06 15:52:05,261 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=admin,users,admin ip=/10.90.37.10 cmd=open > src=/logs/remote.log.01.20110105.0835.lzo dst=null perm=null > > 2011-01-06 15:52:06,960 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=admin,users,admin ip=/10.90.37.8 cmd=open > src=/logs/remote.log.01.20110104.0855.lzo dst=null perm=null > > 2011-01-06 15:52:07,147 WARN org.mortbay.log: /getimage: > java.io.IOException: GetImage failed. java.io.IOException: Server returned > HTTP response code: 503 for URL: > http://10.90.37.13:50090/getimage?getimage=1 > > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1313) > > at > org.apache.hadoop.hdfs.server.namenode.TransferFsImage.getFileClient(TransferFsImage.java:151) > > at > org.apache.hadoop.hdfs.server.namenode.GetImageServlet.doGet(GetImageServlet.java:58) > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > > at > org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502) > > at > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363) > > at > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) > > at > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) > > at > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) > > at > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417) > > at > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) > > at > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) > > at org.mortbay.jetty.Server.handle(Server.java:324) > > at > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534) > > at > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864) > > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533) > > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207) > > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403) > > at > org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409) > > at > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522) > > > > 2011-01-06 15:52:07,522 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=admin,users,admin ip=/10.90.37.10 cmd=open > src=/logs/remote.log.01.20110104.2125.lzo dst=null perm=null > > > > The errors from the 2nn repeat frequently, I grabbed a snippet of the nn log > from the most recent occurange (i.e. it’s not the same occurance as my > original post) > > > > > > > > From: suresh srinivas [mailto:srini30...@gmail.com] > Sent: January 6, 2011 15:27 > To: hdfs-user@hadoop.apache.org > Subject: Re: Secondary Namenode doCheckpoint, FileNotFoundException > > > > Can you add namenode log around this time? > > --------------------------------------------------------------------- > This transmission (including any attachments) may contain confidential > information, privileged material (including material protected by the > solicitor-client or other applicable privileges), or constitute non-public > information. Any use of this information by anyone other than the intended > recipient is prohibited. If you have received this transmission in error, > please immediately reply to the sender and delete this information from your > system. Use, dissemination, distribution, or reproduction of this > transmission by unintended recipients is not authorized and may be unlawful.