[ https://issues.apache.org/jira/browse/HADOOP-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721640#action_12721640 ]
Hemanth Yamijala commented on HADOOP-6035: ------------------------------------------ The namenode goes into the safemode if you are starting on an existing installation of HDFS. This is the time when it reconstructs state from the transaction logs that it has written. It also waits until it has information from the datanodes about blocks they have, etc. It is not a problem for the namenode to go into safemode at startup. If it doesn't automatically come out of it, then that could be a problem. So, is the namenode coming out of safemode ? And if you don't want to start on an existing installation, you must format the filesystem before you start. All this information is available at the Forrest documentation here: http://hadoop.apache.org/core/docs/r0.20.0/hdfs_user_guide.html. So, let us step back and see the problem statement: You are starting a cluster. The namenode goes into safemode and comes out of it (maybe you forcibly do this - using the dfsadmin command). Then the jobtracker goes down. This is happening only if you use the capacity scheduler, but not if you use the fairshare scheduler. Is this right so far ? Now that we've fixed the configuration, I think we could look at the logs. Can you please upload the jobtracker log to begin with ? > jobtracker stops when namenode goes out of safemode runing capacit scheduler > ---------------------------------------------------------------------------- > > Key: HADOOP-6035 > URL: https://issues.apache.org/jira/browse/HADOOP-6035 > Project: Hadoop Core > Issue Type: Bug > Affects Versions: 0.20.0 > Environment: Fedora 10 > Reporter: Anjali M > Priority: Minor > Attachments: capacity-scheduler.xml, hadoop-site.xml > > > I am facing a problem running the capacity scheduler in hadoop-0.20.0. > The jobtracker is listing the queues when namenode is in the safemode. > Once the namenode goes out of the safemode the jt stops working. On > accessing jobqueue details it shows the following error. > HTTP ERROR: 500 > INTERNAL_SERVER_ERROR > RequestURI=/jobqueue_details.jsp > Caused by: > java.lang.NullPointerException > at > org.apache.hadoop.mapred.JobQueuesManager.getRunningJobQueue(JobQueuesManager.java:156) > at > org.apache.hadoop.mapred.CapacityTaskScheduler.getJobs(CapacityTaskScheduler.java:1495) > at > org.apache.hadoop.mapred.jobqueue_005fdetails_jsp._jspService(jobqueue_005fdetails_jsp.java:64) > at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at > org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502) > at > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363) > at > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) > at > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) > at > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) > at > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417) > at > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) > at > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) > at org.mortbay.jetty.Server.handle(Server.java:324) > at > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534) > at > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864) > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533) > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207) > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403) > at > org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409) > at > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522) > Is it because any of the configuration in capacity-scheduler.xml is incorrect? > I tried forcing the namenode out of the safemode in bin/hadoop > dfsadmin, but still it does not work. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.