[ 
https://issues.apache.org/jira/browse/HADOOP-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721640#action_12721640
 ] 

Hemanth Yamijala commented on HADOOP-6035:
------------------------------------------

The namenode goes into the safemode if you are starting on an existing 
installation of HDFS. This is the time when it reconstructs state from the 
transaction logs that it has written. It also waits until it has information 
from the datanodes about blocks they have, etc. It is not a problem for the 
namenode to go into safemode at startup. If it doesn't automatically come out 
of it, then that could be a problem. So, is the namenode coming out of safemode 
? And if you don't want to start on an existing installation, you must format 
the filesystem before you start. All this information is available at the 
Forrest documentation here: 
http://hadoop.apache.org/core/docs/r0.20.0/hdfs_user_guide.html.

So, let us step back and see the problem statement: You are starting a cluster. 
The namenode goes into safemode and comes out of it (maybe you forcibly do this 
- using the dfsadmin command). Then the jobtracker goes down. This is happening 
only if you use the capacity scheduler, but not if you use the fairshare 
scheduler. Is this right so far ?

Now that we've fixed the configuration, I think we could look at the logs. Can 
you please upload the jobtracker log to begin with ?

 



> jobtracker stops when namenode goes out of safemode runing capacit scheduler
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-6035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6035
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>         Environment: Fedora 10
>            Reporter: Anjali M
>            Priority: Minor
>         Attachments: capacity-scheduler.xml, hadoop-site.xml
>
>
> I am facing a problem running the capacity scheduler in hadoop-0.20.0.
> The jobtracker is listing the queues when namenode is in the safemode.
> Once the namenode goes out of the safemode the jt stops working. On
> accessing jobqueue details it shows the following error.
> HTTP ERROR: 500
> INTERNAL_SERVER_ERROR
> RequestURI=/jobqueue_details.jsp
> Caused by:
> java.lang.NullPointerException
>        at 
> org.apache.hadoop.mapred.JobQueuesManager.getRunningJobQueue(JobQueuesManager.java:156)
>        at 
> org.apache.hadoop.mapred.CapacityTaskScheduler.getJobs(CapacityTaskScheduler.java:1495)
>        at 
> org.apache.hadoop.mapred.jobqueue_005fdetails_jsp._jspService(jobqueue_005fdetails_jsp.java:64)
>        at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
>        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>        at 
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
>        at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
>        at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>        at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>        at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>        at 
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
>        at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>        at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>        at org.mortbay.jetty.Server.handle(Server.java:324)
>        at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
>        at 
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
>        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
>        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
>        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
>        at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
>        at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)
> Is it because any of the configuration in capacity-scheduler.xml is incorrect?
> I tried forcing the namenode out of the safemode in bin/hadoop
> dfsadmin, but still it does not work.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to