Hi Guodong,

This is likely a bug on our side.

Could you paste the output of the client when you issue the kill? Also, the
output of the jobtracker (during the submission and kill of the job) would
also be helpful.

Thanks,



-- Vinod


On Thu, Apr 11, 2013 at 10:43 PM, 王国栋 <[email protected]> wrote:

> Hi Ben,
>
> I am sorry for my mistake about point 2. The jobtracker jetty server works
> fine. Yesterday, the execution time for my  test job is too short, so it is
> finished before the jetty server can show the job status  in running list.
> Today, I try some big job, and the status is perfectly right.
>
> More information about point 1. I can reproduce this by the following step.
> 1. set the needed resource for each slot(memory, cpu) in mapred-site.xml.
> Make sure each slot need a lot of resource.
> 2. Given only one mesos slave whose resource is not enough for one mapper
> slot and one reduce slot.
> 3. run the jobtracker, submit a job which need 1 mapper and 1 reducer.
>
> Then , I can find out that the job is pending due to not enough resource. I
> can use ctrl-c to stop the hadoop client. But the job is still pending.
>
> I try to kill the job with "hadoop job -kill ", but I can not kill the
> pending job.
> Before kill the job, I can check the job status is
> *Job Setup:* <
> http://localhost:50030/jobtasks.jsp?jobid=job_201304121059_0001&type=setup&pagenum=1&state=killed
> >
> Pending
> *Job Cleanup:* Pending*
> *
> *
> *
> After I try to kill the job, the job status is
> *Job Setup: *failed
> *Job Cleanup:* Pending
>
> Then the job is hang up. And I can never stop it.
>
> Is it possible that mesos-scheduler miss some killed job event ?
>
>
>
>
> Guodong
>
>
> On Fri, Apr 12, 2013 at 2:59 AM, Benjamin Mahler
> <[email protected]>wrote:
>
> > 'Pending Map Tasks': The nubmer of pending map tasks in the Hadoop
> > JobTracker.
> > 'Pending Reduce Tasks': The nubmer of pending reduce tasks in the Hadoop
> > JobTracker.
> >
> > Did you successfully kill the job? If so, did you allow some time for the
> > JobTracker to detect that the job was killed.
> >
> > Our scheduler (MesosScheduler.java) simply introspects on the JobTracker
> > state to determine the number of pending map/reduce tasks in the system.
> I
> > would expect this to go to 0 sometime after you kill your job, if that's
> > not the case, I'll need more information to figure out what's going on.
> >
> > Can you elaborate more on your point 2? It almost sounds like you're
> > talking to a different JobTracker? Note that the MesosScheduler runs
> > *inside* the JobTracker.
> >
> > Also, note that we recently committed a deadlock fix for the Hadoop
> patch:
> > https://reviews.apache.org/r/10352/
> >
> >
> > On Thu, Apr 11, 2013 at 2:31 AM, 王国栋 <[email protected]> wrote:
> >
> > > Hi,
> > >
> > > I am trying to run hadoop over mesos. And I am using the code in the
> > trunk.
> > > But I ran into some problems here. My hadoop version is cdh3u3.
> > >
> > > *1. when a job is pending because of no enough resources, I use ctrl-c
> to
> > > stop the job client. But I can see the pending mapper and pending
> reducer
> > > are still in the job tracker. Then I try to use "hadoop job -kill
> jobid"
> > to
> > > kill this job, but nothing happens in jobtracker, mapper and reducer
> are
> > > still pending. The log in jobtracker is as follow.*
> > >
> > > 13/04/11 17:21:07 INFO mapred.MesosScheduler: JobTracker Status
> > >       Pending Map Tasks: 1
> > >    Pending Reduce Tasks: 1
> > >          Idle Map Slots: 0
> > >       Idle Reduce Slots: 0
> > >      Inactive Map Slots: 0 (launched but no hearbeat yet)
> > >   Inactive Reduce Slots: 0 (launched but no hearbeat yet)
> > >        Needed Map Slots: 1
> > >     Needed Reduce Slots: 1
> > > 13/04/11 17:21:07 INFO mapred.MesosScheduler: Declining offer with
> > > insufficient resources for a TaskTracker:
> > >   cpus: offered 4.0 needed 1.800000011920929
> > >   mem : offered 2731.0 needed 6432.0
> > >   disk: offered 70651.0 needed 4096.0
> > >   ports:  at least 2 (sufficient)
> > > [name: "cpus"
> > > type: SCALAR
> > > scalar {
> > >   value: 4.0
> > > }
> > > , name: "mem"
> > > type: SCALAR
> > > scalar {
> > >   value: 2731.0
> > > }
> > > , name: "ports"
> > > type: RANGES
> > > ranges {
> > >   range {
> > >     begin: 31000
> > >     end: 32000
> > >   }
> > > }
> > > , name: "disk"
> > > type: SCALAR
> > > scalar {
> > >   value: 70651.0
> > > }
> > > ]
> > > 13/04/11 17:21:07 INFO mapred.MesosScheduler: Unable to fully satisfy
> > > needed map/reduce slots: 1 map slots 1 reduce slots remaining
> > >
> > > *2. when we submit the job to the jobtracker, I can not find any
> running
> > > job on jobtracker web interface(http://localhost:50030/jobtracker.jsp
> ).
> > > But
> > > when the job is finished, I can see the job info in retired job in
> > > jobtracker.*
> > >
> > > Any ideas about this ? Thanks a lot.
> > >
> > > Guodong
> > >
> >
>

Reply via email to