Re: Question about running hadoop over mesos

Benjamin Mahler Fri, 19 Apr 2013 15:04:22 -0700

Ok great! Thanks for the report Guodong! Also, I see you've been helping
out on the mailing list, much appreciated! :)


On Wed, Apr 17, 2013 at 11:26 PM, 王国栋 <[email protected]> wrote:

> Hi Ben,
>
> I am pretty sure that if there is no enough resource for mesos to start a
> tasktracker, after issuing the kill command, the mapper task and reduce
> task is still pending, because at this point, jobtracker does not remove
> any tasks of this killed job.
>
> As you mentioned, as long as the resource is enough, the tasktracker will
> be launched by mesos. And then, the job status will be self-correct as
> expect.
>
> So I don't think it is a bug.
>
> Guodong
>
>
> On Thu, Apr 18, 2013 at 2:00 AM, Benjamin Mahler
> <[email protected]>wrote:
>
> > As long as MesosScheduler thinks that there are pending jobs that need to
> > have TaskTrackers launched, we will normally allow the PREP->FAILED
> > transition to run so this will self-correct as expected when we are able
> to
> > launch TaskTrackers.
> >
> > Guodong, I see from your original email, but can you confirm again that
> the
> > MesosScheduler thinks there are pending tasks after you issue the kill?
> >
> >
> > On Wed, Apr 17, 2013 at 8:43 AM, Vinod Kone <[email protected]> wrote:
> >
> > > I see. So looks like Hadoop doesn't differentiate between not launching
> > > setup tasks and launching setup tasks but waiting to get to RUNNING
> > state?
> > > That's kinda unfortunate.
> > >
> > > Anyway as long as this is the expected behavior for Hadoop users Im
> fine
> > > for now.
> > >
> > > @vinodkone
> > > Sent from my mobile
> > >
> > > On Apr 17, 2013, at 8:04 AM, 王国栋 <[email protected]> wrote:
> > >
> > > > Hi Vinod,
> > > >
> > > > I think it is reasonable that hadoop run the job clean up task. Since
> > all
> > > > the job status change is triggered by the completion of some task,
> eg.
> > > Job
> > > > status change from PREP to RUNNING as long as the job set up task is
> > > > completed(if the job is not killed yet).
> > > >
> > > > For any job, as long as it is killed, no matter the previous status
> is
> > > > RUNNING or PREP, the job cleanup task will be scheduled to a
> > tasktracker.
> > > > When the job cleanup task is finished, the job status will be set as
> > > KILLED
> > > > and FAILED. So the logic for changing the job status will be the
> same,
> > > > regardless the previous job status.
> > > >
> > > > Anyway, I think it makes sense to me. So I don't think this is a bug
> :)
> > > >
> > > > Guodong
> > > >
> > > >
> > > > On Wed, Apr 17, 2013 at 10:52 PM, Vinod Kone <[email protected]>
> > > wrote:
> > > >
> > > >> Thanks for digging through the code. Your analysis makes sense.
> Though
> > > it
> > > >> is strange that Hadoop wants to run a cleanup task for a job that
> was
> > > never
> > > >> launched. Maybe you could file a bug for Hadoop?
> > > >>
> > > >> @vinodkone
> > > >> Sent from my mobile
> > > >>
> > > >> On Apr 17, 2013, at 2:30 AM, 王国栋 <[email protected]> wrote:
> > > >>
> > > >>> Hi Ben,
> > > >>>
> > > >>> I have read some of the code about job tracker and task tracker
> > these 2
> > > >>> days.
> > > >>>
> > > >>> I don't think this scenario is a bug. Neither a bug in mesos
> > scheduler
> > > >> nor
> > > >>> a bug in jobtracker. The following is my opinion.
> > > >>>
> > > >>> We kill the job when the job status is PREP( which means setup task
> > is
> > > >> not
> > > >>> launched yet). When we kill the job, jobtracker just mark 2 boolean
> > > >> values
> > > >>> in JobInProgress as true, which means this job is killed and
> failed.
> > > But
> > > >>> killing a job will not trigger any job status change until the job
> > > >> cleanup
> > > >>> task is scheduled to a tasktracker.
> > > >>>
> > > >>> The job status stays in PREP just because mesos can not launch a
> > > >>> tasktracker due to not enough resource. So it is a little bit
> > > >> embarrassing.
> > > >>> When there is enough resource for mesos to launch a
> tasktracker(say,
> > > >> there
> > > >>> is some other mesos-slave who can offer more enough to start a task
> > > >>> tracker), mesos will launch a task tracker and the job cleanup task
> > > will
> > > >> be
> > > >>> scheduled to this task tracker. And at last, after the clean up
> task
> > is
> > > >>> completed, the job status will be updated as *KILLED* and the
> pending
> > > >>> mapper slot and pending reduce slot will be released too.
> > > >>>
> > > >>> If anything is wrong, please correct me. :)
> > > >>>
> > > >>>
> > > >>> Guodong
> > > >>>
> > > >>>
> > > >>> On Tue, Apr 16, 2013 at 1:02 AM, Benjamin Mahler
> > > >>> <[email protected]>wrote:
> > > >>>
> > > >>>> Thanks! It would be helpful to me if you could provide the exact
> > > >> commands
> > > >>>> you're running for steps 8, 9 and 10 as well. It would save me
> some
> > > >> time as
> > > >>>> my knowledge of Hadoop commands is limited.
> > > >>>>
> > > >>>>
> > > >>>> On Sun, Apr 14, 2013 at 9:53 PM, 王国栋 <[email protected]> wrote:
> > > >>>>
> > > >>>>> Hi ben,
> > > >>>>>
> > > >>>>> I've updated to the latest code in trunk. And the problem is
> still
> > > >> here.
> > > >>>>>
> > > >>>>> Please follow these steps to reproduce it.
> > > >>>>> 1. check out the trunk code
> > > >>>>> 2. bootstrap
> > > >>>>> 3. mkdir build for out source build
> > > >>>>> 4. cd build && ../configure
> > > >>>>> 5. make
> > > >>>>> 6. cd hadoop && make hadoop-0.20.2-cdh3u3
> > > >>>>> 7. modify conf/mapred-site.xml, set
> > mapred.mesos.slot.mem=10240(make
> > > >> sure
> > > >>>>> the resource is not enough)
> > > >>>>> 8. start mesos and the jobtracker
> > > >>>>> 9. submit a wordcount job to jobtracker.(at this time, the job is
> > > >> pending
> > > >>>>> due to not enough resource)
> > > >>>>> 10. kill the job (can not kill the job)
> > > >>>>>
> > > >>>>>
> > > >>>>> Guodong
> > > >>>>>
> > > >>>>>
> > > >>>>> On Mon, Apr 15, 2013 at 10:52 AM, 王国栋 <[email protected]>
> wrote:
> > > >>>>>
> > > >>>>>> I am building the latest code in the trunk. I will keep you
> > updated,
> > > >> if
> > > >>>>>> the problem is still here, I will give you the exact reproduce
> > > process
> > > >>>>> and
> > > >>>>>> make sure you can debug it.
> > > >>>>>>
> > > >>>>>> I also think this may be a bug in jobtracker. :)
> > > >>>>>>
> > > >>>>>> Thanks.
> > > >>>>>>
> > > >>>>>> Guodong
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Mon, Apr 15, 2013 at 10:20 AM, Benjamin Mahler <
> > > >>>>>> [email protected]> wrote:
> > > >>>>>>
> > > >>>>>>> On April 9th we submitted a deadlock fix, please update to make
> > > sure
> > > >>>> you
> > > >>>>>>> have the fix:
> > > >>>>>>> https://reviews.apache.org/r/10352/
> > > >>>>>>>
> > > >>>>>>> Can you provide the commands to reproduce from a mesos build
> > > >>>> directory?
> > > >>>>>>> I'd
> > > >>>>>>> like to be able to reproduce this locally on my laptop, so the
> > > exact
> > > >>>>>>> commands I need to run from within my mesos build directory
> would
> > > be
> > > >>>>>>> useful
> > > >>>>>>> :)
> > > >>>>>>>
> > > >>>>>>> The fact that the job gets stuck in the JobTracker as PREP
> makes
> > me
> > > >>>>> think
> > > >>>>>>> there's a bug in the JobTracker, I imagine people don't often
> run
> > > >>>>>>> JobTrackers without any TaskTrackers, which is not the case
> here.
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Sun, Apr 14, 2013 at 7:04 PM, 王国栋 <[email protected]>
> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi Ben,
> > > >>>>>>>>
> > > >>>>>>>> I put my ideas inline. Please check.
> > > >>>>>>>>
> > > >>>>>>>> On Mon, Apr 15, 2013 at 8:13 AM, Benjamin Mahler
> > > >>>>>>>> <[email protected]>wrote:
> > > >>>>>>>>
> > > >>>>>>>>> So I'm still a little confused here. From what you showed, it
> > > >>>> looks
> > > >>>>>>> like
> > > >>>>>>>>> the 'job -kill' command you posted succeeded on the client
> > side?
> > > >>>>>>>>>
> > > >>>>>>>>> [trunk ?]$ ./bin/hadoop job -kill
> job_201304121621_000113/04/12
> > > >>>>>>> 16:27:16
> > > >>>>>>>>> INFO security.UserGroupInformation: JAAS Configuration
> already
> > > set
> > > >>>>> up
> > > >>>>>>> for
> > > >>>>>>>>> Hadoop, not re-installing.
> > > >>>>>>>>> *Killed job job_201304121621_0001**
> > > >>>>>>>>> *
> > > >>>>>>>> Yes, from the client side, killing the job is successful.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> I see that the JobTracker still thinks the job is pending.
> What
> > > >>>>>>> happens
> > > >>>>>>>>> when you re-issue that kill command?
> > > >>>>>>>> The Jobtracker still think the status of job is *PREP*, when I
> > > >>>> reissue
> > > >>>>>>> kill
> > > >>>>>>>> cmd, it seems the same as the first time I issued the cmd.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> I'm confused as to why it's still in pending when the
> > JobTracker
> > > >>>> has
> > > >>>>>>>>> removed the job:
> > > >>>>>>>>>
> > > >>>>>>>>> 13/04/12 16:27:16 INFO mapred.JobTracker: Killing job
> > > >>>>>>>> job_201304121621_0001
> > > >>>>>>>>> 13/04/12 16:27:16 INFO mapred.JobInProgress: Killing job
> > > >>>>>>>>> 'job_201304121621_0001'*
> > > >>>>>>>>>
> > > >>>>>>>>> Looking at the JobTracker code, it seems like this indeed
> kills
> > > >>>> the
> > > >>>>>>> job:
> > > >>>>>>>>>
> > > >>>>>>>>> The code that prints the log line in JobTracker:
> > > >>>>>>>>> private synchronized void killJob(JobInProgress job) {
> > > >>>>>>>>>   LOG.info("Killing job " + job.getJobID());
> > > >>>>>>>>>   JobStatus prevStatus = (JobStatus)job.getStatus().clone();
> > > >>>>>>>>>   job.kill();
> > > >>>>>>>>>
> > > >>>>>>>>>   // Inform the listeners if the job is killed
> > > >>>>>>>>>   // Note :
> > > >>>>>>>>>   //   If the job is killed in the PREP state then the
> > listeners
> > > >>>>>>> will
> > > >>>>>>>> be
> > > >>>>>>>>>   //   invoked
> > > >>>>>>>>>   //   If the job is killed in the RUNNING state then cleanup
> > > >>>>> tasks
> > > >>>>>>>> will
> > > >>>>>>>>> be
> > > >>>>>>>>>   //   launched and the updateTaskStatuses() will take care
> of
> > > >>>> it
> > > >>>>>>>>>   JobStatus newStatus = (JobStatus)job.getStatus().clone();
> > > >>>>>>>>>   if (prevStatus.getRunState() != newStatus.getRunState()
> > > >>>>>>>>>       && newStatus.getRunState() == JobStatus.KILLED) {
> > > >>>>>>>>>     JobStatusChangeEvent event =
> > > >>>>>>>>>       new JobStatusChangeEvent(job,
> > EventType.RUN_STATE_CHANGED,
> > > >>>>>>>>> prevStatus,
> > > >>>>>>>>>           newStatus);
> > > >>>>>>>>>     updateJobInProgressListeners(event);
> > > >>>>>>>>>   }
> > > >>>>>>>>> }
> > > >>>>>>>>>
> > > >>>>>>>>> Then JobInProgress.kill():
> > > >>>>>>>>> /**
> > > >>>>>>>>>  * Kill the job and all its component tasks. This method
> should
> > > >>>> be
> > > >>>>>>>> called
> > > >>>>>>>>> from
> > > >>>>>>>>>  * jobtracker and should return fast as it locks the
> > jobtracker.
> > > >>>>>>>>>  */
> > > >>>>>>>>> public void kill() {
> > > >>>>>>>>>   boolean killNow = false;
> > > >>>>>>>>>   synchronized(jobInitKillStatus) {
> > > >>>>>>>>>     jobInitKillStatus.killed = true;
> > > >>>>>>>>>     //if not in middle of init, terminate it now
> > > >>>>>>>>>     if(!jobInitKillStatus.initStarted ||
> > > >>>>>>> jobInitKillStatus.initDone) {
> > > >>>>>>>>>       //avoiding nested locking by setting flag
> > > >>>>>>>>>       killNow = true;
> > > >>>>>>>>>     }
> > > >>>>>>>>>   }
> > > >>>>>>>>>   if(killNow) {
> > > >>>>>>>>>     terminate(JobStatus.KILLED);
> > > >>>>>>>>>   }
> > > >>>>>>>>> }
> > > >>>>>>>>>
> > > >>>>>>>>> I don't really see the issue at this point, so any further
> > > >>>>>>> information /
> > > >>>>>>>>> debugging on your end that reveals this bug would be very
> > > valuable
> > > >>>>> to
> > > >>>>>>> us.
> > > >>>>>>>> I am trying to debug this on my laptop. I found when I issue
> > kill
> > > >>>> cmd
> > > >>>>> to
> > > >>>>>>>> jobtracker, the job status is never changed. It is always
> > *PREP*.
> > > >>>>>>>> Can you reproduce this on your machine when you follow the
> > steps I
> > > >>>>>>>> mentioned in the previous mail?
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> Lastly, what version of the code are you running? If you're
> > > >>>> running
> > > >>>>>>> off
> > > >>>>>>>>> trunk, when did you last update it?
> > > >>>>>>>>
> > > >>>>>>>> I am running with the code in the trunk, it is updated last
> > week.
> > > >>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> Ben
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> On Fri, Apr 12, 2013 at 1:32 AM, 王国栋 <[email protected]>
> > wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> Hi Vinod,
> > > >>>>>>>>>>
> > > >>>>>>>>>> When I submit the job, the log of jobtracker is as follow
> > > >>>>>>>>>> ----------------------------------
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.MesosScheduler: Added job
> > > >>>>>>>>>> job_201304121621_0001
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobTracker: Job
> > > >>>>> job_201304121621_0001
> > > >>>>>>>> added
> > > >>>>>>>>>> successfully for user 'guodong' to queue 'default'
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobTracker: Initializing
> > > >>>>>>>>>> job_201304121621_0001
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: Initializing
> > > >>>>>>>>>> job_201304121621_0001
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.AuditLogger: USER=guodong
> > > >>>>> IP=127.0.0.1
> > > >>>>>>>>>> OPERATION=SUBMIT_JOB TARGET=job_201304121621_0001
> > RESULT=SUCCESS
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: jobToken
> > generated
> > > >>>>> and
> > > >>>>>>>>> stored
> > > >>>>>>>>>> with users keys in
> > > >>>>>>>>>>
> > /tmp/hadoop-guodong/mapred/system/job_201304121621_0001/jobToken
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: Input size for
> > job
> > > >>>>>>>>>> job_201304121621_0001 = 89502988. Number of splits = 3
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO net.NetworkTopology: Adding a new
> node:
> > > >>>>>>>>>> /default-rack/localhost
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress:
> > > >>>>>>>>>> tip:task_201304121621_0001_m_000000 has split on
> > > >>>>>>>>>> node:/default-rack/localhost
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress:
> > > >>>>>>>>>> tip:task_201304121621_0001_m_000001 has split on
> > > >>>>>>>>>> node:/default-rack/localhost
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress:
> > > >>>>>>>>>> tip:task_201304121621_0001_m_000002 has split on
> > > >>>>>>>>>> node:/default-rack/localhost
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress:
> > > >>>> job_201304121621_0001
> > > >>>>>>>>>> LOCALITY_WAIT_FACTOR=1.0
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: Job
> > > >>>>>>> job_201304121621_0001
> > > >>>>>>>>>> initialized successfully with 3 map tasks and 1 reduce
> tasks.
> > > >>>>>>>>>> 13/04/12 16:22:39 INFO mapred.MesosScheduler: JobTracker
> > Status
> > > >>>>>>>>>>     Pending Map Tasks: 3
> > > >>>>>>>>>>  Pending Reduce Tasks: 1
> > > >>>>>>>>>>        Idle Map Slots: 0
> > > >>>>>>>>>>     Idle Reduce Slots: 0
> > > >>>>>>>>>>    Inactive Map Slots: 0 (launched but no hearbeat yet)
> > > >>>>>>>>>> Inactive Reduce Slots: 0 (launched but no hearbeat yet)
> > > >>>>>>>>>>      Needed Map Slots: 3
> > > >>>>>>>>>>   Needed Reduce Slots: 1
> > > >>>>>>>>>> 13/04/12 16:22:39 INFO mapred.MesosScheduler: Declining
> offer
> > > >>>> with
> > > >>>>>>>>>> insufficient resources for a TaskTracker:
> > > >>>>>>>>>> cpus: offered 4.0 needed 1.800000011920929
> > > >>>>>>>>>> mem : offered 2731.0 needed 6400.0
> > > >>>>>>>>>> disk: offered 75120.0 needed 4096.0
> > > >>>>>>>>>> ports:  at least 2 (sufficient)
> > > >>>>>>>>>> [name: "cpus"
> > > >>>>>>>>>> type: SCALAR
> > > >>>>>>>>>> scalar {
> > > >>>>>>>>>> value: 4.0
> > > >>>>>>>>>> }
> > > >>>>>>>>>> , name: "mem"
> > > >>>>>>>>>> type: SCALAR
> > > >>>>>>>>>> scalar {
> > > >>>>>>>>>> value: 2731.0
> > > >>>>>>>>>> }
> > > >>>>>>>>>> , name: "ports"
> > > >>>>>>>>>> type: RANGES
> > > >>>>>>>>>> ranges {
> > > >>>>>>>>>> range {
> > > >>>>>>>>>>   begin: 31000
> > > >>>>>>>>>>   end: 32000
> > > >>>>>>>>>> }
> > > >>>>>>>>>> }
> > > >>>>>>>>>> , name: "disk"
> > > >>>>>>>>>> type: SCALAR
> > > >>>>>>>>>> scalar {
> > > >>>>>>>>>> value: 75120.0
> > > >>>>>>>>>> }
> > > >>>>>>>>>> ]
> > > >>>>>>>>>> 13/04/12 16:22:39 INFO mapred.MesosScheduler: Unable to
> fully
> > > >>>>>>> satisfy
> > > >>>>>>>>>> needed map/reduce slots: 3 map slots 1 reduce slots
> remaining
> > > >>>>>>>>>> 13/04/12 16:22:45 INFO mapred.MesosScheduler: JobTracker
> > Status
> > > >>>>>>>>>>     Pending Map Tasks: 3
> > > >>>>>>>>>>  Pending Reduce Tasks: 1
> > > >>>>>>>>>>        Idle Map Slots: 0
> > > >>>>>>>>>>     Idle Reduce Slots: 0
> > > >>>>>>>>>>    Inactive Map Slots: 0 (launched but no hearbeat yet)
> > > >>>>>>>>>> Inactive Reduce Slots: 0 (launched but no hearbeat yet)
> > > >>>>>>>>>>      Needed Map Slots: 3
> > > >>>>>>>>>>   Needed Reduce Slots: 1
> > > >>>>>>>>>> 13/04/12 16:22:45 INFO mapred.MesosScheduler: Declining
> offer
> > > >>>> with
> > > >>>>>>>>>> insufficient resources for a TaskTracker:
> > > >>>>>>>>>> cpus: offered 4.0 needed 1.800000011920929
> > > >>>>>>>>>> mem : offered 2731.0 needed 6400.0
> > > >>>>>>>>>> disk: offered 75120.0 needed 4096.0
> > > >>>>>>>>>> ports:  at least 2 (sufficient)
> > > >>>>>>>>>> [name: "cpus"
> > > >>>>>>>>>> type: SCALAR
> > > >>>>>>>>>> scalar {
> > > >>>>>>>>>> value: 4.0
> > > >>>>>>>>>> }
> > > >>>>>>>>>> , name: "mem"
> > > >>>>>>>>>> type: SCALAR
> > > >>>>>>>>>> scalar {
> > > >>>>>>>>>> value: 2731.0
> > > >>>>>>>>>> }
> > > >>>>>>>>>> , name: "ports"
> > > >>>>>>>>>> type: RANGES
> > > >>>>>>>>>> ranges {
> > > >>>>>>>>>> range {
> > > >>>>>>>>>>   begin: 31000
> > > >>>>>>>>>>   end: 32000
> > > >>>>>>>>>> }
> > > >>>>>>>>>> }
> > > >>>>>>>>>> , name: "disk"
> > > >>>>>>>>>> type: SCALAR
> > > >>>>>>>>>> scalar {
> > > >>>>>>>>>> value: 75120.0
> > > >>>>>>>>>> }
> > > >>>>>>>>>> ]
> > > >>>>>>>>>> 13/04/12 16:22:45 INFO mapred.MesosScheduler: Unable to
> fully
> > > >>>>>>> satisfy
> > > >>>>>>>>>> needed map/reduce slots: 3 map slots 1 reduce slots
> remaining
> > > >>>>>>>>>>
> > > >>>>>>>>>> ----------------------------------
> > > >>>>>>>>>>
> > > >>>>>>>>>> the hadoop client log is
> > > >>>>>>>>>> --------------------------------------------
> > > >>>>>>>>>> 13/04/12 16:22:36 INFO security.UserGroupInformation: JAAS
> > > >>>>>>>> Configuration
> > > >>>>>>>>>> already set up for Hadoop, not re-installing.
> > > >>>>>>>>>> 13/04/12 16:22:36 INFO util.NativeCodeLoader: Loaded the
> > > >>>>>>> native-hadoop
> > > >>>>>>>>>> library
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO input.FileInputFormat: Total input
> > paths
> > > >>>> to
> > > >>>>>>>>> process
> > > >>>>>>>>>> : 1
> > > >>>>>>>>>> 13/04/12 16:22:37 WARN snappy.LoadSnappy: Snappy native
> > library
> > > >>>> is
> > > >>>>>>>>>> available
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO snappy.LoadSnappy: Snappy native
> > library
> > > >>>>>>> loaded
> > > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobClient: Running job:
> > > >>>>>>>>> job_201304121621_0001
> > > >>>>>>>>>> 13/04/12 16:22:38 INFO mapred.JobClient:  map 0% reduce 0%
> > > >>>>>>>>>>
> > > >>>>>>>>>> --------------------------------------------
> > > >>>>>>>>>>
> > > >>>>>>>>>> Since the client is hung up,  I use ctrl-c to stop the
> client.
> > > >>>>> Then
> > > >>>>>>> use
> > > >>>>>>>>> job
> > > >>>>>>>>>> -status to check the job status.
> > > >>>>>>>>>>
> > > >>>>>>>>>> guodong@guodong-Vostro-3400
> > > >>>>>>>>>> :~/workspace/mesos-trunk/build/hadoop/hadoop-0.20.2-cdh3u3
> > > >>>>>>>>>> [trunk ?]$ ./bin/hadoop job -status job_201304121621_0001
> > > >>>>>>>>>> 13/04/12 16:26:22 INFO security.UserGroupInformation: JAAS
> > > >>>>>>>> Configuration
> > > >>>>>>>>>> already set up for Hadoop, not re-installing.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Job: job_201304121621_0001
> > > >>>>>>>>>> file:
> > > >>
> > >
> >
> file:/tmp/hadoop-guodong/mapred/staging/guodong/.staging/job_201304121621_0001/job.xml
> > > >>>>>>>>>> tracking URL:
> > > >>>> http://localhost:50030/jobdetails.jsp?jobid=job_201304121621_0001
> > > >>>>>>>>>> map() completion: 0.0
> > > >>>>>>>>>> reduce() completion: 0.0
> > > >>>>>>>>>> Counters: 0
> > > >>>>>>>>>>
> > > >>>>>>>>>> Then try to kill the job
> > > >>>>>>>>>> guodong@guodong-Vostro-3400
> > > >>>>>>>>>> :~/workspace/mesos-trunk/build/hadoop/hadoop-0.20.2-cdh3u3
> > > >>>>>>>>>> [trunk ?]$ ./bin/hadoop job -kill
> > job_201304121621_000113/04/12
> > > >>>>>>>> 16:27:16
> > > >>>>>>>>>> INFO security.UserGroupInformation: JAAS Configuration
> already
> > > >>>> set
> > > >>>>>>> up
> > > >>>>>>>> for
> > > >>>>>>>>>> Hadoop, not re-installing.
> > > >>>>>>>>>> Killed job job_201304121621_0001
> > > >>>>>>>>>>
> > > >>>>>>>>>> when I kill the job, I can see the log on jobtracker
> > > >>>>>>>>>> *13/04/12 16:27:13 INFO mapred.MesosScheduler: Unable to
> fully
> > > >>>>>>> satisfy
> > > >>>>>>>>>> needed map/reduce slots: 3 map slots 1 reduce slots
> remaining*
> > > >>>>>>>>>> *13/04/12 16:27:16 INFO mapred.JobTracker: Killing job
> > > >>>>>>>>>> job_201304121621_0001
> > > >>>>>>>>>> *
> > > >>
> > >
> >
>

Re: Question about running hadoop over mesos

Reply via email to