As long as MesosScheduler thinks that there are pending jobs that need to have TaskTrackers launched, we will normally allow the PREP->FAILED transition to run so this will self-correct as expected when we are able to launch TaskTrackers.
Guodong, I see from your original email, but can you confirm again that the MesosScheduler thinks there are pending tasks after you issue the kill? On Wed, Apr 17, 2013 at 8:43 AM, Vinod Kone <[email protected]> wrote: > I see. So looks like Hadoop doesn't differentiate between not launching > setup tasks and launching setup tasks but waiting to get to RUNNING state? > That's kinda unfortunate. > > Anyway as long as this is the expected behavior for Hadoop users Im fine > for now. > > @vinodkone > Sent from my mobile > > On Apr 17, 2013, at 8:04 AM, 王国栋 <[email protected]> wrote: > > > Hi Vinod, > > > > I think it is reasonable that hadoop run the job clean up task. Since all > > the job status change is triggered by the completion of some task, eg. > Job > > status change from PREP to RUNNING as long as the job set up task is > > completed(if the job is not killed yet). > > > > For any job, as long as it is killed, no matter the previous status is > > RUNNING or PREP, the job cleanup task will be scheduled to a tasktracker. > > When the job cleanup task is finished, the job status will be set as > KILLED > > and FAILED. So the logic for changing the job status will be the same, > > regardless the previous job status. > > > > Anyway, I think it makes sense to me. So I don't think this is a bug :) > > > > Guodong > > > > > > On Wed, Apr 17, 2013 at 10:52 PM, Vinod Kone <[email protected]> > wrote: > > > >> Thanks for digging through the code. Your analysis makes sense. Though > it > >> is strange that Hadoop wants to run a cleanup task for a job that was > never > >> launched. Maybe you could file a bug for Hadoop? > >> > >> @vinodkone > >> Sent from my mobile > >> > >> On Apr 17, 2013, at 2:30 AM, 王国栋 <[email protected]> wrote: > >> > >>> Hi Ben, > >>> > >>> I have read some of the code about job tracker and task tracker these 2 > >>> days. > >>> > >>> I don't think this scenario is a bug. Neither a bug in mesos scheduler > >> nor > >>> a bug in jobtracker. The following is my opinion. > >>> > >>> We kill the job when the job status is PREP( which means setup task is > >> not > >>> launched yet). When we kill the job, jobtracker just mark 2 boolean > >> values > >>> in JobInProgress as true, which means this job is killed and failed. > But > >>> killing a job will not trigger any job status change until the job > >> cleanup > >>> task is scheduled to a tasktracker. > >>> > >>> The job status stays in PREP just because mesos can not launch a > >>> tasktracker due to not enough resource. So it is a little bit > >> embarrassing. > >>> When there is enough resource for mesos to launch a tasktracker(say, > >> there > >>> is some other mesos-slave who can offer more enough to start a task > >>> tracker), mesos will launch a task tracker and the job cleanup task > will > >> be > >>> scheduled to this task tracker. And at last, after the clean up task is > >>> completed, the job status will be updated as *KILLED* and the pending > >>> mapper slot and pending reduce slot will be released too. > >>> > >>> If anything is wrong, please correct me. :) > >>> > >>> > >>> Guodong > >>> > >>> > >>> On Tue, Apr 16, 2013 at 1:02 AM, Benjamin Mahler > >>> <[email protected]>wrote: > >>> > >>>> Thanks! It would be helpful to me if you could provide the exact > >> commands > >>>> you're running for steps 8, 9 and 10 as well. It would save me some > >> time as > >>>> my knowledge of Hadoop commands is limited. > >>>> > >>>> > >>>> On Sun, Apr 14, 2013 at 9:53 PM, 王国栋 <[email protected]> wrote: > >>>> > >>>>> Hi ben, > >>>>> > >>>>> I've updated to the latest code in trunk. And the problem is still > >> here. > >>>>> > >>>>> Please follow these steps to reproduce it. > >>>>> 1. check out the trunk code > >>>>> 2. bootstrap > >>>>> 3. mkdir build for out source build > >>>>> 4. cd build && ../configure > >>>>> 5. make > >>>>> 6. cd hadoop && make hadoop-0.20.2-cdh3u3 > >>>>> 7. modify conf/mapred-site.xml, set mapred.mesos.slot.mem=10240(make > >> sure > >>>>> the resource is not enough) > >>>>> 8. start mesos and the jobtracker > >>>>> 9. submit a wordcount job to jobtracker.(at this time, the job is > >> pending > >>>>> due to not enough resource) > >>>>> 10. kill the job (can not kill the job) > >>>>> > >>>>> > >>>>> Guodong > >>>>> > >>>>> > >>>>> On Mon, Apr 15, 2013 at 10:52 AM, 王国栋 <[email protected]> wrote: > >>>>> > >>>>>> I am building the latest code in the trunk. I will keep you updated, > >> if > >>>>>> the problem is still here, I will give you the exact reproduce > process > >>>>> and > >>>>>> make sure you can debug it. > >>>>>> > >>>>>> I also think this may be a bug in jobtracker. :) > >>>>>> > >>>>>> Thanks. > >>>>>> > >>>>>> Guodong > >>>>>> > >>>>>> > >>>>>> On Mon, Apr 15, 2013 at 10:20 AM, Benjamin Mahler < > >>>>>> [email protected]> wrote: > >>>>>> > >>>>>>> On April 9th we submitted a deadlock fix, please update to make > sure > >>>> you > >>>>>>> have the fix: > >>>>>>> https://reviews.apache.org/r/10352/ > >>>>>>> > >>>>>>> Can you provide the commands to reproduce from a mesos build > >>>> directory? > >>>>>>> I'd > >>>>>>> like to be able to reproduce this locally on my laptop, so the > exact > >>>>>>> commands I need to run from within my mesos build directory would > be > >>>>>>> useful > >>>>>>> :) > >>>>>>> > >>>>>>> The fact that the job gets stuck in the JobTracker as PREP makes me > >>>>> think > >>>>>>> there's a bug in the JobTracker, I imagine people don't often run > >>>>>>> JobTrackers without any TaskTrackers, which is not the case here. > >>>>>>> > >>>>>>> > >>>>>>> On Sun, Apr 14, 2013 at 7:04 PM, 王国栋 <[email protected]> wrote: > >>>>>>> > >>>>>>>> Hi Ben, > >>>>>>>> > >>>>>>>> I put my ideas inline. Please check. > >>>>>>>> > >>>>>>>> On Mon, Apr 15, 2013 at 8:13 AM, Benjamin Mahler > >>>>>>>> <[email protected]>wrote: > >>>>>>>> > >>>>>>>>> So I'm still a little confused here. From what you showed, it > >>>> looks > >>>>>>> like > >>>>>>>>> the 'job -kill' command you posted succeeded on the client side? > >>>>>>>>> > >>>>>>>>> [trunk ?]$ ./bin/hadoop job -kill job_201304121621_000113/04/12 > >>>>>>> 16:27:16 > >>>>>>>>> INFO security.UserGroupInformation: JAAS Configuration already > set > >>>>> up > >>>>>>> for > >>>>>>>>> Hadoop, not re-installing. > >>>>>>>>> *Killed job job_201304121621_0001** > >>>>>>>>> * > >>>>>>>> Yes, from the client side, killing the job is successful. > >>>>>>>> > >>>>>>>> > >>>>>>>>> > >>>>>>>>> I see that the JobTracker still thinks the job is pending. What > >>>>>>> happens > >>>>>>>>> when you re-issue that kill command? > >>>>>>>> The Jobtracker still think the status of job is *PREP*, when I > >>>> reissue > >>>>>>> kill > >>>>>>>> cmd, it seems the same as the first time I issued the cmd. > >>>>>>>> > >>>>>>>> > >>>>>>>>> > >>>>>>>>> I'm confused as to why it's still in pending when the JobTracker > >>>> has > >>>>>>>>> removed the job: > >>>>>>>>> > >>>>>>>>> 13/04/12 16:27:16 INFO mapred.JobTracker: Killing job > >>>>>>>> job_201304121621_0001 > >>>>>>>>> 13/04/12 16:27:16 INFO mapred.JobInProgress: Killing job > >>>>>>>>> 'job_201304121621_0001'* > >>>>>>>>> > >>>>>>>>> Looking at the JobTracker code, it seems like this indeed kills > >>>> the > >>>>>>> job: > >>>>>>>>> > >>>>>>>>> The code that prints the log line in JobTracker: > >>>>>>>>> private synchronized void killJob(JobInProgress job) { > >>>>>>>>> LOG.info("Killing job " + job.getJobID()); > >>>>>>>>> JobStatus prevStatus = (JobStatus)job.getStatus().clone(); > >>>>>>>>> job.kill(); > >>>>>>>>> > >>>>>>>>> // Inform the listeners if the job is killed > >>>>>>>>> // Note : > >>>>>>>>> // If the job is killed in the PREP state then the listeners > >>>>>>> will > >>>>>>>> be > >>>>>>>>> // invoked > >>>>>>>>> // If the job is killed in the RUNNING state then cleanup > >>>>> tasks > >>>>>>>> will > >>>>>>>>> be > >>>>>>>>> // launched and the updateTaskStatuses() will take care of > >>>> it > >>>>>>>>> JobStatus newStatus = (JobStatus)job.getStatus().clone(); > >>>>>>>>> if (prevStatus.getRunState() != newStatus.getRunState() > >>>>>>>>> && newStatus.getRunState() == JobStatus.KILLED) { > >>>>>>>>> JobStatusChangeEvent event = > >>>>>>>>> new JobStatusChangeEvent(job, EventType.RUN_STATE_CHANGED, > >>>>>>>>> prevStatus, > >>>>>>>>> newStatus); > >>>>>>>>> updateJobInProgressListeners(event); > >>>>>>>>> } > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> Then JobInProgress.kill(): > >>>>>>>>> /** > >>>>>>>>> * Kill the job and all its component tasks. This method should > >>>> be > >>>>>>>> called > >>>>>>>>> from > >>>>>>>>> * jobtracker and should return fast as it locks the jobtracker. > >>>>>>>>> */ > >>>>>>>>> public void kill() { > >>>>>>>>> boolean killNow = false; > >>>>>>>>> synchronized(jobInitKillStatus) { > >>>>>>>>> jobInitKillStatus.killed = true; > >>>>>>>>> //if not in middle of init, terminate it now > >>>>>>>>> if(!jobInitKillStatus.initStarted || > >>>>>>> jobInitKillStatus.initDone) { > >>>>>>>>> //avoiding nested locking by setting flag > >>>>>>>>> killNow = true; > >>>>>>>>> } > >>>>>>>>> } > >>>>>>>>> if(killNow) { > >>>>>>>>> terminate(JobStatus.KILLED); > >>>>>>>>> } > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> I don't really see the issue at this point, so any further > >>>>>>> information / > >>>>>>>>> debugging on your end that reveals this bug would be very > valuable > >>>>> to > >>>>>>> us. > >>>>>>>> I am trying to debug this on my laptop. I found when I issue kill > >>>> cmd > >>>>> to > >>>>>>>> jobtracker, the job status is never changed. It is always *PREP*. > >>>>>>>> Can you reproduce this on your machine when you follow the steps I > >>>>>>>> mentioned in the previous mail? > >>>>>>>> > >>>>>>>> > >>>>>>>>> > >>>>>>>>> Lastly, what version of the code are you running? If you're > >>>> running > >>>>>>> off > >>>>>>>>> trunk, when did you last update it? > >>>>>>>> > >>>>>>>> I am running with the code in the trunk, it is updated last week. > >>>>>>>> > >>>>>>>>> > >>>>>>>>> Ben > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Fri, Apr 12, 2013 at 1:32 AM, 王国栋 <[email protected]> wrote: > >>>>>>>>> > >>>>>>>>>> Hi Vinod, > >>>>>>>>>> > >>>>>>>>>> When I submit the job, the log of jobtracker is as follow > >>>>>>>>>> ---------------------------------- > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.MesosScheduler: Added job > >>>>>>>>>> job_201304121621_0001 > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobTracker: Job > >>>>> job_201304121621_0001 > >>>>>>>> added > >>>>>>>>>> successfully for user 'guodong' to queue 'default' > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobTracker: Initializing > >>>>>>>>>> job_201304121621_0001 > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: Initializing > >>>>>>>>>> job_201304121621_0001 > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.AuditLogger: USER=guodong > >>>>> IP=127.0.0.1 > >>>>>>>>>> OPERATION=SUBMIT_JOB TARGET=job_201304121621_0001 RESULT=SUCCESS > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: jobToken generated > >>>>> and > >>>>>>>>> stored > >>>>>>>>>> with users keys in > >>>>>>>>>> /tmp/hadoop-guodong/mapred/system/job_201304121621_0001/jobToken > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: Input size for job > >>>>>>>>>> job_201304121621_0001 = 89502988. Number of splits = 3 > >>>>>>>>>> 13/04/12 16:22:37 INFO net.NetworkTopology: Adding a new node: > >>>>>>>>>> /default-rack/localhost > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: > >>>>>>>>>> tip:task_201304121621_0001_m_000000 has split on > >>>>>>>>>> node:/default-rack/localhost > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: > >>>>>>>>>> tip:task_201304121621_0001_m_000001 has split on > >>>>>>>>>> node:/default-rack/localhost > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: > >>>>>>>>>> tip:task_201304121621_0001_m_000002 has split on > >>>>>>>>>> node:/default-rack/localhost > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: > >>>> job_201304121621_0001 > >>>>>>>>>> LOCALITY_WAIT_FACTOR=1.0 > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: Job > >>>>>>> job_201304121621_0001 > >>>>>>>>>> initialized successfully with 3 map tasks and 1 reduce tasks. > >>>>>>>>>> 13/04/12 16:22:39 INFO mapred.MesosScheduler: JobTracker Status > >>>>>>>>>> Pending Map Tasks: 3 > >>>>>>>>>> Pending Reduce Tasks: 1 > >>>>>>>>>> Idle Map Slots: 0 > >>>>>>>>>> Idle Reduce Slots: 0 > >>>>>>>>>> Inactive Map Slots: 0 (launched but no hearbeat yet) > >>>>>>>>>> Inactive Reduce Slots: 0 (launched but no hearbeat yet) > >>>>>>>>>> Needed Map Slots: 3 > >>>>>>>>>> Needed Reduce Slots: 1 > >>>>>>>>>> 13/04/12 16:22:39 INFO mapred.MesosScheduler: Declining offer > >>>> with > >>>>>>>>>> insufficient resources for a TaskTracker: > >>>>>>>>>> cpus: offered 4.0 needed 1.800000011920929 > >>>>>>>>>> mem : offered 2731.0 needed 6400.0 > >>>>>>>>>> disk: offered 75120.0 needed 4096.0 > >>>>>>>>>> ports: at least 2 (sufficient) > >>>>>>>>>> [name: "cpus" > >>>>>>>>>> type: SCALAR > >>>>>>>>>> scalar { > >>>>>>>>>> value: 4.0 > >>>>>>>>>> } > >>>>>>>>>> , name: "mem" > >>>>>>>>>> type: SCALAR > >>>>>>>>>> scalar { > >>>>>>>>>> value: 2731.0 > >>>>>>>>>> } > >>>>>>>>>> , name: "ports" > >>>>>>>>>> type: RANGES > >>>>>>>>>> ranges { > >>>>>>>>>> range { > >>>>>>>>>> begin: 31000 > >>>>>>>>>> end: 32000 > >>>>>>>>>> } > >>>>>>>>>> } > >>>>>>>>>> , name: "disk" > >>>>>>>>>> type: SCALAR > >>>>>>>>>> scalar { > >>>>>>>>>> value: 75120.0 > >>>>>>>>>> } > >>>>>>>>>> ] > >>>>>>>>>> 13/04/12 16:22:39 INFO mapred.MesosScheduler: Unable to fully > >>>>>>> satisfy > >>>>>>>>>> needed map/reduce slots: 3 map slots 1 reduce slots remaining > >>>>>>>>>> 13/04/12 16:22:45 INFO mapred.MesosScheduler: JobTracker Status > >>>>>>>>>> Pending Map Tasks: 3 > >>>>>>>>>> Pending Reduce Tasks: 1 > >>>>>>>>>> Idle Map Slots: 0 > >>>>>>>>>> Idle Reduce Slots: 0 > >>>>>>>>>> Inactive Map Slots: 0 (launched but no hearbeat yet) > >>>>>>>>>> Inactive Reduce Slots: 0 (launched but no hearbeat yet) > >>>>>>>>>> Needed Map Slots: 3 > >>>>>>>>>> Needed Reduce Slots: 1 > >>>>>>>>>> 13/04/12 16:22:45 INFO mapred.MesosScheduler: Declining offer > >>>> with > >>>>>>>>>> insufficient resources for a TaskTracker: > >>>>>>>>>> cpus: offered 4.0 needed 1.800000011920929 > >>>>>>>>>> mem : offered 2731.0 needed 6400.0 > >>>>>>>>>> disk: offered 75120.0 needed 4096.0 > >>>>>>>>>> ports: at least 2 (sufficient) > >>>>>>>>>> [name: "cpus" > >>>>>>>>>> type: SCALAR > >>>>>>>>>> scalar { > >>>>>>>>>> value: 4.0 > >>>>>>>>>> } > >>>>>>>>>> , name: "mem" > >>>>>>>>>> type: SCALAR > >>>>>>>>>> scalar { > >>>>>>>>>> value: 2731.0 > >>>>>>>>>> } > >>>>>>>>>> , name: "ports" > >>>>>>>>>> type: RANGES > >>>>>>>>>> ranges { > >>>>>>>>>> range { > >>>>>>>>>> begin: 31000 > >>>>>>>>>> end: 32000 > >>>>>>>>>> } > >>>>>>>>>> } > >>>>>>>>>> , name: "disk" > >>>>>>>>>> type: SCALAR > >>>>>>>>>> scalar { > >>>>>>>>>> value: 75120.0 > >>>>>>>>>> } > >>>>>>>>>> ] > >>>>>>>>>> 13/04/12 16:22:45 INFO mapred.MesosScheduler: Unable to fully > >>>>>>> satisfy > >>>>>>>>>> needed map/reduce slots: 3 map slots 1 reduce slots remaining > >>>>>>>>>> > >>>>>>>>>> ---------------------------------- > >>>>>>>>>> > >>>>>>>>>> the hadoop client log is > >>>>>>>>>> -------------------------------------------- > >>>>>>>>>> 13/04/12 16:22:36 INFO security.UserGroupInformation: JAAS > >>>>>>>> Configuration > >>>>>>>>>> already set up for Hadoop, not re-installing. > >>>>>>>>>> 13/04/12 16:22:36 INFO util.NativeCodeLoader: Loaded the > >>>>>>> native-hadoop > >>>>>>>>>> library > >>>>>>>>>> 13/04/12 16:22:37 INFO input.FileInputFormat: Total input paths > >>>> to > >>>>>>>>> process > >>>>>>>>>> : 1 > >>>>>>>>>> 13/04/12 16:22:37 WARN snappy.LoadSnappy: Snappy native library > >>>> is > >>>>>>>>>> available > >>>>>>>>>> 13/04/12 16:22:37 INFO snappy.LoadSnappy: Snappy native library > >>>>>>> loaded > >>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobClient: Running job: > >>>>>>>>> job_201304121621_0001 > >>>>>>>>>> 13/04/12 16:22:38 INFO mapred.JobClient: map 0% reduce 0% > >>>>>>>>>> > >>>>>>>>>> -------------------------------------------- > >>>>>>>>>> > >>>>>>>>>> Since the client is hung up, I use ctrl-c to stop the client. > >>>>> Then > >>>>>>> use > >>>>>>>>> job > >>>>>>>>>> -status to check the job status. > >>>>>>>>>> > >>>>>>>>>> guodong@guodong-Vostro-3400 > >>>>>>>>>> :~/workspace/mesos-trunk/build/hadoop/hadoop-0.20.2-cdh3u3 > >>>>>>>>>> [trunk ?]$ ./bin/hadoop job -status job_201304121621_0001 > >>>>>>>>>> 13/04/12 16:26:22 INFO security.UserGroupInformation: JAAS > >>>>>>>> Configuration > >>>>>>>>>> already set up for Hadoop, not re-installing. > >>>>>>>>>> > >>>>>>>>>> Job: job_201304121621_0001 > >>>>>>>>>> file: > >> > file:/tmp/hadoop-guodong/mapred/staging/guodong/.staging/job_201304121621_0001/job.xml > >>>>>>>>>> tracking URL: > >>>> http://localhost:50030/jobdetails.jsp?jobid=job_201304121621_0001 > >>>>>>>>>> map() completion: 0.0 > >>>>>>>>>> reduce() completion: 0.0 > >>>>>>>>>> Counters: 0 > >>>>>>>>>> > >>>>>>>>>> Then try to kill the job > >>>>>>>>>> guodong@guodong-Vostro-3400 > >>>>>>>>>> :~/workspace/mesos-trunk/build/hadoop/hadoop-0.20.2-cdh3u3 > >>>>>>>>>> [trunk ?]$ ./bin/hadoop job -kill job_201304121621_000113/04/12 > >>>>>>>> 16:27:16 > >>>>>>>>>> INFO security.UserGroupInformation: JAAS Configuration already > >>>> set > >>>>>>> up > >>>>>>>> for > >>>>>>>>>> Hadoop, not re-installing. > >>>>>>>>>> Killed job job_201304121621_0001 > >>>>>>>>>> > >>>>>>>>>> when I kill the job, I can see the log on jobtracker > >>>>>>>>>> *13/04/12 16:27:13 INFO mapred.MesosScheduler: Unable to fully > >>>>>>> satisfy > >>>>>>>>>> needed map/reduce slots: 3 map slots 1 reduce slots remaining* > >>>>>>>>>> *13/04/12 16:27:16 INFO mapred.JobTracker: Killing job > >>>>>>>>>> job_201304121621_0001 > >>>>>>>>>> * > >> >
