Re: Question about running hadoop over mesos

Vinod Kone Wed, 17 Apr 2013 08:43:45 -0700

I see. So looks like Hadoop doesn't differentiate between not launching setup 
tasks and launching setup tasks but waiting to get to RUNNING state? That's 
kinda unfortunate.


Anyway as long as this is the expected behavior for Hadoop users Im fine for 
now. 

@vinodkone
Sent from my mobile 

On Apr 17, 2013, at 8:04 AM, 王国栋 <[email protected]> wrote:

> Hi Vinod,
> 
> I think it is reasonable that hadoop run the job clean up task. Since all
> the job status change is triggered by the completion of some task, eg. Job
> status change from PREP to RUNNING as long as the job set up task is
> completed(if the job is not killed yet).
> 
> For any job, as long as it is killed, no matter the previous status is
> RUNNING or PREP, the job cleanup task will be scheduled to a tasktracker.
> When the job cleanup task is finished, the job status will be set as KILLED
> and FAILED. So the logic for changing the job status will be the same,
> regardless the previous job status.
> 
> Anyway, I think it makes sense to me. So I don't think this is a bug :)
> 
> Guodong
> 
> 
> On Wed, Apr 17, 2013 at 10:52 PM, Vinod Kone <[email protected]> wrote:
> 
>> Thanks for digging through the code. Your analysis makes sense. Though it
>> is strange that Hadoop wants to run a cleanup task for a job that was never
>> launched. Maybe you could file a bug for Hadoop?
>> 
>> @vinodkone
>> Sent from my mobile
>> 
>> On Apr 17, 2013, at 2:30 AM, 王国栋 <[email protected]> wrote:
>> 
>>> Hi Ben,
>>> 
>>> I have read some of the code about job tracker and task tracker these 2
>>> days.
>>> 
>>> I don't think this scenario is a bug. Neither a bug in mesos scheduler
>> nor
>>> a bug in jobtracker. The following is my opinion.
>>> 
>>> We kill the job when the job status is PREP( which means setup task is
>> not
>>> launched yet). When we kill the job, jobtracker just mark 2 boolean
>> values
>>> in JobInProgress as true, which means this job is killed and failed. But
>>> killing a job will not trigger any job status change until the job
>> cleanup
>>> task is scheduled to a tasktracker.
>>> 
>>> The job status stays in PREP just because mesos can not launch a
>>> tasktracker due to not enough resource. So it is a little bit
>> embarrassing.
>>> When there is enough resource for mesos to launch a tasktracker(say,
>> there
>>> is some other mesos-slave who can offer more enough to start a task
>>> tracker), mesos will launch a task tracker and the job cleanup task will
>> be
>>> scheduled to this task tracker. And at last, after the clean up task is
>>> completed, the job status will be updated as *KILLED* and the pending
>>> mapper slot and pending reduce slot will be released too.
>>> 
>>> If anything is wrong, please correct me. :)
>>> 
>>> 
>>> Guodong
>>> 
>>> 
>>> On Tue, Apr 16, 2013 at 1:02 AM, Benjamin Mahler
>>> <[email protected]>wrote:
>>> 
>>>> Thanks! It would be helpful to me if you could provide the exact
>> commands
>>>> you're running for steps 8, 9 and 10 as well. It would save me some
>> time as
>>>> my knowledge of Hadoop commands is limited.
>>>> 
>>>> 
>>>> On Sun, Apr 14, 2013 at 9:53 PM, 王国栋 <[email protected]> wrote:
>>>> 
>>>>> Hi ben,
>>>>> 
>>>>> I've updated to the latest code in trunk. And the problem is still
>> here.
>>>>> 
>>>>> Please follow these steps to reproduce it.
>>>>> 1. check out the trunk code
>>>>> 2. bootstrap
>>>>> 3. mkdir build for out source build
>>>>> 4. cd build && ../configure
>>>>> 5. make
>>>>> 6. cd hadoop && make hadoop-0.20.2-cdh3u3
>>>>> 7. modify conf/mapred-site.xml, set mapred.mesos.slot.mem=10240(make
>> sure
>>>>> the resource is not enough)
>>>>> 8. start mesos and the jobtracker
>>>>> 9. submit a wordcount job to jobtracker.(at this time, the job is
>> pending
>>>>> due to not enough resource)
>>>>> 10. kill the job (can not kill the job)
>>>>> 
>>>>> 
>>>>> Guodong
>>>>> 
>>>>> 
>>>>> On Mon, Apr 15, 2013 at 10:52 AM, 王国栋 <[email protected]> wrote:
>>>>> 
>>>>>> I am building the latest code in the trunk. I will keep you updated,
>> if
>>>>>> the problem is still here, I will give you the exact reproduce process
>>>>> and
>>>>>> make sure you can debug it.
>>>>>> 
>>>>>> I also think this may be a bug in jobtracker. :)
>>>>>> 
>>>>>> Thanks.
>>>>>> 
>>>>>> Guodong
>>>>>> 
>>>>>> 
>>>>>> On Mon, Apr 15, 2013 at 10:20 AM, Benjamin Mahler <
>>>>>> [email protected]> wrote:
>>>>>> 
>>>>>>> On April 9th we submitted a deadlock fix, please update to make sure
>>>> you
>>>>>>> have the fix:
>>>>>>> https://reviews.apache.org/r/10352/
>>>>>>> 
>>>>>>> Can you provide the commands to reproduce from a mesos build
>>>> directory?
>>>>>>> I'd
>>>>>>> like to be able to reproduce this locally on my laptop, so the exact
>>>>>>> commands I need to run from within my mesos build directory would be
>>>>>>> useful
>>>>>>> :)
>>>>>>> 
>>>>>>> The fact that the job gets stuck in the JobTracker as PREP makes me
>>>>> think
>>>>>>> there's a bug in the JobTracker, I imagine people don't often run
>>>>>>> JobTrackers without any TaskTrackers, which is not the case here.
>>>>>>> 
>>>>>>> 
>>>>>>> On Sun, Apr 14, 2013 at 7:04 PM, 王国栋 <[email protected]> wrote:
>>>>>>> 
>>>>>>>> Hi Ben,
>>>>>>>> 
>>>>>>>> I put my ideas inline. Please check.
>>>>>>>> 
>>>>>>>> On Mon, Apr 15, 2013 at 8:13 AM, Benjamin Mahler
>>>>>>>> <[email protected]>wrote:
>>>>>>>> 
>>>>>>>>> So I'm still a little confused here. From what you showed, it
>>>> looks
>>>>>>> like
>>>>>>>>> the 'job -kill' command you posted succeeded on the client side?
>>>>>>>>> 
>>>>>>>>> [trunk ?]$ ./bin/hadoop job -kill job_201304121621_000113/04/12
>>>>>>> 16:27:16
>>>>>>>>> INFO security.UserGroupInformation: JAAS Configuration already set
>>>>> up
>>>>>>> for
>>>>>>>>> Hadoop, not re-installing.
>>>>>>>>> *Killed job job_201304121621_0001**
>>>>>>>>> *
>>>>>>>> Yes, from the client side, killing the job is successful.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I see that the JobTracker still thinks the job is pending. What
>>>>>>> happens
>>>>>>>>> when you re-issue that kill command?
>>>>>>>> The Jobtracker still think the status of job is *PREP*, when I
>>>> reissue
>>>>>>> kill
>>>>>>>> cmd, it seems the same as the first time I issued the cmd.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I'm confused as to why it's still in pending when the JobTracker
>>>> has
>>>>>>>>> removed the job:
>>>>>>>>> 
>>>>>>>>> 13/04/12 16:27:16 INFO mapred.JobTracker: Killing job
>>>>>>>> job_201304121621_0001
>>>>>>>>> 13/04/12 16:27:16 INFO mapred.JobInProgress: Killing job
>>>>>>>>> 'job_201304121621_0001'*
>>>>>>>>> 
>>>>>>>>> Looking at the JobTracker code, it seems like this indeed kills
>>>> the
>>>>>>> job:
>>>>>>>>> 
>>>>>>>>> The code that prints the log line in JobTracker:
>>>>>>>>> private synchronized void killJob(JobInProgress job) {
>>>>>>>>>   LOG.info("Killing job " + job.getJobID());
>>>>>>>>>   JobStatus prevStatus = (JobStatus)job.getStatus().clone();
>>>>>>>>>   job.kill();
>>>>>>>>> 
>>>>>>>>>   // Inform the listeners if the job is killed
>>>>>>>>>   // Note :
>>>>>>>>>   //   If the job is killed in the PREP state then the listeners
>>>>>>> will
>>>>>>>> be
>>>>>>>>>   //   invoked
>>>>>>>>>   //   If the job is killed in the RUNNING state then cleanup
>>>>> tasks
>>>>>>>> will
>>>>>>>>> be
>>>>>>>>>   //   launched and the updateTaskStatuses() will take care of
>>>> it
>>>>>>>>>   JobStatus newStatus = (JobStatus)job.getStatus().clone();
>>>>>>>>>   if (prevStatus.getRunState() != newStatus.getRunState()
>>>>>>>>>       && newStatus.getRunState() == JobStatus.KILLED) {
>>>>>>>>>     JobStatusChangeEvent event =
>>>>>>>>>       new JobStatusChangeEvent(job, EventType.RUN_STATE_CHANGED,
>>>>>>>>> prevStatus,
>>>>>>>>>           newStatus);
>>>>>>>>>     updateJobInProgressListeners(event);
>>>>>>>>>   }
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> Then JobInProgress.kill():
>>>>>>>>> /**
>>>>>>>>>  * Kill the job and all its component tasks. This method should
>>>> be
>>>>>>>> called
>>>>>>>>> from
>>>>>>>>>  * jobtracker and should return fast as it locks the jobtracker.
>>>>>>>>>  */
>>>>>>>>> public void kill() {
>>>>>>>>>   boolean killNow = false;
>>>>>>>>>   synchronized(jobInitKillStatus) {
>>>>>>>>>     jobInitKillStatus.killed = true;
>>>>>>>>>     //if not in middle of init, terminate it now
>>>>>>>>>     if(!jobInitKillStatus.initStarted ||
>>>>>>> jobInitKillStatus.initDone) {
>>>>>>>>>       //avoiding nested locking by setting flag
>>>>>>>>>       killNow = true;
>>>>>>>>>     }
>>>>>>>>>   }
>>>>>>>>>   if(killNow) {
>>>>>>>>>     terminate(JobStatus.KILLED);
>>>>>>>>>   }
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> I don't really see the issue at this point, so any further
>>>>>>> information /
>>>>>>>>> debugging on your end that reveals this bug would be very valuable
>>>>> to
>>>>>>> us.
>>>>>>>> I am trying to debug this on my laptop. I found when I issue kill
>>>> cmd
>>>>> to
>>>>>>>> jobtracker, the job status is never changed. It is always *PREP*.
>>>>>>>> Can you reproduce this on your machine when you follow the steps I
>>>>>>>> mentioned in the previous mail?
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Lastly, what version of the code are you running? If you're
>>>> running
>>>>>>> off
>>>>>>>>> trunk, when did you last update it?
>>>>>>>> 
>>>>>>>> I am running with the code in the trunk, it is updated last week.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Ben
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Apr 12, 2013 at 1:32 AM, 王国栋 <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi Vinod,
>>>>>>>>>> 
>>>>>>>>>> When I submit the job, the log of jobtracker is as follow
>>>>>>>>>> ----------------------------------
>>>>>>>>>> 13/04/12 16:22:37 INFO mapred.MesosScheduler: Added job
>>>>>>>>>> job_201304121621_0001
>>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobTracker: Job
>>>>> job_201304121621_0001
>>>>>>>> added
>>>>>>>>>> successfully for user 'guodong' to queue 'default'
>>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobTracker: Initializing
>>>>>>>>>> job_201304121621_0001
>>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: Initializing
>>>>>>>>>> job_201304121621_0001
>>>>>>>>>> 13/04/12 16:22:37 INFO mapred.AuditLogger: USER=guodong
>>>>> IP=127.0.0.1
>>>>>>>>>> OPERATION=SUBMIT_JOB TARGET=job_201304121621_0001 RESULT=SUCCESS
>>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: jobToken generated
>>>>> and
>>>>>>>>> stored
>>>>>>>>>> with users keys in
>>>>>>>>>> /tmp/hadoop-guodong/mapred/system/job_201304121621_0001/jobToken
>>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: Input size for job
>>>>>>>>>> job_201304121621_0001 = 89502988. Number of splits = 3
>>>>>>>>>> 13/04/12 16:22:37 INFO net.NetworkTopology: Adding a new node:
>>>>>>>>>> /default-rack/localhost
>>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress:
>>>>>>>>>> tip:task_201304121621_0001_m_000000 has split on
>>>>>>>>>> node:/default-rack/localhost
>>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress:
>>>>>>>>>> tip:task_201304121621_0001_m_000001 has split on
>>>>>>>>>> node:/default-rack/localhost
>>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress:
>>>>>>>>>> tip:task_201304121621_0001_m_000002 has split on
>>>>>>>>>> node:/default-rack/localhost
>>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress:
>>>> job_201304121621_0001
>>>>>>>>>> LOCALITY_WAIT_FACTOR=1.0
>>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobInProgress: Job
>>>>>>> job_201304121621_0001
>>>>>>>>>> initialized successfully with 3 map tasks and 1 reduce tasks.
>>>>>>>>>> 13/04/12 16:22:39 INFO mapred.MesosScheduler: JobTracker Status
>>>>>>>>>>     Pending Map Tasks: 3
>>>>>>>>>>  Pending Reduce Tasks: 1
>>>>>>>>>>        Idle Map Slots: 0
>>>>>>>>>>     Idle Reduce Slots: 0
>>>>>>>>>>    Inactive Map Slots: 0 (launched but no hearbeat yet)
>>>>>>>>>> Inactive Reduce Slots: 0 (launched but no hearbeat yet)
>>>>>>>>>>      Needed Map Slots: 3
>>>>>>>>>>   Needed Reduce Slots: 1
>>>>>>>>>> 13/04/12 16:22:39 INFO mapred.MesosScheduler: Declining offer
>>>> with
>>>>>>>>>> insufficient resources for a TaskTracker:
>>>>>>>>>> cpus: offered 4.0 needed 1.800000011920929
>>>>>>>>>> mem : offered 2731.0 needed 6400.0
>>>>>>>>>> disk: offered 75120.0 needed 4096.0
>>>>>>>>>> ports:  at least 2 (sufficient)
>>>>>>>>>> [name: "cpus"
>>>>>>>>>> type: SCALAR
>>>>>>>>>> scalar {
>>>>>>>>>> value: 4.0
>>>>>>>>>> }
>>>>>>>>>> , name: "mem"
>>>>>>>>>> type: SCALAR
>>>>>>>>>> scalar {
>>>>>>>>>> value: 2731.0
>>>>>>>>>> }
>>>>>>>>>> , name: "ports"
>>>>>>>>>> type: RANGES
>>>>>>>>>> ranges {
>>>>>>>>>> range {
>>>>>>>>>>   begin: 31000
>>>>>>>>>>   end: 32000
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>> , name: "disk"
>>>>>>>>>> type: SCALAR
>>>>>>>>>> scalar {
>>>>>>>>>> value: 75120.0
>>>>>>>>>> }
>>>>>>>>>> ]
>>>>>>>>>> 13/04/12 16:22:39 INFO mapred.MesosScheduler: Unable to fully
>>>>>>> satisfy
>>>>>>>>>> needed map/reduce slots: 3 map slots 1 reduce slots remaining
>>>>>>>>>> 13/04/12 16:22:45 INFO mapred.MesosScheduler: JobTracker Status
>>>>>>>>>>     Pending Map Tasks: 3
>>>>>>>>>>  Pending Reduce Tasks: 1
>>>>>>>>>>        Idle Map Slots: 0
>>>>>>>>>>     Idle Reduce Slots: 0
>>>>>>>>>>    Inactive Map Slots: 0 (launched but no hearbeat yet)
>>>>>>>>>> Inactive Reduce Slots: 0 (launched but no hearbeat yet)
>>>>>>>>>>      Needed Map Slots: 3
>>>>>>>>>>   Needed Reduce Slots: 1
>>>>>>>>>> 13/04/12 16:22:45 INFO mapred.MesosScheduler: Declining offer
>>>> with
>>>>>>>>>> insufficient resources for a TaskTracker:
>>>>>>>>>> cpus: offered 4.0 needed 1.800000011920929
>>>>>>>>>> mem : offered 2731.0 needed 6400.0
>>>>>>>>>> disk: offered 75120.0 needed 4096.0
>>>>>>>>>> ports:  at least 2 (sufficient)
>>>>>>>>>> [name: "cpus"
>>>>>>>>>> type: SCALAR
>>>>>>>>>> scalar {
>>>>>>>>>> value: 4.0
>>>>>>>>>> }
>>>>>>>>>> , name: "mem"
>>>>>>>>>> type: SCALAR
>>>>>>>>>> scalar {
>>>>>>>>>> value: 2731.0
>>>>>>>>>> }
>>>>>>>>>> , name: "ports"
>>>>>>>>>> type: RANGES
>>>>>>>>>> ranges {
>>>>>>>>>> range {
>>>>>>>>>>   begin: 31000
>>>>>>>>>>   end: 32000
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>> , name: "disk"
>>>>>>>>>> type: SCALAR
>>>>>>>>>> scalar {
>>>>>>>>>> value: 75120.0
>>>>>>>>>> }
>>>>>>>>>> ]
>>>>>>>>>> 13/04/12 16:22:45 INFO mapred.MesosScheduler: Unable to fully
>>>>>>> satisfy
>>>>>>>>>> needed map/reduce slots: 3 map slots 1 reduce slots remaining
>>>>>>>>>> 
>>>>>>>>>> ----------------------------------
>>>>>>>>>> 
>>>>>>>>>> the hadoop client log is
>>>>>>>>>> --------------------------------------------
>>>>>>>>>> 13/04/12 16:22:36 INFO security.UserGroupInformation: JAAS
>>>>>>>> Configuration
>>>>>>>>>> already set up for Hadoop, not re-installing.
>>>>>>>>>> 13/04/12 16:22:36 INFO util.NativeCodeLoader: Loaded the
>>>>>>> native-hadoop
>>>>>>>>>> library
>>>>>>>>>> 13/04/12 16:22:37 INFO input.FileInputFormat: Total input paths
>>>> to
>>>>>>>>> process
>>>>>>>>>> : 1
>>>>>>>>>> 13/04/12 16:22:37 WARN snappy.LoadSnappy: Snappy native library
>>>> is
>>>>>>>>>> available
>>>>>>>>>> 13/04/12 16:22:37 INFO snappy.LoadSnappy: Snappy native library
>>>>>>> loaded
>>>>>>>>>> 13/04/12 16:22:37 INFO mapred.JobClient: Running job:
>>>>>>>>> job_201304121621_0001
>>>>>>>>>> 13/04/12 16:22:38 INFO mapred.JobClient:  map 0% reduce 0%
>>>>>>>>>> 
>>>>>>>>>> --------------------------------------------
>>>>>>>>>> 
>>>>>>>>>> Since the client is hung up,  I use ctrl-c to stop the client.
>>>>> Then
>>>>>>> use
>>>>>>>>> job
>>>>>>>>>> -status to check the job status.
>>>>>>>>>> 
>>>>>>>>>> guodong@guodong-Vostro-3400
>>>>>>>>>> :~/workspace/mesos-trunk/build/hadoop/hadoop-0.20.2-cdh3u3
>>>>>>>>>> [trunk ?]$ ./bin/hadoop job -status job_201304121621_0001
>>>>>>>>>> 13/04/12 16:26:22 INFO security.UserGroupInformation: JAAS
>>>>>>>> Configuration
>>>>>>>>>> already set up for Hadoop, not re-installing.
>>>>>>>>>> 
>>>>>>>>>> Job: job_201304121621_0001
>>>>>>>>>> file:
>> file:/tmp/hadoop-guodong/mapred/staging/guodong/.staging/job_201304121621_0001/job.xml
>>>>>>>>>> tracking URL:
>>>> http://localhost:50030/jobdetails.jsp?jobid=job_201304121621_0001
>>>>>>>>>> map() completion: 0.0
>>>>>>>>>> reduce() completion: 0.0
>>>>>>>>>> Counters: 0
>>>>>>>>>> 
>>>>>>>>>> Then try to kill the job
>>>>>>>>>> guodong@guodong-Vostro-3400
>>>>>>>>>> :~/workspace/mesos-trunk/build/hadoop/hadoop-0.20.2-cdh3u3
>>>>>>>>>> [trunk ?]$ ./bin/hadoop job -kill job_201304121621_000113/04/12
>>>>>>>> 16:27:16
>>>>>>>>>> INFO security.UserGroupInformation: JAAS Configuration already
>>>> set
>>>>>>> up
>>>>>>>> for
>>>>>>>>>> Hadoop, not re-installing.
>>>>>>>>>> Killed job job_201304121621_0001
>>>>>>>>>> 
>>>>>>>>>> when I kill the job, I can see the log on jobtracker
>>>>>>>>>> *13/04/12 16:27:13 INFO mapred.MesosScheduler: Unable to fully
>>>>>>> satisfy
>>>>>>>>>> needed map/reduce slots: 3 map slots 1 reduce slots remaining*
>>>>>>>>>> *13/04/12 16:27:16 INFO mapred.JobTracker: Killing job
>>>>>>>>>> job_201304121621_0001
>>>>>>>>>> *
>>

Re: Question about running hadoop over mesos

Reply via email to