Re: Draining/Decommisioning a tasktracker

phil young Sun, 30 Jan 2011 09:12:15 -0800

This is the specific information I referred to in my post.

http://hadoop.apache.org/common/docs/r0.20.0/fair_scheduler.html


mapred.fairscheduler.loadmanager An extensibility point that lets you
specify a class that determines how many maps and reduces can run on a given
TaskTracker. This class should implement the LoadManager interface. By
default the task caps in the Hadoop config file are used, but this option
could be used to make the load based on available memory and CPU utilization
for example.

-Phil





On Sat, Jan 29, 2011 at 8:37 AM, rishi pathak <[email protected]>wrote:

> HI,
>     Here is a description of what we are trying to achieve(whether it is
> possible or not is still not cear):
> We have large computing clusters used majorly  for MPI jobs. We use
> PBS/Torque and Maui for resource allocation and scheduling.
> At most times utilization is very high except for very small resource
> pockets of say 16 cores for 2-5 Hrs. We are trying establish feasibility of
> using these small(but fixed sized) resource pockets for nutch crawls. Our
> configuration is:
>
> # Hadoop 0.20.2 (packaged with nutch)
> #Lustre parallel filesystem for data storage
> # No HDFS
>
> We have JT running on one of the login nodes at all times.
> Request for resource (nodes=16, walltime=05 Hrs.) is made using batch
> system and as a part of job TTs are provisioned. The problem is, when a job
> expires, user processes are cleaned up and thus TT gets killed. With that,
> completed and running map/reduce tasks for nutch job are killed and are
> rescheduled. Solution could be as we see it:
>
> 1. As the filesystem is shared(& persistent),  restart tasks on another TT
> and make intermediate task data available. i.e. sort of checkpointing.
> 2. TT draining - based on a speculative time for task completion, TT whose
> walltime is nearing expiry will go into draining mode.i.e. no new tasks will
> be scheduled on that TT.
>
> For '1', it is very far fetched(we are no Hadoop expert)
> '2' seems to be a more sensible approach.
>
> Using exclude list for TT will not help as Koji has already mentioned
> We looked into capacity scheduler but did'nt find any pointers. Phil, what
> version of hadoop
> have these hooks in scheduler.
>
> On Sat, Jan 29, 2011 at 3:34 AM, phil young <[email protected]>wrote:
>
>> There are some hooks available in the schedulers that could be useful
>> also.
>> I think they were expected to be used to allow you to schedule tasks based
>> on load average on the host, but I'd expect you can customize them for
>> your
>> purpose.
>>
>>
>> On Fri, Jan 28, 2011 at 6:46 AM, Harsh J <[email protected]> wrote:
>>
>> > Moving discussion to the MapReduce-User list:
>> > [email protected]
>> >
>> > Reply inline:
>> >
>> > On Fri, Jan 28, 2011 at 2:39 PM, rishi pathak <
>> [email protected]>
>> > wrote:
>> > > Hi,
>> > >        Is there a way to drain a tasktracker. What we require is not
>> to
>> > > schedule any more map/red tasks onto a tasktracker(mark it offline)
>> but
>> > > still the running tasks should not be affected.
>> >
>> > You could simply shut the TT down. MapReduce was designed with faults
>> > in mind and thus tasks that are running on a particular TaskTracker
>> > can be re-run elsewhere if they failed. Is this not usable in your
>> > case?
>> >
>> > --
>> > Harsh J
>> > www.harshj.com
>> >
>>
>
>
>
> --
> ---
> Rishi Pathak
> National PARAM Supercomputing Facility
> C-DAC, Pune, India
>
>
>

Re: Draining/Decommisioning a tasktracker

Reply via email to