Re: Mapreduce: scheduling and task placement work

Arun C Murthy Tue, 16 Nov 2010 09:52:26 -0800

Abhishek,

 Welcome.

There isn't a concerted effort afaik, given the complexity of thetask at hand - as I'm sure you will appreciate.


 There are several pieces slowly falling in place:

# The CapacityScheduler (CS) already allows for memory-basedscheduling (i.e. support for 'High RAM' jobs). This is a subtlechange, rather than look at abstract 'slots' the CS looks at everymachine as being made up of real memory slots.# The TaskTracker in trunk (hadoop-0.22) already reports per taskCPU, memory usage.

Clearly it will take some effort to move from here to truly dynamicslot-less scheduling, since the notion of slots is fairly deeplyentrenched in the framework (JobTracker, TaskTracker etc.).


 Of course, I don't mean to discourage you!

Feel free to start opening jiras and jotting your thoughts down,make some proposals and get involved!


Arun

On Nov 16, 2010, at 4:51 AM, abhishek sharma wrote:

Hi,

In his e-mail on the Hadoop Common mailing list, Steve Loughran
mentioned the following:

"There's work underway to be more aware of system load when scheduling
things, rather than have a fairly simplistic "slot" model, look more
at system load and memory load as a way of measuring how idle machines
are. If you were to be really devious, you'd look at io load, network,
machine temperature, etc. If you find this an interesting problem to
get involved in, the mapreduce-dev mailing list is the place to get
involved."

I would like to get involved.

I recently finished my PhD in computer science from the Univ. of
Southern California. In one of my projects, I modified the MapReduce
scheduler to implement a particular job priority scheme.

Thanks,
Abhishek

Re: Mapreduce: scheduling and task placement work

Reply via email to