[jira] Commented: (MAPREDUCE-1603) Add a plugin class for the TaskTracker to determine available slots

Steve Loughran (JIRA) Wed, 17 Mar 2010 10:39:49 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846491#action_12846491
 ]


Steve Loughran commented on MAPREDUCE-1603:
-------------------------------------------

-that would imply passing up machine metadata: cpu family/version, OS, etc. No 
reason why that couldn't be done, though you'd have to decide whether that is 
something you'd republish every heartbeat or just when the TT first registers. 
Of course, without the JT making decisions on where to route stuff based on 
those features, it's wasted effort. Which would imply you also need some plugin 
support for making the decisions as to where to run Mappers and Reducers; right 
now it's fairly straightforward: do it close to the data. 

> Add a plugin class for the TaskTracker to determine available slots
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1603
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1603
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: tasktracker
>    Affects Versions: 0.22.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> Currently the #of available map and reduce slots is determined by the 
> configuration. MAPREDUCE-922 has proposed working things out automatically, 
> but that is going to depend a lot on the specific tasks -hard to get right 
> for everyone.
> There is a Hadoop cluster near me that would like to use CPU time from other 
> machines in the room, machines which cannot offer storage, but which will 
> have spare CPU time when they aren't running code scheduled with a grid 
> scheduler. The nodes could run a TT which would report a dynamic number of 
> slots, the number depending upon the current grid workload. 
> I propose we add a plugin point here, so that different people can develop 
> plugin classes that determine the amount of available slots based on 
> workload, RAM, CPU, power budget, thermal parameters, etc. Lots of space for 
> customisation and improvement. And by having it as a plugin: people get to 
> integrate with whatever datacentre schedulers they have without Hadoop itself 
> needing to be altered: the base implementation would be as today: subtract 
> the number of active map and reduce slots from the configured values, push 
> that out. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1603) Add a plugin class for the TaskTracker to determine available slots

Reply via email to