[
https://issues.apache.org/jira/browse/MAPREDUCE-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784884#action_12784884
]
Zhaoning Zhang commented on MAPREDUCE-1226:
-------------------------------------------
I think the basic goal of this jira is to run jobs on the Heterogeneous cluster
just like on the homogeneous one.
And the jobs will have the illusion that it's a homogeneous cluster, then high
level scheduler can schedule tasks or jobs without considering the
heterogeneity.
> Granularity Variable Task Pre-Scheduler in Heterogeneous Environment
> ---------------------------------------------------------------------
>
> Key: MAPREDUCE-1226
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1226
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: jobtracker, task, tasktracker
> Environment: Heterogeneous Cluster
> Reporter: Zhaoning Zhang
> Priority: Minor
>
> As we deploy the LATE scheduler of the OSDI08 paper, upon some of our cluster
> enviroments, some slow nodes may be assigned tasks that every time run slowly
> and be re-executed then killed, so we found these nodes are always with no
> use and waste the assigned task slots.
> In the LATE mechanism, we re-execute some of the tasks, so these tasks run on
> different node twice or more, then this cause some waste of the calculating
> resources.
> Easily, we can remove these node out of the cluster or split the cluster into
> two or more. But I think it's useful and significant to design a mechanism to
> help low utility nodes to be effect.
>
> We want to pre-schedule the tasks with the utility based on node historical
> logs, then assign larger size tasks to the fast nodes. In Hadoop task
> scheduler, we assign the map task in default splits of 64M. Some may split it
> into 128M. But, most of them are of the same granularity. So I want to alter
> this mechanism to a granularity variable one.
> As we know, the Map task granularity depends on the DFS file size, while the
> Reduce task's depends on the Partitioner to split the intermediate results.
> So I think this is feasible to get the granularity variable mechanism.
> If we use the pre-schedule model, then we can expect all the tasks can start
> at a nearly same time and finish at a nearly same time, and the job can fill
> a specific time slot.
> History-Log-Based nodes Utility description
> This is the fundamental description of nodes for the pre-scheduler. And in
> the heterogeneous environment, the cluster can be split into different
> sub-cluster, and within the sub-cluster the nodes are homogeneous and between
> the sub-cluster the nodes are heterogeneous.
> Nodes Utility Stability
> We think this is important for the pre-scheduler depends on the stability of
> the nodes. And we could pick the bad stability nodes up and treat them
> differently, but we haven't have good method to handle this.
> Error tolerant
> I think the original scheduler in the homogeneous cluster is designed to
> handle the error nodes, if some nodes get exceptions, the JobTracker
> re-execute them, and handle these exceptions dynamically.
> So if we use the pre-scheduler, we must face the problem of the exceptions.
> I propose that if some tasks got exceptions, we split the task into more than
> one part and execute them on more than one different nodes, then the expected
> finish time will be shorten, and the total job response time will not be too
> long.
> Job Priorities
> If we use this pre-scheduler, single job will fill the time slot, and if then
> will be some other high-priority jobs, they will wait. And I don't get effect
> methods to solve this.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.