[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784884#action_12784884
 ] 

Zhaoning Zhang commented on MAPREDUCE-1226:
-------------------------------------------

I think the basic goal of this jira is to run jobs on the Heterogeneous cluster 
just like on the homogeneous one. 
And the jobs will have the illusion that it's a homogeneous cluster, then high 
level scheduler can schedule tasks or jobs without considering the 
heterogeneity.

> Granularity Variable Task Pre-Scheduler in Heterogeneous Environment 
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1226
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1226
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker, task, tasktracker
>         Environment: Heterogeneous Cluster
>            Reporter: Zhaoning Zhang
>            Priority: Minor
>
> As we deploy the LATE scheduler of the OSDI08 paper, upon some of our cluster 
> enviroments, some slow nodes may be assigned tasks that every time run slowly 
> and be re-executed then killed, so we found these nodes are always with no 
> use and waste the assigned task slots.
> In the LATE mechanism, we re-execute some of the tasks, so these tasks run on 
> different node twice or more, then this cause some waste of the calculating 
> resources.
> Easily, we can remove these node out of the cluster or split the cluster into 
> two or more. But I think it's useful and significant to design a mechanism to 
> help low utility nodes to be effect.
>  
> We want to pre-schedule the tasks with the utility based on node historical 
> logs, then assign larger size tasks to the fast nodes. In Hadoop task 
> scheduler, we assign the map task in default splits of 64M. Some may split it 
> into 128M. But, most of them are of the same granularity. So I want to alter 
> this mechanism to a granularity variable one.
> As we know, the Map task granularity depends on the DFS file size, while the 
> Reduce task's depends on the Partitioner to split the intermediate results. 
> So I think this is feasible to get the granularity variable mechanism.
> If we use the pre-schedule model, then we can expect all the tasks can start 
> at a nearly same time and finish at a nearly same time, and the job can fill 
> a specific time slot. 
> History-Log-Based nodes Utility description
> This is the fundamental description of nodes for the pre-scheduler. And in 
> the heterogeneous environment, the cluster can be split into different 
> sub-cluster, and within the sub-cluster the nodes are homogeneous and between 
> the sub-cluster the nodes are heterogeneous.
> Nodes Utility Stability
> We think this is important for the pre-scheduler depends on the stability of 
> the nodes. And we could pick the bad stability nodes up and treat them 
> differently, but we haven't have good method to handle this. 
> Error tolerant
> I think the original scheduler in the homogeneous cluster is designed to 
> handle the error nodes, if some nodes get exceptions, the JobTracker 
> re-execute them, and handle these exceptions dynamically.
> So if we use the pre-scheduler, we must face the problem of the exceptions.
> I propose that if some tasks got exceptions, we split the task into more than 
> one part and execute them on more than one different nodes, then the expected 
> finish time will be shorten, and the total job response time will not be too 
> long.
> Job Priorities
> If we use this pre-scheduler, single job will fill the time slot, and if then 
> will be some other high-priority jobs, they will wait. And I don't get effect 
> methods to solve this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to