Mumak: Map-Reduce Simulator
---------------------------

                 Key: MAPREDUCE-728
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
             Project: Hadoop Map/Reduce
          Issue Type: New Feature
            Reporter: Arun C Murthy
            Assignee: Arun C Murthy
             Fix For: 0.21.0


h3. Vision:

We want to build a Simulator to simulate large-scale Hadoop clusters, 
applications and workloads. This would be invaluable in furthering Hadoop by 
providing a tool for researchers and developers to prototype features (e.g. 
pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
their behaviour and performance with reasonable amount of confidence, there-by 
aiding rapid innovation.

----

h3. First Cut: Simulator for the Map-Reduce Scheduler

The Map-Reduce Scheduler is a fertile area of interest with at least four 
schedulers, each with their own set of features, currently in existence: 
Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority Scheduler.

Each scheduler's scheduling decisions are driven by many factors, such as 
fairness, capacity guarantee, resource availability, data-locality etc.

Given that, it is non-trivial to accurately choose a single scheduler or even a 
set of desired features to predict the right scheduler (or features) for a 
given workload. Hence a simulator which can predict how well a particular 
scheduler works for some specific workload by quickly iterating over schedulers 
and/or scheduler features would be quite useful.

So, the first cut is to implement a simulator for the Map-Reduce scheduler 
which take as input a job trace derived from production workload and a cluster 
definition, and simulates the execution of the jobs in as defined in the trace 
in this virtual cluster. As output, the detailed job execution trace (recorded 
in relation to virtual simulated time) could then be analyzed to understand 
various traits of individual schedulers (individual jobs turn around time, 
throughput, faireness, capacity guarantee, etc). To support this, we would need 
a simulator which could accurately model the conditions of the actual system 
which would affect a schedulers decisions. These include very large-scale 
clusters (thousands of nodes), the detailed characteristics of the workload 
thrown at the clusters, job or task failures, data locality, and cluster 
hardware (cpu, memory, disk i/o, network i/o, network topology) etc.




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to