[jira] [Commented] (YARN-2381) CLONE - Yarn Scheduler Load Simulator

2014-08-05 Thread JiankunLiu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14087134#comment-14087134
 ] 

JiankunLiu commented on YARN-2381:
--

Wei Yan, I'm sorry. I clicked the button clone by mistake, but I don't know 
how to delete it.

 CLONE - Yarn Scheduler Load Simulator
 -

 Key: YARN-2381
 URL: https://issues.apache.org/jira/browse/YARN-2381
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: JiankunLiu
 Fix For: 2.3.0


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2381) CLONE - Yarn Scheduler Load Simulator

2014-08-04 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084804#comment-14084804
 ] 

Wei Yan commented on YARN-2381:
---

[~Jackliu91], purpose of this jira?

 CLONE - Yarn Scheduler Load Simulator
 -

 Key: YARN-2381
 URL: https://issues.apache.org/jira/browse/YARN-2381
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: JiankunLiu
 Fix For: 2.3.0


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)