[Hadoop Wiki] Update of "BristolHadoopWorkshopSpring2010 " by SteveLoughran

Apache Wiki Thu, 01 Apr 2010 11:33:37 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "BristolHadoopWorkshopSpring2010" page has been changed by SteveLoughran.
The comment on this change is: dynamic priority scheduler.
http://wiki.apache.org/hadoop/BristolHadoopWorkshopSpring2010?action=diff&rev1=5&rev2=6

--------------------------------------------------

  
  Sanders talk triggered an interesting discussion on whether the Grid model 
had delivered on what it had promised, or not. The answer: some stuff got 
addressed, but some things (storage) had been ignored, and turned out to be 
rather important.
  
+ == Thomas Sandholm: Economic Scheduling of Hadoop Jobs ==
+ 
+ 
[[http://www.slideshare.net/steve_l/economic-scheduling-of-hadoop-jobs|Slides]]
+ 
+ Thomas Sandholm joined us from via videoconference to talk about the 
scheduler that he and Kevin Lai wrote.
+ 
+  * The two main schedulers are optimised for the Yahoo! and Facebook 
workloads. Although converging they are tuned differently; both teams are 
nervous about changes that would reduce their throughputs, as the cost would be 
significant.
+  * Hadoop 0.21 adds a plugin API to add your own scheduler more easily.
+  * The DynamicPriorityScheduler is designed for multiple users competing for 
time on a shared cluster. 
+  * You bid for time; the scheduler gives priority to those who bid the most.
+  * You can bid $0, you will still get time if nobody else bids more than you.
+  * Running Map or Reduce jobs will get killed if higher priority work comes 
in. The scheduler tries to be clever here and leave stuff that has been running 
a while alone (on the expectation that it will finish soon). The benefits of 
killing processes comes in if people can schedule long running jobs.
+  * It avoids any kind of history to make it scalable, no need to worry about 
persistence.
+  * If your bid doesn't get through, you don't get billed.
+  * To use: give every user/team their own queue
+ 
+ The scheduler is in the contrib directory for Hadoop 0.21; it's not easy to 
backport as it uses the scheduler plugin API.
+

[Hadoop Wiki] Update of "BristolHadoopWorkshopSpring2010 " by SteveLoughran

Reply via email to