Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "BristolHadoopWorkshopSpring2010" page has been changed by SteveLoughran. The comment on this change is: dynamic priority scheduler. http://wiki.apache.org/hadoop/BristolHadoopWorkshopSpring2010?action=diff&rev1=5&rev2=6 -------------------------------------------------- Sanders talk triggered an interesting discussion on whether the Grid model had delivered on what it had promised, or not. The answer: some stuff got addressed, but some things (storage) had been ignored, and turned out to be rather important. + == Thomas Sandholm: Economic Scheduling of Hadoop Jobs == + + [[http://www.slideshare.net/steve_l/economic-scheduling-of-hadoop-jobs|Slides]] + + Thomas Sandholm joined us from via videoconference to talk about the scheduler that he and Kevin Lai wrote. + + * The two main schedulers are optimised for the Yahoo! and Facebook workloads. Although converging they are tuned differently; both teams are nervous about changes that would reduce their throughputs, as the cost would be significant. + * Hadoop 0.21 adds a plugin API to add your own scheduler more easily. + * The DynamicPriorityScheduler is designed for multiple users competing for time on a shared cluster. + * You bid for time; the scheduler gives priority to those who bid the most. + * You can bid $0, you will still get time if nobody else bids more than you. + * Running Map or Reduce jobs will get killed if higher priority work comes in. The scheduler tries to be clever here and leave stuff that has been running a while alone (on the expectation that it will finish soon). The benefits of killing processes comes in if people can schedule long running jobs. + * It avoids any kind of history to make it scalable, no need to worry about persistence. + * If your bid doesn't get through, you don't get billed. + * To use: give every user/team their own queue + + The scheduler is in the contrib directory for Hadoop 0.21; it's not easy to backport as it uses the scheduler plugin API. +
