Hi, On Thu, Aug 12, 2010 at 3:35 AM, Bobby Dennett <bdennett+softw...@gmail.com> wrote: > From what I've read/seen, it appears that, if not the "default" > scheduler, most installations are using Hadoop's Fair Scheduler. Based > on features and our requirements, we're leaning towards using the > Capacity Scheduler; however, there is some concern that it may not be > as "stable" as there doesn't appear to be as much talk about it, > compared to the Fair Scheduler. > > Has anyone hit any nasty issues with regards to the Capacity Scheduler > and, in general, are there any "gotchas" to look out for with either > scheduler? > > We're ramping up the number of users on our Hadoop clusters, > particularly in regards to Hive. Our goal is to ensure that production > processes continue to run with a majority of the cluster during peak > usage times, while personal users share the remaining capacity. The > Capacity Scheduler's support of queues and for memory-intensive jobs > is appealing but we are curious about drawbacks and/or potential > issues.
FWIW, Yahoo! is running capacity scheduler for a reasonably long time now. However, there have been many patches on top of the base Hadoop 0.20.2 version to capacity scheduler that make it 'stable' and work at large scale effectively. Looking at the change log of the yahoo hadoop distribution could possibly give an idea of which patches are useful to pick up and apply to an older version. The good news is that most of these patches have 0.20 versions that are available on JIRA and would apply reasonably cleanly. > > Thanks in advance, > -Bobby >