Thanks a lot!
On Thu, Dec 27, 2012 at 8:11 PM, Vinod Kumar Vavilapalli < [email protected]> wrote: > > On top of that, the message indicates that you need to have your scheduler > class in the mapred package. > > Thanks, > +Vinod Kumar Vavilapalli > Hortonworks Inc. > http://hortonworks.com/ > > On Dec 27, 2012, at 7:38 AM, Hemanth Yamijala wrote: > > Hi, > > Firstly, I am talking about Hadoop 1.0. Please note that in Hadoop 2.x and > trunk, the Mapreduce framework is completely revamped to Yarn ( > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html) > and you may need to look at different interfaces for building your own > scheduler. > > In 1.0, the primary function of the TaskScheduler is the assignTasks > method. Given a TaskTracker object as input, this method figures out how > many free map and reduce slots exist in that particular tasktracker and > selects one or more task that can be scheduled on it. Since task selection > is the primary responsibility and the granularity is at a task level, the > class is called TaskScheduler. > > The method of choosing a job and then a task within the job is customised > by the different schedulers already present in Hadoop. Also, the core logic > of selecting a map task with data locality optimizations is not implemented > in the schedulers per se, but they rely on the JobInProgress object in > MapReduce framework for achieving the same. > > To implement your own Scheduler, it may be best to look at the sources of > existing schedulers: JobQueueTaskScheduler, CapacityTaskScheduler or > FairScheduler. In particular, the last two are in the contrib modules of > mapreduce, and hence will be fairly independent to follow. Their build > files will also tell you how to resolve any compile problems like the one > you are facing. > > Thanks > Hemanth > > > > > On Thu, Dec 27, 2012 at 4:10 PM, Yaron Gonen <[email protected]>wrote: > >> Hi, >> If I understand correctly, the job scheduler (why is the class called >> TaskScheduler?) is responsible for assigning the task whose split is as >> close as possible to the tasktacker. >> Meaning that the job scheduler is responsible to two things: >> >> 1. Selecting a job. >> 2. Once a job is selected, assign the closest task to the tasktracker >> that send the heartbeat. >> >> Is this correct? >> >> I want to write my own job scheduler to change the logic above, but it >> says The type TaskScheduler is not visible. >> How can I write my own scheduler? >> >> thanks >> > > >
