On top of that, the message indicates that you need to have your scheduler class in the mapred package.
Thanks, +Vinod Kumar Vavilapalli Hortonworks Inc. http://hortonworks.com/ On Dec 27, 2012, at 7:38 AM, Hemanth Yamijala wrote: > Hi, > > Firstly, I am talking about Hadoop 1.0. Please note that in Hadoop 2.x and > trunk, the Mapreduce framework is completely revamped to Yarn > (http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html) > and you may need to look at different interfaces for building your own > scheduler. > > In 1.0, the primary function of the TaskScheduler is the assignTasks method. > Given a TaskTracker object as input, this method figures out how many free > map and reduce slots exist in that particular tasktracker and selects one or > more task that can be scheduled on it. Since task selection is the primary > responsibility and the granularity is at a task level, the class is called > TaskScheduler. > > The method of choosing a job and then a task within the job is customised by > the different schedulers already present in Hadoop. Also, the core logic of > selecting a map task with data locality optimizations is not implemented in > the schedulers per se, but they rely on the JobInProgress object in MapReduce > framework for achieving the same. > > To implement your own Scheduler, it may be best to look at the sources of > existing schedulers: JobQueueTaskScheduler, CapacityTaskScheduler or > FairScheduler. In particular, the last two are in the contrib modules of > mapreduce, and hence will be fairly independent to follow. Their build files > will also tell you how to resolve any compile problems like the one you are > facing. > > Thanks > Hemanth > > > > > On Thu, Dec 27, 2012 at 4:10 PM, Yaron Gonen <[email protected]> wrote: > Hi, > If I understand correctly, the job scheduler (why is the class called > TaskScheduler?) is responsible for assigning the task whose split is as close > as possible to the tasktacker. > Meaning that the job scheduler is responsible to two things: > Selecting a job. > Once a job is selected, assign the closest task to the tasktracker that send > the heartbeat. > Is this correct? > > I want to write my own job scheduler to change the logic above, but it says > The type TaskScheduler is not visible. > How can I write my own scheduler? > > thanks >
