On top of that, the message indicates that you need to have your scheduler 
class in the mapred package.

Thanks,
+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/

On Dec 27, 2012, at 7:38 AM, Hemanth Yamijala wrote:

> Hi,
> 
> Firstly, I am talking about Hadoop 1.0. Please note that in Hadoop 2.x and 
> trunk, the Mapreduce framework is completely revamped to Yarn 
> (http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html)
>  and you may need to look at different interfaces for building your own 
> scheduler.
> 
> In 1.0, the primary function of the TaskScheduler is the assignTasks method. 
> Given a TaskTracker object as input, this method figures out how many free 
> map and reduce slots exist in that particular tasktracker and selects one or 
> more task that can be scheduled on it. Since task selection is the primary 
> responsibility and the granularity is at a task level, the class is called 
> TaskScheduler.
> 
> The method of choosing a job and then a task within the job is customised by 
> the different schedulers already present in Hadoop. Also, the core logic of 
> selecting a map task with data locality optimizations is not implemented in 
> the schedulers per se, but they rely on the JobInProgress object in MapReduce 
> framework for achieving the same.
> 
> To implement your own Scheduler, it may be best to look at the sources of 
> existing schedulers: JobQueueTaskScheduler, CapacityTaskScheduler or 
> FairScheduler.  In particular, the last two are in the contrib modules of 
> mapreduce, and hence will be fairly independent to follow. Their build files 
> will also tell you how to resolve any compile problems like the one you are 
> facing.
> 
> Thanks
> Hemanth  
> 
> 
> 
> 
> On Thu, Dec 27, 2012 at 4:10 PM, Yaron Gonen <[email protected]> wrote:
> Hi,
> If I understand correctly, the job scheduler (why is the class called 
> TaskScheduler?) is responsible for assigning the task whose split is as close 
> as possible to the tasktacker.
> Meaning that the job scheduler is responsible to two things:
> Selecting a job.
> Once a job is selected, assign the closest task to the tasktracker that send 
> the heartbeat.
> Is this correct?
> 
> I want to write my own job scheduler to change the logic above, but it says 
> The type TaskScheduler is not visible.
> How can I write my own scheduler?
> 
> thanks
> 

Reply via email to