Hi, As the name suggest, Fair-scheduler does a fair allocation of slot to the jobs. Let say, you have 10 map slots in your cluster and it is occupied by a job-1 which requires 30 map slot to finish. But the same time, another job-2 require only 2 map slots to finish - Here slots will be provided to job-2 to get finished quickly while job-1 will be keep running.
On Tue, May 14, 2013 at 12:02 AM, Rahul Bhattacharjee < [email protected]> wrote: > Any pointer to my question. > > There is another question , kind-of dumb , but just wanted to clarify. > > Say in a FIFO scheduler or a capacity scheduler , if there are slots > available and the first job doesn't need all of the available slots , then > the job next in the queue is scheduled for execution or that still waits > for the first job to finish? > - Jobs don't wait for all the slots to get freed. Execution will start as soon as it get a slot. However, Hadoop does its best to allot a slot where job can achieve data locality. > Thanks, > Rahul > > > On Sat, May 11, 2013 at 8:31 PM, Rahul Bhattacharjee < > [email protected]> wrote: > >> Hi, >> >> I was going through the job schedulers of Hadoop and could not see any >> major operational difference between the capacity scheduler and the fair >> share scheduler apart from the fact that fair share scheduler supports >> preemption and capacity scheduler doesn't. >> >> Another thing is the former creates logical pools based on certain >> attribute like username , user group etc and the later has a notion of job >> queues. Can someone point me to any other major differences between these >> two types of schedulers. >> >> Another question in this regard is the capacity scheduler uses a FIFO >> queue.So its still possible that a high priority long running job using all >> the capacity allocated to the queue to block all the other jobs after it in >> the queue.I think this is the expected behavior , but wanted to confirm. >> >> Thanks, >> Rahul >> >> >> > Thanks -- Alok
