Well I have just 1 rack which has all the datanodes, so yes rack local. Though the data is scattered across all the datanodes, so i guess not "data-local". Is there a way to force it to use slots from all the tasktrackers? Or if I define 4 queues each of which uses 1 slot, but even then it is not guarantedd it would use slots from different TTs
On Tue, Aug 23, 2011 at 4:07 PM, Arun C Murthy <a...@hortonworks.com> wrote: > Were they all 'data-local' or 'rack-local' tasks? If so, it's expected. > > Arun > > > On Aug 23, 2011, at 3:51 PM, Sulabh Choudhury wrote: > > Hi, > > So I just started using capacity scheduler for M/R jobs. I have 4 task > trackers each with 4 map/reduce slots. > Configured a queue so that it uses 25% (4 slots) of the available slots. I > was expecting that it would distribute the job and use slots from each of > the 4 tasktrackers but it actually uses all 4 slots from a single TT. > Is there a configuration I am missing or this is the expected behavior ? > > > > -- -- Thanks and Regards, Sulabh Choudhury