I think set tasktracker.reduce.tasks.maximum to be 1 may meet your requirement
Best, -- Nan Zhu School of Computer Science, McGill University On Friday, 8 February, 2013 at 10:54 PM, David Parks wrote: > I have a cluster of boxes with 3 reducers per node. I want to limit a > particular job to only run 1 reducer per node. > > This job is network IO bound, gathering images from a set of webservers. > > My job has certain parameters set to meet “web politeness” standards (e.g. > limit connects and connection frequency). > > If this job runs from multiple reducers on the same node, those per-host > limits will be violated. Also, this is a shared environment and I don’t want > long running network bound jobs uselessly taking up all reduce slots. > > >