I haven't use AWS MR before…..if your instances are configured with 3 reducer slots, it means that 3 reducers can run at the same time in this node,
what do you mean by "this property is already set to 1 on my cluster"? actually this value can be node-specific, if AWS MR instance allows you to do that, you can modify mapred-site.xml to change it from 3 to 1 Best, -- Nan Zhu School of Computer Science, McGill University On Friday, 8 February, 2013 at 11:24 PM, David Parks wrote: > Hmm, odd, I’m using AWS Mapreduce, and this property is already set to 1 on > my cluster by default (using 15 m1.xlarge boxes which come with 3 reducer > slots configured by default). > > > > From: Nan Zhu [mailto:[email protected]] > Sent: Saturday, February 09, 2013 10:59 AM > To: [email protected] (mailto:[email protected]) > Subject: Re: How can I limit reducers to one-per-node? > > I think set tasktracker.reduce.tasks.maximum to be 1 may meet your > requirement > > > > > > Best, > > > > -- > > Nan Zhu > > School of Computer Science, > > McGill University > > > > > > > On Friday, 8 February, 2013 at 10:54 PM, David Parks wrote: > > > > I have a cluster of boxes with 3 reducers per node. I want to limit a > > particular job to only run 1 reducer per node. > > > > > > > > > > > > This job is network IO bound, gathering images from a set of webservers. > > > > > > > > > > > > My job has certain parameters set to meet “web politeness” standards (e.g. > > limit connects and connection frequency). > > > > > > > > > > > > If this job runs from multiple reducers on the same node, those per-host > > limits will be violated. Also, this is a shared environment and I don’t > > want long running network bound jobs uselessly taking up all reduce slots. > > > > > > > > > > > > > >
