Setting that mapred.reduce.tasks won't work as Pig overrides. See http://pig.apache.org/docs/r0.10.0/perf.html#parallel for info on how to set the number of reducers in Pig.
Alan. On Feb 1, 2013, at 4:53 PM, Mohit Anchlia wrote: > Just slightly different problem I tried setting SET mapred.reduce.tasks to > 200 in pig but still more tasks were launched for that job. Is there any > other way to set the parameter? > > On Fri, Feb 1, 2013 at 3:15 PM, Harsha <[email protected]> wrote: > >> >> its the total number of reducers not active reducers. >> If you specify lower number each reducer gets more data to process. >> -- >> Harsha >> >> >> On Friday, February 1, 2013 at 2:54 PM, Mohit Anchlia wrote: >> >>> Thanks! Is there a downside of reducing number of reducers? I am trying >> to >>> alleviate high CPU. >>> >>> With low reducers using parallel clause does it mean that more data is >>> processed by each reducer or does it mean how many reducers can be active >>> at one time >>> >>> On Fri, Feb 1, 2013 at 2:44 PM, Harsha <[email protected] (mailto: >> [email protected])> wrote: >>> >>>> Mohit, >>>> you can use PARALLEL clause to specify reduce tasks. More info here >>>> >> http://pig.apache.org/docs/r0.8.1/cookbook.html#Use+the+Parallel+Features >>>> >>>> -- >>>> Harsha >>>> >>>> >>>> On Friday, February 1, 2013 at 2:42 PM, Mohit Anchlia wrote: >>>> >>>>> Is there a way to specify max number of reduce tasks that a job >> should >>>> span >>>>> in pig script without having to restart the cluster? >>>> >>>> >>> >>> >>> >> >> >>
