Sorry my question was around mapred.map.tasks I mistakenly specified wrong parameter. In pig I am setting mapred.map.tasks to 200 but there are more tasks being executed.
On Fri, Feb 1, 2013 at 5:04 PM, Alan Gates <[email protected]> wrote: > Setting that mapred.reduce.tasks won't work as Pig overrides. See > http://pig.apache.org/docs/r0.10.0/perf.html#parallel for info on how to > set the number of reducers in Pig. > > Alan. > > On Feb 1, 2013, at 4:53 PM, Mohit Anchlia wrote: > > > Just slightly different problem I tried setting SET mapred.reduce.tasks > to > > 200 in pig but still more tasks were launched for that job. Is there any > > other way to set the parameter? > > > > On Fri, Feb 1, 2013 at 3:15 PM, Harsha <[email protected]> wrote: > > > >> > >> its the total number of reducers not active reducers. > >> If you specify lower number each reducer gets more data to process. > >> -- > >> Harsha > >> > >> > >> On Friday, February 1, 2013 at 2:54 PM, Mohit Anchlia wrote: > >> > >>> Thanks! Is there a downside of reducing number of reducers? I am trying > >> to > >>> alleviate high CPU. > >>> > >>> With low reducers using parallel clause does it mean that more data is > >>> processed by each reducer or does it mean how many reducers can be > active > >>> at one time > >>> > >>> On Fri, Feb 1, 2013 at 2:44 PM, Harsha <[email protected] (mailto: > >> [email protected])> wrote: > >>> > >>>> Mohit, > >>>> you can use PARALLEL clause to specify reduce tasks. More info here > >>>> > >> > http://pig.apache.org/docs/r0.8.1/cookbook.html#Use+the+Parallel+Features > >>>> > >>>> -- > >>>> Harsha > >>>> > >>>> > >>>> On Friday, February 1, 2013 at 2:42 PM, Mohit Anchlia wrote: > >>>> > >>>>> Is there a way to specify max number of reduce tasks that a job > >> should > >>>> span > >>>>> in pig script without having to restart the cluster? > >>>> > >>>> > >>> > >>> > >>> > >> > >> > >> > >
