Pankaj, What version of pig are you using? In later versions of pig, it should have some logic around automatically setting parallelisms (though sometimes these heuristics will be wrong).
There are also some operations which will force you to use 1 reducer. It depends on what your script is doing. 2012/6/1 Pankaj Gupta <[email protected]> > Hi, > > I just realized that one of my large scale pig jobs that has 100K map jobs > actually only has one reduce task. Reading the documentation I see that the > number of reduce tasks is defined by the PARALLEL clause whose default > value is 1. I have a few questions around this: > > # Why is the default value of reduce tasks 1? > # (Related to first question) Why aren't reduce tasks parallelized > automatically in Pig? > # How do I choose a good value of reduce tasks for my pig jobs? > > Thanks in Advance, > Pankaj
