Number of maps depends on the input splits. If your dataset is too big (and not gzipped) there will be a map created for each split (which equals block size).
On Fri, Mar 9, 2012 at 4:39 PM, Mohit Anchlia <[email protected]>wrote: > I have "set mapred.map.tasks 5" in the pig job and still I am seeing > around 214 map tasks and around 30 actively running jobs. I was expecting > only 5 map tasks. > > My cluster has 5 nodes. >
