What version of Pig are you using? Starting in 0.8 Pig will combine
small blocks into a single map. This prevents jobs that actually are
reading small amounts of data from taking a lot of slots on the
cluster. You can turn this off by adding -
Dpig.noSplitCombination=true to your command line.
Alan.
On Mar 23, 2011, at 5:45 PM, Dexin Wang wrote:
And the nodes are pretty lightly loaded (~1.0) and there's plenty of
free
memory. Now I'm seeing 2 mappers per node. Very much under-utilized.
On Wed, Mar 23, 2011 at 1:39 PM, Dexin Wang <[email protected]>
wrote:
Hi,
We've seen a strange problem where some Pig jobs would just run fewer
mappers concurrently than the mapper capacity. Specifically we have
a 10
node cluster and each is configured to have 12 mappers. Normally we
have 120
mappers running. But for some Pig jobs it will only have 10 mappers
running
(while nothing else is running), and actually appears to be 1
mapper per
node.
We have not noticed the same problem with other non-Pig hadoop job.
Anyone
has experienced the same thing and have any explanation or remedy?
Thanks!
Dexin