Hi,

if the mapred.max.split.size is not set (and it's not by default) than 
CombineFileInputFormat 
only takes racks in account when grouping blocks. So if you set this property 
it will take also
block placement on machines into account and you should get multiple mappers.

Hope this helps,
Aleksandar Stupar.




________________________________
From: Keith Wiley <kwi...@keithwiley.com>
To: common-user@hadoop.apache.org
Sent: Thu, April 29, 2010 11:23:35 PM
Subject: CombineFileInputFormat not producing multiple mappers

I am using CombineFileInputFormat and CombineFileSplit to group small input 
files as fed to the mappers.  The job runs properly and the output is correct, 
but I get only one mapper task, so I lose all my paralleization in the map 
stage.

I realize I'm not providing much detail yet because I'm not sure what to say.  
Feel free to ask questions for clarification.

What might cause this problem and how might I diagnose -- must less fix -- it?

Thank you.

________________________________________________________________________________
Keith Wiley              kwi...@keithwiley.com              www.keithwiley.com

"And what if we picked the wrong religion?  Every week, we're just making God
madder and madder!"
  -- Homer Simpson
________________________________________________________________________________


      

Reply via email to