Jonathan On Fri, Nov 16, 2007 at 12:00:21PM -0600, jonathan doklovic wrote: >Hi, > >We've finally got our hadoop cluster up, some data to crunch and a >map/reduce job. > >After running a few configurations, i'm not sure about our performance >and would like to get some advice.... > >We have a 20 node ec2 cluster. >We have 750MB of data. >currently our job seems to be doing 1%/min on the cluster. >Using a much smaller subset of data and running locally, the job takes a >matter of seconds. > > >Here's our hadoop-site.xml > > ><property> > <name>mapred.tasktracker.tasks.maximum</name> > <value>20</value> ></property> >
That is very high, you are basically letting the TT spawn 40 child-jvms. I wouldn't go above 3 or 4 for that config. Hopefully http://lucene.apache.org/hadoop/cluster_setup.html is useful... Arun
