Eres el Esteban que conozco?
El 07/07/2011, a las 15:53, Esteban Gutierrez <[email protected]> escribió: > Hi Pony, > > There is a good chance that your boxes are doing some heavy swapping and > that is a killer for Hadoop. Have you tried > with mapred.job.reuse.jvm.num.tasks=-1 and limiting as much possible the > heap on that boxes? > > Cheers, > Esteban. > > -- > Get Hadoop! http://www.cloudera.com/downloads/ > > > > On Thu, Jul 7, 2011 at 1:29 PM, Juan P. <[email protected]> wrote: > >> Hi guys! >> >> I'd like some help fine tuning my cluster. I currently have 20 boxes >> exactly >> alike. Single core machines with 600MB of RAM. No chance of upgrading the >> hardware. >> >> My cluster is made out of 1 NameNode/JobTracker box and 19 >> DataNode/TaskTracker boxes. >> >> All my config is default except i've set the following in my >> mapred-site.xml >> in an effort to try and prevent choking my boxes. >> *<property>* >> * <name>mapred.tasktracker.map.tasks.maximum</name>* >> * <value>1</value>* >> * </property>* >> >> I'm running a MapReduce job which reads a Proxy Server log file (2GB), maps >> hosts to each record and then in the reduce task it accumulates the amount >> of bytes received from each host. >> >> Currently it's producing about 65000 keys >> >> The hole job takes forever to complete, specially the reduce part. I've >> tried different tuning configs by I can't bring it down under 20mins. >> >> Any ideas? >> >> Thanks for your help! >> Pony >>
