I missed the key information: The servers are *Amazon EC2* *M1 Medium Instance*
2013/1/18 yaotian <[email protected]> > Hi, > > *=>My machine environment:* > 1 master 1 CPU core, 2G Mhz, 1G Memory > 2 Slaves(datanode): 1 CPU core, 2G Mhz, 4G memory > hadoop: hadoop-0.20.205.0 > > *=> My data:* > User GPS trace analysis. Each user has many gp location information. We > want to analyze them. > > *=>My question:* > 1. We have 2 datanode. But hadoop only used only 1 server? Is that not > effective? > > 2. When i run the 200M size data. It is successful. But if i run 30G data, > it always to report "Task attempt_201301171429_0013_r_000000_0 failed to > report status for 600 seconds. Killing!" > > *=>My map-reduce config:* > <configuration> > > <property> > <name>mapred.job.tracker</name> > <value>master:9001</value> > </property> > > > <property> > <name>mapred.reduce.parallel.copies</name> > <value>50</value> > </property> > > <property> > <name>mapred.compress.map.output</name> > <value>true</value> > </property> > > > <property> > <name>mapred.job.shuffle.merge.percent</name> > <value>0.75</value> > </property> > > > <property> > <name>mapred.job.tracker.http.address</name> > <value>0.0.0.0:9003</value> > </property> > > > <property> > <name>mapreduce.reduce.memory.mb</name> > <value>4000</value> > </property> > > <property> > <name>mapred.child.java.opts</name> > <value>-Xmx2000m</value> > </property> > > <property> > <name>mapreduce.reduce.java.opts</name> > <value>-Xmx2000m</value> > </property> > > > <property> > <name>mapred.reduce.tasks</name> > <value>AutoReduce</value> > </property> > > <property> > <name>io.sort.factor</name> > <value>12</value> > </property> > > <property> > <name>io.sort.mb</name> > <value>300</value> > </property> > > > <property> > <name>io.file.buffer.size</name> > <value>65536</value> > </property> > > <property> > <name>dfs.datanode.handler.count</name> > <value>8</value> > </property> > </configuration> > > > > > > >
