Hello Hadoopers: I am running the RandomWrite on a 8 nodes cluster. Because the default setting is creating 1G/mapper, 10mappers/host. Considering replications, it is essentially creating 30G/host. Because each node in the cluster has at most 30G. So my cluster is full and can not execute further command. I create a new application configuration file specifying 1G/mapper. But it seems it is still creating 30G data and still running out of each node's disk. Is this the right way to generate the less than 10G data with RandomWriter? Below is the command and application configuration file I used.
bin/hadoop jar hadoop-0.17.0-examples.jar randomwriter rand -conf randConfig.xml Below is the application configuration file I am using: <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>test.randomwriter.maps_per_host</name> <value>10</value> <description></description> </property> <property> <name>test.randomwrite.bytes_per_map</name> <value>103741824</value> <description></description> </property> <property> <name>test.randomwrite.min_key</name> <value>10</value> <description> </description> </property> <property> <name>test.randomwrite.max_key</name> <value>1000</value> <description> </description> </property> <property> <name>test.randomwrite.min_value</name> <value>0</value> <description>Default block replication. </description> </property> <property> <name>test.randomwrite.max_value</name> <value>20000</value> <description>Default block replication. </description> </property> </configuration>