Dear all, I am setting up and configuring a small Hadoop cluster of 11 nodes for teaching purposes. All machines are identical, and have the following specs:
- 4-core Intel(R) Xeon(R) CPU E3-1270 (3.5 GHz) - 16 GB of RAM - Debian Squeeze I use a version of Hadoop 0.20.2 packaged by Cloudera (hadoop-0.20.2-cdh3u5). The significant configuration options I changed are: - mapred.tasktracker.map.tasks.maximum : 4 - mapred.tasktracker.reduce.tasks.maximum : 2 - mapred.child.java.opts : -Xmx1500m - mapred.child.ulimit : 4500000 - io.sort.mb : 200 - io.sort.factor : 64 - io.file.buffer.size : 65536 - mapred.jobtracker.taskScheduler : org.apache.hadoop.mapred.FairScheduler - mapred.reduce.tasks : 10 - mapred.reduce.parallel.copies : 10 - mapred.reduce.slowstart.completed.maps : 0.8 Most of these values were taken from the "Hadoop Operations" book. My problem is the following: when running jobs on the cluster, I often get the following errors in my mappers: java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:250) Caused by: java.io.IOException: Task process exit with nonzero status of 1. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:237) Error occurred during initialization of VM Could not reserve enough space for object heap I had at first a ulimit of 3000000, and then increased it to 4500000, with no change. I don't understand why I get these memory errors: as I understood, each node should use 1 + 1 + 4*1.5 + 2*1.5 = 11 GB of RAM at most, leaving plenty of margin (the first 2 GB are for the TaskTracker and DataNode processes). Of course, no other software is running on these machines. The JobTracker and NameNode are on two separated machines, not part of these 11 workers. Do any of you have any advice on how I could prevent these errors from happening? All jobs run fine though, it's just that these failures slow things down a bit, and let me with the impression that I got something wrong. Are there any issues with my configuration options, given the hardware specs of my machines? Thanks in advance for any help/pointer! Cheers, Vincent
