Also, I had one other question, is the default HADOOP_HEAPSIZE (of 1000m) sufficient or is increasing this recommended?
On Sep 3, 2011, at 6:41 PM, Bryan Keller wrote: > Is there any rule-of-thumb for setting the maximum number of mappers and > reducers per task tracker, via the mapred.tasktracker.xxx.tasks.maximum > properties? I have data nodes with 24-cores (4 CPUs w/ 6 cores) and 24 GB > RAM. I have the child processes using -Xmx1024m, so 1 GB each. > > I currently have the maximums set to 16. This potentially will result in 32 > processes (16 mappers and 16 reducers), so more processes than cores and more > potential memory use than physical memory. However, it also potentially > leaves resources unused if I am running a map-only job, in which only 16 > mapper processes will be used, so 8 cores and 8 GB aren't doing much. > > What have others been setting these values to, and for what hardware? >
