Hey guys,
I am running hive and I am trying to join two tables (2.2GB and 136MB) on a
cluster of 9 nodes (replication = 3)
Hadoop version - 0.20.2
Each data node memory - 2GB
HADOOP_HEAPSIZE - 1000MB
other heap settings are defaults. My hive launches 40 Maptasks and every
task failed with the same error
2011-09-19 18:37:17,110 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 300
2011-09-19 18:37:17,223 FATAL org.apache.hadoop.mapred.TaskTracker:
Error running child : java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:781)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Looks like I need to tweak some of the heap settings for TTs to handle
the memory efficiently. I am unable to understand which variables to
modify (there are too many related to heap sizes).
Any specific things I must look at?
Thanks,
jS