Hi Yingyi,
Here are some ideas you might want to try:
1) Limit the thread stack size.
2 You can set the heap available to the mapper jvm.
I.e. Here's a setting to get 10 GB of heap and use a smaller stack (64k)
for the threads.
-Dmapred.child.java.opts="-Xms10g -Xmx10g -Xss64k"
Also, you might want to try using the EdgeListVertex instead of Vertex
(i.e. GiraphJob.setVertexClass(EdgeListVertex.class)), it is quite a bit
smaller.
Let us know if that helps you. You should also check to see if your
Hadoop installation is using a 32-bit of 64-bit JVM. If it's 32-bit you
will be limited in how much heap you can use.
Avery
On 11/17/11 9:38 PM, Yingyi Bu wrote:
Hi,
I'm running a Giraph PageRank job. I tried with 8GB input text
data over 10 nodes (each has 4 core, 4 disks, and 12GB physical
memory), that is 800MB input-data/machine. However, Giraph job
fails because of high GC costs and Out-of-Memory exception.
Do I set some special things in Hadoop configurations, for
example, maximum heap size for map task vm ?
Thanks!!
Best regards,
Yingyi