This is for a single worker it appears. Most likely your worker went
into GC and never returned. You can try with GC settings turned on, try
adding something like.
-XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails
-XX:+PrintGCTimeStamps -verbose:gc
You could also try the concurrent mark/sweep collector.
-XX:+UseConcMarkSweepGC
Any chance you can use more workers and/or get more memory?
Avery
On 4/3/14, 5:46 PM, Vikesh Khanna wrote:
@Avery,
Thanks for the help. I checked out the task logs, and turns out there
was an exception "GC overhead limit exceeded" due to which the
benchmarks wouldn't even load the vertices. I got around it by
increasing the heap size (mapred.child.java.opts) in mapred-site.xml.
The benchmark is loading vertices now. However, the job is still
getting stuck indefinitely (and eventually killed). I have attached
the small log for the map task on 1 worker. Would really appreciate if
you can help understand the cause.
Thanks,
Vikesh Khanna,
Masters, Computer Science (Class of 2015)
Stanford University
------------------------------------------------------------------------
*From: *"Praveen kumar s.k" <[email protected]>
*To: *[email protected]
*Sent: *Thursday, April 3, 2014 4:40:07 PM
*Subject: *Re: Giraph job hangs indefinitely and is eventually killed
by JobTracker
You have given -w 30, make sure that that many number of map tasks are
configured in your cluster
On Thu, Apr 3, 2014 at 6:24 PM, Avery Ching <[email protected]> wrote:
> My guess is that you don't get your resources. It would be very
helpful to
> print the master log. You can find it when the job is running to
look at
> the Hadoop counters on the job UI page.
>
> Avery
>
>
> On 4/3/14, 12:49 PM, Vikesh Khanna wrote:
>
> Hi,
>
> I am running the PageRank benchmark under giraph-examples from
giraph-1.0.0
> release. I am using the following command to run the job (as
mentioned here)
>
> vikesh@madmax
>
/lfs/madmax/0/vikesh/usr/local/giraph/giraph-examples/src/main/java/org/apache/giraph/examples
> $ $HADOOP_HOME/bin/hadoop jar
>
$GIRAPH_HOME/giraph-core/target/giraph-1.0.0-for-hadoop-0.20.203.0-jar-with-dependencies.jar
> org.apache.giraph.benchmark.PageRankBenchmark -e 1 -s 3 -v -V
50000000 -w 30
>
>
> However, the job gets stuck at map 9% and is eventually killed by the
> JobTracker on reaching the mapred.task.timeout (default 10 minutes).
I tried
> increasing the timeout to a very large value, and the job went on
for over 8
> hours without completion. I also tried the ShortestPathsBenchmark, which
> also fails the same way.
>
>
> Any help is appreciated.
>
>
> ****** ---------------- ***********
>
>
> Machine details:
>
> Linux version 2.6.32-279.14.1.el6.x86_64
> ([email protected]) (gcc version 4.4.6 20120305
(Red Hat
> 4.4.6-4) (GCC) ) #1 SMP Tue Nov 6 23:43:09 UTC 2012
>
> Architecture: x86_64
> CPU op-mode(s): 32-bit, 64-bit
> Byte Order: Little Endian
> CPU(s): 64
> On-line CPU(s) list: 0-63
> Thread(s) per core: 1
> Core(s) per socket: 8
> CPU socket(s): 8
> NUMA node(s): 8
> Vendor ID: GenuineIntel
> CPU family: 6
> Model: 47
> Stepping: 2
> CPU MHz: 1064.000
> BogoMIPS: 5333.20
> Virtualization: VT-x
> L1d cache: 32K
> L1i cache: 32K
> L2 cache: 256K
> L3 cache: 24576K
> NUMA node0 CPU(s): 1-8
> NUMA node1 CPU(s): 9-16
> NUMA node2 CPU(s): 17-24
> NUMA node3 CPU(s): 25-32
> NUMA node4 CPU(s): 0,33-39
> NUMA node5 CPU(s): 40-47
> NUMA node6 CPU(s): 48-55
> NUMA node7 CPU(s): 56-63
>
>
> I am using a pseudo-distributed Hadoop cluster on a single machine with
> 64-cores.
>
>
> *****-------------*******
>
>
> Thanks,
> Vikesh Khanna,
> Masters, Computer Science (Class of 2015)
> Stanford University
>
>
>