Hi,

I'm testing a custom PageRank implementation using trunk on Hadoop
1.0.4. I seem to run into a deadlock after the input superstep.

The workers report "finishSuperstep: (all workers done) WORKER_ONLY -
Attempt=0, Superstep=0" and the master reports that all workers are done
with superstep -1.

I reconstructed this using a local setup and seems to me that the
BspServiceMaster hangs in the cleanUpZooKeeper method infinitely.

I'm not using a dedicated zk instance, I just have Giraph start one. Any
ideas what can be done to fix my problem?

Best,
Sebastian


excerpt from jstack

"org.apache.giraph.master.MasterThread" prio=10 tid=0x00007f29fc385000
nid=0x29d1 waiting on condition [0x00007f2a09a5f000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000000f38967d8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2116)
        at
org.apache.giraph.zk.PredicateLock.waitMsecs(PredicateLock.java:112)
        at
org.apache.giraph.zk.PredicateLock.waitForever(PredicateLock.java:138)
        at
org.apache.giraph.master.BspServiceMaster.cleanUpZooKeeper(BspServiceMaster.java:1602)
        at
org.apache.giraph.master.BspServiceMaster.cleanup(BspServiceMaster.java:1692)
        at org.apache.giraph.master.MasterThread.run(MasterThread.java:144)


Reply via email to