Hi Sebastian, This seems to be a new issue related to our recent upgrade to multithreading. I have not seen this before. All my other emails related to the array index out of bounds error you found over the weekend.
however, I have had trouble with the local zk instance for some time now on a number of Giraph profiles and pretty much exclusively use a separate ZK instance of my own. Last summer I was running a lot of jobs on a 1.0.x hadoop cluster with Giraph, and I was told to use the on-cluster dedicated ZK quorum due to "problems" with Giraph's local ZK instanantiation. No one got more specific with me than that. I also can't get the local ZK instances to come up on Hadoop-2.0.x -- perhaps this feature of Giraph has had problems for a while. Was it working for you recently? If you notice any other clues as to the cause, please post them I'm hoping to do some work aorund this soon. On Tue, Jan 22, 2013 at 11:05 AM, Claudio Martella < [email protected]> wrote: > Hi Sebastian, > > I do not know what is happening, I am also having problems of jobs > blocking while waiting to setup the zookeeper instance. > We should look into this. > > Best, > Claudio > > > On Mon, Jan 21, 2013 at 1:59 PM, Sebastian Schelter <[email protected]>wrote: > >> Hi, >> >> I'm testing a custom PageRank implementation using trunk on Hadoop >> 1.0.4. I seem to run into a deadlock after the input superstep. >> >> The workers report "finishSuperstep: (all workers done) WORKER_ONLY - >> Attempt=0, Superstep=0" and the master reports that all workers are done >> with superstep -1. >> >> I reconstructed this using a local setup and seems to me that the >> BspServiceMaster hangs in the cleanUpZooKeeper method infinitely. >> >> I'm not using a dedicated zk instance, I just have Giraph start one. Any >> ideas what can be done to fix my problem? >> >> Best, >> Sebastian >> >> >> excerpt from jstack >> >> "org.apache.giraph.master.MasterThread" prio=10 tid=0x00007f29fc385000 >> nid=0x29d1 waiting on condition [0x00007f2a09a5f000] >> java.lang.Thread.State: TIMED_WAITING (parking) >> at sun.misc.Unsafe.park(Native Method) >> - parking to wait for <0x00000000f38967d8> (a >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) >> at >> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198) >> at >> >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2116) >> at >> org.apache.giraph.zk.PredicateLock.waitMsecs(PredicateLock.java:112) >> at >> org.apache.giraph.zk.PredicateLock.waitForever(PredicateLock.java:138) >> at >> >> org.apache.giraph.master.BspServiceMaster.cleanUpZooKeeper(BspServiceMaster.java:1602) >> at >> >> org.apache.giraph.master.BspServiceMaster.cleanup(BspServiceMaster.java:1692) >> at >> org.apache.giraph.master.MasterThread.run(MasterThread.java:144) >> >> >> > > > -- > Claudio Martella > [email protected] >
