It might help to have a different znode for synchronization at each iteration. That way, if slow nodes are still just getting around to deleting the old node, the fast nodes creating their new nodes will not interfere.
On Wed, Sep 7, 2011 at 7:01 PM, Edward J. Yoon <[email protected]>wrote: > Hi, > > I'm using Zookeeper for global barrier synchronization of Hama BSP > computing engine. Current implementation is based on 'ZooKeeper > Recipes and Solutions'[1] but there's a problem. > > The problem is that, before the last process leaving the barrier > completely, other processors are starting to create their node[2]. So, > that last process hangs forever at "2. if no children, exit" step. > This problem intermittently occurs on high-performance environments. > > Can anyone advise me? > > 1. http://zookeeper.apache.org/doc/trunk/recipes.html > 2. > https://issues.apache.org/jira/browse/HAMA-387?focusedCommentId=13037785&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13037785 > > -- > Best Regards, Edward J. Yoon > @eddieyoon >
