From the code I observed, it seems that znodes created are consisted of peer 
names only (in the form of `host:port'). Therefore, processes at different 
superstep share the flat namespace. During iteration of each supersteps, the 
newer superstep process can not be distinguished from the older one, resulting 
in process hanging. Adding superstep value to created znode and filtering out 
znode of next superstep might solve the problem. 

But I haven't tested the code, so I may be wrong because of misunderstanding.  

-----Original message-----
From:Edward J. Yoon <[email protected]>
To:[email protected]
Date:Tue, 21 Jun 2011 17:20:21 +0900
Subject:Re: Lock and Barrier Synchronization

Especially, this can be problematic when locking a large number of BSPPeers.

On Tue, Jun 21, 2011 at 5:13 PM, Edward J. Yoon <[email protected]> wrote:
> Hi all,
>
> Recently I'm looking at HAMA-387.
>
> There's some problem related with lock and barrier synchronization.
> The problem is as soon as last one of lock files deleted (before
> completely escape from while loop at leaveBarrier method), others
> begin to create their lock file. So, sometimes, it causes hang.
>
> My temporary solution is 'Thread.sleep(200);'. Good but not perfect.
> If zk.getChildren() response is slower than 200 milliseconds, process
> will be hanged.
>
> Is there any other idea?
>
> Thanks.
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon


--
ChiaHung Lin
Department of Information Management
National University of Kaohsiung
Taiwan

Reply via email to