This is correct, but shouldn't we provide a mechnism that the sync service can be newly implemented? In Hadoop for example you can implement your own scheduler.
GoldenOrb has a similar mechanism: https://github.com/raveldata/goldenorb/blob/master/src/main/java/org/goldenorb/zookeeper/Barrier.java https://github.com/raveldata/goldenorb/blob/master/src/main/java/org/goldenorb/zookeeper/OrbFastBarrier.java BTW their implementation seems much clearer to me. I'll open an issue to add Zookeeper to YARN and use the plain normal BSPPeerImpl. 2011/10/17 Edward J. Yoon <[email protected]> > IMO, it should be designed as a common component and we don't need to > compete with Zookeeper team to implement a distributed lock management > system. > > Here's my thoughts: > > The benefits you said e.g., performance and simple code, are skeptical to > me. > > First, the cost of lock operations is not a large part of whole job > performance. In large cluster, reliability will be more important. > Zookeepr can be used not only for distributed locking service but also > for the master election, event management in the future. And, we can > just contribute the code to Zookeeper if needed. Are you sure that we > can keep the complexity of our own sync server? > > On Fri, Oct 14, 2011 at 11:06 PM, Thomas Jungblut > <[email protected]> wrote: > > Hey, > > > > as you may already heard, I used a RPC sync service which I have wrote on > my > > own. It works, but it may not be as good as Zookeeper. > > My idea: > > We can make a "AbstractBSPPeer" class which has following methods: > > abstract enterBarrier(); > > abstract leaveBarrier(); > > abstract getAllPeerNames(); > > > > These are obviously things that belong to the our specific > synchronization > > daemon. > > Now we could extend an ZooKeeperBSPPeer which implements the ZooKeeper > way > > of barrier sync and a RPC one. > > > > Or to push it even further, take on Edwards idea of a common > synchronization > > service which abstracts the use of ZooKeeper or an RPC service. > > My goal of the RPC service is to keep simplicity in our code and built a > > overhead-less service which provides additional features, e.G. > deregistering > > a task from a barrier. > > It would be great if we can benchmark them both to get a gist of what is > the > > best in terms of performance and reliability. > > So I would be +1 for Edwards idea. Maybe you can clarify this a bit > @Edward. > > [1] > > Edwards idea would help us to share common code between YARN and normal > > infrastructure. > > > > [1] my thoughts: we need some kind of factory which launches a specific > sync > > daemon, based on a given configuration. > > > > It would be great if you can share your opinion :) > > Thanks! > > > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon > -- Thomas Jungblut Berlin <[email protected]>
