Decoupling ZAB is a good idea and like you all mentioned it could be used for 
things, like a key value store.

I've come across one such case in HDFS, where they have solved the problem 
their own way. As I know, the approach taken in this design is based on the 
well-known ZAB and Paxos. 
So I hope there is a space for such libraries in the real world.

References :
https://issues.apache.org/jira/browse/HDFS-3077
https://issues.apache.org/jira/secure/attachment/12547598/qjournal-design.pdf

-Rakesh

-----Original Message-----
From: mutsuz...@gmail.com [mailto:mutsuz...@gmail.com] On Behalf Of Michi 
Mutsuzaki
Sent: 02 June 2014 08:30
To: Alexander Shraer
Cc: dev@zookeeper.apache.org; Flavio Junqueira; Ivan Kelly
Subject: Re: intern project idea: decouple zab from zookeeper

Thank you for the pointer Alex.

I agree that the reconfiguration is a responsibility of the atomic broadcast. I 
feel that session management might need to rely on the atomic broadcast 
exposing additional primitives. For example, right now ZooKeeper forwards 
session information to the leader by piggybacking it in the quorum ping packets.

Let me know if you know good open source libraries for references. So far I've 
looked at ZooKeeper and goraft.

Thanks!
--Michi

On Sun, Jun 1, 2014 at 6:36 PM, Alexander Shraer <shra...@gmail.com> wrote:
> an interesting read if you haven't see it. fig 1 is similar to Michi's 
> proposal.
> http://research.google.com/archive/paxos_made_live.html
>
> I think that reconfig should be the responsibility of the atomic 
> broadcast / replicated log implementation (if supported by the specific 
> implementation).
> Client management and sessions seem like application dependent.
>
> I'd also suggest to check out existing open source paxos libraries as 
> an API reference.
>
>
> On Sun, Jun 1, 2014 at 6:11 PM, Michi Mutsuzaki 
> <mi...@cs.stanford.edu>
> wrote:
>>
>> Thank you for the clarifications Flavio. I guess 'heavyweight' is a 
>> relative term. A typical use cases I deal with is to replicate small 
>> amount of data (<1GB) among 3 ~ 5 servers, and having access to zab 
>> would be very useful.
>>
>> I didn't mean to suggest to separate zab in the zookeeper code base. 
>> I referred to ZOOKEEPER-30 to highlight the usefulness of having a 
>> common interface for replication protocol.
>>
>> Thanks!
>> --Michi
>>
>>
>> On Sun, Jun 1, 2014 at 2:52 PM, Flavio Junqueira 
>> <fpjunque...@yahoo.com>
>> wrote:
>> > I'm not sure it is worth transforming this discussion into a bk vs.
>> > zk/zab. I think the space they target is different, although they 
>> > both deal with replication. It does sound worth having a separate 
>> > zab implementation, but it isn't clear that it is worth separating 
>> > zab in the zookeeper code base.
>> >
>> > There seem to be some misconceptions here, so here are some
>> > clarifications:
>> >
>> > - Zab itself doesn't deal with snapshots, it essentially replicates 
>> > a log. The use of snapshots is an optimization to speed up 
>> > recovery, and sure, it fits well into the framework of the protocol.
>> > - BookKeeper indeed relies on zk because it requires a component 
>> > for configuration and metadata of ledgers. By relying on a separate 
>> > configuration component, the pool of bookies can grow and shrink 
>> > arbitrarily, and such changes do not affect write performance like with zk.
>> > The configuration component, however, needs the properties of a 
>> > protocol like zab, so we still need something like zab.
>> > - Calling BK heavyweight is a bit of a stretch. Bookies + zk makes 
>> > only two components! These are not production numbers, but I don't 
>> > see a deployment with fewer than 10 machines (5 for ZK + 5 bookies) 
>> > being very interesting. If that's a significant fraction of your 
>> > overall server footprint, then sure, it is heavy for you.
>> >
>> > -Flavio
>> >
>> > On 01 Jun 2014, at 19:22, Michi Mutsuzaki <mi...@cs.stanford.edu> wrote:
>> >
>> >> Hi Ivan,
>> >>
>> >> The use case this project is going after is to durably replicate 
>> >> in-memory state. I think this project can differentiate itself 
>> >> from BookKeeper.
>> >>
>> >> 1. BookKeeper is pretty heavyweight, as you need to deploy 
>> >> ZooKeeper and bookies. I think there are use cases where you don't 
>> >> need the horizontal scalability BookKeeper provides, and you 
>> >> prefer to have a light-weight library for replicating state. 
>> >> ZooKeeper is one such example :) 2. Please correct me if I'm 
>> >> wrong, but BookKeeper is not designed for maintaining multiple 
>> >> in-memory replicas. A ledger can't be opened for reading if it's 
>> >> already open for writing, and you need to recover by restoring 
>> >> from a snapshot and replaying log entries if the writer goes down.
>> >> 3. ZOOKEEPER-30, which I wasn't initially aware of, is another 
>> >> motivation. I think there is a value in having a common interface 
>> >> for consensus algorithms so that services can plug in different 
>> >> implementations. This makes it easier to benchmark and test 
>> >> correctness of various implementations.
>> >>
>> >>
>> >> On Sun, Jun 1, 2014 at 3:05 AM, Ivan Kelly <iv...@apache.org> wrote:
>> >>> On Sat, May 31, 2014 at 02:29:34PM -0700, Michi Mutsuzaki wrote:
>> >>>> I'm hosting an intern this summer. One project I've been 
>> >>>> thinking about is to decouple zab from zookeeper. There are many 
>> >>>> use cases where you need a quorum based replication, but the 
>> >>>> hierarchical data model doesn't work well. A smallish (~1GB?) 
>> >>>> replicated key-value store with millions of entires is one such 
>> >>>> example. The goal of the project is to decouple the consensus 
>> >>>> algorithm (zab) from the data model
>> >>>> (zookeeper) more cleanly so that the users can define their own 
>> >>>> data models and use zab to replicate the data.
>> >>> So you want a replicated log which give you the guarantees of 
>> >>> zab. How would this be very different from Bookkeeper?
>> >>>
>> >>> -Ivan
>> >
>
>

Reply via email to