I have a few reasons for suggesting a separate project: - I don't see a reason for tying the releases of an independent implementation of Zab to ZooKeeper - The set of developers (and committers) interested in an independent implementation of Zab might be different compared to ZooKeeper; it could really be a separate community - It really feels like parallel efforts along the lines of Curator and BookKeeper, so I see it following similar steps
Regarding the effort of an intern, I guess it depends how far you want the initial stretch to go. An initial implementation to contribute to Apache followed by community activity might get it going. -Flavio > -----Original Message----- > From: Ivan Kelly [mailto:iv...@apache.org] > Sent: 02 June 2014 15:58 > To: dev@zookeeper.apache.org; mi...@cs.stanford.edu > Cc: Ivan Kelly > Subject: Re: intern project idea: decouple zab from zookeeper > > > 1. BookKeeper is pretty heavyweight, as you need to deploy ZooKeeper > > and bookies. I think there are use cases where you don't need the > > horizontal scalability BookKeeper provides, and you prefer to have a > > light-weight library for replicating state. ZooKeeper is one such > > example :) > I was thinking from the point of view that if you want to provide ZAB as a > library, then the library will have to provide an RPC mechanism for talking to > other members of the quorum, and a means to persist updates to disk > before responding, and _then_ provide a ZAB implementation somewhere > in between. This doesn't seem much lighter than BK. > > I think it's a worthwhile thing to pursue, but I disagree that a separate project > is a better way to doing it. If this is an intern project, expecting them to > reimplement ZAB might be a bit of a large ask (depending on the internship > length and the intern themselves). An investigation into splitting the user > interface layer of zookeeper and ZAB seems itself to be a nice chunk to work > on, and it has the advantage that even if the changes don't get merged into > trunk, there will be a clearer picture as to why they can't be split. > > > 2. Please correct me if I'm wrong, but BookKeeper is not designed for > > maintaining multiple in-memory replicas. A ledger can't be opened for > > reading if it's already open for writing, and you need to recover by > > restoring from a snapshot and replaying log entries if the writer goes > > down. > You can read from a ledger while it is being written to, but right now it's > polling. Twitter are working on some changes to make it more notification > like to reduce latency between the primary writing and the secondary > reading. > > -Ivan