Proposal BookKeeper is a distributed write ahead logging (WAL) service. It is built on top of ZooKeeper and is used for distributed recovery and reliability. Much like ZooKeeper itself, BookKeeper is a distributed tool used for reliability, but unlike ZooKeeper it is used to store large amounts of application data in the form of byte streams, which we call ledgers. It is made up of Bookies, which store data, and a client library. All other meta-data is stored in ZooKeeper.
The BookKeeper subproject also includes Hedwig, which is a pub/sub system built on both BookKeeper and ZooKeeper. It's coupling with BookKeeper is tight and many of the performance features of BookKeeper were added in response to Hedwig's requirements. Hedwig is made up of a rather thin client library and stateless Brokers that cache and distribute messages. Background BookKeeper was developed as a WAL for the Hadoop NameNode and was also used to build the Hedwig pub/sub system. Both are currently contribs to ZooKeeper. The work to get the hooks necessary to integrate BookKeeper with the NameNode is almost complete (HDFS-1580). Rational We have contributors that we would like to make committers to BookKeeper and Hedwig. It would be nice to allow a development community to grow around BookKeeper. Also, hudson does not run against contrib. Making BookKeeper its own subproject would allow us to better qa our changes. We also would like to decouple BookKeeper releases from ZooKeeper releases. ZooKeeper is quite mature and has relatively long release cycles. We would like shorter release cycles for BookKeeper. In theory we could make two projects BookKeeper and Hedwig, but doing so would double the project management and release overhead. The development community between BookKeeper and Hedwig overlaps heavily, so we would be increasing the burden on the same group of contributors. Because of the developer community overlap with ZooKeeper and the fact that BookKeeper is inline with the general mission of ZooKeeper, we think BookKeeper should be a subproject of ZooKeeper. Call for vote I propose that BookKeeper become a ZooKeeper subproject subject to ZooKeeper PMC and Bylaws. I, Benjamin Reed, will champion the proposal. BookKeeper will have the following initial committers: Dhruba Borthakur (Facebook) Flavio Junqueira (Yahoo) Ivan Kelly (Yahoo) Benjamin Reed (Yahoo) Utkarsh Srivastava (Twitter)
