It is very interesting! Thank you. I will look into it soon Enrico
Il mer 7 feb 2018, 15:24 Sijie Guo <guosi...@gmail.com> ha scritto: > Hi all, > > I started a proposal of contributing a table (aka key/value) service > component as a contrib module to the bookkeeper community. This BP together > with other BPs I sent last week forms the idea of how we can do on > improving metadata management in bookkeeper (I will talk a bit more at the > end of this email). > > **why it was developed** > > Two main categories of use cases were driving the need of a key/value like > service. > > One is metadata storage, bookkeeper needs a key/value like storage > (currently it is zookeeper) to store the ledger's metadata, systems built > on top of bookkeeper like distributedlog/pulsar also follow the pattern > that bookkeeper is using. They all need a key/value like storage to store > their metadata. We all know zookeeper is the bottleneck of the scalability. > And it is also an issue marker to production systems (based on my biased > production experiences). > > The other one is state storage in real-time/streaming > analytics/computation. In streaming analytics, the computation jobs usually > process streaming data. they usually need to store some sort of state of > the computation operators into a storage and serve the computation state as > final results for queries. Those state are usually represented in key/value > forms, and usually backed by wal. BookKeeper has been used in this area via > distributedlog/pulsar for storing and serving log / streaming data. It is > ideal for bookkeeper also able to store and serve state data for the sake > of unification, simplification and also reducing the complexity of > deployment and operations. > > Hence we prototyped/developed a table service component as an add-on to > bookkeeper. We'd like to contribute this as a contrib module to bookkeeper > and continue the development, integration and evaluation in the bookkeeper > community. > > We hope this can be like bookkeeper in zookeeper. bookkeeper was a contrib > module in zookeeper, and it is developed in the community and grown into > what it is now. > > **how it is aligned with metadata storage** > > BP-28, BP-29 and BP-30. They are related at some extend. > > BP-28 is more a cleanup proposal to carry-on Jia's work (on service > discovery interfaces). This is to produce a clean metadata api module, > define a clean dependency between > bookkeeper implementation and metadata service, and allow we really plugin > different > metadata services without touching/changing bookkeeper implementation. > > BP-29 and BP-30 can be thought as two different metadata service > implementation based > on the metadata api contract defined in BP-28. > > BP-29 is to use Etcd as the metadata service, while BP-30 is to have a > built-in key/value service as the metadata service. Both BP-29 and BP-30 > have pros and cons. However they > are not against to each other. Allowing two concurrent approaches will help > us understand > more on metadata management in bookkeeper and its ecosystem (e.g. dlog, > pulsar), which > will lead the project head in a healthy direction. > > **Proposed Changes** > > This proposal is to propose this table service as a contrib module under > `stream` directory just as how we handle `dlog`. We can mark it as > "preview"/"alpha" in 4.7 and continue the development of this module in > bookkeeper community. > > The details of the proposal can be found in the google doc attached below: > > > https://docs.google.com/document/d/155xAwWv5IdOitHh1NVMEwCMGgB28M3FyMiQSxEpjE-Y/edit#heading=h.56rbh52koe3f > > Please take a look. Comments are welcome. > > - Sijie > -- -- Enrico Olivelli