👍. Thanks a lot for the suggestions and feed back.

On Thu, Sep 7, 2017 at 4:24 AM, Sijie Guo <guosi...@gmail.com> wrote:

> On Wed, Sep 6, 2017 at 1:07 PM, Enrico Olivelli <eolive...@gmail.com>
> wrote:
>
> > Off topic curiosity... Jia and Sijie, do you think we are going to drop
> ZK
> > from DL too?
> >
>
> Yes. That's the goal - 1) for large deployment, we are trying to overcome
> the limitation of zookeeper; 2) for smaller deployments, it will make
> deployment much easier, you just need to deploy a cluster of bookies. once
> it is done, you can use ledger api or log stream api to access the
> bookkeeper cluster.
>
> Both DL and BK are metadata storage pluggable. They have very clear
> interfaces on defining metadata operations. So it is straightforward to use
> a different metadata storage.
>
>
> > Enrico
> >
> > On mer 6 set 2017, 19:51 Enrico Olivelli <eolive...@gmail.com> wrote:
> >
> > >
> > >
> > > On mer 6 set 2017, 18:25 Sijie Guo <guosi...@gmail.com> wrote:
> > >
> > >> On Sep 6, 2017 4:57 AM, "Enrico Olivelli" <eolive...@gmail.com>
> wrote:
> > >>
> > >> Thank you Sijie and Jia for your comments and explanations,
> > >> answers inline
> > >>
> > >> 2017-09-06 2:23 GMT+02:00 Jia Zhai <zhaiji...@gmail.com>:
> > >>
> > >> > Thanks a lot Enrico and Sijie for your comments and information on
> > this.
> > >> >
> > >> > On Tue, Sep 5, 2017 at 9:31 PM, Enrico Olivelli <
> eolive...@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Great to see you working on this !
> > >> > > I would be great to have such feature, as it is the first step to
> a
> > >> > > 'standalone' BookKeeper mode
> > >> > >
> > >> > > Some complementary ideas/first look questions:
> > >> > > - the document does not talk about security, IMHO we have at least
> > to
> > >> > cover
> > >> > > authentication and TLS, it would be great to leverage existing
> > >> > AuthPlugins,
> > >> > > as they are based on exchanging byte[] (as SASL wants)
> > >> > >
> > >> > [Jia] It is a good idea. We left the security part for now for a few
> > >> > reasons. 1) Make this BP more focus on removing zookeeper
> dependencies
> > >> from
> > >> > client. 2) It is introduced as a separated implementation of
> existing
> > >> > interfaces. So it won’t impact existing security story.   And for
> > sure,
> > >> We
> > >> > will add the security part later after this.
> > >> >
> > >>
> > >>
> > >> I am fine, I am only afraid that we won't be able to support it in the
> > >> (near) future,
> > >> maybe you could just only cite the security story and add some
> reference
> > >> to
> > >> how we would deal with it in future
> > >>
> > >>
> > >> The new ledger manager will be first marked as experimental, until it
> is
> > >> stable and have security feature.
> > >>
> > >> How does that sound?
> > >>
> > >
> > > Ok
> > >
> > >>
> > >>
> > >>
> > >> >
> > >> > - do we have some kind of "bootstrap servers list" configuration
> > option
> > >> ?
> > >> > > the list should be complete or just a subset of bookies ? at
> > >> connection
> > >> > the
> > >> > > client could discover the list of other bookies
> > >> > >
> > >> > [Jia] Yes, we will have a `clientBootstrapBookies` settings in the
> > >> server
> > >> > set. It can be a list of bookies or just simple a DNS over the
> > bookies.
> > >> > Will add this to the BP
> > >> >
> > >> > - will the client connect to only one bookie at a time ? how we will
> > >> deal
> > >> > > with errors ?
> > >> > >
> > >> > [Jia] It will connect the the list of bootstrap servers. gPRC will
> > load
> > >> > balance the requests and manage the connection errors.
> > >> >
> > >> > - should the bookie write on ZK metadata its gRPC endpoint info ?
> > (this
> > >> > > will be useful for a bookie to tell about other bookies to the
> > >> connected
> > >> > > clients)
> > >> > >
> > >> > [Jia]No, it won’t. We don’t see a strong reason to add it.
> Especially
> > >> > eventually we may eliminate zookeeper completely.
> > >> > It can be a fixed port `3281`, or in a scheduler-based environment,
> it
> > >> is
> > >> > very easy to have a load balancer sitting in front of those bookies.
> > >> >
> > >>
> > >> I think a fixed port is not a good way.
> > >> You will not be able to run more than one bookie on a single host.
> > >>
> > >> We should support:
> > >> - configurable port
> > >> - ephemeral port for tests
> > >>
> > >>
> > >> I think what Jia means is a configurable port, but it is a relatively
> > >> fixed
> > >> port, which client doesn't discover this port from zookeeper.
> > >>
> > >
> > > Very good
> > >
> > >>
> > >>
> > >> Ideally I would like to have the local transport option, in order to
> > have
> > >> a
> > >> single JVM, but this is not a blocker problem, as we are running gRPC
> on
> > >> netty it should be feasible or we can create some kind of short-circut
> > >> between the client and the Bookie
> > >>
> > >>
> > >> GRPC supports inprocess channel. So you don't need to use the low
> level
> > >> netty settings.
> > >>
> > >
> > > Great
> > >
> > > So it sounds all good to me thanks
> > >
> > > Enrico
> > >
> > >
> > >>
> > >> I am OK for not writing this to the bookie metadata, leaving up to the
> > >> client have a configured list of bookies enabled to metadata
> operations
> > >>
> > >>
> > >>
> > >>
> > >> >
> > >> > - the bookie will be somehow a proxy for zookeeper, I think that the
> > >> > > 'watch' part is the more complex, we will have to deal with
> > >> > reconnections,
> > >> > > errors....maybe it is worth to write more detail about this
> > >> > >
> > >> > [Jia] The `watch` API is using the `streaming` rpc in gRPC. It is a
> > >> > straightforward proxy behavior, if a connection is broken, the
> client
> > >> will
> > >> > simply retry on watching again.
> > >> >
> > >> >
> > >> > > Minor issues:
> > >> > > - Maybe you can consider using ledgerId and not ledger_id, like in
> > >> > > LedgerMetadataFormat we are using lastEntryId
> > >> > >
> > >> > [Jia] Thanks, It is a protobuf style. The protobuf will convert
> > >> `ledger_id`
> > >> > to `ledgerId`. We don’t need to worry about this.
> > >> >
> > >>
> > >> got it, thanks
> > >>
> > >>
> > >> >
> > >> >
> > >> > > -In the "motivation" part you write that the fact the having more
> > >> clients
> > >> > > than the number of bookies would be a problem for zookeeper,
> > actually
> > >> > > zookeeper is very good at dealing with a huge number of clients.
> > >> > Actually I
> > >> > > am always running clusters with 3-5 bookies and 10-100 writing
> > clients
> > >> > and
> > >> > > this has never given troubles
> > >> >
> > >> > [Jia] :) Seems “10-100 writing clients” is not “a huge number of
> > >> clients”.
> > >> >
> > >>
> > >> OK, I agree with you an Sijie, I have no experience of larger clusters
> > >>
> > >>
> > >> >
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > > Future:
> > >> > > - as bookies will be proxies maybe we should take care not to
> > >> overwhelm
> > >> a
> > >> > > bookie with too many clients
> > >> > >
> > >> > [Jia] First, gRPC is based on Netty, the protocol is http2, so the
> > >> > connection is multiplexed. We don’t need to worry about connection
> > >> count.
> > >> > Second, all the bookies are treated equally for the metadata
> > operations,
> > >> > gRPC will load balancing the requests across the bookies. We don’t
> > need
> > >> to
> > >> > worry about some bookies are overwhelmed.
> > >> >
> > >>
> > >> gRPC sounds great
> > >>
> > >>
> > >> >
> > >> >
> > >> > > - iteration on ledgers, sometimes the clients enumerates ledgers
> but
> > >> it
> > >> > is
> > >> > > not interested in having all of them, as we are using the bookie
> as
> > >> proxy
> > >> > > maybe some kind of "filter" (at least on custom metadata) would be
> > >> create
> > >> > > to limit the number of returned items. Other point I don't know
> gRPC
> > >> but
> > >> > it
> > >> > > does not seems to be very clear how to 'stop' the iteration
> > >> > >
> > >> > [Jia] Thanks, We can add it later. For now, we would like to focus
> on
> > >> > adding the features the ledger manager needs.
> > >> >
> > >>
> > >> Yup
> > >>
> > >> -- Enrico
> > >>
> > >>
> > >> >
> > >> > >
> > >> > > -- Enrico
> > >> > >
> > >> > >
> > >> > > 2017-09-05 15:10 GMT+02:00 Jia Zhai <zhaiji...@gmail.com>:
> > >> > >
> > >> > > > Hi all,
> > >> > > >
> > >> > > > I have just posted a proposal to remove zookeeper dependency
> from
> > >> > > > bookkeeper client, to make bookkeeper client a thin client:
> > >> > > >
> > >> > > > https://cwiki.apache.org/confluence/display/BOOKKEEPER/
> > >> > > > BP-16%3A+remove+zookeeper+dependency+from+bookkeeper+client
> > >> > > >
> > >> > > >
> > >> > > > BookKeeper uses zookeeper for service discovery (discovering the
> > >> > > available
> > >> > > > bookies in the cluster), metadata management (storing all the
> > >> metadata
> > >> > > for
> > >> > > > ledgers). However it exposes the metadata storage directly to
> the
> > >> > > clients,
> > >> > > > making bookkeeper client a very thick client. It also exposes
> some
> > >> > > > problems.
> > >> > > >
> > >> > > > This BP explores the possibility of eliminating zookeeper
> > completely
> > >> > from
> > >> > > > client side, to produce a thin bookkeeper client.
> > >> > > >
> > >> > > > I will send a patch as soon as we agree on the proposal.
> > >> > > >
> > >> > > >
> > >> > > > Thanks.
> > >> > > >
> > >> > > > -Jia
> > >> > > >
> > >> > >
> > >> >
> > >>
> > > --
> > >
> > >
> > > -- Enrico Olivelli
> > >
> > --
> >
> >
> > -- Enrico Olivelli
> >
>

Reply via email to