2017-09-13 10:10 GMT+02:00 Sijie Guo <guosi...@gmail.com>:

> On Wed, Sep 13, 2017 at 12:16 AM, Enrico Olivelli <eolive...@gmail.com>
> wrote:
>
> > I think that this is a good direction to go.
> >
> > I believe to the reasons about ZK in huge systems even it is not my case
> so
> > I cannot add comments on this usecase.
> >
> > I am fine with direction as long as we are still going to support
> > ZooKeeper.
> > BookKeeper is in the Hadoop / ZooKeeper ecosystem and several products
> rely
> > on ZK too, for instance in my systems it is usual to have
> > BookKeeper/Kafka/HBase/Majordodo....  and so I am not going to live
> > without
> > zookeeper in the short/mid term.
> >
> > I am really OK in dropping ZK because for "simple" systems in fact when
> you
> > need only BK having the burden of setting up a zookeeper server is weird
> > for customers. I usually re-distribute BK + ZK with my applications and
> we
> > are talking about little clusters of up to 10 machines.
> >
>
> Just to clarify - we are not dropping ZK here. we are just proposing to
> have a ledger manager implementation that doesn't depend on zookeeper
> directly.
> We are not modifying any existing ledger manager implementation.
>


Yep, we are on the same page
for this proposal the bookie will be a sort of "proxy" between the client
and the actual ledger manager implementation which will "live" inside the
bookie
it is only a new ledger manager to be used in clients, this ledger manager
will issue RPCs (or kind of "streaming" RPCs) to a list of bookies


>
>
> >
> > The direction on this proposal is OK for me and it is very like the work
> I
> > was starting about "standalone mode".
>
>
> > I think it will be very easy to support the case of having a single
> bookie
> > with this approach or even client+ bookie in the same JVM,
> > Having multiple bookies will make us to add some other coordination
> > facility between bookies, I would like to know if there is already some
> > idea about this, are we going to use another product like etcd,jgroups or
> > implement our own coordination protocol ?
>
>
> we are not replacing A with B, even with zookeeper. the ledger management
> is already abstracted in interfaces.
> the users can use whatever system they prefer as the metadata store.
>
> our direction is to provide an option to store metadata as well as data in
> bookies. so in this option, there is no external metadata storage needed.
>

Sorry. Maybe my curiosity is not clear.
If you have multiple bookies and each bookie holds its own version of
metadata, how do you coordinate them ? which will be the source of truth ?
Maybe we should start a new email thread in the future to talk about
"alternative distributed metadata storages"

Any way the meaning and the scope of the proposal is clear to me and I am
really OK with it, I hope it will get soon approved

-- Enrico


>
>
> > ZK is simple but it very
> > effective.
>
> Maybe we could help the ZK community to move forward and resolve
> > the problems we are bringing to light
> >
> >
> > Enrico
> >
> >
> > 2017-09-13 3:15 GMT+02:00 Jia Zhai <zhaiji...@gmail.com>:
> >
> > > Any thoughts or comments
> > > :)
> > >
> > > Thanks a lot.
> > > -Jia
> > >
> > > On Tue, Sep 12, 2017 at 4:30 PM, Jia Zhai <zhaiji...@gmail.com> wrote:
> > >
> > > > This blog: https://bitworks.software/blog/en/2017-07-12-replicated-
> > > > scalable-commitlog-with-apachebookkeeper.html, which also refer a
> > little
> > > > the limitation of zookeeper in bookkeeper
> > > >
> > > > On Thu, Sep 7, 2017 at 9:45 AM, Jia Zhai <zhaiji...@gmail.com>
> wrote:
> > > >
> > > >> đź‘Ť. Thanks a lot for the suggestions and feed back.
> > > >>
> > > >> On Thu, Sep 7, 2017 at 4:24 AM, Sijie Guo <guosi...@gmail.com>
> wrote:
> > > >>
> > > >>> On Wed, Sep 6, 2017 at 1:07 PM, Enrico Olivelli <
> eolive...@gmail.com
> > >
> > > >>> wrote:
> > > >>>
> > > >>> > Off topic curiosity... Jia and Sijie, do you think we are going
> to
> > > >>> drop ZK
> > > >>> > from DL too?
> > > >>> >
> > > >>>
> > > >>> Yes. That's the goal - 1) for large deployment, we are trying to
> > > overcome
> > > >>> the limitation of zookeeper; 2) for smaller deployments, it will
> make
> > > >>> deployment much easier, you just need to deploy a cluster of
> bookies.
> > > >>> once
> > > >>> it is done, you can use ledger api or log stream api to access the
> > > >>> bookkeeper cluster.
> > > >>>
> > > >>> Both DL and BK are metadata storage pluggable. They have very clear
> > > >>> interfaces on defining metadata operations. So it is
> straightforward
> > to
> > > >>> use
> > > >>> a different metadata storage.
> > > >>>
> > > >>>
> > > >>> > Enrico
> > > >>> >
> > > >>> > On mer 6 set 2017, 19:51 Enrico Olivelli <eolive...@gmail.com>
> > > wrote:
> > > >>> >
> > > >>> > >
> > > >>> > >
> > > >>> > > On mer 6 set 2017, 18:25 Sijie Guo <guosi...@gmail.com> wrote:
> > > >>> > >
> > > >>> > >> On Sep 6, 2017 4:57 AM, "Enrico Olivelli" <
> eolive...@gmail.com>
> > > >>> wrote:
> > > >>> > >>
> > > >>> > >> Thank you Sijie and Jia for your comments and explanations,
> > > >>> > >> answers inline
> > > >>> > >>
> > > >>> > >> 2017-09-06 2:23 GMT+02:00 Jia Zhai <zhaiji...@gmail.com>:
> > > >>> > >>
> > > >>> > >> > Thanks a lot Enrico and Sijie for your comments and
> > information
> > > on
> > > >>> > this.
> > > >>> > >> >
> > > >>> > >> > On Tue, Sep 5, 2017 at 9:31 PM, Enrico Olivelli <
> > > >>> eolive...@gmail.com>
> > > >>> > >> > wrote:
> > > >>> > >> >
> > > >>> > >> > > Great to see you working on this !
> > > >>> > >> > > I would be great to have such feature, as it is the first
> > step
> > > >>> to a
> > > >>> > >> > > 'standalone' BookKeeper mode
> > > >>> > >> > >
> > > >>> > >> > > Some complementary ideas/first look questions:
> > > >>> > >> > > - the document does not talk about security, IMHO we have
> at
> > > >>> least
> > > >>> > to
> > > >>> > >> > cover
> > > >>> > >> > > authentication and TLS, it would be great to leverage
> > existing
> > > >>> > >> > AuthPlugins,
> > > >>> > >> > > as they are based on exchanging byte[] (as SASL wants)
> > > >>> > >> > >
> > > >>> > >> > [Jia] It is a good idea. We left the security part for now
> > for a
> > > >>> few
> > > >>> > >> > reasons. 1) Make this BP more focus on removing zookeeper
> > > >>> dependencies
> > > >>> > >> from
> > > >>> > >> > client. 2) It is introduced as a separated implementation of
> > > >>> existing
> > > >>> > >> > interfaces. So it won’t impact existing security story.
>  And
> > > for
> > > >>> > sure,
> > > >>> > >> We
> > > >>> > >> > will add the security part later after this.
> > > >>> > >> >
> > > >>> > >>
> > > >>> > >>
> > > >>> > >> I am fine, I am only afraid that we won't be able to support
> it
> > in
> > > >>> the
> > > >>> > >> (near) future,
> > > >>> > >> maybe you could just only cite the security story and add some
> > > >>> reference
> > > >>> > >> to
> > > >>> > >> how we would deal with it in future
> > > >>> > >>
> > > >>> > >>
> > > >>> > >> The new ledger manager will be first marked as experimental,
> > until
> > > >>> it is
> > > >>> > >> stable and have security feature.
> > > >>> > >>
> > > >>> > >> How does that sound?
> > > >>> > >>
> > > >>> > >
> > > >>> > > Ok
> > > >>> > >
> > > >>> > >>
> > > >>> > >>
> > > >>> > >>
> > > >>> > >> >
> > > >>> > >> > - do we have some kind of "bootstrap servers list"
> > configuration
> > > >>> > option
> > > >>> > >> ?
> > > >>> > >> > > the list should be complete or just a subset of bookies ?
> at
> > > >>> > >> connection
> > > >>> > >> > the
> > > >>> > >> > > client could discover the list of other bookies
> > > >>> > >> > >
> > > >>> > >> > [Jia] Yes, we will have a `clientBootstrapBookies` settings
> in
> > > the
> > > >>> > >> server
> > > >>> > >> > set. It can be a list of bookies or just simple a DNS over
> the
> > > >>> > bookies.
> > > >>> > >> > Will add this to the BP
> > > >>> > >> >
> > > >>> > >> > - will the client connect to only one bookie at a time ? how
> > we
> > > >>> will
> > > >>> > >> deal
> > > >>> > >> > > with errors ?
> > > >>> > >> > >
> > > >>> > >> > [Jia] It will connect the the list of bootstrap servers.
> gPRC
> > > will
> > > >>> > load
> > > >>> > >> > balance the requests and manage the connection errors.
> > > >>> > >> >
> > > >>> > >> > - should the bookie write on ZK metadata its gRPC endpoint
> > info
> > > ?
> > > >>> > (this
> > > >>> > >> > > will be useful for a bookie to tell about other bookies to
> > the
> > > >>> > >> connected
> > > >>> > >> > > clients)
> > > >>> > >> > >
> > > >>> > >> > [Jia]No, it won’t. We don’t see a strong reason to add it.
> > > >>> Especially
> > > >>> > >> > eventually we may eliminate zookeeper completely.
> > > >>> > >> > It can be a fixed port `3281`, or in a scheduler-based
> > > >>> environment, it
> > > >>> > >> is
> > > >>> > >> > very easy to have a load balancer sitting in front of those
> > > >>> bookies.
> > > >>> > >> >
> > > >>> > >>
> > > >>> > >> I think a fixed port is not a good way.
> > > >>> > >> You will not be able to run more than one bookie on a single
> > host.
> > > >>> > >>
> > > >>> > >> We should support:
> > > >>> > >> - configurable port
> > > >>> > >> - ephemeral port for tests
> > > >>> > >>
> > > >>> > >>
> > > >>> > >> I think what Jia means is a configurable port, but it is a
> > > >>> relatively
> > > >>> > >> fixed
> > > >>> > >> port, which client doesn't discover this port from zookeeper.
> > > >>> > >>
> > > >>> > >
> > > >>> > > Very good
> > > >>> > >
> > > >>> > >>
> > > >>> > >>
> > > >>> > >> Ideally I would like to have the local transport option, in
> > order
> > > to
> > > >>> > have
> > > >>> > >> a
> > > >>> > >> single JVM, but this is not a blocker problem, as we are
> running
> > > >>> gRPC on
> > > >>> > >> netty it should be feasible or we can create some kind of
> > > >>> short-circut
> > > >>> > >> between the client and the Bookie
> > > >>> > >>
> > > >>> > >>
> > > >>> > >> GRPC supports inprocess channel. So you don't need to use the
> > low
> > > >>> level
> > > >>> > >> netty settings.
> > > >>> > >>
> > > >>> > >
> > > >>> > > Great
> > > >>> > >
> > > >>> > > So it sounds all good to me thanks
> > > >>> > >
> > > >>> > > Enrico
> > > >>> > >
> > > >>> > >
> > > >>> > >>
> > > >>> > >> I am OK for not writing this to the bookie metadata, leaving
> up
> > to
> > > >>> the
> > > >>> > >> client have a configured list of bookies enabled to metadata
> > > >>> operations
> > > >>> > >>
> > > >>> > >>
> > > >>> > >>
> > > >>> > >>
> > > >>> > >> >
> > > >>> > >> > - the bookie will be somehow a proxy for zookeeper, I think
> > that
> > > >>> the
> > > >>> > >> > > 'watch' part is the more complex, we will have to deal
> with
> > > >>> > >> > reconnections,
> > > >>> > >> > > errors....maybe it is worth to write more detail about
> this
> > > >>> > >> > >
> > > >>> > >> > [Jia] The `watch` API is using the `streaming` rpc in gRPC.
> It
> > > is
> > > >>> a
> > > >>> > >> > straightforward proxy behavior, if a connection is broken,
> the
> > > >>> client
> > > >>> > >> will
> > > >>> > >> > simply retry on watching again.
> > > >>> > >> >
> > > >>> > >> >
> > > >>> > >> > > Minor issues:
> > > >>> > >> > > - Maybe you can consider using ledgerId and not ledger_id,
> > > like
> > > >>> in
> > > >>> > >> > > LedgerMetadataFormat we are using lastEntryId
> > > >>> > >> > >
> > > >>> > >> > [Jia] Thanks, It is a protobuf style. The protobuf will
> > convert
> > > >>> > >> `ledger_id`
> > > >>> > >> > to `ledgerId`. We don’t need to worry about this.
> > > >>> > >> >
> > > >>> > >>
> > > >>> > >> got it, thanks
> > > >>> > >>
> > > >>> > >>
> > > >>> > >> >
> > > >>> > >> >
> > > >>> > >> > > -In the "motivation" part you write that the fact the
> having
> > > >>> more
> > > >>> > >> clients
> > > >>> > >> > > than the number of bookies would be a problem for
> zookeeper,
> > > >>> > actually
> > > >>> > >> > > zookeeper is very good at dealing with a huge number of
> > > clients.
> > > >>> > >> > Actually I
> > > >>> > >> > > am always running clusters with 3-5 bookies and 10-100
> > writing
> > > >>> > clients
> > > >>> > >> > and
> > > >>> > >> > > this has never given troubles
> > > >>> > >> >
> > > >>> > >> > [Jia] :) Seems “10-100 writing clients” is not “a huge
> number
> > of
> > > >>> > >> clients”.
> > > >>> > >> >
> > > >>> > >>
> > > >>> > >> OK, I agree with you an Sijie, I have no experience of larger
> > > >>> clusters
> > > >>> > >>
> > > >>> > >>
> > > >>> > >> >
> > > >>> > >> > >
> > > >>> > >> >
> > > >>> > >> >
> > > >>> > >> >
> > > >>> > >> > > Future:
> > > >>> > >> > > - as bookies will be proxies maybe we should take care not
> > to
> > > >>> > >> overwhelm
> > > >>> > >> a
> > > >>> > >> > > bookie with too many clients
> > > >>> > >> > >
> > > >>> > >> > [Jia] First, gRPC is based on Netty, the protocol is http2,
> so
> > > the
> > > >>> > >> > connection is multiplexed. We don’t need to worry about
> > > connection
> > > >>> > >> count.
> > > >>> > >> > Second, all the bookies are treated equally for the metadata
> > > >>> > operations,
> > > >>> > >> > gRPC will load balancing the requests across the bookies. We
> > > don’t
> > > >>> > need
> > > >>> > >> to
> > > >>> > >> > worry about some bookies are overwhelmed.
> > > >>> > >> >
> > > >>> > >>
> > > >>> > >> gRPC sounds great
> > > >>> > >>
> > > >>> > >>
> > > >>> > >> >
> > > >>> > >> >
> > > >>> > >> > > - iteration on ledgers, sometimes the clients enumerates
> > > >>> ledgers but
> > > >>> > >> it
> > > >>> > >> > is
> > > >>> > >> > > not interested in having all of them, as we are using the
> > > >>> bookie as
> > > >>> > >> proxy
> > > >>> > >> > > maybe some kind of "filter" (at least on custom metadata)
> > > would
> > > >>> be
> > > >>> > >> create
> > > >>> > >> > > to limit the number of returned items. Other point I don't
> > > know
> > > >>> gRPC
> > > >>> > >> but
> > > >>> > >> > it
> > > >>> > >> > > does not seems to be very clear how to 'stop' the
> iteration
> > > >>> > >> > >
> > > >>> > >> > [Jia] Thanks, We can add it later. For now, we would like to
> > > >>> focus on
> > > >>> > >> > adding the features the ledger manager needs.
> > > >>> > >> >
> > > >>> > >>
> > > >>> > >> Yup
> > > >>> > >>
> > > >>> > >> -- Enrico
> > > >>> > >>
> > > >>> > >>
> > > >>> > >> >
> > > >>> > >> > >
> > > >>> > >> > > -- Enrico
> > > >>> > >> > >
> > > >>> > >> > >
> > > >>> > >> > > 2017-09-05 15:10 GMT+02:00 Jia Zhai <zhaiji...@gmail.com
> >:
> > > >>> > >> > >
> > > >>> > >> > > > Hi all,
> > > >>> > >> > > >
> > > >>> > >> > > > I have just posted a proposal to remove zookeeper
> > dependency
> > > >>> from
> > > >>> > >> > > > bookkeeper client, to make bookkeeper client a thin
> > client:
> > > >>> > >> > > >
> > > >>> > >> > > > https://cwiki.apache.org/confluence/display/BOOKKEEPER/
> > > >>> > >> > > > BP-16%3A+remove+zookeeper+dependency+from+bookkeeper+
> > client
> > > >>> > >> > > >
> > > >>> > >> > > >
> > > >>> > >> > > > BookKeeper uses zookeeper for service discovery
> > (discovering
> > > >>> the
> > > >>> > >> > > available
> > > >>> > >> > > > bookies in the cluster), metadata management (storing
> all
> > > the
> > > >>> > >> metadata
> > > >>> > >> > > for
> > > >>> > >> > > > ledgers). However it exposes the metadata storage
> directly
> > > to
> > > >>> the
> > > >>> > >> > > clients,
> > > >>> > >> > > > making bookkeeper client a very thick client. It also
> > > exposes
> > > >>> some
> > > >>> > >> > > > problems.
> > > >>> > >> > > >
> > > >>> > >> > > > This BP explores the possibility of eliminating
> zookeeper
> > > >>> > completely
> > > >>> > >> > from
> > > >>> > >> > > > client side, to produce a thin bookkeeper client.
> > > >>> > >> > > >
> > > >>> > >> > > > I will send a patch as soon as we agree on the proposal.
> > > >>> > >> > > >
> > > >>> > >> > > >
> > > >>> > >> > > > Thanks.
> > > >>> > >> > > >
> > > >>> > >> > > > -Jia
> > > >>> > >> > > >
> > > >>> > >> > >
> > > >>> > >> >
> > > >>> > >>
> > > >>> > > --
> > > >>> > >
> > > >>> > >
> > > >>> > > -- Enrico Olivelli
> > > >>> > >
> > > >>> > --
> > > >>> >
> > > >>> >
> > > >>> > -- Enrico Olivelli
> > > >>> >
> > > >>>
> > > >>
> > > >>
> > > >
> > >
> >
>

Reply via email to