On Sep 6, 2017 4:57 AM, "Enrico Olivelli" <eolive...@gmail.com> wrote:

Thank you Sijie and Jia for your comments and explanations,
answers inline

2017-09-06 2:23 GMT+02:00 Jia Zhai <zhaiji...@gmail.com>:

> Thanks a lot Enrico and Sijie for your comments and information on this.
>
> On Tue, Sep 5, 2017 at 9:31 PM, Enrico Olivelli <eolive...@gmail.com>
> wrote:
>
> > Great to see you working on this !
> > I would be great to have such feature, as it is the first step to a
> > 'standalone' BookKeeper mode
> >
> > Some complementary ideas/first look questions:
> > - the document does not talk about security, IMHO we have at least to
> cover
> > authentication and TLS, it would be great to leverage existing
> AuthPlugins,
> > as they are based on exchanging byte[] (as SASL wants)
> >
> [Jia] It is a good idea. We left the security part for now for a few
> reasons. 1) Make this BP more focus on removing zookeeper dependencies
from
> client. 2) It is introduced as a separated implementation of existing
> interfaces. So it won’t impact existing security story.   And for sure, We
> will add the security part later after this.
>


I am fine, I am only afraid that we won't be able to support it in the
(near) future,
maybe you could just only cite the security story and add some reference to
how we would deal with it in future


The new ledger manager will be first marked as experimental, until it is
stable and have security feature.

How does that sound?



>
> - do we have some kind of "bootstrap servers list" configuration option ?
> > the list should be complete or just a subset of bookies ? at connection
> the
> > client could discover the list of other bookies
> >
> [Jia] Yes, we will have a `clientBootstrapBookies` settings in the server
> set. It can be a list of bookies or just simple a DNS over the bookies.
> Will add this to the BP
>
> - will the client connect to only one bookie at a time ? how we will deal
> > with errors ?
> >
> [Jia] It will connect the the list of bootstrap servers. gPRC will load
> balance the requests and manage the connection errors.
>
> - should the bookie write on ZK metadata its gRPC endpoint info ? (this
> > will be useful for a bookie to tell about other bookies to the connected
> > clients)
> >
> [Jia]No, it won’t. We don’t see a strong reason to add it. Especially
> eventually we may eliminate zookeeper completely.
> It can be a fixed port `3281`, or in a scheduler-based environment, it is
> very easy to have a load balancer sitting in front of those bookies.
>

I think a fixed port is not a good way.
You will not be able to run more than one bookie on a single host.

We should support:
- configurable port
- ephemeral port for tests


I think what Jia means is a configurable port, but it is a relatively fixed
port, which client doesn't discover this port from zookeeper.


Ideally I would like to have the local transport option, in order to have a
single JVM, but this is not a blocker problem, as we are running gRPC on
netty it should be feasible or we can create some kind of short-circut
between the client and the Bookie


GRPC supports inprocess channel. So you don't need to use the low level
netty settings.


I am OK for not writing this to the bookie metadata, leaving up to the
client have a configured list of bookies enabled to metadata operations




>
> - the bookie will be somehow a proxy for zookeeper, I think that the
> > 'watch' part is the more complex, we will have to deal with
> reconnections,
> > errors....maybe it is worth to write more detail about this
> >
> [Jia] The `watch` API is using the `streaming` rpc in gRPC. It is a
> straightforward proxy behavior, if a connection is broken, the client will
> simply retry on watching again.
>
>
> > Minor issues:
> > - Maybe you can consider using ledgerId and not ledger_id, like in
> > LedgerMetadataFormat we are using lastEntryId
> >
> [Jia] Thanks, It is a protobuf style. The protobuf will convert
`ledger_id`
> to `ledgerId`. We don’t need to worry about this.
>

got it, thanks


>
>
> > -In the "motivation" part you write that the fact the having more
clients
> > than the number of bookies would be a problem for zookeeper, actually
> > zookeeper is very good at dealing with a huge number of clients.
> Actually I
> > am always running clusters with 3-5 bookies and 10-100 writing clients
> and
> > this has never given troubles
>
> [Jia] :) Seems “10-100 writing clients” is not “a huge number of clients”.
>

OK, I agree with you an Sijie, I have no experience of larger clusters


>
> >
>
>
>
> > Future:
> > - as bookies will be proxies maybe we should take care not to overwhelm
a
> > bookie with too many clients
> >
> [Jia] First, gRPC is based on Netty, the protocol is http2, so the
> connection is multiplexed. We don’t need to worry about connection count.
> Second, all the bookies are treated equally for the metadata operations,
> gRPC will load balancing the requests across the bookies. We don’t need to
> worry about some bookies are overwhelmed.
>

gRPC sounds great


>
>
> > - iteration on ledgers, sometimes the clients enumerates ledgers but it
> is
> > not interested in having all of them, as we are using the bookie as
proxy
> > maybe some kind of "filter" (at least on custom metadata) would be
create
> > to limit the number of returned items. Other point I don't know gRPC but
> it
> > does not seems to be very clear how to 'stop' the iteration
> >
> [Jia] Thanks, We can add it later. For now, we would like to focus on
> adding the features the ledger manager needs.
>

Yup

-- Enrico


>
> >
> > -- Enrico
> >
> >
> > 2017-09-05 15:10 GMT+02:00 Jia Zhai <zhaiji...@gmail.com>:
> >
> > > Hi all,
> > >
> > > I have just posted a proposal to remove zookeeper dependency from
> > > bookkeeper client, to make bookkeeper client a thin client:
> > >
> > > https://cwiki.apache.org/confluence/display/BOOKKEEPER/
> > > BP-16%3A+remove+zookeeper+dependency+from+bookkeeper+client
> > >
> > >
> > > BookKeeper uses zookeeper for service discovery (discovering the
> > available
> > > bookies in the cluster), metadata management (storing all the metadata
> > for
> > > ledgers). However it exposes the metadata storage directly to the
> > clients,
> > > making bookkeeper client a very thick client. It also exposes some
> > > problems.
> > >
> > > This BP explores the possibility of eliminating zookeeper completely
> from
> > > client side, to produce a thin bookkeeper client.
> > >
> > > I will send a patch as soon as we agree on the proposal.
> > >
> > >
> > > Thanks.
> > >
> > > -Jia
> > >
> >
>

Reply via email to