Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

Colin McCabe Thu, 09 Feb 2017 10:47:01 -0800

On Wed, Feb 8, 2017, at 18:39, Dong Lin wrote:
> Hey Colin,
> 
> Thanks for updating the KIP. I have two followup questions:
> 
> - It seems that setCreationConfig(...) is a bit redundant given that most
> arguments (e.g. topic name, partition num) are already passed to
> TopicsContext.create(...) when user creates topic. Should we pass
> the creationConfig as a parameter to TopicsContext.create(..)?


That makes sense.  I'll add it as a parameter to create.

> 
> - I am wondering if we should also specify the constructor of the
> AdminClient in the KIP. Previously we agreed that AdminClient should have
> its own thread to poll NetworkClient to send/receive messages. Should we
> also allow AdminClient to use an existing NetworkClient that is provided
> to
> the constructor? This would allow AdminClient to share NetworkClient with
> producer or consumer in order to reduce the total number of open sockets
> on
> both client and server.

Hmm.  NetworkClient isn't thread safe.  There is no safe way for the
AdminClient thread to share a NetworkClient instance with any other
thread.  I also think library users shouldn't be accessing NetworkClient
since it's part of the implementation, not the API.

It would be interesting to think about extending NetworkClient so that
it could support being shared between consumer, producer, and
adminclient.  You would probably also need to support having multiple
requests in flight to a particular node, which it can't do right now. 
We should probably do that in a separate KIP, though.

best,
Colin

> 
> Thanks,
> Dong
> 
> On Wed, Feb 8, 2017 at 5:00 PM, Colin McCabe <[email protected]> wrote:
> 
> > Hi all,
> >
> > I made some major revisions to the proposal on the wiki, so please check
> > it out.
> >
> > The new API is based on Ismael's suggestion of grouping related APIs.
> > There is only one layer of grouping.  I think that it's actually pretty
> > intuitive.  It's also based on the idea of using Futures, which several
> > people commented that they'd like to see.
> >
> > Here's a simple example:
> >
> >  > AdminClient client = new AdminClientImpl(myConfig);
> >  > try {
> >  >   client.topics().create("foo", 3, (short) 2, false).get();
> >  >   Collection<String> topicNames = client.topics().list(false).get();
> >  >   log.info("Found topics: {}", Utils.mkString(topicNames, ", "));
> >  >   Collection<Node> nodes = client.nodes().list().get();
> >  >   log.info("Found cluster nodes: {}", Utils.mkString(nodes, ", "));
> >  > } finally {
> >  >   client.close();
> >  > }
> >
> > The good thing is, there is no Try, no 'get' prefixes, no messing with
> > batch APIs.  If there is an error, then Future#get() throws an
> > ExecutionException which wraps the relevant exception in the standard
> > Java way.
> >
> > Here's a slightly less simple example:
> >
> > > AdminClient client = new AdminClientImpl(myConfig);
> > > try {
> > >   List<Future<Void>> futures = new LinkedList<>();
> > >   for (String topicName: myNewTopicNames) {
> > >     creations.add(client.topics().
> > >         setClientTimeout(30000).setCreationConfig(myTopicConfig).
> > >           create(topicName, 3, (short) 2, false));
> > >   }
> > >   Futures.waitForAll(futures);
> > > } finally {
> > >   client.close();
> > > }
> >
> > I went with Futures because I feel like ought to have some option for
> > doing async.  It's a style of programming that has become a lot more
> > popular with the rise of Node.js, Twisted python, etc. etc.  Also, as
> > Ismael commented, Java 8 CompletableFuture is going to make Java's
> > support for fluent async programming a lot stronger by allowing call
> > chaining and much more.
> >
> > If we are going to support async, the simplest thing is just to make
> > everything return a future and let people call get() if they want to run
> > synchronously.  Having a mix of async and sync APIs is just going to be
> > confusing and redundant.
> >
> > I think we should try to avoid creating single functions that start
> > multiple requests if we can.  It makes things much uglier.  It means
> > that you have to have some kind of request class that wraps up the
> > request the user is trying to create, so that you can handle an array of
> > those requests.  The return value has to be something like Map<Node,
> > Try<Value>> to represent which nodes failed and succeeded.  This is the
> > kind of stuff that, in my opinion, makes people scratch their heads.
> >
> > If we need to, we can still get some of the efficiency benefits of batch
> > APIs by waiting for a millisecond or two before sending out a topic
> > create() request to see if other create() requests arrive.  If so, we
> > can coalesce them.  It might be better to figure out if this is an
> > actual performance issue before implementing it, though.
> >
> > I think it would be good to get something out there, annotate it as
> > @Unstable, and get feedback from people building against trunk and using
> > it.  We have removed or changed @Unstable APIs in streams before, so I
> > don't think we should worry that it will get set in stone prematurely.
> > The AdminClient API should get much less developer use than anything in
> > streams, so changing an unstable API should be much easier.
> >
> > best,
> > Colin
> >
> >
> > On Wed, Feb 8, 2017, at 07:49, Ismael Juma wrote:
> > > Thanks for elaborating Jay. I totally agree that we have to be very
> > > careful
> > > in how we use our complexity budget. Easier said than done when people
> > > don't agree on what is complex and what is simple. :) For example, I
> > > think
> > > batch APIs are a significant source of complexity as you have to do a
> > > bunch
> > > of ceremony to group things before sending the request and error handling
> > > becomes more complex due to partial failures (things like `Try` or other
> > > mechanisms that serve a similar role are then needed).
> > >
> > > Maybe a way forward is to write API usage examples to help validate that
> > > the suggested API is indeed easy to use.
> > >
> > > Ismael
> > >
> > > On Wed, Feb 8, 2017 at 4:40 AM, Jay Kreps <[email protected]> wrote:
> > >
> > > > Totally agree on CompletableFuture. Also agree with some of the rough
> > edges
> > > > on the Consumer.
> > > >
> > > > I don't have much of a leg to stand on with the splitting vs not
> > splitting
> > > > thing, really hard to argue one or the other is better. I guess the one
> > > > observation in watching us try to make good public apis over the years
> > is I
> > > > am kind of in favor of a particular kind of simple. In particular I
> > think
> > > > since the bar is sooo high in support and docs and the community of
> > users
> > > > so broad in the range of their capabilities, it makes it so there is a
> > lot
> > > > of value in dead simple interfaces that don't have a lot of conceptual
> > > > weight, don't introduce a lot of new classes or concepts or general
> > > > patterns that must be understood to use them correctly. So things like
> > > > nesting, or the Try class, or async apis, or even just a complex set of
> > > > classes representing arguments or return values kind of all stack
> > > > conceptual burdens on the user to figure out correct usage. So like,
> > for
> > > > example, the Try class is very elegant and represents a whole
> > generalized
> > > > class of possibly completed actions, but the flip side is maybe I'm
> > just a
> > > > working guy who needs to list his kafka topics but doesn't know Rust,
> > take
> > > > pity on me! :-)
> > > >
> > > > Nit picking aside, super excited to see us progress on this.
> > > >
> > > > -Jay
> > > >
> > > > On Tue, Feb 7, 2017 at 3:46 PM Ismael Juma <[email protected]> wrote:
> > > >
> > > > > Hi Jay,
> > > > >
> > > > > Thanks for the feedback. Comments inline.
> > > > >
> > > > > On Tue, Feb 7, 2017 at 8:18 PM, Jay Kreps <[email protected]> wrote:
> > > > > >
> > > > > >    - I think it would be good to not use "get" as the prefix for
> > things
> > > > > >    making remote calls. We've tried to avoid the java getter
> > convention
> > > > > >    entirely (see code style guide), but for remote calls in
> > particular
> > > > it
> > > > > > kind
> > > > > >    of blurs the line between field access and remote RPC in a way
> > that
> > > > > > leads
> > > > > >    people to trouble. What about, e.g., fetchAllGroups() vs
> > > > > getAllGroups().
> > > > > >
> > > > >
> > > > > Agreed that we should avoid the `get` prefix for remote calls. There
> > are
> > > > a
> > > > > few possible prefixes for the read operations: list, fetch, describe.
> > > > >
> > > > >
> > > > > >    - I think futures and callbacks are a bit of a pain to use. I'd
> > > > second
> > > > > >    Becket's comment: let's ensure there a common use case
> > motivating
> > > > > these
> > > > > >    that wouldn't be just as easily satisfied with batch operations
> > > > (which
> > > > > > we
> > > > > >    seem to have at least for some things). In terms of flexibility
> > > > > > Callbacks >
> > > > > >    Futures > Batch Ops but I think in terms of usability it is the
> > > > exact
> > > > > >    opposite so let's make sure we have worked out how the API will
> > be
> > > > > used
> > > > > >    before deciding. In particular I think java Futures are often an
> > > > > >    uncomfortable half-way point since calling get() and blocking
> > the
> > > > > > thread is
> > > > > >    often not what you want for chaining sequences of operations in
> > a
> > > > > truly
> > > > > >    async way, so 99% of people just use the future as a way to
> > batch
> > > > > calls.
> > > > > >
> > > > >
> > > > > We should definitely figure out how the APIs are going to be used
> > before
> > > > > deciding. I agree that callbacks are definitely painful and there's
> > > > little
> > > > > reason to expose them in a modern API unless it's meant to be very
> > low
> > > > > level. When it comes to Futures, I think it's important to
> > distinguish
> > > > what
> > > > > is available in Java 7 and below versus what is available from Java 8
> > > > > onwards. CompletableFuture makes it much easier to compose/chain
> > > > operations
> > > > > (in a similar vein to java.util.Stream, our own Streams API, etc.)
> > and it
> > > > > gives you the ability to register callbacks if you really want to
> > > > (avoiding
> > > > > the somewhat odd situation in the Producer where we return a Future
> > _and_
> > > > > allow you to pass a callback).
> > > > >
> > > > >
> > > > > >    - Personally I don't think splitting the admin methods up
> > actually
> > > > > makes
> > > > > >    things more usable. It just makes you have to dig through our
> > > > > > hierarchy. I
> > > > > >    think a flat class with a bunch of operations (like the consumer
> > > > api)
> > > > > is
> > > > > >    probably the easiest for people to grok and find things on. I am
> > > > kind
> > > > > > of a
> > > > > >    dumb PHP programmer at heart, though.
> > > > > >
> > > > >
> > > > > I am not sure it's fair to compare the AdminClient with the
> > Consumer. The
> > > > > former has APIs for a bunch of unrelated APIs (topics, ACLs, configs,
> > > > > consumer groups, delegation tokens, preferred leader election,
> > partition
> > > > > reassignment, etc.) where the latter is pretty specialised. For each
> > of
> > > > the
> > > > > resources, you may have 3-4 operations, it will get confusing fast.
> > Also,
> > > > > do you really think an API that has one level of grouping will mean
> > that
> > > > > users have to "dig through our hierarchy"? Or are you concerned that
> > once
> > > > > we go in that direction, there is a danger of making the hierarchy
> > more
> > > > > complicated?
> > > > >
> > > > > Finally, I am not sure I would use the consumer as an example of
> > > > something
> > > > > that is easy to grok. :) The fact that methods behave pretty
> > differently
> > > > > (some are blocking while others only have an effect after poll) with
> > no
> > > > > indication from the type signature or naming convention makes it
> > harder,
> > > > > not easier, to understand.
> > > > >
> > > > > Ismael
> > > > >
> > > >
> >

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

Reply via email to