Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

Dong Lin Wed, 08 Feb 2017 18:44:29 -0800

BTW, the idea to share NetworkClient is suggested by Radai and I like this
idea.


On Wed, Feb 8, 2017 at 6:39 PM, Dong Lin <lindon...@gmail.com> wrote:

> Hey Colin,
>
> Thanks for updating the KIP. I have two followup questions:
>
> - It seems that setCreationConfig(...) is a bit redundant given that most
> arguments (e.g. topic name, partition num) are already passed to
> TopicsContext.create(...) when user creates topic. Should we pass
> the creationConfig as a parameter to TopicsContext.create(..)?
>
> - I am wondering if we should also specify the constructor of the
> AdminClient in the KIP. Previously we agreed that AdminClient should have
> its own thread to poll NetworkClient to send/receive messages. Should we
> also allow AdminClient to use an existing NetworkClient that is provided to
> the constructor? This would allow AdminClient to share NetworkClient with
> producer or consumer in order to reduce the total number of open sockets on
> both client and server.
>
> Thanks,
> Dong
>
> On Wed, Feb 8, 2017 at 5:00 PM, Colin McCabe <cmcc...@apache.org> wrote:
>
>> Hi all,
>>
>> I made some major revisions to the proposal on the wiki, so please check
>> it out.
>>
>> The new API is based on Ismael's suggestion of grouping related APIs.
>> There is only one layer of grouping.  I think that it's actually pretty
>> intuitive.  It's also based on the idea of using Futures, which several
>> people commented that they'd like to see.
>>
>> Here's a simple example:
>>
>>  > AdminClient client = new AdminClientImpl(myConfig);
>>  > try {
>>  >   client.topics().create("foo", 3, (short) 2, false).get();
>>  >   Collection<String> topicNames = client.topics().list(false).get();
>>  >   log.info("Found topics: {}", Utils.mkString(topicNames, ", "));
>>  >   Collection<Node> nodes = client.nodes().list().get();
>>  >   log.info("Found cluster nodes: {}", Utils.mkString(nodes, ", "));
>>  > } finally {
>>  >   client.close();
>>  > }
>>
>> The good thing is, there is no Try, no 'get' prefixes, no messing with
>> batch APIs.  If there is an error, then Future#get() throws an
>> ExecutionException which wraps the relevant exception in the standard
>> Java way.
>>
>> Here's a slightly less simple example:
>>
>> > AdminClient client = new AdminClientImpl(myConfig);
>> > try {
>> >   List<Future<Void>> futures = new LinkedList<>();
>> >   for (String topicName: myNewTopicNames) {
>> >     creations.add(client.topics().
>> >         setClientTimeout(30000).setCreationConfig(myTopicConfig).
>> >           create(topicName, 3, (short) 2, false));
>> >   }
>> >   Futures.waitForAll(futures);
>> > } finally {
>> >   client.close();
>> > }
>>
>> I went with Futures because I feel like ought to have some option for
>> doing async.  It's a style of programming that has become a lot more
>> popular with the rise of Node.js, Twisted python, etc. etc.  Also, as
>> Ismael commented, Java 8 CompletableFuture is going to make Java's
>> support for fluent async programming a lot stronger by allowing call
>> chaining and much more.
>>
>> If we are going to support async, the simplest thing is just to make
>> everything return a future and let people call get() if they want to run
>> synchronously.  Having a mix of async and sync APIs is just going to be
>> confusing and redundant.
>>
>> I think we should try to avoid creating single functions that start
>> multiple requests if we can.  It makes things much uglier.  It means
>> that you have to have some kind of request class that wraps up the
>> request the user is trying to create, so that you can handle an array of
>> those requests.  The return value has to be something like Map<Node,
>> Try<Value>> to represent which nodes failed and succeeded.  This is the
>> kind of stuff that, in my opinion, makes people scratch their heads.
>>
>> If we need to, we can still get some of the efficiency benefits of batch
>> APIs by waiting for a millisecond or two before sending out a topic
>> create() request to see if other create() requests arrive.  If so, we
>> can coalesce them.  It might be better to figure out if this is an
>> actual performance issue before implementing it, though.
>>
>> I think it would be good to get something out there, annotate it as
>> @Unstable, and get feedback from people building against trunk and using
>> it.  We have removed or changed @Unstable APIs in streams before, so I
>> don't think we should worry that it will get set in stone prematurely.
>> The AdminClient API should get much less developer use than anything in
>> streams, so changing an unstable API should be much easier.
>>
>> best,
>> Colin
>>
>>
>> On Wed, Feb 8, 2017, at 07:49, Ismael Juma wrote:
>> > Thanks for elaborating Jay. I totally agree that we have to be very
>> > careful
>> > in how we use our complexity budget. Easier said than done when people
>> > don't agree on what is complex and what is simple. :) For example, I
>> > think
>> > batch APIs are a significant source of complexity as you have to do a
>> > bunch
>> > of ceremony to group things before sending the request and error
>> handling
>> > becomes more complex due to partial failures (things like `Try` or other
>> > mechanisms that serve a similar role are then needed).
>> >
>> > Maybe a way forward is to write API usage examples to help validate that
>> > the suggested API is indeed easy to use.
>> >
>> > Ismael
>> >
>> > On Wed, Feb 8, 2017 at 4:40 AM, Jay Kreps <j...@confluent.io> wrote:
>> >
>> > > Totally agree on CompletableFuture. Also agree with some of the rough
>> edges
>> > > on the Consumer.
>> > >
>> > > I don't have much of a leg to stand on with the splitting vs not
>> splitting
>> > > thing, really hard to argue one or the other is better. I guess the
>> one
>> > > observation in watching us try to make good public apis over the
>> years is I
>> > > am kind of in favor of a particular kind of simple. In particular I
>> think
>> > > since the bar is sooo high in support and docs and the community of
>> users
>> > > so broad in the range of their capabilities, it makes it so there is
>> a lot
>> > > of value in dead simple interfaces that don't have a lot of conceptual
>> > > weight, don't introduce a lot of new classes or concepts or general
>> > > patterns that must be understood to use them correctly. So things like
>> > > nesting, or the Try class, or async apis, or even just a complex set
>> of
>> > > classes representing arguments or return values kind of all stack
>> > > conceptual burdens on the user to figure out correct usage. So like,
>> for
>> > > example, the Try class is very elegant and represents a whole
>> generalized
>> > > class of possibly completed actions, but the flip side is maybe I'm
>> just a
>> > > working guy who needs to list his kafka topics but doesn't know Rust,
>> take
>> > > pity on me! :-)
>> > >
>> > > Nit picking aside, super excited to see us progress on this.
>> > >
>> > > -Jay
>> > >
>> > > On Tue, Feb 7, 2017 at 3:46 PM Ismael Juma <ism...@juma.me.uk> wrote:
>> > >
>> > > > Hi Jay,
>> > > >
>> > > > Thanks for the feedback. Comments inline.
>> > > >
>> > > > On Tue, Feb 7, 2017 at 8:18 PM, Jay Kreps <j...@confluent.io> wrote:
>> > > > >
>> > > > >    - I think it would be good to not use "get" as the prefix for
>> things
>> > > > >    making remote calls. We've tried to avoid the java getter
>> convention
>> > > > >    entirely (see code style guide), but for remote calls in
>> particular
>> > > it
>> > > > > kind
>> > > > >    of blurs the line between field access and remote RPC in a way
>> that
>> > > > > leads
>> > > > >    people to trouble. What about, e.g., fetchAllGroups() vs
>> > > > getAllGroups().
>> > > > >
>> > > >
>> > > > Agreed that we should avoid the `get` prefix for remote calls.
>> There are
>> > > a
>> > > > few possible prefixes for the read operations: list, fetch,
>> describe.
>> > > >
>> > > >
>> > > > >    - I think futures and callbacks are a bit of a pain to use. I'd
>> > > second
>> > > > >    Becket's comment: let's ensure there a common use case
>> motivating
>> > > > these
>> > > > >    that wouldn't be just as easily satisfied with batch operations
>> > > (which
>> > > > > we
>> > > > >    seem to have at least for some things). In terms of flexibility
>> > > > > Callbacks >
>> > > > >    Futures > Batch Ops but I think in terms of usability it is the
>> > > exact
>> > > > >    opposite so let's make sure we have worked out how the API
>> will be
>> > > > used
>> > > > >    before deciding. In particular I think java Futures are often
>> an
>> > > > >    uncomfortable half-way point since calling get() and blocking
>> the
>> > > > > thread is
>> > > > >    often not what you want for chaining sequences of operations
>> in a
>> > > > truly
>> > > > >    async way, so 99% of people just use the future as a way to
>> batch
>> > > > calls.
>> > > > >
>> > > >
>> > > > We should definitely figure out how the APIs are going to be used
>> before
>> > > > deciding. I agree that callbacks are definitely painful and there's
>> > > little
>> > > > reason to expose them in a modern API unless it's meant to be very
>> low
>> > > > level. When it comes to Futures, I think it's important to
>> distinguish
>> > > what
>> > > > is available in Java 7 and below versus what is available from Java
>> 8
>> > > > onwards. CompletableFuture makes it much easier to compose/chain
>> > > operations
>> > > > (in a similar vein to java.util.Stream, our own Streams API, etc.)
>> and it
>> > > > gives you the ability to register callbacks if you really want to
>> > > (avoiding
>> > > > the somewhat odd situation in the Producer where we return a Future
>> _and_
>> > > > allow you to pass a callback).
>> > > >
>> > > >
>> > > > >    - Personally I don't think splitting the admin methods up
>> actually
>> > > > makes
>> > > > >    things more usable. It just makes you have to dig through our
>> > > > > hierarchy. I
>> > > > >    think a flat class with a bunch of operations (like the
>> consumer
>> > > api)
>> > > > is
>> > > > >    probably the easiest for people to grok and find things on. I
>> am
>> > > kind
>> > > > > of a
>> > > > >    dumb PHP programmer at heart, though.
>> > > > >
>> > > >
>> > > > I am not sure it's fair to compare the AdminClient with the
>> Consumer. The
>> > > > former has APIs for a bunch of unrelated APIs (topics, ACLs,
>> configs,
>> > > > consumer groups, delegation tokens, preferred leader election,
>> partition
>> > > > reassignment, etc.) where the latter is pretty specialised. For
>> each of
>> > > the
>> > > > resources, you may have 3-4 operations, it will get confusing fast.
>> Also,
>> > > > do you really think an API that has one level of grouping will mean
>> that
>> > > > users have to "dig through our hierarchy"? Or are you concerned
>> that once
>> > > > we go in that direction, there is a danger of making the hierarchy
>> more
>> > > > complicated?
>> > > >
>> > > > Finally, I am not sure I would use the consumer as an example of
>> > > something
>> > > > that is easy to grok. :) The fact that methods behave pretty
>> differently
>> > > > (some are blocking while others only have an effect after poll)
>> with no
>> > > > indication from the type signature or naming convention makes it
>> harder,
>> > > > not easier, to understand.
>> > > >
>> > > > Ismael
>> > > >
>> > >
>>
>
>

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

Reply via email to