Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

Jeff Widman Mon, 06 Feb 2017 13:04:52 -0800

For individual consumer groups, it would be nice if the admin client made
it possible to fetch consumer offsets for the entire consumer group. Then
we don't have to manually assemble this outside of the admin client
interface.


On Feb 6, 2017 11:41 AM, "Colin McCabe" <cmcc...@apache.org> wrote:

> On Fri, Feb 3, 2017, at 16:57, Dong Lin wrote:
> > Thanks for the reply, Colin. I have some comments inline.
>
> Hi Dong L.,
>
> >
> > In addition, I also have some comments regarding the Future() in response
> > to your latest email. As Ismael mentioned, we have added
> > purgeDataBefore()
> > API in AdminClient. This API returns Future() that allows user to purge
> > data in either syn or async manner. And we have presented use-case for
> > both
> > syn and async usage of this API in the discussion thread of KIP-107. I
> > think we should at least return a Future() object in this case, right?
>
> Hmm.  I didn't see any discussion in the KIP-107 [DISCUSS] thread of why
> a Future<> based API was proposed.  It seemed like the discussion
> focused on other issues (or maybe I missed the an email thread?)
>
> >
> > As you mentioned, we can transform a blocking API into a Futures-based
> > API
> > by using a thread pool. But thread pool seems inconvenient as compared to
> > using future().get() which transform a Future-based API into a blocking
> > API. Would it be a good reason to return Future() for all those API where
> > we need both syn and async mode?
>
> That's a good point.  It's easier to build a sync API on top of Futures
> than the reverse.
>
> >
> >
> > On Fri, Feb 3, 2017 at 10:20 AM, Colin McCabe <cmcc...@apache.org>
> wrote:
> >
> > > On Thu, Feb 2, 2017, at 17:54, Dong Lin wrote:
> > > > Hey Colin,
> > > >
> > > > Thanks for the KIP. I have a few comments below:
> > > >
> > > > - I share similar view with Ismael that a Future-based API is better.
> > > > PurgeDataFrom() is an example API that user may want to do it
> > > > asynchronously even though there is only one request in flight at a
> time.
> > > > In the future we may also have some admin operation that allows user
> to
> > > > move replica from one broker to another, which also needs to work in
> both
> > > > sync and async style. It seems more generic to return Future<?> for
> any
> > > > API
> > > > that requires both mode.
> > > >
> > > > - I am not sure if it is the cleanest way to have enum classes
> > > > CreateTopicsFlags and DeleteTopicsFlags. Are we going to create such
> > > > class
> > > > for every future API that requires configuration? It may be more
> generic
> > > > to
> > > > provide Map<String, ?> to any admin API that operates on multiple
> > > > entries.
> > > > For example, deleteTopic(Map<String, Boolean>). And it can be
> Map<String,
> > > > Properties> for those API that requires multiple configs per entry.
> And
> > > > we
> > > > can provide default value, doc, config name for those API as we do
> > > > with AbstractConfig.
> > >
> > > Thanks for the comments, Dong L.
> > >
> > > EnumSet<CreateTopicsFlags>, EnumSet<DeleteTopicsFlags>, and so forth
> are
> > > type-safe ways for the user to pass a set of boolean flags to the
> > > function.  It is basically a replacement for having an api like
> > > createTopics(...., boolean nonblocking, boolean validateOnly).  It is
> > > preferrable to having the boolean flags because it's immediately clear
> > > when reading the code what createTopics(...,
> > > CreateTopicsFlags.VALIDATE_ONLY) means, whereas it is not immediately
> > > clear what createTopics(..., false, true) means.  It also prevents
> > > having lots of function overloads over time, which becomes confusing
> for
> > > users.  The EnumSet is not intended as a replacement for all possible
> > > future arguments we might add, but just an easy way to add more boolean
> > > arguments later without adding messy function overloads or type
> > > signatures with many booleans.
> > >
> > > Map<String, ?> is not type-safe, and I don't think we should use it in
> > > our APIs.  Instead, we should just add function overloads if necessary.
> > >
> >
> > I agree that using EnumSet is type safe.
> >
> >
> > >
> > > >
> > > > - I not sure if "Try" is very intuitive to Java developer. Is there
> any
> > > > counterpart of scala's "Try" in java
> > >
> > > Unfortunately, there is no equivalent to Try in the standard Java
> > > library.  That is the first place I checked, and I spent a while
> > > searching.  The closest is probably Java 8's Optional.  However,
> > > Optional just allows us to express Some(thing) or None, so callers
> would
> > > not be able to determine what the error was.
> >
> >
> > > > We actually have quite a few
> > > > existing
> > > > classes in Kafka that address the same problem, such as
> > > > ProduceRequestResult, LogAppendResult etc. Maybe we can follow the
> same
> > > > conversion and use *Result as this class name.
> > >
> > > Hmm.  ProduceRequestResult and LogAppendResult just store an exception
> > > alongside the data, and make the caller responsible for checking
> whether
> > > the exception is null (or None) before looking at the data.  I don't
> > > think this is a good solution, because if the user forgets to do this,
> > > they will interpret whatever is in the data fields as valid, even when
> > > it is not.  ProduceRequestResult and LogAppendResult are also internal
> > > classes, not user-facing APIs, so we did not spend time thinking about
> > > how to make them easy to use for end-users.
> > >
> >
> > I am not actually suggesting to use LogAppendResult directly. I am
> > wondering if it would be more intuitive for developer to name this class
> > something like OperationResult instead of Try. The OperationResult can
> > store whatever we want to store with current "Try" class you proposed in
> > the KIP. It is just my personal opinion that "Try" doesn't directly tell
> > me
> > what the class actually does. But I am not too worried about it if you
> > think "Try" is better.
>
> I guess part of the reason I used "Try" is that scala.util.Try [
> http://www.scala-lang.org/api/2.9.3/scala/util/Try.html ] is very
> well-known to most Scala developers and fills the same function.  I'm
> open to other names... "Result" might be better than "OperationResult",
> since it's shorter.
>
> >
> >
> > >
> > > >
> > > > - How are we going to allow user to specify timeout for blocking APIs
> > > > such
> > > > as deleteTopic? Is this configured per AdminClient, or should it be
> > > > specified in the API parameter?
> > >
> > > Right now, it is specified by the configuration of the AdminClient.
> > >
> >
> > User probably wants different client-side timeout for different API,
> > right?
> > For example, user may want to block for a long time for topic creation
> > API
> > because he needs topic to be created before doing the next thing. But the
> > user may want to call purgeDataBefore() API periodically and
> > automatically
> > without waiting for purge operation to complete. So if we don't provide
> > Future() to user, we should probably allow user to specify client-side
> > timeout in the API.
>
> Keep in mind "async on the client" (futures) is different than "async on
> the server" (setting very short or 0 timeouts in the RPC).  We could
> have a future-based API and still have long server timeouts.  It just
> means that the futures will wait longer before returning failure or
> success.  I agree that it would be nice to be able to set per-operation
> timeouts, though.
>
> cheers,
> Colin
>
> >
> >
> > >
> > > >
> > > > - Are we going to have this class initiate its own thread, as we do
> with
> > > > Sender class, to send/receive requests? If yes, it will be useful to
> have
> > > > have a class that extends AbstractConfig and specifies config and
> their
> > > > default values.
> > >
> > > Yes, I agree.  I will add this to the KIP.
> > >
> > > best,
> > > Colin
> > >
> > > >
> > > > Thanks,
> > > > Dong
> > > >
> > > >
> > > >
> > > > On Thu, Feb 2, 2017 at 3:02 PM, Ismael Juma <ism...@juma.me.uk>
> wrote:
> > > >
> > > > > Hi Colin,
> > > > >
> > > > > Thanks for the KIP, great to make progress on this. I have some
> initial
> > > > > comments, will post more later:
> > > > >
> > > > > 1. We have KafkaProducer that implements the Producer interface and
> > > > > KafkaConsumer that implements the Consumer interface. Maybe we
> could
> > > have
> > > > > KafkaAdminClient that implements the AdminClient interface? Or
> maybe
> > > just
> > > > > KafkaAdmin. Not sure, some ideas for consideration. Also, I don't
> > > think we
> > > > > should worry about a name clash with the internal AdminClient
> written
> > > in
> > > > > Scala. That will go away soon enough and choosing a good name for
> the
> > > > > public class is what matters.
> > > > >
> > > > > 2. We should include the proposed package name in the KIP
> > > > > (probably org.apache.kafka.clients.admin?).
> > > > >
> > > > > 3. It would be good to list the supported configs.
> > > > >
> > > > > 4. KIP-107, which passed the vote, specifies the introduction of a
> > > method
> > > > > to AdminClient with the following signature. We should figure out
> how
> > > it
> > > > > would look given this proposal.
> > > > >
> > > > > Future<Map<TopicPartition, PurgeDataResult>>
> > > > > purgeDataBefore(Map<TopicPartition, Long> offsetForPartition)
> > > > >
> > > > > 5. I am not sure about rejecting the Futures-based API. I think I
> would
> > > > > prefer that, personally. Grant had an interesting idea of not
> exposing
> > > the
> > > > > batch methods in the AdminClient to start with to keep it simple
> and
> > > > > relying on a Future based API to make it easy for users to run
> things
> > > > > concurrently. This is consistent with the producer and Java 8 makes
> > > things
> > > > > a lot nicer with CompletableFuture (similar to Scala Futures). I
> will
> > > think
> > > > > more about this and other details of the proposal and send a
> follow-up.
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Thu, Feb 2, 2017 at 6:54 PM, Colin McCabe <cmcc...@apache.org>
> > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I wrote up a Kafka improvement proposal for adding an
> > > > > > AdministrativeClient interface.  This is a continuation of the
> work
> > > on
> > > > > > KIP-4 towards centralized administrative operations.  Please
> check
> > > out
> > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-117%
> > > > > 3A+Add+a+public+
> > > > > > AdministrativeClient+API+for+Kafka+admin+operations
> > > > > >
> > > > > > regards,
> > > > > > Colin
> > > > > >
> > > > >
> > >
>

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

Reply via email to