On Thu, Feb 2, 2017, at 15:02, Ismael Juma wrote: > Hi Colin, > > Thanks for the KIP, great to make progress on this. I have some initial > comments, will post more later: > > 1. We have KafkaProducer that implements the Producer interface and > KafkaConsumer that implements the Consumer interface. Maybe we could have > KafkaAdminClient that implements the AdminClient interface? Or maybe just > KafkaAdmin. Not sure, some ideas for consideration. Also, I don't think > we > should worry about a name clash with the internal AdminClient written in > Scala. That will go away soon enough and choosing a good name for the > public class is what matters.
Hi Ismael, Thanks for taking a look. I guess my thought process was that users might find it confusing if the public API and the old private API had the same name. "What do you mean, I have to upgrade to release X to get AdminClient, I have it right here?" I do have a slight preference for the shorter name, though, so if this isn't a worry, we can change it to AdminClient. > > 2. We should include the proposed package name in the KIP > (probably org.apache.kafka.clients.admin?). Good idea. I will add the package name to the KIP. > > 3. It would be good to list the supported configs. OK > > 4. KIP-107, which passed the vote, specifies the introduction of a method > to AdminClient with the following signature. We should figure out how it > would look given this proposal. > > Future<Map<TopicPartition, PurgeDataResult>> > purgeDataBefore(Map<TopicPartition, Long> offsetForPartition) > > 5. I am not sure about rejecting the Futures-based API. I think I would > prefer that, personally. Grant had an interesting idea of not exposing > the > batch methods in the AdminClient to start with to keep it simple and > relying on a Future based API to make it easy for users to run things > concurrently. This is consistent with the producer... So, there are two ways that an operation can be "async" here which are very separate. There is "async on the server." This basically means that we tell the server to do something and don't wait for a confirmation that it succeeded. For example, in the current proposal, users can call createTopic(new Topic(...), CreateTopicFlags.NONBLOCKING). The call will wait for the server to get the request, which will go into purgatory. Later on, the request may succeed or fail, but the client won't know either way. In RPC terms, this means we set the timeout value to 0. Then there is "async on the client." This just means that the client thread doesn't block-- instead, it gets back a Future (or similar object). What this boils down to in terms of implementation is that a message gets put on some queue somewhere and the client thread continues running. "async on the client" tends to be good when you want to churn out a ton of requests without using lots of threads. However, it is more confusing mental model for most programmers. You can easily translate a Futures-based API into a blocking API by having blocking shims that just call create the Future and call get(). Similarly, you can transform a blocking API into a Futures-based API by using a thread pool. Thread pools use resources, though, whereas having function shims does not. I haven't seen any discussion here about what we gain here by using a Futures-based API. It makes sense to use Futures in the Producer, since they're more flexible, and users are potentially creating lots and lots of messages. I'm not sure if users would want to do lots and lots of admin operations with a single thread. I'd be curious to hear a little more from potential end-users about what API would be most flexible and usable for them. I'm open to ideas. best, Colin > > Ismael > > On Thu, Feb 2, 2017 at 6:54 PM, Colin McCabe <cmcc...@apache.org> wrote: > > > Hi all, > > > > I wrote up a Kafka improvement proposal for adding an > > AdministrativeClient interface. This is a continuation of the work on > > KIP-4 towards centralized administrative operations. Please check out > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-117%3A+Add+a+public+ > > AdministrativeClient+API+for+Kafka+admin+operations > > > > regards, > > Colin > >