Guozhang,

Perhaps we can discuss this in our KIP hangout next week?

Thanks,

Jun

On Tue, Jun 9, 2015 at 1:12 PM, Guozhang Wang <wangg...@gmail.com> wrote:

> This email is to kick-off some discussion around the changes we want to
> make on the new consumer APIs as well as their semantics. Here are a
> not-comprehensive list of items in my mind:
>
> 1. Poll(timeout): current definition of timeout states "The time, in
> milliseconds, spent waiting in poll if data is not available. If 0, waits
> indefinitely." While in the current implementation, we have different
> semantics as stated, for example:
>
> a) poll(timeout) can return before "timeout" elapsed with empty consumed
> data.
> b) poll(timeout) can return after more than "timeout" elapsed due to
> blocking event like join-group, coordinator discovery, etc.
>
> We should think a bit more on what semantics we really want to provide and
> how to provide it in implementation.
>
> 2. Thread safeness: currently we have a coarsen-grained locking mechanism
> that provides thread safeness but blocks commit / position / etc calls
> while poll() is in process. We are considering to remove the
> coarsen-grained locking with an additional Consumer.wakeup() call to break
> the polling, and instead suggest users to have one consumer client per
> thread, which aligns with the design of a single-threaded consumer
> (KAFKA-2123).
>
> 3. Commit(): we want to improve the async commit calls to add a callback
> handler upon commit completes, and guarantee ordering of commit calls with
> retry policies (KAFKA-2168). In addition, we want to extend the API to
> expose attaching / fetching offset metadata stored in the Kafka offset
> manager.
>
> 4. OffsetFetchRequest: currently for handling OffsetCommitRequest we check
> the generation id and the assigned partitions before accepting the request
> if the group is using Kafka for partition management, but for
> OffsetFetchRequest we cannot do this checking since it does not include
> groupId / consumerId information. Do people think this is OK or we should
> add this as we did in OffsetCommitRequest?
>
> 5. New APIs: there are some other requests to add:
>
> a) offsetsBeforeTime(timestamp): or would seekToEnd and seekToBeginning
> sufficient?
>
> b) listTopics(): or should we just enforce users to use AdminUtils for such
> operations?
>
> There may be other issues that I have missed here, so folks just bring it
> up if you thought about anything else.
>
> -- Guozhang
>

Reply via email to