Then, you may be interested in our consume client redesign:

https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Client+Re-Design

Thanks,

Jun

On Fri, Nov 2, 2012 at 7:17 AM, Guy Peleg <guy.pe...@gmail.com> wrote:

> Jun,
>
> On that note, waiting to consumer's acknowledgment can be with configurable
> timeout ranging from blocking to not wait at all
>
> Thanks,
>
> Guy
>
>
>
> On Fri, Nov 2, 2012 at 12:38 PM, Guy Peleg <guy.pe...@gmail.com> wrote:
>
> > Jun,
> >
> > I'm not sure that's enough.
> >
> > A callback may not be enough since we can't be sure that there are no
> > events from that partition being processed while the new consumer starts
> > processing events from that partition.
> >
> > I think that a consumer should be handed a partition only after we're
> sure
> > there is no other consumer that is reading or *processing *events from
> > that partition.
> >
> > The only way to achieve that is, I think, by some kind of acknowledgment
> > from the consumer side that it is ready to give up the partition (e.g.
> > after gracefully stopped working internally with those events)
> >
> > I know that means that there is a need to consider the extreme cases
> here,
> > but still I think we can't do without the consumer's 'acknoledment'
> without
> > being subject to race scenarios.
> >
> > Thanks,
> >
> > Guy
> >
> >
> >
> >
> > On Thu, Nov 1, 2012 at 4:56 PM, Jun Rao <jun...@gmail.com> wrote:
> >
> >> Guy,
> >>
> >> Yes, this is possible. One solution that we have been thinking about is
> >> that if a rebalance happens, each consumer can somehow get a callback
> that
> >> indicates the set of partitions being consumed may have changed. Will
> this
> >> address your concern?
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >> On Thu, Nov 1, 2012 at 12:10 AM, Guy Peleg <guy.pe...@gmail.com> wrote:
> >>
> >> > One more possible race might happen when the partition number is fixed
> >> but
> >> > consumer(s) are added/removed
> >> > For example: If I have a consumer reading data from two partitions
> >> > (partition one and partition two), and a new consumer is added, the
> >> result
> >> > will be that each consumer will consume from one partition
> >> > let's say that the 'old' consumer will continue with partition one
> while
> >> > the new consumer will process the data from partition two
> >> >
> >> > but, suppose that partition two held events that belong to event id
> 'x',
> >> > and that partition is now consumed by the new consumer,
> >> > Since consumers might reside on different machines and they are
> possibly
> >> > multithreaded processes, there might be a situation that other event
> ids
> >> > 'x' are already 'in the internal queues' and are being processed
> >> > by the first consumer (events that were read/entered the first
> consumer
> >> > before the new consumer appeared but are being processed or wait to
> >> > processed within the 'old' consumer) and that means that there is a
> >> > possibility that those events are being processed simultaneously by
> the
> >> two
> >> > consumers (since the new consumer will start reading events that might
> >> be
> >> > of id 'x' and that might be then processed in parallel with event ids
> >> 'x'
> >> > in the old consumer)
> >> >
> >> > If that is a possible scenario then when a new consumer is starting
> >> there
> >> > should be some kind of 'consumers sync'
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Wed, Oct 31, 2012 at 4:57 PM, Jun Rao <jun...@gmail.com> wrote:
> >> >
> >> > > Guy,
> >> > >
> >> > > This is really an issue with changing # of partitions. If # of
> >> partitions
> >> > > changes for a topic, in the transition phase, messages used to be
> >> > delivered
> >> > > to the same partition could be delivered to different partitions and
> >> > their
> >> > > consumption ordering is non-deterministic (since ordered consumption
> >> is
> >> > > only guaranteed within a partition).
> >> > >
> >> > > In 0.7, # of partitions increases as new brokers are added. In 0.8,
> #
> >> of
> >> > > partitions is set at topic creation time and will stay the same when
> >> new
> >> > > brokers are added.
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Jun
> >> > >
> >> > > On Wed, Oct 31, 2012 at 4:12 AM, Guy Peleg <guy.pe...@gmail.com>
> >> wrote:
> >> > >
> >> > > > Hi,
> >> > > >
> >> > > > As I learn and plan to use Kafka, I'm concirned about possible
> race
> >> > > > condition when brokers/consumers are added or removed.
> >> > > >
> >> > > > Say I have a topic that is devide into two partitions, where
> >> consumers
> >> > > are
> >> > > > deviding the mssages between those two partitions by ,say, modulo
> >> > > event-id,
> >> > > > where events with the same event ids should be processed by the
> >> order
> >> > of
> >> > > > their arrival, that will work since as I said, I will devide the
> >> > incoming
> >> > > > events by their event-id % number_of_partitions
> >> > > >
> >> > > > Now, when a new paratition is added, there might be situations
> where
> >> > > events
> >> > > > with event-id 'x', will still be in the first broker, while new
> >> ones,
> >> > > with
> >> > > > event-id 'x', are added to the new paratition
> >> > > > which may result in those events being processed in parallel, what
> >> am i
> >> > > > missing?
> >> > > >
> >> > > > Thanks,
> >> > > >
> >> > > > Guy
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Reply via email to