Jun,

On that note, waiting to consumer's acknowledgment can be with configurable
timeout ranging from blocking to not wait at all

Thanks,

Guy



On Fri, Nov 2, 2012 at 12:38 PM, Guy Peleg <guy.pe...@gmail.com> wrote:

> Jun,
>
> I'm not sure that's enough.
>
> A callback may not be enough since we can't be sure that there are no
> events from that partition being processed while the new consumer starts
> processing events from that partition.
>
> I think that a consumer should be handed a partition only after we're sure
> there is no other consumer that is reading or *processing *events from
> that partition.
>
> The only way to achieve that is, I think, by some kind of acknowledgment
> from the consumer side that it is ready to give up the partition (e.g.
> after gracefully stopped working internally with those events)
>
> I know that means that there is a need to consider the extreme cases here,
> but still I think we can't do without the consumer's 'acknoledment' without
> being subject to race scenarios.
>
> Thanks,
>
> Guy
>
>
>
>
> On Thu, Nov 1, 2012 at 4:56 PM, Jun Rao <jun...@gmail.com> wrote:
>
>> Guy,
>>
>> Yes, this is possible. One solution that we have been thinking about is
>> that if a rebalance happens, each consumer can somehow get a callback that
>> indicates the set of partitions being consumed may have changed. Will this
>> address your concern?
>>
>> Thanks,
>>
>> Jun
>>
>> On Thu, Nov 1, 2012 at 12:10 AM, Guy Peleg <guy.pe...@gmail.com> wrote:
>>
>> > One more possible race might happen when the partition number is fixed
>> but
>> > consumer(s) are added/removed
>> > For example: If I have a consumer reading data from two partitions
>> > (partition one and partition two), and a new consumer is added, the
>> result
>> > will be that each consumer will consume from one partition
>> > let's say that the 'old' consumer will continue with partition one while
>> > the new consumer will process the data from partition two
>> >
>> > but, suppose that partition two held events that belong to event id 'x',
>> > and that partition is now consumed by the new consumer,
>> > Since consumers might reside on different machines and they are possibly
>> > multithreaded processes, there might be a situation that other event ids
>> > 'x' are already 'in the internal queues' and are being processed
>> > by the first consumer (events that were read/entered the first consumer
>> > before the new consumer appeared but are being processed or wait to
>> > processed within the 'old' consumer) and that means that there is a
>> > possibility that those events are being processed simultaneously by the
>> two
>> > consumers (since the new consumer will start reading events that might
>> be
>> > of id 'x' and that might be then processed in parallel with event ids
>> 'x'
>> > in the old consumer)
>> >
>> > If that is a possible scenario then when a new consumer is starting
>> there
>> > should be some kind of 'consumers sync'
>> >
>> >
>> >
>> >
>> >
>> > On Wed, Oct 31, 2012 at 4:57 PM, Jun Rao <jun...@gmail.com> wrote:
>> >
>> > > Guy,
>> > >
>> > > This is really an issue with changing # of partitions. If # of
>> partitions
>> > > changes for a topic, in the transition phase, messages used to be
>> > delivered
>> > > to the same partition could be delivered to different partitions and
>> > their
>> > > consumption ordering is non-deterministic (since ordered consumption
>> is
>> > > only guaranteed within a partition).
>> > >
>> > > In 0.7, # of partitions increases as new brokers are added. In 0.8, #
>> of
>> > > partitions is set at topic creation time and will stay the same when
>> new
>> > > brokers are added.
>> > >
>> > > Thanks,
>> > >
>> > > Jun
>> > >
>> > > On Wed, Oct 31, 2012 at 4:12 AM, Guy Peleg <guy.pe...@gmail.com>
>> wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > As I learn and plan to use Kafka, I'm concirned about possible race
>> > > > condition when brokers/consumers are added or removed.
>> > > >
>> > > > Say I have a topic that is devide into two partitions, where
>> consumers
>> > > are
>> > > > deviding the mssages between those two partitions by ,say, modulo
>> > > event-id,
>> > > > where events with the same event ids should be processed by the
>> order
>> > of
>> > > > their arrival, that will work since as I said, I will devide the
>> > incoming
>> > > > events by their event-id % number_of_partitions
>> > > >
>> > > > Now, when a new paratition is added, there might be situations where
>> > > events
>> > > > with event-id 'x', will still be in the first broker, while new
>> ones,
>> > > with
>> > > > event-id 'x', are added to the new paratition
>> > > > which may result in those events being processed in parallel, what
>> am i
>> > > > missing?
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Guy
>> > > >
>> > >
>> >
>>
>
>

Reply via email to