Evan,

Please look at autocommit.enable at
http://incubator.apache.org/kafka/configuration.html
If it is false, you can control the offset storage via the commitOffsets
API call.

>> So, commit the offset when you have an ack, however that is defined;
Rollback to an earlier offset when you don't get acks,
and de-dup as necessary.

Sounds like you can use commitOffsets() right after getting an ack.

Thanks,
Neha

On Thu, Dec 8, 2011 at 12:44 PM, Evan Chan <e...@ooyala.com> wrote:

> What you mean is that we need to modify (have our own modified copy of) the
> high level consumer (specifically the ConsumerConnector) so that instead of
> automatically calling commitOffset(),  we can call commitOffset() at our
> own discretion, when we know that the messages have gotten to their
> destination.
>
> I am planning to do this BTW for a similar use case.
> Exactly once == at least once + de-duplication.
> So, commit the offset when you have an ack, however that is defined;
> Rollback to an earlier offset when you don't get acks,
> and de-dup as necessary.
>
> -Evan
>
>
> On Thu, Dec 8, 2011 at 10:03 AM, Jun Rao <jun...@gmail.com> wrote:
>
> > Neha is right. It's possible to achieve exactly-once delivery even in
> high
> > level consumer. What you have to do is do make sure all consumed messages
> > are really consumed and then call commitOffset. When you call
> commitOffset,
> > all messages returned to the apps should have been fully consumed or put
> in
> > a safe place.
> >
> > Thanks,
> >
> > Jun
> >
> > On Thu, Dec 8, 2011 at 9:52 AM, Neha Narkhede <neha.narkh...@gmail.com
> > >wrote:
> >
> > > Mark,
> > >
> > > >> Is that correct? Did you mean SimpleConsumer or HighLevelConsumer?
> > What
> > > are the differences?
> > >
> > > The high level consumer check points the offsets in zookeeper, either
> > > periodically or based on an API call (look at commitOffsets()).
> > >
> > > If you want to checkpoint each and every message offset, exactly-once
> > > semantics will be expensive. But if you are willing to tolerate a small
> > > window of duplicates, you could buffer and write the offsets in
> batches.
> > > If you choose to do the former, commitOffsets() approach is expensive,
> > > since that can lead to too many writes on zookeeper. If you choose the
> > > later, it could be fine, and you can use the high level consumer
> itself.
> > >
> > > On the contrary, if your consumer is writing the messages to some
> > database
> > > or persistent storage, you might be better off using SimpleConsumer.
> > There
> > > was another discussion about making the offset storage of the high
> level
> > > consumer pluggable, but we don't have that feature yet.
> > >
> > > Thanks,
> > > Neha
> > >
> > >
> > > On Thu, Dec 8, 2011 at 9:32 AM, Jun Rao <jun...@gmail.com> wrote:
> > >
> > > > Currently, the high level consumer (with ZK integration) doesn't
> expose
> > > > offsets to the consumer. Only SimpleConsumer does.
> > > >
> > > > Jun
> > > >
> > > > On Thu, Dec 8, 2011 at 9:15 AM, Mark <static.void....@gmail.com>
> > wrote:
> > > >
> > > > > "This is only possible through SimpleConsumer right now."
> > > > >
> > > > >
> > > > > Is that correct? Did you mean SimpleConsumer or HighLevelConsumer?
> > What
> > > > > are the differences?
> > > > >
> > > > >
> > > > > On 12/8/11 8:53 AM, Jun Rao wrote:
> > > > >
> > > > >> Mark,
> > > > >>
> > > > >> Today, this is mostly the responsibility of the consumer, by
> > managing
> > > > the
> > > > >> offsets properly. For example, if the consumer periodically
> flushes
> > > > >> messages to disk, it has to checkpoint to disk the offset
> > > corresponding
> > > > to
> > > > >> the last flush. On failure, the consumer has to rewind the
> > consumption
> > > > >> from
> > > > >> the last checkpointed offset. This is only possible through
> > > > SimpleConsumer
> > > > >> right now.
> > > > >>
> > > > >> Thanks,
> > > > >>
> > > > >> Jun
> > > > >>
> > > > >> On Thu, Dec 8, 2011 at 8:18 AM, Mark<static.void....@gmail.com**>
> > > >  wrote:
> > > > >>
> > > > >>  How can one guarantee exactly one semantics when using Kafka as a
> > > > >>> traditional queue? Is this guarantee the responsibility of the
> > > > consumer?
> > > > >>>
> > > > >>>
> > > >
> > >
> >
>
>
>
> --
> --
> *Evan Chan*
> Senior Software Engineer |
> e...@ooyala.com | (650) 996-4600
> www.ooyala.com | blog <http://www.ooyala.com/blog> |
> @ooyala<http://www.twitter.com/ooyala>
>

Reply via email to