Storm processes many records from a partition at once. If a record fails,
all tuples from the offset onwards will have to be retried. So in your
case, if a record before that record failed, it's possible that a
subsequent record that had already been successfully processed was retried.

This is true for Storm's spouts and bolts API which can only guarantee
at-least-once semantics under failure scenarios. If you require
exactly-once semantics, I would recommend looking into Trident.

On Fri, Sep 19, 2014 at 7:03 AM, Kushan Maskey <
[email protected]> wrote:

> Is possible that kafkaSpout would process same record twice? I had an
> impression that once a record is read and processed successfully by storm,
> it will never go back to read the same message at the same offset again. I
> found an instance that a record was read and stored twice with exactly the
> same even the created timestamp is same.
>
> Kushan Maskey




-- 
Twitter: @nathanmarz
http://nathanmarz.com

Reply via email to