Storm processes many records from a partition at once. If a record fails, all tuples from the offset onwards will have to be retried. So in your case, if a record before that record failed, it's possible that a subsequent record that had already been successfully processed was retried.
This is true for Storm's spouts and bolts API which can only guarantee at-least-once semantics under failure scenarios. If you require exactly-once semantics, I would recommend looking into Trident. On Fri, Sep 19, 2014 at 7:03 AM, Kushan Maskey < [email protected]> wrote: > Is possible that kafkaSpout would process same record twice? I had an > impression that once a record is read and processed successfully by storm, > it will never go back to read the same message at the same offset again. I > found an instance that a record was read and stored twice with exactly the > same even the created timestamp is same. > > Kushan Maskey -- Twitter: @nathanmarz http://nathanmarz.com
