There's nothing built into Flume to deal with duplicates, it only provides at-least-once delivery semantics.
You'll have to handle it in your data processing applications or add an ETL step to deal with duplicates before making data available for other queries. -Joey On Wed, Dec 3, 2014 at 5:46 AM, Guillermo Ortiz <[email protected]> wrote: > Hi, > > I would like to know if there's a easy way to deal with data > duplication when an agent crashs and it resends same data again. > > Is there any mechanism to deal with it in Flume, -- Joey Echeverria
