Hi, In the cast of no failures with a single source, single channel and single sink you will see ordering. However, I believe when there is a failure file channel will change ordering on rollback.
If strict ordering is required it's advisable to assign sequence numbers upstream and then re-order the data with either a MR job or Impala query once they land in MapReduce. Brock On Wed, Feb 12, 2014 at 12:02 AM, Christopher Shannon <[email protected] > wrote: > Interesting question. > > I can't answer it, but I would like to know what strategies others have > pursued if they have had a need to order their data after it gets to the > end of the Flume pipeline. > > - C. > > > On Tue, Feb 11, 2014 at 11:52 PM, Chris Schneider < > [email protected]> wrote: > >> I've seen a fair number of resources on the web that describe the loose >> ordering guarantees that flume offers for messages in the face of >> degradation or failures. But I can't tell what applies to flume-og, and >> flume-ng. Hopefully somebody can help clear up the situation. >> >> In the case of a single agent topology, (source -> FileSystem Channel -> >> sink), can messages become out of order? What situations cause that? >> >> In a multi agent topology, does that answer change? >> >> (Agent 1 Source -> FilesystemChannel -> Avro To Collector) >> (Agent 2 Source -> FilesystemChannel -> Avro To Collector) >> (Collector Avro from agents -> FilesystemChannel -> final Sink) >> >> And perhaps in an even more complicated setup, with multiple collectors, >> does that answer change further? >> >> >> >> > -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
