Hi,

In the cast of no failures with a single source, single channel and single
sink you will see ordering. However, I believe when there is a failure file
channel will change ordering on rollback.

If strict ordering is required it's advisable to assign sequence numbers
upstream and then re-order the data with either a MR job or Impala query
once they land in MapReduce.

Brock


On Wed, Feb 12, 2014 at 12:02 AM, Christopher Shannon <[email protected]
> wrote:

> Interesting question.
>
> I can't answer it, but I would like to know what strategies others have
> pursued if they have had a need to order their data after it gets to the
> end of the Flume pipeline.
>
> - C.
>
>
> On Tue, Feb 11, 2014 at 11:52 PM, Chris Schneider <
> [email protected]> wrote:
>
>> I've seen a fair number of resources on the web that describe the loose
>> ordering guarantees that flume offers for messages in the face of
>> degradation or failures.  But I can't tell what applies to flume-og, and
>> flume-ng.  Hopefully somebody can help clear up the situation.
>>
>> In the case of a single agent topology, (source -> FileSystem Channel ->
>> sink), can messages become out of order?  What situations cause that?
>>
>> In a multi agent topology, does that answer change?
>>
>> (Agent 1   Source -> FilesystemChannel -> Avro To Collector)
>> (Agent 2   Source -> FilesystemChannel -> Avro To Collector)
>> (Collector Avro from agents -> FilesystemChannel -> final Sink)
>>
>> And perhaps in an even more complicated setup, with multiple collectors,
>> does that answer change further?
>>
>>
>>
>>
>


-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

Reply via email to