Hi All,

Is there any documentation on the circumstances under which flume ng will 
either drop events or possibly send events twice resulting in duplicates?

I seem to be able to run into both situations with a test setup under high 
contention, using a agent1[syslog source --> file channel --> avro sink] --> 
agent2[avro source, file channel, hdfs sink]. I drop events with the default 
values for the timeouts on the file channels in combination with letting agent1 
become unavailable for some period of time (causing rsyslog to build up a 
queue). The same situation with higher timeouts leads to a number of duplicate 
events (about 500 after 2.5M events).

(BTW: is there an official ascii art notation for flume setups?)


Thanks for any pointers,
Friso

Reply via email to