Hi All,

My query is related to the flume’s sink processor for the failover. I 
implemented the failover data flow to transfer the event to downstream in case 
of agent failure.

I have configured the flume’s sinkgroups for failover of the sinks. In case of 
the failure of sink1, sink2 take charge of storing data into hdfs.
I tested the functionality of the sinkgroups’s failover successfully. When 
sink1 fails, sink2 successfully takes the charge and transfer the data to the 
destination.

Here I am transferring the events by tailing a file contains 100 records. I 
shutdown the sink1 when it has processed 40 records. After that all data start 
transferred by sink2. But, the file saved in hdfs contains 140 records. It 
means both downstream agents hold the events, if one fails another start 
transfer the events from the start as it hold all data in its defined channel.

Is this the intended behavior of the flume’s sinkgroup in case of failover? How 
can I avoid the duplicate events? Is there any solution as part of flume’s sink 
processor to deal with the duplicates?

Thanks & Regards,
Ashutosh



이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 
정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 
발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. 
This email is intended for the use of the addressee only. If you receive this 
email by mistake, please either delete it without reproducing, distributing or 
retaining copies thereof or notify the sender immediately.

Reply via email to