Hi Noel, May be you are specifying both sinkgroups and sinks.
Can you try removing the sinks. #agent.sinks = hdfsSink-1 hdfsSink-2 Yogi On Tue, Feb 19, 2013 at 1:32 PM, Noel Duffy <[email protected]>wrote: > I have a Flume agent that pulls events from RabbitMQ and pushes them into > HDFS. So far so good, but now I want to have a second Flume agent on a > different host acting as a hot backup for the first agent such that the > loss of the first host running Flume would not cause any events to be lost. > In the testing I've done I've gotten two Flume agents on separate hosts to > read the same events from the RabbitMQ queue, but it's not clear to me how > to configure the sinks such that only one of the sinks actually does > something and the other does nothing. > > From reading the documentation, I supposed that a sinkgroup configured for > failover was what I needed, but the documentation examples only cover the > case where the sinks in a failover group are all on the same agent on the > same host. I've seen messages online which seem to say that sinks in a > sinkgroup can be on different hosts, but I can find no clear explanation of > how to configure such a sinkgroup. How would sinks on different hosts > communicate with one another? Would the sinks in the sinkgroup have to use > a JDBC channel? Would the sinks have to be non-terminal sinks, like Avro? > > In my testing I set up two agents on different hosts and configured a > sinkgroup containing two sinks, both HDFS sinks. > > agent.sinkgroups = sinkgroup1 > agent.sinkgroups.sinkgroup1.sinks = hdfsSink-1 hdfsSink-2 > agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-1 = 5 > agent.sinkgroups.sinkgroup1.processor.priority.hdfsSink-2 = 10 > agent.sinkgroups.sinkgroup1.processor.type=failover > > agent.sinks = hdfsSink-1 hdfsSink-2 > agent.sinks.hdfsSink-1.type = hdfs > agent.sinks.hdfsSink-1.bind = 10.20.30.81 > agent.sinks.hdfsSink-1.channel = fileChannel-1 > agent.sinks.hdfsSink-1.hdfs.path = /flume/localbrain-events > agent.sinks.hdfsSink-1.hdfs.filePrefix = lb-events > agent.sinks.hdfsSink-1.hdfs.round = false > agent.sinks.hdfsSink-1.hdfs.rollCount=50 > agent.sinks.hdfsSink-1.hdfs.fileType=SequenceFile > agent.sinks.hdfsSink-1.hdfs.writeFormat=Text > agent.sinks.hdfsSink-1.hdfs.codeC = lzo > agent.sinks.hdfsSink-1.hdfs.rollInterval=30 > agent.sinks.hdfsSink-1.hdfs.rollSize=0 > agent.sinks.hdfsSink-1.hdfs.batchSize=1 > > agent.sinks.hdfsSink-2.bind = 10.20.30.119 > agent.sinks.hdfsSink-2.type = hdfs > agent.sinks.hdfsSink-2.channel = fileChannel-1 > agent.sinks.hdfsSink-2.hdfs.path = /flume/localbrain-events > agent.sinks.hdfsSink-2.hdfs.filePrefix = lb-events > agent.sinks.hdfsSink-2.hdfs.round = false > agent.sinks.hdfsSink-2.hdfs.rollCount=50 > agent.sinks.hdfsSink-2.hdfs.fileType=SequenceFile > agent.sinks.hdfsSink-2.hdfs.writeFormat=Text > agent.sinks.hdfsSink-2.hdfs.codeC = lzo > agent.sinks.hdfsSink-2.hdfs.rollInterval=30 > agent.sinks.hdfsSink-2.hdfs.rollSize=0 > agent.sinks.hdfsSink-2.hdfs.batchSize=1 > > However, this does not achieve the failover I hoped for. The sink > hdfsSink-2 on both agents writes the events to HDFS. The agents are not > communicating, so the binding of the sink to an ip address is not doing > anything. > > >
