1) The source is Avro client. The events are lost. The intent of the
question is there a way I can provide failover for the very first Flume
Source without getting duplicate events.
2) Here is configuration for Failover sinks along with the logs.
ag1.sources=s1
ag1.channels=c1
ag1.sinks=k1 k2
ag1.sources.s1.type=netcat
ag1.sources.s1.channels=c1
ag1.sources.s1.bind=0.0.0.0
ag1.sources.s1.port=12345
ag1.channels.c1.type=memory
ag1.sinkgroups=sg1
ag1.sinkgroups.sg1.sinks=k1 k2
ag1.sinkgroups.sg1.processor.type=failover
ag1.sinkgroups.sg1.processor.priority.k1=10
ag1.sinkgroups.sg1.processor.priority.k2=20
ag1.sinks.k1.type=logger
ag1.sinks.k1.channel=c1
ag1.sinks.k2.type=hdfs
ag1.sinks.k2.hdfs.path=flume/data/%{directory}
ag1.sinks.k2.hdfs.fileSuffix=.log
ag1.sinks.k2.hdfs.rollInterval=0
ag1.sinks.k2.hdfs.rollCount=10
ag1.sinks.k2.hdfs.rollSize=0
ag1.sinks.k2.hdfs.inUsePrefix=.
ag1.sinks.k2.hdfs.inUseSuffix=
ag1.sinks.k2.hdfs.fileType=DataStream
ag1.sinks.k2.channel=c1
*Logs*
at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:202)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1243)
at org.apache.hadoop.ipc.Client.call(Client.java:1087)
... 15 more
14/02/07 15:01:40 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:8020. Already tried 0 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/07 15:01:41 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:8020. Already tried 1 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/07 15:01:42 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:8020. Already tried 2 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/07 15:01:42 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: DFSOutputStream is closed
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3879)
at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
at org.apache.flume.sink.hdfs.HDFSDataStream.sync(HDFSDataStream.java:117)
at org.apache.flume.sink.hdfs.BucketWriter$5.call(BucketWriter.java:356)
at org.apache.flume.sink.hdfs.BucketWriter$5.call(BucketWriter.java:353)
at org.apache.flume.sink.hdfs.BucketWriter$8$1.run(BucketWriter.java:536)
at
org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:160)
at org.apache.flume.sink.hdfs.BucketWriter.access$1000(BucketWriter.java:56)
at org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:533)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
14/02/07 15:01:43 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:8020. Already tried 3 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/07 15:01:44 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:8020. Already tried 4 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/07 15:01:45 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:8020. Already tried 5 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)