----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3750/#review4804 -----------------------------------------------------------
Thanks for the patch Juhani, the change seems to be in the right direction. Some high-level feedback: 1. This needs to be implemented on top of FLUME-949 which deals with removing the notion of a PollableSink altogether. As a result, the SinkRunner will become a concrete implementation that can then allow different sink handling policies - such as either a failover policy (needed for this issue), or load balancing policy (not needed for this issue). Hence the policy part needs to be pluggable rather than the sink runner itself. An example of such a construct is the ChannelSelector and ChannelProcessor implementations. 2. The most important aspect of this change is the ability to configure the sink runner with relative ease using the configuration mechanism. The properties file configuration provider implementation will then have to be modified in order to wire the runner correctly. The implementation should be such that it remains intuitive to the user reading the configuration file and powerful enough to group multiple sinks with different policies within the agent. Some discussion regarding these choices is definitely in order before we implement anything. That said, I suggest we discuss the #2 item over the JIRA while waiting for FLUME-949 to be committed. Thanks - Arvind On 2012-02-03 08:53:38, Juhani Connolly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/3750/ > ----------------------------------------------------------- > > (Updated 2012-02-03 08:53:38) > > > Review request for Flume. > > > Summary > ------- > > This includes the changes from FLUME-945 and is thus subject to change with > it. I have a separate patch for the diff from 945->865 if someone wants it. > > A failover sink runner is added, along with a SinkRunnerFactory. The > configuration reading is modified so as to read the runner configuration and > pass it to the SinkRunnerFactory. > A failover sink is used by setting the runner type to failover for every > participating sink. In addition each sink assigned to a runner must set the > same runner.name and a unique runner.priority > The runner will poll only the highest priority live sink. Should it fail by > throwing EventDeliveryException, it will throw it into a pile of dead sinks. > Only once all sinks are exhausted are the dead sinks revived. > > Some issues that are outstanding > - Not all sinks throw EventDeliveryException, or have a clear time when they > should be considered "dead". Coupling such sinks with the failover runner > will not result in failover > - Sinks do not have a clear mechanism to test for their liveliness, other > than perhaps polling LifecycleState. However for most sinks lifecyclestate > tends to remain in START even if the sink is unable to do anything(e.g. avro > sink failing to connect) > > > This addresses bug FLUME-865. > https://issues.apache.org/jira/browse/FLUME-865 > > > Diffs > ----- > > flume-ng-core/src/main/java/org/apache/flume/PollableSink.java e86cc59 > flume-ng-core/src/main/java/org/apache/flume/Sink.java ab9b63c > flume-ng-core/src/main/java/org/apache/flume/SinkRunner.java 7f5cb17 > flume-ng-core/src/main/java/org/apache/flume/sink/AvroSink.java 90b8a86 > flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkRunner.java > PRE-CREATION > flume-ng-core/src/main/java/org/apache/flume/sink/LoggerSink.java 7c87bf7 > flume-ng-core/src/main/java/org/apache/flume/sink/NullSink.java 5722cd1 > flume-ng-core/src/main/java/org/apache/flume/sink/PollableSinkRunner.java > afcf1c3 > flume-ng-core/src/main/java/org/apache/flume/sink/RollingFileSink.java > 47addd1 > flume-ng-core/src/main/java/org/apache/flume/sink/SinkRunnerFactory.java > PRE-CREATION > flume-ng-core/src/test/java/org/apache/flume/sink/TestAvroSink.java 9718491 > > flume-ng-core/src/test/java/org/apache/flume/sink/TestFailoverSinkRunner.java > PRE-CREATION > > flume-ng-node/src/main/java/org/apache/flume/conf/properties/FlumeConfiguration.java > 1ee1f8e > > flume-ng-node/src/main/java/org/apache/flume/conf/properties/PropertiesFileConfigurationProvider.java > bea0a3c > > flume-ng-node/src/test/java/org/apache/flume/node/TestAbstractLogicalNodeManager.java > f1e23bf > > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java > 716bbf5 > > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java > ff20beb > > flume-ng-sinks/flume-irc-sink/src/main/java/org/apache/flume/sink/irc/IRCSink.java > 5fc3943 > > Diff: https://reviews.apache.org/r/3750/diff > > > Testing > ------- > > All unit tests pass. > A new unit test was added. It creates a memory channel and a failover runner > with 3 sinks that consume a preset number of events then send > EventDeliveryException. Events are fed to the channel and assertions are made > that the runner is failing over to the correct sinks. This test also passes. > > I also tried to run this on a real cluster, with one master source using a > failover runner to three avro sinks which would each fed into separate agents > which I would kill off to test the failover. Unfortunately because of > AvroSink not throwing EventDeliveryException, this test could not be > completed succesfully, and I felt it was beyond the scope of this ticket to > modify AvroSink > > > Thanks, > > Juhani > >
