Re: Review Request: FLUME-865 creating a failover sink runner

Arvind Prabhakar Sun, 12 Feb 2012 17:57:38 -0800

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3750/#review5039
-----------------------------------------------------------



Thanks for the patch Juhani. The functionality seems to be progressing well. 
Some comments/feedback to consider regarding configuration:

The current implementation of configuration mechanism is such that it allows 
you to specify sinks per agent. Each sink is then associated with a channel. 
For example, for an agent named "host1" the relevant configuration to tie a 
sink named "sink1" to a channel named "channel1" will be as follows:

   host1.sinks = sink1 ...
   host1.channels = channel1 ...

   host1.sinks.sink1.type = avro
   host1.sinks.sink1.channel = channel1

If we want to have a failover sink for sink1, say we call it "sink2", then the 
key problem is to be able to associate this new sink with previously configured 
sink1. Moreover, this change should be backward compatible so that existing 
configuration files continue to work. Given these requirements, the following 
is a  possible solution:

   host1.sinks = sink1 sink2 ...
   host1.channels = channel1 ...

   host1.sinks.sink1.type = avro
   host1.sinks.sink1.channel = channel1
   ...
   host1.sinks.sink2.type = avro
   host1.sinks.sink2.channel = channel1
   ...

   host1.sinks.groups = group1
   host1.sinks.groups.group1.sinks = sink1 sink2
   host1.sinks.groups.group1.policy.type = failover

This introduces the notion of sink groups where a group is a named entity that 
can define it's own policy. Internally, this can then translate to the 
SinkRunner picking a policy implementation backed by an interface to propagate 
the event. For example - SinkPolicy.processEvent(Event e).

If no groups are specified, then each sink gets the default policy 
implementation that can handle a single sink at a time only. This makes the 
change backward compatible and gives enough flexibility to express different 
policies. Further, for policy implementations that require more configuration, 
it can be easily use extra configuration from within the group configuration, 
much like the way a selector gets its configuration.

Your thoughts?
   







- Arvind


On 2012-02-06 01:12:23, Juhani Connolly wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/3750/
> -----------------------------------------------------------
> 
> (Updated 2012-02-06 01:12:23)
> 
> 
> Review request for Flume.
> 
> 
> Summary
> -------
> 
> This includes the changes from FLUME-945 and is thus subject to change with 
> it. I have a separate patch for the diff from 945->865 if someone wants it.
> 
> A failover sink runner is added, along with a SinkRunnerFactory. The 
> configuration reading is modified so as to read the runner configuration and 
> pass it to the SinkRunnerFactory.
> A failover sink is used by setting the runner type to failover for every 
> participating sink. In addition each sink assigned to a runner must set the 
> same runner.name and a unique runner.priority
> The runner will poll only the highest priority live sink. Should it fail by 
> throwing EventDeliveryException, it will throw it into a pile of dead sinks. 
> Only once all sinks are exhausted are the dead sinks revived.
> 
> Some issues that are outstanding
> - Not all sinks throw EventDeliveryException, or have a clear time when they 
> should be considered "dead". Coupling such sinks with the failover runner 
> will not result in failover
> - Sinks do not have a clear mechanism to test for their liveliness, other 
> than perhaps polling LifecycleState. However for most sinks lifecyclestate 
> tends to remain in START even if the sink is unable to do anything(e.g. avro 
> sink failing to connect)
> 
> 
> This addresses bug FLUME-865.
>     https://issues.apache.org/jira/browse/FLUME-865
> 
> 
> Diffs
> -----
> 
>   flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkRunner.java 
> PRE-CREATION 
>   flume-ng-core/src/main/java/org/apache/flume/sink/SinkRunnerFactory.java 
> PRE-CREATION 
>   
> flume-ng-core/src/test/java/org/apache/flume/sink/TestFailoverSinkRunner.java 
> PRE-CREATION 
>   
> flume-ng-node/src/main/java/org/apache/flume/conf/properties/FlumeConfiguration.java
>  1ee1f8e 
>   
> flume-ng-node/src/main/java/org/apache/flume/conf/properties/PropertiesFileConfigurationProvider.java
>  3d0e366 
>   
> flume-ng-node/src/test/java/org/apache/flume/node/TestAbstractLogicalNodeManager.java
>  34d6010 
> 
> Diff: https://reviews.apache.org/r/3750/diff
> 
> 
> Testing
> -------
> 
> All unit tests pass.
> A new unit test was added. It creates a memory channel and a failover runner 
> with 3 sinks that consume a preset number of events then send 
> EventDeliveryException. Events are fed to the channel and assertions are made 
> that the runner is failing over to the correct sinks. This test also passes.
> 
> I also tried to run this on a real cluster, with one master source using a 
> failover runner to three avro sinks which would each fed into separate agents 
> which I would kill off to test the failover. Unfortunately because of 
> AvroSink not throwing EventDeliveryException, this test could not be 
> completed succesfully, and I felt it was beyond the scope of this ticket to 
> modify AvroSink
> 
> 
> Thanks,
> 
> Juhani
> 
>

Re: Review Request: FLUME-865 creating a failover sink runner

Reply via email to