[
https://issues.apache.org/jira/browse/FLUME-952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205294#comment-13205294
]
Juhani Connolly commented on FLUME-952:
---------------------------------------
bq - I have an implicit way of doing so in the design. If
SinkRunner.chooseSink(int try) will be executed with parameter try > 1, it
means that previously returned sink has failed and we need another working
sink. Consider following example. Failover sink is keeping track of active
sink. It will return this active sink for call chooseSink(1). However calling
chooseSink(2) means that active sink is not working and we need move to another
active sink. However I don't mind adding explicit method for marking some sink
as dead. It was just an idea.
This is workable but feels unwieldy, no need for the extra state imo, I think
it is more transparent to give selector developers a function in the interface
they need to implement.
bq - I also found out that most sinks do not return internal state, so that you
have to try them to find actual state. That is actually the reason why I
suggested to put a loop into SinkRunner.PollingRunner.run to keep executing
SinkRunner.chooseSink(int try) until it returns null with increasing try
parameter. This way selector can simply force Runner to try some previously
dead sink and verify that is still dead.
I'm not sure if we can just keep beating on dead sinks every time we want to
see if they're back yet... Some of them block for a short while trying to send
a message, and I don't think we can just keep hammering them every failed
message... What I would have liked to do is keep a list of dead sinks, and
start up another thread that could periodically poll them for recovery. One
possibility is that failed sinks should change their lifecycle state to
stopped and that all sinks would be required to make some kind of liveliness
check when starting. Right now even if a sink returns from start() without a
problem, many of them can fail on the very first process
One thing I am sure of is that right now sink implementations are inconsistent
with one another, and there is no unified way of knowing when they have
died(some of them never throw EventDeliveryException) or when they are working
properly. I think any implementation of the selector will have to make some
assumptions about their behavior and then that behavior will need to be
enforced. For me right now those assumptions could be:
- EventDeliveryException getting thrown signals failure
- New status flag for sinks, or poll function, or return value on start
> Modifying SinkRunner to be pluggable to allow for failover/replication.
> -----------------------------------------------------------------------
>
> Key: FLUME-952
> URL: https://issues.apache.org/jira/browse/FLUME-952
> Project: Flume
> Issue Type: Brainstorming
> Components: Sinks+Sources
> Reporter: Juhani Connolly
> Fix For: v1.1.0
>
>
> Implementing the failover sink runner the following was suggested:
> 1. This needs to be implemented on top of FLUME-949 which deals with removing
> the notion of a PollableSink altogether. As a result, the SinkRunner will
> become a concrete implementation that can then allow different sink handling
> policies - such as either a failover policy (needed for this issue), or load
> balancing policy (not needed for this issue). Hence the policy part needs to
> be pluggable rather than the sink runner itself. An example of such a
> construct is the ChannelSelector and ChannelProcessor implementations.
> In Flume-865 I have implemented FailoverSinkRunner as a separate runner, but
> I am open to the idea of making it pluggable if it makes the code more
> maintainable.
> As is, there are many differences between the requirements for Failover and a
> normal Sink runner, including configuration, initialisation, shutdown, error
> handling and event processing. If we were to make this pluggable, many hooks
> would be needed and I don't think there is that much common behavior that
> warrants using a pluggable system rather than just a solid base class.
> - Adding a new sink to a runner, with configuration variables(such as
> priority or weight)
> - Policy for handling process: should this just return a list of sinks to
> process like ChannelSelector and hand off the processing to Process? I think
> that the specific failover policy for each type of runner will be different
> so this feels awkward. I would personally prefer to just pass the process
> call to the pluggable component and let it be responsible for calling process
> on the correct sinks, as well as handling errors.
> Right now I am not convinced for the need to make SinkRunner pluggable, but I
> would be interested to hear other peoples opinions
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira