[
https://issues.apache.org/jira/browse/FLUME-865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207467#comment-13207467
]
Arvind Prabhakar commented on FLUME-865:
----------------------------------------
Thanks for the explanation Juhani. I agree that this will remain backward
compatible with the current implementation since the runner configuration is
not used at the moment. Hence adding anything to the <sink>.runner namespace is
as good as new configuration.
However - I do have the following concerns:
1. As you pointed out, the runner configuration has some repetition and is not
centrally defined. While in this example this is not a big problem considering
its simplicity, it may become a problem with runners that require complex
configuration. Since each sink that has the same runner by name is associated
with the same logical runner instance, it forces the question of which sink's
configuration takes precedence over the other for the runner namespace. One
reason why we have the namespaced hierarchical configuration is to make it as
easy to understand as possible. If we follow this idea, we are defining
configuration of one logical entity that is split over multiple disjoint
namespaces.
2. There is a bigger issue with the runner concept itself - which is that it
does not merit substitution. As you can tell, a key goal of flume-728 branch is
to ensure simplicity of implementation. To this effect we have decided to
remove the concept of a substitutable runner itself and make sure that all
agents have the *exact same running semantic*. Over a period of time we expect
this semantic to evolve and encompass fairly complex requirements. For instance
the notion of polling, backing-off, aggressive draining etc - these are all
common concepts that apply to any agent regardless of which sink it is using
and what policy the sink follows (failover/load-balancing/etc). If at this
stage we do incorporate the notion of substitutable runners based on an
abstract class - it will lead to custom runners and related sustenance problems
for the broader set of users.
3. Even if we do follow the notion of a fixed runner, its configuration is
redundant since it is static in nature. For example -
host1.sinks.avro1.runner.processor = failover could well be
host1.sinks.avro1.processor = failover. While this is doable, it still poses
the same issues as #1 above which is it spreads the configuration of a single
logical component over unrelated namespaces. Besides, it also implies that the
sink controls the processor, as opposed to the processor controlling the sink.
Having expressed these concerns, I also want to say that your patch is very
valuable to this project and I want to extend all my support to help you get it
committed. For that, do you think it is reasonable if we create a sub-task of
this Jira that refactors the existing code to introduce the notion of sink
processor as a pluggable component and have the associated configuration with
it? Once that is committed, you can refactor your implementation to align with
that.
> Implement failover sink
> ------------------------
>
> Key: FLUME-865
> URL: https://issues.apache.org/jira/browse/FLUME-865
> Project: Flume
> Issue Type: New Feature
> Components: Sinks+Sources
> Affects Versions: NG alpha 2
> Reporter: Jarek Jarcec Cecho
> Assignee: Juhani Connolly
> Fix For: v1.1.0
>
> Attachments: FLUME-865.patch
>
>
> It would be nice if the flume-ng would have ability to failover to different
> sink in case that the active one is not responding (e.g. before failing the
> transaction).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira