Hi Saikat,
Please see my thoughts inline. This is how I think about this stuff; others
may think about it differently.

On Mon, Jun 27, 2016 at 8:45 PM, Saikat Kanjilal <[email protected]>
wrote:

> Exactly right, I'm proposing we create a graph sink for flume while
> keeping the flume core intact.


As you are probably aware, sources and sinks don't have to be part of the
main Apache Flume source tree to be used with Flume. The plugins.d
mechanism described in [1] makes building and integrating separate plugins
into Flume an easy thing to do at deployment time.

In another project I work on, Apache Kudu (incubating), we have a Flume
Kudu sink committed in the main source tree [2]. We may at some point
propose to move it into the Flume source tree, but for now (for testing and
API stability reasons) it's easier to keep it in the Kudu source tree.

Likewise, you could implement a Flume Neo4J sink and post it up on GitHub
(or maybe in the Neo4J tree?). Donating it to the Apache Flume project once
it's in decent shape may make sense at some point, especially if the
dependencies are easy to share and integrate into the Flume project.
However, I wouldn't say that it's a foregone conclusion that it really
needs to be part of the Flume source tree. Assuming you need the sink, and
are going to implement it anyway, then maybe we can defer the discussion of
whether to include it in the Flume source tree until later. One of the
things I try to keep in mind when integrating new plugin code is whether
the project will be able to support the maintenance burden of the new code.

In reading from a graph db we need a mechanism to stream data from the
> graph store into flume.
>

Yes, I'd say it could potentially make sense to create a Flume Neo4J source
as well. I think the same logic as above would still apply.

Regards,
Mike

[1]
https://flume.apache.org/FlumeUserGuide.html#installing-third-party-plugins
[2]
https://github.com/apache/incubator-kudu/tree/master/java/kudu-flume-sink

Reply via email to