[
https://issues.apache.org/jira/browse/SPARK-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986275#comment-13986275
]
Hari Shreedharan commented on SPARK-1645:
-----------------------------------------
Yes, it is better to add new methods rather than reusing the old ones and
confusing existing users.
In fact, I think we should add a new receiver for the time being and only
deprecate the old one initially. We can remove the old one in a later release.
> Improve Spark Streaming compatibility with Flume
> ------------------------------------------------
>
> Key: SPARK-1645
> URL: https://issues.apache.org/jira/browse/SPARK-1645
> Project: Spark
> Issue Type: Bug
> Components: Streaming
> Reporter: Hari Shreedharan
>
> Currently the following issues affect Spark Streaming and Flume compatibilty:
> * If a spark worker goes down, it needs to be restarted on the same node,
> else Flume cannot send data to it. We can fix this by adding a Flume receiver
> that is polls Flume, and a Flume sink that supports this.
> * Receiver sends acks to Flume before the driver knows about the data. The
> new receiver should also handle this case.
> * Data loss when driver goes down - This is true for any streaming ingest,
> not just Flume. I will file a separate jira for this and we should work on it
> there. This is a longer term project and requires considerable development
> work.
> I intend to start working on these soon. Any input is appreciated. (It'd be
> great if someone can add me as a contributor on jira, so I can assign the
> jira to myself).
--
This message was sent by Atlassian JIRA
(v6.2#6252)