[
https://issues.apache.org/jira/browse/SPARK-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054442#comment-14054442
]
sunshangchun edited comment on SPARK-2201 at 7/8/14 11:12 AM:
--------------------------------------------------------------
I don't think it's a problem.
1. It's a external module and take no effect on spark core module.
2. spark core module has already used zookeeper to select primary master.
3. The changes is backward compatibility, set a host and port as the flume
receiver still works.
Thanks
was (Author: joyyoj):
I don't like it's a problem.
It's a external module and take no effect on spark core module.
Again, spark core module has already used zookeeper to select leader master.
Thanks
> Improve FlumeInputDStream's stability and make it scalable
> ----------------------------------------------------------
>
> Key: SPARK-2201
> URL: https://issues.apache.org/jira/browse/SPARK-2201
> Project: Spark
> Issue Type: Improvement
> Reporter: sunshangchun
>
> Currently:
> FlumeUtils.createStream(ssc, "localhost", port);
> This means that only one flume receiver can work with FlumeInputDStream .so
> the solution is not scalable.
> I use a zookeeper to solve this problem.
> Spark flume receivers register themselves to a zk path when started, and a
> flume agent get physical hosts and push events to them.
> Some works need to be done here:
> 1.receiver create tmp node in zk, listeners just watch those tmp nodes.
> 2. when spark FlumeReceivers started, they acquire a physical host
> (localhost's ip and an idle port) and register itself to zookeeper.
> 3. A new flume sink. In the method of appendEvents, they get physical hosts
> and push data to them in a round-robin manner.
--
This message was sent by Atlassian JIRA
(v6.2#6252)