Re: About FlumeUtils.createStream

Tathagata Das Mon, 23 Feb 2015 20:28:44 -0800

Akhil, that is incorrect.

Spark will list on the given port for Flume to push data into it.
When in local mode, it will listen on localhost:9999
When in some kind of cluster, instead of localhost you will have to give
the hostname of the cluster node where you want Flume to forward the data.
Spark will launch the Flume receiver on that node (assuming the hostname
matching is correct), and list on port 9999, for receiving data from Flume.
So only the configured machine will listen on port 9999.


I suggest trying the other stream. FlumeUtils.createPollingStream. More
details here.
http://spark.apache.org/docs/latest/streaming-flume-integration.html



On Sat, Feb 21, 2015 at 12:17 AM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> Spark won't listen on 9999 mate, It basically means you have a flume
> source running at port 9999 of your localhost. And when you submit your
> application in standalone mode, workers will consume date from that port.
>
> Thanks
> Best Regards
>
> On Sat, Feb 21, 2015 at 9:22 AM, bit1...@163.com <bit1...@163.com> wrote:
>
>>
>> Hi,
>> In the spark streaming application, I write the code, 
>> FlumeUtils.createStream(ssc,"localhost",9999),which
>> means spark will listen on the 9999 port, and wait for Flume Sink to write
>> to it.
>> My question is:  when I submit the application to the Spark Standalone
>> cluster, will 9999 be opened only on the Driver Machine or all the workers
>> will also open the 9999 port and wait for the Flume data?
>>
>> ------------------------------
>>
>>
>

Re: About FlumeUtils.createStream

Reply via email to