Hi Eric,

> does the socket only get opened on the master node and then the stream is
partitioned out to the worker nodes?
No, the socket are opened on the worker. As for the socket example, Flink
starts a source task as a worker to ingest data.
What's more, you can use setParallelism() method to control the source
parallelism and to ingest multi-streams. However, the default socket source
function(provided by flink) is not a ParallelSourceFunction, so the max
parallelism should not bigger than 1. You can implement a user defined
source function if you want.

> Is it possible to have each worker node be a sensor that receives a
stream of data, where each stream is of the same type (e.g. a series of
tuples)?
Yes, nearly all source connectors[1] works in this way.

I have no idea why you think the socket is opened on the master. Maybe this
document[2] is helpful for you.

Best, Hequn

[1] https://ci.apache.org/projects/flink/flink-docs-master/dev/connectors/
[2]
https://ci.apache.org/projects/flink/flink-docs-master/concepts/programming-model.html#programs-and-dataflows

On Sat, Sep 1, 2018 at 11:47 AM Eric L Goodman <[email protected]>
wrote:

> If I have a standalone cluster running flink, what is the best way to
> ingest multiple streams of the same type of data?
>
> For example, if I open a socket text stream, does the socket only get
> opened on the master node and then the stream is partitioned out to the
> worker nodes?
>  DataStream<String> text = env.socketTextStream("localhost", port, "\n");
>
> Is it possible to have each worker node be a sensor that receives a stream
> of data, where each stream is of the same type (e.g. a series of tuples)?
>
> Thanks
>
>

Reply via email to