Re: Parallelism question

Maximilian Michels Tue, 14 Apr 2015 02:59:45 -0700

Hi Giacomo,

If I understand you correctly, you want your Flink job to execute with a
parallelism of 5. Just call setDegreeOfParallelism(5) on your
ExecutionEnvironment. That way, all operations, when possible, will be
performed using 5 parallel instances. This is also true for the DataSink
which will produce 5 files containing the output data from the parallel
instances.


Best,
Max


On Tue, Apr 14, 2015 at 10:38 AM, Giacomo Licari <giacomo.lic...@gmail.com>
wrote:

> Hi guys,
> I have a question about how parallelism works.
>
> If I have a large dataset and I would divide it into 5 blocks, can I pass
> each block of data to a fixed parallel process (for example I set up 5
> process) ?
>
> And if the results data from each process arrive to the output not in an
> ordered way, can I order them? For example:
>
> data from process 1
> data from process 2
> and so on
>
> Thank you guys!
>

Re: Parallelism question

Reply via email to