Hi Giacomo, If I understand you correctly, you want your Flink job to execute with a parallelism of 5. Just call setDegreeOfParallelism(5) on your ExecutionEnvironment. That way, all operations, when possible, will be performed using 5 parallel instances. This is also true for the DataSink which will produce 5 files containing the output data from the parallel instances.
Best, Max On Tue, Apr 14, 2015 at 10:38 AM, Giacomo Licari <giacomo.lic...@gmail.com> wrote: > Hi guys, > I have a question about how parallelism works. > > If I have a large dataset and I would divide it into 5 blocks, can I pass > each block of data to a fixed parallel process (for example I set up 5 > process) ? > > And if the results data from each process arrive to the output not in an > ordered way, can I order them? For example: > > data from process 1 > data from process 2 > and so on > > Thank you guys! >