Hi Kostas,

thanks for the quick reply.

> If T_1 must be processed before T_i, i>1, then you cannot parallelize the 
> algorithm.

What would be the best way to process it anyway?

DataSet.collect() -> loop over List -> env.fromCollection(...) ?
Or with a parallelism of 1 and a .map(...) ?

However, this approach would collect all data at one node and wouldn't
scale, correct?

Regards,
Sebastian

Reply via email to