Flink has support for spillable intermediate results. Currently they
are only set if necessary to avoid pipeline deadlocks.

You can force this via

env.getConfig().setExecutionMode(ExecutionMode.BATCH);

This will write shuffles to disk, but you don't get the fine-grained
control you probably need for your use case.

– Ufuk

On Thu, May 5, 2016 at 3:29 PM, Paschek, Robert
<robert.pasc...@tu-berlin.de> wrote:
> Hi Mailing List,
>
>
>
> I want to write and read intermediates to/from disk.
>
> The following foo- codesnippet may illustrate my intention:
>
>
>
> public void mapPartition(Iterable<T> tuples, Collector<T> out) {
>
>
>
>                 for (T tuple : tuples) {
>
>
>
>                                if (Condition)
>
>                                                out.collect(tuple);
>
>                                else
>
>                                                writeTupleToDisk
>
>                 }
>
>
>
>                 While (‘TupleOnDisk’)
>
>                                out.collect(‘ReadNextTupleFromDisk’);
>
> }
>
>
>
> I'am wondering if flink provides an integrated class for this purpose. I
> also have to precise identify the files with the intermediates due
> parallelism of mapPartition.
>
>
>
>
>
> Thank you in advance!
>
> Robert

Reply via email to