Hi Ovidiu,

at the moment Flink's batch fault tolerance restarts the whole job in case
of a failure. However, parts of the logic to do partial backtracking such
as intermediate result partitions and the backtracking algorithm are
already implemented or exist as a PR [1]. So we hope to complete the
partial backtracking soon.

[1] https://github.com/apache/flink/pull/640

Cheers,
Till

On Mon, Feb 22, 2016 at 6:00 PM, Ovidiu-Cristian MARCU <
ovidiu-cristian.ma...@inria.fr> wrote:

> Hi
>
> In case of failure of a node what does it mean 'Fault tolerance for
> programs in the *DataSet API* works by retrying failed executions’ [1] ?
> -work already done by the rest of the nodes is not lost, only work of the
> lost node is recomputed, job execution will continue
> or
> -entire job execution is retried
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/fault_tolerance.html
>
> Best,
> Ovidiu
>

Reply via email to