[
https://issues.apache.org/jira/browse/FLINK-13169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrey Zagrebin updated FLINK-13169:
------------------------------------
Description: The BatchFineGrainedRecoveryITCase can be extended with an
additional test failure strategy which abruptly shuts down the task executor.
This leads to the loss of all previously completed and the in-progress mapper
result partitions. The fail-over strategy should restart the current
in-progress mapper which will get the PartitionNotFoundException because the
previous result becomes unavailable and the previous mapper has to be restarted
as well. The same should happen subsequently with all previous mappers. When
the source is recomputed, all mappers has to be restarted again to recalculate
their lost results. (was: The BatchFineGrainedRecoveryITCase can be extended
with an additional test failure strategy which abruptly shuts down the task
executor. This leads to the loss of all previously completed and the
in-progress mapper result partitions. The fail-over strategy should
subsequently restart the current in-progress mapper and all previous mappers
because the previous result is unavailable. When the source is recomputed, all
mappers has to be restarted again to recalculate their lost results.)
> IT test for fine-grained recovery (task executor failures)
> ----------------------------------------------------------
>
> Key: FLINK-13169
> URL: https://issues.apache.org/jira/browse/FLINK-13169
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination
> Reporter: Andrey Zagrebin
> Assignee: Andrey Zagrebin
> Priority: Major
> Fix For: 1.9.0
>
>
> The BatchFineGrainedRecoveryITCase can be extended with an additional test
> failure strategy which abruptly shuts down the task executor. This leads to
> the loss of all previously completed and the in-progress mapper result
> partitions. The fail-over strategy should restart the current in-progress
> mapper which will get the PartitionNotFoundException because the previous
> result becomes unavailable and the previous mapper has to be restarted as
> well. The same should happen subsequently with all previous mappers. When the
> source is recomputed, all mappers has to be restarted again to recalculate
> their lost results.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)