[ 
https://issues.apache.org/jira/browse/FLINK-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-14439:
----------------------------
    Description: 
In current region failover when using DefaultScheduler, most of the input 
result partition states are unknown. Even though the failure cause is a 
PartitionException, only one unhealthy partition can be identified.

The may lead to multiple unsuccessful failovers before all the unhealthy but 
needed partitions are identified and their producers are involved in the 
failover as well. (unsuccessful failover here means the recovered tasks get 
failed again soon due to some missing input partitions.)

Using JM side tracked partition states to help the region failover to identify 
unhealthy(missing) partitions earlier can help with this case.

To achieve it, I'd propose as follows:
1. Add an interface setResultPartitionAvailabilityChecker to 
FailoverStrategy.Factory.
2. Invoke that interface in DefaultScheduler ctor to pass in 
ExecutionGraph#resultPartitionAvailabilityChecker.
3. Change RestartPipelinedRegionStrategy.Factory use to construct 
RestartPipelinedRegionStrategy with the given checker

It also fails BatchFineGrainedRecoveryITCase due to unexpected failover counts. 
This is because the legacy scheduler already has similar optimization in 
FLINK-13055.

  was:
In current region failover when using DefaultScheduler, most of the input 
result partition states are unknown. Even though the failure cause is a 
PartitionException, only one unhealthy partition can be identified.

The may lead to multiple unsuccessful failovers before all the unhealthy but 
needed partitions are identified and their producers are involved in the 
failover as well. (unsuccessful failover here means the recovered tasks get 
failed again soon due to some missing input partitions.)

Using JM side tracked partition states to help the region failover to identify 
unhealthy(missing) partitions earlier can help with this case.

It also fails BatchFineGrainedRecoveryITCase due to unexpected failover counts. 
This is because the legacy scheduler already has similar optimization in 
FLINK-13055.


> RestartPipelinedRegionStrategy leverage tracked partition availability for 
> better failover experience in DefaultScheduler 
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-14439
>                 URL: https://issues.apache.org/jira/browse/FLINK-14439
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>    Affects Versions: 1.10.0
>            Reporter: Zhu Zhu
>            Priority: Major
>             Fix For: 1.10.0
>
>
> In current region failover when using DefaultScheduler, most of the input 
> result partition states are unknown. Even though the failure cause is a 
> PartitionException, only one unhealthy partition can be identified.
> The may lead to multiple unsuccessful failovers before all the unhealthy but 
> needed partitions are identified and their producers are involved in the 
> failover as well. (unsuccessful failover here means the recovered tasks get 
> failed again soon due to some missing input partitions.)
> Using JM side tracked partition states to help the region failover to 
> identify unhealthy(missing) partitions earlier can help with this case.
> To achieve it, I'd propose as follows:
> 1. Add an interface setResultPartitionAvailabilityChecker to 
> FailoverStrategy.Factory.
> 2. Invoke that interface in DefaultScheduler ctor to pass in 
> ExecutionGraph#resultPartitionAvailabilityChecker.
> 3. Change RestartPipelinedRegionStrategy.Factory use to construct 
> RestartPipelinedRegionStrategy with the given checker
> It also fails BatchFineGrainedRecoveryITCase due to unexpected failover 
> counts. This is because the legacy scheduler already has similar optimization 
> in FLINK-13055.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to