[ 
https://issues.apache.org/jira/browse/FLINK-14606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-14606:
----------------------------
    Fix Version/s:     (was: 1.10.0)
                   1.11.0

> Simplify params of Execution#processFail
> ----------------------------------------
>
>                 Key: FLINK-14606
>                 URL: https://issues.apache.org/jira/browse/FLINK-14606
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>    Affects Versions: 1.10.0
>            Reporter: Zhu Zhu
>            Priority: Major
>             Fix For: 1.11.0
>
>
> The 3 params fromSchedulerNg/releasePartitions/isCallback of 
> Execution#processFail are quite a mess while they seem to be correlated. 
> I'd propose to simplify the prams of processFail by using a 
> {{isInternalError}} to replace those 3 params. {{isInternalError}} is true 
> iff the failure is from TM(strictly speaking, notified from SchedulerBase). 
> This also hardens the handling of cases that a task is successfully deployed 
> but JM does not realize it(see #3 below).
> Here's why these 3 params can be simplified:
> 1. {{fromSchedulerNg}}, true iff the failure is from TM and 
> isLegacyScheduling==false.
>     It's only used like this: {{if (!fromSchedulerNg && 
> !isLegacyScheduling()))}}. So it's the same to use {{!isInternalFailure}} to 
> replace it.
> 2. {{releasePartitions}}, true iff the failure is from TM.
>   Now the value is exactly the same as {{isInternalFailure}}, we can drop it 
> and use {{isInternalFailure}} instead.
> 3. {{isCallback}}, true iff the failure is from TM or the task is not 
> deployed.
>     It's only used like this: {{(!isCallback && (current == RUNNING || 
> current == DEPLOYING))}}.
>     So using {{!isInternalFailure}} to replace it would be enough. It is a 
> bit different for the case that a task deployment to a task manager fails, 
> which set {{isCallback}} to true previously. However, it would be safer to 
> signal a cancel call, in case the deployment is actually a success but the 
> response is lost on network.
> cc [~GJL]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to