[
https://issues.apache.org/jira/browse/FLINK-14606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhu Zhu closed FLINK-14606.
---------------------------
Resolution: Won't Do
Closed because the value of {{inCallback}} and {{releasePartitions}} are not
exactly aligned.
{{inCallback}} only needs to be true if the task is already deployed and it is
failed by JM. However, even if a task is not deployed, {{releasePartitions}}
still needs to be true since the partition may have been created in external
shuffle services.
{{fromSchedulerNG}} will be removed along with the legacy scheduler removal, so
we do not need to change it right now here.
> Simplify params of Execution#processFail
> ----------------------------------------
>
> Key: FLINK-14606
> URL: https://issues.apache.org/jira/browse/FLINK-14606
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination
> Affects Versions: 1.10.0
> Reporter: Zhu Zhu
> Priority: Major
>
> The 3 params fromSchedulerNg/releasePartitions/isCallback of
> Execution#processFail are quite a mess while they seem to be correlated.
> I'd propose to simplify the prams of processFail by using a
> {{isInternalError}} to replace those 3 params. {{isInternalError}} is true
> iff the failure is from TM(strictly speaking, notified from SchedulerBase).
> This also hardens the handling of cases that a task is successfully deployed
> but JM does not realize it(see #3 below).
> Here's why these 3 params can be simplified:
> 1. {{fromSchedulerNg}}, true iff the failure is from TM and
> isLegacyScheduling==false.
> It's only used like this: {{if (!fromSchedulerNg &&
> !isLegacyScheduling()))}}. So it's the same to use {{!isInternalFailure}} to
> replace it.
> 2. {{releasePartitions}}, true iff the failure is from TM.
> Now the value is exactly the same as {{isInternalFailure}}, we can drop it
> and use {{isInternalFailure}} instead.
> 3. {{isCallback}}, true iff the failure is from TM or the task is not
> deployed.
> It's only used like this: {{(!isCallback && (current == RUNNING ||
> current == DEPLOYING))}}.
> So using {{!isInternalFailure}} to replace it would be enough. It is a
> bit different for the case that a task deployment to a task manager fails,
> which set {{isCallback}} to true previously. However, it would be safer to
> signal a cancel call, in case the deployment is actually a success but the
> response is lost on network.
> cc [~GJL]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)