[ 
https://issues.apache.org/jira/browse/FLINK-18012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-18012:
-----------------------------------
    Labels: pull-request-available  (was: )

> Deactivate slot timeout if TaskSlotTable.tryMarkSlotActive is called
> --------------------------------------------------------------------
>
>                 Key: FLINK-18012
>                 URL: https://issues.apache.org/jira/browse/FLINK-18012
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 1.9.3, 1.10.1, 1.11.0
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 1.11.0, 1.10.2, 1.9.4
>
>
> With FLINK-9932 we loosened the slot allocation protocol in a way that the 
> {{JobMaster}} can deploy {{Tasks}} into a slot which has not been 
> {{ACTIVATED}} but only {{ALLOCATED}} for a given job. This allowed to better 
> handle the case where the {{JobMasterGateway#offerSlots}} response was late 
> so that it timed out. The way it was solved is to offer a 
> {{TaskSlotTable#tryMarkSlotActive}} method which, in contrast to 
> {{TaskSlotTable#markSlotActive}}, would not fail if the requested slot was 
> not available.
> However, the problem is that the former method does not deactivate the slot 
> timeout. Hence, it can happen if the {{offerSlots}} response never arrives at 
> the {{TaskExecutor}} that an {{ACTIVATED}} slot times out.
> In order to fix the problem, we should also deactivate the slot timeout when 
> {{TaskSlotTable#tryMarkSlotActive}} is being called.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to