[ 
https://issues.apache.org/jira/browse/FLINK-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Young updated FLINK-4427:
------------------------------
    Description: 
Currently we only have allocation logic for SlotManager / ResourceManager, for 
some batch job, slots that already finished can be released, thus should 
trigger container release in different cluster modes.
This should also address the problem that in the new architecture, when we have 
a BLOCKING result partition type, the data is actually holded by TaskManager, 
not the slot. When we finish the produce task, we will mark this task finished 
and try to release the slot. In yarn or mesos mode, releasing slot may trigger 
releasing the container, so the TaskManager will be terminated, the result data 
is lost. We should introduce some mechanism to prevent this from happening.

  was:Currently we only have allocation logic for SlotManager / 
ResourceManager, for some batch job, slots that already finished can be 
released, thus should trigger container release in different cluster modes.


> Add slot / container releasing logic to SlotManager (Standalone / Yarn / 
> Mesos)
> -------------------------------------------------------------------------------
>
>                 Key: FLINK-4427
>                 URL: https://issues.apache.org/jira/browse/FLINK-4427
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Cluster Management
>            Reporter: Kurt Young
>
> Currently we only have allocation logic for SlotManager / ResourceManager, 
> for some batch job, slots that already finished can be released, thus should 
> trigger container release in different cluster modes.
> This should also address the problem that in the new architecture, when we 
> have a BLOCKING result partition type, the data is actually holded by 
> TaskManager, not the slot. When we finish the produce task, we will mark this 
> task finished and try to release the slot. In yarn or mesos mode, releasing 
> slot may trigger releasing the container, so the TaskManager will be 
> terminated, the result data is lost. We should introduce some mechanism to 
> prevent this from happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to