[
https://issues.apache.org/jira/browse/FLINK-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15997961#comment-15997961
]
Till Rohrmann commented on FLINK-6434:
--------------------------------------
Thanks for reporting the issue [~tiemsn]. This sounds like a bug and should be
fixed.
I think we could solve it the following way: We generate the {{AllocationID}}
in {{ProviderAndOwner#allocateSlot}} and pass it to
{{SlotPoolGateway#allocateSlot}}. On the returned future we register an
exception handler which will call {{SlotPoolGateway#failAllocation}} with the
generated {{AllocationID}}. That way we should be able to deal with timeouts on
the {{Execution}} side. What do you think?
> There may be allocatedSlots leak in SlotPool
> --------------------------------------------
>
> Key: FLINK-6434
> URL: https://issues.apache.org/jira/browse/FLINK-6434
> Project: Flink
> Issue Type: Bug
> Components: Cluster Management
> Reporter: shuai.xu
> Assignee: shuai.xu
> Labels: flip-6
>
> If the call allocateSlot() from Execution to Slotpool timeout, the job will
> begin to failover, but the pending request are still in SlotPool, if then a
> new slot register to SlotPool, it may be fulfill the outdated pending request
> and be added to allocatedSlots, but it will never be used and will never be
> recycled.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)