Till Rohrmann created FLINK-9908:
------------------------------------
Summary: Inconsistent state of SlotPool after ExecutionGraph
cancellation
Key: FLINK-9908
URL: https://issues.apache.org/jira/browse/FLINK-9908
Project: Flink
Issue Type: Bug
Affects Versions: 1.5.1, 1.6.0, 1.7.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
Fix For: 1.5.2, 1.6.0, 1.7.0
If the {{ExecutionGraph}} is concurrently scheduled and cancelled, it can
happen that requested {{Slots}} are not properly returned to the {{SlotPool}}.
This causes an inconsistent state of the {{SlotPool}} where it thinks that some
of its slots are still occupied even though the respective {{Execution}} has
already been cancelled.
The problem seems to be caused by propagating the cancellation of the overall
scheduling future to the individual scheduling futures. If the individual
scheduling future is cancelled, then the callback which produces its value and
also handles the failure case won't be called.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)