Till Rohrmann created FLINK-9635:
------------------------------------
Summary: Local recovery scheduling can cause spread out of tasks
Key: FLINK-9635
URL: https://issues.apache.org/jira/browse/FLINK-9635
Project: Flink
Issue Type: Bug
Components: Distributed Coordination
Affects Versions: 1.5.0
Reporter: Till Rohrmann
Fix For: 1.6.0, 1.5.1
In order to make local recovery work, Flink's scheduling was changed such that
it tries to be rescheduled to its previous location. In order to not occupy
slots which have state of other tasks cached, the strategy will request a new
slot if the old slot identified by the previous allocation id is no longer
present. This also applies to newly allocated slots because there is no
distinction between new or already used. This behaviour can cause that every
tasks gets deployed to its own slot if the {{SlotPool}} has released all slots
in the meantime, for example.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)