Stephan Ewen created FLINK-7231: ----------------------------------- Summary: SlotSharingGroups are not always released in time for new restarts Key: FLINK-7231 URL: https://issues.apache.org/jira/browse/FLINK-7231 Project: Flink Issue Type: Bug Components: Distributed Coordination Affects Versions: 1.3.1 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 1.4.0, 1.3.2
In the case where there are not enough resources to schedule the streaming program, a race condition can lead to a sequence of the following errors: {code} java.lang.IllegalStateException: SlotSharingGroup cannot clear task assignment, group still has allocated resources. {code} This eventually recovers, but may involve many fast restart attempts before doing so. The root cause is that slots are not cleared before the next restart attempt. -- This message was sent by Atlassian JIRA (v6.4.14#64029)