Lijie Wang created FLINK-28144: ---------------------------------- Summary: Let JobMaster support blocklist mechanism Key: FLINK-28144 URL: https://issues.apache.org/jira/browse/FLINK-28144 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Affects Versions: 1.16.0 Reporter: Lijie Wang
SlotPool should avoid allocating slots that located on blocked nodes. To do that, our core idea is to keep the SlotPool in such a state: there is no slot in SlotPool that is free (no task assigned) and located on blocked nodes. Details are as following: 1. When receiving slot offers from task managers located on blocked nodes, all offers should be rejected. 2. When a node is newly blocked, we should release all free(no task assigned) slots on it. We need to find all task managers on blocked nodes and release all free slots on them by SlotPoolService#releaseFreeSlotsOnTaskManager. 3. When a slot state changes from reserved(task assigned) to free(no task assigned), it will check whether the corresponding task manager is blocked. If yes, release the slot. -- This message was sent by Atlassian Jira (v8.20.7#820007)