[ https://issues.apache.org/jira/browse/FLINK-20865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Weijie Guo updated FLINK-20865: ------------------------------- Affects Version/s: 2.1.0 > Prevent potential resource deadlock in fine-grained resource management > ----------------------------------------------------------------------- > > Key: FLINK-20865 > URL: https://issues.apache.org/jira/browse/FLINK-20865 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination > Affects Versions: 2.1.0 > Reporter: Yangze Guo > Priority: Minor > Labels: auto-deprioritized-major > Fix For: 2.0.0 > > Attachments: 屏幕快照 2021-01-06 下午2.32.57.png > > > !屏幕快照 2021-01-06 下午2.32.57.png|width=954,height=288! > The above figure demonstrates a potential case of deadlock due to scheduling > dependency. For the given topology, initially the scheduler will request 4 > slots, for A, B, C and D. Assuming only 2 slots are available, if both slots > are assigned to Pipeline Region 0 (as shown on the left), A and B will first > finish execution, then C and D will be executed, and finally E will be > executed. However, if in the beginning the 2 slots are assigned to A and C > (as shown on the right), then neither of A and C can finish execution due to > missing B and D consuming the data they produced. > Currently, with coarse-grained resource management, the scheduler guarantees > to always finish fulfilling requirements of one pipeline region before > starting to fulfill requirements of another. That means the deadlock case > shown on the right of the above figure can never happen. > However, there’s no such guarantee in fine-grained resource management. Since > resource requirements for SSGs can be different, there’s no control on which > requirements will be fulfilled first, when there’s not enough resources to > fulfill all the requirements. Therefore, it’s not always possible to fulfill > one pipeline region prior to another. > To solve this problem, we can make the scheduler defer requesting slots for > other SSGs before requirements of the current SSG are fulfilled, for > fine-grained resource management, at the price of more scheduling time. -- This message was sent by Atlassian Jira (v8.20.10#820010)