[jira] [Commented] (FLINK-34588) FineGrainedSlotManager checks whether resources need to reconcile but doesn't act on the result
[ https://issues.apache.org/jira/browse/FLINK-34588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825261#comment-17825261 ] Matthias Pohl commented on FLINK-34588: --- Ok, thanks for clarification. I might add this information as comments to my FLINK-34427 PR. (y) > FineGrainedSlotManager checks whether resources need to reconcile but doesn't > act on the result > --- > > Key: FLINK-34588 > URL: https://issues.apache.org/jira/browse/FLINK-34588 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.19.0, 1.18.1, 1.20.0 >Reporter: Matthias Pohl >Priority: Major > > There are a few locations in {{FineGrainedSlotManager}} where we check > whether resources can/need to be reconciled but don't care about the result > and just trigger the resource update (e.g. in > [FineGrainedSlotManager:626|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java#L626] > and > [FineGrainedSlotManager:682|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java#L682]). > Looks like we could reduce the calls to the backend here. > It's not having a major impact because this feature is only used in the > {{ActiveResourceManager}} which triggers > [checkResourceDeclarations|https://github.com/apache/flink/blob/c678244a3890273145a786b9e1bf1a4f96f6dcfd/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/active/ActiveResourceManager.java#L331] > and reevaluates the {{resourceDeclarations}}. Not sure whether I missed > something here and there's actually a bigger issue with it. But considering > that nobody complained about it in the past, I'd assume that it's not a > severe issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34588) FineGrainedSlotManager checks whether resources need to reconcile but doesn't act on the result
[ https://issues.apache.org/jira/browse/FLINK-34588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824616#comment-17824616 ] Weihua Hu commented on FLINK-34588: --- Thanks [~mapohl] reporting this. At the first time. the function `checkResourcesNeedReconcile` is called `checkTaskManagerReleasable`, it is only responsible for release idle task managers. So we only care the result of `checkTaskManagerReleasable` in release path([Line 816|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java#L816]). In [FLINK-32880|https://issues.apache.org/jira/browse/FLINK-32880], we change it to `checkResourcesNeedReconcile` and let it check whether need to allocate redundant task manager. There are two functions to allocate/release task managers now. `checkResourcesNeedReconcile`: allocate redundant task manager and release idle task manager `checkResourceRequirements`: allocate task manager for job requirement So, in periodic check of `checkClusterReconciliation`, we take the result of `checkResourcesNeedReconcile` in account because we don't try to fulfill the job requirement here. In other place we ignore the result of `checkResourcesNeedReconcile` because `checkResourceRequirements` may also allocate/release taskmanagers. > FineGrainedSlotManager checks whether resources need to reconcile but doesn't > act on the result > --- > > Key: FLINK-34588 > URL: https://issues.apache.org/jira/browse/FLINK-34588 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.19.0, 1.18.1, 1.20.0 >Reporter: Matthias Pohl >Priority: Major > > There are a few locations in {{FineGrainedSlotManager}} where we check > whether resources can/need to be reconciled but don't care about the result > and just trigger the resource update (e.g. in > [FineGrainedSlotManager:626|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java#L626] > and > [FineGrainedSlotManager:682|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java#L682]). > Looks like we could reduce the calls to the backend here. > It's not having a major impact because this feature is only used in the > {{ActiveResourceManager}} which triggers > [checkResourceDeclarations|https://github.com/apache/flink/blob/c678244a3890273145a786b9e1bf1a4f96f6dcfd/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/active/ActiveResourceManager.java#L331] > and reevaluates the {{resourceDeclarations}}. Not sure whether I missed > something here and there's actually a bigger issue with it. But considering > that nobody complained about it in the past, I'd assume that it's not a > severe issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34588) FineGrainedSlotManager checks whether resources need to reconcile but doesn't act on the result
[ https://issues.apache.org/jira/browse/FLINK-34588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824278#comment-17824278 ] Matthias Pohl commented on FLINK-34588: --- Sorry for that. I updated the links. They should work now. For the record: This was also just observed in a code review. I'm not aware of any actual issues that arise from this. > FineGrainedSlotManager checks whether resources need to reconcile but doesn't > act on the result > --- > > Key: FLINK-34588 > URL: https://issues.apache.org/jira/browse/FLINK-34588 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.19.0, 1.18.1, 1.20.0 >Reporter: Matthias Pohl >Priority: Major > > There are a few locations in {{FineGrainedSlotManager}} where we check > whether resources can/need to be reconciled but don't care about the result > and just trigger the resource update (e.g. in > [FineGrainedSlotManager:626|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java#L626] > and > [FineGrainedSlotManager:682|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java#L682]). > Looks like we could reduce the calls to the backend here. > It's not having a major impact because this feature is only used in the > {{ActiveResourceManager}} which triggers > [checkResourceDeclarations|https://github.com/apache/flink/blob/c678244a3890273145a786b9e1bf1a4f96f6dcfd/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/active/ActiveResourceManager.java#L331] > and reevaluates the {{resourceDeclarations}}. Not sure whether I missed > something here and there's actually a bigger issue with it. But considering > that nobody complained about it in the past, I'd assume that it's not a > severe issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34588) FineGrainedSlotManager checks whether resources need to reconcile but doesn't act on the result
[ https://issues.apache.org/jira/browse/FLINK-34588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824082#comment-17824082 ] Gyula Fora commented on FLINK-34588: The links in the description don't seem to work :/ > FineGrainedSlotManager checks whether resources need to reconcile but doesn't > act on the result > --- > > Key: FLINK-34588 > URL: https://issues.apache.org/jira/browse/FLINK-34588 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.19.0, 1.18.1, 1.20.0 >Reporter: Matthias Pohl >Priority: Major > > There are a few locations in {{FineGrainedSlotManager}} where we check > whether resources can/need to be reconciled but don't care about the result > and just trigger the resource update (e.g. in > [FineGrainedSlotManager:620|https://github.com/apache/flink/blob/c0d3e495f4c2316a80f251de77b05b943b5be1f8/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java#L620] > and > [FineGrainedSlotManager:676|https://github.com/apache/flink/blob/c0d3e495f4c2316a80f251de77b05b943b5be1f8/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java#L676]). > Looks like we could reduce the calls to the backend here. > It's not having a major impact because this feature is only used in the > {{ActiveResourceManager}} which triggers > [checkResourceDeclarations|https://github.com/apache/flink/blob/c678244a3890273145a786b9e1bf1a4f96f6dcfd/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/active/ActiveResourceManager.java#L331] > and reevaluates the {{resourceDeclarations}}. Not sure whether I missed > something here and there's actually a bigger issue with it. But considering > that nobody complained about it in the past, I'd assume that it's not a > severe issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34588) FineGrainedSlotManager checks whether resources need to reconcile but doesn't act on the result
[ https://issues.apache.org/jira/browse/FLINK-34588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824061#comment-17824061 ] Matthias Pohl commented on FLINK-34588: --- cc [~huwh] > FineGrainedSlotManager checks whether resources need to reconcile but doesn't > act on the result > --- > > Key: FLINK-34588 > URL: https://issues.apache.org/jira/browse/FLINK-34588 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.19.0, 1.18.1, 1.20.0 >Reporter: Matthias Pohl >Priority: Major > > There are a few locations in {{FineGrainedSlotManager}} where we check > whether resources can/need to be reconciled but don't care about the result > and just trigger the resource update (e.g. in > [FineGrainedSlotManager:620|https://github.com/apache/flink/blob/c0d3e495f4c2316a80f251de77b05b943b5be1f8/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java#L620] > and > [FineGrainedSlotManager:676|https://github.com/apache/flink/blob/c0d3e495f4c2316a80f251de77b05b943b5be1f8/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java#L676]). > Looks like we could reduce the calls to the backend here. > It's not having a major impact because this feature is only used in the > {{ActiveResourceManager}} which triggers > [checkResourceDeclarations|https://github.com/apache/flink/blob/c678244a3890273145a786b9e1bf1a4f96f6dcfd/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/active/ActiveResourceManager.java#L331] > and reevaluates the {{resourceDeclarations}}. Not sure whether I missed > something here and there's actually a bigger issue with it. But considering > that nobody complained about it in the past, I'd assume that it's not a > severe issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)