[
https://issues.apache.org/jira/browse/FLINK-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16960495#comment-16960495
]
Zhu Zhu commented on FLINK-14314:
---------------------------------
[~trohrmann] You are right, we can keep those bookkeeping and check logics.
In this way, for shared slot, we still allocate {{SingleTaskSlot}} with the
task {{ResourceProfile}} but should allocate the physical slot with the group
{{ResourceProfile}}.
To support this, we can add a {{groupResourceProfile}} in {{SlotProfile}} and
mark the previous {{ResourceProfile}} in it as {{taskResourceProfile}}.
Besides that, I think we can change {{checkOversubscriptionAndReleaseChildren}}
to not tolerate any partial fulfillment anymore, but always release the entire
{{MultiTaskSlot}} if it fails to fulfill all its sub slots' resource request.
Because partial fulfillment case indicates a bug if we have done FLINK-14314.
> Allocate shared slot resources respecting the resources of all vertices in
> the group
> ------------------------------------------------------------------------------------
>
> Key: FLINK-14314
> URL: https://issues.apache.org/jira/browse/FLINK-14314
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination
> Affects Versions: 1.10.0
> Reporter: Zhu Zhu
> Priority: Major
> Fix For: 1.10.0
>
>
> With FLINK-14058, it is assumed that a shared slot should be large enough to
> be used by one instance of each JobVertex in the group simultaneously.
> To support it, a shared slot resources should be the sum of all JobVertex
> resources in the group.
> Here's the concrete proposal:
> 1. add a {{ResourceSpec}} for {{SlotSharingGroup}}. Set it as a merge of the
> resources of all operators in it when building the {{JobGraph}} in
> {{StreamingJobGraphGenerator}}
> 2. remove the resources bookkeeping logic for slot sharing, which was
> introduced in FLINK-12765. So that a shared slot will allocate physical slot
> regarding the first resourceProfile it receives, and do no more checks for
> later arrived slot allocation on it. The next step will guarantee the first
> resourceProfile to be enough.
> 3. change {{ExecutionVertex#getResourceProfile}} to return the
> {{SlotSharingGroup#resourceSpec}} if it is in a slot sharing group, otherwise
> returns the {{ExecutionJobVertex#resourceProfile}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)