[
https://issues.apache.org/jira/browse/FLINK-31757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17711886#comment-17711886
]
Rui Fan commented on FLINK-31757:
---------------------------------
Thanks [~RocMarshal] 's reporting and [~huwh] 's feedback.
{quote}If the user allocates resources to TM: All TM resources are applied
according to the 5 TMs (loading 21-subtasks), then subsequent TM resources will
be wasted. If apply the resources based on other TM(only loading a subtask),
the 5 TMs resources are insufficient, tasks running on its may have lag.
{quote}
>From the information, flink users have 2 options:
# Set different slot sharing group for tasks.
# Set the TM resources according to the high load TM to ensure the performance.
Option 1 is not friendly to flink users, and flink sql doesn't support set slot
sharing group.
Option 2 will waste some TM resources.
As I understand, the balance of the number of tasks on the TM can make the
actual resource usage of all TMs closer, it should be valuable for flink users
and flink community from my side.
Please go ahead and prepare a detailed design doc first, thanks.:)
> Optimize Flink un-balanced tasks scheduling
> -------------------------------------------
>
> Key: FLINK-31757
> URL: https://issues.apache.org/jira/browse/FLINK-31757
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Task
> Reporter: RocMarshal
> Assignee: RocMarshal
> Priority: Major
>
> Supposed we have a Job with 21 {{JobVertex}}. The parallelism of vertex A is
> 100, and the others are 5. If each {{TaskManager}} only have one slot, then
> we need 100 TMs.
> There will be 5 slots with 21 sub-tasks, and the others will only have one
> sub-task of A. Does this mean we have to make a trade-off between wasted
> resources and insufficient resources?
> From a resource utilization point of view, we expect all subtasks to be
> evenly distributed on each TM.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)