[ 
https://issues.apache.org/jira/browse/FLINK-31757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17711932#comment-17711932
 ] 

Rui Fan commented on FLINK-31757:
---------------------------------

Hi [~chesnay] , thanks for your reply.
{quote}The obvious solution for the user is to set the parallelism to 100 for 
everything if the describe issues are a problem.
{quote}
In some scenarios, setting all parallelism globally will waste resources or 
setting low parallelism for some tasks is a good choice. For example, flink job 
has too many sources, each source has only 5 partitions. So setting parallelism 
to 5 for each source is enough.

Or the business logic is very complex, the flink job has dozens of tasks, and 
the user sets a reasonable parallelism according to the busy ratio of the tasks 
(similar to FLIP-AutoScalar).

In general, it is a common scenario that the parallelism of multiple tasks is 
different. For this scenario, it is unreasonable for resource balance that the 
front TM runs a large number of tasks and the subsequent TMs run a small number 
of tasks.

> Optimize Flink un-balanced tasks scheduling
> -------------------------------------------
>
>                 Key: FLINK-31757
>                 URL: https://issues.apache.org/jira/browse/FLINK-31757
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>            Reporter: RocMarshal
>            Assignee: RocMarshal
>            Priority: Major
>
> Supposed we have a Job with 21 {{JobVertex}}. The parallelism of vertex A is 
> 100, and the others are 5. If each {{TaskManager}} only have one slot, then 
> we need 100 TMs.
> There will be 5 slots with 21 sub-tasks, and the others will only have one 
> sub-task of A. Does this mean we have to make a trade-off between wasted 
> resources and insufficient resources?
> From a resource utilization point of view, we expect all subtasks to be 
> evenly distributed on each TM.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to