[ 
https://issues.apache.org/jira/browse/BEAM-5713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16647630#comment-16647630
 ] 

Maximilian Michels commented on BEAM-5713:
------------------------------------------

{quote}
I added more tasks and they are all squeezed into the same slots (only 8 out of 
144 task slots are used).
{quote}

Tasks which directly depend on each other share the same task slot. That is how 
pipelining in Flink works. AFAIK pipelines can get arbitrarily long.


{quote}
The scheduling of all tasks to the same slot is consistent, distribution over 
hosts isn't. With parallelism 4, different result (multiple hosts).
{quote}

Is that consistent behavior for parallelism of 4? I find that it depends on 
what the iterator returns from the task slot HashMap. This depends on a number 
of factors, e.g. what jobs you ran before, how TaskManager registered.

> Flink portable runner schedules all tasks of streaming job on same task 
> manager
> -------------------------------------------------------------------------------
>
>                 Key: BEAM-5713
>                 URL: https://issues.apache.org/jira/browse/BEAM-5713
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>    Affects Versions: 2.8.0
>            Reporter: Thomas Weise
>            Assignee: Maximilian Michels
>            Priority: Major
>              Labels: portability, portability-flink
>         Attachments: Different SlotSharingGroup.png, With 
> RichParallelSourceFunction and parallelism 5.png, 
> image-2018-10-11-11-43-50-333.png, image-2018-10-11-16-20-45-221.png
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The cluster has 9 task managers and 144 task slots total. A simple streaming 
> pipeline with parallelism of 8 will get all tasks scheduled on the same task 
> manager, causing the host to be fully booked and the remaining cluster idle.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to