Hi Mu,
Regarding your questions.

   - The feature `spread out tasks evenly across task managers` is
   introduced in Flink 1.10.0, and backported to Flink 1.9.2, per the JIRA
   ticket [1]. That means if you configure this option in Flink 1.9.0, it
   should not take any effect.
   - Please be aware that this feature ATM only works for standalone
   deployment (including standalone Kubernetes deployment). For the native
   Kubernetes, Yarn and Mesos deployment, it is a known issue that this
   feature does not work as expected.
   - Regarding the scheduling behavior changes, we would need more
   information to explain this. To provide the information needed, the easiest
   way is probably to provide the jobmanager log files, if you're okay with
   sharing them. If you cannot share the logs, then it would be better to
   answer the following questions
      - What Flink deployment are you using? (Standalone/K8s/Yarn/Mesos)
      - How many times have you tried with and without
      `cluster.evenly-spread-out-slots`? In other words, the described
behaviors
      before and after setting `cluster.evenly-spread-out-slots`, can they be
      stably reproduced?
      - How many TMs do you have? And how many slots does each TM has?


Thank you~

Xintong Song


[1] https://issues.apache.org/jira/browse/FLINK-12122

On Tue, Jul 7, 2020 at 8:33 PM Mu Kong <kong.mu....@gmail.com> wrote:

> Hi, Guo,
>
> Thanks for helping out.
>
> My application has a kafka source with 60 subtasks(parallelism), and we
> have 15 task managers with 15 slots on each.
>
> *Before I applied the cluster.evenly-spread-out-slots,* meaning it is set
> to default false, the operator 'kafka source" has 11 subtasks allocated in
> one single task manager,
> while the remaining 49 subtasks of "kafka source" distributed to the
> remaining 14 task managers.
>
> *After I set cluster.evenly-spread-out-slots to true*, the 60 subtasks of
> "kafka source" were allocated to only 4 task managers, and they took 15
> slots on each of these 4 TMs.
>
> What I thought is that this config will make the subtasks of one operator
> more evenly spread among the task managers, but it seems it made them
> allocated in the same task manager as much as possible.
>
> The version I'm deploying is 1.9.0.
>
> Best regards,
> Mu
>
> On Tue, Jul 7, 2020 at 7:10 PM Yangze Guo <karma...@gmail.com> wrote:
>
>> Hi, Mu,
>>
>> IIUC, cluster.evenly-spread-out-slots would fulfill your demand. Why
>> do you think it does the opposite of what you want. Do you run your
>> job in active mode? If so, cluster.evenly-spread-out-slots might not
>> work very well because there could be insufficient task managers when
>> request slot from ResourceManager. This has been discussed in
>> https://issues.apache.org/jira/browse/FLINK-12122 .
>>
>>
>> Best,
>> Yangze Guo
>>
>> On Tue, Jul 7, 2020 at 5:44 PM Mu Kong <kong.mu....@gmail.com> wrote:
>> >
>> > Hi community,
>> >
>> > I'm running an application to consume data from kafka, and process it
>> then put data to the druid.
>> > I wonder if there is a way where I can allocate the data source
>> consuming process evenly across the task manager to maximize the usage of
>> the network of task managers.
>> >
>> > So, for example, I have 15 task managers and I set parallelism for the
>> kafka source as 60, since I have 60 partitions in kafka topic.
>> > What I want is flink cluster will put 4 kafka source subtasks on each
>> task manager.
>> >
>> > Is that possible? I have gone through the document, the only thing we
>> found is
>> >
>> > cluster.evenly-spread-out-slots
>> >
>> > which does exact the opposite of what I want. It will put the subtasks
>> of the same operator onto one task manager as much as possible.
>> >
>> > So, is some kind of manual resource allocation available?
>> > Thanks in advance!
>> >
>> >
>> > Best regards,
>> > Mu
>>
>

Reply via email to