Re: Distributing Tasks over Task manager

Jürgen Thomann Tue, 18 Oct 2016 06:09:36 -0700

Hi Robert,

Do you already had a chance to look on it? If you need more informationjust let me know.


Regards,
Jürgen

On 12.10.2016 21:12, Jürgen Thomann wrote:

Hi Robert,
Thanks for your suggestions. We are using the DataStream API and Itried it with disabling it completely, but that didn't help.
I attached the plan and to add some context, it starts with a Kafkasource followed by a map operation ( parallelism 4). The next map isthe expensive part with a parallelism of 18 which produces a Tuple2which is used for splitting. Starting here the parallelism is always 2except the sink with 1. Both resulting streams have two maps, afilter, one more map and are ending with anassignTimestampsAndWatermarks. If there is now a small box in thepicture it is a filter operation and otherwise it goes directly to akeyBy, timewindow and apply operation followed by a sink.
If one task manager contains more sub tasks of the expensive map thanany other task manager, everything later in the stream is running onthe same task manager. If two task manager have the same amount of subtasks, the following tasks with a parallelism of 2 are distributedover the two task manager.
Interesting is also that the task manager have 6 task slots configuredand the expensive part has 6 sub tasks on one task manager but stilleverything later in the flow is running on this task manager. Thisalso happens if operator chaining is disabled.
Best,
Jürgen


On 12.10.2016 17:43, Robert Metzger wrote:
Hi Jürgen,

Are you using the DataStream or the DataSet API?
Maybe the operator chaining is causing too many operations to be"packed" into one task. Check out this documentation page:https://ci.apache.org/projects/flink/flink-docs-master/dev/datastream_api.html#task-chaining-and-resource-groupsYou could try to disable chaining completely to see if that resolvesthe issue (you'll probably pay for this by having more serializationoverhead and network traffic).
If my suggestions don't help, can you post a screenshot of your jobplan (from the web interface) here, so that we see what operationsyou are performing?
Regards,
Robert

Re: Distributing Tasks over Task manager

Reply via email to