Hi Ray,
For your question : Does that say that each parallel task inside the 
TaskManager talk to all parallel tasks inside the same TaskManager or to all 
parallel tasks across all task managers? Each task will talk to all parallel 
upstream and downstream tasks that both include the same TaskManager and across 
different task managers.The consumer and producer tasks may be deployed in the 
same TaskManager or different TaskManagers.For the case of same TaskManager, 
the local data shuffle is directly done by memory copy and the required buffers 
can be determined by #slots-per-TM^2.For the case of across TaskManagers, the 
remote data shuffle is done by network transport and only one tcp connection 
between two TaskManagers can be reused by all the internal tasks. So the 
required buffers can be determined by #TMs.
Considering both cases, the formular is #slots-per-TM^2 * #TMs, hope it can 
help you.
Cheers,Zhijiang 
------------------------------------------------------------------发件人:Ray 
Ruvinskiy <ray.ruvins...@arcticwolf.com>发送时间:2017年6月7日(星期三) 
23:59收件人:user@flink.apache.org <user@flink.apache.org>主 题:Question regarding 
configuring number of network buffers
The documentation provides the formula #slots-per-TM^2 * #TMs * 4 to determine 
the number of network buffers we should configure. The documentation also says, 
“A logical network connection exists for each point-to-point exchange of data 
over the network, which typically happens at repartitioning- or broadcasting 
steps (shuffle phase). In those, each parallel task inside the TaskManager has 
to be able to talk to all other parallel tasks.” Does that say that each 
parallel task inside the TaskManager talk to all parallel tasks inside the same 
TaskManager or to all parallel tasks across all task managers? Intuitively, I 
would assume the latter, but then wouldn’t the formula for determining the 
number of network buffers be more along the lines of (#slots-per-TM * #TMs)^2? 
Thanks, Ray 

Reply via email to