Hi Chris,

I am not sure why you do not want to use local channel. Are there any problems 
for local channel in your case?

The root cause of local channel is determined by scheduler which schedules both 
producer and consumer tasks into the same task manager. So if you want to 
change this behaviour, it is better to change the logic or limit in scheduler 
instead of network stack.  Another simple way for your requirement is setting 
only one slot per task manager, then there would be only one task running in 
each task manager.

Best,
Zhijiang


------------------------------------------------------------------
From:Chris Miller <c...@34s.de>
Send Time:2019年1月14日(星期一) 00:18
To:dev <dev@flink.apache.org>
Subject:Disable local data transportation



Hi all, 

let's have a look at a simple Join with two DataSources and parallelism
p=5. 

The whole Job consists of 3 parts: 

1. DataSource Task 

2. Join Task 

3. DataSink Task 

In the first task, the data is provided and prepared for the Join task.
In particular each DataSource task creates a ResultPartition which is
divided into 5 subpartitions. Since 1/5 of the Join Task will be located
in the same node, one of these subpartitions does not have to be shipped
over the network. 

This one subpartition will be shipped to a LocalInputChannel (not
RemoteInputChannel) and therefore will not get in touch with the
network. 

Now I made some changes in the network part for my research and would
like them to affect all subpartitions. 

Question: 

Is there a feature build into flink to completely disable the local
stuff and send all subpartitions via network even if they have the same
location and destination? 

If not - does anyone have an idea where to tweak this? 

Thanks. 

Chris 


Reply via email to