[GitHub] [flink] zhijiangW commented on a change in pull request #11567: [FLINK-16645] Limit the maximum backlogs in subpartitions

GitBox Mon, 27 Apr 2020 20:39:54 -0700


zhijiangW commented on a change in pull request #11567:
URL: https://github.com/apache/flink/pull/11567#discussion_r416302882




##########
File path: 
flink-core/src/main/java/org/apache/flink/configuration/NettyShuffleEnvironmentOptions.java
##########
@@ -174,6 +173,20 @@
                                " help relieve back-pressure caused by 
unbalanced data distribution among the subpartitions. This value should be" +
                                " increased in case of higher round trip times 
between nodes and/or larger number of machines in the cluster.");
 
+       /**
+        * Number of max buffers can be used for each output subparition.
+        */
+       @Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER_NETWORK)
+       public static final ConfigOption<Integer> 
NETWORK_MAX_BUFFERS_PER_CHANNEL =
+                       key("taskmanager.network.max-buffers-per-channel")
+                                       .defaultValue(Integer.MAX_VALUE)

Review comment:
       In theory I think the proper default value here is to consider not 
affecting the performance and meanwhile reducing the backlog AMAP for a 
sub-partition.
   
   The most ideally situation might be like this: the network or local channel 
can consume x buffers/second, then the max backlog might be (x+1) 
buffers/second to satisfy the pipeline.
   
   In other words the consumer would never wait for the backlog to delay the 
pipeline, and the backlog only needs to provide a bit overhead to satisfy the 
consumer.
   
   I guess the default 10 should be a conservative value which would not affect 
performance I convinced. But I am not sure whether we can further reduce this 
value without specific experiments. Anyway the default 10 might already resolve 
the problem of in-flight buffers greatly.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] zhijiangW commented on a change in pull request #11567: [FLINK-16645] Limit the maximum backlogs in subpartitions

Reply via email to