Luke Hutchison created FLINK-6057:
-------------------------------------

             Summary: Better default needed for num network buffers
                 Key: FLINK-6057
                 URL: https://issues.apache.org/jira/browse/FLINK-6057
             Project: Flink
          Issue Type: Bug
          Components: Core
    Affects Versions: 1.2.0
            Reporter: Luke Hutchison


Using the default environment,

{code}
ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
{code}

my code will sometimes fail with an error that Flink ran out of network 
buffers. To fix this, I have to do:

{code}
int numTasks = Runtime.getRuntime().availableProcessors();
config.setInteger(ConfigConstants.DEFAULT_PARALLELISM_KEY, numTasks);
config.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, numTasks);
config.setInteger(ConfigConstants.TASK_MANAGER_NETWORK_NUM_BUFFERS_KEY, 
numTasks * 2048);
{code}

The default value of 2048 fails when I increase the degree of parallelism for a 
large Flink pipeline (hence the fix to set the number of buffers to numTasks * 
2048).

This is particularly problematic because a pipeline can work fine on one 
machine, and when you start the pipeline on a machine with more cores, it can 
fail.

The default execution environment should pick a saner default based on the 
level of parallelism (or whatever is needed to ensure that the number of 
network buffers is not going to be exceeded for a given execution environment).




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to