[
https://issues.apache.org/jira/browse/FLINK-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967540#comment-15967540
]
ASF GitHub Bot commented on FLINK-4545:
---------------------------------------
GitHub user NicoK opened a pull request:
https://github.com/apache/flink/pull/3721
[FLINK-4545] replace the network buffers parameter
(based on #3708 and #3713)
Instead, allow the configuration with the following three new (more
flexible) parameters:
* `taskmanager.network.memory.fraction`: fraction of JVM memory to use for
network buffers (default: 0.1)
* `taskmanager.network.memory.min`: minimum memory size for network buffers
(default: 64 MB)
* `taskmanager.network.memory.max`: maximum memory size for network buffers
(default: 1 GB)
Note that I needed to adapt two unit tests which would have been killed on
Travis CI because these defaults result in ~150MB memory being used for network
buffers which apparently was too much there.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/NicoK/flink flink-4545
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/3721.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3721
----
commit e61f7bc4debce332c421cb645ff1025b4d03d8d0
Author: Nico Kruber <[email protected]>
Date: 2017-04-11T09:26:29Z
[FLINK-6292] fix transfer.sh upload by using https
Seems the upload via http is not supported anymore.
commit 362ceec0823b179719449d0ed244c591dfcf51f4
Author: Nico Kruber <[email protected]>
Date: 2017-04-12T09:09:03Z
[FLINK-6299] make all IT cases extend from TestLogger
This way, currently executed tests and their failures are properly logged.
commit 973099ef55701fe63951639d37b4f01765b06a01
Author: Nico Kruber <[email protected]>
Date: 2017-04-06T12:41:52Z
[FLINK-4545] replace the network buffers parameter
Instead, allow the configuration with the following three new (more
flexible)
parameters:
* "taskmanager.network.memory.fraction": fraction of JVM memory to use for
network buffers (default: 0.1)
* "taskmanager.network.memory.min": minimum memory size for network
buffers (default: 64 MB)
* "taskmanager.network.memory.max": maximum memory size for network
buffers (default: 1 GB)
# Please enter the commit message for your changes. Lines starting
commit 09a981189b59ac13bd39000cc77913c0b03289fd
Author: Nico Kruber <[email protected]>
Date: 2017-04-11T12:20:40Z
[hotfix] fix typo in error message
commit 0960a809c8da51b9787f3f726945716933051fc3
Author: Nico Kruber <[email protected]>
Date: 2017-04-11T13:29:41Z
[hotfix] fix typo in taskmanager.sh usage string
commit 298bb69451a1405df774451de11eb5684534c956
Author: Nico Kruber <[email protected]>
Date: 2017-04-06T15:58:14Z
[FLINK-4545] adapt taskmanager.sh to take network buffers memory into
account
commit ea2fb24f4a6eb18cc3f8d3ebd83a49c0f1386a8a
Author: Nico Kruber <[email protected]>
Date: 2017-04-10T09:43:50Z
[FLINK-4545] add configuration checks for the new network buffer memory
config
commit 5133d250c4dba4a5e72baad95c841d2b03cb49ea
Author: Nico Kruber <[email protected]>
Date: 2017-04-10T16:22:10Z
[FLINK-4545] add unit tests using the new network configuration parameters
and methods
commit a24a548e6ff7e36581f7f7457099656362ca3974
Author: Nico Kruber <[email protected]>
Date: 2017-04-11T16:52:56Z
[FLINK-4545] add unit tests for heap size calculation in shell scripts
These verify that the results are the same as in the calculation done by
Java.
commit d55153d559bf110a931b5de849df812038ba4a7a
Author: Nico Kruber <[email protected]>
Date: 2017-04-12T16:11:37Z
[FLINK-4545] update the docs with the changed network buffer parameter
Also update the descriptions of taskmanager.memory.fraction not being
relative
to the full size of taskmanager.heap.mb but that network buffer memory is
subtracted before!
commit c48beb0d67e8ef847ef845835e342d4a49127e7d
Author: Nico Kruber <[email protected]>
Date: 2017-04-12T16:25:27Z
[FLINK-4545] fix some tests being killed on Travis CI
Due to the increased defaults for network buffer memory use, some builds on
Travis CI fail with unit tests being killed. This affects
* RocksDbBackendEventTimeWindowCheckpointingITCase and
* HBaseConnectorITCase
We fix this by limiting the maximum amount of network buffer memory to 80MB
(current defaults would yield 150MB, previously 64MB were used).
----
> Flink automatically manages TM network buffer
> ---------------------------------------------
>
> Key: FLINK-4545
> URL: https://issues.apache.org/jira/browse/FLINK-4545
> Project: Flink
> Issue Type: Wish
> Components: Network
> Reporter: Zhenzhong Xu
>
> Currently, the number of network buffer per task manager is preconfigured and
> the memory is pre-allocated through taskmanager.network.numberOfBuffers
> config. In a Job DAG with shuffle phase, this number can go up very high
> depends on the TM cluster size. The formula for calculating the buffer count
> is documented here
> (https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#configuring-the-network-buffers).
>
> #slots-per-TM^2 * #TMs * 4
> In a standalone deployment, we may need to control the task manager cluster
> size dynamically and then leverage the up-coming Flink feature to support
> scaling job parallelism/rescaling at runtime.
> If the buffer count config is static at runtime and cannot be changed without
> restarting task manager process, this may add latency and complexity for
> scaling process. I am wondering if there is already any discussion around
> whether the network buffer should be automatically managed by Flink or at
> least expose some API to allow it to be reconfigured. Let me know if there is
> any existing JIRA that I should follow.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)