[
https://issues.apache.org/jira/browse/FLINK-20863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Till Rohrmann updated FLINK-20863:
----------------------------------
Component/s: Runtime / Coordination
> Exclude network memory from ResourceProfile
> -------------------------------------------
>
> Key: FLINK-20863
> URL: https://issues.apache.org/jira/browse/FLINK-20863
> Project: Flink
> Issue Type: Task
> Components: Runtime / Coordination, Runtime / Network
> Reporter: Yangze Guo
> Priority: Major
> Fix For: 1.13.0
>
>
> Network memory is included in the current ResourceProfile implementation,
> expecting the fine-grained resource management to not deploy too many tasks
> onto a TM that require more network memory than the TM contains.
> However, how much network memory each task needs highly depends on the
> shuffle service implementation, and may vary when switching to another
> shuffle service. Therefore, neither user nor the Flink runtime can easily
> specify network memory requirements for a task/slot at the moment.
> The concrete solution for network memory controlling is beyond the scope of
> this FLIP. However, we are aware of a few potential directions for solving
> this problem.
> - Make shuffle services adaptively control the amount of memory assigned to
> each task/slot, with respect to the given memory pool size. In this way,
> there should be no need to rely on fine-grained resource management to
> control the network memory consumption.
> - Make shuffle services expose interfaces for calculating network memory
> requirements for given SSGs. In this way, the Flink runtime can specify the
> calculated network memory requirements for slots, without having to
> understand the internal details of different shuffle service implementations.
> As for now, we propose to exclude network memory from ResourceProfile for the
> moment, to unblock the fine-grained resource management feature from the
> network memory controlling issue. If needed, it can be added back in future,
> as long as there’s a good way to specify the requirement.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)