[ 
https://issues.apache.org/jira/browse/FLINK-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15854437#comment-15854437
 ] 

ASF GitHub Bot commented on FLINK-3163:
---------------------------------------

Github user greghogan commented on the issue:

    https://github.com/apache/flink/pull/3249
  
    @StephanEwen from the discussion of FLINK-3163 I also had the idea of 
`taskmanager.compute.fraction` where the number of slots would be a multiple of 
the number of cores / vcores. Since Flink processes these as opaque strings the 
only purpose is to help organize [config 
page](https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html).
    
    I have found YARN-5764, MESOS-5342, and MESOS-314 discussing NUMA support 
for containers but all are works in progress. I see that Docker supports 
`--cpuset-cpus` and `--cpuset-mems` in [docker 
run](https://docs.docker.com/engine/reference/run/) and in [docker 
compose](https://docs.docker.com/compose/compose-file) config version 2 (using 
`cpuset`). It's not clear how to dynamically bind Flink to numa nodes without 
scripting Flink's docker commands.


> Configure Flink for NUMA systems
> --------------------------------
>
>                 Key: FLINK-3163
>                 URL: https://issues.apache.org/jira/browse/FLINK-3163
>             Project: Flink
>          Issue Type: Improvement
>          Components: Startup Shell Scripts
>    Affects Versions: 1.0.0
>            Reporter: Greg Hogan
>            Assignee: Greg Hogan
>
> On NUMA systems Flink can be pinned to a single physical processor ("node") 
> using {{numactl --membind=$node --cpunodebind=$node <command>}}. Commonly 
> available NUMA systems include the largest AWS and Google Compute instances.
> For example, on an AWS c4.8xlarge system with 36 hyperthreads the user could 
> configure a single TaskManager with 36 slots or have Flink create two 
> TaskManagers bound to each of the NUMA nodes, each with 18 slots.
> There may be some extra overhead in transferring network buffers between 
> TaskManagers on the same system, though the fraction of data shuffled in this 
> manner decreases with the size of the cluster. The performance improvement 
> from only accessing local memory looks to be significant though difficult to 
> benchmark.
> The JobManagers may fit into NUMA nodes rather than requiring full systems.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to