[
https://issues.apache.org/jira/browse/FLINK-8431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16325364#comment-16325364
]
Dongwon Kim commented on FLINK-8431:
------------------------------------
Eron, I saw a discussion on
[GPU_RESOURCES|https://www.mail-archive.com/[email protected]/msg37571.html]
and [MESOS-7576|https://issues.apache.org/jira/browse/MESOS-7576].
{{GPU_RESOURCES}} is going to be deprecated in favor of the reservation
mechanism ([MESOS-7574|https://issues.apache.org/jira/browse/MESOS-7574]).
Thanks to it, I can launch Flink sessions by starting Mesos agents with
{{--filter_gpu_resources}} set to false. It allows Flink to get resource offers
from GPU nodes even though the current implementation of Flink's Mesos
scheduler does not enable {{GPU_RESOURCES}} framework capability.
Nevertheless, it seems that we need to enable {{GPU_RESOURCES}} framework
capability before it is completely deprecated. This is because many users could
still use Mesos<1.4.0.
[MESOS-7576|https://issues.apache.org/jira/browse/MESOS-7576] is a relatively
new issue and takes effect from
[Mesos-1.4.0|https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.4.0].
So I plan to enable {{GPU_RESOURCES}} framework capability when
{{mesos.resourcemanager.tasks.gpus}} is set (>0).
> Allow to specify # GPUs for TaskManager in Mesos
> ------------------------------------------------
>
> Key: FLINK-8431
> URL: https://issues.apache.org/jira/browse/FLINK-8431
> Project: Flink
> Issue Type: Improvement
> Components: Cluster Management, Mesos
> Reporter: Dongwon Kim
> Assignee: Dongwon Kim
> Priority: Minor
>
> Mesos provides first-class support for Nvidia GPUs [1], but Flink does not
> exploit it when scheduling TaskManagers. If Mesos agents are configured to
> isolate GPUs as shown in [2], TaskManagers that do not specify to use GPUs
> cannot see GPUs at all.
> We, therefore, need to introduce a new configuration property named
> "mesos.resourcemanager.tasks.gpus" to allow users to specify # of GPUs for
> each TaskManager process in Mesos.
> [1] http://mesos.apache.org/documentation/latest/gpu-support/
> [2] http://mesos.apache.org/documentation/latest/gpu-support/#agent-flags
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)