Martin Bydzovsky created MESOS-9207:
---------------------------------------
Summary: CFS on docker executor tasks doesnt work
Key: MESOS-9207
URL: https://issues.apache.org/jira/browse/MESOS-9207
Project: Mesos
Issue Type: Bug
Components: containerization, docker
Affects Versions: 1.6.1, 1.5.1
Reporter: Martin Bydzovsky
The CFS hardlimiting on docker-based tasks doesnt work. the
--cgroups-enable-cfs support added in
[https://github.com/apache/mesos/commit/346cc8dd528a28a6e1f1cbdb4c95b8bdea2f6070]
adds parameter --cpu-quota, which is nice, however completely useless. The
hardlimitting must be activated by setting either one of --cpus or --cpu-period
and (optionally overriding some default) --cpu-quota.
(https://docs.docker.com/config/containers/resource_constraints/#configure-the-default-cfs-scheduler)
Attaching output showing wrong parameters are added by the executor:
{code:java}
bydga@bydzovskym ~ λ curl http://mesos-slave1:5051/flags | jshon | grep cfs
"cgroups_enable_cfs": "true",
bydga@bydzovskym ~ λ ssh mesos-slave1
Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.4.0-1060-aws x86_64)
bydzovskym mesos-slave1:us-w2 ~ 🍺 ps aux | grep example-api
root 30414 0.1 0.3 843532 49296 ? Ssl 07:54 0:01 mesos-docker-executor
--cgroups_enable_cfs=true
--container=mesos-6e31d2cb-ac4f-4b1c-ae2b-08cf54acc088 --docker=docker
--docker_socket=/var/run/docker.sock --help=false
--initialize_driver_logging=true --launcher_dir=/usr/libexec/mesos
--logbufsecs=0 --logging_level=INFO --mapped_directory=/mnt/mesos/sandbox
--quiet=false
--sandbox_directory=/srv/mesos/slaves/6b8f88fb-29df-4a35-86c3-a369d1447a53-S0/frameworks/2da5f61c-8400-40e0-8964-3edbd2f24e37-0001/executors/hera_example-api_production_api.b4ff812e-b017-11e8-92cc-06cd01d45cce/runs/6e31d2cb-ac4f-4b1c-ae2b-08cf54acc088
--stop_timeout=30secs
root 30426 0.0 0.1 324744 26644 ? Sl 07:54 0:00 docker -H
unix:///var/run/docker.sock run --cpu-shares 1024 --cpu-quota 100000 --memory
209715200 -e HOST=mesos-slave1.priv -e
MARATHON_APP_DOCKER_IMAGE=awsid.dkr.ecr.us-west-2.amazonaws.com/hera/example-api/production:d07dd097
-e MARATHON_APP_ID=/hera/example-api/production/api -e
MARATHON_APP_RESOURCE_CPUS=1.0 -e MARATHON_APP_RESOURCE_DISK=0.0 -e
MARATHON_APP_RESOURCE_GPUS=0 -e MARATHON_APP_RESOURCE_MEM=200.0 -e
MARATHON_APP_VERSION=2018-09-04T07:54:09.419Z -e
MESOS_CONTAINER_NAME=mesos-6e31d2cb-ac4f-4b1c-ae2b-08cf54acc088 -e
MESOS_SANDBOX=/mnt/mesos/sandbox -e
MESOS_TASK_ID=hera_example-api_production_api.b4ff812e-b017-11e8-92cc-06cd01d45cce
-e PORT=9115 -e PORT0=9115 -e PORTS=9115 -e PORT_9115=9115 -e PORT_PORT0=9115
-v
/srv/mesos/slaves/6b8f88fb-29df-4a35-86c3-a369d1447a53-S0/frameworks/2da5f61c-8400-40e0-8964-3edbd2f24e37-0001/executors/hera_example-api_production_api.b4ff812e-b017-11e8-92cc-06cd01d45cce/runs/6e31d2cb-ac4f-4b1c-ae2b-08cf54acc088:/mnt/mesos/sandbox
--net bridge -p 9115:9115/tcp --name
mesos-6e31d2cb-ac4f-4b1c-ae2b-08cf54acc088
--label=MESOS_TASK_ID=hera_example-api_production_api.b4ff812e-b017-11e8-92cc-06cd01d45cce
awsid.dkr.ecr.us-west-2.amazonaws.com/hera/example-api/production:d07dd097
coffee index.coffee{code}
You can see, that the mesos-docker-executor has correctly propagated the
{code:java}
--cgroups_enable_cfs=true{code}
However
{code:java}
--cpu-shares 1024 --cpu-quota 100000{code}
are set in the docker run command.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)