Joshua Cohen created AURORA-1632:
------------------------------------
Summary: Investigate executor fixes when Mesos 0.30.0 stops
passing along environment variables
Key: AURORA-1632
URL: https://issues.apache.org/jira/browse/AURORA-1632
Project: Aurora
Issue Type: Task
Components: Executor
Reporter: Joshua Cohen
Priority: Blocker
In the 0.30.0 release, the Mesos Agent will no longer implicitly pass along its
environment variables (see:
http://mail-archives.apache.org/mod_mbox/mesos-dev/201603.mbox/%3CCAK7AWaGB24ALh8eb%2BvKMFgc4%2BjmhxZ6ry79HBcKN%2BBt04Sx43A%40mail.gmail.com%3E).
I tested in vagrant by explicitly setting the
{{--executor_environment_variables}} flag on the agent to {{'{}'}} and verified
that this does impact us. Initially we get a permission denied error when
trying to fork the runner:
{noformat}
I0310 16:36:21.048671 18103 thermos_task_runner.py:275] Forking off runner with
cmdline:
/var/lib/mesos/slaves/aa9f2963-947d-4582-8cec-e694d9d06e79-S0/frameworks/0f9b27e9-6b03-4b5e-9e2f-91eae9ba5c99-0003/executors/thermos-vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf/runs/56f62331-3ad4-463a-b392-3b80cc664b3c/thermos_runner.pex
--setuid=vagrant
--task_id=vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf
--log_to_disk=DEBUG --hostname=192.168.33.7
--thermos_json=/var/lib/mesos/slaves/aa9f2963-947d-4582-8cec-e694d9d06e79-S0/frameworks/0f9b27e9-6b03-4b5e-9e2f-91eae9ba5c99-0003/executors/thermos-vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf/runs/56f62331-3ad4-463a-b392-3b80cc664b3c/task.json
--sandbox=/var/lib/mesos/slaves/aa9f2963-947d-4582-8cec-e694d9d06e79-S0/frameworks/0f9b27e9-6b03-4b5e-9e2f-91eae9ba5c99-0003/executors/thermos-vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf/runs/56f62331-3ad4-463a-b392-3b80cc664b3c/sandbox
--log_dir=/var/lib/mesos/slaves/aa9f2963-947d-4582-8cec-e694d9d06e79-S0/frameworks/0f9b27e9-6b03-4b5e-9e2f-91eae9ba5c99-0003/executors/thermos-vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf/runs/56f62331-3ad4-463a-b392-3b80cc664b3c
--checkpoint_root=/var/lib/mesos/slaves/aa9f2963-947d-4582-8cec-e694d9d06e79-S0/frameworks/0f9b27e9-6b03-4b5e-9e2f-91eae9ba5c99-0003/executors/thermos-vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf/runs/56f62331-3ad4-463a-b392-3b80cc664b3c/checkpoints
--process_logger_destination=file --port=aurora:31248 --port=http:31248
F0310 16:36:21.057298 18103 aurora_executor.py:80] Task initialization failed:
[Errno 13] Permission denied
{noformat}
This error can be addressed with the patch from this pull request:
https://github.com/apache/aurora/pull/21. However, even after applying this
patch processes fail to fork (see attached screenshot).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)