Re: Review Request 47853: Isolate the executor's filesystem from the task's.

Joshua Cohen Tue, 26 Jul 2016 09:58:10 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47853/
-----------------------------------------------------------


(Updated July 26, 2016, 4:57 p.m.)


Review request for Aurora, Jie Yu, Maxim Khutornenko, and Stephan Erb.


Changes
-------

Update to use new mesos-containerizer launch command to isolate the task's 
filesystem rather than replicating all of that logic from Mesos into Thermos.

A couple of notes:

1) this depends on Mesos 1.0.0, so I won't ship this until that has shipped and 
we've upgraded Aurora to depend on it.
2) End to end tests are failing for the GPU job which I'm assuming is related 
to the fact that I've upgraded my vagrant image to Mesos 1.0.0 but have not 
upgraded Aurora.


Repository: aurora


Description
-------

This changes the approach to launching tasks with filesystem images in the 
unified containerizer. Instead of adding an `Image` to the `MesosContainer`, we 
instead add the task filesystem as a `Volume` with an associated image. This 
image is mounted in the mesos directory under the `taskfs` path. The executor, 
on start up does the following:

1. Creates user/group under the taskfs root.
2. `pivot_root`s into the taskfs, while bind mounting the sandbox under that 
root as well as mounting procfs.
3. From there, task execution is essentially unchanged minus some slight 
changes to the environment depending on whether we're running in a pivoted root.


Diffs (updated)
-----

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
1d66208490aff6ea8af4c737845fa2cf13617529 
  examples/vagrant/upstart/aurora-scheduler.conf 
954ddb48e923e0a2a29c415975c4f69afcad37b5 
  src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 
cbbd6be94aa857b02cd7b45bfb2f0216d9a1cec3 
  src/main/python/apache/aurora/executor/bin/thermos_executor_main.py 
0ef3856abc0df5403e3443ac35ba8d6940de8938 
  src/main/python/apache/aurora/executor/common/sandbox.py 
6d8b7f58b639a60cc5a0c0c9ef98dfaaa8a64486 
  src/main/python/apache/aurora/executor/thermos_task_runner.py 
3896e3841562600379705dbf78a6f62728246348 
  src/main/python/apache/thermos/core/BUILD 
1094664e112cc71af37835f32037e9eb6d047202 
  src/main/python/apache/thermos/core/process.py 
1791b5ff9a36eef7470bef9a6ebbafaf0ab05ca3 
  src/main/python/apache/thermos/core/runner.py 
fe971edaa2448afaf0fc342e11bc370de96ef5e4 
  src/main/python/apache/thermos/runner/thermos_runner.py 
0d06e8e2ac78d26ba8f63744853eb5ce3f6aced6 
  src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 
500fd435b4c72b25abd8df7eea6b3850edc96e99 
  src/test/python/apache/aurora/executor/common/test_sandbox.py 
63f46e25bdd6fa387dd64975d7b95ee2659f5874 
  src/test/python/apache/thermos/core/test_process.py 
77f644c09116266ce02479b9a80403aa68767bd6 
  src/test/sh/org/apache/aurora/e2e/Dockerfile 
6fdea3d28760f59235c51c5b6913d2ee0172ef1a 
  src/test/sh/org/apache/aurora/e2e/Dockerfile.netcat PRE-CREATION 
  src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 
bf6ef69401782d6c65a2d72e9270e52d1ad9fb9f 
  src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora 
edeafbea288c95c19c82aede09717840b569528d 
  src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora 
9569eec2a32e0ea5212517c0082fc906036d1e57 
  src/test/sh/org/apache/aurora/e2e/run-server.sh PRE-CREATION 
  src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
47bd94d1ce2aeffaefb67ac8325fc2f1d21c934c 

Diff: https://reviews.apache.org/r/47853/diff/


Testing
-------

Lots of manual testing, e2e tests, etc.

I didn't add much coverage on the thermos side of things because it seemed like 
this was better served by the e2e tests than by doing a bunch of 
subprocess.check_call mocking. On the e2e front I created a new Dockerfile that 
sets up a much slimmer filesystem image that explicitly does not include python 
to ensure that the executor's filesystem is truly isolated.


Thanks,

Joshua Cohen

Re: Review Request 47853: Isolate the executor's filesystem from the task's.

Reply via email to