Re: Review Request 47853: Isolate the executor's filesystem from the task's.

2016-08-03 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47853/#review144620
---


Ship it!




Master (b912e17) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Aug. 3, 2016, 3:34 p.m., Joshua Cohen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47853/
> ---
> 
> (Updated Aug. 3, 2016, 3:34 p.m.)
> 
> 
> Review request for Aurora, Jie Yu, Maxim Khutornenko, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This changes the approach to launching tasks with filesystem images in the 
> unified containerizer. Instead of adding an `Image` to the `MesosContainer`, 
> we instead add the task filesystem as a `Volume` with an associated image. 
> This image is mounted in the mesos directory under the `taskfs` path. The 
> executor, on start up does the following:
> 
> 1. Creates user/group under the taskfs root.
> 2. `pivot_root`s into the taskfs, while bind mounting the sandbox under that 
> root as well as mounting procfs.
> 3. From there, task execution is essentially unchanged minus some slight 
> changes to the environment depending on whether we're running in a pivoted 
> root.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> 1d66208490aff6ea8af4c737845fa2cf13617529 
>   examples/vagrant/upstart/aurora-scheduler.conf 
> 954ddb48e923e0a2a29c415975c4f69afcad37b5 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 
> f9b1c7cf30f93336fb850da09c1f2b7178cbdc17 
>   src/main/python/apache/aurora/executor/bin/thermos_executor_main.py 
> c5159314543cba15ac5e525ad0c31dedd398f8bf 
>   src/main/python/apache/aurora/executor/common/sandbox.py 
> 6d8b7f58b639a60cc5a0c0c9ef98dfaaa8a64486 
>   src/main/python/apache/aurora/executor/thermos_task_runner.py 
> 3896e3841562600379705dbf78a6f62728246348 
>   src/main/python/apache/thermos/core/BUILD 
> 1094664e112cc71af37835f32037e9eb6d047202 
>   src/main/python/apache/thermos/core/process.py 
> 1791b5ff9a36eef7470bef9a6ebbafaf0ab05ca3 
>   src/main/python/apache/thermos/core/runner.py 
> fe971edaa2448afaf0fc342e11bc370de96ef5e4 
>   src/main/python/apache/thermos/runner/thermos_runner.py 
> 0d06e8e2ac78d26ba8f63744853eb5ce3f6aced6 
>   
> src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 
> bb8a849717a6ca3328ad4638acff5cc44ddd6ac9 
>   src/test/python/apache/aurora/executor/common/test_sandbox.py 
> 63f46e25bdd6fa387dd64975d7b95ee2659f5874 
>   src/test/python/apache/thermos/core/test_process.py 
> 77f644c09116266ce02479b9a80403aa68767bd6 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile 
> 1b91f02725c1a6762cda2aaa587456341df4ba5a 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile.netcat PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 
> bf6ef69401782d6c65a2d72e9270e52d1ad9fb9f 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora 
> edeafbea288c95c19c82aede09717840b569528d 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora 
> 9569eec2a32e0ea5212517c0082fc906036d1e57 
>   src/test/sh/org/apache/aurora/e2e/run-server.sh PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> 47bd94d1ce2aeffaefb67ac8325fc2f1d21c934c 
> 
> Diff: https://reviews.apache.org/r/47853/diff/
> 
> 
> Testing
> ---
> 
> Lots of manual testing, e2e tests, etc.
> 
> I didn't add much coverage on the thermos side of things because it seemed 
> like this was better served by the e2e tests than by doing a bunch of 
> subprocess.check_call mocking. On the e2e front I created a new Dockerfile 
> that sets up a much slimmer filesystem image that explicitly does not include 
> python to ensure that the executor's filesystem is truly isolated.
> 
> 
> Thanks,
> 
> Joshua Cohen
> 
>



Re: Review Request 47853: Isolate the executor's filesystem from the task's.

2016-08-03 Thread Joshua Cohen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47853/
---

(Updated Aug. 3, 2016, 3:34 p.m.)


Review request for Aurora, Jie Yu, Maxim Khutornenko, and Stephan Erb.


Changes
---

Properly handle the case where the gid already exists in the task's filesystem.


Repository: aurora


Description
---

This changes the approach to launching tasks with filesystem images in the 
unified containerizer. Instead of adding an `Image` to the `MesosContainer`, we 
instead add the task filesystem as a `Volume` with an associated image. This 
image is mounted in the mesos directory under the `taskfs` path. The executor, 
on start up does the following:

1. Creates user/group under the taskfs root.
2. `pivot_root`s into the taskfs, while bind mounting the sandbox under that 
root as well as mounting procfs.
3. From there, task execution is essentially unchanged minus some slight 
changes to the environment depending on whether we're running in a pivoted root.


Diffs (updated)
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
1d66208490aff6ea8af4c737845fa2cf13617529 
  examples/vagrant/upstart/aurora-scheduler.conf 
954ddb48e923e0a2a29c415975c4f69afcad37b5 
  src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 
f9b1c7cf30f93336fb850da09c1f2b7178cbdc17 
  src/main/python/apache/aurora/executor/bin/thermos_executor_main.py 
c5159314543cba15ac5e525ad0c31dedd398f8bf 
  src/main/python/apache/aurora/executor/common/sandbox.py 
6d8b7f58b639a60cc5a0c0c9ef98dfaaa8a64486 
  src/main/python/apache/aurora/executor/thermos_task_runner.py 
3896e3841562600379705dbf78a6f62728246348 
  src/main/python/apache/thermos/core/BUILD 
1094664e112cc71af37835f32037e9eb6d047202 
  src/main/python/apache/thermos/core/process.py 
1791b5ff9a36eef7470bef9a6ebbafaf0ab05ca3 
  src/main/python/apache/thermos/core/runner.py 
fe971edaa2448afaf0fc342e11bc370de96ef5e4 
  src/main/python/apache/thermos/runner/thermos_runner.py 
0d06e8e2ac78d26ba8f63744853eb5ce3f6aced6 
  src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 
bb8a849717a6ca3328ad4638acff5cc44ddd6ac9 
  src/test/python/apache/aurora/executor/common/test_sandbox.py 
63f46e25bdd6fa387dd64975d7b95ee2659f5874 
  src/test/python/apache/thermos/core/test_process.py 
77f644c09116266ce02479b9a80403aa68767bd6 
  src/test/sh/org/apache/aurora/e2e/Dockerfile 
1b91f02725c1a6762cda2aaa587456341df4ba5a 
  src/test/sh/org/apache/aurora/e2e/Dockerfile.netcat PRE-CREATION 
  src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 
bf6ef69401782d6c65a2d72e9270e52d1ad9fb9f 
  src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora 
edeafbea288c95c19c82aede09717840b569528d 
  src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora 
9569eec2a32e0ea5212517c0082fc906036d1e57 
  src/test/sh/org/apache/aurora/e2e/run-server.sh PRE-CREATION 
  src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
47bd94d1ce2aeffaefb67ac8325fc2f1d21c934c 

Diff: https://reviews.apache.org/r/47853/diff/


Testing
---

Lots of manual testing, e2e tests, etc.

I didn't add much coverage on the thermos side of things because it seemed like 
this was better served by the e2e tests than by doing a bunch of 
subprocess.check_call mocking. On the e2e front I created a new Dockerfile that 
sets up a much slimmer filesystem image that explicitly does not include python 
to ensure that the executor's filesystem is truly isolated.


Thanks,

Joshua Cohen



Re: Review Request 47853: Isolate the executor's filesystem from the task's.

2016-08-02 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47853/#review144560
---


Ship it!




Ship It!

- Maxim Khutornenko


On Aug. 2, 2016, 8:55 p.m., Joshua Cohen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47853/
> ---
> 
> (Updated Aug. 2, 2016, 8:55 p.m.)
> 
> 
> Review request for Aurora, Jie Yu, Maxim Khutornenko, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This changes the approach to launching tasks with filesystem images in the 
> unified containerizer. Instead of adding an `Image` to the `MesosContainer`, 
> we instead add the task filesystem as a `Volume` with an associated image. 
> This image is mounted in the mesos directory under the `taskfs` path. The 
> executor, on start up does the following:
> 
> 1. Creates user/group under the taskfs root.
> 2. `pivot_root`s into the taskfs, while bind mounting the sandbox under that 
> root as well as mounting procfs.
> 3. From there, task execution is essentially unchanged minus some slight 
> changes to the environment depending on whether we're running in a pivoted 
> root.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> 1d66208490aff6ea8af4c737845fa2cf13617529 
>   examples/vagrant/upstart/aurora-scheduler.conf 
> 954ddb48e923e0a2a29c415975c4f69afcad37b5 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 
> f9b1c7cf30f93336fb850da09c1f2b7178cbdc17 
>   src/main/python/apache/aurora/executor/bin/thermos_executor_main.py 
> c5159314543cba15ac5e525ad0c31dedd398f8bf 
>   src/main/python/apache/aurora/executor/common/sandbox.py 
> 6d8b7f58b639a60cc5a0c0c9ef98dfaaa8a64486 
>   src/main/python/apache/aurora/executor/thermos_task_runner.py 
> 3896e3841562600379705dbf78a6f62728246348 
>   src/main/python/apache/thermos/core/BUILD 
> 1094664e112cc71af37835f32037e9eb6d047202 
>   src/main/python/apache/thermos/core/process.py 
> 1791b5ff9a36eef7470bef9a6ebbafaf0ab05ca3 
>   src/main/python/apache/thermos/core/runner.py 
> fe971edaa2448afaf0fc342e11bc370de96ef5e4 
>   src/main/python/apache/thermos/runner/thermos_runner.py 
> 0d06e8e2ac78d26ba8f63744853eb5ce3f6aced6 
>   
> src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 
> bb8a849717a6ca3328ad4638acff5cc44ddd6ac9 
>   src/test/python/apache/aurora/executor/common/test_sandbox.py 
> 63f46e25bdd6fa387dd64975d7b95ee2659f5874 
>   src/test/python/apache/thermos/core/test_process.py 
> 77f644c09116266ce02479b9a80403aa68767bd6 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile 
> 1b91f02725c1a6762cda2aaa587456341df4ba5a 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile.netcat PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 
> bf6ef69401782d6c65a2d72e9270e52d1ad9fb9f 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora 
> edeafbea288c95c19c82aede09717840b569528d 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora 
> 9569eec2a32e0ea5212517c0082fc906036d1e57 
>   src/test/sh/org/apache/aurora/e2e/run-server.sh PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> 47bd94d1ce2aeffaefb67ac8325fc2f1d21c934c 
> 
> Diff: https://reviews.apache.org/r/47853/diff/
> 
> 
> Testing
> ---
> 
> Lots of manual testing, e2e tests, etc.
> 
> I didn't add much coverage on the thermos side of things because it seemed 
> like this was better served by the e2e tests than by doing a bunch of 
> subprocess.check_call mocking. On the e2e front I created a new Dockerfile 
> that sets up a much slimmer filesystem image that explicitly does not include 
> python to ensure that the executor's filesystem is truly isolated.
> 
> 
> Thanks,
> 
> Joshua Cohen
> 
>



Re: Review Request 47853: Isolate the executor's filesystem from the task's.

2016-08-02 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47853/#review144534
---


Fix it, then Ship it!





src/main/python/apache/aurora/executor/common/sandbox.py (line 193)


As discussed out-of-band, `-f` will silently switch the GID and can 
therefore not be used.


- Stephan Erb


On Aug. 2, 2016, 10:55 p.m., Joshua Cohen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47853/
> ---
> 
> (Updated Aug. 2, 2016, 10:55 p.m.)
> 
> 
> Review request for Aurora, Jie Yu, Maxim Khutornenko, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This changes the approach to launching tasks with filesystem images in the 
> unified containerizer. Instead of adding an `Image` to the `MesosContainer`, 
> we instead add the task filesystem as a `Volume` with an associated image. 
> This image is mounted in the mesos directory under the `taskfs` path. The 
> executor, on start up does the following:
> 
> 1. Creates user/group under the taskfs root.
> 2. `pivot_root`s into the taskfs, while bind mounting the sandbox under that 
> root as well as mounting procfs.
> 3. From there, task execution is essentially unchanged minus some slight 
> changes to the environment depending on whether we're running in a pivoted 
> root.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> 1d66208490aff6ea8af4c737845fa2cf13617529 
>   examples/vagrant/upstart/aurora-scheduler.conf 
> 954ddb48e923e0a2a29c415975c4f69afcad37b5 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 
> f9b1c7cf30f93336fb850da09c1f2b7178cbdc17 
>   src/main/python/apache/aurora/executor/bin/thermos_executor_main.py 
> c5159314543cba15ac5e525ad0c31dedd398f8bf 
>   src/main/python/apache/aurora/executor/common/sandbox.py 
> 6d8b7f58b639a60cc5a0c0c9ef98dfaaa8a64486 
>   src/main/python/apache/aurora/executor/thermos_task_runner.py 
> 3896e3841562600379705dbf78a6f62728246348 
>   src/main/python/apache/thermos/core/BUILD 
> 1094664e112cc71af37835f32037e9eb6d047202 
>   src/main/python/apache/thermos/core/process.py 
> 1791b5ff9a36eef7470bef9a6ebbafaf0ab05ca3 
>   src/main/python/apache/thermos/core/runner.py 
> fe971edaa2448afaf0fc342e11bc370de96ef5e4 
>   src/main/python/apache/thermos/runner/thermos_runner.py 
> 0d06e8e2ac78d26ba8f63744853eb5ce3f6aced6 
>   
> src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 
> bb8a849717a6ca3328ad4638acff5cc44ddd6ac9 
>   src/test/python/apache/aurora/executor/common/test_sandbox.py 
> 63f46e25bdd6fa387dd64975d7b95ee2659f5874 
>   src/test/python/apache/thermos/core/test_process.py 
> 77f644c09116266ce02479b9a80403aa68767bd6 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile 
> 1b91f02725c1a6762cda2aaa587456341df4ba5a 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile.netcat PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 
> bf6ef69401782d6c65a2d72e9270e52d1ad9fb9f 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora 
> edeafbea288c95c19c82aede09717840b569528d 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora 
> 9569eec2a32e0ea5212517c0082fc906036d1e57 
>   src/test/sh/org/apache/aurora/e2e/run-server.sh PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> 47bd94d1ce2aeffaefb67ac8325fc2f1d21c934c 
> 
> Diff: https://reviews.apache.org/r/47853/diff/
> 
> 
> Testing
> ---
> 
> Lots of manual testing, e2e tests, etc.
> 
> I didn't add much coverage on the thermos side of things because it seemed 
> like this was better served by the e2e tests than by doing a bunch of 
> subprocess.check_call mocking. On the e2e front I created a new Dockerfile 
> that sets up a much slimmer filesystem image that explicitly does not include 
> python to ensure that the executor's filesystem is truly isolated.
> 
> 
> Thanks,
> 
> Joshua Cohen
> 
>



Re: Review Request 47853: Isolate the executor's filesystem from the task's.

2016-08-02 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47853/#review144528
---


Ship it!




Master (b912e17) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Aug. 2, 2016, 8:55 p.m., Joshua Cohen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47853/
> ---
> 
> (Updated Aug. 2, 2016, 8:55 p.m.)
> 
> 
> Review request for Aurora, Jie Yu, Maxim Khutornenko, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This changes the approach to launching tasks with filesystem images in the 
> unified containerizer. Instead of adding an `Image` to the `MesosContainer`, 
> we instead add the task filesystem as a `Volume` with an associated image. 
> This image is mounted in the mesos directory under the `taskfs` path. The 
> executor, on start up does the following:
> 
> 1. Creates user/group under the taskfs root.
> 2. `pivot_root`s into the taskfs, while bind mounting the sandbox under that 
> root as well as mounting procfs.
> 3. From there, task execution is essentially unchanged minus some slight 
> changes to the environment depending on whether we're running in a pivoted 
> root.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> 1d66208490aff6ea8af4c737845fa2cf13617529 
>   examples/vagrant/upstart/aurora-scheduler.conf 
> 954ddb48e923e0a2a29c415975c4f69afcad37b5 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 
> f9b1c7cf30f93336fb850da09c1f2b7178cbdc17 
>   src/main/python/apache/aurora/executor/bin/thermos_executor_main.py 
> c5159314543cba15ac5e525ad0c31dedd398f8bf 
>   src/main/python/apache/aurora/executor/common/sandbox.py 
> 6d8b7f58b639a60cc5a0c0c9ef98dfaaa8a64486 
>   src/main/python/apache/aurora/executor/thermos_task_runner.py 
> 3896e3841562600379705dbf78a6f62728246348 
>   src/main/python/apache/thermos/core/BUILD 
> 1094664e112cc71af37835f32037e9eb6d047202 
>   src/main/python/apache/thermos/core/process.py 
> 1791b5ff9a36eef7470bef9a6ebbafaf0ab05ca3 
>   src/main/python/apache/thermos/core/runner.py 
> fe971edaa2448afaf0fc342e11bc370de96ef5e4 
>   src/main/python/apache/thermos/runner/thermos_runner.py 
> 0d06e8e2ac78d26ba8f63744853eb5ce3f6aced6 
>   
> src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 
> bb8a849717a6ca3328ad4638acff5cc44ddd6ac9 
>   src/test/python/apache/aurora/executor/common/test_sandbox.py 
> 63f46e25bdd6fa387dd64975d7b95ee2659f5874 
>   src/test/python/apache/thermos/core/test_process.py 
> 77f644c09116266ce02479b9a80403aa68767bd6 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile 
> 1b91f02725c1a6762cda2aaa587456341df4ba5a 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile.netcat PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 
> bf6ef69401782d6c65a2d72e9270e52d1ad9fb9f 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora 
> edeafbea288c95c19c82aede09717840b569528d 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora 
> 9569eec2a32e0ea5212517c0082fc906036d1e57 
>   src/test/sh/org/apache/aurora/e2e/run-server.sh PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> 47bd94d1ce2aeffaefb67ac8325fc2f1d21c934c 
> 
> Diff: https://reviews.apache.org/r/47853/diff/
> 
> 
> Testing
> ---
> 
> Lots of manual testing, e2e tests, etc.
> 
> I didn't add much coverage on the thermos side of things because it seemed 
> like this was better served by the e2e tests than by doing a bunch of 
> subprocess.check_call mocking. On the e2e front I created a new Dockerfile 
> that sets up a much slimmer filesystem image that explicitly does not include 
> python to ensure that the executor's filesystem is truly isolated.
> 
> 
> Thanks,
> 
> Joshua Cohen
> 
>



Re: Review Request 47853: Isolate the executor's filesystem from the task's.

2016-08-02 Thread Joshua Cohen


> On Aug. 1, 2016, 7:05 p.m., Stephan Erb wrote:
> > src/main/python/apache/aurora/executor/common/sandbox.py, lines 200-209
> > 
> >
> > Those two calls have the implicit assumptions:
> > 
> > * the role user does not exist within container.
> > * uid and gid as available on the host do not collide with pre-existing 
> > ones in the container
> > 
> > The first point could be problematic for people migrating from the 
> > existing Docker support in Aurora, as it requires images with the 
> > pre-existing users.

They do have implicit assumptions, I'm ok with that though, I wasn't building 
this functionality under the assumption that a Docker image being used via the 
DockerContainerizer would work out of the box with the new containerizer (at 
the very least you'll no longer require python in your image).

That said, it's fairly ssafe to check for exit code 4 (per the manpage: UID 
exists) from the useradd call and continue in that case. For groupadd we can 
pass the -f flag to simply exit 0 if the group exists.


> On Aug. 1, 2016, 7:05 p.m., Stephan Erb wrote:
> > src/main/python/apache/thermos/core/process.py, lines 408-416
> > 
> >
> > I am slightly confused here.
> > 
> > The way I understand the code `self._sandbox` points to a path of the 
> > host filesystem. So, if we set `cwd` to that path, what does it point to 
> > when we are running within the container? Is this related to the bind mount 
> > that you mention in the description of this patch?

Good catch. This is a leftover from my original impl that bind mounted the 
sandbox path into the task's filesystem. The new mesos-containerizer launch 
subcommand does not have that behavior though, so I'll need to restore that 
bind mount.


> On Aug. 1, 2016, 7:05 p.m., Stephan Erb wrote:
> > src/main/python/apache/aurora/executor/bin/thermos_executor_main.py, lines 
> > 98-102
> > 
> >
> > Looks like Mesos is able to compute a default value for this using some 
> > makefile constants: 
> > https://github.com/apache/mesos/blob/e8ebbe5fe4189ef7ab046da2276a6abee41deeb2/src/slave/flags.cpp#L193-L198
> > 
> > We should at least document that the expected value here is based on 
> > the one passed to Mesos.

Added a note to the help text.


> On Aug. 1, 2016, 7:05 p.m., Stephan Erb wrote:
> > src/main/python/apache/aurora/executor/thermos_task_runner.py, lines 268-270
> > 
> >
> > This code is risky to operator error: If the operator forgets to set 
> > the optional containerizer path, users can launch container images whose 
> > filesystem isolation will not be applied. 
> > 
> > I would prefer if we throw an error (here, or at another appropriate 
> > place) if we are supposed to use a filesystem image without a containerizer 
> > path.

Added some code to fail fast if task is using an image but path to 
mesos-containerizer is not provided.


- Joshua


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47853/#review144324
---


On Aug. 2, 2016, 8:55 p.m., Joshua Cohen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47853/
> ---
> 
> (Updated Aug. 2, 2016, 8:55 p.m.)
> 
> 
> Review request for Aurora, Jie Yu, Maxim Khutornenko, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This changes the approach to launching tasks with filesystem images in the 
> unified containerizer. Instead of adding an `Image` to the `MesosContainer`, 
> we instead add the task filesystem as a `Volume` with an associated image. 
> This image is mounted in the mesos directory under the `taskfs` path. The 
> executor, on start up does the following:
> 
> 1. Creates user/group under the taskfs root.
> 2. `pivot_root`s into the taskfs, while bind mounting the sandbox under that 
> root as well as mounting procfs.
> 3. From there, task execution is essentially unchanged minus some slight 
> changes to the environment depending on whether we're running in a pivoted 
> root.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> 1d66208490aff6ea8af4c737845fa2cf13617529 
>   examples/vagrant/upstart/aurora-scheduler.conf 
> 954ddb48e923e0a2a29c415975c4f69afcad37b5 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 
> f9b1c7cf30f93336fb850da09c1f2b7178cbdc17 
>   

Re: Review Request 47853: Isolate the executor's filesystem from the task's.

2016-08-02 Thread Joshua Cohen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47853/
---

(Updated Aug. 2, 2016, 8:55 p.m.)


Review request for Aurora, Jie Yu, Maxim Khutornenko, and Stephan Erb.


Changes
---

Address Stephan's feedback:

- Add a note to the help output for --mesos_containerizer_path that its value 
should match that of -launcher_dir on the mesos agent.
- Make user/group adding under the task filesystem more robust to already 
existing users/groups.
- Fail fast if the task is using a filesystem image but 
--mesos_containerizer_path was not set.
- restore the bind mount of the mesos directory into the task's filesystem that 
was lost in the switch to mesos-containerizer.


Repository: aurora


Description
---

This changes the approach to launching tasks with filesystem images in the 
unified containerizer. Instead of adding an `Image` to the `MesosContainer`, we 
instead add the task filesystem as a `Volume` with an associated image. This 
image is mounted in the mesos directory under the `taskfs` path. The executor, 
on start up does the following:

1. Creates user/group under the taskfs root.
2. `pivot_root`s into the taskfs, while bind mounting the sandbox under that 
root as well as mounting procfs.
3. From there, task execution is essentially unchanged minus some slight 
changes to the environment depending on whether we're running in a pivoted root.


Diffs (updated)
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
1d66208490aff6ea8af4c737845fa2cf13617529 
  examples/vagrant/upstart/aurora-scheduler.conf 
954ddb48e923e0a2a29c415975c4f69afcad37b5 
  src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 
f9b1c7cf30f93336fb850da09c1f2b7178cbdc17 
  src/main/python/apache/aurora/executor/bin/thermos_executor_main.py 
c5159314543cba15ac5e525ad0c31dedd398f8bf 
  src/main/python/apache/aurora/executor/common/sandbox.py 
6d8b7f58b639a60cc5a0c0c9ef98dfaaa8a64486 
  src/main/python/apache/aurora/executor/thermos_task_runner.py 
3896e3841562600379705dbf78a6f62728246348 
  src/main/python/apache/thermos/core/BUILD 
1094664e112cc71af37835f32037e9eb6d047202 
  src/main/python/apache/thermos/core/process.py 
1791b5ff9a36eef7470bef9a6ebbafaf0ab05ca3 
  src/main/python/apache/thermos/core/runner.py 
fe971edaa2448afaf0fc342e11bc370de96ef5e4 
  src/main/python/apache/thermos/runner/thermos_runner.py 
0d06e8e2ac78d26ba8f63744853eb5ce3f6aced6 
  src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 
bb8a849717a6ca3328ad4638acff5cc44ddd6ac9 
  src/test/python/apache/aurora/executor/common/test_sandbox.py 
63f46e25bdd6fa387dd64975d7b95ee2659f5874 
  src/test/python/apache/thermos/core/test_process.py 
77f644c09116266ce02479b9a80403aa68767bd6 
  src/test/sh/org/apache/aurora/e2e/Dockerfile 
1b91f02725c1a6762cda2aaa587456341df4ba5a 
  src/test/sh/org/apache/aurora/e2e/Dockerfile.netcat PRE-CREATION 
  src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 
bf6ef69401782d6c65a2d72e9270e52d1ad9fb9f 
  src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora 
edeafbea288c95c19c82aede09717840b569528d 
  src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora 
9569eec2a32e0ea5212517c0082fc906036d1e57 
  src/test/sh/org/apache/aurora/e2e/run-server.sh PRE-CREATION 
  src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
47bd94d1ce2aeffaefb67ac8325fc2f1d21c934c 

Diff: https://reviews.apache.org/r/47853/diff/


Testing
---

Lots of manual testing, e2e tests, etc.

I didn't add much coverage on the thermos side of things because it seemed like 
this was better served by the e2e tests than by doing a bunch of 
subprocess.check_call mocking. On the e2e front I created a new Dockerfile that 
sets up a much slimmer filesystem image that explicitly does not include python 
to ensure that the executor's filesystem is truly isolated.


Thanks,

Joshua Cohen



Re: Review Request 47853: Isolate the executor's filesystem from the task's.

2016-08-01 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47853/#review144324
---



Looks good to me in general. A couple of general remarks below.


src/main/python/apache/aurora/executor/bin/thermos_executor_main.py (lines 98 - 
102)


Looks like Mesos is able to compute a default value for this using some 
makefile constants: 
https://github.com/apache/mesos/blob/e8ebbe5fe4189ef7ab046da2276a6abee41deeb2/src/slave/flags.cpp#L193-L198

We should at least document that the expected value here is based on the 
one passed to Mesos.



src/main/python/apache/aurora/executor/common/sandbox.py (lines 189 - 198)


Those two calls have the implicit assumptions:

* the role user does not exist within container.
* uid and gid as available on the host do not collide with pre-existing 
ones in the container

The first point could be problematic for people migrating from the existing 
Docker support in Aurora, as it requires images with the pre-existing users.



src/main/python/apache/aurora/executor/thermos_task_runner.py (lines 268 - 270)


This code is risky to operator error: If the operator forgets to set the 
optional containerizer path, users can launch container images whose filesystem 
isolation will not be applied. 

I would prefer if we throw an error (here, or at another appropriate place) 
if we are supposed to use a filesystem image without a containerizer path.



src/main/python/apache/thermos/core/process.py (lines 188 - 190)


See my comment above.



src/main/python/apache/thermos/core/process.py (lines 408 - 416)


I am slightly confused here.

The way I understand the code `self._sandbox` points to a path of the host 
filesystem. So, if we set `cwd` to that path, what does it point to when we are 
running within the container? Is this related to the bind mount that you 
mention in the description of this patch?


Additional things that come to mind (but that are out of scope of this 
particular review request) 

* documentation
* `aurora task ssh` could probably be update to use the containerizer launch 
command as well.

- Stephan Erb


On Aug. 1, 2016, 7:22 p.m., Joshua Cohen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47853/
> ---
> 
> (Updated Aug. 1, 2016, 7:22 p.m.)
> 
> 
> Review request for Aurora, Jie Yu, Maxim Khutornenko, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This changes the approach to launching tasks with filesystem images in the 
> unified containerizer. Instead of adding an `Image` to the `MesosContainer`, 
> we instead add the task filesystem as a `Volume` with an associated image. 
> This image is mounted in the mesos directory under the `taskfs` path. The 
> executor, on start up does the following:
> 
> 1. Creates user/group under the taskfs root.
> 2. `pivot_root`s into the taskfs, while bind mounting the sandbox under that 
> root as well as mounting procfs.
> 3. From there, task execution is essentially unchanged minus some slight 
> changes to the environment depending on whether we're running in a pivoted 
> root.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> 1d66208490aff6ea8af4c737845fa2cf13617529 
>   examples/vagrant/upstart/aurora-scheduler.conf 
> 954ddb48e923e0a2a29c415975c4f69afcad37b5 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 
> f9b1c7cf30f93336fb850da09c1f2b7178cbdc17 
>   src/main/python/apache/aurora/executor/bin/thermos_executor_main.py 
> c5159314543cba15ac5e525ad0c31dedd398f8bf 
>   src/main/python/apache/aurora/executor/common/sandbox.py 
> 6d8b7f58b639a60cc5a0c0c9ef98dfaaa8a64486 
>   src/main/python/apache/aurora/executor/thermos_task_runner.py 
> 3896e3841562600379705dbf78a6f62728246348 
>   src/main/python/apache/thermos/core/BUILD 
> 1094664e112cc71af37835f32037e9eb6d047202 
>   src/main/python/apache/thermos/core/process.py 
> 1791b5ff9a36eef7470bef9a6ebbafaf0ab05ca3 
>   src/main/python/apache/thermos/core/runner.py 
> fe971edaa2448afaf0fc342e11bc370de96ef5e4 
>   src/main/python/apache/thermos/runner/thermos_runner.py 
> 0d06e8e2ac78d26ba8f63744853eb5ce3f6aced6 
>   
> src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 
> bb8a849717a6ca3328ad4638acff5cc44ddd6ac9 
>   src/test/python/apache/aurora/executor/common/test_sandbox.py 
> 63f46e25bdd6fa387dd64975d7b95ee2659f5874 
>   

Re: Review Request 47853: Isolate the executor's filesystem from the task's.

2016-08-01 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47853/#review144357
---


Ship it!




Master (b912e17) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Aug. 1, 2016, 5:22 p.m., Joshua Cohen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47853/
> ---
> 
> (Updated Aug. 1, 2016, 5:22 p.m.)
> 
> 
> Review request for Aurora, Jie Yu, Maxim Khutornenko, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This changes the approach to launching tasks with filesystem images in the 
> unified containerizer. Instead of adding an `Image` to the `MesosContainer`, 
> we instead add the task filesystem as a `Volume` with an associated image. 
> This image is mounted in the mesos directory under the `taskfs` path. The 
> executor, on start up does the following:
> 
> 1. Creates user/group under the taskfs root.
> 2. `pivot_root`s into the taskfs, while bind mounting the sandbox under that 
> root as well as mounting procfs.
> 3. From there, task execution is essentially unchanged minus some slight 
> changes to the environment depending on whether we're running in a pivoted 
> root.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> 1d66208490aff6ea8af4c737845fa2cf13617529 
>   examples/vagrant/upstart/aurora-scheduler.conf 
> 954ddb48e923e0a2a29c415975c4f69afcad37b5 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 
> f9b1c7cf30f93336fb850da09c1f2b7178cbdc17 
>   src/main/python/apache/aurora/executor/bin/thermos_executor_main.py 
> c5159314543cba15ac5e525ad0c31dedd398f8bf 
>   src/main/python/apache/aurora/executor/common/sandbox.py 
> 6d8b7f58b639a60cc5a0c0c9ef98dfaaa8a64486 
>   src/main/python/apache/aurora/executor/thermos_task_runner.py 
> 3896e3841562600379705dbf78a6f62728246348 
>   src/main/python/apache/thermos/core/BUILD 
> 1094664e112cc71af37835f32037e9eb6d047202 
>   src/main/python/apache/thermos/core/process.py 
> 1791b5ff9a36eef7470bef9a6ebbafaf0ab05ca3 
>   src/main/python/apache/thermos/core/runner.py 
> fe971edaa2448afaf0fc342e11bc370de96ef5e4 
>   src/main/python/apache/thermos/runner/thermos_runner.py 
> 0d06e8e2ac78d26ba8f63744853eb5ce3f6aced6 
>   
> src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 
> bb8a849717a6ca3328ad4638acff5cc44ddd6ac9 
>   src/test/python/apache/aurora/executor/common/test_sandbox.py 
> 63f46e25bdd6fa387dd64975d7b95ee2659f5874 
>   src/test/python/apache/thermos/core/test_process.py 
> 77f644c09116266ce02479b9a80403aa68767bd6 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile 
> 1b91f02725c1a6762cda2aaa587456341df4ba5a 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile.netcat PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 
> bf6ef69401782d6c65a2d72e9270e52d1ad9fb9f 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora 
> edeafbea288c95c19c82aede09717840b569528d 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora 
> 9569eec2a32e0ea5212517c0082fc906036d1e57 
>   src/test/sh/org/apache/aurora/e2e/run-server.sh PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> 47bd94d1ce2aeffaefb67ac8325fc2f1d21c934c 
> 
> Diff: https://reviews.apache.org/r/47853/diff/
> 
> 
> Testing
> ---
> 
> Lots of manual testing, e2e tests, etc.
> 
> I didn't add much coverage on the thermos side of things because it seemed 
> like this was better served by the e2e tests than by doing a bunch of 
> subprocess.check_call mocking. On the e2e front I created a new Dockerfile 
> that sets up a much slimmer filesystem image that explicitly does not include 
> python to ensure that the executor's filesystem is truly isolated.
> 
> 
> Thanks,
> 
> Joshua Cohen
> 
>



Re: Review Request 47853: Isolate the executor's filesystem from the task's.

2016-08-01 Thread Joshua Cohen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47853/
---

(Updated Aug. 1, 2016, 5:22 p.m.)


Review request for Aurora, Jie Yu, Maxim Khutornenko, and Stephan Erb.


Changes
---

rebase on top of Mesos 1.0.0 upgrade changes, fix broken aurora config.

Maxim, Stephan, these changes now should be good for review.


Repository: aurora


Description
---

This changes the approach to launching tasks with filesystem images in the 
unified containerizer. Instead of adding an `Image` to the `MesosContainer`, we 
instead add the task filesystem as a `Volume` with an associated image. This 
image is mounted in the mesos directory under the `taskfs` path. The executor, 
on start up does the following:

1. Creates user/group under the taskfs root.
2. `pivot_root`s into the taskfs, while bind mounting the sandbox under that 
root as well as mounting procfs.
3. From there, task execution is essentially unchanged minus some slight 
changes to the environment depending on whether we're running in a pivoted root.


Diffs (updated)
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
1d66208490aff6ea8af4c737845fa2cf13617529 
  examples/vagrant/upstart/aurora-scheduler.conf 
954ddb48e923e0a2a29c415975c4f69afcad37b5 
  src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 
f9b1c7cf30f93336fb850da09c1f2b7178cbdc17 
  src/main/python/apache/aurora/executor/bin/thermos_executor_main.py 
c5159314543cba15ac5e525ad0c31dedd398f8bf 
  src/main/python/apache/aurora/executor/common/sandbox.py 
6d8b7f58b639a60cc5a0c0c9ef98dfaaa8a64486 
  src/main/python/apache/aurora/executor/thermos_task_runner.py 
3896e3841562600379705dbf78a6f62728246348 
  src/main/python/apache/thermos/core/BUILD 
1094664e112cc71af37835f32037e9eb6d047202 
  src/main/python/apache/thermos/core/process.py 
1791b5ff9a36eef7470bef9a6ebbafaf0ab05ca3 
  src/main/python/apache/thermos/core/runner.py 
fe971edaa2448afaf0fc342e11bc370de96ef5e4 
  src/main/python/apache/thermos/runner/thermos_runner.py 
0d06e8e2ac78d26ba8f63744853eb5ce3f6aced6 
  src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 
bb8a849717a6ca3328ad4638acff5cc44ddd6ac9 
  src/test/python/apache/aurora/executor/common/test_sandbox.py 
63f46e25bdd6fa387dd64975d7b95ee2659f5874 
  src/test/python/apache/thermos/core/test_process.py 
77f644c09116266ce02479b9a80403aa68767bd6 
  src/test/sh/org/apache/aurora/e2e/Dockerfile 
1b91f02725c1a6762cda2aaa587456341df4ba5a 
  src/test/sh/org/apache/aurora/e2e/Dockerfile.netcat PRE-CREATION 
  src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 
bf6ef69401782d6c65a2d72e9270e52d1ad9fb9f 
  src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora 
edeafbea288c95c19c82aede09717840b569528d 
  src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora 
9569eec2a32e0ea5212517c0082fc906036d1e57 
  src/test/sh/org/apache/aurora/e2e/run-server.sh PRE-CREATION 
  src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
47bd94d1ce2aeffaefb67ac8325fc2f1d21c934c 

Diff: https://reviews.apache.org/r/47853/diff/


Testing
---

Lots of manual testing, e2e tests, etc.

I didn't add much coverage on the thermos side of things because it seemed like 
this was better served by the e2e tests than by doing a bunch of 
subprocess.check_call mocking. On the e2e front I created a new Dockerfile that 
sets up a much slimmer filesystem image that explicitly does not include python 
to ensure that the executor's filesystem is truly isolated.


Thanks,

Joshua Cohen



Re: Review Request 47853: Isolate the executor's filesystem from the task's.

2016-07-26 Thread Joshua Cohen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47853/
---

(Updated July 26, 2016, 4:57 p.m.)


Review request for Aurora, Jie Yu, Maxim Khutornenko, and Stephan Erb.


Changes
---

Update to use new mesos-containerizer launch command to isolate the task's 
filesystem rather than replicating all of that logic from Mesos into Thermos.

A couple of notes:

1) this depends on Mesos 1.0.0, so I won't ship this until that has shipped and 
we've upgraded Aurora to depend on it.
2) End to end tests are failing for the GPU job which I'm assuming is related 
to the fact that I've upgraded my vagrant image to Mesos 1.0.0 but have not 
upgraded Aurora.


Repository: aurora


Description
---

This changes the approach to launching tasks with filesystem images in the 
unified containerizer. Instead of adding an `Image` to the `MesosContainer`, we 
instead add the task filesystem as a `Volume` with an associated image. This 
image is mounted in the mesos directory under the `taskfs` path. The executor, 
on start up does the following:

1. Creates user/group under the taskfs root.
2. `pivot_root`s into the taskfs, while bind mounting the sandbox under that 
root as well as mounting procfs.
3. From there, task execution is essentially unchanged minus some slight 
changes to the environment depending on whether we're running in a pivoted root.


Diffs (updated)
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
1d66208490aff6ea8af4c737845fa2cf13617529 
  examples/vagrant/upstart/aurora-scheduler.conf 
954ddb48e923e0a2a29c415975c4f69afcad37b5 
  src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 
cbbd6be94aa857b02cd7b45bfb2f0216d9a1cec3 
  src/main/python/apache/aurora/executor/bin/thermos_executor_main.py 
0ef3856abc0df5403e3443ac35ba8d6940de8938 
  src/main/python/apache/aurora/executor/common/sandbox.py 
6d8b7f58b639a60cc5a0c0c9ef98dfaaa8a64486 
  src/main/python/apache/aurora/executor/thermos_task_runner.py 
3896e3841562600379705dbf78a6f62728246348 
  src/main/python/apache/thermos/core/BUILD 
1094664e112cc71af37835f32037e9eb6d047202 
  src/main/python/apache/thermos/core/process.py 
1791b5ff9a36eef7470bef9a6ebbafaf0ab05ca3 
  src/main/python/apache/thermos/core/runner.py 
fe971edaa2448afaf0fc342e11bc370de96ef5e4 
  src/main/python/apache/thermos/runner/thermos_runner.py 
0d06e8e2ac78d26ba8f63744853eb5ce3f6aced6 
  src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 
500fd435b4c72b25abd8df7eea6b3850edc96e99 
  src/test/python/apache/aurora/executor/common/test_sandbox.py 
63f46e25bdd6fa387dd64975d7b95ee2659f5874 
  src/test/python/apache/thermos/core/test_process.py 
77f644c09116266ce02479b9a80403aa68767bd6 
  src/test/sh/org/apache/aurora/e2e/Dockerfile 
6fdea3d28760f59235c51c5b6913d2ee0172ef1a 
  src/test/sh/org/apache/aurora/e2e/Dockerfile.netcat PRE-CREATION 
  src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 
bf6ef69401782d6c65a2d72e9270e52d1ad9fb9f 
  src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora 
edeafbea288c95c19c82aede09717840b569528d 
  src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora 
9569eec2a32e0ea5212517c0082fc906036d1e57 
  src/test/sh/org/apache/aurora/e2e/run-server.sh PRE-CREATION 
  src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
47bd94d1ce2aeffaefb67ac8325fc2f1d21c934c 

Diff: https://reviews.apache.org/r/47853/diff/


Testing
---

Lots of manual testing, e2e tests, etc.

I didn't add much coverage on the thermos side of things because it seemed like 
this was better served by the e2e tests than by doing a bunch of 
subprocess.check_call mocking. On the e2e front I created a new Dockerfile that 
sets up a much slimmer filesystem image that explicitly does not include python 
to ensure that the executor's filesystem is truly isolated.


Thanks,

Joshua Cohen



Re: Review Request 47853: Isolate the executor's filesystem from the task's.

2016-05-27 Thread Joshua Cohen


> On May 26, 2016, 12:17 p.m., Stephan Erb wrote:
> > src/main/python/apache/aurora/executor/common/sandbox.py, lines 208-230
> > 
> >
> > Seeing this spelled out, I feel like we should not have such code in 
> > Thermos. It is rather low-level, hard to get compeltely right, and can 
> > impose security risks if done wrong.
> > 
> > Mesos already has to maintain a version of the `chroot`/`pivot_root` 
> > mechanism. We should try to leverage it. Otherwise, this will be an up-hill 
> > battle that we cannot win in the long run, especially given the limited 
> > number of Aurora committers
> > 
> > I see a couple options:
> > 
> > a) We get the Mesos guys to export their `pivot_root` code so that all 
> > we have to do is call a single Mesos command, just like it could currently 
> > be done for the `mesos-fetcher` or the entire `mesos-containerizer`:
> > 
> > ```
> > vagrant@aurora:~$ /usr/libexec/mesos/mesos-containerizer 
> > Usage: /usr/libexec/mesos/mesos-containerizer  [OPTIONS]
> > 
> > Available subcommands:
> > help
> > launch
> > mount
> > ```
> > 
> > b) We live with the fact that we'll always run Thermos within the 
> > container. This will require Python, but we could drop the dependency on 
> > other Mesos dependencies by migrating to the new executor HTTP API.
> > 
> > 
> > c) We stick to the `MesosContainer.image` for the user container, but 
> > use a `Volume` `image` in order to mount a minimal container that is just 
> > packaging Thermos and its dependencies. We could then launch the executor 
> > from the mounted volume such as `/thermos/bin/thermox_executor.sh` (with 
> > appropriate P`YTHON_PATH` and `LD_LIBRARY_PATH`). The user could would 
> > still be running using the user-image.
> > 
> > 
> > Approaches b) and c) keep the door open for us to run the executor as 
> > an un-priviledged user. This has security advantages and would als make it 
> > easier for users to leverage security features such as the new Mesos Linux 
> > Capabilities 
> > https://docs.google.com/document/d/1YiTift8TQla2vq3upQr7K-riQ_pQ-FKOCOsysQJROGc/edit?pref=2=1

Just to capture offline discussion here: I've reached out to mesos-dev list and 
they're open to the idea of adding a subcommand to mesos-containerizer that 
would be responsible for isolating a task's filesystem.

We hope to work on and contribute this patch next week during Mesoscon.

Given that, I'm going to hold off on this review for the time being.


- Joshua


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47853/#review134955
---


On May 25, 2016, 9:18 p.m., Joshua Cohen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47853/
> ---
> 
> (Updated May 25, 2016, 9:18 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This changes the approach to launching tasks with filesystem images in the 
> unified containerizer. Instead of adding an `Image` to the `MesosContainer`, 
> we instead add the task filesystem as a `Volume` with an associated image. 
> This image is mounted in the mesos directory under the `taskfs` path. The 
> executor, on start up does the following:
> 
> 1. Creates user/group under the taskfs root.
> 2. `pivot_root`s into the taskfs, while bind mounting the sandbox under that 
> root as well as mounting procfs.
> 3. From there, task execution is essentially unchanged minus some slight 
> changes to the environment depending on whether we're running in a pivoted 
> root.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> a99889c1f2d9e10825f87ea669532ad78641880f 
>   examples/vagrant/upstart/aurora-scheduler.conf 
> 3d9e706de564df5e24cb34265bebc0db1cad11a0 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 
> 3b01801d929dd61ee989495bf38af8f03e9f5ad4 
>   src/main/python/apache/aurora/executor/common/sandbox.py 
> be1deba6219462c9fdaaf07a583851b85fe974bf 
>   src/main/python/apache/aurora/executor/thermos_task_runner.py 
> 3896e3841562600379705dbf78a6f62728246348 
>   src/main/python/apache/thermos/core/BUILD 
> 1094664e112cc71af37835f32037e9eb6d047202 
>   src/main/python/apache/thermos/core/process.py 
> 1791b5ff9a36eef7470bef9a6ebbafaf0ab05ca3 
>   src/main/python/apache/thermos/core/runner.py 
> 3ebf86ebd12ed3b68f543d4b9a45615e4681ba7f 
>   src/main/python/apache/thermos/runner/thermos_runner.py 
> 0d06e8e2ac78d26ba8f63744853eb5ce3f6aced6 
>   
> 

Re: Review Request 47853: Isolate the executor's filesystem from the task's.

2016-05-26 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47853/#review134955
---



Thanks for looking into this difficult but important problem.


src/main/python/apache/aurora/executor/common/sandbox.py (lines 195 - 217)


Seeing this spelled out, I feel like we should not have such code in 
Thermos. It is rather low-level, hard to get compeltely right, and can impose 
security risks if done wrong.

Mesos already has to maintain a version of the `chroot`/`pivot_root` 
mechanism. We should try to leverage it. Otherwise, this will be an up-hill 
battle that we cannot win in the long run, especially given the limited number 
of Aurora committers

I see a couple options:

a) We get the Mesos guys to export their `pivot_root` code so that all we 
have to do is call a single Mesos command, just like it could currently be done 
for the `mesos-fetcher` or the entire `mesos-containerizer`:

```
vagrant@aurora:~$ /usr/libexec/mesos/mesos-containerizer 
Usage: /usr/libexec/mesos/mesos-containerizer  [OPTIONS]

Available subcommands:
help
launch
mount
```

b) We live with the fact that we'll always run Thermos within the 
container. This will require Python, but we could drop the dependency on other 
Mesos dependencies by migrating to the new executor HTTP API.

c) We stick to the `MesosContainer.image` for the user container, but use a 
`Volume` `image` in order to mount a minimal container that is just packaging 
Thermos and its dependencies. We could then launch the executor from the 
mounted volume such as `/thermos/bin/thermox_executor.sh` (with appropriate 
P`YTHON_PATH` and `LD_LIBRARY_PATH`). The user could would still be running 
using the user-image.

Approaches b) and c) keep the door open for us to run the executor as an 
un-priviledged user. This has security advantages and would als make it easier 
for users to leverage security features such as the new Mesos Linux 
Capabilities 
https://docs.google.com/document/d/1YiTift8TQla2vq3upQr7K-riQ_pQ-FKOCOsysQJROGc/edit?pref=2=1


- Stephan Erb


On May 25, 2016, 11:18 p.m., Joshua Cohen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47853/
> ---
> 
> (Updated May 25, 2016, 11:18 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This changes the approach to launching tasks with filesystem images in the 
> unified containerizer. Instead of adding an `Image` to the `MesosContainer`, 
> we instead add the task filesystem as a `Volume` with an associated image. 
> This image is mounted in the mesos directory under the `taskfs` path. The 
> executor, on start up does the following:
> 
> 1. Creates user/group under the taskfs root.
> 2. `pivot_root`s into the taskfs, while bind mounting the sandbox under that 
> root as well as mounting procfs.
> 3. From there, task execution is essentially unchanged minus some slight 
> changes to the environment depending on whether we're running in a pivoted 
> root.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> a99889c1f2d9e10825f87ea669532ad78641880f 
>   examples/vagrant/upstart/aurora-scheduler.conf 
> 3d9e706de564df5e24cb34265bebc0db1cad11a0 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 
> 3b01801d929dd61ee989495bf38af8f03e9f5ad4 
>   src/main/python/apache/aurora/executor/common/sandbox.py 
> be1deba6219462c9fdaaf07a583851b85fe974bf 
>   src/main/python/apache/aurora/executor/thermos_task_runner.py 
> 3896e3841562600379705dbf78a6f62728246348 
>   src/main/python/apache/thermos/core/BUILD 
> 1094664e112cc71af37835f32037e9eb6d047202 
>   src/main/python/apache/thermos/core/process.py 
> 1791b5ff9a36eef7470bef9a6ebbafaf0ab05ca3 
>   src/main/python/apache/thermos/core/runner.py 
> 3ebf86ebd12ed3b68f543d4b9a45615e4681ba7f 
>   src/main/python/apache/thermos/runner/thermos_runner.py 
> 0d06e8e2ac78d26ba8f63744853eb5ce3f6aced6 
>   
> src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 
> 58785bfa37ff214f26e9f94d836e6df40e411c3b 
>   src/test/python/apache/aurora/executor/common/test_sandbox.py 
> e47d9b8822deb36cb9cfa0554ef89d6cda80f3e9 
>   src/test/python/apache/thermos/core/test_process.py 
> 77f644c09116266ce02479b9a80403aa68767bd6 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile 
> 6fdea3d28760f59235c51c5b6913d2ee0172ef1a 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile.netcat PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 
> 

Re: Review Request 47853: Isolate the executor's filesystem from the task's.

2016-05-25 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47853/#review134845
---


Ship it!




Master (d521dcd) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On May 25, 2016, 9:18 p.m., Joshua Cohen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47853/
> ---
> 
> (Updated May 25, 2016, 9:18 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This changes the approach to launching tasks with filesystem images in the 
> unified containerizer. Instead of adding an `Image` to the `MesosContainer`, 
> we instead add the task filesystem as a `Volume` with an associated image. 
> This image is mounted in the mesos directory under the `taskfs` path. The 
> executor, on start up does the following:
> 
> 1. Creates user/group under the taskfs root.
> 2. `pivot_root`s into the taskfs, while bind mounting the sandbox under that 
> root as well as mounting procfs.
> 3. From there, task execution is essentially unchanged minus some slight 
> changes to the environment depending on whether we're running in a pivoted 
> root.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> a99889c1f2d9e10825f87ea669532ad78641880f 
>   examples/vagrant/upstart/aurora-scheduler.conf 
> 3d9e706de564df5e24cb34265bebc0db1cad11a0 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 
> 3b01801d929dd61ee989495bf38af8f03e9f5ad4 
>   src/main/python/apache/aurora/executor/common/sandbox.py 
> be1deba6219462c9fdaaf07a583851b85fe974bf 
>   src/main/python/apache/aurora/executor/thermos_task_runner.py 
> 3896e3841562600379705dbf78a6f62728246348 
>   src/main/python/apache/thermos/core/BUILD 
> 1094664e112cc71af37835f32037e9eb6d047202 
>   src/main/python/apache/thermos/core/process.py 
> 1791b5ff9a36eef7470bef9a6ebbafaf0ab05ca3 
>   src/main/python/apache/thermos/core/runner.py 
> 3ebf86ebd12ed3b68f543d4b9a45615e4681ba7f 
>   src/main/python/apache/thermos/runner/thermos_runner.py 
> 0d06e8e2ac78d26ba8f63744853eb5ce3f6aced6 
>   
> src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 
> 58785bfa37ff214f26e9f94d836e6df40e411c3b 
>   src/test/python/apache/aurora/executor/common/test_sandbox.py 
> e47d9b8822deb36cb9cfa0554ef89d6cda80f3e9 
>   src/test/python/apache/thermos/core/test_process.py 
> 77f644c09116266ce02479b9a80403aa68767bd6 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile 
> 6fdea3d28760f59235c51c5b6913d2ee0172ef1a 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile.netcat PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 
> 219c40fb94561f0a390cac16e643bf4332c51aad 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora 
> 08553e4f48f137e0455ad07287086311171c06bd 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora 
> 8b3a50ba6de992560593987f3db254baa4d29a41 
>   src/test/sh/org/apache/aurora/e2e/run-server.sh PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> abe0ca75c6a2c0ace15fce68ad0e5c9aa98193a4 
> 
> Diff: https://reviews.apache.org/r/47853/diff/
> 
> 
> Testing
> ---
> 
> Lots of manual testing, e2e tests, etc.
> 
> I didn't add much coverage on the thermos side of things because it seemed 
> like this was better served by the e2e tests than by doing a bunch of 
> subprocess.check_call mocking. On the e2e front I created a new Dockerfile 
> that sets up a much slimmer filesystem image that explicitly does not include 
> python to ensure that the executor's filesystem is truly isolated.
> 
> 
> Thanks,
> 
> Joshua Cohen
> 
>