Re: Review Request 66103: Introduce mesos disk collector

2018-03-22 Thread Reza Motamedi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66103/
---

(Updated March 22, 2018, 2:02 p.m.)


Review request for Aurora, David McLaughlin, Daniel Knightly, Jordan Ly, 
Santhosh Kumar Shanmugham, and Stephan Erb.


Changes
---

Adding a test for unexpected response format, e.g. a dict instead of a list.


Repository: aurora


Description
---

When disk isolation is enabled in a Mesos agent it calculates the disk usage 
for each container. 
Thermos Observer also monitors disk usage using `twitter.common.dirutil`, 
essentially repeating the work already done by the agent. In practice, we see 
that disk monitoring is one of the most expensive resource monitoring tasks. 
For instance, when there are deeply nested directories, the CPU utilization of 
the observer process can easily reach 1.5 CPUs. It would be ideal if we 
delegate the disk monitoring task to the agent and do it only once. With this 
approach, when disk collection has improved in the agent (for instance by 
implementing XFS isolation), we can simply benefit from it without any code 
change. Some more information about the problem is provided in AURORA-1918.

This patch that introduces `MesosDiskCollector` which queries the agent's API 
endpoint to lookup disk_used_bytes. Note that there is also resource monitoring 
in thermos executor. Currently, I left the disk collector there to use the `du` 
implementation. That can be changed in a later patch.

I modified some vagrant config files including `aurora-executor.service` and 
`etc_mesos-slave/isolation` for testing. They can be left as is. I included 
them in this patch to show how this would work e2e.


Diffs (updated)
-

  3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
  RELEASE-NOTES.md 51ab6c724694244bf616b29e9beace4a4a3f5252 
  docs/reference/observer-configuration.md 
8a443c94f7f37f9454989781f722101a97c99f15 
  examples/jobs/hello_world.aurora 5401bfebe753b5e53abd08baeac501144ced9b5a 
  examples/vagrant/mesos_config/etc_mesos-slave/isolation 
1a7028ffc70116b104ef3ad22b7388f637707a0f 
  examples/vagrant/systemd/thermos.service 
01925bcd2ae44f100df511f3c3951c3f5a1a72aa 
  src/main/python/apache/aurora/tools/thermos_observer.py 
dd9f0c46ceac9e939b1b763073314161de0ea614 
  src/main/python/apache/thermos/monitoring/BUILD 
65ba7088f65e7baa5d30744736ba456b46a55e86 
  src/main/python/apache/thermos/monitoring/disk.py 
986d33a5000f8d5db15cb639c81f8b1d756ffa05 
  src/main/python/apache/thermos/monitoring/resource.py 
adcdc751c03460dc801a18278faa96d6bd64722b 
  src/main/python/apache/thermos/observer/task_observer.py 
a6870d48bddf2a2ccede7bb68195f2baae1d0e47 
  
src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
 fe74bd1d3ecd89fca1b5b2251202cbbc0f24 
  src/test/python/apache/thermos/monitoring/BUILD 
8f2b39336dce6c7b580e6ba0009f60afdcb89179 
  src/test/python/apache/thermos/monitoring/test_disk.py 
362393bfd1facf3198e2d438d0596b16700b72b8 
  src/test/python/apache/thermos/monitoring/test_resource.py 
e577e552d4ee1807096a15401851bb9fd95fa426 


Diff: https://reviews.apache.org/r/66103/diff/6/

Changes: https://reviews.apache.org/r/66103/diff/5-6/


Testing
---

- I added unit tests.
- Tested in vagrant and it works as intenced.
- I also built and deployed in our test enviroment. In order to measure 
imporoved performance I created jobs with nested folders and noticed reduction 
in CPU utilization of the Observer process, by at least 60%. (1.5 CPU cores to 
0.4 CPU cores)

Here is one specific test setup: On two hosts I created a two tasks. Each task 
creates identical nested directory structures and files in them. The overall 
size is 30GB. test_host_1 runs the current version of observer and test_host_2 
runs Observer with this patch and also has mesos_disk_collection enabled. The 
results are as follows:

```
rezam[7]TEST_HOST_1 ~ $ while true; do echo `date`; curl localhost:1338/vars -s 
| grep cpu; sleep 10; done
Thu Mar 22 04:36:17 UTC 2018
observer.observer_cpu 108.9
Thu Mar 22 04:36:27 UTC 2018
observer.observer_cpu 123.2
Thu Mar 22 04:36:38 UTC 2018
observer.observer_cpu 123.2
Thu Mar 22 04:36:48 UTC 2018
observer.observer_cpu 123.2
Thu Mar 22 04:36:58 UTC 2018
observer.observer_cpu 111.0
Thu Mar 22 04:37:08 UTC 2018
observer.observer_cpu 111.0
Thu Mar 22 04:37:18 UTC 2018
observer.observer_cpu 111.0


rezam[7]TEST_HOST_2 ~ $ while true; do echo `date`; curl localhost:1338/vars -s 
| grep cpu; sleep 10; done
Thu Mar 22 04:36:20 UTC 2018
observer.observer_cpu 1.3
Thu Mar 22 04:36:30 UTC 2018
observer.observer_cpu 1.3
Thu Mar 22 04:36:40 UTC 2018
observer.observer_cpu 1.3
Thu Mar 22 04:36:50 UTC 2018
observer.observer_cpu 1.2
Thu Mar 22 04:37:00 UTC 2018
observer.observer_cpu 1.2
Thu Mar 22 04:37:10 UTC 2018
observer.observer_cpu 1.2
Thu Mar 22 04:37:20 UTC 

Re: Review Request 66103: Introduce mesos disk collector

2018-03-22 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66103/#review199763
---


Ship it!




Master (f32086d) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On March 22, 2018, 7:02 a.m., Reza Motamedi wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66103/
> ---
> 
> (Updated March 22, 2018, 7:02 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Daniel Knightly, Jordan Ly, 
> Santhosh Kumar Shanmugham, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> When disk isolation is enabled in a Mesos agent it calculates the disk usage 
> for each container. 
> Thermos Observer also monitors disk usage using `twitter.common.dirutil`, 
> essentially repeating the work already done by the agent. In practice, we see 
> that disk monitoring is one of the most expensive resource monitoring tasks. 
> For instance, when there are deeply nested directories, the CPU utilization 
> of the observer process can easily reach 1.5 CPUs. It would be ideal if we 
> delegate the disk monitoring task to the agent and do it only once. With this 
> approach, when disk collection has improved in the agent (for instance by 
> implementing XFS isolation), we can simply benefit from it without any code 
> change. Some more information about the problem is provided in AURORA-1918.
> 
> This patch that introduces `MesosDiskCollector` which queries the agent's API 
> endpoint to lookup disk_used_bytes. Note that there is also resource 
> monitoring in thermos executor. Currently, I left the disk collector there to 
> use the `du` implementation. That can be changed in a later patch.
> 
> I modified some vagrant config files including `aurora-executor.service` and 
> `etc_mesos-slave/isolation` for testing. They can be left as is. I included 
> them in this patch to show how this would work e2e.
> 
> 
> Diffs
> -
> 
>   3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
>   RELEASE-NOTES.md 51ab6c724694244bf616b29e9beace4a4a3f5252 
>   docs/reference/observer-configuration.md 
> 8a443c94f7f37f9454989781f722101a97c99f15 
>   examples/jobs/hello_world.aurora 5401bfebe753b5e53abd08baeac501144ced9b5a 
>   examples/vagrant/mesos_config/etc_mesos-slave/isolation 
> 1a7028ffc70116b104ef3ad22b7388f637707a0f 
>   examples/vagrant/systemd/thermos.service 
> 01925bcd2ae44f100df511f3c3951c3f5a1a72aa 
>   src/main/python/apache/aurora/tools/thermos_observer.py 
> dd9f0c46ceac9e939b1b763073314161de0ea614 
>   src/main/python/apache/thermos/monitoring/BUILD 
> 65ba7088f65e7baa5d30744736ba456b46a55e86 
>   src/main/python/apache/thermos/monitoring/disk.py 
> 986d33a5000f8d5db15cb639c81f8b1d756ffa05 
>   src/main/python/apache/thermos/monitoring/resource.py 
> adcdc751c03460dc801a18278faa96d6bd64722b 
>   src/main/python/apache/thermos/observer/task_observer.py 
> a6870d48bddf2a2ccede7bb68195f2baae1d0e47 
>   
> src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
>  fe74bd1d3ecd89fca1b5b2251202cbbc0f24 
>   src/test/python/apache/thermos/monitoring/BUILD 
> 8f2b39336dce6c7b580e6ba0009f60afdcb89179 
>   src/test/python/apache/thermos/monitoring/test_disk.py 
> 362393bfd1facf3198e2d438d0596b16700b72b8 
>   src/test/python/apache/thermos/monitoring/test_resource.py 
> e577e552d4ee1807096a15401851bb9fd95fa426 
> 
> 
> Diff: https://reviews.apache.org/r/66103/diff/6/
> 
> 
> Testing
> ---
> 
> - I added unit tests.
> - Tested in vagrant and it works as intenced.
> - I also built and deployed in our test enviroment. In order to measure 
> imporoved performance I created jobs with nested folders and noticed 
> reduction in CPU utilization of the Observer process, by at least 60%. (1.5 
> CPU cores to 0.4 CPU cores)
> 
> Here is one specific test setup: On two hosts I created a two tasks. Each 
> task creates identical nested directory structures and files in them. The 
> overall size is 30GB. test_host_1 runs the current version of observer and 
> test_host_2 runs Observer with this patch and also has mesos_disk_collection 
> enabled. The results are as follows:
> 
> ```
> rezam[7]TEST_HOST_1 ~ $ while true; do echo `date`; curl localhost:1338/vars 
> -s | grep cpu; sleep 10; done
> Thu Mar 22 04:36:17 UTC 2018
> observer.observer_cpu 108.9
> Thu Mar 22 04:36:27 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:38 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:48 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:58 UTC 2018
> observer.observer_cpu 111.0
> Thu Mar 22 04:37:08 UTC 2018
> 

Re: Review Request 66199: Remove unused LOST_LOCK_MESSAGE variable in JobUpdateControllerImpl

2018-03-22 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66199/#review199771
---


Ship it!




Ship It!

- Stephan Erb


On March 21, 2018, 10:38 p.m., Jordan Ly wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66199/
> ---
> 
> (Updated March 21, 2018, 10:38 p.m.)
> 
> 
> Review request for Aurora, Renan DelValle and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> We no longer use locks for updates (context: 
> https://reviews.apache.org/r/63130/). This was a legacy variable.
> 
> 
> Diffs
> -
> 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java
>  f8be8058f3a80a18b999d2666e2adb33e1e55fef 
> 
> 
> Diff: https://reviews.apache.org/r/66199/diff/1/
> 
> 
> Testing
> ---
> 
> `./gradlew test`
> 
> 
> Thanks,
> 
> Jordan Ly
> 
>



Re: Review Request 66190: Fix 'PreemptorSlotSearchBenchmark', remove 'isProduction' references in benchmark

2018-03-22 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66190/#review199774
---


Ship it!




Ship It!

- Stephan Erb


On March 21, 2018, 9:23 p.m., Jordan Ly wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66190/
> ---
> 
> (Updated March 21, 2018, 9:23 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, and 
> Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This benchmark was using the deprecated `production` flag when building the 
> tasks for the cluster. `PendingTaskProcessor` depends on `tier` instead, so 
> this benchmark ended up not testing the correct codepath.
> 
> Removed references to `production` and added `tier` instead. Additionally, 
> removed some unused options.
> 
> 
> Diffs
> -
> 
>   src/jmh/java/org/apache/aurora/benchmark/BenchmarkSettings.java 
> ddab2eccb22a93ecb67481f399707d2d82df5db2 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
> 1f9a5764b502f939f0345ff99fb0fc2830b4c2f0 
>   src/jmh/java/org/apache/aurora/benchmark/Tasks.java 
> 60c62bbf3061650a5dd8654045dc8189293d0190 
> 
> 
> Diff: https://reviews.apache.org/r/66190/diff/4/
> 
> 
> Testing
> ---
> 
> Old:
> ```
> # Run complete. Total time: 00:08:32
> 
> Benchmark 
>(numPendingTasks)   Mode  Cnt Score Error  
>  Units
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark
>1  thrpt   1057.670 ±  20.451  
>  ops/s
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate 
>1  thrpt   10   595.374 ± 211.805  
> MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate.norm
>1  thrpt   10  10830342.916 ± 380.919
> B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space
>1  thrpt   10   593.530 ± 222.002  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space.norm
>   1  thrpt   10  10717947.102 ± 1280229.296B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space
>1  thrpt   10 0.305 ±   1.264  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space.norm
>   1  thrpt   10 13552.434 ±   61403.918B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.count  
>1  thrpt   1060.000
> counts
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.time   
>1  thrpt   10   202.000
> ms
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·stack 
>1  thrptNaN
>---
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark
>   10  thrpt   1052.161 ±   8.526  
>  ops/s
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate 
>   10  thrpt   10   550.771 ±  89.939  
> MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.alloc.rate.norm
>   10  thrpt   10  11074211.352 ± 318.376
> B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space
>   10  thrpt   10   550.125 ± 107.470  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Eden_Space.norm
>  10  thrpt   10  11073792.311 ± 1621636.993B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space
>   10  thrpt   10 0.038 ±   0.049  MB/sec
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.churn.PS_Survivor_Space.norm
>  10  thrpt   10   737.753 ± 919.460B/op
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.count  
>   10  thrpt   1055.000
> counts
> SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark:·gc.time   
>   10  thrpt   10   155.000
> ms
> 

Re: Review Request 66103: Introduce mesos disk collector

2018-03-22 Thread Franck Cuny via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66103/#review199791
---




src/main/python/apache/thermos/monitoring/disk.py
Lines 132 (patched)


we should specify a default timeout


- Franck Cuny


On March 22, 2018, 5:16 p.m., Reza Motamedi wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66103/
> ---
> 
> (Updated March 22, 2018, 5:16 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Daniel Knightly, Franck Cuny, 
> Jordan Ly, Santhosh Kumar Shanmugham, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> When disk isolation is enabled in a Mesos agent it calculates the disk usage 
> for each container. 
> Thermos Observer also monitors disk usage using `twitter.common.dirutil`, 
> essentially repeating the work already done by the agent. In practice, we see 
> that disk monitoring is one of the most expensive resource monitoring tasks. 
> For instance, when there are deeply nested directories, the CPU utilization 
> of the observer process can easily reach 1.5 CPUs. It would be ideal if we 
> delegate the disk monitoring task to the agent and do it only once. With this 
> approach, when disk collection has improved in the agent (for instance by 
> implementing XFS isolation), we can simply benefit from it without any code 
> change. Some more information about the problem is provided in AURORA-1918.
> 
> This patch that introduces `MesosDiskCollector` which queries the agent's API 
> endpoint to lookup disk_used_bytes. Note that there is also resource 
> monitoring in thermos executor. Currently, I left the disk collector there to 
> use the `du` implementation. That can be changed in a later patch.
> 
> I modified some vagrant config files including `aurora-executor.service` and 
> `etc_mesos-slave/isolation` for testing. They can be left as is. I included 
> them in this patch to show how this would work e2e.
> 
> 
> Diffs
> -
> 
>   3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
>   RELEASE-NOTES.md 51ab6c724694244bf616b29e9beace4a4a3f5252 
>   docs/reference/observer-configuration.md 
> 8a443c94f7f37f9454989781f722101a97c99f15 
>   examples/jobs/hello_world.aurora 5401bfebe753b5e53abd08baeac501144ced9b5a 
>   examples/vagrant/mesos_config/etc_mesos-slave/isolation 
> 1a7028ffc70116b104ef3ad22b7388f637707a0f 
>   examples/vagrant/systemd/thermos.service 
> 01925bcd2ae44f100df511f3c3951c3f5a1a72aa 
>   src/main/python/apache/aurora/tools/thermos_observer.py 
> dd9f0c46ceac9e939b1b763073314161de0ea614 
>   src/main/python/apache/thermos/monitoring/BUILD 
> 65ba7088f65e7baa5d30744736ba456b46a55e86 
>   src/main/python/apache/thermos/monitoring/disk.py 
> 986d33a5000f8d5db15cb639c81f8b1d756ffa05 
>   src/main/python/apache/thermos/monitoring/resource.py 
> adcdc751c03460dc801a18278faa96d6bd64722b 
>   src/main/python/apache/thermos/observer/task_observer.py 
> a6870d48bddf2a2ccede7bb68195f2baae1d0e47 
>   
> src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
>  fe74bd1d3ecd89fca1b5b2251202cbbc0f24 
>   src/test/python/apache/thermos/monitoring/BUILD 
> 8f2b39336dce6c7b580e6ba0009f60afdcb89179 
>   src/test/python/apache/thermos/monitoring/test_disk.py 
> 362393bfd1facf3198e2d438d0596b16700b72b8 
>   src/test/python/apache/thermos/monitoring/test_resource.py 
> e577e552d4ee1807096a15401851bb9fd95fa426 
> 
> 
> Diff: https://reviews.apache.org/r/66103/diff/6/
> 
> 
> Testing
> ---
> 
> - I added unit tests.
> - Tested in vagrant and it works as intenced.
> - I also built and deployed in our test enviroment. In order to measure 
> imporoved performance I created jobs with nested folders and noticed 
> reduction in CPU utilization of the Observer process, by at least 60%. (1.5 
> CPU cores to 0.4 CPU cores)
> 
> Here is one specific test setup: On two hosts I created a two tasks. Each 
> task creates identical nested directory structures and files in them. The 
> overall size is 30GB. test_host_1 runs the current version of observer and 
> test_host_2 runs Observer with this patch and also has mesos_disk_collection 
> enabled. The results are as follows:
> 
> ```
> rezam[7]TEST_HOST_1 ~ $ while true; do echo `date`; curl localhost:1338/vars 
> -s | grep cpu; sleep 10; done
> Thu Mar 22 04:36:17 UTC 2018
> observer.observer_cpu 108.9
> Thu Mar 22 04:36:27 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:38 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:48 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:58 UTC 2018
> observer.observer_cpu 111.0
> Thu Mar 22 04:37:08 UTC 2018
> 

Re: Review Request 66103: Introduce mesos disk collector

2018-03-22 Thread Reza Motamedi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66103/
---

(Updated March 22, 2018, 9:52 p.m.)


Review request for Aurora, David McLaughlin, Daniel Knightly, Franck Cuny, 
Jordan Ly, Santhosh Kumar Shanmugham, and Stephan Erb.


Changes
---

- address feedback by adding timeout to `reuqests.get`.
- I also removed the `default_settings` class method in `DiskCollectorSettings` 
class since it was essentially calling the constructor. Now instead of 
`DiskCollectorSettings.default_settings()` I just call 
`DiskCollectorSettings()`. I don't think `default_settings` was very pythonic.


Repository: aurora


Description
---

When disk isolation is enabled in a Mesos agent it calculates the disk usage 
for each container. 
Thermos Observer also monitors disk usage using `twitter.common.dirutil`, 
essentially repeating the work already done by the agent. In practice, we see 
that disk monitoring is one of the most expensive resource monitoring tasks. 
For instance, when there are deeply nested directories, the CPU utilization of 
the observer process can easily reach 1.5 CPUs. It would be ideal if we 
delegate the disk monitoring task to the agent and do it only once. With this 
approach, when disk collection has improved in the agent (for instance by 
implementing XFS isolation), we can simply benefit from it without any code 
change. Some more information about the problem is provided in AURORA-1918.

This patch that introduces `MesosDiskCollector` which queries the agent's API 
endpoint to lookup disk_used_bytes. Note that there is also resource monitoring 
in thermos executor. Currently, I left the disk collector there to use the `du` 
implementation. That can be changed in a later patch.

I modified some vagrant config files including `aurora-executor.service` and 
`etc_mesos-slave/isolation` for testing. They can be left as is. I included 
them in this patch to show how this would work e2e.


Diffs (updated)
-

  3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
  RELEASE-NOTES.md 51ab6c724694244bf616b29e9beace4a4a3f5252 
  docs/reference/observer-configuration.md 
8a443c94f7f37f9454989781f722101a97c99f15 
  examples/jobs/hello_world.aurora 5401bfebe753b5e53abd08baeac501144ced9b5a 
  examples/vagrant/mesos_config/etc_mesos-slave/isolation 
1a7028ffc70116b104ef3ad22b7388f637707a0f 
  examples/vagrant/systemd/thermos.service 
01925bcd2ae44f100df511f3c3951c3f5a1a72aa 
  src/main/python/apache/aurora/tools/thermos_observer.py 
dd9f0c46ceac9e939b1b763073314161de0ea614 
  src/main/python/apache/thermos/monitoring/BUILD 
65ba7088f65e7baa5d30744736ba456b46a55e86 
  src/main/python/apache/thermos/monitoring/disk.py 
986d33a5000f8d5db15cb639c81f8b1d756ffa05 
  src/main/python/apache/thermos/monitoring/resource.py 
adcdc751c03460dc801a18278faa96d6bd64722b 
  src/main/python/apache/thermos/observer/task_observer.py 
a6870d48bddf2a2ccede7bb68195f2baae1d0e47 
  
src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
 fe74bd1d3ecd89fca1b5b2251202cbbc0f24 
  src/test/python/apache/thermos/monitoring/BUILD 
8f2b39336dce6c7b580e6ba0009f60afdcb89179 
  src/test/python/apache/thermos/monitoring/test_disk.py 
362393bfd1facf3198e2d438d0596b16700b72b8 
  src/test/python/apache/thermos/monitoring/test_resource.py 
e577e552d4ee1807096a15401851bb9fd95fa426 


Diff: https://reviews.apache.org/r/66103/diff/7/

Changes: https://reviews.apache.org/r/66103/diff/6-7/


Testing
---

- I added unit tests.
- Tested in vagrant and it works as intenced.
- I also built and deployed in our test enviroment. In order to measure 
imporoved performance I created jobs with nested folders and noticed reduction 
in CPU utilization of the Observer process, by at least 60%. (1.5 CPU cores to 
0.4 CPU cores)

Here is one specific test setup: On two hosts I created a two tasks. Each task 
creates identical nested directory structures and files in them. The overall 
size is 30GB. test_host_1 runs the current version of observer and test_host_2 
runs Observer with this patch and also has mesos_disk_collection enabled. The 
results are as follows:

```
rezam[7]TEST_HOST_1 ~ $ while true; do echo `date`; curl localhost:1338/vars -s 
| grep cpu; sleep 10; done
Thu Mar 22 04:36:17 UTC 2018
observer.observer_cpu 108.9
Thu Mar 22 04:36:27 UTC 2018
observer.observer_cpu 123.2
Thu Mar 22 04:36:38 UTC 2018
observer.observer_cpu 123.2
Thu Mar 22 04:36:48 UTC 2018
observer.observer_cpu 123.2
Thu Mar 22 04:36:58 UTC 2018
observer.observer_cpu 111.0
Thu Mar 22 04:37:08 UTC 2018
observer.observer_cpu 111.0
Thu Mar 22 04:37:18 UTC 2018
observer.observer_cpu 111.0


rezam[7]TEST_HOST_2 ~ $ while true; do echo `date`; curl localhost:1338/vars -s 
| grep cpu; sleep 10; done
Thu Mar 22 04:36:20 UTC 2018
observer.observer_cpu 1.3
Thu Mar 22 04:36:30 

Re: Review Request 66103: Introduce mesos disk collector

2018-03-22 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66103/#review199741
---


Ship it!




Master (f32086d) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On March 21, 2018, 10:29 p.m., Reza Motamedi wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66103/
> ---
> 
> (Updated March 21, 2018, 10:29 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Daniel Knightly, Jordan Ly, 
> Santhosh Kumar Shanmugham, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> When disk isolation is enabled in a Mesos agent it calculates the disk usage 
> for each container. 
> Thermos Observer also monitors disk usage using `twitter.common.dirutil`, 
> essentially repeating the work already done by the agent. In practice, we see 
> that disk monitoring is one of the most expensive resource monitoring tasks. 
> For instance, when there are deeply nested directories, the CPU utilization 
> of the observer process can easily reach 1.5 CPUs. It would be ideal if we 
> delegate the disk monitoring task to the agent and do it only once. With this 
> approach, when disk collection has improved in the agent (for instance by 
> implementing XFS isolation), we can simply benefit from it without any code 
> change. Some more information about the problem is provided in AURORA-1918.
> 
> This patch that introduces `MesosDiskCollector` which queries the agent's API 
> endpoint to lookup disk_used_bytes. Note that there is also resource 
> monitoring in thermos executor. Currently, I left the disk collector there to 
> use the `du` implementation. That can be changed in a later patch.
> 
> I modified some vagrant config files including `aurora-executor.service` and 
> `etc_mesos-slave/isolation` for testing. They can be left as is. I included 
> them in this patch to show how this would work e2e.
> 
> 
> Diffs
> -
> 
>   3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
>   RELEASE-NOTES.md 51ab6c724694244bf616b29e9beace4a4a3f5252 
>   docs/reference/observer-configuration.md 
> 8a443c94f7f37f9454989781f722101a97c99f15 
>   examples/jobs/hello_world.aurora 5401bfebe753b5e53abd08baeac501144ced9b5a 
>   examples/vagrant/mesos_config/etc_mesos-slave/isolation 
> 1a7028ffc70116b104ef3ad22b7388f637707a0f 
>   examples/vagrant/systemd/thermos.service 
> 01925bcd2ae44f100df511f3c3951c3f5a1a72aa 
>   src/main/python/apache/aurora/tools/thermos_observer.py 
> dd9f0c46ceac9e939b1b763073314161de0ea614 
>   src/main/python/apache/thermos/monitoring/BUILD 
> 65ba7088f65e7baa5d30744736ba456b46a55e86 
>   src/main/python/apache/thermos/monitoring/disk.py 
> 986d33a5000f8d5db15cb639c81f8b1d756ffa05 
>   src/main/python/apache/thermos/monitoring/resource.py 
> adcdc751c03460dc801a18278faa96d6bd64722b 
>   src/main/python/apache/thermos/observer/task_observer.py 
> a6870d48bddf2a2ccede7bb68195f2baae1d0e47 
>   
> src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
>  fe74bd1d3ecd89fca1b5b2251202cbbc0f24 
>   src/test/python/apache/thermos/monitoring/BUILD 
> 8f2b39336dce6c7b580e6ba0009f60afdcb89179 
>   src/test/python/apache/thermos/monitoring/test_disk.py 
> 362393bfd1facf3198e2d438d0596b16700b72b8 
>   src/test/python/apache/thermos/monitoring/test_resource.py 
> e577e552d4ee1807096a15401851bb9fd95fa426 
> 
> 
> Diff: https://reviews.apache.org/r/66103/diff/5/
> 
> 
> Testing
> ---
> 
> - I added unit tests.
> - Tested in vagrant and it works as intenced.
> - I also built and deployed in our test enviroment. In order to measure 
> imporoved performance I created jobs with nested folders and noticed 
> reduction in CPU utilization of the Observer process, by at least 60%. (1.5 
> CPU cores to 0.4 CPU cores)
> 
> Here is one specific test setup: On two hosts I created a two tasks. Each 
> task creates identical nested directory structures and files in them. The 
> overall size is 30GB. test_host_1 runs the current version of observer and 
> test_host_2 runs Observer with this patch and also has mesos_disk_collection 
> enabled. The results are as follows:
> 
> ```
> rezam[7]TEST_HOST_1 ~ $ while true; do echo `date`; curl localhost:1338/vars 
> -s | grep cpu; sleep 10; done
> Thu Mar 22 04:36:17 UTC 2018
> observer.observer_cpu 108.9
> Thu Mar 22 04:36:27 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:38 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:48 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:58 UTC 2018
> observer.observer_cpu 111.0
> Thu Mar 22 04:37:08 UTC 2018
> 

Re: Review Request 66103: Introduce mesos disk collector

2018-03-22 Thread Reza Motamedi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66103/
---

(Updated March 23, 2018, 12:59 a.m.)


Review request for Aurora, David McLaughlin, Daniel Knightly, Franck Cuny, 
Jordan Ly, Santhosh Kumar Shanmugham, and Stephan Erb.


Changes
---

- Update style check comments
- Addressed feedback regarding tests and early compliation of JMESPath.
- I added an additional test for the timeout using dynamic response callback in 
HTTPretty as well.


Repository: aurora


Description
---

When disk isolation is enabled in a Mesos agent it calculates the disk usage 
for each container. 
Thermos Observer also monitors disk usage using `twitter.common.dirutil`, 
essentially repeating the work already done by the agent. In practice, we see 
that disk monitoring is one of the most expensive resource monitoring tasks. 
For instance, when there are deeply nested directories, the CPU utilization of 
the observer process can easily reach 1.5 CPUs. It would be ideal if we 
delegate the disk monitoring task to the agent and do it only once. With this 
approach, when disk collection has improved in the agent (for instance by 
implementing XFS isolation), we can simply benefit from it without any code 
change. Some more information about the problem is provided in AURORA-1918.

This patch that introduces `MesosDiskCollector` which queries the agent's API 
endpoint to lookup disk_used_bytes. Note that there is also resource monitoring 
in thermos executor. Currently, I left the disk collector there to use the `du` 
implementation. That can be changed in a later patch.

I modified some vagrant config files including `aurora-executor.service` and 
`etc_mesos-slave/isolation` for testing. They can be left as is. I included 
them in this patch to show how this would work e2e.


Diffs (updated)
-

  3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
  RELEASE-NOTES.md 51ab6c724694244bf616b29e9beace4a4a3f5252 
  docs/reference/observer-configuration.md 
8a443c94f7f37f9454989781f722101a97c99f15 
  examples/jobs/hello_world.aurora 5401bfebe753b5e53abd08baeac501144ced9b5a 
  examples/vagrant/mesos_config/etc_mesos-slave/isolation 
1a7028ffc70116b104ef3ad22b7388f637707a0f 
  examples/vagrant/systemd/thermos.service 
01925bcd2ae44f100df511f3c3951c3f5a1a72aa 
  src/main/python/apache/aurora/tools/thermos_observer.py 
dd9f0c46ceac9e939b1b763073314161de0ea614 
  src/main/python/apache/thermos/monitoring/BUILD 
65ba7088f65e7baa5d30744736ba456b46a55e86 
  src/main/python/apache/thermos/monitoring/disk.py 
986d33a5000f8d5db15cb639c81f8b1d756ffa05 
  src/main/python/apache/thermos/monitoring/resource.py 
adcdc751c03460dc801a18278faa96d6bd64722b 
  src/main/python/apache/thermos/observer/task_observer.py 
a6870d48bddf2a2ccede7bb68195f2baae1d0e47 
  
src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
 fe74bd1d3ecd89fca1b5b2251202cbbc0f24 
  src/test/python/apache/thermos/monitoring/BUILD 
8f2b39336dce6c7b580e6ba0009f60afdcb89179 
  src/test/python/apache/thermos/monitoring/test_disk.py 
362393bfd1facf3198e2d438d0596b16700b72b8 
  src/test/python/apache/thermos/monitoring/test_resource.py 
e577e552d4ee1807096a15401851bb9fd95fa426 


Diff: https://reviews.apache.org/r/66103/diff/8/

Changes: https://reviews.apache.org/r/66103/diff/7-8/


Testing
---

- I added unit tests.
- Tested in vagrant and it works as intenced.
- I also built and deployed in our test enviroment. In order to measure 
imporoved performance I created jobs with nested folders and noticed reduction 
in CPU utilization of the Observer process, by at least 60%. (1.5 CPU cores to 
0.4 CPU cores)

Here is one specific test setup: On two hosts I created a two tasks. Each task 
creates identical nested directory structures and files in them. The overall 
size is 30GB. test_host_1 runs the current version of observer and test_host_2 
runs Observer with this patch and also has mesos_disk_collection enabled. The 
results are as follows:

```
rezam[7]TEST_HOST_1 ~ $ while true; do echo `date`; curl localhost:1338/vars -s 
| grep cpu; sleep 10; done
Thu Mar 22 04:36:17 UTC 2018
observer.observer_cpu 108.9
Thu Mar 22 04:36:27 UTC 2018
observer.observer_cpu 123.2
Thu Mar 22 04:36:38 UTC 2018
observer.observer_cpu 123.2
Thu Mar 22 04:36:48 UTC 2018
observer.observer_cpu 123.2
Thu Mar 22 04:36:58 UTC 2018
observer.observer_cpu 111.0
Thu Mar 22 04:37:08 UTC 2018
observer.observer_cpu 111.0
Thu Mar 22 04:37:18 UTC 2018
observer.observer_cpu 111.0


rezam[7]TEST_HOST_2 ~ $ while true; do echo `date`; curl localhost:1338/vars -s 
| grep cpu; sleep 10; done
Thu Mar 22 04:36:20 UTC 2018
observer.observer_cpu 1.3
Thu Mar 22 04:36:30 UTC 2018
observer.observer_cpu 1.3
Thu Mar 22 04:36:40 UTC 2018
observer.observer_cpu 1.3
Thu Mar 22 04:36:50 UTC 2018
observer.observer_cpu 

Re: Review Request 66103: Introduce mesos disk collector

2018-03-22 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66103/#review199830
---


Ship it!




Master (f32086d) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On March 23, 2018, 1:11 a.m., Reza Motamedi wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66103/
> ---
> 
> (Updated March 23, 2018, 1:11 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Daniel Knightly, Franck Cuny, 
> Jordan Ly, Santhosh Kumar Shanmugham, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> When disk isolation is enabled in a Mesos agent it calculates the disk usage 
> for each container. 
> Thermos Observer also monitors disk usage using `twitter.common.dirutil`, 
> essentially repeating the work already done by the agent. In practice, we see 
> that disk monitoring is one of the most expensive resource monitoring tasks. 
> For instance, when there are deeply nested directories, the CPU utilization 
> of the observer process can easily reach 1.5 CPUs. It would be ideal if we 
> delegate the disk monitoring task to the agent and do it only once. With this 
> approach, when disk collection has improved in the agent (for instance by 
> implementing XFS isolation), we can simply benefit from it without any code 
> change. Some more information about the problem is provided in AURORA-1918.
> 
> This patch that introduces `MesosDiskCollector` which queries the agent's API 
> endpoint to lookup disk_used_bytes. Note that there is also resource 
> monitoring in thermos executor. Currently, I left the disk collector there to 
> use the `du` implementation. That can be changed in a later patch.
> 
> I modified some vagrant config files including `aurora-executor.service` and 
> `etc_mesos-slave/isolation` for testing. They can be left as is. I included 
> them in this patch to show how this would work e2e.
> 
> 
> Diffs
> -
> 
>   3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
>   RELEASE-NOTES.md 51ab6c724694244bf616b29e9beace4a4a3f5252 
>   docs/reference/observer-configuration.md 
> 8a443c94f7f37f9454989781f722101a97c99f15 
>   examples/jobs/hello_world.aurora 5401bfebe753b5e53abd08baeac501144ced9b5a 
>   examples/vagrant/mesos_config/etc_mesos-slave/isolation 
> 1a7028ffc70116b104ef3ad22b7388f637707a0f 
>   examples/vagrant/systemd/thermos.service 
> 01925bcd2ae44f100df511f3c3951c3f5a1a72aa 
>   src/main/python/apache/aurora/tools/thermos_observer.py 
> dd9f0c46ceac9e939b1b763073314161de0ea614 
>   src/main/python/apache/thermos/monitoring/BUILD 
> 65ba7088f65e7baa5d30744736ba456b46a55e86 
>   src/main/python/apache/thermos/monitoring/disk.py 
> 986d33a5000f8d5db15cb639c81f8b1d756ffa05 
>   src/main/python/apache/thermos/monitoring/resource.py 
> adcdc751c03460dc801a18278faa96d6bd64722b 
>   src/main/python/apache/thermos/observer/task_observer.py 
> a6870d48bddf2a2ccede7bb68195f2baae1d0e47 
>   
> src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
>  fe74bd1d3ecd89fca1b5b2251202cbbc0f24 
>   src/test/python/apache/thermos/monitoring/BUILD 
> 8f2b39336dce6c7b580e6ba0009f60afdcb89179 
>   src/test/python/apache/thermos/monitoring/test_disk.py 
> 362393bfd1facf3198e2d438d0596b16700b72b8 
>   src/test/python/apache/thermos/monitoring/test_resource.py 
> e577e552d4ee1807096a15401851bb9fd95fa426 
> 
> 
> Diff: https://reviews.apache.org/r/66103/diff/9/
> 
> 
> Testing
> ---
> 
> - I added unit tests.
> - Tested in vagrant and it works as intenced.
> - I also built and deployed in our test enviroment. In order to measure 
> imporoved performance I created jobs with nested folders and noticed 
> reduction in CPU utilization of the Observer process, by at least 60%. (1.5 
> CPU cores to 0.4 CPU cores)
> 
> Here is one specific test setup: On two hosts I created a two tasks. Each 
> task creates identical nested directory structures and files in them. The 
> overall size is 30GB. test_host_1 runs the current version of observer and 
> test_host_2 runs Observer with this patch and also has mesos_disk_collection 
> enabled. The results are as follows:
> 
> ```
> rezam[7]TEST_HOST_1 ~ $ while true; do echo `date`; curl localhost:1338/vars 
> -s | grep cpu; sleep 10; done
> Thu Mar 22 04:36:17 UTC 2018
> observer.observer_cpu 108.9
> Thu Mar 22 04:36:27 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:38 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:48 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:58 UTC 2018
> observer.observer_cpu 111.0
> Thu Mar 22 04:37:08 UTC 2018
> 

Re: Review Request 66103: Introduce mesos disk collector

2018-03-22 Thread Reza Motamedi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66103/#review199822
---




src/test/python/apache/thermos/monitoring/BUILD
Lines 21 (patched)


I have built observer with this patch a couple of times on Jenkins boxes as 
well. 

Are you referring to `requests-kerberos` and `requests-mock`? I will 
include them here as well.


- Reza Motamedi


On March 22, 2018, 9:52 p.m., Reza Motamedi wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66103/
> ---
> 
> (Updated March 22, 2018, 9:52 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Daniel Knightly, Franck Cuny, 
> Jordan Ly, Santhosh Kumar Shanmugham, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> When disk isolation is enabled in a Mesos agent it calculates the disk usage 
> for each container. 
> Thermos Observer also monitors disk usage using `twitter.common.dirutil`, 
> essentially repeating the work already done by the agent. In practice, we see 
> that disk monitoring is one of the most expensive resource monitoring tasks. 
> For instance, when there are deeply nested directories, the CPU utilization 
> of the observer process can easily reach 1.5 CPUs. It would be ideal if we 
> delegate the disk monitoring task to the agent and do it only once. With this 
> approach, when disk collection has improved in the agent (for instance by 
> implementing XFS isolation), we can simply benefit from it without any code 
> change. Some more information about the problem is provided in AURORA-1918.
> 
> This patch that introduces `MesosDiskCollector` which queries the agent's API 
> endpoint to lookup disk_used_bytes. Note that there is also resource 
> monitoring in thermos executor. Currently, I left the disk collector there to 
> use the `du` implementation. That can be changed in a later patch.
> 
> I modified some vagrant config files including `aurora-executor.service` and 
> `etc_mesos-slave/isolation` for testing. They can be left as is. I included 
> them in this patch to show how this would work e2e.
> 
> 
> Diffs
> -
> 
>   3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
>   RELEASE-NOTES.md 51ab6c724694244bf616b29e9beace4a4a3f5252 
>   docs/reference/observer-configuration.md 
> 8a443c94f7f37f9454989781f722101a97c99f15 
>   examples/jobs/hello_world.aurora 5401bfebe753b5e53abd08baeac501144ced9b5a 
>   examples/vagrant/mesos_config/etc_mesos-slave/isolation 
> 1a7028ffc70116b104ef3ad22b7388f637707a0f 
>   examples/vagrant/systemd/thermos.service 
> 01925bcd2ae44f100df511f3c3951c3f5a1a72aa 
>   src/main/python/apache/aurora/tools/thermos_observer.py 
> dd9f0c46ceac9e939b1b763073314161de0ea614 
>   src/main/python/apache/thermos/monitoring/BUILD 
> 65ba7088f65e7baa5d30744736ba456b46a55e86 
>   src/main/python/apache/thermos/monitoring/disk.py 
> 986d33a5000f8d5db15cb639c81f8b1d756ffa05 
>   src/main/python/apache/thermos/monitoring/resource.py 
> adcdc751c03460dc801a18278faa96d6bd64722b 
>   src/main/python/apache/thermos/observer/task_observer.py 
> a6870d48bddf2a2ccede7bb68195f2baae1d0e47 
>   
> src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
>  fe74bd1d3ecd89fca1b5b2251202cbbc0f24 
>   src/test/python/apache/thermos/monitoring/BUILD 
> 8f2b39336dce6c7b580e6ba0009f60afdcb89179 
>   src/test/python/apache/thermos/monitoring/test_disk.py 
> 362393bfd1facf3198e2d438d0596b16700b72b8 
>   src/test/python/apache/thermos/monitoring/test_resource.py 
> e577e552d4ee1807096a15401851bb9fd95fa426 
> 
> 
> Diff: https://reviews.apache.org/r/66103/diff/7/
> 
> 
> Testing
> ---
> 
> - I added unit tests.
> - Tested in vagrant and it works as intenced.
> - I also built and deployed in our test enviroment. In order to measure 
> imporoved performance I created jobs with nested folders and noticed 
> reduction in CPU utilization of the Observer process, by at least 60%. (1.5 
> CPU cores to 0.4 CPU cores)
> 
> Here is one specific test setup: On two hosts I created a two tasks. Each 
> task creates identical nested directory structures and files in them. The 
> overall size is 30GB. test_host_1 runs the current version of observer and 
> test_host_2 runs Observer with this patch and also has mesos_disk_collection 
> enabled. The results are as follows:
> 
> ```
> rezam[7]TEST_HOST_1 ~ $ while true; do echo `date`; curl localhost:1338/vars 
> -s | grep cpu; sleep 10; done
> Thu Mar 22 04:36:17 UTC 2018
> observer.observer_cpu 108.9
> Thu Mar 22 04:36:27 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:38 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 

Re: Review Request 66103: Introduce mesos disk collector

2018-03-22 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66103/#review199814
---


Fix it, then Ship it!




Implementation LGTM. Some more comments about the tests. Fix it and ship it.


src/test/python/apache/thermos/monitoring/test_disk.py
Lines 97 (patched)


Everywhere using HTTPretty.

Assert `httpretty.last_request()`

https://github.com/gabrielfalcao/HTTPretty



src/test/python/apache/thermos/monitoring/test_disk.py
Lines 211 (patched)


s/uage/usage/



src/test/python/apache/thermos/monitoring/test_disk.py
Lines 284 (patched)


We can use `dynamic responses through callbacks` and mock a connection 
error with HTTPretty without patching `requests`.


https://github.com/gabrielfalcao/HTTPretty#dynamic-responses-through-callbacks



src/test/python/apache/thermos/monitoring/test_disk.py
Lines 305 (patched)


Assert `request.get` was called indeed.


- Santhosh Kumar Shanmugham


On March 22, 2018, 2:52 p.m., Reza Motamedi wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66103/
> ---
> 
> (Updated March 22, 2018, 2:52 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Daniel Knightly, Franck Cuny, 
> Jordan Ly, Santhosh Kumar Shanmugham, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> When disk isolation is enabled in a Mesos agent it calculates the disk usage 
> for each container. 
> Thermos Observer also monitors disk usage using `twitter.common.dirutil`, 
> essentially repeating the work already done by the agent. In practice, we see 
> that disk monitoring is one of the most expensive resource monitoring tasks. 
> For instance, when there are deeply nested directories, the CPU utilization 
> of the observer process can easily reach 1.5 CPUs. It would be ideal if we 
> delegate the disk monitoring task to the agent and do it only once. With this 
> approach, when disk collection has improved in the agent (for instance by 
> implementing XFS isolation), we can simply benefit from it without any code 
> change. Some more information about the problem is provided in AURORA-1918.
> 
> This patch that introduces `MesosDiskCollector` which queries the agent's API 
> endpoint to lookup disk_used_bytes. Note that there is also resource 
> monitoring in thermos executor. Currently, I left the disk collector there to 
> use the `du` implementation. That can be changed in a later patch.
> 
> I modified some vagrant config files including `aurora-executor.service` and 
> `etc_mesos-slave/isolation` for testing. They can be left as is. I included 
> them in this patch to show how this would work e2e.
> 
> 
> Diffs
> -
> 
>   3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
>   RELEASE-NOTES.md 51ab6c724694244bf616b29e9beace4a4a3f5252 
>   docs/reference/observer-configuration.md 
> 8a443c94f7f37f9454989781f722101a97c99f15 
>   examples/jobs/hello_world.aurora 5401bfebe753b5e53abd08baeac501144ced9b5a 
>   examples/vagrant/mesos_config/etc_mesos-slave/isolation 
> 1a7028ffc70116b104ef3ad22b7388f637707a0f 
>   examples/vagrant/systemd/thermos.service 
> 01925bcd2ae44f100df511f3c3951c3f5a1a72aa 
>   src/main/python/apache/aurora/tools/thermos_observer.py 
> dd9f0c46ceac9e939b1b763073314161de0ea614 
>   src/main/python/apache/thermos/monitoring/BUILD 
> 65ba7088f65e7baa5d30744736ba456b46a55e86 
>   src/main/python/apache/thermos/monitoring/disk.py 
> 986d33a5000f8d5db15cb639c81f8b1d756ffa05 
>   src/main/python/apache/thermos/monitoring/resource.py 
> adcdc751c03460dc801a18278faa96d6bd64722b 
>   src/main/python/apache/thermos/observer/task_observer.py 
> a6870d48bddf2a2ccede7bb68195f2baae1d0e47 
>   
> src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
>  fe74bd1d3ecd89fca1b5b2251202cbbc0f24 
>   src/test/python/apache/thermos/monitoring/BUILD 
> 8f2b39336dce6c7b580e6ba0009f60afdcb89179 
>   src/test/python/apache/thermos/monitoring/test_disk.py 
> 362393bfd1facf3198e2d438d0596b16700b72b8 
>   src/test/python/apache/thermos/monitoring/test_resource.py 
> e577e552d4ee1807096a15401851bb9fd95fa426 
> 
> 
> Diff: https://reviews.apache.org/r/66103/diff/7/
> 
> 
> Testing
> ---
> 
> - I added unit tests.
> - Tested in vagrant and it works as intenced.
> - I also built and deployed in our test enviroment. In order to measure 
> imporoved performance I created jobs with nested folders and noticed 
> reduction in CPU utilization of the Observer process, by 

Re: Review Request 66103: Introduce mesos disk collector

2018-03-22 Thread Reza Motamedi


> On March 22, 2018, 10:31 p.m., Stephan Erb wrote:
> > src/test/python/apache/thermos/monitoring/BUILD
> > Lines 21 (patched)
> > 
> >
> > Requests has a few dependencies. I believe you need to list those here 
> > as well in order to ensure pants pulls in the correct versions.

I have built observer with this patch a couple of times on Jenkins boxes as 
well. 


Are you referring to requests-kerberos and requests-mock? I will include them 
here as well.


- Reza


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66103/#review199812
---


On March 22, 2018, 9:52 p.m., Reza Motamedi wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66103/
> ---
> 
> (Updated March 22, 2018, 9:52 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Daniel Knightly, Franck Cuny, 
> Jordan Ly, Santhosh Kumar Shanmugham, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> When disk isolation is enabled in a Mesos agent it calculates the disk usage 
> for each container. 
> Thermos Observer also monitors disk usage using `twitter.common.dirutil`, 
> essentially repeating the work already done by the agent. In practice, we see 
> that disk monitoring is one of the most expensive resource monitoring tasks. 
> For instance, when there are deeply nested directories, the CPU utilization 
> of the observer process can easily reach 1.5 CPUs. It would be ideal if we 
> delegate the disk monitoring task to the agent and do it only once. With this 
> approach, when disk collection has improved in the agent (for instance by 
> implementing XFS isolation), we can simply benefit from it without any code 
> change. Some more information about the problem is provided in AURORA-1918.
> 
> This patch that introduces `MesosDiskCollector` which queries the agent's API 
> endpoint to lookup disk_used_bytes. Note that there is also resource 
> monitoring in thermos executor. Currently, I left the disk collector there to 
> use the `du` implementation. That can be changed in a later patch.
> 
> I modified some vagrant config files including `aurora-executor.service` and 
> `etc_mesos-slave/isolation` for testing. They can be left as is. I included 
> them in this patch to show how this would work e2e.
> 
> 
> Diffs
> -
> 
>   3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
>   RELEASE-NOTES.md 51ab6c724694244bf616b29e9beace4a4a3f5252 
>   docs/reference/observer-configuration.md 
> 8a443c94f7f37f9454989781f722101a97c99f15 
>   examples/jobs/hello_world.aurora 5401bfebe753b5e53abd08baeac501144ced9b5a 
>   examples/vagrant/mesos_config/etc_mesos-slave/isolation 
> 1a7028ffc70116b104ef3ad22b7388f637707a0f 
>   examples/vagrant/systemd/thermos.service 
> 01925bcd2ae44f100df511f3c3951c3f5a1a72aa 
>   src/main/python/apache/aurora/tools/thermos_observer.py 
> dd9f0c46ceac9e939b1b763073314161de0ea614 
>   src/main/python/apache/thermos/monitoring/BUILD 
> 65ba7088f65e7baa5d30744736ba456b46a55e86 
>   src/main/python/apache/thermos/monitoring/disk.py 
> 986d33a5000f8d5db15cb639c81f8b1d756ffa05 
>   src/main/python/apache/thermos/monitoring/resource.py 
> adcdc751c03460dc801a18278faa96d6bd64722b 
>   src/main/python/apache/thermos/observer/task_observer.py 
> a6870d48bddf2a2ccede7bb68195f2baae1d0e47 
>   
> src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
>  fe74bd1d3ecd89fca1b5b2251202cbbc0f24 
>   src/test/python/apache/thermos/monitoring/BUILD 
> 8f2b39336dce6c7b580e6ba0009f60afdcb89179 
>   src/test/python/apache/thermos/monitoring/test_disk.py 
> 362393bfd1facf3198e2d438d0596b16700b72b8 
>   src/test/python/apache/thermos/monitoring/test_resource.py 
> e577e552d4ee1807096a15401851bb9fd95fa426 
> 
> 
> Diff: https://reviews.apache.org/r/66103/diff/7/
> 
> 
> Testing
> ---
> 
> - I added unit tests.
> - Tested in vagrant and it works as intenced.
> - I also built and deployed in our test enviroment. In order to measure 
> imporoved performance I created jobs with nested folders and noticed 
> reduction in CPU utilization of the Observer process, by at least 60%. (1.5 
> CPU cores to 0.4 CPU cores)
> 
> Here is one specific test setup: On two hosts I created a two tasks. Each 
> task creates identical nested directory structures and files in them. The 
> overall size is 30GB. test_host_1 runs the current version of observer and 
> test_host_2 runs Observer with this patch and also has mesos_disk_collection 
> enabled. The results are as follows:
> 
> ```
> rezam[7]TEST_HOST_1 ~ $ while true; do echo `date`; curl localhost:1338/vars 
> -s | grep cpu; 

Re: Review Request 66103: Introduce mesos disk collector

2018-03-22 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66103/#review199838
---


Ship it!




Ship It!

- Santhosh Kumar Shanmugham


On March 22, 2018, 6:11 p.m., Reza Motamedi wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66103/
> ---
> 
> (Updated March 22, 2018, 6:11 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Daniel Knightly, Franck Cuny, 
> Jordan Ly, Santhosh Kumar Shanmugham, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> When disk isolation is enabled in a Mesos agent it calculates the disk usage 
> for each container. 
> Thermos Observer also monitors disk usage using `twitter.common.dirutil`, 
> essentially repeating the work already done by the agent. In practice, we see 
> that disk monitoring is one of the most expensive resource monitoring tasks. 
> For instance, when there are deeply nested directories, the CPU utilization 
> of the observer process can easily reach 1.5 CPUs. It would be ideal if we 
> delegate the disk monitoring task to the agent and do it only once. With this 
> approach, when disk collection has improved in the agent (for instance by 
> implementing XFS isolation), we can simply benefit from it without any code 
> change. Some more information about the problem is provided in AURORA-1918.
> 
> This patch that introduces `MesosDiskCollector` which queries the agent's API 
> endpoint to lookup disk_used_bytes. Note that there is also resource 
> monitoring in thermos executor. Currently, I left the disk collector there to 
> use the `du` implementation. That can be changed in a later patch.
> 
> I modified some vagrant config files including `aurora-executor.service` and 
> `etc_mesos-slave/isolation` for testing. They can be left as is. I included 
> them in this patch to show how this would work e2e.
> 
> 
> Diffs
> -
> 
>   3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
>   RELEASE-NOTES.md 51ab6c724694244bf616b29e9beace4a4a3f5252 
>   docs/reference/observer-configuration.md 
> 8a443c94f7f37f9454989781f722101a97c99f15 
>   examples/jobs/hello_world.aurora 5401bfebe753b5e53abd08baeac501144ced9b5a 
>   examples/vagrant/mesos_config/etc_mesos-slave/isolation 
> 1a7028ffc70116b104ef3ad22b7388f637707a0f 
>   examples/vagrant/systemd/thermos.service 
> 01925bcd2ae44f100df511f3c3951c3f5a1a72aa 
>   src/main/python/apache/aurora/tools/thermos_observer.py 
> dd9f0c46ceac9e939b1b763073314161de0ea614 
>   src/main/python/apache/thermos/monitoring/BUILD 
> 65ba7088f65e7baa5d30744736ba456b46a55e86 
>   src/main/python/apache/thermos/monitoring/disk.py 
> 986d33a5000f8d5db15cb639c81f8b1d756ffa05 
>   src/main/python/apache/thermos/monitoring/resource.py 
> adcdc751c03460dc801a18278faa96d6bd64722b 
>   src/main/python/apache/thermos/observer/task_observer.py 
> a6870d48bddf2a2ccede7bb68195f2baae1d0e47 
>   
> src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
>  fe74bd1d3ecd89fca1b5b2251202cbbc0f24 
>   src/test/python/apache/thermos/monitoring/BUILD 
> 8f2b39336dce6c7b580e6ba0009f60afdcb89179 
>   src/test/python/apache/thermos/monitoring/test_disk.py 
> 362393bfd1facf3198e2d438d0596b16700b72b8 
>   src/test/python/apache/thermos/monitoring/test_resource.py 
> e577e552d4ee1807096a15401851bb9fd95fa426 
> 
> 
> Diff: https://reviews.apache.org/r/66103/diff/9/
> 
> 
> Testing
> ---
> 
> - I added unit tests.
> - Tested in vagrant and it works as intenced.
> - I also built and deployed in our test enviroment. In order to measure 
> imporoved performance I created jobs with nested folders and noticed 
> reduction in CPU utilization of the Observer process, by at least 60%. (1.5 
> CPU cores to 0.4 CPU cores)
> 
> Here is one specific test setup: On two hosts I created a two tasks. Each 
> task creates identical nested directory structures and files in them. The 
> overall size is 30GB. test_host_1 runs the current version of observer and 
> test_host_2 runs Observer with this patch and also has mesos_disk_collection 
> enabled. The results are as follows:
> 
> ```
> rezam[7]TEST_HOST_1 ~ $ while true; do echo `date`; curl localhost:1338/vars 
> -s | grep cpu; sleep 10; done
> Thu Mar 22 04:36:17 UTC 2018
> observer.observer_cpu 108.9
> Thu Mar 22 04:36:27 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:38 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:48 UTC 2018
> observer.observer_cpu 123.2
> Thu Mar 22 04:36:58 UTC 2018
> observer.observer_cpu 111.0
> Thu Mar 22 04:37:08 UTC 2018
> observer.observer_cpu 111.0
> Thu Mar 22 04:37:18 UTC 2018
> observer.observer_cpu 111.0
> 
> 
> rezam[7]TEST_HOST_2 ~ $ while true; do echo 

Re: Review Request 66103: Introduce mesos disk collector

2018-03-22 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66103/#review199812
---



Querying information from Mesos rather than re-doing it in Thermos is a step 
into the right direction. Overall this looks good to me. Below are just a few 
nitpicks.


src/main/python/apache/thermos/monitoring/disk.py
Lines 99-105 (patched)


Please mark the attributes that are only used internally with `_` so that 
it is easier to understand which parts are part of the public interface.



src/main/python/apache/thermos/monitoring/disk.py
Lines 112-114 (patched)


The indentation here and in a few other places is slightly off. Aurora 
tends to follow this style guide here 
https://github.com/twitter/commons/blob/master/src/python/twitter/common/styleguide.md#best-practices



src/main/python/apache/thermos/monitoring/disk.py
Lines 174 (patched)


Should we do the jmespath compilation here? If there is an error it will 
throw early and only once.



src/test/python/apache/thermos/monitoring/BUILD
Lines 21 (patched)


Requests has a few dependencies. I believe you need to list those here as 
well in order to ensure pants pulls in the correct versions.


- Stephan Erb


On March 22, 2018, 10:52 p.m., Reza Motamedi wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66103/
> ---
> 
> (Updated March 22, 2018, 10:52 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Daniel Knightly, Franck Cuny, 
> Jordan Ly, Santhosh Kumar Shanmugham, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> When disk isolation is enabled in a Mesos agent it calculates the disk usage 
> for each container. 
> Thermos Observer also monitors disk usage using `twitter.common.dirutil`, 
> essentially repeating the work already done by the agent. In practice, we see 
> that disk monitoring is one of the most expensive resource monitoring tasks. 
> For instance, when there are deeply nested directories, the CPU utilization 
> of the observer process can easily reach 1.5 CPUs. It would be ideal if we 
> delegate the disk monitoring task to the agent and do it only once. With this 
> approach, when disk collection has improved in the agent (for instance by 
> implementing XFS isolation), we can simply benefit from it without any code 
> change. Some more information about the problem is provided in AURORA-1918.
> 
> This patch that introduces `MesosDiskCollector` which queries the agent's API 
> endpoint to lookup disk_used_bytes. Note that there is also resource 
> monitoring in thermos executor. Currently, I left the disk collector there to 
> use the `du` implementation. That can be changed in a later patch.
> 
> I modified some vagrant config files including `aurora-executor.service` and 
> `etc_mesos-slave/isolation` for testing. They can be left as is. I included 
> them in this patch to show how this would work e2e.
> 
> 
> Diffs
> -
> 
>   3rdparty/python/requirements.txt 4ac242cfa2c1c19cb7447816ab86e748839d3d11 
>   RELEASE-NOTES.md 51ab6c724694244bf616b29e9beace4a4a3f5252 
>   docs/reference/observer-configuration.md 
> 8a443c94f7f37f9454989781f722101a97c99f15 
>   examples/jobs/hello_world.aurora 5401bfebe753b5e53abd08baeac501144ced9b5a 
>   examples/vagrant/mesos_config/etc_mesos-slave/isolation 
> 1a7028ffc70116b104ef3ad22b7388f637707a0f 
>   examples/vagrant/systemd/thermos.service 
> 01925bcd2ae44f100df511f3c3951c3f5a1a72aa 
>   src/main/python/apache/aurora/tools/thermos_observer.py 
> dd9f0c46ceac9e939b1b763073314161de0ea614 
>   src/main/python/apache/thermos/monitoring/BUILD 
> 65ba7088f65e7baa5d30744736ba456b46a55e86 
>   src/main/python/apache/thermos/monitoring/disk.py 
> 986d33a5000f8d5db15cb639c81f8b1d756ffa05 
>   src/main/python/apache/thermos/monitoring/resource.py 
> adcdc751c03460dc801a18278faa96d6bd64722b 
>   src/main/python/apache/thermos/observer/task_observer.py 
> a6870d48bddf2a2ccede7bb68195f2baae1d0e47 
>   
> src/test/python/apache/aurora/executor/common/test_resource_manager_integration.py
>  fe74bd1d3ecd89fca1b5b2251202cbbc0f24 
>   src/test/python/apache/thermos/monitoring/BUILD 
> 8f2b39336dce6c7b580e6ba0009f60afdcb89179 
>   src/test/python/apache/thermos/monitoring/test_disk.py 
> 362393bfd1facf3198e2d438d0596b16700b72b8 
>   src/test/python/apache/thermos/monitoring/test_resource.py 
> e577e552d4ee1807096a15401851bb9fd95fa426 
> 
> 
> Diff: https://reviews.apache.org/r/66103/diff/7/
> 
> 
> Testing
> ---
> 
> - I added unit tests.
> - 

Re: Review Request 66103: Introduce mesos disk collector

2018-03-22 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66103/#review199815
---



Master (f32086d) is red with this patch.
  ./build-support/jenkins/build.sh

 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[add_instances]
 <- 
.pants.d/pyprep/sources/c19e1cfebce41d1e9b9c5fa55be409ab288ab83d/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 46%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[create_job]
 <- 
.pants.d/pyprep/sources/c19e1cfebce41d1e9b9c5fa55be409ab288ab83d/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 53%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[kill_job]
 <- 
.pants.d/pyprep/sources/c19e1cfebce41d1e9b9c5fa55be409ab288ab83d/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 60%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[restart]
 <- 
.pants.d/pyprep/sources/c19e1cfebce41d1e9b9c5fa55be409ab288ab83d/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 66%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[start_cronjob]
 <- 
.pants.d/pyprep/sources/c19e1cfebce41d1e9b9c5fa55be409ab288ab83d/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 73%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[start_job_update]
 <- 
.pants.d/pyprep/sources/c19e1cfebce41d1e9b9c5fa55be409ab288ab83d/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 80%]
 
src/test/python/apache/aurora/client/hooks/test_non_hooked_api.py::TestNonHookedAuroraClientAPI::test_kill_job_discards_config
 <- 
.pants.d/pyprep/sources/c19e1cfebce41d1e9b9c5fa55be409ab288ab83d/apache/aurora/client/hooks/test_non_hooked_api.py
 PASSED [ 86%]
 
src/test/python/apache/aurora/client/hooks/test_non_hooked_api.py::TestNonHookedAuroraClientAPI::test_restart_discards_config
 <- 
.pants.d/pyprep/sources/c19e1cfebce41d1e9b9c5fa55be409ab288ab83d/apache/aurora/client/hooks/test_non_hooked_api.py
 PASSED [ 93%]
 
src/test/python/apache/aurora/client/hooks/test_non_hooked_api.py::TestNonHookedAuroraClientAPI::test_start_cronjob_discards_config
 <- 
.pants.d/pyprep/sources/c19e1cfebce41d1e9b9c5fa55be409ab288ab83d/apache/aurora/client/hooks/test_non_hooked_api.py
 PASSED [100%]
 
  generated xml file: 
/home/jenkins/jenkins-slave/workspace/AuroraBot/.pants.d/test/pytest/src.test.python.apache.aurora.client.hooks.hooks/junitxml/TEST-src.test.python.apache.aurora.client.hooks.hooks.xml
 
 === 15 passed in 0.31 seconds 
 
   src.test.python.apache.aurora.admin.admin
   .   SUCCESS
   src.test.python.apache.aurora.client.client  
   .   SUCCESS
   src.test.python.apache.aurora.client.api.api 
   .   SUCCESS
   src.test.python.apache.aurora.client.cli.cli 
   .   SUCCESS
   src.test.python.apache.aurora.client.docker.docker   
   .   SUCCESS
   src.test.python.apache.aurora.client.hooks.hooks 
   .   SUCCESS
   src.test.python.apache.aurora.common.common  
   .   SUCCESS
   
src.test.python.apache.aurora.common.health_check.health_check  
.   SUCCESS
   src.test.python.apache.aurora.config.config  
   .   SUCCESS
   src.test.python.apache.aurora.executor.executor  
   .   FAILURE
   src.test.python.apache.aurora.executor.bin.bin   
   .   SUCCESS
   src.test.python.apache.aurora.executor.common.common 
   .   SUCCESS
   src.test.python.apache.aurora.tools.tools
   .   SUCCESS
   src.test.python.apache.thermos.cli.cli   
   .   SUCCESS
   src.test.python.apache.thermos.cli.commands.commands 
   .   SUCCESS
   src.test.python.apache.thermos.common.common 
   .   SUCCESS
   src.test.python.apache.thermos.config.config 
   .