-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61016/#review181152
-----------------------------------------------------------
Master (8f5a591) is red with this patch.
./build-support/jenkins/build.sh
[1m def test_runner_state_reconstruction(self):[0m
[1m> assert self.state ==
self.runner.reconstructed_state[0m
[1m[31mE assert RunnerState(header=None,
processes=None, statuses=None) == None[0m
[1m[31mE + where RunnerState(header=None,
processes=None, statuses=None) = <test_finalization.TestRegularFinalizingTask
object at 0x7f88745da090>.state[0m
[1m[31mE + and None = None[0m
[1m[31mE + where None =
<apache.thermos.testing.runner.Runner object at
0x7f88745daa10>.reconstructed_state[0m
[1m[31mE + where
<apache.thermos.testing.runner.Runner object at 0x7f88745daa10> =
<test_finalization.TestRegularFinalizingTask object at
0x7f88745da090>.runner[0m
.pants.d/python-setup/chroots/6108b131782500e43b1f032e7433d264e763b3e9/apache/thermos/testing/runner.py:212:
AssertionError
------------- Captured stderr setup --------------
ERROR:root:Failed to recover from
/tmp/tmpPdZxLt/checkpoints/1500672624993483-runner-base/runner: [Errno 2] No
such file or directory:
'/tmp/tmpPdZxLt/checkpoints/1500672624993483-runner-base/runner'
__ TestRegularFinalizingTask.test_runner_state ___
self = <test_finalization.TestRegularFinalizingTask object
at 0x7f88745b7e10>
[1m def test_runner_state(self):[0m
[1m> assert self.state.statuses[-1].state ==
TaskState.SUCCESS[0m
[1m[31mE TypeError: 'NoneType' object has no
attribute '__getitem__'[0m
src/test/python/apache/thermos/core/test_finalization.py:30: TypeError
TestRegularFinalizingTask.test_runner_process_in_expected_states
self = <test_finalization.TestRegularFinalizingTask object
at 0x7f88745aea50>
[1m def
test_runner_process_in_expected_states(self):[0m
[1m history = self.state.processes[0m
[1m for process in ('main', 'finalizer'):[0m
[1m> assert len(history[process]) == 1[0m
[1m[31mE TypeError: 'NoneType' object has no
attribute '__getitem__'[0m
src/test/python/apache/thermos/core/test_finalization.py:35: TypeError
generated xml file:
/home/jenkins/jenkins-slave/workspace/AuroraBot/dist/test-results/aaf4d108c31293299a0839bdc404a91802f80937.xml
[1m[31m 3 failed, 794 passed, 6 skipped, 1 warnings in
298.43 seconds [0m
FAILURE
21:33:41 05:34 [complete][31m
FAILURE[0m
I will refresh this build result if you post a review containing "@ReviewBot
retry"
- Aurora ReviewBot
On July 21, 2017, 9:12 p.m., Reza Motamedi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61016/
> -----------------------------------------------------------
>
> (Updated July 21, 2017, 9:12 p.m.)
>
>
> Review request for Aurora, Santhosh Kumar Shanmugham and Stephan Erb.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> # lock psutil's oneshot
>
> TLDR; psutil's `oneshot` is not threadsafe.
>
> After a lot of testing on busy machines, I realized that psutil's oneshot is
> not threadsafe. I contanced the developer however, have not recevied a
> conceret fix.
>
> Please read https://issues.apache.org/jira/browse/AURORA-1939 and
> https://github.com/giampaolo/psutil/issues/1110 for more information.
>
>
> Diffs
> -----
>
> src/main/python/apache/thermos/monitoring/process_collector_psutil.py
> 3594955c68b45ab65c01426ba0a18ec8a132a27f
>
>
> Diff: https://reviews.apache.org/r/61016/diff/1/
>
>
> Testing
> -------
>
> The following test is done by adding additional logging in the current code:
>
>
> ```
> ...
> cpu_times = process.cpu_times()
> + log.debug("process:{} cpu times {}".format(process, cpu_times))
> user, system = cpu_times.user, cpu_times.system
> memory_info = p
> ...
> ```
>
> ```
> $ grep '36350'
> thermos-observer.XXXX.prod.twttr.net.root.log.DEBUG.20170721-163950.9421
> D0721 16:55:28.242974 9421 process_collector_psutil.py:40]
> process:psutil.Process(pid=36350, name='mesos-slave') cpu times
> pcputimes(user=2500.95, system=4487.06, children_user=0.0,
> children_system=0.0)
> D0721 17:11:21.940462 9421 process_collector_psutil.py:40]
> process:psutil.Process(pid=36350, name='bash') cpu times pcputimes(user=0.0,
> system=0.03, children_user=0.0, children_system=0.0)
> D0721 17:11:22.247414 9421 process_collector_psutil.py:111] Calculated rate
> for pid=34339 and children: -7.32560348996 (old: 6988.040000, new: 0.060000)
> {34339: 1498166704.32, 36350: 1498166720.51} -> {34339: 1498166704.32, 36350:
> 1498166720.51} [{34339: ProcessSample(rate=0.0, user=0.0, system=0.03,
> rss=2777088, vms=11919360, nice=0, status='sleeping', threads=1), 36350:
> ProcessSample(rate=0.0, user=2500.95, system=4487.06, rss=41906176,
> vms=1601019904, nice=0, status='sleeping', threads=20)}] [{34339:
> ProcessSample(rate=0.0, user=0.0, system=0.03, rss=2777088, vms=11919360,
> nice=0, status='sleeping', threads=1), 36350: ProcessSample(rate=0.0,
> user=0.0, system=0.03, rss=41906176, vms=1601019904, nice=0,
> status='sleeping', threads=20)}]
> ```
>
> These inconsistencies disappear after removing oneshot.
>
>
> Thanks,
>
> Reza Motamedi
>
>