> On June 27, 2017, 9:10 p.m., Stephan Erb wrote: > > src/main/python/apache/thermos/monitoring/resource.py > > Line 54 (original), 53 (patched) > > <https://reviews.apache.org/r/60376/diff/1/?file=1758879#file1758879line54> > > > > It might be worth to rename this to something like > > 'AggregateResourceResult' or 'TaskResourceResult'.
Makes sense. > On June 27, 2017, 9:10 p.m., Stephan Erb wrote: > > src/main/python/apache/thermos/monitoring/resource.py > > Line 160 (original), 166 (patched) > > <https://reviews.apache.org/r/60376/diff/1/?file=1758879#file1758879line167> > > > > The usages of `ts_value_pair` has made it difficult to understand what > > is going on. I believe it becomes a little bit clearer to the casual reader > > if you do something like this: > > > > timestamp, resources = self._history.get(timestamp) > > > > This will replace usages of `ts_value_pair[1].proc_usage.values()` with > > `resources.proc_usage.values()` which makes it slightly clearer. > > > > Calling this `full_resources` instead of `resources` should work as > > well. Cool. The code looks much nicer. > On June 27, 2017, 9:10 p.m., Stephan Erb wrote: > > src/main/python/apache/thermos/monitoring/resource.py > > Lines 190 (patched) > > <https://reviews.apache.org/r/60376/diff/1/?file=1758879#file1758879line191> > > > > in this case you coud do > > > > _, resources = self._history.get(time.time()) > > > > > > as you don't care about the timestamp. Done. > On June 27, 2017, 9:10 p.m., Stephan Erb wrote: > > src/main/python/apache/thermos/monitoring/resource.py > > Line 222 (original), 243-244 (patched) > > <https://reviews.apache.org/r/60376/diff/1/?file=1758879#file1758879line244> > > > > We normally don't do hanging indentation > > https://github.com/twitter/commons/blob/master/src/python/twitter/common/styleguide.md#best-practices Good to know. > On June 27, 2017, 9:10 p.m., Stephan Erb wrote: > > src/test/python/apache/thermos/monitoring/test_resource.py > > Lines 84-103 (patched) > > <https://reviews.apache.org/r/60376/diff/1/?file=1758880#file1758880line86> > > > > I am not a big fan of the mocking here, even though the sibling > > testcases are not much better :). Mocking results in high coupling, making > > the implementation harder to change in the future. > > > > Please give it a try if you can manage to inject a normal, > > pre-populated `ResourceHistory` into the constructor of the > > `TaskResourceMonitor` instead. > > > > (If you don't have any conclusive results after 10-15 minutes feel free > > to give up. There might be some further interdependency that I am not > > spotting right now) Addressed. Please let me know if this is what you had in mind. - Reza ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/60376/#review179043 ----------------------------------------------------------- On June 28, 2017, 7:01 p.m., Reza Motamedi wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/60376/ > ----------------------------------------------------------- > > (Updated June 28, 2017, 7:01 p.m.) > > > Review request for Aurora, David McLaughlin, Joshua Cohen, Jordan Ly, and > Santhosh Kumar Shanmugham. > > > Repository: aurora > > > Description > ------- > > # Observer task page to load consumption info from history > > Resource consumptions of Thermos Processes are periodically calculated by > TaskResourceMonitor threads (one thread per Thermos task). This information > is used to display a (semi) fresh state of the tasks running on a host in the > Observer host page, aka landing page. An aggregate history of the > consumptions is kept at the task level, although TaskResourceMonitor needs to > first collect the resource at the Process level and then aggregate them. > > On the other hand, when an Observer _task page_ is visited, the resources > consumption of Thermos Processes within that task are calculated again and > displayed without being aggregated. This can become very slow since time to > complete resource calculation is affected by the load on the host. > > By applying this patch we take advantage of the periodic work and fulfill > information resource requested in Observer task page from already collected > resource consumptions. > > > Diffs > ----- > > src/main/python/apache/thermos/monitoring/resource.py > 434666696e600a0e6c19edd986c86575539976f2 > src/test/python/apache/aurora/executor/common/test_resource_manager.py > a898e4d81d34d1e30e39db1be1a66bc9e0ab1a35 > src/test/python/apache/thermos/monitoring/test_resource.py > d794a998f1d9fc52ba260cd31ac444aee7f8ed28 > > > Diff: https://reviews.apache.org/r/60376/diff/2/ > > > Testing > ------- > > I stress tested this patch on a host that had a slow Observer page. > Interestingly, I did not need to do much to make the Observer slow. There are > a few points to be made clear first. > - We at Twitter limit the resources allocated to the Observer using > `systemd`. The observer is allowed to use only 20% of a CPU core. The > attached screen shots are from such a setup. > - Having assigned 20% of a cpu core to Observer, starting only 8 `task`s, > each with 3 `process`es is enough to make the Observer slow; 11secs to load > `task page`. > > > File Attachments > ---------------- > > without the patch -- Screen Shot 2017-06-22 at 1.11.12 PM.png > > https://reviews.apache.org/media/uploaded/files/2017/06/22/03968028-a2f5-4a99-ba57-b7a41c471436__without_the_patch_--_Screen_Shot_2017-06-22_at_1.11.12_PM.png > with the patch -- Screen Shot 2017-06-22 at 1.07.41 PM.png > > https://reviews.apache.org/media/uploaded/files/2017/06/22/5962c018-27d3-4463-a277-f6ad48b7f2d7__with_the_patch_--_Screen_Shot_2017-06-22_at_1.07.41_PM.png > > > Thanks, > > Reza Motamedi > >
