> On May 26, 2017, 4:56 p.m., Jiang Yan Xu wrote:
> > For this and the next review, could you summarize how these metrics can be 
> > used to reason about the allocator/sorter's performance?
> > 
> > I agree that conceptually we'd like something that tells us how well the 
> > (dirty or overall) sort performs but it's not immediately clear how to 
> > derive that from the provided metrics because `sort` on each sorter is 
> > called many times during one allocation, multiple times per agent. The 
> > three sorters are of the same implementation, how to interpret the metrics 
> > from each? The amount of work split between each sorter seems to be pretty 
> > dynamic?
> > 
> > Also given the frequency that the timer (relatively expensive) is invoked, 
> > how much overhead would it cost the sort()? This is probably worth 
> > measuring if we add these.

The latency in role (and quota) level sorts indicate how much time in the 
allocate cycle is being spent on the sorts. This can be significant if:

i) A very high level of roles/quotas are being sorted; and
ii) Events that make the `dirty` flag `false` occuring significant number of 
times (determined from the number of sorts being performed indicated by 
`allocator/mesos/roles/sort_runs` or `allocator/mesos/quotas/sort_runs`) 
leading to a significant number of times the sorts are actually happening.

These stats by themselves are indicative of an actual problem but should help 
in diagnosing such conditions.


- Anindya


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53841/#review176110
-----------------------------------------------------------


On May 24, 2017, 4:23 a.m., Anindya Sinha wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53841/
> -----------------------------------------------------------
> 
> (Updated May 24, 2017, 4:23 a.m.)
> 
> 
> Review request for mesos, James Peach and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6579
>     https://issues.apache.org/jira/browse/MESOS-6579
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Following metrics have been added:
> a) allocator/mesos/roles/sort_runs: Number of role level sorts.
> b) allocator/mesos/roles/sort_run: Latency in role level sorts.
> c) allocator/mesos/quotas/sort_runs: Number of quota level sorts.
> d) allocator/mesos/quotas/sort_run: Latency in quota level sorts.
> 
> 
> Diffs
> -----
> 
>   docs/monitoring.md cb2833642e7e41c03c98ea92f7300d156a216a2e 
>   src/master/allocator/mesos/hierarchical.hpp 
> 123f97cf495bff0f822838e09df0d88818f04da6 
>   src/master/allocator/sorter/drf/metrics.hpp 
> 61568cb520826ab59d675824b212e0d3deb63764 
>   src/master/allocator/sorter/drf/metrics.cpp 
> ff63fbac5bbcf54e1ae39c3b650c0dafe7ea46d4 
>   src/master/allocator/sorter/drf/sorter.hpp 
> fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
>   src/master/allocator/sorter/drf/sorter.cpp 
> 26b77f578f3235a8792c72d4575d607cdb2c7de7 
>   src/tests/hierarchical_allocator_tests.cpp 
> f911110068a50c822aa90b864329ae87c9b5f8bb 
> 
> 
> Diff: https://reviews.apache.org/r/53841/diff/7/
> 
> 
> Testing
> -------
> 
> All tests passed.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>

Reply via email to