> On Jan. 9, 2014, 4:02 p.m., Benjamin Hindman wrote:
> > How do you intend for these values to get used? How do they compare with 
> > the names for things exposed on the slaves for statistics?
> 
> Niklas Nielsen wrote:
>     For diagnostics mostly. Even though per-framework statistics are already 
> gathered and exposed, node-wide metrics would be another good indicator of 
> performance issues.
>     
>     The current statistics are mostly internal event counters (XXXX_tasks, 
> XXXX_status_updates, ...) and aggregate resources on masters (mem_total, 
> men_used, ...)
>     This patch and https://reviews.apache.org/r/16631/ adds system metrics 
> (and follow the same naming scheme with "system_XXXX"). I do agree that this 
> is not ideal - I am open to ideas.
> 
> Ben Mahler wrote:
>     Would re-using ResourceStatistics make sense here? If we're going to 
> start exposing system metrics in the master/slave, we should at least make 
> sure the naming is consistent. (I think this is what benh was referring to: 
> how does this compare to the names in ResourceStatistics?).
>     
>     We should think about how we want to expose system metrics in a more 
> general sense (if at all), and how this will tie into third party monitoring 
> tools (e.g. MESOS-780).

Re-using ResourceStatistics by adding mem_free_bytes and average load and model 
that in a /stats.json field? I'd guess we could reuse men_limit_bytes for the 
total. Or just reusing the naming? This was the case where I thought it would 
collide with the resource aggregates on the master stats end-point. We could 
also be add a new endpoint which model ResourceStatistics.

MESOS-780 didn't seem to discuss the scheme for system metrics but concluded 
that we'd stick with a pull/endpoint model.
You are right, it could be a discussion whether to have system metrics exposed 
by Mesos at all - those metrics could be gather by full fledged monitoring 
daemons. However, it seems worthwhile if those metrics could help drive 
framework scheduling decisions.


- Niklas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16634/#review31492
-----------------------------------------------------------


On Jan. 13, 2014, 2:11 p.m., Niklas Nielsen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16634/
> -----------------------------------------------------------
> 
> (Updated Jan. 13, 2014, 2:11 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone.
> 
> 
> Bugs: MESOS-581
>     https://issues.apache.org/jira/browse/MESOS-581
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> This patch expose the number bytes of total and free memory in the
> master and slave stats endpoints.
> Similar to https://reviews.apache.org/r/16631/, the new fields are
> named system_mem_total and system_mem_free to disambiguate the
> aggregate mem_total, mem_used, ... in the master stats endpoint.
> Again, I am open to another naming scheme.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp d7cd89f 
>   src/slave/http.cpp 1358810 
> 
> Diff: https://reviews.apache.org/r/16634/diff/
> 
> 
> Testing
> -------
> 
> make check and functional testing of /stats.json endpoints.
> 
> 
> Thanks,
> 
> Niklas Nielsen
> 
>

Reply via email to