> On Jan. 9, 2014, 4:02 p.m., Benjamin Hindman wrote: > > How do you intend for these values to get used? How do they compare with > > the names for things exposed on the slaves for statistics? > > Niklas Nielsen wrote: > For diagnostics mostly. Even though per-framework statistics are already > gathered and exposed, node-wide metrics would be another good indicator of > performance issues. > > The current statistics are mostly internal event counters (XXXX_tasks, > XXXX_status_updates, ...) and aggregate resources on masters (mem_total, > men_used, ...) > This patch and https://reviews.apache.org/r/16631/ adds system metrics > (and follow the same naming scheme with "system_XXXX"). I do agree that this > is not ideal - I am open to ideas. > > Ben Mahler wrote: > Would re-using ResourceStatistics make sense here? If we're going to > start exposing system metrics in the master/slave, we should at least make > sure the naming is consistent. (I think this is what benh was referring to: > how does this compare to the names in ResourceStatistics?). > > We should think about how we want to expose system metrics in a more > general sense (if at all), and how this will tie into third party monitoring > tools (e.g. MESOS-780).
Re-using ResourceStatistics by adding mem_free_bytes and average load and model that in a /stats.json field? I'd guess we could reuse men_limit_bytes for the total. Or just reusing the naming? This was the case where I thought it would collide with the resource aggregates on the master stats end-point. We could also be add a new endpoint which model ResourceStatistics. MESOS-780 didn't seem to discuss the scheme for system metrics but concluded that we'd stick with a pull/endpoint model. You are right, it could be a discussion whether to have system metrics exposed by Mesos at all - those metrics could be gather by full fledged monitoring daemons. However, it seems worthwhile if those metrics could help drive framework scheduling decisions. - Niklas ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16634/#review31492 ----------------------------------------------------------- On Jan. 13, 2014, 2:11 p.m., Niklas Nielsen wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/16634/ > ----------------------------------------------------------- > > (Updated Jan. 13, 2014, 2:11 p.m.) > > > Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone. > > > Bugs: MESOS-581 > https://issues.apache.org/jira/browse/MESOS-581 > > > Repository: mesos-git > > > Description > ------- > > This patch expose the number bytes of total and free memory in the > master and slave stats endpoints. > Similar to https://reviews.apache.org/r/16631/, the new fields are > named system_mem_total and system_mem_free to disambiguate the > aggregate mem_total, mem_used, ... in the master stats endpoint. > Again, I am open to another naming scheme. > > > Diffs > ----- > > src/master/http.cpp d7cd89f > src/slave/http.cpp 1358810 > > Diff: https://reviews.apache.org/r/16634/diff/ > > > Testing > ------- > > make check and functional testing of /stats.json endpoints. > > > Thanks, > > Niklas Nielsen > >
