I imagine we will also be running into the same need. Our plan right now was to write an quick service that polls the API the dashboard uses to retrieve metric information for each slave and then pipe that data directly to something like graphite for logging. I haven’t looked too much into this yet though
On Dec 18, 2014, at 2:05 PM, Niklas Nielsen <[email protected]<mailto:[email protected]>> wrote: Hi Steven, Alex Rukletsov and I worked on this as a proof-of-concept piece in the mesos-master last week, providing the same kind of graphs as you describe in the dashboard. We have a good idea about how to implement this now and we can start a discussion on JIRA on how to proceed (I can create it shortly). My first thought is that this should be pluggable; having something similar to "status update decorators" Alongside hanging key-value pairs of the status update, you can keep track of the life-time/size of tasks and do the resource math. There are some interesting problems to solve when it gets to master fail-over, but let's try to enumerate those in the ticket. Thanks, Niklas On Thu, Dec 18, 2014 at 11:56 AM, Steven Schlansker <[email protected]<mailto:[email protected]>> wrote: I am running a corporate Mesos cluster, shared by a number of teams and projects. We are looking to get some insight into our usage of precious computing resources. For example, I'd like to be able to present a report breaking down CPU-hour and RAM GB-hour utilization by service, team, or other relevant grouping. How I'd imagine this works: * Collect Mesos statistics per task (allocated CPU, CPU utilization, allocated memory, memory utilization, disk utilization) periodically (say, once a minute) * Collect task metadata from a pluggable source (mapping from Mesos task to service name, team name, any other metadata you wish to use to group tasks) * Generate dashboard / reports by aggregating task data over axes provided by metadata input Has anyone started on such a project? Thanks, Steven -- Niklas

