Hi Steven, Alex Rukletsov and I worked on this as a proof-of-concept piece in the mesos-master last week, providing the same kind of graphs as you describe in the dashboard. We have a good idea about how to implement this now and we can start a discussion on JIRA on how to proceed (I can create it shortly). My first thought is that this should be pluggable; having something similar to "status update decorators" Alongside hanging key-value pairs of the status update, you can keep track of the life-time/size of tasks and do the resource math.
There are some interesting problems to solve when it gets to master fail-over, but let's try to enumerate those in the ticket. Thanks, Niklas On Thu, Dec 18, 2014 at 11:56 AM, Steven Schlansker < [email protected]> wrote: > > I am running a corporate Mesos cluster, shared by a number of teams and > projects. > We are looking to get some insight into our usage of precious computing > resources. For example, I'd like to be able to present a report breaking > down CPU-hour and RAM GB-hour utilization by service, team, or other > relevant grouping. > > How I'd imagine this works: > > * Collect Mesos statistics per task (allocated CPU, CPU utilization, > allocated memory, memory utilization, disk utilization) periodically (say, > once a minute) > * Collect task metadata from a pluggable source (mapping from Mesos task > to service name, team name, any other metadata you wish to use to group > tasks) > * Generate dashboard / reports by aggregating task data over axes provided > by metadata input > > Has anyone started on such a project? > > Thanks, > Steven > > -- Niklas

