It'd be nice to build the solution in a way that is independent of Singularity 
- it seems like
the vast majority of the data can be collected from Mesos with then only a 
small piece that integrates
with Singularity or Marathon or Chronos or whatnot.  Then all the different 
framework users could contribute
to the shared accounting system.


On Dec 18, 2014, at 12:48 PM, Thomas Petr <[email protected]> wrote:

> We (HubSpot) currently have a cron job that enumerates all tasks running on 
> the slaves and pushes resource usage data into OpenTSDB. We then use lead.js 
> to query / visualize this data. The cron job isn't open source, but I could 
> look into releasing it if anyone is interested. I've also thought about 
> adding this functionality into our Singularity framework, but if it was 
> directly supported by the mesos master (pumping task resource usage into 
> graphite / OpenTSDB), that'd be pretty cool.
> 
> -Tom
> 
> On Thu, Dec 18, 2014 at 3:25 PM, Andrew Ortman 
> <[email protected]> wrote:
> I imagine we will also be running into the same need. Our plan right now was 
> to write an quick service that polls the API the dashboard uses to retrieve 
> metric information for each slave and then pipe that data directly to 
> something like graphite for logging. I haven’t looked too much into this yet 
> though
> 
> 
>> On Dec 18, 2014, at 2:05 PM, Niklas Nielsen <[email protected]> wrote:
>> 
>> Hi Steven,
>> 
>> Alex Rukletsov and I worked on this as a proof-of-concept piece in the 
>> mesos-master last week, providing the same kind of graphs as you describe in 
>> the dashboard.
>> We have a good idea about how to implement this now and we can start a 
>> discussion on JIRA on how to proceed (I can create it shortly).
>> My first thought is that this should be pluggable; having something similar 
>> to "status update decorators"
>> Alongside hanging key-value pairs of the status update, you can keep track 
>> of the life-time/size of tasks and do the resource math.
>> 
>> There are some interesting problems to solve when it gets to master 
>> fail-over, but let's try to enumerate those in the ticket.
>> 
>> Thanks,
>> Niklas
>> 
>> On Thu, Dec 18, 2014 at 11:56 AM, Steven Schlansker 
>> <[email protected]> wrote:
>> I am running a corporate Mesos cluster, shared by a number of teams and 
>> projects.
>> We are looking to get some insight into our usage of precious computing 
>> resources.  For example, I'd like to be able to present a report breaking 
>> down CPU-hour and RAM GB-hour utilization by service, team, or other 
>> relevant grouping.
>> 
>> How I'd imagine this works:
>> 
>> * Collect Mesos statistics per task (allocated CPU, CPU utilization, 
>> allocated memory, memory utilization, disk utilization) periodically (say, 
>> once a minute)
>> * Collect task metadata from a pluggable source (mapping from Mesos task to 
>> service name, team name, any other metadata you wish to use to group tasks)
>> * Generate dashboard / reports by aggregating task data over axes provided 
>> by metadata input
>> 
>> Has anyone started on such a project?
>> 
>> Thanks,
>> Steven
>> 
>> 
>> 
>> -- 
>> Niklas
> 

Reply via email to