On Mon, May 16, 2016 at 2:49 PM, Tim Van Steenburgh
<tim.van.steenbu...@canonical.com> wrote:
> Right, but NRPE can be related to any charm too. My point was just that the
> charm doesn't need to explicitly support monitoring.

It totally does, IMO.

process count, disk, mem usage are all important, and should be
available out of the box.

But alerts (driven by monitoring) are all about specific context.

When I'm alerted, I want as specific info as possible as to what is
wrong and hints as to why.  Generic machine monitoring provides little
context, and if that's all you had, would increase your MTTR as you go
fish.

I want detailed, application specific, early alerts that can only be
written by those with application knowledge. These belong in the
charm, and need to be written/maintained by the charm experts.

I've been banging on about this idea for while, but in my head, it
makes sense to promote the idea of app-specific health checks (a la
snappy) in to juju proper, rather than a userspace solution with
layers. Then, you *don't* need specific relation support in your charm
- you just need to write a generic set of health checks/scripts.

Then, these checks are available to run as an action (we do this
pre/post each deploy), or show via juju status, or via the GUI[1]. A
monitoring service can just relate to the charm with the default
relation[2], and get a rich app specific set of checks that it can
convert to its own format and process. No need for relations for each
specific monitoring tool you wish to support. Makes monitoring a 1st
class juju citizen.

Juju could totally own this space, and it's a compelling one.
Monitoring is a mess, and needs integrating with everything all the
time. If we do 80% of that integration for our users, I think that
would play very well with operations folks. And I don't think the
tools in the DISCO[3] orchestration space can do this as effectively -
they by design do not have a central place to consolidate this kind of
integration.


[1] Want a demo that will wow a devops crowd, IMO? Deploy a full demo
system, with monitoring exposed in GUI out of the box. I've said it
before (and been laughed at :), but the GUI could be an amazing
monitoring tool. We might even use it in production ;P

[2] or even more magically, just deploy a monitoring service like
nagios unrelated in the environment, and have it speak with the
model's controller to fetch checks from all machines. Implicit
relations to all, which for monitoring is maybe what you want?

[3] Docker Inspired Slice of COmputer,
https://plus.google.com/+MarkShuttleworthCanonical/posts/W6LScydwS89


-- 
Simon

-- 
Juju mailing list
Juju@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju

Reply via email to