It really depends on the monitoring solution. Usually this metrics are
exported and you can just predicate on them, in the language they provide.
In my case, I'm using a hosted solution (signalfx) that gives you a daemon
set and sends that metric to them. You can then predicate. We have alerts
David,
What we do is export the kubernetes cluster events to Cloud PubSub using
Stackdriver Export and then we have SumoLogic setup to ingest logs from PubSub.
Then we use the SumoLogic Scheduled Search Capabilities to send alerts based on
certain events.
Punit Agrawal
Site Reliability
David,
In Datadog events you can see the killed pods.
But, if you have containers that need to be killed because they don't die
when receiving a stop, you'll see a lot of events like: KILLED,
DESTROYED, and this is not necessarily
an error, could be only a container being restarted, keep that
Most of what you're asking for is available via the k8s API, if you watch
it.
On Wed, Aug 8, 2018 at 12:58 PM David Rosenstrauch
wrote:
> As we're getting ready to go to production with our k8s-based system,
> we're trying to pin down exactly how we're going to do all the needed
>
Thanks for the response, Marcio. We've actually recently started using
Datadog already. (At least in dev/qa.) But DD is a bit of a sea of
metrics, and I'm not clear how we would accomplish one of the specific
tasks I've mentioned - for example, alerting when k8s has killed a
container or
Hi David,
You can use DataDog to achieve this.
On 8/8/18, David Rosenstrauch wrote:
> As we're getting ready to go to production with our k8s-based system,
> we're trying to pin down exactly how we're going to do all the needed
> monitoring/alerting for it. We can easily collect many of
As we're getting ready to go to production with our k8s-based system,
we're trying to pin down exactly how we're going to do all the needed
monitoring/alerting for it. We can easily collect many of the metrics
we need (using kube-state-metrics to feed into prometheus, and/or
Datadog) and