We currently use it to aid with reviewing PRs, so I took the approach of 
fewer insight but also fewer false positives. Especially that (for example) 
label checks on alert rules are suppose to help new employees write correct 
rules without having to run in past other people.

The next biggish feature I plan is to turn pint into an exporter - run it 
as a sidecar along each Prometheus we run and report any missing series 
(used in alerts but not present in Prometheus) via  metrics, so we can 
alert if (for example) we upgrade node-exporter to a version that renamed a 
bunch of metrics we rely on, and we stop getting some alerts (a pet peeve 
of mine).

In general Prometheus configuration and workflow is fairly lax - empty 
query results are either a bug or not depending on deeper context and so 
on. We want to it be more strict, so we have more confidence that it all 
work together. Pint aims to give us that, plus some other feedback, like 
raise early warning when someone adds a recording rule that would generate 
a ton of new series, eating memory as a result.

On Monday, 26 April 2021 at 18:59:37 UTC+1 [email protected] wrote:

> Oh, interesting!
>
> I was always thinking of building something along those lines, but purely 
> live-linting rules loaded into a Prometheus server against the actual data 
> that server (which you are also partially doing already).
>
> It was going to output warnings:
>
> - ...for any referenced metric name that isn't currently known to the 
> Prometheus server
> - ...for any label name on a metric name that isn't known
> - ...for any common query mistakes, like rate() on a gauge, deriv() on 
> counters, aggregating away the "le" label, etc.
>
> ...and potentially give an idea about which rules load how many time 
> series in their current state.
>
> Any of those could generate false positives, so it could output warnings 
> at max, but could still be very helpful.
>
> It seems like your tool already does most of that and more, but the common 
> query gotchas one might be useful at some point too :)
>
> On Mon, Apr 26, 2021 at 2:24 PM [email protected] <[email protected]> 
> wrote:
>
>> Hi,
>>
>> https://github.com/cloudflare/pint is a small tool we use at Cloudflare 
>> to try to better manage our ever growing collection of recording and 
>> alerting rules.
>> The main motivation for it was to help with pull requests that are adding 
>> or editing rule files where we often would need to check:
>> * how many time series would a new recording rule add
>> * how many times a new alert will trigger based on historical metrics
>> * are all time series used in a rule present in our Prometheus instances 
>> (we have a non-trivial topology)
>> And that's on top of simple conventions we have, for example each alert 
>> should have a set of well known labels and annotations, like severity or a 
>> link to a Grafana dashboard and a runbook. But even those conventions, 
>> while simple themselves, only apply to "production" alerts, rather than 
>> "test" alerts that are present in config, but not yet paging anyone.
>>
>> While the code is fairly fresh it's been used internally for a while with 
>> good results, so I hope this will be useful for others.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/895de3f8-35c0-4fee-9807-9225eb1aa330n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-users/895de3f8-35c0-4fee-9807-9225eb1aa330n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> -- 
> Julius Volz
> PromLabs - promlabs.com
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/9ff22cd2-2396-4226-af6b-3237c9cde1a0n%40googlegroups.com.

Reply via email to