To add a little more information about ambiguous/inconsistent matchers, the 
current version of Alertmanager allows the following:

{foo=}} equivalent to {foo="}"}, but I think this should be an error
{{foo=} is an error because of two {{, unlike the above
{foo=~} could be either {foo=~""} or {foo="~"}, it's interpreted in current 
versions as {foo=~""}
{foo=,} is equivalent to {foo=""}, but I think should be an error as a 
comma with no value has a high likelihood of being human error
{foo=,,} is an error, unlike {foo=,} or {foo=}}
{foo= } is equivalent to {foo=""}
{foo= b} and {foo=b } are equivalent to {foo="b"}, but {foo=b b} is 
equivalent to {foo="b b"}

The changes proposed here would also restrict all of the examples above as 
1. trailing commas are rejected and 2. unquoted text must match the regular 
expression [a-zA-Z_:][a-zA-Z0-9_:]*.


On Tuesday, May 9, 2023 at 10:05:49 AM UTC+1 George Robinson wrote:

> Hi, all! 👋
>
> I've opened a pull request for the Alertmanager here 
> <https://github.com/prometheus/alertmanager/pull/3353> that adds support 
> for parsing UTF-8 matchers while also adding restrictions on 
> ambiguous matchers that are permitted in the current version of 
> Alertmanager.
>
> https://github.com/prometheus/alertmanager/pull/3353
>
> *Background*
>
> As explained in #3319 
> <https://github.com/prometheus/alertmanager/issues/3319>, Alertmanager 
> constrains label names to the characters ^[a-zA-Z_][a-zA-Z0-9_]*$ - the 
> same as Prometheus.
>
> However, because Alertmanager is such a well known and popular open source 
> project it makes sense to use it for other kinds of alert generators too, 
> and not just Prometheus. An example of this is Grafana, where Alertmanager 
> is used to manage alerts created from both Prometheus and non-Prometheus 
> like datasources, including Graphite, InfluxDB, SQL, Loki and more. The 
> issue with this though is that these other datasources do not share the 
> same constraints as Prometheus when it comes to label names, and so using 
> them with Alertmanager requires some kind of normalization to ensure label 
> matchers match the set of allowed characters before the alert can be sent 
> to the Alertmanager.
>
> *What this pull request does*
>
> @yuri-tceretian <https://github.com/yuri-tceretian> has proposed adding 
> support for UTF-8 characters in #3321 
> <https://github.com/prometheus/alertmanager/pull/3321> that updates the 
> existing regular expression to support double quoted UTF-8 sequences as 
> label names, while keeping unquoted label names the same.
>
> In this pull request I wanted to propose a different option to where we 
> would add a "simple" LL(1) parser to Alertmanager to parse matchers instead.
>
> *Motivation*
>
> The original motivation for writing this parser was to add support for 
> matching label names containing . and spaces to grafana/grafana 
> <https://github.com/grafana/grafana>. However, about the same time I 
> learned that Prometheus maintainers agreed to add support for UTF-8 
> labels in Alertmanager 
> <https://docs.google.com/document/d/11LC3wJcVk00l8w5P3oLQ-m3Y37iom6INAMEu2ZAGIIE/edit#bookmark=id.wm24eue9w2eg>,
>  
> and so I decided to further the work to see if it could be upstreamed to 
> Alertmanager instead.
>
> The original source code can be found at grobinson-grafana/matchers 
> <https://github.com/grobinson-grafana/matchers>.
>
> *Breaking changes*
>
> *Expressions must start and end with open and closing braces*
>
> All expressions must start and end with { and }, although this can be 
> relaxed if required. For example foo=bar is not valid, it must be 
> {foo=bar}.
>
> *Trailing commas are not permitted*
>
> Trailing commas are not permitted. For example {foo=bar,} is not valid, 
> it must be {foo=bar}.
>
> *All non [a-zA-Z_:][a-zA-Z0-9_:]* values must be double quoted*
>
> The set of unquoted characters is now the same on both sides of the 
> expression. In other words, both label names and label values without 
> double quotes must match the regular expression [a-zA-Z_:][a-zA-Z0-9_:]*. 
> For example {foo=!bar} is not valid, it must be {foo="!bar"}. In current 
> versions of Alertmanager, unquoted label values can contain all UTF-8 code 
> points with the exception of comma, such as {foo=!bar}.
>
> There are two reasons for this:
>
>    1. 
>    
>    It's no longer possible to write ambiguous matchers which I feel is 
>    something Alertmanager should fix. For example is {foo=~} equivalent 
>    to {foo="~"} or {foo=~""}?
>    2. 
>    
>    If we restrict the =, !, ~ characters to double quotes we can keep the 
>    grammar LL(1). Without this restriction lookahead/backtrack is required to 
>    parse matchers such as {foo==~!=!~bar} which are valid in current 
>    versions of Alertmanager.
>    
> *Errors*
>
> One of the goals with this LL(1) parser is to provide better error 
> messages than what is possible using just a regular expression. This work 
> is still in progress as error messages are not as useful as I would like 
> for EOF, but here is an example of what it looks like for non-EOF cases:
> {foo=bar 🙂} 9:13: 🙂: invalid input: expected comma or closing '}' {foo 
> with spaces=bar with spaces} 5:9: unexpected with: expected an operator 
> such as '=', '!=', '=~' or '!~'
> *Benchmarks*
>
> I've also provided a number of benchmarks of both the LL(1) parser and 
> regex parser that supports UTF-8. These can be found at 
> grobinson-grafana/matchers-benchmarks 
> <https://github.com/grobinson-grafana/matchers-benchmarks>. However, to 
> run them go.mod must be updated to use the branch 
> https://github.com/grafana/prometheus-alertmanager/tree/yuri-tceretian/utf-8-label-names
>  here.
> BenchmarkMatchersSimple, BenchmarkPrometheusSimple {foo="bar"} 
> BenchmarkMatchersComplex, BenchmarkPrometheusComplex {foo="bar",bar="foo 
> 🙂","baz"!=qux,qux!="baz 🙂"} BenchmarkMatchersRegexSimple, 
> BenchmarkPrometheusRegexSimple {foo=~"[a-zA-Z_:][a-zA-Z0-9_:]*"} 
> BenchmarkMatchersRegexComplex, BenchmarkPrometheusRegexComplex 
> {foo=~"[a-zA-Z_:][a-zA-Z0-9_:]*",bar=~"[a-zA-Z_:]","baz"!~"[a-zA-Z_:][a-zA-Z0-9_:]*",qux!~"[a-zA-Z_:]"}
>  
>
> go test -bench=. -benchmem goos: darwin goarch: arm64 pkg: 
> github.com/grobinson-grafana/matchers-benchmarks 
> BenchmarkMatchersRegexSimple-8 
> <http://github.com/grobinson-grafana/matchers-benchmarksBenchmarkMatchersRegexSimple-8>
>  
> 488295 2425 ns/op 3248 B/op 49 allocs/op BenchmarkMatchersRegexComplex-8 
> 138081 9074 ns/op 11448 B/op 169 allocs/op BenchmarkPrometheusRegexSimple-8 
> 329244 3496 ns/op 3531 B/op 58 allocs/op BenchmarkPrometheusRegexComplex-8 
> 95188 12554 ns/op 12619 B/op 204 allocs/op BenchmarkMatchersSimple-8 
> 2888340 414.9 ns/op 56 B/op 2 allocs/op BenchmarkMatchersComplex-8 741590 
> 1628 ns/op 248 B/op 7 allocs/op BenchmarkPrometheusSimple-8 1919209 613.9 
> ns/op 233 B/op 8 allocs/op BenchmarkPrometheusComplex-8 425430 2803 ns/op 
> 1015 B/op 31 allocs/op PASS ok 
> github.com/grobinson-grafana/matchers-benchmarks 11.766s
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/97791d92-3c5d-4d54-8b96-f08852839e28n%40googlegroups.com.

Reply via email to