It's odd that the yellow graph appears to be smoothly increasing at 2
minute intervals, whilst the green one has a rate burst. A value of
200Gbps implies that the counter has gone up by 7.5TB between the start and
end of the 5 minute rate window. And interestingly, the absolute value of
the counter shown is 6TB (right-hand axis), which is in the same ballpark.
Is it possible that the counter value goes 6TB -> 0 -> 6TB very quickly,
with the points so closely spaced that the zero value isn't picked up in
the yellow graph? Or grafana is ignoring zero / null values?
It's possible to check this using the PromQL browser in the Prometheus web
interface (normally port 9090), or the API, and extract the raw data from
the TSDB. Do an instant query for
wan_ifInOctets{site="foo"}[10m]
and you'll get all the raw data points over that period. Adjust the time
of the query so that it covers the period where things are strange, and
look for values which look out of place.
My other question is, how is "wan_ifInOctets" being collected in the first
place? The standard snmp_exporter if_mib would give "ifInOctets" (plus
labels for the interfaces). Have you got a recording rule or something
like that?
The standard ifInOctets is a 32-bit counter, with maximum value ~4GB. I
was going to suggest you use ifHCInOctets, but then I see your
"wan_ifInOctets" already has a value of ~6TB so it can't possibly be 32
bits. It would be good to understand where it comes from.
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/06864658-06a5-429e-b6a7-195c4dafb6a9o%40googlegroups.com.