I am actually trying to do something very similar, but I can't really tell if it is the same or not. Basically, I have a metric that gives me the status of up or down, being 1 or 0 respectively in the value field.
I would like to somehow find out from when the value went FROM 0 TO 1, so how long it has been. In this case, how long since it changed to 1 to the current timestamp, therefore I should be able to measure the uptime value of that metric. Open to ideas, as I can't seem to get this working, eventually I would like to present this into grafana so I can show the uptime of that metric. On Friday, 3 April 2020 at 10:01:52 UTC+1 [email protected] wrote: > ANSWERED! > From Stackoverflow: > > Summing up our discussion: the evaluation interval is too big; after 5 > minutes, a metric becomes [stale][1]. This means that when the expression > is evaluated, the right hand side of your `OR` expression is no longer > considered by Prometheus and thus is always empty. > > Your second issue is that your record rule is adding some labels to the > original metric and you get some complaint by Prometheus. This is not > because the labels already exists: in [recording rules][3], labels > overwrite the existing labels. > > The issue is your `OR` expression: it should specify an `ignoring()` > [matching clause][2] for ignoring the added labels or you will get the > labels from both sides of the `OR` expression: > > > `vector1 or vector2` results in a vector that contains all original > elements (label sets + values) of vector1 and additionally all elements of > vector2 ***which do not have matching label sets in vector1***. > > Since you get both side of the `OR`, when Prometheus tries to add the > labels to the left hand side, it conflicts with the right hand side which > already exists. > > Your expression should be something like: > ```yaml > expr: | > timestamp(changes(metric-name[450s]) > 0) > or ignoring(stat,monitor) > last-update > ``` > Or use an `ON(label1,label2,...)` clause on a discriminating label set > which avoids changing the expression whenever you change the labels. > > > [1]: > https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness > [2]: > https://prometheus.io/docs/prometheus/latest/querying/operators/#one-to-one-vector-matches > [3]: > https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/#rule > > > On Wednesday, April 1, 2020 at 5:41:19 AM UTC-4, Weston Greene wrote: >> >> In the stackoverflow post about this same topic, I was encouraged to >> reduce my evaluation frequency since `last-update` was likely going stale >> by the default TTL (Time To Live) of 5 minutes. >> >> Now I can't get passed the `vector contains metrics with the same >> labelset after applying rule labels`. >> >> I do add labels in the recording rule: >> ``` >> stat: true >> monitor: false >> ``` >> >> I believe this is because `last-update` already has all the labels that >> `metric-name` has plus the labels that the recording rule adds, so when the >> `or` is triggered `last-update` conflicts since it already has the labels. >> >> How do I get around this? Thank you again for your creativity! >> >> >> On Monday, March 30, 2020 at 10:23:20 AM UTC-4, Weston Greene wrote: >>> >>> This was already partially answered in >>> https://stackoverflow.com/questions/54148451 >>> >>> But not sufficiently, so I'm asking here and in the Stack Overflow: >>> https://stackoverflow.com/questions/60928468 >>> >>> Here is the image of the graph: >>> >>> [image: Screen Shot 2020-03-30 at 06.18.07.png] >>> >>> >>> >>> On Monday, March 30, 2020 at 10:21:01 AM UTC-4, Weston Greene wrote: >>>> >>>> >>>> I have the Recording rule pattern: >>>> ```yaml >>>> - record: last-update >>>> expr: | >>>> timestamp(changes(metric-name[450s]) > 0) >>>> or >>>> last-update >>>> ``` >>>> >>>> However, that doesn't work. The `or last-update` part doesn't return a >>>> value. >>>> >>>> I have tried using an offset, >>>> ` or (last-update offset 450s)`, >>>> to no avail. >>>> >>>> >>>> My evaluation frequency is 5 minutes (the frequency that prometheus >>>> runs my Recording rules). I tried the 7.5 minutes offset because I >>>> theorized that the OR was attempting to write last-update as last-update >>>> but last-update was null in that second; if the OR were to attempt writing >>>> last-update as the value it was during it's previous evaluation, then it >>>> should find a value in last-update, but that returned no value as well. >>>> >>>> >>>> This is what the metric looks like graphed: >>>> >>>> [choppy rather than a complete staircase][1] (I don't have enough >>>> reputation to post pictures...) >>>> >>>> >>>> >>>> Thank you in advance for your help. >>>> >>>> Why I care: >>>> If a time series plateaus for an extended period of time then I want to >>>> know as that may mean it has begun to fail to return accurate data. >>>> >>>> >>>> [1]: I think the image link is preventing me from posting >>>> >>> -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/5f304ce1-52f4-4fd2-af30-b8a56fe47438n%40googlegroups.com.

