Hi Brian,

thanks a lot for your reply.

I re-read my original mail and I recognize I should have probably delivered 
less information and went straight to the point. That probably created a 
bit of confusion. E.g. I never intended the up metric - or any other metric 
- to be considered a boolean. My bad. I'll try to get straight to the point 
this time.

>This is *not* boolean.  Rather, it takes the vector of timeseries "foo" 
and matches them up with the vector of timeseries "bar".  All those 
elements of foo which have exactly matching label >sets with bar, are 
passed through unchanged.  Anything else is dropped.

Right, and my question is the following. Mostly to understand the 
underlining behaviour, not because I have any particular problem to resolve.
Assuming the second metric goes missing how is the binary expression 
evaluated exactly? In the "normal" case, i.e. "foo and bar" we would not 
have points but in the case of "absent(foo) and bar", from my tests, it 
seems to me the "bar" filtering is simply ignored.

I can guess that is because "absent" is not really a metric per se and thus 
we are comparing two empty sets of labels - effectively reducing 
"absent(foo) and bar" to "absent(foo)".
I'd say, it would make sort of sense, right?

Cheers,
F.

On Thursday, 3 March 2022 at 17:01:29 UTC+1 Brian Candler wrote:

> You can use the PromQL browser in the prometheus web UI to debug this, 
> since you can view the value of an expression at any previous point in time.
>
> Try the two halves separately:
>
> absent(our_metric{environment="pro",service="bar",stack="foo"}) 
>
> up{service="bar",source="app"} == 1
>
> Then try the whole expression at that point in time.  Either view the 
> graph, or view the instant query and set the instant time to when there was 
> a problem.
>
> > As the node went missing the second operand of the binary operator could 
> not be evaluated, simply because it was neither `1`, nor `0`
>
> The expression:
>     up{service="bar",source="app"} == 1
> can only ever have the value 1 or be missing.  metric == constant is a 
> filter, not a boolean.  The value it returns is the value of the LHS, or no 
> value if the filter condition is not met.
>
> Possibly you want to remove the "== 1" entirely:
>
> absent(our_metric{environment="pro",service="bar",stack="foo"}) and 
> on(stack, environment) up{service="bar",source="app"}
>
> "and" expressions behave in a corresponding way:
>
>     foo and bar
>
> This is *not* boolean.  Rather, it takes the vector of timeseries "foo" 
> and matches them up with the vector of timeseries "bar".  All those 
> elements of foo which have exactly matching label sets with bar, are passed 
> through unchanged.  Anything else is dropped.
>
> So it's just a filter: "give me all values of foo, where there is also a 
> value present for bar".  It does not have true/false values either as its 
> input or its output.
>
> > Or, in other words, the following was holding true:
> > 
> > absent(up{service="bar",source="app"}) = 1
>
> How do you know?  The "up" metric is always present for a target, whether 
> or not scraping is successful: it would only not be present if you removed 
> the target from the scrape job.  This could be the case if you are using 
> some dynamic service discovery, and the service went away.  But then your 
> real problem is how to stop services vanishing from service discovery.
>
> Anyway, you can tell for sure by looking at historical values of these 
> queries:
>
> up{service="bar",source="app"}
> absent(up{service="bar",source="app"})
>
>
> On Thursday, 3 March 2022 at 11:12:11 UTC Federico Buti wrote:
>
>> Hi list,
>>
>> For a monitored system we setup a rule as follows:
>>
>> absent(our_metric{environment="pro",service="bar",stack="foo"}) and 
>> on(stack, environment) up{service="bar",source="app"} == 1
>>
>> This is one of the few absence rules we have in our ruleset. This is also 
>> a bit special because the exporter uses the absence of the metric to 
>> indicate a problem - something that is discouraged from guidelines. But 
>> that goes beyond my question anyway.
>>
>> Using a binary AND operator seems to work fine, cutting out the cases in 
>> which the node is not scrapable. However this morning the node went 
>> missing. We had probably a misconfiguration in our provisioning which we 
>> are currently investigating.
>>
>> As the node went missing the second operand of the binary operator could 
>> not be evaluated, simply because it was neither `1`, nor `0`. Or, in other 
>> words, the following was holding true:
>>
>> absent(up{service="bar",source="app"}) = 1
>>
>> I understand an alert can resolve if the related metric goes stale but 
>> I'm not sure how the logic should translate in this case. On the surface I 
>> would not expect the AND expression to fire as we are not able to say the 
>> "up" metric is really 1.
>>
>> But maybe I'm missing the point here?
>>
>> Thanks in advance,
>> F.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/0631774b-a574-41bc-810d-de25558db035n%40googlegroups.com.

Reply via email to