[prometheus-users] Re: Blackbox_Exporter 0.24.0 - probe endpoint every 15s with30s timeout ?

Alexander Wilke Tue, 09 Jan 2024 15:35:15 -0800

Hello,
it's only working partly I think. If I add the same target several times to 
the same job then prometheus treats targets with the exact naming as on.
This results in one target on prometheus' webui target list and tcpdump 
confirms onle one scrape per 60s


      - targets:
        - pfsense.oberndorf.ca:443        # pfsense webui tcp tls test
        - pfsense.oberndorf.ca:443        # pfsense webui tcp tls test
        - pfsense.oberndorf.ca:443        # pfsense webui tcp tls test
        - pfsense.oberndorf.ca:443        # pfsense webui tcp tls test

If I use this I have 4 different namings for the same target which results 
in 4 scrapes. However with this max 4 permutations are possible I think and 
with http only 2.

        scheme: https
      - targets:
        - pfsense.oberndorf.ca:443        # pfsense webui tcp tls test
        - https://pfsense.oberndorf.ca        # pfsense webui tcp tls test
        - https://pfsense.oberndorf.ca:443        # pfsense webui tcp tls 
test
        - pfsense.oberndorf.ca        # pfsense webui tcp tls test


And at least I they do not spread as equal as I hoped and in addition I now 
have 4 different instances.
Maybe I could fix this with relabling the "instance" field but this sound 
as wrong as relabeling the "job".

[image: same_target_4times.JPG]


Back to your question:
"Does it really matter whether it was 20 seconds or 25 seconds?"

I don't know if this is relevant. It's a rare issue and I am in discussion 
with the vendor of the API/appliance. However it maybe could give me some 
more indication if the API would respond after lets say 50s oder 3 minutes.
If scrape_timeout is reached the exporter sends a RST if I remember 
correctly which is good to close the connections but will also close the 
connection to the API and API server maybe just writes "client closed 
connection" or something similar to the log.

I don't know if this is really a problem if the answers of two parallel 
probes overlap (timeout longer than duration) because the connection uses 
different source ports and prometheus allows the "out-of-order" ingestion 
if I remember correctly.
Perhaps it could lead to many unclosed connections which need memory. lt's 
say interval is 1s and timeout is 60s there could be 60 connections in 
parallel.

Maybe a longer timeout than scrape_interval could be handled like this:

scrape_interval: 15s
scrape_timeout: 60s

if scrape_time is longer than scrape_interval check if probe duration 
succeeded before scrape_timeout and do the next scrape according to 
scrape_interval.
if scrape_duration is longer than scrape_interval and shorter than 
scrape_timeout skip next scrape until timeout reached or scrape succeeded.

However this would not allow parallel scrapes.


Probably this is a rare scenario and debugging an API with 
blackbox_exporter was only an idea. I just wanted to ask if I miss 
something :-)

Thanks for sharing your ideas.





Brian Candler schrieb am Dienstag, 9. Januar 2024 um 15:45:42 UTC+1:

> (Thinks: maybe it's *not* necessary to apply distinct labels? This feels 
> wrong somehow, but I can't pinpoint exactly why it would be bad)
>
> On Tuesday 9 January 2024 at 14:43:51 UTC Brian Candler wrote:
>
>> Unfortunately, the timeout can't be longer than the scrape interval, 
>> firstly because this would require overlapping scrapes, and secondly the 
>> results could be returned out-of-order: e.g.
>>
>> xx:yy:00 scrape 1: takes 25 seconds, gives result at xx:yy:25
>> xx:yy:15 scrape 2: takes 5 seconds, gives result at xx:yy:20
>>
>> > If I run two blackbox_probes in parallel with scrape_interval: 30s and 
>> scrape_timeout: 30s this will work but both probes will start more or less 
>> at the same time.
>>
>> Actually I think you'll find they'd be evenly spread out over the scrape 
>> interval - try it.
>>
>> For example, make a single scrape job with a 60 second scrape interval, 
>> and list the same target 4 times - but make sure you apply some distinct 
>> label to each instance, so that they generate 4 separate timeseries.  You 
>> can then look at the raw timestamps in the database to check the actual 
>> scrape times: easiest way is by using the PromQL web interface and 
>> supplying a range vector query, like probe_success{instance="foo"}[5m].  
>> This has to be in table view, not graph view.  Don't mix any other targets 
>> into that scrape job, because they'll be spread together.
>>
>> Alternatively, KISS: use a 15 second scrape interval, and simply accept 
>> that "scrape failed" = "took longer than 15 seconds". Does it really matter 
>> whether it was 20 seconds or 25 seconds? Can you get that information from 
>> somewhere else if needed, e.g. web server logs?
>>
>> On Tuesday 9 January 2024 at 10:04:42 UTC Alexander Wilke wrote:
>>
>>> Hello,
>>> I want to use blackbox_exporter and http prober to login to an API.
>>>
>>> My goal is to do the login every 15s which could be
>>>
>>> xx:yy:00
>>> xx:yy:15
>>> xx:yy:30
>>> xx:yy:45
>>>
>>> I could solve this with scrape_interval: 15s.
>>> But in addition I want to allow a scrape timeout of 30s which is longer 
>>> than the scrape_timeout.
>>>
>>> If I run two blackbox_probes in parallel with scrape_interval: 30s and 
>>> scrape_timeout: 30s this will work but both probes will start more or less 
>>> at the same time.
>>>
>>> xx:yy:00
>>> xx:yy:30
>>>
>>> The idea behind tha is:
>>> In general the API response for login is very fast. For whatever reason 
>>> sometimes it takes 30s or more. I do not want the probe to just fail after 
>>> 15s but want to see and understand how long a login request takes.
>>>
>>> If I abort a long lasting request or do a parellel login this may work 
>>> very fast. So it is probably not a problem with the API in general but with 
>>> specific user session or other unknown circumstances. So I want many scrape 
>>> intervals but the imeout sometimes needs to be higher OR I need several 
>>> blackbox_probes which do not start at the same time but are spread equally.
>>>
>>> Any ideas?
>>> Is this possible with prometheus 2.48.1 and blackbox_exporter 0.24.0 ?
>>>
>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/033b21ea-9750-49fa-9d7b-d9b916c033a8n%40googlegroups.com.

[prometheus-users] Re: Blackbox_Exporter 0.24.0 - probe endpoint every 15s with30s timeout ?

Reply via email to