Hello, it's only working partly I think. If I add the same target several times to the same job then prometheus treats targets with the exact naming as on. This results in one target on prometheus' webui target list and tcpdump confirms onle one scrape per 60s
- targets: - pfsense.oberndorf.ca:443 # pfsense webui tcp tls test - pfsense.oberndorf.ca:443 # pfsense webui tcp tls test - pfsense.oberndorf.ca:443 # pfsense webui tcp tls test - pfsense.oberndorf.ca:443 # pfsense webui tcp tls test If I use this I have 4 different namings for the same target which results in 4 scrapes. However with this max 4 permutations are possible I think and with http only 2. scheme: https - targets: - pfsense.oberndorf.ca:443 # pfsense webui tcp tls test - https://pfsense.oberndorf.ca # pfsense webui tcp tls test - https://pfsense.oberndorf.ca:443 # pfsense webui tcp tls test - pfsense.oberndorf.ca # pfsense webui tcp tls test And at least I they do not spread as equal as I hoped and in addition I now have 4 different instances. Maybe I could fix this with relabling the "instance" field but this sound as wrong as relabeling the "job". [image: same_target_4times.JPG] Back to your question: "Does it really matter whether it was 20 seconds or 25 seconds?" I don't know if this is relevant. It's a rare issue and I am in discussion with the vendor of the API/appliance. However it maybe could give me some more indication if the API would respond after lets say 50s oder 3 minutes. If scrape_timeout is reached the exporter sends a RST if I remember correctly which is good to close the connections but will also close the connection to the API and API server maybe just writes "client closed connection" or something similar to the log. I don't know if this is really a problem if the answers of two parallel probes overlap (timeout longer than duration) because the connection uses different source ports and prometheus allows the "out-of-order" ingestion if I remember correctly. Perhaps it could lead to many unclosed connections which need memory. lt's say interval is 1s and timeout is 60s there could be 60 connections in parallel. Maybe a longer timeout than scrape_interval could be handled like this: scrape_interval: 15s scrape_timeout: 60s if scrape_time is longer than scrape_interval check if probe duration succeeded before scrape_timeout and do the next scrape according to scrape_interval. if scrape_duration is longer than scrape_interval and shorter than scrape_timeout skip next scrape until timeout reached or scrape succeeded. However this would not allow parallel scrapes. Probably this is a rare scenario and debugging an API with blackbox_exporter was only an idea. I just wanted to ask if I miss something :-) Thanks for sharing your ideas. Brian Candler schrieb am Dienstag, 9. Januar 2024 um 15:45:42 UTC+1: > (Thinks: maybe it's *not* necessary to apply distinct labels? This feels > wrong somehow, but I can't pinpoint exactly why it would be bad) > > On Tuesday 9 January 2024 at 14:43:51 UTC Brian Candler wrote: > >> Unfortunately, the timeout can't be longer than the scrape interval, >> firstly because this would require overlapping scrapes, and secondly the >> results could be returned out-of-order: e.g. >> >> xx:yy:00 scrape 1: takes 25 seconds, gives result at xx:yy:25 >> xx:yy:15 scrape 2: takes 5 seconds, gives result at xx:yy:20 >> >> > If I run two blackbox_probes in parallel with scrape_interval: 30s and >> scrape_timeout: 30s this will work but both probes will start more or less >> at the same time. >> >> Actually I think you'll find they'd be evenly spread out over the scrape >> interval - try it. >> >> For example, make a single scrape job with a 60 second scrape interval, >> and list the same target 4 times - but make sure you apply some distinct >> label to each instance, so that they generate 4 separate timeseries. You >> can then look at the raw timestamps in the database to check the actual >> scrape times: easiest way is by using the PromQL web interface and >> supplying a range vector query, like probe_success{instance="foo"}[5m]. >> This has to be in table view, not graph view. Don't mix any other targets >> into that scrape job, because they'll be spread together. >> >> Alternatively, KISS: use a 15 second scrape interval, and simply accept >> that "scrape failed" = "took longer than 15 seconds". Does it really matter >> whether it was 20 seconds or 25 seconds? Can you get that information from >> somewhere else if needed, e.g. web server logs? >> >> On Tuesday 9 January 2024 at 10:04:42 UTC Alexander Wilke wrote: >> >>> Hello, >>> I want to use blackbox_exporter and http prober to login to an API. >>> >>> My goal is to do the login every 15s which could be >>> >>> xx:yy:00 >>> xx:yy:15 >>> xx:yy:30 >>> xx:yy:45 >>> >>> I could solve this with scrape_interval: 15s. >>> But in addition I want to allow a scrape timeout of 30s which is longer >>> than the scrape_timeout. >>> >>> If I run two blackbox_probes in parallel with scrape_interval: 30s and >>> scrape_timeout: 30s this will work but both probes will start more or less >>> at the same time. >>> >>> xx:yy:00 >>> xx:yy:30 >>> >>> The idea behind tha is: >>> In general the API response for login is very fast. For whatever reason >>> sometimes it takes 30s or more. I do not want the probe to just fail after >>> 15s but want to see and understand how long a login request takes. >>> >>> If I abort a long lasting request or do a parellel login this may work >>> very fast. So it is probably not a problem with the API in general but with >>> specific user session or other unknown circumstances. So I want many scrape >>> intervals but the imeout sometimes needs to be higher OR I need several >>> blackbox_probes which do not start at the same time but are spread equally. >>> >>> Any ideas? >>> Is this possible with prometheus 2.48.1 and blackbox_exporter 0.24.0 ? >>> >>> >>> >>> -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/033b21ea-9750-49fa-9d7b-d9b916c033a8n%40googlegroups.com.