[prometheus-users] json to prometheus exposition format

2020-11-07 Thread kiran
All, I have custom data(coming form an API request) in json format and trying to push to victoria metrics without prometheus, but looking for an easy and efficient way of converting into promethues exposition format or one of the acceptable formats for ingestion by victoria metrics. Any suggestion

[prometheus-users] Recovering after a WAL corruption

2020-11-07 Thread Kirill Elagin
Hi everyone, A week ago we had an incident which resulted in the disk getting full. Yesterday, when investigating another issue, I realised that queries in Prometheus are only returning data from the last 3 to 5 hours. This is Prometheus 2.20.1 with all TSDB configuration set to defaults. I star

Re: [prometheus-users] suspending a VM that is running prometheus?

2020-11-07 Thread Harald Koch
On Thu, Nov 5, 2020, at 03:54, Ben Kochie wrote: > For working hypervisor combinations, running NTP on the guest VM is > unnecessary. For example, with KVM/QEMU, the guest can sync directly with the > hypervisor. Thanks for your reply! I agree, but the the state of the art continues to be that

Re: [prometheus-users] Re: Discrepancy in Alert Rule Evaluation.

2020-11-07 Thread Brian Candler
I don't think it's a false alert. If it's the rule you showed, then the only way you can get an alert is if the metric probe_success has value zero. You should try to understand *why* BBE is returning zero; if necessary use tcpdump or wireshark to capture the HTTP traffic to and from it. But

Re: [prometheus-users] Re: Discrepancy in Alert Rule Evaluation.

2020-11-07 Thread Brian Candler
On Saturday, 7 November 2020 13:35:47 UTC, Yagyansh S. Kumar wrote: > > Try looking at scrape_duration_seconds{job="Ping-All-Servers"}. Maybe > it's borderline to the scrape interval. > >> That's interesting. Here are the top 20 scrape_duration_seconds maxed > for last 1 hour by instance. Close

Re: [prometheus-users] Re: proxy_url not working in azure service discoverer

2020-11-07 Thread Brett Jacobson
Okay I'll file a bug report in github. On Saturday, November 7, 2020 at 3:49:32 AM UTC-6 Brian Brazil wrote: > On Sat, 7 Nov 2020 at 08:21, Brian Candler wrote: > >> On Friday, 6 November 2020 21:16:18 UTC, Brett Jacobson wrote: >>> >>> I am trying to set the proxy_url param on a scrape config j

[prometheus-users] Re: promQL: getting data from multiple metrics in single query

2020-11-07 Thread Brian Candler
On Saturday, 7 November 2020 14:12:53 UTC, kiran wrote: > > In my use case I have custom metrics that I will be sending to victoria > metrics and will be using PormQL. > Using remote_write from prometheus, or your applicatino directly writing to VictoriaMetrics? VM supports import in a bunch o

Re: [prometheus-users] promtool check config prometheus.yml equivalent for blackbox_exporter

2020-11-07 Thread Brian Brazil
On Sat, 7 Nov 2020 at 13:50, 'Evelyn Pereira Souza' via Prometheus Users < prometheus-users@googlegroups.com> wrote: > Hi > > I really like > > # promtool check config prometheus.yml > > because it detects errors in syntax, etc > > Is there an equivalent for blackbox_exporter? Checking the config

[prometheus-users] Re: promQL: getting data from multiple metrics in single query

2020-11-07 Thread kiran
Thank you Brian. In my use case I have custom metrics that I will be sending to victoria metrics and will be using PormQL. The metrics constitute application level metadata and some high level metrics for each application. I may have 20-25 such data points for each application. Ultimately I need t

[prometheus-users] promtool check config prometheus.yml equivalent for blackbox_exporter

2020-11-07 Thread 'Evelyn Pereira Souza' via Prometheus Users
Hi I really like # promtool check config prometheus.yml because it detects errors in syntax, etc Is there an equivalent for blackbox_exporter? Checking the config without deploying the changes? regards Evelyn -- You received this message because you are subscribed to the Google Groups "Pr

Re: [prometheus-users] Re: Discrepancy in Alert Rule Evaluation.

2020-11-07 Thread Yagyansh S. Kumar
Try looking at scrape_duration_seconds{job="Ping-All-Servers"}. Maybe it's borderline to the scrape interval. >> That's interesting. Here are the top 20 scrape_duration_seconds maxed for last 1 hour by instance. Close to 5 seconds. Can this lead to some issue? But again the thing comes why no

Re: [prometheus-users] Re: Rate per min

2020-11-07 Thread Arjun Singri
Thank you Brian. Arjun On Tue, Nov 3, 2020 at 8:43 AM Brian Candler wrote: > On Tuesday, 3 November 2020 15:04:56 UTC, Arjun wrote: >> >> I wanted to understand this in the context of "sum of rate". What is >> "sum" summing up here and across which time period? >> >> sum(rate(http_counter[1m]))

Re: [prometheus-users] Re: Discrepancy in Alert Rule Evaluation.

2020-11-07 Thread Brian Candler
Try looking at scrape_duration_seconds{job="Ping-All-Servers"}. Maybe it's borderline to the scrape interval. What does min_over_time(up{job="Ping-All-Servers"}[5m]) show? In other words, is it the scrape to BBE which is failing, or the BBE probe? (I think the latter). Is there a different n

Re: [prometheus-users] Blaxkbox Panic Error.

2020-11-07 Thread yagyans...@gmail.com
Okay, thanks a lot, Brian. Can you please have a look at https://groups.google.com/u/1/g/prometheus-users/c/Cb7lUaqWnbc too? On Saturday, November 7, 2020 at 3:21:55 PM UTC+5:30 Brian Brazil wrote: > On Sat, 7 Nov 2020 at 04:46, yagyans...@gmail.com > wrote: > >> >> Hi. I am running Blackbox

Re: [prometheus-users] Re: Discrepancy in Alert Rule Evaluation.

2020-11-07 Thread Yagyansh S. Kumar
Yes, both the Prometheus instances are talking to the same BBE indeed. Infact both have the exact same configuration file and are scraping the exact same targets. Here is the graph for the modified query. Fails visible for 2.20.1 but none for 2.12.0. 2.12.0 [image: image.png] 2.20.1 [image: imag

Re: [prometheus-users] Re: Discrepancy in Alert Rule Evaluation.

2020-11-07 Thread Brian Candler
You won't necessarily see all the failures on that graph. With a 5-second scrape interval, a 6 hour window contains 4,320 scrapes - more than the number of points fetched. Many of the points will be skipped over. I suggest you graph this instead: min_over_time(probe_success[5m]) (Otherwise,

Re: [prometheus-users] Blaxkbox Panic Error.

2020-11-07 Thread Brian Brazil
On Sat, 7 Nov 2020 at 04:46, yagyans...@gmail.com wrote: > > Hi. I am running Blackbox Exporter v 0.18.0 and quite a few time I am > observing error like below. > > Nov 7 05:03:11 dh4-k1-infra-prometheus-n2 blackbox_exporter: 2020/11/07 > 05:03:11 http: panic serving 172.20.10.98:44106: runtime

Re: [prometheus-users] Re: proxy_url not working in azure service discoverer

2020-11-07 Thread Brian Brazil
On Sat, 7 Nov 2020 at 08:21, Brian Candler wrote: > On Friday, 6 November 2020 21:16:18 UTC, Brett Jacobson wrote: >> >> I am trying to set the proxy_url param on a scrape config job that uses >> azure_sd_configs for azure service discovery. It appears that the azure >> service discovery module

[prometheus-users] Re: Discrepancy in Alert Rule Evaluation.

2020-11-07 Thread Brian Candler
On Saturday, 7 November 2020 08:49:15 UTC, yagyans...@gmail.com wrote: > > My Blackbox exporter is already running with Debug Log Mode and still, I > don't see and probe failed logs for that period. > But is this the same blackbox exporter which is also showing panics in its logs? https://groups

[prometheus-users] Re: Discrepancy in Alert Rule Evaluation.

2020-11-07 Thread Brian Candler
The promQL queryprobe_success{job=~"Ping-All-Servers"} == 0 is a filter. It returns the set of timeseries where the job label matches "Ping-All-Servers" *and* the value is zero. It cannot return a non-empty set of results unless those conditions are met. What's your rule evaluation interv

[prometheus-users] Re: Discrepancy in Alert Rule Evaluation.

2020-11-07 Thread yagyans...@gmail.com
Hi Brian, My Blackbox exporter is already running with Debug Log Mode and still, I don't see and probe failed logs for that period. Also, I have ran the query for some of the instances that I saw in PENDING state, but I do not see any failures there also, probe_success is 1 for them constantly

[prometheus-users] Re: Discrepancy in Alert Rule Evaluation.

2020-11-07 Thread Brian Candler
Go into the Prometheus query browser (front page in the web interface, normally port 9090), and enter the query: probe_success{job=~"Ping-All-Servers"} and switch to graph mode. Is the line going up and down? Then probes are failing. If you want to see logs of these failures, then on the bla

[prometheus-users] Re: promQL: getting data from multiple metrics in single query

2020-11-07 Thread Brian Candler
You can do a PromQL query like {__name__="foo|bar"} but that's messy if you also want to filter on different labels for metrics foo and bar. If you're only interested in the current values of each metric, then you can query the /federate endpoint where you can provide the match[] parameter mult

[prometheus-users] Re: Blaxkbox Panic Error.

2020-11-07 Thread yagyans...@gmail.com
Hi Brain, I am running under linux-amd64 and using pre-built binary. [myuser@infra-prometheus ~]# /usr/local/bin/blackbox_exporter --version blackbox_exporter, version 0.18.0 (branch: HEAD, revision: 60c86e6ce5af7958b06ae7a08222bb6ec839) build user: root@53d72328d93f build date:

[prometheus-users] Re: Blaxkbox Panic Error.

2020-11-07 Thread Brian Candler
I've not seen this. What server platform are you running under? (e.g. is it linux-amd64?) Are you using the release binaries, or did you build it yourself from source? (show output of blackbox_exporter --version) Is it only http modules that give this error? Can you show the blackbox module c

[prometheus-users] Re: proxy_url not working in azure service discoverer

2020-11-07 Thread Brian Candler
On Friday, 6 November 2020 21:16:18 UTC, Brett Jacobson wrote: > > I am trying to set the proxy_url param on a scrape config job that uses > azure_sd_configs for azure service discovery. It appears that the azure > service discovery module does not respect this setting from the > scrape_config

[prometheus-users] Discrepancy in Alert Rule Evaluation.

2020-11-07 Thread yagyans...@gmail.com
Hi. I am using Blackbox Exporter v 0.18.0 for generating Host Down Alerts. Below is the configured rule. - alert: HostDown expr: probe_success{job=~"Ping-All-Servers"} == 0 for: 1m labels: severity: "CRITICAL" annotations: summary: "Server is Down - *{{ $labels.insta

[prometheus-users] Re: Discrepancy in Alert Rule Evaluation.

2020-11-07 Thread yagyans...@gmail.com
Prometheus Version - 2.20.1 On Saturday, November 7, 2020 at 1:46:31 PM UTC+5:30 yagyans...@gmail.com wrote: > > Hi. I am using Blackbox Exporter v 0.18.0 for generating Host Down Alerts. > Below is the configured rule. > - alert: HostDown > expr: probe_success{job=~"Ping-All-Servers"} ==