It returns zero and in targets its "prometheus(0/1)" too. I have basic auth enabled. Is it possible to use basic auth and enable the prometheus scraping itself?
Thanks Paras. On Mon, Sep 19, 2022 at 5:12 PM Brian Candler <b.cand...@pobox.com> wrote: > What does > up{job="prometheus"} > show? > > If it's 0, then you have a problem with prometheus scraping itself. What > error do you see in the targets list in the web interface? Maybe you've > configured it to listen on a different port, or on a different path > (--web.external-url), or with TLS or basic auth. > > If it's 1, then it appears you have no alerting rules. Which would > explain why you don't get any alerts. > > On Monday, 19 September 2022 at 22:55:34 UTC+1 pradha...@gmail.com wrote: > >> None of these metrics are recognized . What am I missing? >> >> prometheus_rule_evaluations_total >> prometheus_rule_evaluation_failures_total >> prometheus_rule_group_iterations_total >> prometheus_rule_group_iterations_missed_total >> >> Thanks >> >> On Mon, Sep 19, 2022 at 4:39 PM Paras pradhan <pradha...@gmail.com> >> wrote: >> >>> Yes. This is what I have >>> scrape_configs: >>> - job_name: "prometheus" >>> static_configs: >>> - targets: ["localhost:9090"] >>> >>> On Mon, Sep 19, 2022 at 4:20 PM Brian Candler <b.ca...@pobox.com> wrote: >>> >>>> You should be getting results all the time, even when things are >>>> working. If you are not, then it means those metrics are missing, which >>>> means most likely you are not collecting them. >>>> >>>> You'll need a scrape job like the one I posted. >>>> >>>> On Monday, 19 September 2022 at 22:14:06 UTC+1 pradha...@gmail.com >>>> wrote: >>>> >>>>> Getting "Empty Query Results" at this moment. I will check when I >>>>> notice the problem again. >>>>> >>>>> Thanks for your input ! >>>>> Paras. >>>>> >>>>> On Mon, Sep 19, 2022 at 4:03 PM Brian Candler <b.ca...@pobox.com> >>>>> wrote: >>>>> >>>>>> Are you collecting prometheus' own metrics? Something like this: >>>>>> >>>>>> - job_name: prometheus >>>>>> scrape_interval: 1m >>>>>> static_configs: >>>>>> - targets: ['localhost:9090'] >>>>>> >>>>>> If you are, then there are various metrics you should check, >>>>>> including: >>>>>> prometheus_rule_evaluations_total >>>>>> prometheus_rule_evaluation_failures_total >>>>>> prometheus_rule_group_iterations_total >>>>>> prometheus_rule_group_iterations_missed_total >>>>>> >>>>>> For the rule / rule group in question, check which of these are >>>>>> incrementing during the problem period. If the 'failures' or 'missed' are >>>>>> incrementing, that points to a problem. Similarly if the >>>>>> 'evaluations_total' or 'iterations_total' *isn't* incrementing. >>>>>> >>>>>> Also, have a look at error output from prometheus while the problem >>>>>> is occurring: >>>>>> journalctl -fu prometheus >>>>>> >>>>>> On Monday, 19 September 2022 at 21:53:46 UTC+1 pradha...@gmail.com >>>>>> wrote: >>>>>> >>>>>>> Correct. Restating prometheus does fix it. >>>>>>> >>>>>>> On Mon, Sep 19, 2022 at 3:44 PM Brian Candler <b.ca...@pobox.com> >>>>>>> wrote: >>>>>>> >>>>>>>> "Restarting prometheus, alertmanager and blackbox-exports fixes the >>>>>>>> issue" >>>>>>>> >>>>>>>> Which one of these fixes the issue? From what you've said, I am >>>>>>>> guessing that restarting only prometheus would do it - since you're >>>>>>>> saying >>>>>>>> you see no alerts in the Prometheus UI, not even in "pending" state. >>>>>>>> >>>>>>>> On Monday, 19 September 2022 at 21:39:11 UTC+1 pradha...@gmail.com >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Prometheus : 2.38.0 >>>>>>>>> Alertmanager : 0.24.0 >>>>>>>>> Blackbox: 0.22.0 >>>>>>>>> >>>>>>>>> probe_success{job="blackbox_icmp-server"} returns 0. I see it . >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> Paras. >>>>>>>>> >>>>>>>>> On Mon, Sep 19, 2022 at 3:32 PM Brian Candler <b.ca...@pobox.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Prometheus version? Alertmanager version? >>>>>>>>>> >>>>>>>>>> What if you enter the query >>>>>>>>>> probe_success{job="blackbox_icmp-server"} == 0 >>>>>>>>>> in the prometheus web interface (PromQL browser) while the >>>>>>>>>> problem is happening? Does it show any results? >>>>>>>>>> >>>>>>>>>> On Monday, 19 September 2022 at 19:21:29 UTC+1 >>>>>>>>>> pradha...@gmail.com wrote: >>>>>>>>>> >>>>>>>>>>> Hello Julius >>>>>>>>>>> >>>>>>>>>>> * The rule is something like this: >>>>>>>>>>> >>>>>>>>>>> - name: ServerDown >>>>>>>>>>> rules: >>>>>>>>>>> - alert: Server-InstanceDown >>>>>>>>>>> expr: probe_success{job="blackbox_icmp-server"} == 0 >>>>>>>>>>> for: 1m >>>>>>>>>>> >>>>>>>>>>> * When alerting is not working, they are down for hours until I >>>>>>>>>>> restart prometheus and blackbox exporters. After restarting, >>>>>>>>>>> everything is >>>>>>>>>>> normal. >>>>>>>>>>> >>>>>>>>>>> * The underlying metrics (probe_sucess) get 0 when it's down >>>>>>>>>>> but they don't change to Pending/Fired. >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> Paras. >>>>>>>>>>> >>>>>>>>>>> On Mon, Sep 19, 2022 at 2:35 AM Julius Volz < >>>>>>>>>>> juliu...@promlabs.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Paras, >>>>>>>>>>>> >>>>>>>>>>>> Could you share more information about your setup: >>>>>>>>>>>> >>>>>>>>>>>> * What's the alerting rule that isn't working as intended? >>>>>>>>>>>> * For how long were the hosts down without getting alerted on? >>>>>>>>>>>> * What did the underlying metrics (e.g. "up" for the exporter's >>>>>>>>>>>> own scrape health and "probe_success" for the backend probe health) >>>>>>>>>>>> collected by the Blackbox Exporter look like at the time when the >>>>>>>>>>>> alert >>>>>>>>>>>> should have been firing, but didn't? >>>>>>>>>>>> >>>>>>>>>>>> One possibility is that your Blackbox exporter itself couldn't >>>>>>>>>>>> be scraped anymore, in which case its "up" metric would be 0 and >>>>>>>>>>>> the >>>>>>>>>>>> "probe_success" metric would be absent (and thus any alerts based >>>>>>>>>>>> on that >>>>>>>>>>>> metric would never fire). >>>>>>>>>>>> >>>>>>>>>>>> Regards, >>>>>>>>>>>> Julius >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Sep 15, 2022 at 6:33 PM Paras pradhan < >>>>>>>>>>>> pradha...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hello, >>>>>>>>>>>>> >>>>>>>>>>>>> We use prometheus , alertmanager and blackbox-exporter to >>>>>>>>>>>>> check hosts if they respond to icmp. Host counts are 1K+. We >>>>>>>>>>>>> noticed >>>>>>>>>>>>> sometimes and randomly the alerts are not generated (prometheus >>>>>>>>>>>>> dashboard >>>>>>>>>>>>> --> alerts) when the hosts/targets are actually down. Restarting >>>>>>>>>>>>> prometheus, alertmanager and blackbox-exports fixes the issue. >>>>>>>>>>>>> Don't see >>>>>>>>>>>>> anything that standouts in the logs. How do I troubleshoot and is >>>>>>>>>>>>> there >>>>>>>>>>>>> anything like cache data in prometheus that needs to be cleared? >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks >>>>>>>>>>>>> Paras. >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>>>> Google Groups "Prometheus Users" group. >>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>>>> it, send an email to prometheus-use...@googlegroups.com. >>>>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>>>> https://groups.google.com/d/msgid/prometheus-users/6bfb92dc-2a18-44d9-8fda-d6f84efba0e7n%40googlegroups.com >>>>>>>>>>>>> <https://groups.google.com/d/msgid/prometheus-users/6bfb92dc-2a18-44d9-8fda-d6f84efba0e7n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>>>>> . >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Julius Volz >>>>>>>>>>>> PromLabs - promlabs.com >>>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>> Google Groups "Prometheus Users" group. >>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>>> send an email to prometheus-use...@googlegroups.com. >>>>>>>>>> >>>>>>>>> To view this discussion on the web visit >>>>>>>>>> https://groups.google.com/d/msgid/prometheus-users/8e9dedc5-38ca-4e22-883c-3f15a5f84227n%40googlegroups.com >>>>>>>>>> <https://groups.google.com/d/msgid/prometheus-users/8e9dedc5-38ca-4e22-883c-3f15a5f84227n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>> . >>>>>>>>>> >>>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "Prometheus Users" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to prometheus-use...@googlegroups.com. >>>>>>>> >>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/prometheus-users/0a344880-3ac6-4567-9e0a-7e8cec7177dan%40googlegroups.com >>>>>>>> <https://groups.google.com/d/msgid/prometheus-users/0a344880-3ac6-4567-9e0a-7e8cec7177dan%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Prometheus Users" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to prometheus-use...@googlegroups.com. >>>>>> >>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/prometheus-users/50e6a4a9-2e0c-4804-bc01-29925565310bn%40googlegroups.com >>>>>> <https://groups.google.com/d/msgid/prometheus-users/50e6a4a9-2e0c-4804-bc01-29925565310bn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Prometheus Users" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to prometheus-use...@googlegroups.com. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/prometheus-users/798b2d5d-1cc7-47cd-a073-f1511397e098n%40googlegroups.com >>>> <https://groups.google.com/d/msgid/prometheus-users/798b2d5d-1cc7-47cd-a073-f1511397e098n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- > You received this message because you are subscribed to the Google Groups > "Prometheus Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to prometheus-users+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/prometheus-users/9f2afcdc-963b-4465-bc77-847413d47075n%40googlegroups.com > <https://groups.google.com/d/msgid/prometheus-users/9f2afcdc-963b-4465-bc77-847413d47075n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CADyt5gkXeRzuRxdeBg5XGzkUJDnmodddtn1OX50dAZ6nK%2BH-jA%40mail.gmail.com.