[prometheus-users] Prometheus operator pod is not running
Hello Everyone Actually my prometheus operator readiness failed because of this i am unable to open my UI .I have checked the logs but i couldn't able to find any error messages related to prometheus Could you please check the below logs and do the needful . *LOGS :*X567H40N2WD level=info ts=2020-06-28T18:03:33.702Z caller=repair.go:59 component=tsdb msg="found healthy block" mint=159190560 maxt=159191280 ulid=01EAJT7KCKCSRT16H7CEWC6X3B level=info ts=2020-06-28T18:03:33.702Z caller=repair.go:59 component=tsdb msg="found healthy block" mint=159191280 maxt=159192000 ulid=01EAK261BQVKXF14BQMBHKFFH9 level=info ts=2020-06-28T18:03:33.702Z caller=repair.go:59 component=tsdb msg="found healthy block" mint=159192000 maxt=159192720 ulid=01EAKA1RYKVM6WM4K0P1CRZDHJ level=info ts=2020-06-28T18:03:33.703Z caller=repair.go:59 component=tsdb msg="found healthy block" mint=159185520 maxt=159187680 ulid=01EAKB8V8EPRGJMZQAB8K7FEJV level=info ts=2020-06-28T18:03:33.703Z caller=repair.go:59 component=tsdb msg="found healthy block" mint=159192720 maxt=159193440 ulid=01EAKGZNYZN22MPA84RWWYBEYP level=info ts=2020-06-28T18:03:54.940Z caller=head.go:584 component=tsdb msg="replaying WAL, this may take awhile" level=info ts=2020-06-28T18:04:11.343Z caller=head.go:608 component=tsdb msg="WAL checkpoint loaded" level=info ts=2020-06-28T18:04:11.344Z caller=head.go:632 component=tsdb msg="WAL segment loaded" segment=32993 maxSegment=34175 level=info ts=2020-06-28T18:04:11.344Z caller=head.go:632 component=tsdb msg="WAL segment loaded" segment=32994 maxSegment=34175 level=info ts=2020-06-28T18:04:11.344Z caller=head.go:632 component=tsdb msg="WAL segment loaded" segment=32995 maxSegment=34175 *kubdetl events :*Events: Type Reason AgeFrom Message -- --- Normal Scheduled 17mdefault-scheduler Successfully assigned monitoring/prometheus-prometheus-operator-prometheus-0 to ip-172-16-9-79.ec2.internal Normal SuccessfulAttachVolume 17m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-c841d132-a29f-11e9-8e27-021d0af0a494" Normal Pulling 17mkubelet, ip-172-11.2.2.ec2.internal pulling image " quay.io/prometheus/prometheus:v2.15.2" Normal Pulling 17mkubelet, ip- 172-11.2.2 .ec2.internal pulling image " quay.io/coreos/configmap-reload:v0.0.1" Normal Created 17mkubelet, ip- 172-11.2.2 .ec2.internal Created container Normal Started 17mkubelet, ip- 172-11.2.2 .ec2.internal Started container Normal Pulling 17mkubelet, ip- 172-11.2.2 .ec2.internal pulling image " quay.io/coreos/prometheus-config-reloader:v0.37.0" Normal Pulled 17mkubelet, ip- 172-11.2.2 .ec2.internal Successfully pulled image " quay.io/coreos/prometheus-config-reloader:v0.37.0" Normal Created 17mkubelet, ip- 172-11.2.2 .ec2.internal Created container Normal Pulled 17mkubelet, ip- 172-11.2.2 .ec2.internal Successfully pulled image " quay.io/prometheus/prometheus:v2.15.2" Normal Started 17mkubelet, ip- 172-11.2.2 ec2.internal Started container Normal Pulled 17mkubelet, ip- 172-11.2.2 .ec2.internal Successfully pulled image " quay.io/coreos/configmap-reload:v0.0.1" Normal Created 17mkubelet, ip 172-11.2.2 .ec2.internal Created container Normal Started 17mkubelet, ip- 172-11.2.2 .ec2.internal Started container Normal Pulling 17mkubelet, ip- 172-11.2.2 ec2.internal pulling image "thanosio/thanos:v0.7.0" Normal Pulled 17mkubelet, ip- 172-11.2.2 .ec2.internal Successfully pulled image "thanosio/thanos:v0.7.0" Normal Created 17mkubelet, ip- 172-11.2.2 .ec2.internal Created container Normal Started 17mkubelet, ip- 172-11.2.2 ec2.internal Started container * Warning Unhealthy 2m12s (x179 over 17m) kubelet, ip- 172-11.2.2 .ec2.internal Readiness probe failed: HTTP probe failed with statuscode: 503* Regards -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this
Re: [prometheus-users] Re: not able to see metrics from query browser even though end point is up and showing metrics through curl
up metric showing value 1 and alos targets it is showing up as well. is there nay time mismatch between prometheus server and exporter? if yes,, how could we disable it. On Sunday, June 28, 2020 at 6:44:57 AM UTC-5 b.ca...@pobox.com wrote: > tcpdump -i eth0 -nn -s0 -A host x.x.x.x and tcp port y > > where x.x.x.x is the IP address of the remote host, and y is the port > you're running the exporter on. Replace "eth0" with your network interface > name. If prometheus and the exporter are on the same machine then use "lo" > (loopback) as the interface name. > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/be6a5035-d446-42ca-a44a-2e42a15dba5cn%40googlegroups.com.
[prometheus-users] Re: Scrape Notification
Hey Brian, Thanks for the reply. I do agree that I'm not using Prometheus as intended. The rationale for my question is that I'm interested in working "around" what I would consider "limitations" of Grafana. I want to coordinate the visualization of events alongside the time series data I am collecting. By allowing Prometheus to scrape one set of numbers at specific instances and then posting zeros for others, I am creating a fake "time series" that really hosts events. Grafana will just ingest that data as any other and then present a beautiful synchronized visualization. I got it working by getting a callback from the webserver. Once Prometheus has scraped the page, the webserver initiates the callback, I zero out the result. Best regards, - Steve P.S. I already send event data to Grafana using annotations. This gives the nice visualization, but doesn't allow the end user to easily download the event data for offline processing. On Friday, June 26, 2020 at 1:30:49 AM UTC-5, Brian Candler wrote: > > Prometheus is not an event database. If you want to show the results of > *individual* tests (or the duration of individual tests) then this isn't > the tool for you. Use a normal SQL database, or a noSQL like > ElasticSearch, InfluxDB etc. > > Prometheus is a metric database, so if you want to show the *number* or > *rate* of tests (or test failures/successes) then it will do a good job. > To do that, you should expose a counter of the number of test > runs/passes/failures, or an accumulated total of the test run times, or a > histogram of test run times. > > Then it doesn't matter how often it's scraped: the data is always > correct. The graph can show how many tests ran in a certain amount of time > by looking at how the counter has changed over that time. If you have a > histogram you can also answer questions like "what percentage of tests > completed in under 5 seconds". But you are still thinking about tests in > the aggregate, not as individual events. > > Resetting the data on a scrape is not a good idea, because it breaks as > soon as two prometheus servers scrape the same endpoint - e.g. a > high-availability pair, or a laptop scraping the data to test a new > collector. > > You mention pushgateway. That's rarely the right solution for any > application. You could use it here, if the only thing you care about is > the result and/or timestamp of the *last* test that was run. To accumulate > counters or totals between test runs, have a look at statsd_exporter > instead. > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/ebd7e582-c03f-41b6-97dd-3ee2bc27d2cbo%40googlegroups.com.
[prometheus-users] Alert handling using alertmanager even handler .
Hi , I want to handle alerts like jenkins process down using alertmanager even handler. But the document is not helping me with how to configure it . Really need help on from where to download this : https://github.com/jjneely/am-event-handler and how to start with it.The documentation is not really enough for me .I tried searching for examples but couldn't fine.Please help me with some examples also. thanks -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/ef5b4370-774e-43b9-9ea3-b8e6c6b19807o%40googlegroups.com.
Re: [prometheus-users] Re: not able to see metrics from query browser even though end point is up and showing metrics through curl
Hi Brian Candler wrote: "up" missing - scraping not configured. "up" has value 0 - unable to communicate or data is bad. "up" has value 1 - maybe partial scrape Check logs at prometheus and exporter; check traffic between prometheus and exporter with tcpdump. Can you show a sample tcpdump syntax for this purpose? Thanks. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/130d9158-d6d7-f49f-1884-b2dd1f66e314%40pobox.com.
[prometheus-users] Re: not able to see metrics from query browser even though end point is up and showing metrics through curl
Then you're probably not scraping it, or the custom exporter is returning invalid data. Check the value of the "up" metric in query browser for the given job and instance, i.e. up{job="foo",instance="bar"} "up" missing - scraping not configured. "up" has value 0 - unable to communicate or data is bad. "up" has value 1 - maybe partial scrape Check logs at prometheus and exporter; check traffic between prometheus and exporter with tcpdump. Also show example of the curl output. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/ab0b5481-a040-45db-b2cd-e9ed70c37f72o%40googlegroups.com.
[prometheus-users] Re: kube-apiserver availability calculation using PromQL
The issue resolved: https://stackoverflow.com/questions/62516996/availability-calculation-using-promql On Monday, June 22, 2020 at 6:50:33 PM UTC+4:30, mahmoud shiri varamini wrote: > > Hi, > I'm trying to monitor my Kubernetes cluster availability. I'm scraping > kube-apiserver metrics and calculate availability according to pods > availability. Some time cluster downs and kube-apiserver pods down and > Prometheus server is not able to scrape at all and sometime lube-apiserver > pods are up and running and also serving requests but due to network > connectivity or any other reason Prometheus server can not scrape metrics. > But I should consider that api-server pods are available and did not > respond and pods are down and even Prometheus can not scrape metrics due to > network connectivity issues. > Is there any way to use PromQL to ignore no vales when I'm trying to > calculate availability percent? > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/ea1ddd15-f22b-484a-b750-88954d87caedo%40googlegroups.com.