[prometheus-users] Re: Alertmanager Pod is failing - CrashLoopBackOff

2020-05-19 Thread Brian Candler
What the error is saying is you tried to add a setting "teams" under opsgenie_configs, but opsgenie_configs does not recognise such an option. The set of allowed options is defined here: https://prometheus.io/docs/alerting/configuration/#opsgenie_config Maybe you wanted something like:

Re: [prometheus-users] Re: Pushgateway or StatsD

2020-05-19 Thread 'Albert Aleksandrov' via Prometheus Users
Thanks for the idea воскресенье, 17 мая 2020 г., 0:24:24 UTC+3 пользователь Matthias Rampke написал: > > If you care about the individual event to the extent that you want to see > it individually, you are probably better off using an event tracking system > like the ELK stack. > > Prometheus

[prometheus-users] Monitor specific application process in Linux

2020-05-19 Thread Juan Rosero
Hello, I've been reading a lot on different sites and this User Group as well, but have not come up with a clear answer. I need to monitor a specific application process in Linux and verify if it's running and I've been reading about *--collector.processes* and enabling it on Node Exporter.

[prometheus-users] Expose java metrics to prometheus

2020-05-19 Thread Nidhi Sharma
Hi, I have a web app running on tomcat. There is a hello resource (REST API) to check the health of the app. Output if this resource gives the version of the app. I want output of this resource to be captured as a metric and pulled by Prometheus. Please help on how can I proceed. -- You

[prometheus-users] How to get the count or sum by hour in day or daywise in a month from PromQL in Grafana

2020-05-19 Thread Rajesh Reddy Nachireddi
Hi, How to get the count or sum by hour in day or daywise in a month from PromQL in Grafana ? we want to get the following: 1. when we select daily report - ex: on Monday 12 AM to 1AM - 100 1AM - 2AM - 200 2AM - 3AM - 400 on tuesday 12 AM to 1AM - 100 1AM - 2AM - 200 2AM - 3AM - 400

Re: [prometheus-users] derive alert severity from other labels

2020-05-19 Thread Brian Brazil
On Tue, 19 May 2020 at 09:25, Roland Mieslinger wrote: > Hi, > > we are using the same set of alert rules for both, our production and qa > environment, with the severity label set to a value based on what is > appropriate for production. > As a consequence, alert severity is too high for most

[prometheus-users] Re: NTP Metrics.

2020-05-19 Thread Brian Candler
The ntp collector is disabled by default : you can turn it on with a command-line flag. However, the timex collector is enabled by default (e.g. node_timex_sync_status, node_timex_estimated_error_seconds) For a rough idea of how

[prometheus-users] Re: Expose java metrics to prometheus

2020-05-19 Thread Nidhi Sharma
Thanks for replying. Can we do this by Mbeans and JMX ? I can create Mbeans and register it in my application but I dont know how to capture the output in Prometheus time series format. For ex : Product_version{product="SomeProduct",version="2.9.1"} On Tuesday, May 19, 2020 at 2:09:51 PM

Re: [prometheus-users] derive alert severity from other labels

2020-05-19 Thread Roland Mieslinger
Am Dienstag, 19. Mai 2020 10:46:32 UTC+2 schrieb Brian Brazil: > > On Tue, 19 May 2020 at 09:25, Roland Mieslinger > wrote: > >> Hi, >> >> we are using the same set of alert rules for both, our production and qa >> environment, with the severity label set to a value based on what is >>

[prometheus-users] Re: Expose java metrics to prometheus

2020-05-19 Thread Vu Tuan Dat
Create a metric name as you wanted and add labels to it https://github.com/prometheus/client_java#labels On Tuesday, May 19, 2020 at 4:17:00 PM UTC+7, Nidhi Sharma wrote: > > Thanks for replying. Can we do this by Mbeans and JMX ? I can create > Mbeans and register it in my application but I

[prometheus-users] Re: derive alert severity from other labels

2020-05-19 Thread Roland Mieslinger
Am Dienstag, 19. Mai 2020 10:33:20 UTC+2 schrieb Vu Tuan Dat: > > you can try: > severity: '{{ if eq $labels.environment "qa" }} warn {{ else }} page {{ > end }}' > >> >> Nice hack, I haven't thought about (ab)using the templating engine for that. -- You received this message because you are

[prometheus-users] NTP Metrics.

2020-05-19 Thread Yagyansh S. Kumar
Hi. I have my own NTP server configured at x.x.x.x . Now, I want to check if my 10 other servers are synchronized with my NTP server or not. I have gone through a lot of threads and found different opinions with different answers. Also, I guess node_ntp_drift_seconds is an old metrics and

[prometheus-users] derive alert severity from other labels

2020-05-19 Thread Roland Mieslinger
Hi, we are using the same set of alert rules for both, our production and qa environment, with the severity label set to a value based on what is appropriate for production. As a consequence, alert severity is too high for most alerts in our qa environment. The environment is available as a

[prometheus-users] Re: derive alert severity from other labels

2020-05-19 Thread Vu Tuan Dat
you can try: severity: '{{ if eq $labels.environment "qa" }} warn {{ else }} page {{ end }}' On Tuesday, May 19, 2020 at 3:25:01 PM UTC+7, Roland Mieslinger wrote: > > Hi, > > we are using the same set of alert rules for both, our production and qa > environment, with the severity label set to

[prometheus-users] Re: Expose java metrics to prometheus

2020-05-19 Thread Vu Tuan Dat
this can help https://github.com/prometheus/client_java On Tuesday, May 19, 2020 at 1:32:10 PM UTC+7, Nidhi Sharma wrote: > > Hi, I have a web app running on tomcat. There is a hello resource (REST > API) to check the health of the app. Output if this resource gives the > version of the app. I

[prometheus-users] Alerts in Alertmanger cannot be cleared

2020-05-19 Thread Vu Tuan Dat
Hello, I got an issue with Alertmanager and Prometheus synchronization. My system has a cluster of two Alertmanager nodes for multiple Prometheus clusters. Yesterday, I updated Prometheus config (targets) using `/-/reload` endpoints then I got several unresolved alerts which were NOT (for

[prometheus-users] Re: Monitor specific application process in Linux

2020-05-19 Thread Vu Tuan Dat
Ideally, write a exporter for your specific process, it's not that hard. Or instrument directly into your app. On Tuesday, May 19, 2020 at 2:11:47 PM UTC+7, Juan Rosero wrote: > > Hello, > > I've been reading a lot on different sites and this User Group as well, > but have not come up with a

[prometheus-users] Re: Monitor specific application process in Linux

2020-05-19 Thread Brian Candler
You can use a little script to write metrics to a textfile and pick them up with node_exporter's textfile_collector, and run it periodically (e.g. from cron). The textfile_collector also exposes the timestamp when the file was last modified, so you can alert if it stops being updated. -- You

Re: [prometheus-users] derive alert severity from other labels

2020-05-19 Thread Brian Brazil
On Tue, 19 May 2020 at 10:25, Roland Mieslinger wrote: > Am Dienstag, 19. Mai 2020 10:46:32 UTC+2 schrieb Brian Brazil: >> >> On Tue, 19 May 2020 at 09:25, Roland Mieslinger wrote: >> >>> Hi, >>> >>> we are using the same set of alert rules for both, our production and qa >>> environment, with

[prometheus-users] Hosted Prometheus with Grafana Cloud

2020-05-19 Thread Colton Conor
We are exploring the option of paying for Grafana Cloud's service. In addition to hosting Grafana, it comes with the ability to store metrics from Prometheus and Graphite. The documentation says: To send data using Prometheus you need the following: - A running instance of Prometheus. - In

[prometheus-users] server returned HTTP status 500 Internal Server Error

2020-05-19 Thread Valliappan RM
Hi Trying to monitor fortigate Firewall Getting this error -server returned HTTP status 500 Internal Server Error Trying based on this https://grafana.com/grafana/dashboards/7567 -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To

[prometheus-users] Suggested limit for max number of series per Prometheus instance?

2020-05-19 Thread Al
I'm currently on-boarding many more metrics and hosts to our Prometheus infrastructure and I wanted to know when is it advised to shard out metrics into a separate instance? I have multiple shards (or groups) of Prometheus instances although one of the groups of instances has over 3 million

Re: [prometheus-users] Re: How to optimize High cardinality labels in Prometheus

2020-05-19 Thread Stuart Clark
If you are doing large queries which touch a lot of timeseries you will need lots of memory and CPU. Ideally you would minimise such queries, or use pre-aggregated metrics (created with recording rules) to simplify what is being requested. I'd suggest looking at what you are try to achieve.

[prometheus-users] Re: How to optimize High cardinality labels in Prometheus

2020-05-19 Thread Dinesh Nithyanandam
Can someone please help here On Tuesday, May 19, 2020 at 2:25:01 AM UTC+5:30, Dinesh N wrote: > > Hi Team, > > I have been using Thanos-Prometheus stack and running into high > cardinality issues where CPU goes till ~80% and then goes down and this > happens when firing high cardinality queries

[prometheus-users] How to scale Prometheus in an organization?

2020-05-19 Thread Juergen Etzlstorfer
Hi everyone, I’ve blogified some of our learnings when it comes to scaling Prometheus in an organization. The article should help to understand the most common challenges + give guidance how to overcome them. Plus I am briefly discussing an open-source project called Keptn https://keptn.sh

[prometheus-users] Prometheus setting off Checkpoint firewalls

2020-05-19 Thread Andy Kruta
My apologies if this has been answered already, but I've looked through the configs for a setting that would allow me to define how many targets can be scraped at once and came up empty. Essentially, what I've got going on here is my prometheus is being blocked by my checkpoint firewalls (for

Re: [prometheus-users] How to get data from SYNTAX section of MIB file

2020-05-19 Thread Denis Trunov
It helped with moduleIDsType metrics - instead of moduleIDsType{moduleIDsIndex="1"} 296 with overrides: moduleIDsType: type: EnumAsInfo I can get moduleIDsType_info{moduleIDsIndex="1",moduleIDsType="moduleDigitalVideo12PortIO"} 1 But in other metrics I still have

Re: [prometheus-users] How to get data from SYNTAX section of MIB file

2020-05-19 Thread Brian Brazil
On Tue, 19 May 2020 at 16:51, Denis Trunov wrote: > Hi, > I have a very simple generator.yml file with just walk section - it works > fine, I can get metrics look like > > moduleConfigsPowerStatus{moduleConfigsIndex="1",moduleIDsType="296"} 2 > > > moduleIDsType is described in MIB > >

Re: [prometheus-users] Prometheus setting off Checkpoint firewalls

2020-05-19 Thread Brian Brazil
On Tue, 19 May 2020 at 16:02, Andy Kruta wrote: > My apologies if this has been answered already, but I've looked through > the configs for a setting that would allow me to define how many targets > can be scraped at once and came up empty. Essentially, what I've got going > on here is my

[prometheus-users] How to get data from SYNTAX section of MIB file

2020-05-19 Thread Denis Trunov
Hi, I have a very simple generator.yml file with just walk section - it works fine, I can get metrics look like moduleConfigsPowerStatus{moduleConfigsIndex="1",moduleIDsType="296"} 2 moduleIDsType is described in MIB moduleIDsType OBJECT-TYPE SYNTAX Integer32 { moduleUnknown(0),

Re: [prometheus-users] Suggested limit for max number of series per Prometheus instance?

2020-05-19 Thread Brian Brazil
On Tue, 19 May 2020 at 14:23, Al wrote: > I'm currently on-boarding many more metrics and hosts to our Prometheus > infrastructure and I wanted to know when is it advised to shard out metrics > into a separate instance? I have multiple shards (or groups) of Prometheus > instances although one

[prometheus-users] codahale and jmx instrumentation

2020-05-19 Thread aditya garg
Hello guys, I saw that there are 2 ways to the instrument java application. 1) Using JMX, in which we need to manage the MBean server. 2) Using codahale, where there is a metric registry to do the same work. I want to know that am I thinking correct by assuming this. Secondly, how to do

[prometheus-users] Re: Alertmanager Pod is failing - CrashLoopBackOff

2020-05-19 Thread vikram yerneni
Sure Brian... I am not sure the config changes earlier. Let me try it out. Thanks On Tuesday, May 19, 2020 at 1:48:26 AM UTC-5, Brian Candler wrote: > > What the error is saying is you tried to add a setting "teams" under > opsgenie_configs, but opsgenie_configs does not recognise such an

[prometheus-users] Re: Alertmanager Pod is failing - CrashLoopBackOff

2020-05-19 Thread vikram yerneni
It worked... Thanks a lot Brian... On Tuesday, May 19, 2020 at 1:48:26 AM UTC-5, Brian Candler wrote: > > What the error is saying is you tried to add a setting "teams" under > opsgenie_configs, but opsgenie_configs does not recognise such an option. > The set of allowed options is defined

Re: [prometheus-users] Prometheus setting off Checkpoint firewalls

2020-05-19 Thread Andy Kruta
Although I wish it was, unfortunately, it's not an option. The good news is that I don't have to deal with the checkpoints much longer. The bad news is that until I get rid of them, I have to silence the noise. On Tuesday, May 19, 2020 at 10:53:38 AM UTC-5, Brian Brazil wrote: > > On Tue, 19

[prometheus-users] Diff bw MetricRegistry and Mbean Server

2020-05-19 Thread Thomas Will
Hello fellas, I have searched a lot before asking here but didn't get the solution. What is the difference between MetricRegistry and Mbean Server, and in what cases we use each of them? Have a good day. Thomas Will. -- You received this message because you are subscribed to the Google

[prometheus-users] Error shutting down ActiveMQ with JMX Exporter

2020-05-19 Thread Brad Pridgeon
I'm testing this JMX Exporter with ActiveMQ to get metrics. Per instructions in this post, I setup the jar as a Java agent. Here are the changes in the activemq/bin/env

Re: [prometheus-users] derive alert severity from other labels

2020-05-19 Thread Christian Hoffmann
Hi Roland, On 5/19/20 10:25 AM, Roland Mieslinger wrote: > we are using the same set of alert rules for both, our production and qa > environment, with the severity label set to a value based on what is > appropriate for production. > As a consequence, alert severity is too high for most alerts

Re: [prometheus-users] Monitor specific application process in Linux

2020-05-19 Thread Christian Hoffmann
Hi Juan, On 5/19/20 9:11 AM, Juan Rosero wrote: > I've been reading a lot on different sites and this User Group as well, > but have not come up with a clear answer. I need to monitor a specific > application process in Linux and verify if it's running and I've been > reading about

[prometheus-users] Re: Monitor specific application process in Linux

2020-05-19 Thread Juan Rosero
Thanks everyone! I'll exporte the options suggested and see what works best. Thanks again and have a great day/evening! -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send

Re: [prometheus-users] Error shutting down ActiveMQ with JMX Exporter

2020-05-19 Thread Harald Koch
On Tue, May 19, 2020, at 18:00, Brad Pridgeon wrote: > I'm testing this JMX Exporter > with ActiveMQ to get metrics. Per instructions in this post, I setup the jar > as a Java agent. >

[prometheus-users] Re: Monitor specific application process in Linux

2020-05-19 Thread Juan Rosero
Thanks everyone! I'll explore the options suggested and see what works best. Thanks again and have a great day/evening! -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send

[prometheus-users] How to generate a GUID or current time for my annotations in Prometheus alerting rule templates

2020-05-19 Thread zichen chuh
I went through documents given by prometheus website and didn't find a clue. >From alerting_rules , only 3 variables are available : $lables, $externalLabels, $value. Thanks in advance. -- You received

[prometheus-users] Does Prometheus cloudwatch exporter supports multiple AWS regions in a single instance

2020-05-19 Thread Jayaprakash Rangaswamy
Hello Team, I want to export AWS cloudwatch metrics which are running in mutilple AWS regions. As per the Prometheus cloudwatch exporter documentation ( https://github.com/prometheus/cloudwatch_exporter), I could see AWS region should be defined for metrics export in YAML file. To export from

[prometheus-users] Re: NTP Metrics.

2020-05-19 Thread Yagyansh S. Kumar
Thanks for the response Brian. I have already enabled the NTP collector in all all my servers, but still cannot see the *node_ntp_drift_seconds* metrics giving the output. Apart from that, I have couple of questions here. Firstly, why are we checking the target clock with Prometheus' server?