Re: [prometheus-users] Cannot get native Prometheus metrics from graph console

2021-02-11 Thread Christian Hoffmann
Hi, On 2021-02-12 00:21, Corey Abma wrote: Thanks for your response Christian. The Prometheus metrics endpoint is down (http://:10090/metrics). I've set up TLS encryption in a separate config file (web-config.yml) that I give as a command line argument for Prometheus. Could there be something

Re: [prometheus-users] Cannot get native Prometheus metrics from graph console

2021-02-11 Thread Christian Hoffmann
On 2021-02-11 22:57, Corey Abma wrote: Yes, I can confirm there's still an explicit scrape job. It even auto-fills in the graphing query console. That might be historical data though. I even get this from going to /metrics (see screenshot). Yeah, this confirms that Prometheus still exposes

Re: [prometheus-users] Cannot get native Prometheus metrics from graph console

2021-02-11 Thread Christian Hoffmann
Hi, On 2021-02-11 20:43, Corey Abma wrote: I recently updated my Prometheus to v2.24.1. I noticed that if I go to my endpoint localhost:10090/metrics, I can see Prometheus metrics (such as prometheus_target_interval_length_seconds) being published just fine. However, if I go to the graphing

Re: [prometheus-users] Encryption support for storing basic auth/bearer token in prometheus.yaml

2021-02-01 Thread Christian Hoffmann
Hi, On 2021-02-01 14:39, vinoth dharmalingam wrote: We see the basic auth/bearer token details are being stored in the prometheus.yaml/password file in a plain text for target scraping. Our cyber process does not allow this plain storage. Are there ways for storing it in the encrypted format

Re: [prometheus-users] inhibit_rules

2021-01-30 Thread Christian Hoffmann
Hi, On 2021-01-30 10:47, Auggie Yang wrote: I was confused with alertmanager inhibit_rules; details as following: Exapmles: 8 servers, one of servers always got high memory(50%) or disk usage(60%), and the others use resource low; how do you set inhibit_rules for  this special normal node;

Re: [prometheus-users] SFP 10Gb Port

2021-01-17 Thread Christian Hoffmann
Hi, On 2021-01-17 19:58, Hirad Rasoolinejad wrote: Unfortunately, we have lots of services that connect using SFP 10Gb ethernet. We have huge traffic coming to our servers, and we need to process them and analyze network traffic based on transmit and receive. Prometheus and Node Exporter

Re: [prometheus-users] Querying prometheus data/series.

2020-12-23 Thread Christian Hoffmann
On 2020-12-23 09:53, akshay sharma wrote: ONTSTAT{IDF="false",Interval="0",METype="NT",ResourceID="t",instance="172.27.173.103:8999 ",job="prometheus",rxByte="33",time="1608711202557257990",txByte="18"} 3 1) I want to perform some action on the labels of metrics above, how can we achieve

Re: [prometheus-users] combine 2 querues

2020-12-17 Thread Christian Hoffmann
Hi, On 2020-12-17 23:06, Alan Miller wrote: The problem is that my "instance" fields are IP address:port (eg: 10.123.5.5:9182). The best solution would be to fix exactly this. ;) https://www.robustperception.io/controlling-the-instance-label So this query returns the instances and what looks

Re: [prometheus-users] How do I set the maintenance cycle

2020-12-15 Thread Christian Hoffmann
On 2020-12-15 04:39, zhengwei y wrote: Does this mean That I have to maintain a timer task to create and delete the corresponding silence rules every day ? That would be one possibility. This could also be automated using cron or something. We chose a different approach and solve that on

Re: [prometheus-users] overriding alert levels?

2020-12-14 Thread Christian Hoffmann
Hi, On 2020-12-14 15:56, klavs@gmail.com wrote: I would like to adjust alert levels for f.ex. disk space - so hosts that match som tag (like environment: dev) has a different level.. I see I am not the only one with such needs - these guys even implemented their own "extension" to

Re: [prometheus-users] Best practice for gathering metrics from client sites

2020-12-14 Thread Christian Hoffmann
Hi, On 2020-12-14 11:24, Patrick Macdonald wrote: If we have a Prometheus server on site-A and an arbitrary number of client sites, each with hardware we might want monitor, is there best practice for how to achieve this? I'm assuming pushgateway isn't the correct use-case here?  Is the only

Re: [prometheus-users] Prometheus metrics repeat, and cause the promql to be unavailable.

2020-12-11 Thread Christian Hoffmann
Hi, On 2020-12-11 04:36, sowhat7 wrote: I add a label for node 10.193.32.44, and I get repeat metrics from prometheus at a certain moment. Is this a permanent issue or does it affect the time around your change only? I suspect the latter. In that case, you could use some kind of aggregation

Re: [prometheus-users] prom query time range

2020-12-10 Thread Christian Hoffmann
Hi, On 12/10/20 8:57 AM, robu...@gmail.com wrote: I struggle with a prom query, probably not so difficult when you know how to do it ;-) I monitor how many systems are up, with count(up{job="node"} S couple of days ago this number jumped, now I like to find out what systems are

Re: [prometheus-users] Alert descriptions on the edge

2020-12-10 Thread Christian Hoffmann
Hi, On 12/10/20 11:33 AM, deln...@gmail.com wrote:   expr: (node_filesystem_avail_bytes{fstype=~"ext[234]|btrfs|xfs|zfs"} / node_filesystem_size_bytes{fstype=~"ext[234]|btrfs|xfs|zfs"} * 100 < 10 and node_filesystem_readonly{fstype=~"ext[234]|btrfs|xfs|zfs"} == 0)

Re: [prometheus-users] Hunting down the root cause of fluctuating node_filesystem_avail_bytes

2020-12-09 Thread Christian Hoffmann
Hi, On 12/9/20 3:48 PM, iono sphere wrote: I am not sure who could I ask this, but I would like to try here. Currently, I'm seeing something weird in my server. Thanks to Prometheus and Node-Exporter, I have seen that node_filesystem_avail_bytes has been fluctuating up and down for hundreds

Re: [prometheus-users] How do I detect a status code of 301,302 with blackbox-exporter

2020-12-09 Thread Christian Hoffmann
Hi, On 12/8/20 4:01 AM, fun...@gmail.com wrote: someone can help me? If you want to verify that your target always returns one of the two status codes, then define a custom blackbox_exporter http module with valid_status_codes: [301, 302]. I assume you will also have a Location: header in

Re: [prometheus-users] Is there any way to update prometheus expr?

2020-12-03 Thread Christian Hoffmann
Hi, On 12/3/20 10:06 AM, lizhihua0925 wrote: What I want to do is limit the rule to eval metrics which have the spcified label. Ah, I understood that you wanted to add labels, but you already said that you wanted to add matchers. So, just to confirm that I understand correctly: If the user

Re: [prometheus-users] Is there any way to update prometheus expr?

2020-12-03 Thread Christian Hoffmann
Hi, On 12/3/20 9:48 AM, Allenzh li wrote: Exactly, I develop a API which accept prometheus rule from web. When user create a new rule, I want to add a fixed matcher(xxId="xxx"), which label name is fixed and label value is various. eg.     cpu_usage / avg_over_time(cpu_usage[5m] offset 24h)

Re: [prometheus-users] Is there any way to update prometheus expr?

2020-12-03 Thread Christian Hoffmann
Hi, On 12/2/20 4:51 AM, Allenzh li wrote: Hi, recently, I want to add a label to all alerting expr, is there any way to acheive that? What exactly are you trying to do? Technically, you can use alert relabeling to add a static (or at least deterministic) label to each outgoing alert.

Re: [prometheus-users] Promethus metrics caluclated in MB or Bytes or KB

2020-11-26 Thread Christian Hoffmann
Hi, On 11/27/20 7:18 AM, Bharathwaj Shankar wrote: I can see various metrics in promethus,basically in which unit it is calculated? As per the official documentations, all values should be in base units. That would be Bytes: https://prometheus.io/docs/practices/naming/#base-units This

Re: [prometheus-users] Issue in start promentheus service

2020-11-25 Thread Christian Hoffmann
On 11/25/20 9:58 AM, 'Kunal Khandelwal' via Prometheus Users wrote: root@ARL-KUNAL:/home/kunal/Documents/Prometheus/prometheus-2.22.2.linux-amd64# systemctl cat prometheus.service # Warning: prometheus.service changed on disk, the version systemd has loaded is outdated. # This output shows the

Re: [prometheus-users] Issue in start promentheus service

2020-11-25 Thread Christian Hoffmann
s \   --web.console.libraries=/etc/prometheus/console_libraries \   --web.listen-address=0.0.0.0:9090 SyslogIdentifier=prometheus Restart=always [Install] WantedBy=multi-user.target On Wednesday, November 25, 2020 at 2:01:58 PM UTC+5:30 Christian Hoffmann wrote: Hi, On 11/25/20 9

Re: [prometheus-users] Issue in start promentheus service

2020-11-25 Thread Christian Hoffmann
Hi, On 11/25/20 9:19 AM, 'Kunal Khandelwal' via Prometheus Users wrote: I am facing an issue while starting Prometheus Service in Ubuntu., it's throwing the following: In the future, could you please start a new thread? The way you posted makes it appear that your issue is somehow related to

Re: [prometheus-users] Storing log events data

2020-11-23 Thread Christian Hoffmann
Hi, On 11/23/20 10:41 PM, kiran wrote:> I am trying to push log events for lambda functions in Prometheus. > I am trying to see if we can even save this kind of data and if so any > recommended structure in Prometheus. E.g whenever a lambda function is > invoked, AWS puts log events data and

Re: [prometheus-users] Time of day alert

2020-11-23 Thread Christian Hoffmann
Hi, On 11/21/20 2:41 PM, Aleksandar Ilic wrote: I was wondering if there is any way to set an alert only to be triggered at a specific time of day. As I saw for alertmanager there is PR open on GitHub but wondering if there is any workaround for this or any other way. What David suggested

Re: [prometheus-users] Correctly using metric_relabel_configs

2020-11-19 Thread Christian Hoffmann
Hi, On 11/19/20 11:32 PM, Laurent Dumont wrote: > collectd_openstack_nova_gauge{exported_instance="site1-director.potato.com > ",instance="123.123.123.123:9103 >

Re: [prometheus-users] Debugging OOM issue.

2020-11-09 Thread Christian Hoffmann
Hi, On 11/9/20 10:56 AM, yagyans...@gmail.com wrote: > Hi. I am using Promtheus v 2.20.1 and suddenly my Prometheus crashed > because of Memory overshoot. How to pinpoint what caused the Prometheus > to go OOM or which query caused the Prometheus go OOM? Prometheus writes the currently active

Re: [prometheus-users] Case insensitive regex for Alertmanager

2020-10-25 Thread Christian Hoffmann
Hi, On 10/25/20 12:42 PM, Shubham Choudhary wrote: > Can we write Case insensitive regex for Alertmanager? > > For example, label is {team="analytics"} but sometimes it can be > {team="Analytics"} or {team="ANALYTICS"} > > - receiver: opsgenie-ANALYTICS_Prometheus >   match_re: >     chef_env:

Re: [prometheus-users] ssl expiry notification

2020-10-25 Thread Christian Hoffmann
Hi, On 10/23/20 5:54 PM, barnyb...@gmail.com wrote: > Hello my friends. > I'm using ribbybibby /*ssl_exporter >   *for checking ssl expiry > for some services. All works fine  but I would like to add more > information to

Re: [prometheus-users] Removing old data it not happening if "no space left on device" regardless retention params

2020-10-25 Thread Christian Hoffmann
Hi, On 10/23/20 12:56 AM, Shox wrote: > We are experiencing a catch22, when there is "no space left on device" > on attempt to write to /wal and disk space is not freed. After some > investigation, it looks like removing old data is happening only after > compaction, but compaction can't happen

Re: [prometheus-users] blackbox exporter's probe_ssl_earliest_cert_expiry giving negative values

2020-10-20 Thread Christian Hoffmann
Hi, On 10/20/20 11:25 AM, deln...@gmail.com wrote: > I understand there's an ongoing discussion > on this > issue. How do you prevent false(or true) alerts when one of the > applications is providing multiple certs and one of these has

Re: [prometheus-users] Re: Delta usage issues?

2020-10-17 Thread Christian Hoffmann
On 10/17/20 2:31 AM, li yun wrote: > sum_over_time(isphone{name="qq",exname!~"test|test1"}[5m > ])-sum_over_time(isphone{name="qq",exname!~"test|test1"}[5m])offset 5m Try placing the offset modifier right next to the metric name: sum_over_time(isphone{name="qq",exname!~"test|test1"}[5m]) -

Re: [prometheus-users] Re: How to find newly added indicators

2020-10-17 Thread Christian Hoffmann
On 10/17/20 2:28 AM, li yun wrote: > Because my data is collected every 5 minutes If possible, try fixing that. The maximum sane scrape interval is 2m. If nothing else helps, you may consider hiding the problem by using recording rules (with sum_over_time) or changing --query.lookback-delta. >, I

Re: [prometheus-users] Re: How to find newly added indicators

2020-10-14 Thread Christian Hoffmann
Hi, On 10/13/20 12:22 PM, li yun wrote: > For example, the following situation > *isphone{name="user",exname~"13"}* > This exname program will continuously collect a lot of monitoring > indicators, but I want to know which indicators have been added in a > certain period of time > 在2020年10月13日星期二

Re: [prometheus-users] Time drift between my browser and server

2020-10-14 Thread Christian Hoffmann
Hi, On 10/13/20 9:06 AM, HENG KUAN WEE _ wrote: > Therefore, I believe that no data being shown is due to the following > error on the Prometheus UI: "*Warning!* Detected 5508.10 seconds time > difference between your browser and the server. Prometheus relies on > accurate time and time drift

Re: [prometheus-users] Monitor email incoming

2020-09-26 Thread Christian Hoffmann
Hi, On 9/24/20 1:09 PM, Igor pember wrote: > How do I use Prometheus to monitor email system? > For example, I want to monitor gmail, I sent email to a test account > every hour, if gmail system can’t receive messages, the alerts will be > raised. You would need some mechanism for sending the

Re: [prometheus-users] Re: Exemplars: Jump from Grafana to Traces (a dream come true)

2020-09-26 Thread Christian Hoffmann
On 9/25/20 3:59 PM, 'Thomas Güttler' via Prometheus Users wrote: > Is it already possible to let Prometheus export Exemplars (trace-ids)? I think there has been some progress meanwhile, but I haven't seen something externally usable yet. Issues might give a good clue:

Re: [prometheus-users] different alert thresholds per service

2020-09-18 Thread Christian Hoffmann
Hi, On 9/18/20 7:53 AM, Bhupendra kumar wrote: > My question is how can configure Prometheus alert thresholds per service > and as well as server. > > Example: I have two machine 1 is (webserver 1) and 2 is (webserver 2). > > server 1 alert receive after 5 minute > server 2 alert receive after

Re: [prometheus-users] Prettifying and simplifying metrics/visualizations

2020-09-15 Thread Christian Hoffmann
On 9/15/20 10:55 AM, John Dexter wrote: > I'm still finding my feet with Prometheus and one thing that is a bit > awkward is that time-series names are pretty cumbersome. We want a > customer-facing dashboard so let's say I want to monitor network activity: > > rate(windows_net_packets_total[2m])

Re: [prometheus-users] Query when executet with prometheus web interface, but doesn't work with Grafana

2020-09-14 Thread Christian Hoffmann
On 9/14/20 4:20 PM, dykow wrote: >> Maybe promxy, Thanos, Cortex or VictoriaMetrics may be better solutions >> for you. > I've done some research and I don't understand how these tools are > better than prometheus' native federation functionality, except HA and > long retention capabilities. >

Re: [prometheus-users] Query when executet with prometheus web interface, but doesn't work with Grafana

2020-09-14 Thread Christian Hoffmann
Hi, On 9/14/20 1:58 PM, dykow wrote: > I am quering vmware_exporter metrics with the following: > *(vmware_vm_mem_usage_average / on(host_name) group_left(instance, > cluster_name, dc_name, monitoring_name) vmware_host_memory_max) * 100*/ > / > Prometheus web interface is returing metrics, but

Re: [prometheus-users] compression in prometheus

2020-09-01 Thread Christian Hoffmann
Hi, On 9/1/20 2:50 PM, Rodolphe Ghio wrote: > I am curently doing an intership and my tutor asked my to do an > algorithm to compress prometheus data, what do you think about that is > it possible ? I think there are lots of ressources regarding Prometheus' encoding which is already supposed to

Re: [prometheus-users] /metrics endpoint not showing metrics scraped from applications

2020-08-29 Thread Christian Hoffmann
Hi, On 8/28/20 10:08 PM, 'Rounak Salim' via Prometheus Users wrote: > I'm unfamiliar with federation but it seems like it's mostly used for > pulling data from multiple Prometheus instances into a single instance. > Do all the scraped metrics show up on the /federate URL if federation is > setup?

Re: [prometheus-users] Prometheus/Zookeeper version

2020-08-27 Thread Christian Hoffmann
Hi, On 8/27/20 4:49 PM, My Me wrote: > Are you saying that Prometheus doesn't use Zookeeper internally ? I guess you are both right, somehow. :) Prometheus does not require Zookeeper and it's not a core feature. However, Prometheus can use Zookeeper for its service discovery. It therefore

Re: [prometheus-users] Correlating different alerts to produce single alert

2020-08-26 Thread Christian Hoffmann
Hi, On 8/27/20 2:19 AM, radhamani...@gmail.com wrote: > If pods in two different namespaces go down,then we need to send a alert > as an appA is down.. > Can I simply write expr as Kubepoddown_in_namespaceA and > Kubepoddown_in_namespaceB ,and send alert message as "AppA is down"? Yes, that's

Re: [prometheus-users] go_memstats_alloc_bytes

2020-08-26 Thread Christian Hoffmann
Hi, On 8/26/20 1:27 PM, deepak...@gmail.com wrote: > How can i pull the metrics related to a application from the server > using Prometheus? You would usually instrument the application itself (by using one of the language clients such as client_python, client_golang, etc.). If this isn't

Re: [prometheus-users] /metrics endpoint not showing metrics scraped from applications

2020-08-26 Thread Christian Hoffmann
Hi, On 8/27/20 3:16 AM, 'Rounak Salim' via Prometheus Users wrote: > The /metrics endpoint only shows me the metrics for the Prometheus > server and none of the metrics scraped by Prometheus from other > applications are shown. > > How can I get all my metrics to be shown in the /metrics

Re: [prometheus-users] Correlating different alerts to produce single alert

2020-08-26 Thread Christian Hoffmann
Hi, On 8/26/20 11:00 PM, radhamani...@gmail.com wrote: > I want to send alert by  doing some correlation based upon multiple > alerts.For eg: if podA,podB,serviceA are all 100% down in two different > namespaces(namespace1,namespace2),then  I want to send alert like > ApplicationA is down. Is

Re: [prometheus-users] Alerts / Alert Manager

2020-08-26 Thread Christian Hoffmann
Hi, On 8/20/20 12:47 PM, 'azha...@googlemail.com' via Prometheus Users wrote: > I have 2 alerts  > > - The first being to fire if CPU is more then 70% (WMI) > > - The second  to report whether an instance is down > > 100 - (avg by(instance) (rate(wmi_cpu_time_total{mode="idle"}[2m])) * > 100)

Re: [prometheus-users] Wal inclusion in retention.size

2020-08-26 Thread Christian Hoffmann
Hi, On 8/21/20 10:59 AM, Venkata Bhagavatula wrote: > In the https://prometheus.io/docs/prometheus/2.20/storage/ link, > Following is mentioned regarding retention.size > > |--storage.tsdb.retention.size|: [EXPERIMENTAL] This determines the > maximum number of bytes that storage blocks can use

Re: [prometheus-users] blackbox probe : x509: certificate signed by unknown authority even with insecure_skip_verify set to true

2020-08-26 Thread Christian Hoffmann
Hi, On 8/26/20 1:26 PM, Marion Guthmuller wrote: > I'm trying to monitor a website with prometheus and blackbox exporter. > Each of them are running inside a docker (images pulled from official > docker hub https://hub.docker.com/r/prom/prometheus and >

Re: [prometheus-users] Prometheus and metric_relabel_configs

2020-08-16 Thread Christian Hoffmann
Hi, On 8/16/20 8:43 PM, Thomas Berger wrote: > I already searched for it (HTTP API), but found no information about it. > However, under localhost: 9090 (insert metric at cursor) all metrics are > displayed. > Above the Execute button, the input field says Expession, > but i don't know how can

Re: [prometheus-users] Prometheus and metric_relabel_configs

2020-08-16 Thread Christian Hoffmann
Hi, On 8/16/20 12:33 AM, Thomas Berger wrote: > with the exporters i get a large number of metrics. > However, I don't need all of them. > I don't want to save these and remove them from the DB. > This is described in the documentation with metric_relabel_configs. > > Example to remove all go:.

Re: [prometheus-users] Conditional routing to alertmanager

2020-08-13 Thread Christian Hoffmann
Hi, On 8/13/20 7:56 PM, Johny wrote: > i have a requirement which requires routing only alerts for > pagerduty to a specific alert manager mesh due to proxy set up. All > other alerts are sent to default alertmanager. Is it possible to do > this conditional routing in prometheus alerting

Re: [prometheus-users] Grouping of alarms (group_interval, group_wait and repeat_interval)

2020-08-12 Thread Christian Hoffmann
Hi, On 8/12/20 3:41 PM, rosaLux161 wrote: > If alert 1 and alert 2 occur simultaneously or in a very short time, > then only one alert should be sent out. If alert 2 only occurs after > some time, then another alert should be sent. The latter does not work. > If alert 2 occurs, nothing happens.

Re: [prometheus-users] Silenced alerts and web.hook receiver

2020-08-10 Thread Christian Hoffmann
Hi, On 8/9/20 4:47 PM, Anthony Dakhin wrote: > I'm new to Prometheus and currently trying to implement a proxy between > Alertmanager and our monitoring system. I've configured web.hook > receiver based on sample Alertmanager config, so I'm able to receive > POSTs when alert is firing or resolved

Re: [prometheus-users] AlertManager: how the time in message generated?

2020-08-10 Thread Christian Hoffmann
Hi, On 8/10/20 8:51 AM, leiwa...@gmail.com wrote: > A pushed the alert message wo wechat:  > >   [Warning]: > 2020-08-10 03:03:53 > description = zhaoqing : 1081.122236   > > I want to know how the time i  marked as red is generated and what it > represents? > It is  about 40 mins earlier than

Re: [prometheus-users] alert label with variable for pushgateway metrics

2020-08-09 Thread Christian Hoffmann
Hi, On 8/10/20 7:06 AM, Aravind Poojari wrote: > We are facing an issue while writing alert rules for the above jobs & > instances. > We are unable to use a template so we have to write the alert rules for > each and every job and their respective instances. It's kind of hard as > instances keep

Re: [prometheus-users] How can I Send Individual Alert for Different Servers on Different usage Criteria

2020-08-08 Thread Christian Hoffmann
Hi, On 8/5/20 3:45 PM, Pachha Gopi wrote: > Hi @Christian I am facing an issue that my alaert manager is not sending > the alerts on regular intervals.how can i configure my alert manager to > send alerts to my slack? Are you looking for the repeat_interval option, which defaults to 4h?

Re: [prometheus-users] How can I Send Individual Alert for Different Servers on Different usage Criteria

2020-08-05 Thread Christian Hoffmann
Hi, On 8/5/20 2:40 PM, Pachha Gopi wrote: > I am Using Prometheus for my Production Servers,my question is is there > any way that we can configure different alerts for individual server. > for example I have 3 Servers ,Server1 cpu usage is 20% ,Server 2 cpu > usage is 30% and Server 3 cpu usage

Re: [prometheus-users] Transmission of (Database) Table Data to Prometheus

2020-08-05 Thread Christian Hoffmann
On 8/3/20 8:34 AM, 'Píer Bauer' via Prometheus Users wrote: > Due to the fact that my query (in real world) contains several thousand > rows of output, I would like to pursue a generic approach to avoid > setting a separate PowerShell variable for each table cell data... > > > But currently I

Re: [prometheus-users] Alertmanager: resolved message received immediately after warning message even resolve_timeout is 5m

2020-08-05 Thread Christian Hoffmann
On 8/5/20 10:21 AM, leiwa...@gmail.com wrote: > rules.yml > groups: > - name: network-delay >   rules: >   - alert: "network delay" >     expr: probe_duration_seconds * 1000 > 3000 >     for: 1s >     labels: >       severity: warning >       team: ops >     annotations: >       description:

Re: [prometheus-users] Prometheus memory issue

2020-08-05 Thread Christian Hoffmann
Hi, On 8/4/20 12:24 PM, Vinod M V wrote: > >           I am facing Memory usage with Prometheus service and > Maintaining 30 days of data from Node exporter, Process exporter and JMX > exporter for 95 servers in Prometheus Database.  > >          Grafana and Prometheus are running on the same

Re: [prometheus-users] Able to specify bind port, but not address

2020-08-05 Thread Christian Hoffmann
Hi, On 8/4/20 3:23 PM, jumble wrote: > Latest prometheus, on RHEL8. > > Observed behavior: bound to |127.0.0.1:9090| This sounds unexpected. Are you using the official binaries from prometheus.io / github? Can you share the exact logs from your experiments? Is it possible that you've got

Re: [prometheus-users] blackbox dns probe failed

2020-08-05 Thread Christian Hoffmann
Hi, On 8/4/20 10:54 AM, e huang wrote: > ts=2020-08-04T05:41:58.646Z caller=main.go:169 > module=dns_eboss.enmonster.com target=10.208.100.9 level=debug > msg="Error while sending a DNS query" err="read udp4 10.208.100. > 10:36709->10.208.100.9:53: i/o timeout" > ts=2020-08-04T05:41:58.646Z

Re: [prometheus-users] How to prevent sending resolve notification after resolve_timeout?

2020-08-05 Thread Christian Hoffmann
Hi, On 8/4/20 2:21 PM, shiqi chai wrote: > Hey guys,I have a problem with configuration of resolve_timeout. As > it means, a notication of resolved will be send after the timeout. > But actually the issue still be firing, it disturb the correct > resolved notification How can I prevent it? Not

Re: [prometheus-users] Remove orphan alert in Prometheus

2020-07-25 Thread Christian Hoffmann
Hi, On 7/25/20 7:41 AM, jro...@gmail.com wrote: > Somehow I ended with an alert from a Prometheus scrape job that was > removed at some point and now I got this orphan alert that it's been > triggered and being sent to the Alertmanager configured receiver. How > can I remove this alert? Can I

Re: [prometheus-users] How do I to determine if a single alert out of a set of grouped together alerts common label has a certain value?

2020-07-21 Thread Christian Hoffmann
On 7/16/20 11:51 PM, 'a z' via Prometheus Users wrote: > I am unsure how within Prometheus/Alertmanager templating how I can > check if one of the alert labels has a certain value. In Alertmanager templates, you can use Go's template syntax which allows for the "eq" function. You can see an

Re: [prometheus-users] node exporter process in defunct state, unable to restart

2020-07-21 Thread Christian Hoffmann
Hi, On 7/21/20 10:01 PM, Lakshman Savadamuthu wrote: > just FYI, there are few other hosts in this cluster, where node_exporter > is running just fine without any issues. > We have started the process using systemctl command, here is the service > file: > > # cat

Re: [prometheus-users] node exporter process in defunct state, unable to restart

2020-07-21 Thread Christian Hoffmann
Hi, On 7/21/20 9:34 PM, Lakshman Savadamuthu wrote: > Thanks for the reply Christian. > Looks like the node_exporter is in defunct state, i can't even stop the > process now. > > Here is the version: > > [root@mesosagent13 ~]# /usr/local/bin/node_exporter --version > > node_exporter, version

Re: [prometheus-users] How can I check which specific version of node exporter was installed together with the deb package?

2020-07-21 Thread Christian Hoffmann
Hi, On 7/21/20 8:54 PM, mordowiciel wrote: > I've installed the 0.15.2+ds version of the prometheus-node-exporter deb > package on Ubuntu 18.04. I was expecting that it would contain the > prometheus-node-exporter version 0.15.2 too, but looking at the names of > the exported metrics, I can see

Re: [prometheus-users] node exporter process in defunct state, unable to restart

2020-07-21 Thread Christian Hoffmann
Hi, On 7/21/20 8:58 PM, Lakshman Savadamuthu wrote: [...] > Jul 21 11:51:21 mesosagent13.xstackstage1.infosight.nimblestorage.com > node_exporter[35895]: time="2020-07-21T11:51:21-07:00" level=info msg=" > - diskstats" source="node_exporter.go:104" > > Jul 21 11:51:21

Re: [prometheus-users] How to set the maximum number of alertmanager group_by

2020-07-18 Thread Christian Hoffmann
Hi, On 7/17/20 5:26 AM, long wrote: > When I set up group_by in alertmanager.yml, I have an alert manager with > 25 alerts, but it will be split into 3 messages each with a maximum of > 10 alerts. How do I set the maximum number of alerts in 1 message? I haven't heard of an alertmanager-side

Re: [prometheus-users] Prometheus AlertManager filter.

2020-07-14 Thread Christian Hoffmann
On 7/14/20 8:45 PM, Zhang Zhao wrote: > Hi Christian, > After I updated the config below, seems everything stopped feeding to > ServiceNow even the ones with “inc:servicenow” label.. Any idea? Hrm, your config looks fine to me. Can you show us an example alert definition which is not routed

Re: [prometheus-users] Prometheus AlertManager filter.

2020-07-14 Thread Christian Hoffmann
Hi, On 7/14/20 7:24 PM, Zhang Zhao wrote: >> I added a filter in the alertmanager config so that only alerts that >> contain "inc:servicenow" label are able to be fed to ServiceNow. >> However it didn't work as expected. I still saw events that do not >> contain this label getting fed to

Re: [prometheus-users] Blacbox exporter soap check 500

2020-07-14 Thread Christian Hoffmann
Hi, On 7/14/20 11:17 AM, Yusuf Dönmez wrote: [...] >   - source_labels: [__address__] > target_label: __param_target >   - source_labels: [__param_target] > target_label: instance >   - target_label: __address__ > replacement: [...] > - labels: > module: 

Re: [prometheus-users] Re: Storage Retention 15d for Prometheus

2020-07-13 Thread Christian Hoffmann
On 7/13/20 3:58 PM, Bhupendra kumar wrote: > Yes pls check. The last line looks like the restart was not successful. This might mean that Prometheus is still running with an older cmdline. Can you check the process list? ps aux | grep prometheus or something? I suspect it might still say 15d. If

Re: [prometheus-users] Sent resolved or inactive status (alertmanager)

2020-07-13 Thread Christian Hoffmann
Hi, On 7/13/20 1:27 PM, Dmitry wrote: > Hello! > I have standard rule for prometheus alertmanager: >   rules: >   - alert: Instance_down > expr: up == 0 > for: 1m > # Labels - additional labels to be attached to the alert > labels: >   severity: 'critical' > annotations: >

Re: [prometheus-users] Monitor process in RHEL 6

2020-07-07 Thread Christian Hoffmann
Hi, On 7/7/20 7:49 PM, Krishnan Subramanian wrote: > Hi, am looking at monitoring processes in RHEL 6 via node exporter.  For > RHEL 7 and above i can use the node exporter --collector.systemd.   > > I am looking at similar option in RHEL 6? is there a way possible?   There are multiple

Re: [prometheus-users] Defining Prometheus alerts with different thresholds per node

2020-07-02 Thread Christian Hoffmann
Hi, On 7/2/20 2:17 AM, LabTest Diagnostics wrote: > I've written some alerts for memory usage (for windows nodes) that look > like this: > > | > expr:100*(windows_os_physical_memory_free_bytes)/(windows_cs_physical_memory_bytes)<70 > | > > Currently, any server that exceeds 70% of available mem

Re: [prometheus-users] Re: Is it possible to use REST Service to provide data for prometheus

2020-07-02 Thread Christian Hoffmann
Hi, On 7/2/20 8:15 AM, Thorsten Stork wrote: > another question to this: How/where will the endpoint of my REST service > configured, so prometheus will call ist it and get the values at the > actual time ? That's the scrape configuration.

Re: [prometheus-users] Is it possible to use REST Service to provide data for prometheus

2020-07-01 Thread Christian Hoffmann
Hi, On 7/1/20 5:25 PM, Thorsten Stork wrote: > I am really new to the prometheus topics, but should evaluate monitoring > functionality  > for a system. > > On this system (middleware) we kan provide REST Service, http-Services > or webservices which can deliver some metrics at the current time

Re: [prometheus-users] Multiple remote_write

2020-07-01 Thread Christian Hoffmann
Hi, On 7/1/20 4:44 PM, Ramachandra Bhaskar Ayalavarapu wrote: > Is it possible for a single Prometheus to have multiple remote_write adapters > depending on jobs ? > For example job1 should be writing to r1 (cortex) and r2 to another > remote_write instance ? Although I'm not using it: As far

Re: [prometheus-users] Is it possible to extract labels when generating AlertManager alert ?

2020-06-30 Thread Christian Hoffmann
Hi, On 6/25/20 8:55 PM, Sébastien Dionne wrote: > I have few java applications that I'll deploy in my cluster.  I need to > know how can I detect if a instance is up or down with Prometheus.  > > *Alerting with AlertManager* > * > * > I have a alert that check for "instanceDown" and send a alert

Re: [prometheus-users] prometheus is scraping metrics from an instance which has no exporter running on

2020-06-30 Thread Christian Hoffmann
Hi, On 6/23/20 8:32 AM, Yashar Nesabian wrote: > Hi, > A few days ago I realized IPMI exporter is not running on one of our > bare metals but we didn't get any alert from our Prometheus. Although I > cannot get the metrics via curl on the Prometheus server, our Prometheus > is scraping metrics

Re: [prometheus-users] Is there a good grok user group? I need a pattern!

2020-06-30 Thread Christian Hoffmann
Hi, On 6/24/20 11:14 AM, Danny de Waard wrote: > Prometheus users, > > Who of you knows a good grok site/group/knowledge base where i can > figure out my pattern. > I can not figure out how to get my ssl log good in grok. Looks like this is used in Logstash, maybe you can ask there?

Re: [prometheus-users] Issues with group_left to exclude specific label value

2020-06-30 Thread Christian Hoffmann
Hi, On 6/25/20 4:05 PM, Al wrote: > All hosts from which I collect node_exporter metrics each have an > additional node_role metric (added via textfile collector) which > identifies all the Chef roles a given host has.  As an example, say we > have 3 hosts with the following textfile collector

Re: [prometheus-users] Merging too prometheus datasources on the same grafana dashboard

2020-06-30 Thread Christian Hoffmann
Hi, On 6/29/20 11:43 AM, Daly Graty wrote: > I got to grafana servers first one is monitoring kubernetes installed on > the master the second is on a separate Vm both are pinging ! > I need to merge both of them in order to access them with the same URL > I tried to added kubernetes prometheus (

Re: [prometheus-users] Custom Threshold for a particular instance.

2020-06-30 Thread Christian Hoffmann
Hi, On 6/24/20 8:09 PM, yagyans...@gmail.com wrote: > Hi. Currently I am using a custom threshold in case of my Memory alerts. > I have 2 main labels for my every node exporter target - cluster and > component. > My custom threshold till now has been based on the component as I had to > define

Re: [prometheus-users] Prometheus timeseries and table panel in grafana

2020-06-30 Thread Christian Hoffmann
Hi, On 6/23/20 5:43 PM, neel patel wrote: > I am using prometheus and grafana combo to monitor PostgreSQL database. > > Now prometheus stores the timeseries as below. > > disk_free_space{file_system="/dev/sda1",file_system_type=“xfs”,mount_point="/boot",server=“127.0.0.1:5432”} > 9.5023104e+07

Re: [prometheus-users] disk speed

2020-06-30 Thread Christian Hoffmann
Hi, On 6/23/20 4:45 PM, 'Metrics Searcher' via Prometheus Users wrote: > Does anyone know how to collect the disk speed, like I can do it via > hdparm or dd? I don't know of a standard solution for this. Also, your examples are performance metrics which cannot be collected passively and

Re: [prometheus-users] Job label in file-based SD

2020-06-29 Thread Christian Hoffmann
Hi, On 6/26/20 10:33 AM, Björn Fischer wrote: > I was going through the guide for file-based service discovery [1] and > noticed that they are setting the job label in the targets file. That > doesn't make sense to me. Targets are not strictly job-specific and > Prometheus is setting the job

Re: [prometheus-users] Alert handling using alertmanager even handler .

2020-06-29 Thread Christian Hoffmann
Hi, On 6/30/20 7:51 AM, Pooja Chauhan wrote: > Hi Christian, > Can u pls gve me the official document link which you are referring. This is the official documentation outlining the alert rule syntax: https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/ Kind regards,

Re: [prometheus-users] Prometheus Query/alerting rules related to NFS Detach mount using node-exporter mountstats and nfs collector does not work

2020-06-29 Thread Christian Hoffmann
Hi, On 6/26/20 12:50 PM, Satyam Vishnoi wrote: > I should get alert when below given /mapr nfs mount-point get detached . > > > I am using following 2 metrics provided by node-exporter collector > mountstats and nfs . > > > Query-1 absent(node_filesystem_size_bytes { >

Re: [prometheus-users] Alert handling using alertmanager even handler .

2020-06-29 Thread Christian Hoffmann
Hi, On 6/28/20 3:23 PM, Pooja Chauhan wrote: > I want to handle alerts like jenkins process down using alertmanager > even handler. But the document is not helping me with how to configure > it . Really need help on from where to download this > :https://github.com/jjneely/am-event-handler  and

Re: [prometheus-users] Prometheus 2.18 incompatibility with 2.04

2020-06-20 Thread Christian Hoffmann
On 6/20/20 5:31 PM, Johny wrote: > If it is non-compliant endpoint, the problem should appear in both > versions, isn't it? It is effecting more than one series. The set up is > in corporate org so I cannot expose end points publicly. Maybe you can build a small reproducer: Grab your metrics via

Re: [prometheus-users] Prometheus javamelody

2020-06-19 Thread Christian Hoffmann
Hi, On 6/16/20 12:09 PM, Shivam Soni wrote: > I got an issue in Prometheus configure to java melody. > can anyone solve this? > plz check URL: > > https://github.com/prometheus/prometheus/issues/7404 I'm seeing a small but maybe relevant difference between your Prometheus config and your test

Re: [prometheus-users] Alertmanger "Not Grouped" alerts

2020-06-19 Thread Christian Hoffmann
Hi, On 6/19/20 8:34 AM, Romenyrr wrote: > I've come across this issue where I'm grouping by 'alertname' but > nothing is being grouped except for one odd group. When I click on the > group tab and click on "Enable custom grouping" that seems to sort > everything by 'alertname'.  > > This

Re: [prometheus-users] Error while writing an alert rule in alert.yml file

2020-06-19 Thread Christian Hoffmann
Hi, On 6/19/20 4:20 PM, Isabel Noronha wrote: >  This is just code snippet of my alerts.yml file > - alert: ContainerKilled >     expr: IF absent(((time() - container_last_seen{name=".+"}) < 5)) >     for: 15s >     labels: >       severity: warning >     annotations: >       summary: "Container

Re: [prometheus-users] Prometheus AlertManager Alert Grouping

2020-06-18 Thread Christian Hoffmann
On 6/18/20 3:00 AM, Zhang Zhao wrote: > Hi, I have a question for alert grouping in AlertManager. I integrated > Prometheus Alerts to ServiceNow via Webhook.  I see the events were > captured on ServiceNow side as below. However, inside each of events > below, there were multiple alerts included.

  1   2   >