[prometheus-users] Re: better way to get notified about (true) single scrape failures?

2023-05-12 Thread Christoph Anton Mitterer
Hey Brian On Wednesday, May 10, 2023 at 9:03:36 AM UTC+2 Brian Candler wrote: It depends on the exact semantics of "for". e.g. take a simple case of 1 minute rule evaluation interval. If you apply "for: 1m" then I guess that means the alert must be firing for two successive evaluations

[prometheus-users] How AlertManager webhook notification retry works in case the receiver is donw for 1 hour ?

2023-05-12 Thread Abdul
Hi All, I have queries regards, AlertManager webhook notification in scenarios like receiver is down between 5 minutes to 60 minutes. Will AlertManager sends alerts occurred in between this time once receiver is up ?. For example, AlertManager Wehhook is configured at 10.00 AM and Alerts are

[prometheus-users] How to use Jmx exporter JavaAgent as a sidecar container

2023-05-12 Thread tantan hngo
As title suggests, I want to run the javaagent in the same pod as our Tomcat. How do I run the javaagent commands towards our .war file that is in the other container? I tried just running the agent by itself and let it listen on the port that we've defined in our tomcat to expose its JMX

[prometheus-users] Prometheus JMX Exporter Help

2023-05-12 Thread tantan hngo
Hello everyone, not quite sure if this is where I should be asking for help but i'm kind of befuddled about this whole situation. I'm trying to set up https://github.com/prometheus/jmx_exporter for our containerized Java application on our cluster. Specifically the JavaAgent as we are

[prometheus-users] Seeking help with Prometheus empty value problem

2023-05-12 Thread 张星
When the value of query_result_sys_network_track_error is empty, the result is also empty. I want it to default to 0, regardless of whether the minuend or subtrahend is empty. I tried using vector(0), but when used together with sum by, the same problem occurs. My current query is: ```

[prometheus-users] Deleting all segments newer than corrupted segment

2023-05-12 Thread Jérôme Loyet
Hi, this morning we noticed a prometheus server with 3.3TB of metrics stopped to returned metrics older than ~2h30. Disk was still full with 3.3TB of data. when I restarted the prometheus servers, it started to replay the WAL and find a corrupted segment. Then it deleted all segments after