[prometheus-users] Re: Weird node_exporter network metrics behaviour - NIC problem?

2024-01-16 Thread 'Brian Candler' via Prometheus Users
I would suspect due to how the counters are incremented and the new values published. Suppose in the NIC's API new counter values are published at some odd interval like every 0.9 seconds. Your 15 second scrape will sometimes see the results of 16 increments from the previous counter, and

[prometheus-users] Re: Weird node_exporter network metrics behaviour - NIC problem?

2024-01-16 Thread Dito Windyaksa
You're right - it's related to our irate query. We tried switching to rate() and it gives us a straight linear line during iperf tests. We've been using irate for years across dozens of servers, but we've only noticed 'weird drops'/instability samples on this single server. We don't see any

[prometheus-users] Re: Weird node_exporter network metrics behaviour - NIC problem?

2024-01-15 Thread Bryan Boreham
I would recommend you stop using irate(). With 4 samples per minute, irate(...[1m]) discards half your information. This can lead to artefacts. There is probably some instability in the underlying samples, which is worth investigating. An *instant* query like

[prometheus-users] Re: Weird node_exporter network metrics behaviour - NIC problem?

2024-01-14 Thread Dito Windyaksa
Yup - both are running under the same scrape interval (15s) and using the same irate query: irate(node_network_transmit_bytes_total{instance="xxx:9100", device="eno1"}[1m])*8 It's an iperf test between each other and no interval argument is set (default zero.) I wonder if it has something to

[prometheus-users] Re: Weird node_exporter network metrics behaviour - NIC problem?

2024-01-14 Thread Alexander Wilke
Do you have the same scrape_interval for both machines? Are you running irate on both queties or "rate" on the one and "irate" on the other? Are the iperf intervals the same for both tests? Dito Windyaksa schrieb am Montag, 15. Januar 2024 um 00:02:26 UTC+1: > Hi, > > We're migrating to a new