You can use something like
`avg_over_time(node_processes_state{state='D'}[10m])` to smooth over missed
scrapes. Depending on how sensitive you want this to be, you can also do
`max_over_time()`.On Wed, Jun 3, 2020 at 9:49 AM 林浩 <[email protected]> wrote: > > We use node export to monitor os D state process, when D state process > number > 500, it triggers pager alert rule like these > > - alert: Node_Process_In_D_State_Count_Critical > expr: node_processes_state{state='D'} > 500 > for: 10m > > but the problem is when OS running into problem status (too much D state > process), looks like node export agent also running in problem status, it > can NOT report correct D state process metric to Prometheus server. > from the below screenshot, we can see some data points missing. This > causes alert flapping, when data missing, the alert gets resolved. > > is any way to avoid alert auto resolved when some data points missed? > > [image: Jietu20200603-154527.jpg] > > > > > > -- > You received this message because you are subscribed to the Google Groups > "Prometheus Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/prometheus-users/f8560b07-9b00-4dfc-9671-667368ddd530%40googlegroups.com > <https://groups.google.com/d/msgid/prometheus-users/f8560b07-9b00-4dfc-9671-667368ddd530%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CABbyFmrkiD1ZT1p5fOfP%3DpdcW-yTMRDYbnAZLgRbz1eRuWykaw%40mail.gmail.com.

