[prometheus-users] Unit testing

Jimmy the Greek Fri, 23 Oct 2020 14:42:08 -0700

I have been experimenting with the unit test capabilities provided by 
promtool and have run into a few issues/gotchas that I can't seem to 
understand.


example code:

*rule_files:*
*  - ../nodelocal-cache.yaml*

*evaluation_interval: 1m*

*tests:*
*  - interval: 1m*
*    external_labels:*
*      cluster: test*
*    input_series:*
*    - series: 'coredns_nodecache_setup_errors_total{pod="unit-test", 
errortype="configmap"}'*
*      values: '1 2 3 4 5 6 7 8 9 10'*
*    - series: 'coredns_dns_response_rcode_count_total{job="nodelocal-dns", 
rcode="SERVFAIL", zone="."}'*
*      values: '0 60 120 180 240 300 360 420 480 540'*
*    - series: 'coredns_dns_response_rcode_count_total{job="nodelocal-dns", 
rcode="NOERROR", zone="."}'*
*      values: '0 120 240 360 480 600 720 840 960 1080'*

*    promql_expr_test:*
*    - expr: rate(coredns_nodecache_setup_errors_total{}[5m])*
*      eval_time: 5m*
*      exp_samples:*
*        - labels: '{pod="unit-test", errortype="configmap"}'*
*          value: 1.6666666666666666E-02*
*    - expr: rate(coredns_dns_response_rcode_count_total{}[5m])*
*      eval_time: 10m*
*      exp_samples:*
*        - labels: '{job="nodelocal-dns", rcode="SERVFAIL", zone="."}'*
*          value: 1*
*        - labels: '{job="nodelocal-dns", rcode="NOERROR", zone="."}'*
*          value: 2*

*    alert_rule_test:*
*      - eval_time: 6m*
*        alertname: NodeLocalDNSSetupErrorsHigh*
*        exp_alerts:*
*          - exp_labels:*
*              severity: critical*
*              alertname: NodeLocalDNSSetupErrorsHigh*
*              errortype: configmap*
*              pod: unit-test*
*            exp_annotations:*
*              description: test:unit-test There are configmap errors 
setting up NodeLocalDNS*
*              summary: NodeLocalDNS setup errors on test:unit-test*

*----*


*groups:- name: NodeLocalDNS  rules:  - alert: NodeLocalDNSSetupErrorsHigh  
  labels:                                                                  
                                                            severity: 
critical    for: 5m    expr: |      
rate(coredns_nodecache_setup_errors_total{}[5m]) > 0                        
                                                     annotations:      
summary: "NodeLocalDNS setup errors on {{ $externalLabels.cluster }}:{{ 
$labels.pod }}"      description: "{{ $externalLabels.cluster }}:{{ 
$labels.pod }} There are {{ $labels.errortype }} errors setting up 
NodeLocalDNS"*

As you can see I run prom QL test 
*rate(coredns_nodecache_setup_errors_total{}[5m])* that evaluates to 1.666. 
Therefore when I test NodeLocalDNSSetupErrorsHigh which will trigger when 
that value is above 0 for 5 minute period the test only passes if I set 
eval_time to 6m, and fails if I set it to 5m (alert doesn't trigger).

What is the relation between the for time in the alert rule itself and the 
eval_time in the test?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/08a1f740-cbde-40f0-b20d-f1ed28b7f6d4n%40googlegroups.com.

[prometheus-users] Unit testing

Reply via email to