[prometheus-users] Federation Job "Error on ingesting out-of-order samples"

Brian Beynon Mon, 15 Feb 2021 12:45:45 -0800

Hello,
I recently updated our prometheus setup from using the helm chart to using 
the prometheus-operator.


*Summary of current setup:*
Platform:  Google Cloud
1.  Google Project used for monitoring:
Prometheus Operator (prometheus,alertmanager,grafana, node-exporter, 
kube-state-metrics...ect.)

2. Then multiple other Google Projects that now run the Prometheus-Operator 
(node-exporter, kube-state-metrics...ect) but without Alertmanager/Grafana.

So the main google project (#1 above) has federate scrapes/jobs that 
connect to each of the other google projects prometheus (#2 above).

Since updating to the prometheus-operators I'm now running into these 
errors coming from the main prometheus logs:
*msg="Error on ingesting samples with different value but same timestamp"*
and *msg="Error on ingesting out-of-order samples".*

Below is an example of one of the federate jobs where the errors are coming 
from. 
When I have the job "vms" and job "node-exporter" both enabled the errors 
occur.  If I disable either of those jobs I no longer see the errors.  

- job_name: 'test-abc-123'
  scrape_interval: 60s
  scrape_timeout: 30s
  honor_labels: true
  metrics_path: '/federate'
  scheme: 'https'
  basic_auth:
    username: '###################'
    password: '###################'
  params:
    'match[]':
*      - '{job="vms"} '*
*      - '{job="node-exporter"} '*
      - '{job="postgres"} '
      - '{job="barman"} '
      - '{job="apiserver"} '
      - '{job="kube-state-metrics"} '
  static_configs:
    - targets:
      - 'test-abc-123.com'
      labels:
        project: 'test-abc-123'

Here is the node-exporter serviceMonitor from project test-abc-123:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: node-exporter
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 1.1.0
  name: node-exporter
  namespace: monitoring
spec:
  endpoints:
  - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
    interval: 15s
    port: https
    relabelings:
    - action: replace
      regex: (.*)
      replacement: $1
      sourceLabels:
      - __meta_kubernetes_pod_node_name
      targetLabel: instance
    scheme: https
    tlsConfig:
      insecureSkipVerify: true
  jobLabel: app.kubernetes.io/name
  selector:
    matchLabels:
      app.kubernetes.io/component: exporter
      app.kubernetes.io/name: node-exporter
      app.kubernetes.io/part-of: kube-prometheus

Here is the "vms" job from project test-abc-123:
    
      - job_name: 'vms'
        static_configs:
          - targets: ['db-prod-1:9100','db-prod-2:9100','util-1:9100']
            labels:
              project: 'client-vms'

I have tried updating labels but maybe not in the right way.   Any 
suggestions or pointers would be appreciated.  

Thank you






-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/b574a695-736f-4f12-9f64-c6bcafeb160dn%40googlegroups.com.

[prometheus-users] Federation Job "Error on ingesting out-of-order samples"

Reply via email to