On 22.04.21 20:20, Matthias Rampke wrote:
Your best starting point is the rules page of the Prometheus UI (:9090/rules). It will show the error. You can also evaluate the rule expression yourself, using the UI, or maybe using PromLens to help debug expression issues./MR
:9090/rules show those 2 errors - found duplicate series for the match group I think we may have a problem with the federation connfig.. alert:PrometheusRemoteWriteBehindexpr:(max_over_time(prometheus_remote_storage_highest_timestamp_in_seconds[5m]) - on(job, instance) group_right() max_over_time(prometheus_remote_storage_queue_highest_sent_timestamp_seconds[5m])) > 120
for: 15m labels: severity: critical annotations:description: Prometheus {{$labels.namespace}}/{{$labels.pod}} remote write is {{ printf "%.1f" $value }}s behind for {{ $labels.remote_name}}:{{ $labels.url }}.
summary: Prometheus remote write is behind.found duplicate series for the match group {instance="prometheus.slash-dir-poc-in.kuber.example.org:9090", job="federate"} on the left hand-side of the operation: [{cluster="poc", endpoint="web", exported_instance="x.x.x.x:9090", exported_job="prometheus-k8s", instance="prometheus.slash-dir-poc-in.kuber.example.org:9090", job="federate", namespace="monitoring", pod="prometheus-k8s-1", prometheus="monitoring/k8s", prometheus_replica="prometheus-k8s-0", service="prometheus-k8s", team="MY-TEAM-NAME"}, {cluster="poc", endpoint="web", exported_instance="x.x.x.x:9090", exported_job="prometheus-k8s", instance="prometheus.slash-dir-poc-in.kuber.example.org:9090", job="federate", namespace="monitoring", pod="prometheus-k8s-0", prometheus="monitoring/k8s", prometheus_replica="prometheus-k8s-0", service="prometheus-k8s", team="MY-TEAM-NAME"}];many-to-many matching
not allowed: matching labels must be unique on one side and record:node:node_num_cpu:sumexpr:count by(cluster, node) (sum by(node, cpu) (node_cpu_seconds_total{job="node-exporter"} * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:))
found duplicate series for the match group {namespace="monitoring", pod="prometheus-k8s-0"} on the right hand-side of the operation: [{__name__="node_namespace_pod:kube_pod_info:", cluster="preprod", instance="prometheus.ep-preprod-in.kuber.example.org:9090", job="federate", namespace="monitoring", node="4516e9ed-4917-4792-ad49-2158775dc07e", pod="prometheus-k8s-0", prometheus="monitoring/k8s", prometheus_replica="prometheus-k8s-1", team="MY-TEAM-NAME"}, {__name__="node_namespace_pod:kube_pod_info:", cluster="poc", instance="prometheus.slash-dir-poc-in.kuber.example.org:9090", job="federate", namespace="monitoring", node="602efe91-2eb5-466f-9350-c4c6ce35119a", pod="prometheus-k8s-0", prometheus="monitoring/k8s", prometheus_replica="prometheus-k8s-0", team="MY-TEAM-NAME"}];many-to-many matching not allowed: matching labels must be unique on one side
also this alert fires name: PrometheusOutOfOrderTimestamps expr: rate(prometheus_target_scrapes_sample_out_of_order_total[5m]) > 0 we may have a problem with federation:We have an external Prometheus which federates from 4x k8s cluter Prometheus.
config
- job_name: federate
scrape_interval: 15s
scrape_timeout: 15s
honor_labels: false
metrics_path: /federate
scheme: https
tls_config:
insecure_skip_verify: true
params:
match[]:
- '{__name__=~".+"}'
file_sd_configs:
- files:
- k8s.yml
relabel_configs:
- source_labels:
- __address__
regex: (.*)
replacement: ${1}:9090
target_label: __address__
- labels:
cluster: poc
team: MY-TEAM-NAME
targets:
- prometheus.slash-dir-poc-in.kuber.example.org
- labels:
cluster: devtest
team: MY-TEAM-NAME
targets:
- prometheus.slash-dir-devtest-in.kuber.example.org
- labels:
cluster: preprod
team: MY-TEAM-NAME
targets:
- prometheus.ep-preprod-in.kuber.example.org
- labels:
cluster: prod
team: MY-TEAM-NAME
targets:
- prometheus.ep-prod-in.kuber.example.org
kind regards
Evelyn
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/8eb1d476-d2ce-9c99-1dfa-392b390c096c%40disroot.org.
OpenPGP_0x61776FA8E38403FB.asc
Description: OpenPGP public key
OpenPGP_signature
Description: OpenPGP digital signature

