Hi!

I have just upgraded from Prometheus 2.3.2 to Prometheus 2.17.1 with no 
more changes than replacing the binaries and scraping of our Kubernetes 
clusters started to fail. We host the Prometheus server in a dedicated 
machine in EC2 and access the K8s API via internal network. This had been 
working fine for several months until this upgrade.

These errors started appearing in the logs:

level=error ts=2020-04-15T08:35:09.842Z caller=klog.go:94 
component=k8s_client_runtime func=ErrorDepth 
msg="/app/discovery/kubernetes/kubernetes.go:407: Failed to list 
*v1.Service: Get 
http://internal-api-develop-k8s-local-meks2l-1292322695.us-east-1.elb.amazonaws.com/api/v1/services?limit=500&resourceVersion=0:
 
dial tcp 172.16.67.74:80: connect: connection timed out" 
level=error ts=2020-04-15T08:35:09.842Z caller=klog.go:94 
component=k8s_client_runtime func=ErrorDepth 
msg="/app/discovery/kubernetes/kubernetes.go:362: Failed to list 
*v1.Service: Get 
http://internal-api-develop-k8s-local-meks2l-1292322695.us-east-1.elb.amazonaws.com/api/v1/services?limit=500&resourceVersion=0:
 
dial tcp 172.16.67.74:80: connect: connection timed out" 
level=error ts=2020-04-15T08:35:09.842Z caller=klog.go:94 
component=k8s_client_runtime func=ErrorDepth 
msg="/app/discovery/kubernetes/kubernetes.go:385: Failed to list *v1.Pod: 
Get 
http://internal-api-develop-k8s-local-meks2l-1292322695.us-east-1.elb.amazonaws.com/api/v1/pods?limit=500&resourceVersion=0:
 
dial tcp 172.16.67.74:80: connect: connection timed out" 
level=error ts=2020-04-15T08:35:09.842Z caller=klog.go:94 
component=k8s_client_runtime func=ErrorDepth 
msg="/app/discovery/kubernetes/kubernetes.go:361: Failed to list 
*v1.Endpoints: Get 
http://internal-api-develop-k8s-local-meks2l-1292322695.us-east-1.elb.amazonaws.com/api/v1/endpoints?limit=500&resourceVersion=0:
 
dial tcp 172.16.67.74:80: connect: connection timed out" 
level=error ts=2020-04-15T08:35:09.842Z caller=klog.go:94 
component=k8s_client_runtime func=ErrorDepth 
msg="/app/discovery/kubernetes/kubernetes.go:363: Failed to list *v1.Pod: 
Get 
http://internal-api-develop-k8s-local-meks2l-1292322695.us-east-1.elb.amazonaws.com/api/v1/pods?limit=500&resourceVersion=0:
 
dial tcp 172.16.67.74:80: connect: connection timed out" 
level=error ts=2020-04-15T08:35:09.846Z caller=klog.go:94 
component=k8s_client_runtime func=ErrorDepth 
msg="/app/discovery/kubernetes/kubernetes.go:449: Failed to list *v1.Node: 
Get 
http://internal-api-develop-k8s-local-meks2l-1292322695.us-east-1.elb.amazonaws.com/api/v1/nodes?limit=500&resourceVersion=0:
 
dial tcp 172.16.67.74:80: connect: connection timed out" 

It seems to me to it's a similar issue to the one described in 
https://github.com/prometheus/prometheus/issues/5108 but in the discovery 
phase. HTTP access to the api server as it's being attempted has never been 
allowed and was not causing issues in the past. A sample of our 
configuration is at the end of the message. Any ideas or insights will be 
very appreciated.

Regards,
Miguel

prometheus.yml:
[...]
  - job_name: 'develop-kubelet' 
    metrics_path: '/metrics' 
    scheme: https 
    tls_config: 
      ca_file: /etc/prometheus/certs/develop.k8s.local.ca.crt 
      cert_file: /etc/prometheus/certs/develop.k8s.local.crt 
      key_file: /etc/prometheus/certs/develop.k8s.local.key 
    kubernetes_sd_configs: 
    - api_server: 
internal-api-develop-k8s-local-meks2l-1292322695.us-east-1.elb.amazonaws.com 

      role: node 
      tls_config: 
        ca_file: /etc/prometheus/certs/develop.k8s.local.ca.crt 
        cert_file: /etc/prometheus/certs/develop.k8s.local.crt 
        key_file: /etc/prometheus/certs/develop.k8s.local.key 
    relabel_configs: 
    - action: labelmap 
      regex: __meta_kubernetes_node_label_(.+) 
    - target_label: __address__ 
      replacement: 
https://internal-api-develop-k8s-local-meks2l-1292322695.us-east-1.elb.amazonaws.com
 

    - source_labels: [__meta_kubernetes_node_name] 
      regex: (.+) 
      target_label: __metrics_path__ 
      replacement: /api/v1/nodes/${1}/proxy/metrics/ 
    - source_labels: [kubernetes_io_hostname] 
      target_label: node 
 
  - job_name: 'develop-container' 
    metrics_path: '/metrics' 
    scheme: https 
    tls_config: 
      ca_file: /etc/prometheus/certs/develop.k8s.local.ca.crt 
      cert_file: /etc/prometheus/certs/develop.k8s.local.crt 
      key_file: /etc/prometheus/certs/develop.k8s.local.key 
    kubernetes_sd_configs: 
    - api_server: 
internal-api-develop-k8s-local-meks2l-1292322695.us-east-1.elb.amazonaws.com 

      role: node 
      tls_config: 
        ca_file: /etc/prometheus/certs/develop.k8s.local.ca.crt 
        cert_file: /etc/prometheus/certs/develop.k8s.local.crt 
        key_file: /etc/prometheus/certs/develop.k8s.local.key 
    relabel_configs: 
    - action: labelmap 
      regex: __meta_kubernetes_node_label_(.+) 
    - target_label: __address__ 
      replacement: 
https://internal-api-develop-k8s-local-meks2l-1292322695.us-east-1.elb.amazonaws.com
 

    - source_labels: [__meta_kubernetes_node_name] 
      regex: (.+) 
      target_label: __metrics_path__ 
      replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor 
    - source_labels: [__meta_kubernetes_namespace] 
      target_label: namespace 
    - source_labels: [__meta_kubernetes_node_name] 
      target_label: node 
    - source_labels: [kubernetes_io_hostname] 
      target_label: node 

  - job_name: 'develop-endpoint' 
    metrics_path: '/metrics' 
    scheme: https 
    tls_config: 
      ca_file: /etc/prometheus/certs/develop.k8s.local.ca.crt 
      cert_file: /etc/prometheus/certs/develop.k8s.local.crt 
      key_file: /etc/prometheus/certs/develop.k8s.local.key 
    kubernetes_sd_configs: 
    - api_server: 
internal-api-develop-k8s-local-meks2l-1292322695.us-east-1.elb.amazonaws.com 

      role: endpoints 
      tls_config: 
        ca_file: /etc/prometheus/certs/develop.k8s.local.ca.crt 
        cert_file: /etc/prometheus/certs/develop.k8s.local.crt 
        key_file: /etc/prometheus/certs/develop.k8s.local.key 
    relabel_configs: 
    - target_label: __address__ 
      replacement: 
https://internal-api-develop-k8s-local-meks2l-1292322695.us-east-1.elb.amazonaws.com
 

    - source_labels: 
      - __meta_kubernetes_namespace 
      - __meta_kubernetes_service_name 
      - __meta_kubernetes_endpoint_port_name 
      separator: ; 
      regex: default;kubernetes;https 
      replacement: $1 
      action: keep 
 
  - job_name: 'develop-pod' 
    metrics_path: '/metrics' 
    scheme: https 
    tls_config: 
      ca_file: /etc/prometheus/certs/develop.k8s.local.ca.crt 
      cert_file: /etc/prometheus/certs/develop.k8s.local.crt 
      key_file: /etc/prometheus/certs/develop.k8s.local.key 
    kubernetes_sd_configs: 
    - api_server: 
internal-api-develop-k8s-local-meks2l-1292322695.us-east-1.elb.amazonaws.com 

      role: pod 
      tls_config: 
        ca_file: /etc/prometheus/certs/develop.k8s.local.ca.crt 
        cert_file: /etc/prometheus/certs/develop.k8s.local.crt 
        key_file: /etc/prometheus/certs/develop.k8s.local.key 
    relabel_configs: 
    - source_labels: 
[__meta_kubernetes_pod_annotation_prometheus_io_scrape] 
      action: keep 
      regex: true 
    - target_label: __address__ 
      replacement: 
https://internal-api-develop-k8s-local-meks2l-1292322695.us-east-1.elb.amazonaws.com
 

    - source_labels: 
[__meta_kubernetes_pod_annotation_prometheus_io_scheme] 
      regex: ^$ 
      replacement: http 
      target_label: __meta_kubernetes_pod_annotation_prometheus_io_scheme 
    - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path] 
      regex: (.+) 
      replacement: ${1} 
      target_label: __metrics_path__ 
    - source_labels: 
      - __meta_kubernetes_namespace 
      - __meta_kubernetes_pod_annotation_prometheus_io_scheme 
      - __meta_kubernetes_pod_name 
      - __meta_kubernetes_pod_annotation_prometheus_io_port 
      - __metrics_path__ 
      regex: (.+);(.+);(.+);(.+);(.+) 
      action: replace 
      target_label: __metrics_path__ 
      replacement: /api/v1/namespaces/${1}/pods/${2}:${3}:${4}/proxy${5} 
    - action: labelmap 
      regex: __meta_kubernetes_pod_label_(.+) 
    - source_labels: [__meta_kubernetes_namespace] 
      target_label: namespace 
    - source_labels: [__meta_kubernetes_pod_node_name] 
      target_label: node 
    - source_labels: [__meta_kubernetes_pod_name] 
      target_label: service 



-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/20869db6-d095-4058-8499-75c9829c6576%40googlegroups.com.

Reply via email to