I can create an Ubuntu container and verify connectivity to the container metrics endpoint with both curl and openssl:
curl https://10.244.3.10:9102/metrics --cacert /etc/istio-certs/root-cert.pem --cert /etc/istio-certs/cert-chain.pem --key /etc/istio-certs/key.pem --insecure openssl s_client -connect 10.244.3.10:9102 -cert /etc/istio-certs/cert-chain.pem -key /etc/istio-certs/key.pem -CAfile /etc/istio-certs/root-cert.pem -alpn "istio" The curl call seems to correctly auto-negotiate the TLS 1.3 comms. The openssl call requires the -alpn "istio" flag to negotiate the protocol at the application layer or it will fail to connect. *The results of my testing (shown below) make me think it's something in Prometheus or the Go stack causing the problem.* I don't think it's an OS configuration issue in the container or anything like that. However, I'm not sure how to debug the Prometheus/Go side of things. A more verbose log from curl shows it will default to HTTP/2 (which I recall seeing is disabled in Prometheus at the moment). root@sleep-5f98748557-s4wh5:/# curl https://10.244.3.10:9102/metrics --cacert /etc/istio-certs/root-cert.pem --cert /etc/istio-certs/cert-chain.pem --key /etc/istio-certs/key.pem --insecure -v * Trying 10.244.3.10:9102... * TCP_NODELAY set * Connected to 10.244.3.10 (10.244.3.10) port 9102 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /etc/istio-certs/root-cert.pem CApath: /etc/ssl/certs * TLSv1.3 (OUT), TLS handshake, Client hello (1): * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8): * TLSv1.3 (IN), TLS handshake, Request CERT (13): * TLSv1.3 (IN), TLS handshake, Certificate (11): * TLSv1.3 (IN), TLS handshake, CERT verify (15): * TLSv1.3 (IN), TLS handshake, Finished (20): * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.3 (OUT), TLS handshake, Certificate (11): * TLSv1.3 (OUT), TLS handshake, CERT verify (15): * TLSv1.3 (OUT), TLS handshake, Finished (20): * SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384 * ALPN, server accepted to use h2 * Server certificate: * subject: [NONE] * start date: Jul 7 20:21:33 2021 GMT * expire date: Jul 8 20:21:33 2021 GMT * issuer: O=cluster.local * SSL certificate verify ok. * Using HTTP2, server supports multi-use * Connection state changed (HTTP/2 confirmed) * Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0 * Using Stream ID: 1 (easy handle 0x564d80d81e10) > GET /metrics HTTP/2 > Host: 10.244.3.10:9102 > user-agent: curl/7.68.0 > accept: */* > * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): * old SSL session ID is stale, removing * Connection state changed (MAX_CONCURRENT_STREAMS == 2147483647)! < HTTP/2 200 I can add --http1.1 to force HTTP/1.1 and it'll still work: root@sleep-5f98748557-s4wh5:/# curl https://10.244.3.10:9102/metrics --cacert /etc/istio-certs/root-cert.pem --cert /etc/istio-certs/cert-chain.pem --key /etc/istio-certs/key.pem --insecure -v --http1.1 * Trying 10.244.3.10:9102... * TCP_NODELAY set * Connected to 10.244.3.10 (10.244.3.10) port 9102 (#0) * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /etc/istio-certs/root-cert.pem CApath: /etc/ssl/certs * TLSv1.3 (OUT), TLS handshake, Client hello (1): * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8): * TLSv1.3 (IN), TLS handshake, Request CERT (13): * TLSv1.3 (IN), TLS handshake, Certificate (11): * TLSv1.3 (IN), TLS handshake, CERT verify (15): * TLSv1.3 (IN), TLS handshake, Finished (20): * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.3 (OUT), TLS handshake, Certificate (11): * TLSv1.3 (OUT), TLS handshake, CERT verify (15): * TLSv1.3 (OUT), TLS handshake, Finished (20): * SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384 * ALPN, server accepted to use http/1.1 * Server certificate: * subject: [NONE] * start date: Jul 7 20:21:33 2021 GMT * expire date: Jul 8 20:21:33 2021 GMT * issuer: O=cluster.local * SSL certificate verify ok. > GET /metrics HTTP/1.1 > Host: 10.244.3.10:9102 > User-Agent: curl/7.68.0 > Accept: */* > * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): * old SSL session ID is stale, removing * Mark bundle as not supporting multiuse < HTTP/1.1 200 OK Since that works it makes me wonder if there's something wrong with the ALPN handling in the way HTTP/2 is disabled at the moment, like maybe it's not negotiating right? I have no idea, I'm mostly grasping at straws. On Tuesday, July 6, 2021 at 1:24:13 PM UTC-7 Travis Illig wrote: > It's not the certificate handling. I tried setting GODEBUG as indicated in > the docs and that didn't fix anything. I'm starting to wonder if it's an > HTTP/2 issue or something similar but I'm not sure how to determine if > that's the problem. > > The error message in Prometheus debug logs isn't super helpful, it just > seems to indicate a protocol problem. > > level=debug ts=2021-07-06T20:00:50.996Z caller=scrape.go:1091 > component="scrape manager" scrape_pool=kubernetes-pods-istio-secure target= > https://10.244.3.10:9102/metrics msg="Scrape failed" err="Get \" > https://10.244.3.10:9102/metrics\": read tcp 10.244.4.85:51794-> > 10.244.3.10:9102: read: connection reset by peer" > > On Tuesday, July 6, 2021 at 12:01:08 PM UTC-7 Travis Illig wrote: > >> I've verified: >> >> - v2.20.1 is the last version where the mTLS scraping works. >> - It doesn't matter which Docker registry you pull from (Docker Hub >> or quay.io - I've sometimes seen different "versions" of containers >> based on registry). >> >> Looking at the release notes for v2.21.0 >> <https://github.com/prometheus/prometheus/releases/tag/v2.21.0> it >> appears there's a new version of Go used for compilation which includes >> some changes on how certificates are handled >> <https://golang.org/doc/go1.15#commonname>. Unclear if this is what I'm >> hitting, but it seems worth looking into. >> >> On Tuesday, July 6, 2021 at 11:02:56 AM UTC-7 Travis Illig wrote: >> >>> I'm deploying Prometheus using the Helm chart >>> <https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus> >>> >>> and I have it configured to scrape Istio mTLS-secured pods using the >>> TLS settings specified >>> <https://istio.io/latest/docs/ops/integrations/prometheus/#tls-settings> >>> by the Istio team to do so. Basically what this amounts to is: >>> >>> - Add the Istio sidecar to the Prometheus instance but disable all >>> traffic proxying - you just want to get the certificates from it. >>> - Mount the certificates into the Prometheus container. >>> - Set up your scrape configuration to use the certificates when >>> scraping Istio-enabled pods. >>> >>> The YAML for the scrape configuration looks like this: >>> >>> - job_name: "kubernetes-pods-istio-secure" >>> scheme: https >>> tls_config: >>> ca_file: /etc/istio-certs/root-cert.pem >>> cert_file: /etc/istio-certs/cert-chain.pem >>> key_file: /etc/istio-certs/key.pem >>> insecure_skip_verify: true >>> >>> *This totally works using Prometheus v2.20.1* packaged as >>> `prom/prometheus` from Docker Hub. >>> >>> *This fails on Prometheus v2.28.0* packaged as ` >>> quay.io/prometheus/prometheus` <http://quay.io/prometheus/prometheus>. >>> Instead of getting a successful scrape, I get "connection reset by peer." >>> I've validated the files are there and properly mounted; they have the >>> expected contents; and there are no Prometheus log messages to indicate >>> anything is amiss. >>> >>> I've been rolling back slowly to see where it starts working again. I've >>> tried v2.26.0 and it still fails. I thought I'd drop a note in here to see >>> if anyone knows what's up. >>> >> -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/66c5181e-bd19-4833-ae7f-2f917a1aeea4n%40googlegroups.com.

