I can create an Ubuntu container and verify connectivity to the container 
metrics endpoint with both curl and openssl:

curl https://10.244.3.10:9102/metrics --cacert 
/etc/istio-certs/root-cert.pem --cert /etc/istio-certs/cert-chain.pem --key 
/etc/istio-certs/key.pem --insecure

openssl s_client -connect 10.244.3.10:9102 -cert 
/etc/istio-certs/cert-chain.pem -key /etc/istio-certs/key.pem -CAfile 
/etc/istio-certs/root-cert.pem -alpn "istio"

The curl call seems to correctly auto-negotiate the TLS 1.3 comms. The 
openssl call requires the -alpn "istio" flag to negotiate the protocol at 
the application layer or it will fail to connect.

*The results of my testing (shown below) make me think it's something in 
Prometheus or the Go stack causing the problem.* I don't think it's an OS 
configuration issue in the container or anything like that. However, I'm 
not sure how to debug the Prometheus/Go side of things.

A more verbose log from curl shows it will default to HTTP/2 (which I 
recall seeing is disabled in Prometheus at the moment).

root@sleep-5f98748557-s4wh5:/# curl https://10.244.3.10:9102/metrics 
--cacert /etc/istio-certs/root-cert.pem --cert 
/etc/istio-certs/cert-chain.pem --key /etc/istio-certs/key.pem --insecure -v
*   Trying 10.244.3.10:9102...
* TCP_NODELAY set
* Connected to 10.244.3.10 (10.244.3.10) port 9102 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/istio-certs/root-cert.pem
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, CERT verify (15):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: [NONE]
*  start date: Jul  7 20:21:33 2021 GMT
*  expire date: Jul  8 20:21:33 2021 GMT
*  issuer: O=cluster.local
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: 
len=0
* Using Stream ID: 1 (easy handle 0x564d80d81e10)
> GET /metrics HTTP/2
> Host: 10.244.3.10:9102
> user-agent: curl/7.68.0
> accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
* Connection state changed (MAX_CONCURRENT_STREAMS == 2147483647)!
< HTTP/2 200

I can add --http1.1 to force HTTP/1.1 and it'll still work:

root@sleep-5f98748557-s4wh5:/# curl https://10.244.3.10:9102/metrics 
--cacert /etc/istio-certs/root-cert.pem --cert 
/etc/istio-certs/cert-chain.pem --key /etc/istio-certs/key.pem --insecure 
-v --http1.1
*   Trying 10.244.3.10:9102...
* TCP_NODELAY set
* Connected to 10.244.3.10 (10.244.3.10) port 9102 (#0)
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/istio-certs/root-cert.pem
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, CERT verify (15):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: [NONE]
*  start date: Jul  7 20:21:33 2021 GMT
*  expire date: Jul  8 20:21:33 2021 GMT
*  issuer: O=cluster.local
*  SSL certificate verify ok.
> GET /metrics HTTP/1.1
> Host: 10.244.3.10:9102
> User-Agent: curl/7.68.0
> Accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK

Since that works it makes me wonder if there's something wrong with the 
ALPN handling in the way HTTP/2 is disabled at the moment, like maybe it's 
not negotiating right? I have no idea, I'm mostly grasping at straws.

On Tuesday, July 6, 2021 at 1:24:13 PM UTC-7 Travis Illig wrote:

> It's not the certificate handling. I tried setting GODEBUG as indicated in 
> the docs and that didn't fix anything. I'm starting to wonder if it's an 
> HTTP/2 issue or something similar but I'm not sure how to determine if 
> that's the problem.
>
> The error message in Prometheus debug logs isn't super helpful, it just 
> seems to indicate a protocol problem.
>
> level=debug ts=2021-07-06T20:00:50.996Z caller=scrape.go:1091 
> component="scrape manager" scrape_pool=kubernetes-pods-istio-secure target=
> https://10.244.3.10:9102/metrics msg="Scrape failed" err="Get \"
> https://10.244.3.10:9102/metrics\": read tcp 10.244.4.85:51794->
> 10.244.3.10:9102: read: connection reset by peer"
>
> On Tuesday, July 6, 2021 at 12:01:08 PM UTC-7 Travis Illig wrote:
>
>> I've verified:
>>
>>    - v2.20.1 is the last version where the mTLS scraping works.
>>    - It doesn't matter which Docker registry you pull from (Docker Hub 
>>    or quay.io - I've sometimes seen different "versions" of containers 
>>    based on registry).
>>
>> Looking at the release notes for v2.21.0 
>> <https://github.com/prometheus/prometheus/releases/tag/v2.21.0> it 
>> appears there's a new version of Go used for compilation which includes 
>> some changes on how certificates are handled 
>> <https://golang.org/doc/go1.15#commonname>. Unclear if this is what I'm 
>> hitting, but it seems worth looking into.
>>
>> On Tuesday, July 6, 2021 at 11:02:56 AM UTC-7 Travis Illig wrote:
>>
>>> I'm deploying Prometheus using the Helm chart 
>>> <https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus>
>>>  
>>> and I have it configured to scrape Istio mTLS-secured pods using the 
>>> TLS settings specified 
>>> <https://istio.io/latest/docs/ops/integrations/prometheus/#tls-settings> 
>>> by the Istio team to do so. Basically what this amounts to is:
>>>
>>>    - Add the Istio sidecar to the Prometheus instance but disable all 
>>>    traffic proxying - you just want to get the certificates from it.
>>>    - Mount the certificates into the Prometheus container.
>>>    - Set up your scrape configuration to use the certificates when 
>>>    scraping Istio-enabled pods.
>>>
>>> The YAML for the scrape configuration looks like this:
>>>
>>> - job_name: "kubernetes-pods-istio-secure"
>>>   scheme: https
>>>   tls_config:
>>>     ca_file: /etc/istio-certs/root-cert.pem
>>>     cert_file: /etc/istio-certs/cert-chain.pem
>>>     key_file: /etc/istio-certs/key.pem
>>>     insecure_skip_verify: true
>>>
>>> *This totally works using Prometheus v2.20.1* packaged as 
>>> `prom/prometheus` from Docker Hub.
>>>
>>> *This fails on Prometheus v2.28.0* packaged as `
>>> quay.io/prometheus/prometheus` <http://quay.io/prometheus/prometheus>. 
>>> Instead of getting a successful scrape, I get "connection reset by peer." 
>>> I've validated the files are there and properly mounted; they have the 
>>> expected contents; and there are no Prometheus log messages to indicate 
>>> anything is amiss.
>>>
>>> I've been rolling back slowly to see where it starts working again. I've 
>>> tried v2.26.0 and it still fails. I thought I'd drop a note in here to see 
>>> if anyone knows what's up.
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/66c5181e-bd19-4833-ae7f-2f917a1aeea4n%40googlegroups.com.

Reply via email to