I've managed to correctly activate istio-proxy logs, and that's what I can see.
This is the log when I wget "inside" the Prometheus container, with success [2021-02-23T10:58:55.066Z] "GET /metrics HTTP/1.1" 200 - "-" 0 75771 51 50 "-" "Wget" "4dae0790-1a6a-4750-bc33-4617a6fbaf16" "10.172.22.36:7070" "127.0.0.1:7070" inbound|7070|| 127.0.0.1:42380 10.172.22.36:7070 10.172.23.247:38210 - default This is the log when Prometheus scraping fails [2021-02-23T10:58:55.536Z] "GET /metrics HTTP/1.1" 503 UC "-" 0 95 53 - "-" "Prometheus/2.11.0" "2c97c597-6a32-44ed-a2fb-c1d37a2644b3" "10.172.22.36:7070" "127.0.0.1:7070" inbound|7070|| 127.0.0.1:42646 10.172.22.36:7070 10.172.23.247:33758 - default Any clues? Thank you On Monday, February 22, 2021 at 10:07:49 AM UTC+1 Paolo Filippelli wrote: > Hi Stuart, > > thanks for our answer. I see that the 503 is the actual response of the > target, and that means that Prometheus is reaching it while scraping, but > obtains this error. > What i cannot understand is what are the differences, in term of > "networking path" and "involved piece" between the scraping (that fails) > and the "internal wget-ing" that succeeds. > And i cannot also understand how to debug it. > > The scraping towards the "divolte" service is failing consistently, also > the same for some other 3 / 4 pods. > > In "divolte" logs i cannot see anything interesting, nor in main container > neither in envoy proxy. Here are the logs: > > *Main divolte container:* > ❯ k logs divolte-dpy-594d8cb676-vgd9l prometheus-jmx-exporter > DEBUG: Environment variables set/received... > Service port (metrics): 7070 > Destination host: localhost > Destination port: 5555 > Rules to appy: divolte > Local JMX: 7071 > > CONFIG FILE not found, enabling PREPARE_CONFIG feature > Preparing configuration based on environment variables > Configuration preparation completed, final cofiguration dump: > ############ > --- > hostPort: localhost:5555 > username: > password:lowercaseOutputName: true > lowercaseOutputLabelNames: true > ######## > Starting Service.. > > *Istio-proxy* > ❯ k logs divolte-dpy-594d8cb676-vgd9l istio-proxy -f > 2021-02-22T07:41:15.450702Z info xdsproxy disconnected from XDS server: > istiod.istio-system.svc:15012 > 2021-02-22T07:41:15.451182Z warning envoy config StreamAggregatedResources > gRPC config stream closed: 0, > 2021-02-22T07:41:15.894626Z info xdsproxy Envoy ADS stream established > 2021-02-22T07:41:15.894837Z info xdsproxy connecting to upstream XDS > server: istiod.istio-system.svc:15012 > 2021-02-22T08:11:25.679886Z info xdsproxy disconnected from XDS server: > istiod.istio-system.svc:15012 > 2021-02-22T08:11:25.680655Z warning envoy config StreamAggregatedResources > gRPC config stream closed: 0, > 2021-02-22T08:11:25.936956Z info xdsproxy Envoy ADS stream established > 2021-02-22T08:11:25.937120Z info xdsproxy connecting to upstream XDS > server: istiod.istio-system.svc:15012 > 2021-02-22T08:39:56.813543Z info xdsproxy disconnected from XDS server: > istiod.istio-system.svc:15012 > 2021-02-22T08:39:56.814249Z warning envoy config StreamAggregatedResources > gRPC config stream closed: 0, > 2021-02-22T08:39:57.183354Z info xdsproxy Envoy ADS stream established > 2021-02-22T08:39:57.183653Z info xdsproxy connecting to upstream XDS > server: istiod.istio-system.svc:150 > > On Fri, Feb 19, 2021 at 10:32 PM Stuart Clark <[email protected]> > wrote: > >> On 19/02/2021 11:41, Paolo Filippelli wrote: >> >> Hi, >> >> just asked the same question on IRC but i don't know which is the best >> place to get support, so I'll ask also here :) >> >> BTW, this is the IRC link: >> https://matrix.to/#/!HaYTjhTxVqshXFkNfu:matrix.org/$16137341243277ijEwp:matrix.org?via=matrix.org >> >> *The Question* >> >> I'm seeing a behaviour that I'd very much like to understand, maybe you >> can help me...we've got a K8s cluster where Prometheus operator is >> installed (v0.35.1). Prometheus version is v2.11.0 >> >> Istio has also been installed in the cluster with the default >> "PERMISSIVE" mode, as to say that every envoy sidecar accepts plain http >> traffic. >> Everything is deployed in default namespace, and everypod BUT >> prometheus/alertmanager/grafana is managed by Istio (i.e. the monitoring >> stack is out of the mesh) >> >> Prometheus can successfully scrape all its targets (defined via >> ServiceMonitors), every target but 3/4 that it fails to scrape. >> >> For example, from the logs of Prometheus i can see: >> >> level=debug ts=2021-02-19T11:15:55.595Z caller=scrape.go:927 >> component="scrape manager" scrape_pool=default/divolte/0 target= >> http://10.172.22.36:7070/metrics msg="Scrape failed" err="server >> returned HTTP status 503 Service Unavailable" >> >> >> But if i log into the Prometheus pod i can successully reach the pod that >> it's failing to scrape >> >> /prometheus $ wget -SqO /dev/null http://10.172.22.36:7070/metrics >> HTTP/1.1 200 OK >> date: Fri, 19 Feb 2021 11:27:57 GMT >> content-type: text/plain; version=0.0.4; charset=utf-8 >> content-length: 75758 >> x-envoy-upstream-service-time: 57 >> server: istio-envoy >> connection: close >> x-envoy-decorator-operation: >> divolte-srv.default.svc.cluster.local:7070/* >> >> That error message doesn't indicate that there are any problems with >> getting to the server. It is saying that the server responded with a 503 >> error code. >> >> Are certain targets consistently failing or do they sometimes work and >> only sometimes fail? >> >> Are there any access or error logs from the Envoy sidecar or target pod >> that might shed some light on where that error is coming from? >> >> -- >> Stuart Clark >> >> -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/7227a579-d403-42ca-8a53-4041bc39a982n%40googlegroups.com.

