Hi Stuart, thanks for our answer. I see that the 503 is the actual response of the target, and that means that Prometheus is reaching it while scraping, but obtains this error. What i cannot understand is what are the differences, in term of "networking path" and "involved piece" between the scraping (that fails) and the "internal wget-ing" that succeeds. And i cannot also understand how to debug it.
The scraping towards the "divolte" service is failing consistently, also the same for some other 3 / 4 pods. In "divolte" logs i cannot see anything interesting, nor in main container neither in envoy proxy. Here are the logs: *Main divolte container:* ❯ k logs divolte-dpy-594d8cb676-vgd9l prometheus-jmx-exporter DEBUG: Environment variables set/received... Service port (metrics): 7070 Destination host: localhost Destination port: 5555 Rules to appy: divolte Local JMX: 7071 CONFIG FILE not found, enabling PREPARE_CONFIG feature Preparing configuration based on environment variables Configuration preparation completed, final cofiguration dump: ############ --- hostPort: localhost:5555 username: password:lowercaseOutputName: true lowercaseOutputLabelNames: true ######## Starting Service.. *Istio-proxy* ❯ k logs divolte-dpy-594d8cb676-vgd9l istio-proxy -f 2021-02-22T07:41:15.450702Z info xdsproxy disconnected from XDS server: istiod.istio-system.svc:15012 2021-02-22T07:41:15.451182Z warning envoy config StreamAggregatedResources gRPC config stream closed: 0, 2021-02-22T07:41:15.894626Z info xdsproxy Envoy ADS stream established 2021-02-22T07:41:15.894837Z info xdsproxy connecting to upstream XDS server: istiod.istio-system.svc:15012 2021-02-22T08:11:25.679886Z info xdsproxy disconnected from XDS server: istiod.istio-system.svc:15012 2021-02-22T08:11:25.680655Z warning envoy config StreamAggregatedResources gRPC config stream closed: 0, 2021-02-22T08:11:25.936956Z info xdsproxy Envoy ADS stream established 2021-02-22T08:11:25.937120Z info xdsproxy connecting to upstream XDS server: istiod.istio-system.svc:15012 2021-02-22T08:39:56.813543Z info xdsproxy disconnected from XDS server: istiod.istio-system.svc:15012 2021-02-22T08:39:56.814249Z warning envoy config StreamAggregatedResources gRPC config stream closed: 0, 2021-02-22T08:39:57.183354Z info xdsproxy Envoy ADS stream established 2021-02-22T08:39:57.183653Z info xdsproxy connecting to upstream XDS server: istiod.istio-system.svc:150 On Fri, Feb 19, 2021 at 10:32 PM Stuart Clark <[email protected]> wrote: > On 19/02/2021 11:41, Paolo Filippelli wrote: > > Hi, > > just asked the same question on IRC but i don't know which is the best > place to get support, so I'll ask also here :) > > BTW, this is the IRC link: > https://matrix.to/#/!HaYTjhTxVqshXFkNfu:matrix.org/$16137341243277ijEwp:matrix.org?via=matrix.org > > *The Question* > > I'm seeing a behaviour that I'd very much like to understand, maybe you > can help me...we've got a K8s cluster where Prometheus operator is > installed (v0.35.1). Prometheus version is v2.11.0 > > Istio has also been installed in the cluster with the default "PERMISSIVE" > mode, as to say that every envoy sidecar accepts plain http traffic. > Everything is deployed in default namespace, and everypod BUT > prometheus/alertmanager/grafana is managed by Istio (i.e. the monitoring > stack is out of the mesh) > > Prometheus can successfully scrape all its targets (defined via > ServiceMonitors), every target but 3/4 that it fails to scrape. > > For example, from the logs of Prometheus i can see: > > level=debug ts=2021-02-19T11:15:55.595Z caller=scrape.go:927 > component="scrape manager" scrape_pool=default/divolte/0 target= > http://10.172.22.36:7070/metrics msg="Scrape failed" err="server returned > HTTP status 503 Service Unavailable" > > > But if i log into the Prometheus pod i can successully reach the pod that > it's failing to scrape > > /prometheus $ wget -SqO /dev/null http://10.172.22.36:7070/metrics > HTTP/1.1 200 OK > date: Fri, 19 Feb 2021 11:27:57 GMT > content-type: text/plain; version=0.0.4; charset=utf-8 > content-length: 75758 > x-envoy-upstream-service-time: 57 > server: istio-envoy > connection: close > x-envoy-decorator-operation: divolte-srv.default.svc.cluster.local:7070/* > > That error message doesn't indicate that there are any problems with > getting to the server. It is saying that the server responded with a 503 > error code. > > Are certain targets consistently failing or do they sometimes work and > only sometimes fail? > > Are there any access or error logs from the Envoy sidecar or target pod > that might shed some light on where that error is coming from? > > -- > Stuart Clark > > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CABMt%2BgOGMR9WwfAHrmZk3SUjw-LKm1C61A%3DKvhY3YnEWNA_T-A%40mail.gmail.com.

