Re: [prometheus-users] Prometheus is failing to scrape a target pod, but able to "wget" it inside Prometheus pod

Paolo Filippelli Mon, 22 Feb 2021 01:07:51 -0800

Hi Stuart,

thanks for our answer. I see that the 503 is the actual response of the
target, and that means that Prometheus is reaching it while scraping, but
obtains this error.
What i cannot understand is what are the differences, in term of
"networking path" and "involved piece" between the scraping (that fails)
and the "internal wget-ing" that succeeds.
And i cannot also understand how to debug it.


The scraping towards the "divolte" service is failing consistently, also
the same for some other 3 / 4 pods.

In "divolte" logs i cannot see anything interesting, nor in main container
neither in envoy proxy. Here are the logs:

*Main divolte container:*
❯ k logs divolte-dpy-594d8cb676-vgd9l prometheus-jmx-exporter
DEBUG: Environment variables set/received...
Service port (metrics): 7070
Destination host: localhost
Destination port: 5555
Rules to appy: divolte
Local JMX: 7071

CONFIG FILE not found, enabling PREPARE_CONFIG feature
Preparing configuration based on environment variables
Configuration preparation completed, final cofiguration dump:
############
---
hostPort: localhost:5555
username:
password:lowercaseOutputName: true
lowercaseOutputLabelNames: true
########
Starting Service..

*Istio-proxy*
❯ k logs divolte-dpy-594d8cb676-vgd9l istio-proxy -f
2021-02-22T07:41:15.450702Z info xdsproxy disconnected from XDS server:
istiod.istio-system.svc:15012
2021-02-22T07:41:15.451182Z warning envoy config StreamAggregatedResources
gRPC config stream closed: 0,
2021-02-22T07:41:15.894626Z info xdsproxy Envoy ADS stream established
2021-02-22T07:41:15.894837Z info xdsproxy connecting to upstream XDS
server: istiod.istio-system.svc:15012
2021-02-22T08:11:25.679886Z info xdsproxy disconnected from XDS server:
istiod.istio-system.svc:15012
2021-02-22T08:11:25.680655Z warning envoy config StreamAggregatedResources
gRPC config stream closed: 0,
2021-02-22T08:11:25.936956Z info xdsproxy Envoy ADS stream established
2021-02-22T08:11:25.937120Z info xdsproxy connecting to upstream XDS
server: istiod.istio-system.svc:15012
2021-02-22T08:39:56.813543Z info xdsproxy disconnected from XDS server:
istiod.istio-system.svc:15012
2021-02-22T08:39:56.814249Z warning envoy config StreamAggregatedResources
gRPC config stream closed: 0,
2021-02-22T08:39:57.183354Z info xdsproxy Envoy ADS stream established
2021-02-22T08:39:57.183653Z info xdsproxy connecting to upstream XDS
server: istiod.istio-system.svc:150

On Fri, Feb 19, 2021 at 10:32 PM Stuart Clark <[email protected]>
wrote:

> On 19/02/2021 11:41, Paolo Filippelli wrote:
>
> Hi,
>
> just asked the same question on IRC but i don't know which is the best
> place to get support, so I'll ask also here :)
>
> BTW, this is the IRC link:
> https://matrix.to/#/!HaYTjhTxVqshXFkNfu:matrix.org/$16137341243277ijEwp:matrix.org?via=matrix.org
>
> *The Question*
>
>  I'm seeing a behaviour that I'd very much like to understand, maybe you
> can help me...we've got a K8s cluster where Prometheus operator is
> installed (v0.35.1). Prometheus version is v2.11.0
>
> Istio has also been installed in the cluster with the default "PERMISSIVE"
> mode, as to say that every envoy sidecar accepts plain http traffic.
> Everything is deployed in default namespace, and everypod BUT
> prometheus/alertmanager/grafana is managed by Istio (i.e. the monitoring
> stack is out of the mesh)
>
> Prometheus can successfully scrape all its targets (defined via
> ServiceMonitors), every target but 3/4 that it fails to scrape.
>
> For example, from the logs of Prometheus i can see:
>
> level=debug ts=2021-02-19T11:15:55.595Z caller=scrape.go:927
> component="scrape manager" scrape_pool=default/divolte/0 target=
> http://10.172.22.36:7070/metrics msg="Scrape failed" err="server returned
> HTTP status 503 Service Unavailable"
>
>
> But if i log into the Prometheus pod i can successully reach the pod that
> it's failing to scrape
>
> /prometheus $ wget -SqO /dev/null http://10.172.22.36:7070/metrics
>   HTTP/1.1 200 OK
>   date: Fri, 19 Feb 2021 11:27:57 GMT
>   content-type: text/plain; version=0.0.4; charset=utf-8
>   content-length: 75758
>   x-envoy-upstream-service-time: 57
>   server: istio-envoy
>   connection: close
>   x-envoy-decorator-operation: divolte-srv.default.svc.cluster.local:7070/*
>
> That error message doesn't indicate that there are any problems with
> getting to the server. It is saying that the server responded with a 503
> error code.
>
> Are certain targets consistently failing or do they sometimes work and
> only sometimes fail?
>
> Are there any access or error logs from the Envoy sidecar or target pod
> that might shed some light on where that error is coming from?
>
> --
> Stuart Clark
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABMt%2BgOGMR9WwfAHrmZk3SUjw-LKm1C61A%3DKvhY3YnEWNA_T-A%40mail.gmail.com.

Re: [prometheus-users] Prometheus is failing to scrape a target pod, but able to "wget" it inside Prometheus pod

Reply via email to