Re: [prometheus-users] Prometheus is failing to scrape a target pod, but able to "wget" it inside Prometheus pod

Paolo Filippelli Tue, 23 Feb 2021 03:22:26 -0800

I've managed to correctly activate istio-proxy logs, and that's what I can 
see.


This is the log when I wget "inside" the Prometheus container, with success

[2021-02-23T10:58:55.066Z] "GET /metrics HTTP/1.1" 200 - "-" 0 75771 51 50 
"-" "Wget" "4dae0790-1a6a-4750-bc33-4617a6fbaf16" "10.172.22.36:7070" 
"127.0.0.1:7070" inbound|7070|| 127.0.0.1:42380 10.172.22.36:7070 
10.172.23.247:38210 - default

This is the log when Prometheus scraping fails

[2021-02-23T10:58:55.536Z] "GET /metrics HTTP/1.1" 503 UC "-" 0 95 53 - "-" 
"Prometheus/2.11.0" "2c97c597-6a32-44ed-a2fb-c1d37a2644b3" 
"10.172.22.36:7070" "127.0.0.1:7070" inbound|7070|| 127.0.0.1:42646 
10.172.22.36:7070 10.172.23.247:33758 - default

Any clues? Thank you
On Monday, February 22, 2021 at 10:07:49 AM UTC+1 Paolo Filippelli wrote:

> Hi Stuart,
>
> thanks for our answer. I see that the 503 is the actual response of the 
> target, and that means that Prometheus is reaching it while scraping, but 
> obtains this error.
> What i cannot understand is what are the differences, in term of 
> "networking path" and "involved piece" between the scraping (that fails) 
> and the "internal wget-ing" that succeeds.
> And i cannot also understand how to debug it.
>
> The scraping towards the "divolte" service is failing consistently, also 
> the same for some other 3 / 4 pods.
>
> In "divolte" logs i cannot see anything interesting, nor in main container 
> neither in envoy proxy. Here are the logs:
>
> *Main divolte container:*
> ❯ k logs divolte-dpy-594d8cb676-vgd9l prometheus-jmx-exporter
> DEBUG: Environment variables set/received...
> Service port (metrics): 7070
> Destination host: localhost
> Destination port: 5555
> Rules to appy: divolte
> Local JMX: 7071
>
> CONFIG FILE not found, enabling PREPARE_CONFIG feature
> Preparing configuration based on environment variables
> Configuration preparation completed, final cofiguration dump:
> ############
> ---
> hostPort: localhost:5555
> username:
> password:lowercaseOutputName: true
> lowercaseOutputLabelNames: true
> ########
> Starting Service..
>
> *Istio-proxy*
> ❯ k logs divolte-dpy-594d8cb676-vgd9l istio-proxy -f
> 2021-02-22T07:41:15.450702Z info xdsproxy disconnected from XDS server: 
> istiod.istio-system.svc:15012
> 2021-02-22T07:41:15.451182Z warning envoy config StreamAggregatedResources 
> gRPC config stream closed: 0,
> 2021-02-22T07:41:15.894626Z info xdsproxy Envoy ADS stream established
> 2021-02-22T07:41:15.894837Z info xdsproxy connecting to upstream XDS 
> server: istiod.istio-system.svc:15012
> 2021-02-22T08:11:25.679886Z info xdsproxy disconnected from XDS server: 
> istiod.istio-system.svc:15012
> 2021-02-22T08:11:25.680655Z warning envoy config StreamAggregatedResources 
> gRPC config stream closed: 0,
> 2021-02-22T08:11:25.936956Z info xdsproxy Envoy ADS stream established
> 2021-02-22T08:11:25.937120Z info xdsproxy connecting to upstream XDS 
> server: istiod.istio-system.svc:15012
> 2021-02-22T08:39:56.813543Z info xdsproxy disconnected from XDS server: 
> istiod.istio-system.svc:15012
> 2021-02-22T08:39:56.814249Z warning envoy config StreamAggregatedResources 
> gRPC config stream closed: 0,
> 2021-02-22T08:39:57.183354Z info xdsproxy Envoy ADS stream established
> 2021-02-22T08:39:57.183653Z info xdsproxy connecting to upstream XDS 
> server: istiod.istio-system.svc:150
>
> On Fri, Feb 19, 2021 at 10:32 PM Stuart Clark <[email protected]> 
> wrote:
>
>> On 19/02/2021 11:41, Paolo Filippelli wrote:
>>
>> Hi, 
>>
>> just asked the same question on IRC but i don't know which is the best 
>> place to get support, so I'll ask also here :)
>>
>> BTW, this is the IRC link: 
>> https://matrix.to/#/!HaYTjhTxVqshXFkNfu:matrix.org/$16137341243277ijEwp:matrix.org?via=matrix.org
>>
>> *The Question*
>>
>>  I'm seeing a behaviour that I'd very much like to understand, maybe you 
>> can help me...we've got a K8s cluster where Prometheus operator is 
>> installed (v0.35.1). Prometheus version is v2.11.0
>>
>> Istio has also been installed in the cluster with the default 
>> "PERMISSIVE" mode, as to say that every envoy sidecar accepts plain http 
>> traffic.
>> Everything is deployed in default namespace, and everypod BUT 
>> prometheus/alertmanager/grafana is managed by Istio (i.e. the monitoring 
>> stack is out of the mesh)
>>
>> Prometheus can successfully scrape all its targets (defined via 
>> ServiceMonitors), every target but 3/4 that it fails to scrape.
>>
>> For example, from the logs of Prometheus i can see:
>>
>> level=debug ts=2021-02-19T11:15:55.595Z caller=scrape.go:927 
>> component="scrape manager" scrape_pool=default/divolte/0 target=
>> http://10.172.22.36:7070/metrics msg="Scrape failed" err="server 
>> returned HTTP status 503 Service Unavailable"
>>
>>
>> But if i log into the Prometheus pod i can successully reach the pod that 
>> it's failing to scrape
>>
>> /prometheus $ wget -SqO /dev/null http://10.172.22.36:7070/metrics
>>   HTTP/1.1 200 OK
>>   date: Fri, 19 Feb 2021 11:27:57 GMT
>>   content-type: text/plain; version=0.0.4; charset=utf-8
>>   content-length: 75758
>>   x-envoy-upstream-service-time: 57
>>   server: istio-envoy
>>   connection: close
>>   x-envoy-decorator-operation: 
>> divolte-srv.default.svc.cluster.local:7070/*
>>
>> That error message doesn't indicate that there are any problems with 
>> getting to the server. It is saying that the server responded with a 503 
>> error code.
>>
>> Are certain targets consistently failing or do they sometimes work and 
>> only sometimes fail?
>>
>> Are there any access or error logs from the Envoy sidecar or target pod 
>> that might shed some light on where that error is coming from?
>>
>> -- 
>> Stuart Clark
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/7227a579-d403-42ca-8a53-4041bc39a982n%40googlegroups.com.

Re: [prometheus-users] Prometheus is failing to scrape a target pod, but able to "wget" it inside Prometheus pod

Reply via email to