Happy to report that the issue has been fixed by having a custom DNS policy 
for BlackBox pods, skipping cluster DNS and pointing to external DNS server.

On Wednesday, November 25, 2020 at 8:29:18 AM UTC-5 Chris Paulraj wrote:

> Tried with different build to include network tools, unable to figure out 
> why the lookup fails. Tried with a blackbox-exporter image from docker hub, 
> resulting with the same issue, although it lasted for 8 hours without 
> error. It does look like this is an environmental issue with my setup, 
> would you be able to help me on how I can increase the DNS lookup timeout 
> for HTTP probes?  Where can I increase the timeout for 
> "probe_dns_lookup_time_seconds"?  -Thank you.
>
> On Monday, November 23, 2020 at 11:04:12 AM UTC-5 Chris Paulraj wrote:
>
>> I created the image using RHEL 7 and I could see that the DNS is 
>> delegated to Openshift node hosting this pod. I was also able to run curl 
>> command from within the pod which was successful. But as you point out, 
>> issue could very well be within the image I built, will try to gather more 
>> information when it happens again. I updated the prometheus & alertmanager 
>> with most recent version and restarted the pods, keeping my fingers 
>> crossed.   Thank you for your help.
>>
>> sh-4.2$ cat /etc/resolv.conf
>> nameserver 10.244.60.18
>> search prometheus-custom.svc.cluster.local svc.cluster.local 
>> cluster.local localdomain xyz.com
>> options ndots:5
>> sh-4.2$ 
>>
>> On Monday, November 23, 2020 at 10:32:54 AM UTC-5 [email protected] 
>> wrote:
>>
>>> The OS that the host is running makes no difference; the question is 
>>> what OS the container is built from.  You'll see this in the Dockerfile 
>>> used to build the container.
>>>
>>> If you are using the off-the-shelf docker container for 
>>> blackbox_exporter then it will be this Dockerfile 
>>> <https://github.com/prometheus/blackbox_exporter/blob/master/Dockerfile> 
>>> which 
>>> builds from quay.io/prometheus/busybox-linux-amd64:latest 
>>> This in turn appears to come from here 
>>> <https://github.com/prometheus/busybox>, which in turn is based on 
>>> debian:buster 
>>> <https://github.com/prometheus/busybox/blob/master/uclibc/Dockerfile> 
>>> or debian:buster-slim 
>>> <https://github.com/prometheus/busybox/blob/master/glibc/Dockerfile>.  
>>> I think those are systemd-based.
>>>
>>> I think you should docker exec into the running container, and see if 
>>> systemd-resolved is running, and/or if /etc/resolv.conf points to 
>>> 127.0.0.53.  If so, the systemd bug I pointed to is relevant.
>>>
>>> If not, then you can try resolving host 
>>> arp-executor-sy-shra-arp-p.icl1p.xyz.com yourself to see if it resolves 
>>> or not.  Ultimately, this problem isn't with blackbox-exporter, it's a case 
>>> of debugging why DNS isn't resolving.  Intermittent DNS resolution can also 
>>> be caused by problems with your authoritative DNS.
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/c90cadb9-65fb-4569-b0e0-7e7a650f9079n%40googlegroups.com.

Reply via email to