Hi everybody.

Brian told us to move here this issue 
<https://github.com/prometheus/blackbox_exporter/issues/591>, as here it is 
more proper place to discuss it.

We have the following issue with blackbox exporter.
We run blackbox-exporter inside docker container. Suddenly, without any 
changes on working machine or container,
ping probe starts failing for one or more targets, while other targets 
remain ok. 

*But when I run manually ping tool inside docker container and on host OS 
outside the container, both succeed.*

When we restart docker container, issue disappears, but occurs after some 
time again.

We experienced this behavior for two of ours internal IP targets 
simultaneously (both from the same datacenter) and later for other public 
targets:
 8.8.8.8, 1.1.1.1.

I examined the problem with a tcpdump and it shows only request packets (no 
reply packets):

tcpdump -i eth0 -nn -s0 -X icmp and host 8.8.8.8
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
15:28:48.734661 IP 172.17.0.5 > 8.8.8.8: ICMP echo request, id 33313, seq 
41979, length 36
        0x0000:  4500 0038 f40e 4000 4001 8a90 ac11 0005  E..8..@.@.......
        0x0010:  0808 0808 0800 7648 8221 a3fb 5072 6f6d  ......vH.!..Prom
        0x0020:  6574 6865 7573 2042 6c61 636b 626f 7820  etheus.Blackbox.
        0x0030:  4578 706f 7274 6572                      Exporter
15:28:48.977456 IP 172.17.0.5 > 8.8.8.8: ICMP echo request, id 33313, seq 
41982, length 36
        0x0000:  4500 0038 f41d 4000 4001 8a81 ac11 0005  E..8..@.@.......
        0x0010:  0808 0808 0800 7645 8221 a3fe 5072 6f6d  ......vE.!..Prom
        0x0020:  6574 6865 7573 2042 6c61 636b 626f 7820  etheus.Blackbox.
        0x0030:  4578 706f 7274 6572                      Exporter



This is tcpdump output, when I start ping manually inside the container, 
along the blackbox-exporter (blackbox-exporter id==33313):

root @ /
 [4] 🐳  →  tcpdump -i eth0 icmp and host 1.1.1.1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
14:48:50.599214 IP a382643a1270 > one.one.one.one: ICMP echo request, id 33313, 
seq 50421, length 36
14:48:51.384085 IP a382643a1270 > one.one.one.one: ICMP echo request, id 35072, 
seq 5, length 64
14:48:51.392669 IP one.one.one.one > a382643a1270: ICMP echo reply, id 35072, 
seq 5, length 64
14:48:51.599289 IP a382643a1270 > one.one.one.one: ICMP echo request, id 33313, 
seq 50435, length 36
14:48:52.384292 IP a382643a1270 > one.one.one.one: ICMP echo request, id 35072, 
seq 6, length 64
14:48:52.393031 IP one.one.one.one > a382643a1270: ICMP echo reply, id 35072, 
seq 6, length 64
14:48:52.599559 IP a382643a1270 > one.one.one.one: ICMP echo request, id 33313, 
seq 50449, length 36
14:48:53.384517 IP a382643a1270 > one.one.one.one: ICMP echo request, id 35072, 
seq 7, length 64
14:48:53.396626 IP one.one.one.one > a382643a1270: ICMP echo reply, id 35072, 
seq 7, length 64


I also checked if there is any zero-filled ID field in IP header, as it was 
discussed in a very similar issue here: #360, but it is not our case.

The only correlations which we found in Grafana, are very short outages of 
connection from the blackbox-exporter machine to 
some of ours internal DNS servers (spikes are in the same time as the 
probes starts failing) monitored with the same blackbox-exporter ...

I would check more deeply, what's going on, but I have no idea where to 
look now. 

Please, don't You have any suggestions what else to check or how to 
possibly debug it?

Kind regards,
Tomáš Bartek

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/43f67f04-e631-4045-ba5e-0a123d3934b5%40googlegroups.com.

Reply via email to