
On Mon, Sep 3, 2018 at 12:45 PM Carl Byington <c...@byington.org> wrote:
> Hash: SHA512
> On Sun, 2018-09-02 at 21:54 -0400, Alex wrote:
> > Do you have any other ideas on how I can isolate this problem?
> Run tcpdump on the external ethernet connection.
> tcpdump -s0 -vv -i %s -nn -w /tmp/outputfile udp dst port domain

I've captured some packets that I believe include the packets relating
to the SERVFAIL errors I've been receiving. Now I have to figure out
how to go through them.

In the meantime, I've configured /etc/resolv.conf to send queries to a
remote system of ours, and the errors have (mostly) stopped.

I also notice some traces take an abnormal amount of time. Ping times
to google.com are less than 20ms, but this trace shows reaching the
root servers takes 104ms:

# dig +trace +nodnssec google.com

; <<>> DiG 9.11.4-P1-RedHat-9.11.4-5.P1.fc28 <<>> +trace +nodnssec google.com
;; global options: +cmd
.                       3451    IN      NS      g.root-servers.net.
.                       3451    IN      NS      k.root-servers.net.
.                       3451    IN      NS      j.root-servers.net.
.                       3451    IN      NS      c.root-servers.net.
.                       3451    IN      NS      i.root-servers.net.
.                       3451    IN      NS      e.root-servers.net.
.                       3451    IN      NS      m.root-servers.net.
.                       3451    IN      NS      l.root-servers.net.
.                       3451    IN      NS      a.root-servers.net.
.                       3451    IN      NS      h.root-servers.net.
.                       3451    IN      NS      b.root-servers.net.
.                       3451    IN      NS      d.root-servers.net.
.                       3451    IN      NS      f.root-servers.net.
;; Received 839 bytes from in 0 ms

com.                    172800  IN      NS      h.gtld-servers.net.
com.                    172800  IN      NS      g.gtld-servers.net.
com.                    172800  IN      NS      b.gtld-servers.net.
com.                    172800  IN      NS      j.gtld-servers.net.
com.                    172800  IN      NS      f.gtld-servers.net.
com.                    172800  IN      NS      m.gtld-servers.net.
com.                    172800  IN      NS      c.gtld-servers.net.
com.                    172800  IN      NS      d.gtld-servers.net.
com.                    172800  IN      NS      k.gtld-servers.net.
com.                    172800  IN      NS      i.gtld-servers.net.
com.                    172800  IN      NS      l.gtld-servers.net.
com.                    172800  IN      NS      a.gtld-servers.net.
com.                    172800  IN      NS      e.gtld-servers.net.
;; Received 835 bytes from in 104 ms

google.com.             172800  IN      NS      ns2.google.com.
google.com.             172800  IN      NS      ns1.google.com.
google.com.             172800  IN      NS      ns3.google.com.
google.com.             172800  IN      NS      ns4.google.com.
;; Received 287 bytes from in 44 ms

;; expected opt record in response
google.com.             300     IN      A
;; Received 44 bytes from in 29 ms

Running the same trace again showed 129ms.

I also located this warning:
06-Sep-2018 12:03:33.304 client: warning: client @0x7f502c1d3d50 (cmail20.com.multi.surbl.org): recursive-clients soft
limit exceeded (901/900/1000), aborting oldest query

I've increased recursive-clients to 2500 but the SERVFAIL errors continue.

There are also a ton of lame-server entries, many of which are related
to one RBL or another, as part of my postscreen config:
06-Sep-2018 13:16:50.686 lame-servers: info: connection refused
resolving '':
06-Sep-2018 13:16:50.706 lame-servers: info: connection refused
resolving '':
06-Sep-2018 13:16:51.308 lame-servers: info: connection refused
resolving '':
06-Sep-2018 13:16:54.798 lame-servers: info: connection refused
resolving 'e51dd24f684d212a7da1119b23603b0f.generic.ixhash.net/A/IN':
06-Sep-2018 13:16:54.799 lame-servers: info: connection refused
resolving 'f4d997d8949e6dbd30f6a418ad364589.generic.ixhash.net/A/IN':
06-Sep-2018 13:16:55.762 lame-servers: info: connection refused
resolving '':
06-Sep-2018 13:16:55.845 lame-servers: info: connection refused
resolving '':

What would be a cause of such a significant delay in reaching the root servers?

Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list

Reply via email to