Re: Troubleshooting BIND stops responding

2017-03-30 Thread Alan Clegg
On 3/30/17 6:02 AM, Mark Elkins wrote:
> Stopping right here, Recursive lookup and Authoritative services are
> completely different services - and require different servers
> (preferably, though you could run multiple incidents of nameservers on a
> single server - but that can get ugly).

Actually, no.  Running both recursive and authoritative does not require
different servers and does not require running multiple instances of
bind.  It's not recommended, but it's not hard, and it has worked for
lots of people for lots of years.

> Your two recursive servers should remain as recursive servers, only
> giving replies to your customer base. When you start running DNSSEC,
> this becomes even more important, a recursive server running as an
> authoritative server for a zone can not give a proper DNSSEC reply when
> asked about Zones carried in its config.

Actually, the only thing it doesn't do is the validation.  It gives
responses just fine as long as you aren't validating your own data.
Trusting the "AD" bit is a great concept, but you really want to
validate as close to the end-point as possible.

> Rather keep things simple.
> 
> I would presume that you have multiple authoritative servers for your
> "vtt.net" domain. If you need more redundancy, add in more authoritative
> nameservers or better still an AnyCast instance. Even any of your local
> Authoritative Nameservers should ask your recursive servers when they
> need to look up information that is not part of the Zones they manage.
> Enough of the preaching.

Interesting to go from "keep things simple" to "let's use anycast" in
three sentences.

Too many people are trying to solve problems that don't exist with
additional complexity that cause additional issues elsewhere in the
network stack.  If your nameserver has issues with basic responses, good
luck debugging that while also dealing with routing problems in your
network and wondering which server you should actually be looking at.

Sorry to sound like an old grouch, but I'm really feeling like and old
grouch these days.

> If you were to run IPv6, a number of errors would disappear, otherwise
> force BIND not to do any IPv6. Adding IPv6 though would be preferable.  ;-)

Keep things simple... When your nameserver isn't responding, don't think
about running IPv6, fix the problem at hand.  And "if you run IPv6, a
number of errors disappear".  I'm just shaking my head.

> Don't think though that any of this is causing your problem. You could
> always upgrade your version of BIND. On my Gentoo Laptop, I'm  running
> BIND 9.11.0-P3, so you are a bit behind.

And there is the useful nugget.

Yes, OP, see if your problems continue once you upgrade.

AlanC



signature.asc
Description: OpenPGP digital signature
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: Troubleshooting BIND stops responding

2017-03-30 Thread Mark Elkins


On 30/03/2017 06:35, i.chu...@volga.ttk.ru wrote:
> Greetings to everyone!
>
> I'm an engineer at local ISP and we have to provide 2 DNS servers running 
> BIND for our clients. We have logs full of various BIND errors but are 
> unable to gain full understanding of the problem. The main problem is that 
> the BIND at 213.80.236.18 sometimes stops responding after working fine 
> for about a week. Then BIND just doesn't return any responses and we have 
> to restart it. There is a suspicion of a weak (because other services are 
> running normally) DoS attack but I don't know the right way to determine 
> if it is so or not. I would be glad if anyone be so kind to help us to 
> solve this issue.
>
> The machines have the IPv4 addresses: 217.23.80.4 (BIND version 9.9.4) and 
> 213.80.236.18 (BIND version 9.9.5-r3) and have to resolve hostnames only 
> for ISP customers (and refuse to resolve for others) BUT we want to be 
> able to resolve our specific zones like vtt.net for anybody trying in case 
> of authoritative nameserver failures

Stopping right here, Recursive lookup and Authoritative services are
completely different services - and require different servers
(preferably, though you could run multiple incidents of nameservers on a
single server - but that can get ugly).

Your two recursive servers should remain as recursive servers, only
giving replies to your customer base. When you start running DNSSEC,
this becomes even more important, a recursive server running as an
authoritative server for a zone can not give a proper DNSSEC reply when
asked about Zones carried in its config.

Rather keep things simple.

I would presume that you have multiple authoritative servers for your
"vtt.net" domain. If you need more redundancy, add in more authoritative
nameservers or better still an AnyCast instance. Even any of your local
Authoritative Nameservers should ask your recursive servers when they
need to look up information that is not part of the Zones they manage.
Enough of the preaching.

-oOo-

If you were to run IPv6, a number of errors would disappear, otherwise
force BIND not to do any IPv6. Adding IPv6 though would be preferable.  ;-)

Don't think though that any of this is causing your problem. You could
always upgrade your version of BIND. On my Gentoo Laptop, I'm  running
BIND 9.11.0-P3, so you are a bit behind.

-- 
Mark James ELKINS  -  Posix Systems - (South) Africa
m...@posix.co.za   Tel: +27.128070590  Cell: +27.826010496
For fast, reliable, low cost Internet in ZA: https://ftth.posix.co.za

___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Troubleshooting BIND stops responding

2017-03-29 Thread i . chudov
Greetings to everyone!

I'm an engineer at local ISP and we have to provide 2 DNS servers running 
BIND for our clients. We have logs full of various BIND errors but are 
unable to gain full understanding of the problem. The main problem is that 
the BIND at 213.80.236.18 sometimes stops responding after working fine 
for about a week. Then BIND just doesn't return any responses and we have 
to restart it. There is a suspicion of a weak (because other services are 
running normally) DoS attack but I don't know the right way to determine 
if it is so or not. I would be glad if anyone be so kind to help us to 
solve this issue.

The machines have the IPv4 addresses: 217.23.80.4 (BIND version 9.9.4) and 
213.80.236.18 (BIND version 9.9.5-r3) and have to resolve hostnames only 
for ISP customers (and refuse to resolve for others) BUT we want to be 
able to resolve our specific zones like vtt.net for anybody trying in case 
of authoritative nameserver failures.

I can post the configuration files like citation/attachment if it's 
appropriate.

And here is log samples from 213.80.236.18:
dns_more.log (configured as "channel enhlog/severity info;"):
30-Mar-2017 08:19:31.001 rate-limit: stop limiting NXDOMAIN responses to 
213.80.210.0/24 for .  ()
30-Mar-2017 08:19:38.822 resolver: DNS format error from 173.245.59.100#53 
resolving 82.51.18.104.in-addr.arpa/PTR for client 188.168.243.125#15693: 
Name 104.in-addr.arpa (SOA) not subdomain of zone 18.104.in-addr.arpa -- 
invalid response
30-Mar-2017 08:19:38.840 resolver: DNS format error from 173.245.58.100#53 
resolving 82.51.18.104.in-addr.arpa/PTR for client 188.168.243.125#15693: 
Name 104.in-addr.arpa (SOA) not subdomain of zone 18.104.in-addr.arpa -- 
invalid response
30-Mar-2017 08:19:51.428 resolver: clients-per-query decreased to 19
30-Mar-2017 08:19:54.725 resolver: DNS format error from 
205.251.192.232#53 resolving now.dolphin.com/ for client 
100.64.36.162#32772: Name dolphin.com (SOA) not subdomain of zone 
now.dolphin.com -- invalid response
30-Mar-2017 08:19:54.786 resolver: DNS format error from 
205.251.195.198#53 resolving now.dolphin.com/ for client 
100.64.36.162#32772: Name dolphin.com (SOA) not subdomain of zone 
now.dolphin.com -- invalid response
30-Mar-2017 08:19:54.848 resolver: DNS format error from 
2600:9000:5307:5600::1#53 resolving now.dolphin.com/ for client 
100.64.36.162#32772: Name dolphin.com (SOA) not subdomain of zone 
now.dolphin.com -- invalid response
30-Mar-2017 08:19:54.925 resolver: DNS format error from 
2600:9000:5304:6600::1#53 resolving now.dolphin.com/ for client 
100.64.36.162#32772: Name dolphin.com (SOA) not subdomain of zone 
now.dolphin.com -- invalid response
30-Mar-2017 08:19:54.998 resolver: DNS format error from 
2600:9000:5300:e800::1#53 resolving now.dolphin.com/ for client 
100.64.36.162#32772: Name dolphin.com (SOA) not subdomain of zone 
now.dolphin.com -- invalid response
30-Mar-2017 08:19:55.060 resolver: DNS format error from 
2600:9000:5303:c600::1#53 resolving now.dolphin.com/ for client 
100.64.36.162#32772: Name dolphin.com (SOA) not subdomain of zone 
now.dolphin.com -- invalid response

process.log (configured as "channel process/severity notice;"):
29-Nov-2016 07:09:28.266 xfer-in: transfer of 'rpz/IN/global' from 
217.23.80.2#53: failed while receiving responses: connection reset
15-Dec-2016 09:56:41.637 xfer-in: transfer of './IN/root' from 
2001:500:2f::f#53: failed to connect: timed out
15-Dec-2016 10:23:37.125 xfer-in: transfer of './IN/root' from 
2001:500:2f::f#53: failed to connect: timed out
15-Dec-2016 10:53:32.581 xfer-in: transfer of './IN/root' from 
2001:500:2f::f#53: failed to connect: timed out
15-Dec-2016 11:20:08.997 xfer-in: transfer of './IN/root' from 
2001:500:2f::f#53: failed to connect: timed out
15-Dec-2016 11:49:11.461 xfer-in: transfer of './IN/root' from 
2001:500:2f::f#53: failed to connect: timed out
15-Dec-2016 12:20:39.845 xfer-in: transfer of './IN/root' from 
2001:500:2f::f#53: failed to connect: timed out
15-Dec-2016 12:48:14.245 xfer-in: transfer of './IN/root' from 
2001:500:2f::f#53: failed to connect: timed out
15-Dec-2016 13:21:37.708 xfer-in: transfer of './IN/root' from 
2001:500:2f::f#53: failed to connect: timed out
15-Dec-2016 13:55:00.133 xfer-in: transfer of './IN/root' from 
2001:500:2f::f#53: failed to connect: timed out
12-Mar-2017 09:25:09.993 xfer-in: transfer of './IN/root' from 
2620:0:2830:202::132#53: failed while receiving responses: end of file

security.log (configured as "channel security/severity info;"):
30-Mar-2017 08:21:57.558 lame-servers: error (unexpected RCODE REFUSED) 
resolving 'echo-nl03.calyptra-soft.net/A/IN': 62.212.78.199#53
30-Mar-2017 08:21:57.630 lame-servers: error (unexpected RCODE REFUSED) 
resolving 'echo-nl03.calyptra-soft.net/A/IN': 83.149.64.123#53
30-Mar-2017 08:21:57.696 lame-servers: error (unexpected RCODE REFUSED) 
resolving '22.178.87.223.in-addr.arpa/PTR/IN':