Re: Troubleshooting BIND stops responding
On 3/30/17 6:02 AM, Mark Elkins wrote: > Stopping right here, Recursive lookup and Authoritative services are > completely different services - and require different servers > (preferably, though you could run multiple incidents of nameservers on a > single server - but that can get ugly). Actually, no. Running both recursive and authoritative does not require different servers and does not require running multiple instances of bind. It's not recommended, but it's not hard, and it has worked for lots of people for lots of years. > Your two recursive servers should remain as recursive servers, only > giving replies to your customer base. When you start running DNSSEC, > this becomes even more important, a recursive server running as an > authoritative server for a zone can not give a proper DNSSEC reply when > asked about Zones carried in its config. Actually, the only thing it doesn't do is the validation. It gives responses just fine as long as you aren't validating your own data. Trusting the "AD" bit is a great concept, but you really want to validate as close to the end-point as possible. > Rather keep things simple. > > I would presume that you have multiple authoritative servers for your > "vtt.net" domain. If you need more redundancy, add in more authoritative > nameservers or better still an AnyCast instance. Even any of your local > Authoritative Nameservers should ask your recursive servers when they > need to look up information that is not part of the Zones they manage. > Enough of the preaching. Interesting to go from "keep things simple" to "let's use anycast" in three sentences. Too many people are trying to solve problems that don't exist with additional complexity that cause additional issues elsewhere in the network stack. If your nameserver has issues with basic responses, good luck debugging that while also dealing with routing problems in your network and wondering which server you should actually be looking at. Sorry to sound like an old grouch, but I'm really feeling like and old grouch these days. > If you were to run IPv6, a number of errors would disappear, otherwise > force BIND not to do any IPv6. Adding IPv6 though would be preferable. ;-) Keep things simple... When your nameserver isn't responding, don't think about running IPv6, fix the problem at hand. And "if you run IPv6, a number of errors disappear". I'm just shaking my head. > Don't think though that any of this is causing your problem. You could > always upgrade your version of BIND. On my Gentoo Laptop, I'm running > BIND 9.11.0-P3, so you are a bit behind. And there is the useful nugget. Yes, OP, see if your problems continue once you upgrade. AlanC signature.asc Description: OpenPGP digital signature ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Troubleshooting BIND stops responding
On 30/03/2017 06:35, i.chu...@volga.ttk.ru wrote: > Greetings to everyone! > > I'm an engineer at local ISP and we have to provide 2 DNS servers running > BIND for our clients. We have logs full of various BIND errors but are > unable to gain full understanding of the problem. The main problem is that > the BIND at 213.80.236.18 sometimes stops responding after working fine > for about a week. Then BIND just doesn't return any responses and we have > to restart it. There is a suspicion of a weak (because other services are > running normally) DoS attack but I don't know the right way to determine > if it is so or not. I would be glad if anyone be so kind to help us to > solve this issue. > > The machines have the IPv4 addresses: 217.23.80.4 (BIND version 9.9.4) and > 213.80.236.18 (BIND version 9.9.5-r3) and have to resolve hostnames only > for ISP customers (and refuse to resolve for others) BUT we want to be > able to resolve our specific zones like vtt.net for anybody trying in case > of authoritative nameserver failures Stopping right here, Recursive lookup and Authoritative services are completely different services - and require different servers (preferably, though you could run multiple incidents of nameservers on a single server - but that can get ugly). Your two recursive servers should remain as recursive servers, only giving replies to your customer base. When you start running DNSSEC, this becomes even more important, a recursive server running as an authoritative server for a zone can not give a proper DNSSEC reply when asked about Zones carried in its config. Rather keep things simple. I would presume that you have multiple authoritative servers for your "vtt.net" domain. If you need more redundancy, add in more authoritative nameservers or better still an AnyCast instance. Even any of your local Authoritative Nameservers should ask your recursive servers when they need to look up information that is not part of the Zones they manage. Enough of the preaching. -oOo- If you were to run IPv6, a number of errors would disappear, otherwise force BIND not to do any IPv6. Adding IPv6 though would be preferable. ;-) Don't think though that any of this is causing your problem. You could always upgrade your version of BIND. On my Gentoo Laptop, I'm running BIND 9.11.0-P3, so you are a bit behind. -- Mark James ELKINS - Posix Systems - (South) Africa m...@posix.co.za Tel: +27.128070590 Cell: +27.826010496 For fast, reliable, low cost Internet in ZA: https://ftth.posix.co.za ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Troubleshooting BIND stops responding
Greetings to everyone! I'm an engineer at local ISP and we have to provide 2 DNS servers running BIND for our clients. We have logs full of various BIND errors but are unable to gain full understanding of the problem. The main problem is that the BIND at 213.80.236.18 sometimes stops responding after working fine for about a week. Then BIND just doesn't return any responses and we have to restart it. There is a suspicion of a weak (because other services are running normally) DoS attack but I don't know the right way to determine if it is so or not. I would be glad if anyone be so kind to help us to solve this issue. The machines have the IPv4 addresses: 217.23.80.4 (BIND version 9.9.4) and 213.80.236.18 (BIND version 9.9.5-r3) and have to resolve hostnames only for ISP customers (and refuse to resolve for others) BUT we want to be able to resolve our specific zones like vtt.net for anybody trying in case of authoritative nameserver failures. I can post the configuration files like citation/attachment if it's appropriate. And here is log samples from 213.80.236.18: dns_more.log (configured as "channel enhlog/severity info;"): 30-Mar-2017 08:19:31.001 rate-limit: stop limiting NXDOMAIN responses to 213.80.210.0/24 for . () 30-Mar-2017 08:19:38.822 resolver: DNS format error from 173.245.59.100#53 resolving 82.51.18.104.in-addr.arpa/PTR for client 188.168.243.125#15693: Name 104.in-addr.arpa (SOA) not subdomain of zone 18.104.in-addr.arpa -- invalid response 30-Mar-2017 08:19:38.840 resolver: DNS format error from 173.245.58.100#53 resolving 82.51.18.104.in-addr.arpa/PTR for client 188.168.243.125#15693: Name 104.in-addr.arpa (SOA) not subdomain of zone 18.104.in-addr.arpa -- invalid response 30-Mar-2017 08:19:51.428 resolver: clients-per-query decreased to 19 30-Mar-2017 08:19:54.725 resolver: DNS format error from 205.251.192.232#53 resolving now.dolphin.com/ for client 100.64.36.162#32772: Name dolphin.com (SOA) not subdomain of zone now.dolphin.com -- invalid response 30-Mar-2017 08:19:54.786 resolver: DNS format error from 205.251.195.198#53 resolving now.dolphin.com/ for client 100.64.36.162#32772: Name dolphin.com (SOA) not subdomain of zone now.dolphin.com -- invalid response 30-Mar-2017 08:19:54.848 resolver: DNS format error from 2600:9000:5307:5600::1#53 resolving now.dolphin.com/ for client 100.64.36.162#32772: Name dolphin.com (SOA) not subdomain of zone now.dolphin.com -- invalid response 30-Mar-2017 08:19:54.925 resolver: DNS format error from 2600:9000:5304:6600::1#53 resolving now.dolphin.com/ for client 100.64.36.162#32772: Name dolphin.com (SOA) not subdomain of zone now.dolphin.com -- invalid response 30-Mar-2017 08:19:54.998 resolver: DNS format error from 2600:9000:5300:e800::1#53 resolving now.dolphin.com/ for client 100.64.36.162#32772: Name dolphin.com (SOA) not subdomain of zone now.dolphin.com -- invalid response 30-Mar-2017 08:19:55.060 resolver: DNS format error from 2600:9000:5303:c600::1#53 resolving now.dolphin.com/ for client 100.64.36.162#32772: Name dolphin.com (SOA) not subdomain of zone now.dolphin.com -- invalid response process.log (configured as "channel process/severity notice;"): 29-Nov-2016 07:09:28.266 xfer-in: transfer of 'rpz/IN/global' from 217.23.80.2#53: failed while receiving responses: connection reset 15-Dec-2016 09:56:41.637 xfer-in: transfer of './IN/root' from 2001:500:2f::f#53: failed to connect: timed out 15-Dec-2016 10:23:37.125 xfer-in: transfer of './IN/root' from 2001:500:2f::f#53: failed to connect: timed out 15-Dec-2016 10:53:32.581 xfer-in: transfer of './IN/root' from 2001:500:2f::f#53: failed to connect: timed out 15-Dec-2016 11:20:08.997 xfer-in: transfer of './IN/root' from 2001:500:2f::f#53: failed to connect: timed out 15-Dec-2016 11:49:11.461 xfer-in: transfer of './IN/root' from 2001:500:2f::f#53: failed to connect: timed out 15-Dec-2016 12:20:39.845 xfer-in: transfer of './IN/root' from 2001:500:2f::f#53: failed to connect: timed out 15-Dec-2016 12:48:14.245 xfer-in: transfer of './IN/root' from 2001:500:2f::f#53: failed to connect: timed out 15-Dec-2016 13:21:37.708 xfer-in: transfer of './IN/root' from 2001:500:2f::f#53: failed to connect: timed out 15-Dec-2016 13:55:00.133 xfer-in: transfer of './IN/root' from 2001:500:2f::f#53: failed to connect: timed out 12-Mar-2017 09:25:09.993 xfer-in: transfer of './IN/root' from 2620:0:2830:202::132#53: failed while receiving responses: end of file security.log (configured as "channel security/severity info;"): 30-Mar-2017 08:21:57.558 lame-servers: error (unexpected RCODE REFUSED) resolving 'echo-nl03.calyptra-soft.net/A/IN': 62.212.78.199#53 30-Mar-2017 08:21:57.630 lame-servers: error (unexpected RCODE REFUSED) resolving 'echo-nl03.calyptra-soft.net/A/IN': 83.149.64.123#53 30-Mar-2017 08:21:57.696 lame-servers: error (unexpected RCODE REFUSED) resolving '22.178.87.223.in-addr.arpa/PTR/IN':
Re: troubleshooting bind
On 09.04.12 16:55, Marseglia, Michael wrote: I'm troubleshooting a DNS issue we recently experienced where records were unresolveable, response NXDOMAIN, from the caching DNS server. I flushed the cache using rndc flush and I received the host's ip. There were no errors in the system log so I'm enabling debug logging should it occur again. I'm still not sure what caused the NXDOMAIN response it so I'm reviewing my BIND config and taking a look at the default values. the NXDOMAIN answer was apparently returned by one of servers that are authoritative for the domain or domains abovec. Check all servers in the resolution path for the answer. It's a quite common problem with master/slave synchronization, multiple masters, or a missing delegation to a subdomain, where this can happen. -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. They say when you play that M$ CD backward you can hear satanic messages. That's nothing. If you play it forward it will install Windows. ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
troubleshooting bind
Hello, I'm troubleshooting a DNS issue we recently experienced where records were unresolveable, response NXDOMAIN, from the caching DNS server. I flushed the cache using rndc flush and I received the host's ip. There were no errors in the system log so I'm enabling debug logging should it occur again. I'm still not sure what caused the NXDOMAIN response it so I'm reviewing my BIND config and taking a look at the default values. When configuring BIND for an internal corporate network with a thousand clients should any of the default values be tweaked? I've searched for tuning guidance but I haven't found any yet. I've taken interest in the tcp-clients, max-ncache-ttl, max-cache-ttl, cleaning-interval and max-cache-size values. These are all currently set to default. I'm guessing in a more volatile network with DHCP and frequent provisioning/deprovisioning of hosts I would want to lower the max-ncache-ttl and max-cache-ttl values. Is this correct? Regarding the tcp-clients option, where can I find the current connection count and how do I know if I'm coming close to this number? In what type of environment would it be expected to hit the default threshold of 100? Lastly, if max-cache-size is set to unlimited what happens if BIND consumes all the available memory? Will the linux kernel terminate the process? How can I find the value of the current cache size? Mike Marseglia Network Engineer, CharterCARE p: 401-456-2331 c: 401-248-4867 e: michael.marseg...@chartercare.orgmailto:michael.marseg...@chartercare.org t: @mmars ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: troubleshooting bind
Hi-- On Apr 9, 2012, at 9:55 AM, Marseglia, Michael wrote: [ ... ] When configuring BIND for an internal corporate network with a thousand clients should any of the default values be tweaked? I’ve searched for tuning guidance but I haven’t found any yet. I’ve taken interest in the tcp-clients, max-ncache-ttl, max-cache-ttl, cleaning-interval and max-cache-size values. These are all currently set to default. These are good things to take a look at, yes, although also clients-per-query max-clients-per-query. I’m guessing in a more volatile network with DHCP and frequent provisioning/deprovisioning of hosts I would want to lower the max-ncache-ttl and max-cache-ttl values. Is this correct? That depends-- if the volatile domain is your domain, and BIND is authoritative for it, then it will be providing AAs directly from zone data, rather than caching responses obtained from some other nameserver. For the most part, it's better for an active domain with frequently changing data to adjust the TTLs for the domain to appropriate values, and let named figure things out from there...but you can only tweak that for the domains you manage. Regarding the tcp-clients option, where can I find the current connection count and how do I know if I’m coming close to this number? In what type of environment would it be expected to hit the default threshold of 100? You can see what active TCP sessions are open via something like: netstat -p tcp | grep 53 ...and add | wc -l if you want to count them. (You might also want to tweak that a bit to use fgrep .53\ to only match port 53...) I don't think it's expected that many TCP sessions would be needed, since UDP + EDNS0 works fine for almost all cases, although as DNSSEC becomes more widely adopted it might be the case that more TCP sessions will be used. Lastly, if max-cache-size is set to unlimited what happens if BIND consumes all the available memory? Will the linux kernel terminate the process? How can I find the value of the current cache size? Most platforms set up a process datasize limit (commonly set to 1GB or so), after which malloc() and friends will fail to get more memory. The kernel will only terminate processes if the entire system runs out of VM, including swap space, but the system will generally in an unusable state due to heavy paging/swapping before the kernel OOM killer gets invoked. Regards, -- -Chuck ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users