Re: Need to improve named performance

Kevin Darcy Sun, 11 Nov 2012 12:49:21 -0800

On 11/10/2012 1:39 PM, Ed LaFrance wrote:

Hello all -
First post to this list, hope I'm on the right place.
Running BIND 9.3.6-P1-RedHat-9.3.6-16.P1.el5 on a quadcore xeon server(3Ghz) with 2GB RAM. Named is being used only for rDNS queries againstour address space.
The issue is that named is not keeping up with rdns requests. Thenameserver is only doing rdns, and it's the only public process on theserver (no webhosting, monitoring, etc).
When I check the router above this server I'll see 200 - 500legitimate connections to this server at any given time. This iswhat's happening: named is not keeping up with the requests, so thenetwork receive queue fills up - I can see this with netstat:
netstat -tulpn | grep :53
Proto Recv-Q Send-Q Local Address Foreign AddressPID/Program name
...
udp   110048      0 xxx.xxx.xxx.xxx:53           0.0.0.0:* 3918/named
udp   110048      0 xxx.xxx.xxx.xxx:53             0.0.0.0:* 3918/named

(two different IPs are on this machine to handle rDNS reqeusts)
Once the queue gets near the max value set by sysctl, udp packetsstart to drop - this can also be seen in netstat:
 netstat -su
...
Udp:
    5157567 packets received
    9761 packets to unknown port received.
    1164232 packet receive errors
    5157554 packets sent
The errors apparently correspond to drops; the only increase when thequeue is full.
Of course by this point dns queries are timing out. I've triedincreasing the queue size with sysctl using this command:
sysctl -w net.core.rmem_max=1048576 net.core.rmem_default=10485
then restarting named; that did eliminate the drops, but the queuegrows gigantic and I get pretty much 100% dns lookup timeouts at thatpoint.
The server loading is about 2.0 - busy, not not overwhelmed, I can runa shell or even a gui session on it with ease so it's by no meansmaxed out. Here's the first slice of top output:
top - 09:13:38 up 18:40,  1 user,  load average: 2.09, 2.05, 2.00
Tasks: 175 total,   1 running, 174 sleeping,   0 stopped,   0 zombie
Cpu(s): 0.2%us, 0.2%sy, 0.0%ni, 74.8%id, 24.7%wa, 0.0%hi, 0.2%si,0.0%st
Mem:   2074984k total,  1743584k used,   331400k free,   166588k buffers
Swap:  4128760k total,       28k used,  4128732k free,  1270032k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+ COMMAND
 4509 named     24   0 71004 4580 2036 S  1.3  0.2   0:46.74 named
 6877 root      15   0  2428 1064  788 R  0.7  0.1   0:00.04 top
  467 root      10  -5     0    0    0 D  0.3  0.0   2:59.13 kjournald
 2460 root      18   0  1816  584  484 D  0.3  0.0   3:30.35 syslogd
    1 root      15   0  2160  644  556 S  0.0  0.0   0:01.08 init
The bottom line is: I need to improve named performance. Tcpdump onlyshows about 20 requests per second on average, I would estimate. Thisshould be handled easily, but instead it's gagging on it and therequests are stacking up. If you have any ideas, I welcome your input.Here's named.conf, it's pretty basic for the global config, the datafor each zone is stored separately elsewhere:
options {
        directory "/var";
        auth-nxdomain no;
        pid-file "/var/run/named/named.pid";
        allow-recursion {
                localnets;
        };

        allow-transfer {
            "none";
        };
};

key "rndc-key" {
        algorithm hmac-md5;
        secret "xxxxxxxxxxxxxxxxxxxxxx";
};

controls {
        inet 127.0.0.1 port 953
        allow { 127.0.0.1; } keys { "rndc-key"; };
};

zone "." {
        type hint;
        file "named.root";
};

zone "0.0.127.IN-ADDR.ARPA" {
        type master;
        file "localhost.rev";
};

I wouldn't expect a nameserver process on Linux, hosting only a fewreverse zones and doing nothing else, to be 71 megabytes in size; I justchecked one of ours, serving *all* of our internal zone data, forwardand reverse authoritative, plus some cached data for a significantnumber of zones delegated to business partners, and it's less than 100Mb in size.

Verify from your query logs, or by dumping cache, that it's *only* doingwhat it is supposed to do, and no more. If you've got a bunch of data inyour cache, or a bunch of queries, that's unrelated to serving yourreverse DNS, then that's probably the root cause of your problem.Consider turning off recursion, or severely limiting it, in order toenforce that the nameserver is only serving its intended purpose. 2Gb ofmemory is a little lean for a nameserver serving a *generic*Internet-name-lookup role...

I guess another possibility is that you've gone crazy with your reversezones (e.g. using $GENERATE willy-nilly), and thus are using up way morememory than you really need, to serve your reverse-resolution needs.


                                    - Kevin


_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: Need to improve named performance

Reply via email to