Bind DoS?

Attila Nagy Sat, 03 Sep 2005 16:13:12 -0700

Hello,

I am currently trying to set up two caching nameservers and noticed aninteresting behaviour.


The configuration is the following:
two FreeBSD/amd64 6-CURRENT machines, with single Opteron processors.

Bind was compiled from ports, without threading, with gcc34 (fromports), with -O2 -static. It runs in a jail, with nothing more than theconfig and a nearly empty devfs mount.


Machine A has a simple config of the following:
options {
        directory "/etc/bind";
        tcp-clients 256;
        recursive-clients 8192;
        max-cache-size 600M;
        minimal-responses yes;
        pid-file "/tmp/named.pid";
        forwarders { MACHINE_B_IP; };
};

Machine B has the same bind, but runs as an authoritative NS with ajoker record of:

*       IN      TXT     "256xA"
in the . zone (so it answers 256 "A"s for everything).

The test:
from machine B I start a queryperf, this way:
queryperf -d list -s MACHINE_A_IP

where list has the following:
www.RANDOMNUMBER.hu TXT
[...] this is 9000000 times.

During the test, machine A starts to fill its cache up until about 860MBs. Until that I see this in top:CPU states: 27.7% user, 0.0% nice, 58.1% system, 14.2% interrupt, 0.0%idle

On machine B queryperf receives answer within the default timeout (5seconds).

After bind reaches about 860 MBs, it starts to eat CPU, so there is 100%user and nearly 0% system and interrupt usage.


queryperf starts to time out with the following:
[Timeout] Query timed out: msg id 64837
Warning: Received a response with an unexpected (maybe timed out) id: 64837

The server effectively dies, it can answer only a very little number ofqueries and with very low performance. If I stop queryperf, bind remainsin the CPU eating state:

76423 bind        1 129    0   861M   862M RUN      8:30 97.71% named

Because the machine has much more RAM, I first tried with 1200M in theconfig. The server has reached its "zombie" state at around 1600 MB ofusage and it was much unresponsive.

On another (real) server, I noticed similar behaviour this week. Bindstarted to eat all CPU resources, there were only "recursive quotareached" messages in the logs, but rndc status said only very low usage(for example 60/1024 on that server).


I can repeat this with and without patch-lib_dns_resolver.c.

If I stop the queries, the server starts to answer the queries in a fewminutes, after it has finished its strange "CPU eating" loop.


ktrace says, it's doing this many-many times between two successful queries:
 76423 named    CALL  gettimeofday(0x7fffffffe450,0)
 76423 named    RET   gettimeofday 0

Any ideas?

Thanks,
--
Attila Nagy                                   e-mail: [EMAIL PROTECTED]
Free Software Network (FSN.HU)           phone @work: +361 371 3536
ISOs: http://www.fsn.hu/?f=download            cell.: +3630 306 6758
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Bind DoS?

Reply via email to