Re: How does BIND 9 scale with multithreading?
2010/9/29 Eivind Olsen eiv...@aminor.no Does anyone know if there are any benchmarks out in the public, which could give some insight into how well BIND 9 scales with multithreading? I've tried looking on this list, and googling, but haven't found anything yet. To be a bit more specific - I'm not sure what a good option for server hardware would be for a recursive DNS server. On one hand, the Sun (ok, Oracle) Niagara/Coolthreads architecture seems to work nicely enough, but maybe I'd be better off with some generic Intel/AMD based solution with fewer threads/cores but higher GHz per thread? i did some test and Niagara (T1000 / T5240) performs badly (response time and rate) compared to Intel/AMD some numbers at 75% cpu T1000 6 cores / 24threads~10ms 600 queries/second 2-core AMD 1210 1.8ghz:~0.6ms 7000 queries/second 8-core Intel E5410 2.33ghz: ~0.6ms 7 queries/second -- Fabien ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind hang out when named reach to 5-600 Mb
2010/7/8 khanh rua duonghoahoc_k4...@yahoo.com: Hi, I install bind as a cache server on Solaris 10, Sun Sparc T5140. It has problem, bind always hang out when named reach to 5-600 Mb ('prstat' check). I have several servers and all have this problem even when i install bind in zone or try with a 64bit version. T5140's a powerful server but bind can't make use of its power. I'm newb with bind an so i have just try some other way but useless. What should i do to track this problem ? is this specific to T5140 ? which server type did you use before ? Some time ago, i did some simple benchmark (dnsperf / queryperf) on T1000 and T5240 and the results were bad. my numbers (bind caching server): SUN X2100 can serve 7000 queries/s with 0.6-1ms response time SUN T1000 can serve 600 queries/s with 10-15ms response time (more than 600 means, response time jumps over 100ms) You should do some benchmark (and heavily use rndc stats) before choosing a new architecture -- Fabien ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
bind 9.6.2 / solaris 10 intel / gcc 4 / compilation warning
Hi, when compiling bind, i saw some warnings. my build box: - Solaris 10 U8 i386 - gcc (GCC) 4.3.4 (from BlastWave.org) ./configure --without-openssl --prefix=/opt/bind-9.6.2 --sysconfdir=/etc --localstatedir=/var --disable-ipv6 --enable-threads gcc -I/opt/compil/bind-9.6.2 -I./include -I./../pthreads/include -I../include -I./../include -I./.. -D_REENTRANT -D_XPG4_2 -D__EXTENSIONS__ -g -O2 -I/usr/incl ude/libxml2 -W -Wall -Wmissing-prototypes -Wcast-qual -Wwrite-strings -Wformat -Wpointer-arith -fno-strict-aliasing -c net.c net.c:109: warning: braces around scalar initializer net.c:109: warning: (near initialization for 'once_ipv6pktinfo.__pthread_once_pad[0]') net.c:109: warning: excess elements in scalar initializer net.c:109: warning: (near initialization for 'once_ipv6pktinfo.__pthread_once_pad[0]') net.c:109: warning: excess elements in scalar initializer net.c:109: warning: (near initialization for 'once_ipv6pktinfo.__pthread_once_pad[0]') net.c:109: warning: excess elements in scalar initializer net.c:109: warning: (near initialization for 'once_ipv6pktinfo.__pthread_once_pad[0]') net.c:113: warning: braces around scalar initializer net.c:113: warning: (near initialization for 'once.__pthread_once_pad[0]') net.c:113: warning: excess elements in scalar initializer net.c:113: warning: (near initialization for 'once.__pthread_once_pad[0]') net.c:113: warning: excess elements in scalar initializer net.c:113: warning: (near initialization for 'once.__pthread_once_pad[0]') net.c:113: warning: excess elements in scalar initializer net.c:113: warning: (near initialization for 'once.__pthread_once_pad[0]') net.c:370: warning: 'initialize_ipv6pktinfo' defined but not used same warning on files: net.c:113 strerror.c:51 hash.c:103 lib.c:42 mem.c:114 random.c:39 result.c:110 lib.c:57 result.c:55 acl.c:481 db.c:68 dlz.c:83 lib.c:44 name.c:198 result.c:188 dst_lib.c:46 dst_result.c:58 statschannel.c:76 lwresd.c:73 -- Fabien ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: bind 9.6.2 with threads hangs
2010/3/19 Chris Thompson c...@cam.ac.uk On Mar 19 2010, David Ford wrote: BIND has long had issues with threading since it started supporting threaded operation. I recommend you simply recompile without thread support. I retry compiling with thread support about twice a year and as of late last year, BIND still hung soon after restart with threading enabled. Experiences seem to differ widely in this respect. We've been running BIND threaded for many years now, on Solaris platforms (currently 9.6.2 under Solaris 10_x86), without encountering this sort of problem. how many queries did your named answered ? To the OP: do you specify max_cache_size? If not, what does the memory consumption of BIND look like when it gets into the non-functional state? yes, max-cache-size 512M but named process takes ~900MB -- Fabien ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: bind 9.6.2 with threads hangs
2010/3/22 Cathy Almond cat...@isc.org Fabien Seisen wrote: yes, max-cache-size 512M but named process takes ~900MB The extra memory is for keeping track of recursive clients (i.e. in-progress client queries). ok This doesn't sound like a hugely loaded server, exact, on my own test (with real life queries), the server can handle ~7 queries/s with response time ~1ms at 70% cpu and no packet lost. else it's somewhat throttled (not particularly large cache and probably default limit on recursive clients). What kind of query rates do you have? Do you get any logging that suggests resource problems? If so, you might need to increase some of the limits. We have a pool of several more or less identicals servers with a load-balancer in front. On average, each server gets 1800 queries/s and 4000 at peak. The problem occurs every few weeks and never on all servers at a time. Recursive clients config is not modified (rndc status: recursive clients: 188/2900/3000) and we have - on avg: 200 recursive clients - at peak 600 It's intriguing that you're seeing the same issues on two bind versions and two OS (and that other people's experience is different from yours) only Solaris 10 - Solaris 10 U6 with bind 9.5.1-P3 with threads compiled with SUNSpro 12 - Solaris 10 U6 with bind 9.6.2 with threads compiled with gcc - it suggests to me that it's specific to your configuration or client base/queries or your environment. we gets real life queries from customers (evil?). A simple rndc flush revives named. Perhaps, a bad formated packet freeze named or create a cache dead lock Can something go wrong in the cache ? I am not fluent with core files but i have got one in my pocket. For troubleshooting I'd start by looking at the logging output - if you've got any categories going to null, un-suppress them temporarily; and add query-errors (see 9.6.2 ARM). Then perhaps do some sampling of network traffic (perhaps there's a UDP message size/fragmentation issue) to see what's happening (or not). all category to non-null and we do not use specific 9.6.2 configuration. I did not noticied weird log message (beside regular: shutting down due to TCP receive error: 202.96.209.6#53: connection reset) here is our log config: category client { client.log; }; category config { config.log; default_syslog; }; category database { database.log; default_syslog; }; category default { default.log; default_syslog; }; category delegation-only { delegation-only.log; }; category dispatch { dispatch.log; }; category general { default.log; }; category lame-servers { lamers.log; }; category network { network.log; }; category notify { notify.log; default_syslog; }; category queries { queries.log; }; category resolver { resolver.log; }; category security { security; }; category unmatched { unmatched.log; }; category update { update.log; }; category xfer-in { xfer-in.log; default_syslog; }; category xfer-out { xfer-out.log; default_syslog; }; -- Fabien ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
bind 9.6.2 with threads hangs
Hi, We have several recursive cache bind servers and experiencing weird things when named is compiled with-threads: In 4 steps: 1) everything goes ok 2) for ~1h named began to answer slower (0,5ms to 100ms) and with symptoms: - load increase on the server (from 0,3 to 4) - number of recursive queries increase (+500%) - number of recursive slot increase (from 200 to 600) - cache hit decrease (from 9X% to - number of cache entries drops from 2M to 0 3) named answer no query - no recursive queries - 0 entry in cache - rndc stats/status works 4) We flush the named cache (rndc flush) and everything goes ok We do a rndc stats every minute to get some stats. Hardware: - intel or amd with a total of 4 or 8 cores - solaris 10 - bind 9.6.2 with threads (gcc) or bind 9.5.1-P3 with threads (SUNWspro) any clue ? some numbers from named.stats : ++ Name Server Statistics ++ 437118882 IPv4 requests received ++ Zone Maintenance Statistics ++ ++ Resolver Statistics ++ 120096973 IPv4 queries sent 29784114 queries with RTT 10ms 49289542 queries with RTT 10-100ms 33448291 queries with RTT 100-500ms 277957 queries with RTT 500-800ms 105059 queries with RTT 800-1600ms 31079 queries with RTT 1600ms [View: _bind] ++ Socket I/O Statistics ++ 120075062 UDP/IPv4 sockets opened 35059 TCP/IPv4 sockets opened 120074870 UDP/IPv4 sockets closed 42651 TCP/IPv4 sockets closed 13116 UDP/IPv4 socket bind failures 5513 TCP/IPv4 socket connect failures 120061921 UDP/IPv4 connections established 6901 TCP/IPv4 connections established 7599 TCP/IPv4 connections accepted 276089 UDP/IPv4 recv errors 315 TCP/IPv4 recv errors ++ Cache DB RRsets ++ [View: mire] [View: abonnes] 885677 A 751488 NS 171869 CNAME 144655 PTR 312051 MX 41667 RRSIG 38816 NSEC 130572 NXDOMAIN -- Fabien ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users