Re: Bind named 9.7.2-P2 segfault and core dump when in debug mode
In message <62426.10.0.66.17.1285784847.squir...@interact.purplecow.org>, Denni s Clarke writes: > > I am trying to track down a bit of strange behavior. Not sure if anyone > else sees this. > > I tend to run named in the foreground and in debug level 2 for a while > after I compile it. If all looks good then I can run it as a service > daemon in the usual way. > > This means I run it like so : > > bash-3.00# /opt/csw/sbin/named -4 -c /etc/opt/csw/named.conf \ > > -d 2 -f -g -n 1 -p 53 -u named > > Note the -d 2 there. > > 29-Sep-2010 17:31:43.715 starting BIND 9.7.2-P2 -4 -c > /etc/opt/csw/named.conf -d 2 -f -g -n 1 -p 53 -u named > . > . > . > > Everything seems to be fine until I saw this after 20 minutes or so : > > 29-Sep-2010 17:40:35.964 error (unexpected RCODE REFUSED) resolving > '243.136.240.111.dun.dnsrbl.net/A/IN': 66.11.124.26#53 > 29-Sep-2010 17:40:35.965 client 66.225.151.243#45979: query failed > (SERVFAIL) for 243.136.240.111.dun.dnsrbl.net/IN/A at query.c:4650 > > >At this point it is just hung. > > > So I started it up again in the exact same way and then went to a client > machine and issued this query via dig : > > $ /opt/csw/bin/dig +qr @ns1.blastwave.org 243.136.240.111.dun.dnsrbl.net > > ; <<>> DiG 9.7.2-P2 <<>> +qr @ns1.blastwave.org > 243.136.240.111.dun.dnsrbl.net > ; (1 server found) > ;; global options: +cmd > ;; Sending: > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 430 > ;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 > > ;; QUESTION SECTION: > ;243.136.240.111.dun.dnsrbl.net.IN A > > ;; Got answer: > ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 430 > ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 > > ;; QUESTION SECTION: > ;243.136.240.111.dun.dnsrbl.net.IN A > > ;; Query time: 235 msec > ;; SERVER: 66.225.151.252#53(66.225.151.252) > ;; WHEN: Wed Sep 29 17:47:25 2010 > ;; MSG SIZE rcvd: 48 > > So that response looks okay but > > QUERY, status: SERVFAIL, id: 430 > > also the named process was hung again : > > 29-Sep-2010 17:47:24.850 createfetch: 243.136.240.111.dun.dnsrbl.net A > 29-Sep-2010 17:47:24.986 error (unexpected RCODE REFUSED) resolving > '243.136.240.111.dun.dnsrbl.net/A/IN': 66.11.124.26#53 > 29-Sep-2010 17:47:25.081 lame server resolving > '243.136.240.111.dun.dnsrbl.net' (in 'dnsrbl.net'?): 66.11.124.30#53 > 29-Sep-2010 17:47:25.082 client 66.225.151.227#24722: query failed > (SERVFAIL) for 243.136.240.111.dun.dnsrbl.net/IN/A at query.c:4650 > Killed > > So I figured I would run this in debug level 4 : > > bash-3.00# /opt/csw/sbin/named -4 -c /etc/opt/csw/named.conf -d 4 -f -g -n > 1 -p 53 -u named > 29-Sep-2010 17:48:56.400 starting BIND 9.7.2-P2 -4 -c > /etc/opt/csw/named.conf -d 4 -f -g -n 1 -p 53 -u named > . > . > . > 29-Sep-2010 17:48:56.455 loading configuration from '/etc/opt/csw/named.conf' > 29-Sep-2010 17:48:56.470 reading built-in trusted keys from file > '/etc/opt/csw/bind.keys' > Segmentation Fault (core dumped) > > bash-3.00# file core > core: ELF 32-bit MSB core file SPARC Version 1, from 'named' > > wow .. that is really not good. > > What failed ? > > bash-3.00# mdb /opt/csw/sbin/sparcv8/named core > Loading modules: [ libc.so.1 ld.so.1 ] > > ::status > debugging core file of named (32-bit) from callistoz > file: /opt/csw/sbin/sparcv8/named > initial argv: > /opt/csw/sbin/named -4 -c /etc/opt/csw/named.conf -d 3 -f -g -n 1 -p 53 -u > name > threading model: multi-threaded > status: process terminated by SIGSEGV (Segmentation Fault) > > > $c > libc.so.1`strlen+0x18(cf73c, 2000, a7cb0, fe94fc00, cf720, fe8c0d60) > libisc.so.62`isc_log_doit+0x794(cf700, fee27ab0, c4f44, 3, 0, 0) > libisc.so.62`isc_log_write+0x60(cf700, fee27ab0, c4f44, 3, a7cb0, a7cd8) > set_limit+0x210(a7cb0, a7cd8, a7cd8, 9, , fffd) > set_limits+0x64(fe94fdf8, d89e0, ff0719c0, fe94fe14, a8028, d89e0) > load_configuration+0x7d4(ffbffe22, dc758, 1, 0, ea758, 5fc28) > run_server+0x420(ea758, e89c8, fe935840, 0, fe7e0200, fe8c21f0) > libisc.so.62`dispatch+0x7d8(d6758, 0, 0, 0, fe7e0200, 1) > libisc.so.62`run+0x14(d6758, fe95, 0, 0, fedde7f0, 0) > libc.so.1`_lwp_start(0, 0, 0, 0, 0, 0) > > > ::stack > libc.so.1`strlen+0x18(cf73c, 2000, a7cb0, fe94fc00, cf720, fe8c0d60) > libisc.so.62`isc_log_doit+0x794(cf700, fee27ab0, c4f44, 3, 0, 0) > libisc.so.62`isc_log_write+0x60(cf700, fee27ab0, c4f44, 3, a7cb0, a7cd8) > set_limit+0x210(a7cb0, a7cd8, a7cd8, 9, , fffd) > set_limits+0x64(fe94fdf8, d89e0, ff0719c0, fe94fe14, a8028, d89e0) > load_configuration+0x7d4(ffbffe22, dc758, 1, 0, ea758, 5fc28) > run_server+0x420(ea758, e89c8, fe935840, 0, fe7e0200, fe8c21f0) > libisc.so.62`dispatch+0x7d8(d6758, 0, 0, 0, fe7e0200, 1) > libisc.so.62`run+0x14(d6758, fe95, 0, 0, fedde7f0, 0) > libc.so.1`_lwp_start(0, 0, 0, 0, 0, 0) > > bash-3.00# pstack core > core 'core' of 1647:/opt/csw/sbin/named -4 -c /etc/opt/csw/named.conf > -d 3 -f -g -n 1 -p 5
Bind named 9.7.2-P2 segfault and core dump when in debug mode
I am trying to track down a bit of strange behavior. Not sure if anyone else sees this. I tend to run named in the foreground and in debug level 2 for a while after I compile it. If all looks good then I can run it as a service daemon in the usual way. This means I run it like so : bash-3.00# /opt/csw/sbin/named -4 -c /etc/opt/csw/named.conf \ > -d 2 -f -g -n 1 -p 53 -u named Note the -d 2 there. 29-Sep-2010 17:31:43.715 starting BIND 9.7.2-P2 -4 -c /etc/opt/csw/named.conf -d 2 -f -g -n 1 -p 53 -u named . . . Everything seems to be fine until I saw this after 20 minutes or so : 29-Sep-2010 17:40:35.964 error (unexpected RCODE REFUSED) resolving '243.136.240.111.dun.dnsrbl.net/A/IN': 66.11.124.26#53 29-Sep-2010 17:40:35.965 client 66.225.151.243#45979: query failed (SERVFAIL) for 243.136.240.111.dun.dnsrbl.net/IN/A at query.c:4650 At this point it is just hung. So I started it up again in the exact same way and then went to a client machine and issued this query via dig : $ /opt/csw/bin/dig +qr @ns1.blastwave.org 243.136.240.111.dun.dnsrbl.net ; <<>> DiG 9.7.2-P2 <<>> +qr @ns1.blastwave.org 243.136.240.111.dun.dnsrbl.net ; (1 server found) ;; global options: +cmd ;; Sending: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 430 ;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;243.136.240.111.dun.dnsrbl.net.IN A ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 430 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;243.136.240.111.dun.dnsrbl.net.IN A ;; Query time: 235 msec ;; SERVER: 66.225.151.252#53(66.225.151.252) ;; WHEN: Wed Sep 29 17:47:25 2010 ;; MSG SIZE rcvd: 48 So that response looks okay but QUERY, status: SERVFAIL, id: 430 also the named process was hung again : 29-Sep-2010 17:47:24.850 createfetch: 243.136.240.111.dun.dnsrbl.net A 29-Sep-2010 17:47:24.986 error (unexpected RCODE REFUSED) resolving '243.136.240.111.dun.dnsrbl.net/A/IN': 66.11.124.26#53 29-Sep-2010 17:47:25.081 lame server resolving '243.136.240.111.dun.dnsrbl.net' (in 'dnsrbl.net'?): 66.11.124.30#53 29-Sep-2010 17:47:25.082 client 66.225.151.227#24722: query failed (SERVFAIL) for 243.136.240.111.dun.dnsrbl.net/IN/A at query.c:4650 Killed So I figured I would run this in debug level 4 : bash-3.00# /opt/csw/sbin/named -4 -c /etc/opt/csw/named.conf -d 4 -f -g -n 1 -p 53 -u named 29-Sep-2010 17:48:56.400 starting BIND 9.7.2-P2 -4 -c /etc/opt/csw/named.conf -d 4 -f -g -n 1 -p 53 -u named . . . 29-Sep-2010 17:48:56.455 loading configuration from '/etc/opt/csw/named.conf' 29-Sep-2010 17:48:56.470 reading built-in trusted keys from file '/etc/opt/csw/bind.keys' Segmentation Fault (core dumped) bash-3.00# file core core: ELF 32-bit MSB core file SPARC Version 1, from 'named' wow .. that is really not good. What failed ? bash-3.00# mdb /opt/csw/sbin/sparcv8/named core Loading modules: [ libc.so.1 ld.so.1 ] > ::status debugging core file of named (32-bit) from callistoz file: /opt/csw/sbin/sparcv8/named initial argv: /opt/csw/sbin/named -4 -c /etc/opt/csw/named.conf -d 3 -f -g -n 1 -p 53 -u name threading model: multi-threaded status: process terminated by SIGSEGV (Segmentation Fault) > $c libc.so.1`strlen+0x18(cf73c, 2000, a7cb0, fe94fc00, cf720, fe8c0d60) libisc.so.62`isc_log_doit+0x794(cf700, fee27ab0, c4f44, 3, 0, 0) libisc.so.62`isc_log_write+0x60(cf700, fee27ab0, c4f44, 3, a7cb0, a7cd8) set_limit+0x210(a7cb0, a7cd8, a7cd8, 9, , fffd) set_limits+0x64(fe94fdf8, d89e0, ff0719c0, fe94fe14, a8028, d89e0) load_configuration+0x7d4(ffbffe22, dc758, 1, 0, ea758, 5fc28) run_server+0x420(ea758, e89c8, fe935840, 0, fe7e0200, fe8c21f0) libisc.so.62`dispatch+0x7d8(d6758, 0, 0, 0, fe7e0200, 1) libisc.so.62`run+0x14(d6758, fe95, 0, 0, fedde7f0, 0) libc.so.1`_lwp_start(0, 0, 0, 0, 0, 0) > ::stack libc.so.1`strlen+0x18(cf73c, 2000, a7cb0, fe94fc00, cf720, fe8c0d60) libisc.so.62`isc_log_doit+0x794(cf700, fee27ab0, c4f44, 3, 0, 0) libisc.so.62`isc_log_write+0x60(cf700, fee27ab0, c4f44, 3, a7cb0, a7cd8) set_limit+0x210(a7cb0, a7cd8, a7cd8, 9, , fffd) set_limits+0x64(fe94fdf8, d89e0, ff0719c0, fe94fe14, a8028, d89e0) load_configuration+0x7d4(ffbffe22, dc758, 1, 0, ea758, 5fc28) run_server+0x420(ea758, e89c8, fe935840, 0, fe7e0200, fe8c21f0) libisc.so.62`dispatch+0x7d8(d6758, 0, 0, 0, fe7e0200, 1) libisc.so.62`run+0x14(d6758, fe95, 0, 0, fedde7f0, 0) libc.so.1`_lwp_start(0, 0, 0, 0, 0, 0) bash-3.00# pstack core core 'core' of 1647:/opt/csw/sbin/named -4 -c /etc/opt/csw/named.conf -d 3 -f -g -n 1 -p 5 - lwp# 1 / thread# 1 fe8cbc3c ___sigtimedwait (ffbffbb0, 0, 0, fefe2a00, 0, 1) + 8 fe8b4158 __posix_sigwait (ffbffbb0, ffbffb2c, 0, 0, fefe2a00, 1) + 18 fede4aac isc__app_ctxrun (fee27fb8, c5170, c4f34, fffe, a5574, a5728) + 45c fede4bd8 isc__app_run (c9be0, a56fc, 0, 6e616d65, 80808080, 1010101) + 28 000458dc main