I dont't have the same core dump, but this is from one that happend yesterday:
#0 0x080db986 in ttl_sooner (v1=0x0, v2=0x59375628) at rbtdb.c:752
752 ttl_sooner(void *v1, void *v2) {
(gdb) where
#0 0x080db986 in ttl_sooner (v1=0x0, v2=0x59375628) at rbtdb.c:752
#1 0x0819e708 in isc_heap_delete (heap=0xb0f54068, index=1) at heap.c:218
#2 0x080e039f in free_rdataset (rbtdb=0xb0f4f008, mctx=0x864bea0,
rdataset=0x59375628) at rbtdb.c:1273
#3 0x080e04c3 in clean_stale_headers (rbtdb=0xb0f4f008, mctx=0x864bea0,
top=0x4af6f3e0) at rbtdb.c:1331
#4 0x080e10c4 in decrement_reference (rbtdb=0xb0f4f008, node=0x411b1368,
least_serial=0,
nlock=isc_rwlocktype_read, tlock=isc_rwlocktype_none,
pruning=isc_boolean_false) at rbtdb.c:1348
#5 0x080ea711 in detachnode (db=0xb0f4f008, targetp=0xb42fe2e4) at rbtdb.c:4877
#6 0x080ea9b1 in rdataset_disassociate (rdataset=0xb05c5a48) at rbtdb.c:7173
#7 0x0812e55a in dns_rdataset_disassociate (rdataset=0xb05c5a48) at
rdataset.c:101
#8 0x08132f9e in fctx_destroy (fctx=0xb05c5988) at resolver.c:3081
#9 0x0813548e in fctx_doshutdown (task=0xb0f0ea30, event=0xb05c59e0) at
resolver.c:3246
#10 0x081b9221 in run (uap=0xb7f09008) at task.c:862
#11 0x0094c73b in start_thread () from /lib/libpthread.so.0
#12 0x008a1cfe in clone () from /lib/libc.so.6
This is the output of "thread apply all bt full" command (it's quite long):
(gdb) thread apply all bt full
Thread 11 (process 11988):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x007f9367 in sigsuspend () from /lib/libc.so.6
No symbol table info available.
#2 0x081bcc74 in isc_app_run () at app.c:534
event = (isc_event_t *) 0x0
next_event = <value optimized out>
task = (isc_task_t *) 0x0
sset = {__val = {0 <repeats 32 times>}}
strbuf =
"č\220a\b\000\020\005\000\000đ˙˙\000\000\003\000\030\021ńˇy\000\000\000\002\000\000\000ü\023ňˇ\000\000\000\000\000\000\003\000P \03...@qđˇž\000\000\000\030\000\000\000ô\037\221\000@1\221\000Pqđˇ\bR˝ż\207˝\203\000\021\000\000\000\024\000\000\000`\000\000\000Đěa\by.\000\000Pqđˇ8R˝ż\bŻ\031\bČs\001\000\b_ňˇÔ.\000\000č\220a\b\024\000\000"
#3 0x08059f7c in main (argc=0, argv=0xbfbd53c4) at ./main.c:932
result = <value optimized out>
Thread 10 (process 11989):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2 0x081b90ae in run (uap=0xb7f09008) at task.c:810
No locals.
#3 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4 0x008a1cfe in clone () from /lib/libc.so.6
---Type <return> to continue, or q <return> to quit---
No symbol table info available.
Thread 9 (process 11990):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2 0x081b90ae in run (uap=0xb7f09008) at task.c:810
No locals.
#3 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 8 (process 11991):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2 0x081b90ae in run (uap=0xb7f09008) at task.c:810
No locals.
#3 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 7 (process 11992):
#0 0x00fe4410 in __kernel_vsyscall ()
---Type <return> to continue, or q <return> to quit---
No symbol table info available.
#1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2 0x081b90ae in run (uap=0xb7f09008) at task.c:810
No locals.
#3 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 6 (process 11993):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2 0x081b90ae in run (uap=0xb7f09008) at task.c:810
No locals.
#3 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 5 (process 11994):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2 0x081b90ae in run (uap=0xb7f09008) at task.c:810
---Type <return> to continue, or q <return> to quit---
No locals.
#3 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 4 (process 11996):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x00953e38 in sendmsg () from /lib/libpthread.so.0
No symbol table info available.
#2 0x081c5962 in doio_send (sock=0xae734710, dev=0xae035788) at socket.c:1630
cc = -1282421656
iov = {{iov_base = 0xafa08cd4, iov_len = 48}, {iov_base = 0xb7f21030,
iov_len = 236}, {iov_base = 0x3c0,
iov_len = 2968577464}, {iov_base = 0xb0f0e5c0, iov_len = 2968577472},
{iov_base = 0x94ea47, iov_len = 0}, {
iov_base = 0x861a0b0, iov_len = 2927195036}, {iov_base = 0x2edc, iov_len =
140611816}, {iov_base = 0xec,
iov_len = 0}}
write_count = 48
msghdr = {msg_name = 0x0, msg_namelen = 0, msg_iov = 0xb38fcc68,
msg_iovlen = 1, msg_control = 0x0,
msg_controllen = 0, msg_flags = 0}
addrbuf = "Ě\217ł[Ź\032\bü\220a\b\020GsŽxÍ\217łňB\034\b\030GsŽ4\000\000\000ťś\037\bě\000\000\000
\214\023\bč\220a\b\bÍ\217ł\021Ř\031\bč\220a\bě\000\000"
attempts = 0
send_errno = 0
strbuf =
"ŘR\"Ž\037y\n\bŕBôˇ\000\000\000\0000\000\000\000P \032\b8Şŕ\202ž\000\000\0000\000\000\000\233\207\...@Ě\217ł\020gsޏĚ\217łţv\033\b@Ě\217ł0\000\000\000Ŕ\000\000\000÷\n\b\000\000\000\000¨$YŽGę\224\000\000\000\000\000°\224a\bXUCüđ\000\000\000P \032\b\210W\003Žž\000\000\000đ\000\000\000\b\020ňˇ0\020ňˇ°\000\000"
#3 0x081c5f41 in socket_send (sock=0xae734710, dev=0xae035788,
task=0xb0f0e5b8, address=0xae79739c, pktinfo=0x0,
---Type <return> to continue, or q <return> to quit---
flags=0) at socket.c:4291
io_state = <value optimized out>
have_lock = isc_boolean_false
ntask = (isc_task_t *) 0x0
result = <value optimized out>
#4 0x08136ae5 in resquery_send (query=0xafa08c60) at resolver.c:1921
fctx = (fetchctx_t *) 0xad698268
result = <value optimized out>
qname = (dns_name_t *) 0x0
qrdataset = (dns_rdataset_t *) 0x0
r = {base = 0xafa08cd4 "\2076", length = 48}
res = (dns_resolver_t *) 0xb0f0d008
task = (isc_task_t *) 0xb0f0e5b8
socket = (isc_socket_t *) 0xae734710
tcpbuffer = {magic = 0, base = 0x274, length = 2528, used = 135051914,
current = 2909287056,
active = 2909373308, link = {prev = 0x94ea47, next = 0x0}, mctx = 0x88700f8}
address = (isc_sockaddr_t *) 0xae79739c
buffer = (isc_buffer_t *) 0xafa08c98
ipaddr = {family = 2, type = {in = {s_addr = 4232271192}, in6 = {in6_u
= {
u6_addr8 = "XUCü?\000\000\000\b°đˇxĐ\217ł", u6_addr16 = {21848, 64579,
63, 0, 45064, 47088, 53368, 45967},
u6_addr32 = {4232271192, 63, 3086004232, 3012546680}}},
un =
"XUCü?\000\000\000\b°đˇxĐ\217ł)ĺ\031\b\bĆrŽh\203v\000\000\000\000˙˙˙˙\220Đ\217ł\bĆrŽř\001\000\000źĐ\217ł\000\000\000\000\bĆrŽŘĐ\217ł8Ń\217ł(Ń\217ł\000\000\000\0008Ń\217łĽÖ\034\b(Ń\217ł\000\000\000\000\001\000\000\000\004Ń\217ł8ĆrŽ\001\000\000\001"},
zone = 0}
tsigkey = (dns_tsigkey_t *) 0x0
peer = (dns_peer_t *) 0x0
useedns = 2909373032
cctx = {magic = 0, allowed = 0, edns = -1, table = {0x0 <repeats 64
times>}, initialnodes = {{r = {
---Type <return> to continue, or q <return> to quit---
base = 0xafa03320 "\003www\vcorrupttube\003com", length = 21}, offset =
12, count = 0, labels = 4 '\004',
next = 0x0}, {r = {base = 0xafa03324 "\vcorrupttube\003com", length =
17}, offset = 16, count = 1,
labels = 3 '\003', next = 0x0}, {r = {base = 0xafa03330 "\003com", length
= 5}, offset = 28, count = 2,
labels = 2 '\002', next = 0x0}, {r = {base = 0x0, length = 0}, offset =
0, count = 0, labels = 0 '\0',
next = 0xb38fd044}, {r = {base = 0xad616b08 "NbdanSNDř\220a\023", length
= 23}, offset = 53688,
count = 45967, labels = 244 'ô', next = 0xb38fd168}, {r = {base = 0xad616b0c
"nSNDř\220a\023", length = 28},
offset = 65433, count = 19265, labels = 1 '\001', next = 0x1}, {r = {base
= 0x0, length = 0}, offset = 53144,
count = 45967, labels = 104 'h', next = 0x0}, {r = {base = 0x4b41ff99
"ŰAK\200ŻFy(ç\020\035ćź\n",
length = 0}, offset = 27404, count = 44385, labels = 88 'X', next =
0x0}, {r = {
base = 0xa30f328 "\220\004", length = 9515960}, offset = 21358, count =
17486, labels = 104 'h',
next = 0x1}, {r = {base = 0x1 <Address 0x1 out of bounds>, length = 1},
offset = 53188, count = 45967,
labels = 68 'D', next = 0xffffffff}, {r = {base = 0xffffffff <Address
0xffffffff out of bounds>, length = 0},
offset = 0, count = 0, labels = 0 '\0', next = 0x0}, {r = {base = 0x0,
length = 4104}, offset = 53416,
count = 45967, labels = 8 '\b', next = 0xb7f0c008}, {r = {base = 0xb38fd0a8
"", length = 3012546744},
offset = 8, count = 0, labels = 8 '\b', next = 0xb38fd0b8}, {r = {
base = 0x9538a4 "X=\001đ˙˙s\001Ăč\204˘˙˙\201ÁB\207", length = 8},
offset = 36582, count = 131,
labels = 7 '\a', next = 0xb38fd0a4}, {r = {base = 0x8 <Address 0x8 out of
bounds>, length = 135914793},
offset = 50680, count = 2783, labels = 236 'ě', next = 0x3c0}, {r = {base =
0xb38fd0d8 "ŕ\t",
length = 2910225256}, offset = 50696, count = 44658, labels = 72 'H',
next = 0x81bb002}}, count = 3,
mctx = 0x86190e8}
secure_domain = isc_boolean_false
#5 0x08137579 in fctx_query (fctx=0xad698268, addrinfo=0xae797398, options=0)
at resolver.c:1505
res = (dns_resolver_t *) 0xb0f0d008
task = (isc_task_t *) 0xb0f0e5b8
result = 2927167124
query = (resquery_t *) 0x30
addr = {type = {sa = {sa_family = 61584, sa_data =
"5\000i\003\000\000\220\006yŽ\bka"}, sin = {
sin_family = 61584, sin_port = 53, sin_addr = {s_addr = 873}, sin_zero =
"\220\006yŽ\bka"}, sin6 = {
---Type <return> to continue, or q <return> to quit---
sin6_family = 61584, sin6_port = 53, sin6_flowinfo = 873, sin6_addr =
{in6_u = {
u6_addr8 = "\220\006yŽ\bkaHŇ\217ł\000\000\000", u6_addr16 = {1680,
44665, 27400, 44385, 53832, 45967, 0,
0}, u6_addr32 = {2927167120, 2908842760, 3012547144, 0}}},
sin6_scope_id = 143405224}, sunix = {
sun_family = 61584,
sun_path =
"5\000i\003\000\000\220\006yŽ\bkaHŇ\217ł\000\000\000\000¨0\214\b\231˙AKÂ\005\000\000-o\020\büŇ\217ł\000\000\000\0008Ó\217ł!É\177\000\200
\221\000Ŕ
\221\000ô\037\221\000|\204i\000\000\000\000\fŇ\217łÖĹ\177\000\b#\221\000\004Ň\217łë?\017\017ô\037\221\000\030Ň\217ł\024Ě\177\000\n\r"}},
length = 3012547128, link = {prev = 0x81af974,
next = 0xb38fd2fc}}
have_addr = isc_boolean_false
srtt = <value optimized out>
#6 0x08137784 in fctx_try (fctx=0xad698268, retrying=isc_boolean_false) at
resolver.c:3013
result = <value optimized out>
addrinfo = (dns_adbaddrinfo_t *) 0x0
#7 0x081b9221 in run (uap=0xb7f09008) at task.c:862
No locals.
#8 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#9 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 3 (process 11997):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x00950d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib/libpthread.so.0
No symbol table info available.
#2 0x081cda66 in isc_condition_waituntil (c=0xb7f0a040, m=0xb7f0a010,
t=0xb7f0a038) at condition.c:59
presult = <value optimized out>
result = 0
---Type <return> to continue, or q <return> to quit---
ts = {tv_sec = 1262616473, tv_nsec = 971230000}
strbuf =
"4\000\000\000ŕ\000\000\000ůë\031\bôż\225\000\030Ăď˛Gę\224\000\000\000\000\0000\225a\b\000\000\000\000Ý.\000\000č\220a\b(J.Ž\210äăŽ8Ăď˛\002°\033\bŘäăŽxJ.ŽHĂď˛\002°\033\bÉ\001\000\000\b°đˇxĂď˛Lć\031\b\210äăŽ(J.ŽxĂď˛\bç\031\b\210äăŽHť\005Žťś\037\b\210äăŽ\222\003\000"
#3 0x081bb583 in run (uap=0xb7f0a008) at timer.c:722
now = {seconds = 1262616473, nanoseconds = 955861000}
result = <value optimized out>
#4 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#5 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 2 (process 11998):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x008a2376 in epoll_wait () from /lib/libc.so.6
No symbol table info available.
#2 0x081ca494 in watcher (uap=0xb7f0c008) at socket.c:3468
manager = <value optimized out>
cc = 1
strbuf = '\0' <repeats 127 times>
#3 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 1 (process 11995):
#0 0x080db986 in ttl_sooner (v1=0x0, v2=0x59375628) at rbtdb.c:752
---Type <return> to continue, or q <return> to quit---
No locals.
#1 0x0819e708 in isc_heap_delete (heap=0xb0f54068, index=1) at heap.c:218
elt = (void *) 0x0
less = <value optimized out>
#2 0x080e039f in free_rdataset (rbtdb=0xb0f4f008, mctx=0x864bea0,
rdataset=0x59375628) at rbtdb.c:1273
size = <value optimized out>
idx = 3
#3 0x080e04c3 in clean_stale_headers (rbtdb=0xb0f4f008, mctx=0x864bea0,
top=0x4af6f3e0) at rbtdb.c:1331
d = (rdatasetheader_t *) 0x4b726d
down_next = (rdatasetheader_t *) 0x0
#4 0x080e10c4 in decrement_reference (rbtdb=0xb0f4f008, node=0x411b1368,
least_serial=0,
nlock=isc_rwlocktype_read, tlock=isc_rwlocktype_none,
pruning=isc_boolean_false) at rbtdb.c:1348
result = <value optimized out>
write_locked = <value optimized out>
nodelock = (rbtdb_nodelock_t *) 0x88099a4
no_reference = <value optimized out>
#5 0x080ea711 in detachnode (db=0xb0f4f008, targetp=0xb42fe2e4) at rbtdb.c:4877
rbtdb = (dns_rbtdb_t *) 0x0
node = (dns_rbtnode_t *) 0x411b1368
inactive = <value optimized out>
nodelock = (rbtdb_nodelock_t *) 0x88099a4
#6 0x080ea9b1 in rdataset_disassociate (rdataset=0xb05c5a48) at rbtdb.c:7173
db = (dns_db_t *) 0xb0f4f008
node = (dns_dbnode_t *) 0x411b1368
#7 0x0812e55a in dns_rdataset_disassociate (rdataset=0xb05c5a48) at
rdataset.c:101
No locals.
#8 0x08132f9e in fctx_destroy (fctx=0xb05c5988) at resolver.c:3081
res = (dns_resolver_t *) 0xb0f0d008
---Type <return> to continue, or q <return> to quit---
bucketnum = <value optimized out>
sa = (isc_sockaddr_t *) 0x0
next_sa = (isc_sockaddr_t *) 0x0
#9 0x0813548e in fctx_doshutdown (task=0xb0f0ea30, event=0xb05c59e0) at
resolver.c:3246
fctx = (fetchctx_t *) 0xb05c5988
bucket_empty = <value optimized out>
res = (dns_resolver_t *) 0xb0f0d008
bucketnum = 28
validator = <value optimized out>
#10 0x081b9221 in run (uap=0xb7f09008) at task.c:862
No locals.
#11 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#12 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
JINMEI Tatuya / 神明達哉 wrote:
At Wed, 30 Dec 2009 10:23:17 +0100,
Dario Miculinic <dario.miculi...@t-com.hr> wrote:
I'm administrating 4 DNS servers running CentOS release 5.4 and Red Hat Enterprise Linux Server release 5.2. with BIND
version 9.6.1-P1. On 3 of them BIND crashed 7 times in last 10 days. There's nothing in log files, but we have core dump
file. I found this in the core dump:
#0 0x080db986 in ttl_sooner (v1=0x0, v2=0x3385b628) at rbtdb.c:752
752 ttl_sooner(void *v1, void *v2) {
(gdb) where
#0 0x080db986 in ttl_sooner (v1=0x0, v2=0x3385b628) at rbtdb.c:752
What's the result of the following gdb command?
(gdb) thread apply all bt full
We've seen crash like this one, but we've not figured out how this
happens. This is pretty likely an inter-thread race, and it may be
tricky. According to the v1/v2 values in your stack trace, a full
backtrace with information of other threads may provide more useful
hint.
If you need immediate workaround rather than chasing the bug,
rebuilding named with --disable-atomic may help (we cannot be sure
because we don't yet know how this bug happens in the first place).
This will use locks in a more conservative way and may avoid the
tricky race condition at the cost of lower performance (so if you want
to try that you'll also need to watch the server load).
---
JINMEI, Tatuya
Internet Systems Consortium, Inc.
_______________________________________________
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users