Eric Crist wrote:
First, please 'reply-all' as I'm not on the list.
I've got a backup server that, every night, offloads things to a
secondary, USB attached hard disk. We've got two of these disks, which
we rotate so as to have a fairly recent off-site version, in the event
of a disaster. One of the two drives has start to cause the backup
server to core dump and reboot. The other works fine. I tried taking
the problematic drive and repartitioning and reformatting it, but the
Here is what I get from a kgdb:
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC-> sudo kgdb kernel.debug
[GDB will not be able to debug user-mode threads:
/usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
welcome to change it and/or distribute copies of it under certain
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".
Unread portion of the kernel message buffer:
panic: softdep_deallocate_dependencies: dangling deps
cpuid = 0
Physical memory: 1011 MB
Dumping 201 MB: 186 170 154 138 122 106 90 74 58 42 26 10
#0 doadump () at pcpu.h:195
195 __asm __volatile("movl %%fs:0,%0" : "=r" (td));
Any insight is appreciated. uname -a is:
FreeBSD hostname 7.0-RELEASE-p3 FreeBSD 7.0-RELEASE-p3 #1: Tue Jul 15
13:53:28 CDT 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC i386
See the developers handbook for more details on how to report panics
(you also need the backtrace, and it may help to catch the problem
earlier if you turn on debugging).
However, this kind of panic can happen if the drive is marginal. e.g.
if it loses or corrupts I/O in transit. Try compiling e.g. the
/usr/src/tools/regression/fsx tool and running that against the problem
disk for a few days, or even multiple instances on different files at
once to really stress it. It will do lots of I/O to a file and verify
that the file remains consistent throughout. It won't touch the whole
drive though, so if only parts of the disk are bad it won't catch it.
For that you could try generating a large random file on another disk,
keeping the md5 checksum, then writing lots of copies of it to the bad
disk to fill or almost fill it, then read back the md5 checksums of each
to compare. A small script could run this in a loop.
Yet another option would be to configure the disk as a geli or zfs
volume, since that will validate checksums with each read and will catch
data corruption anywhere on the disk.
I'd validate those things before proceeding with the existing panic.
firstname.lastname@example.org mailing list
To unsubscribe, send any mail to "[EMAIL PROTECTED]"