Re: Kernel Panic help.

2008-08-22 Thread Kris Kennaway

Eric Crist wrote:

Hey folks,

First, please 'reply-all' as I'm not on the list.

I've got a backup server that, every night, offloads things to a 
secondary, USB attached hard disk.  We've got two of these disks, which 
we rotate so as to have a fairly recent off-site version, in the event 
of a disaster.  One of the two drives has start to cause the backup 
server to core dump and reboot.  The other works fine.  I tried taking 
the problematic drive and repartitioning and reformatting it, but the 
problems persist.


Here is what I get from a kgdb:

[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC-> sudo kgdb kernel.debug 
/var/crash/vmcore.17
[GDB will not be able to debug user-mode threads: 
/usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you 
are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".

Unread portion of the kernel message buffer:
panic: softdep_deallocate_dependencies: dangling deps
cpuid = 0
Uptime: 11d20h37m38s
Physical memory: 1011 MB
Dumping 201 MB: 186 170 154 138 122 106 90 74 58 42 26 10

#0  doadump () at pcpu.h:195
195__asm __volatile("movl %%fs:0,%0" : "=r" (td));


Any insight is appreciated.  uname -a is:

FreeBSD hostname 7.0-RELEASE-p3 FreeBSD 7.0-RELEASE-p3 #1: Tue Jul 15 
13:53:28 CDT 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC  i386


See the developers handbook for more details on how to report panics 
(you also need the backtrace, and it may help to catch the problem 
earlier if you turn on debugging).


However, this kind of panic can happen if the drive is marginal.  e.g. 
if it loses or corrupts I/O in transit.  Try compiling e.g. the 
/usr/src/tools/regression/fsx tool and running that against the problem 
disk for a few days, or even multiple instances on different files at 
once to really stress it.  It will do lots of I/O to a file and verify 
that the file remains consistent throughout.  It won't touch the whole 
drive though, so if only parts of the disk are bad it won't catch it.


For that you could try generating a large random file on another disk, 
keeping the md5 checksum, then writing lots of copies of it to the bad 
disk to fill or almost fill it, then read back the md5 checksums of each 
to compare.  A small script could run this in a loop.


Yet another option would be to configure the disk as a geli or zfs 
volume, since that will validate checksums with each read and will catch 
data corruption anywhere on the disk.


I'd validate those things before proceeding with the existing panic.

Kris
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Kernel Panic help.

2008-08-22 Thread Eric Crist

Hey folks,

First, please 'reply-all' as I'm not on the list.

I've got a backup server that, every night, offloads things to a  
secondary, USB attached hard disk.  We've got two of these disks,  
which we rotate so as to have a fairly recent off-site version, in the  
event of a disaster.  One of the two drives has start to cause the  
backup server to core dump and reboot.  The other works fine.  I tried  
taking the problematic drive and repartitioning and reformatting it,  
but the problems persist.


Here is what I get from a kgdb:

[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC-> sudo kgdb kernel.debug / 
var/crash/vmcore.17
[GDB will not be able to debug user-mode threads: /usr/lib/ 
libthread_db.so: Undefined symbol "ps_pglobal_lookup"]

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and  
you are
welcome to change it and/or distribute copies of it under certain  
conditions.

Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for  
details.

This GDB was configured as "i386-marcel-freebsd".

Unread portion of the kernel message buffer:
panic: softdep_deallocate_dependencies: dangling deps
cpuid = 0
Uptime: 11d20h37m38s
Physical memory: 1011 MB
Dumping 201 MB: 186 170 154 138 122 106 90 74 58 42 26 10

#0  doadump () at pcpu.h:195
195 __asm __volatile("movl %%fs:0,%0" : "=r" (td));


Any insight is appreciated.  uname -a is:

FreeBSD hostname 7.0-RELEASE-p3 FreeBSD 7.0-RELEASE-p3 #1: Tue Jul 15  
13:53:28 CDT 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC  i386



___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"