On Sat, 15 Dec 2007, Elliot Finley wrote:

in the kernel and I'm still unable to obtain a crash dump. Hopefully there is enough info in this email for a hacker to point me in the right direction to debug this.

If you're unable to obtain a crash dump, you should still be able to use interactive console-based debugging with DDB. I find this is easiest to do with a serial console from an adjacent machine, so that I can copy-and-paste the results into an e-mail rather than hand-transcribe. You can also use firewire consoles to the same effect, although I've never done that.

Once the system panics, it will drop into DDB. I usually kick off debugging by doing a backtrace, "bt", and showing the status of the current and then all processors "show pcpu", "show allpcpu". Depending on the type of bug, I find output from "ps", "alltrace", "show lockedvnods", "show alllocks", "show uma", "show malloc" quite useful. The below panic is a NULL pointer dereference in the taskqueue code, but it's likely triggered by a bug in a consumer of the task queue service, rather than the task queue code itself. That means we'll need to identify what consumer that is. That information should become visible by looking at the arguments to the stack trace in DDB. If not, we may need to work a little harder to get a dump, or set up serial or firewire kgdb to inspect the live running system with a full debugger.

On the swap / dump / etc thing. In order to capture a saved kernel dump, you need sufficient room for the full dump on whatever partition /var/crash is on, and it must be writable. Because dumps are normally written to swap partitions, running fsck before the dump is captured can lead to portions of the dump being overwritten if fsck uses a lot of memory (and hence overflows into swap). As many systems have a separate /var and /var is often small, it could well be that you can successfully capture the dump by just booting to single-user, manually fscking /var, mounting /var, and running savecore in the /var/crash directory. You can also configure additional partitions as purely dump partitions, rather than swap partitions. One trick I've used previousy is to add a disk temporarily just for the purposes of dumping to, and manually doing a dumpon for a partition on that disk (but not a swapon).

Robert N M Watson
Computer Laboratory
University of Cambridge


dmesg:

Copyright (c) 1992-2007 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,
1994
       The Regents of the University of California. All rights
reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 6.2-RELEASE-p5 #1: Mon Nov 19 11:16:44 MST 2007
   [EMAIL PROTECTED]:/usr/obj/usr/src/sys/DDB-SMP
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2793.20-MHz 686-class CPU)
 Origin = "GenuineIntel"  Id = 0xf4a  Stepping = 10

Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
 Features2=0x641d<SSE3,RSVD2,MON,DS_CPL,CNTX-ID,CX16,<b14>>
 AMD Features=0x20100000<NX,LM>
 AMD Features2=0x1<LAHF>
 Logical CPUs per core: 2
real memory  = 3220963328 (3071 MB)
avail memory = 3150856192 (3004 MB)
ACPI APIC Table: <DELL   PE BKC  >
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
cpu0 (BSP): APIC ID:  0
cpu1 (AP): APIC ID:  1
cpu2 (AP): APIC ID:  6
cpu3 (AP): APIC ID:  7
ioapic0: Changing APIC ID to 8
ioapic1: Changing APIC ID to 9
ioapic1: WARNING: intbase 32 != expected base 24
ioapic2: Changing APIC ID to 10
ioapic2: WARNING: intbase 64 != expected base 56
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 32-55 on motherboard
ioapic2 <Version 2.0> irqs 64-87 on motherboard
kbd1 at kbdmux0
ath_hal: 0.9.17.2 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413,
RF5413)
acpi0: <DELL PE BKC> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
cpu2: <ACPI CPU> on acpi0
cpu3: <ACPI CPU> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> at device 2.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> at device 0.0 on pci1
pci2: <ACPI PCI bus> on pcib2
amr0: <LSILogic MegaRAID 1.53> mem
0xd80f0000-0xd80fffff,0xdfdc0000-0xdfdfffff irq 46 at device 14.0 on
pci2
amr0: delete logical drives supported by controller
amr0: <LSILogic PERC 4e/Di> Firmware 522A, BIOS H430, 256MB RAM
pcib3: <ACPI PCI-PCI bridge> at device 0.2 on pci1
pci3: <ACPI PCI bus> on pcib3
pcib4: <ACPI PCI-PCI bridge> at device 4.0 on pci0
pci4: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> at device 5.0 on pci0
pci5: <ACPI PCI bus> on pcib5
pcib6: <ACPI PCI-PCI bridge> at device 0.0 on pci5
pci6: <ACPI PCI bus> on pcib6
em0: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port
0xecc0-0xecff mem 0xdfae0000-0xdfafffff irq 64 at device 7.0 on pci6
em0: Ethernet address: 00:18:8b:34:70:50
pcib7: <ACPI PCI-PCI bridge> at device 0.2 on pci5
pci7: <ACPI PCI bus> on pcib7
em1: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port
0xdcc0-0xdcff mem 0xdf8e0000-0xdf8fffff irq 65 at device 8.0 on pci7
em1: Ethernet address: 00:18:8b:34:70:51
pcib8: <ACPI PCI-PCI bridge> at device 6.0 on pci0
pci8: <ACPI PCI bus> on pcib8
uhci0: <Intel 82801EB (ICH5) USB controller USB-A> port 0xbce0-0xbcff
irq 16 at device 29.0 on pci0
uhci0: [GIANT-LOCKED]
usb0: <Intel 82801EB (ICH5) USB controller USB-A> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: <Intel 82801EB (ICH5) USB controller USB-B> port 0xbcc0-0xbcdf
irq 19 at device 29.1 on pci0
uhci1: [GIANT-LOCKED]
usb1: <Intel 82801EB (ICH5) USB controller USB-B> on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: <Intel 82801EB (ICH5) USB controller USB-C> port 0xbca0-0xbcbf
irq 18 at device 29.2 on pci0
uhci2: [GIANT-LOCKED]
usb2: <Intel 82801EB (ICH5) USB controller USB-C> on uhci2
usb2: USB revision 1.0
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
ehci0: <Intel 82801EB/R (ICH5) USB 2.0 controller> mem
0xdff00000-0xdff003ff irq 23 at device 29.7 on pci0
ehci0: [GIANT-LOCKED]
usb3: EHCI version 1.0
usb3: companion controllers, 2 ports each: usb0 usb1 usb2
usb3: <Intel 82801EB/R (ICH5) USB 2.0 controller> on ehci0
usb3: USB revision 2.0
uhub3: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub3: 6 ports with 6 removable, self powered
uhub4: vendor 0x413c product 0xa001, class 9/0, rev 2.00/0.00, addr 2
uhub4: multiple transaction translators
uhub4: 2 ports with 2 removable, self powered
pcib9: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci9: <ACPI PCI bus> on pcib9
pci9: <unknown> at device 5.0 (no driver attached)
pci9: <unknown> at device 5.1 (no driver attached)
pci9: <unknown> at device 5.2 (no driver attached)
atapci0: <SiI 0680 UDMA133 controller> port
0xccf0-0xccf7,0xcce4-0xcce7,0xccd8-0xccdf,0xccd0-0xccd3,0xcc70-0xcc7f
mem 0xdf5fec00-0xdf5fecff irq 23 at device 6.0 on pci9
ata2: <ATA channel 0> on atapci0
ata3: <ATA channel 1> on atapci0
pci9: <display, VGA> at device 13.0 (no driver attached)
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci1: <Intel ICH5 UDMA100 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 31.1 on
pci0
ata0: <ATA channel 0> on atapci1
ata1: <ATA channel 1> on atapci1
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on
acpi0
fdc0: [FAST]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse, device ID 3
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10
on acpi0
sio0: type 16550A
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem
0xc0000-0xcafff,0xcb000-0xcbfff,0xcc000-0xccfff,0xec000-0xeffff on
isa0
ppc0: parallel port not found.
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on
isa0
ukbd0: Dell DRAC4, rev 1.10/0.00, addr 2, iclass 3/1
kbd2 at ukbd0
ums0: Dell DRAC4, rev 1.10/0.00, addr 2, iclass 3/1
ums0: X report 0x0002 not supported
device_attach: ums0 attach returned 6
Timecounters tick every 1.000 msec
acd0: CDROM <TEAC CD-ROM CD-224E-N/3.AB> at ata0-master UDMA33
device_attach: afd0 attach returned 6
acd1: CDROM <VIRTUALCDROM DRIVE/> at ata2-slave PIO3
amr0: delete logical drives supported by controller
amrd0: <LSILogic MegaRAID logical drive> on amr0
amrd0: 559600MB (1146060800 sectors) RAID 5 (optimal)
SMP: AP CPU #2 Launched!
SMP: AP CPU #1 Launched!
SMP: AP CPU #3 Launched!
Trying to mount root from ufs:/dev/amrd0s1a
fire_saver: the console does not support M_VGA_CG320
module_register_init: MOD_LOAD (fire_saver, 0xc8d50c10, 0) error 19

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to