Re: Assertion in zdb?
Vitalij Satanivskij sa...@ukr.net writes: Hello. System - 10.0-CURRENT FreeBSD 10.0-CURRENT #2 r255173 While trying to get some statistics from zdb zdb -dd disk1 stat.log get some assertion: Assertion failed: object_count == usedobjs (0x85727 == 0x3aa93d), file /usr/src/cddl/usr.sbin/zdb/../../../cddl/contrib/opensolaris/cmd/zdb/zdb.c, line 1767. zsh: abort (core dumped) zdb -dd disk1 stat.log Maybe somebody have any idea about what's it's can be and how big problem it's (or not a problem at all)? Probably not a problem unless it happens reliably when you try it multiple times. Since zdb looks at the raw disks, if the filesystem/zpool is active, zdb can easily read bits of the zpool metadata off the disks at different times and thus see an inconsistent state. Hence trying to get stats out of zdb always carries a certain risk of not working. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Problem with firewire disks with recent -CURRENT.
rmt...@servalan.servalan.com writes: Tried upgrading one of my machines to -CURRENT yesterday and got the following panic when the sbp code did its probing of all the firewire devices: panic: mutex sbp not owned at /usr/src/sys/cam/cam_xpt.c:4549 cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xff81fe6837f0 kdb_backtrace() at kdb_backtrace+0x39/frame 0xff81fe6838a0 vpanic() at vpanic+0x126/frame 0xff81fe6838e0 panic() at panic+0x43/frame 0xff81fe683940 __mtx_assert() at __mtx_assert+0xc2/frame 0xff81fe683950 xpt_compile_path() at xpt_compile_path+0xa1/frame 0xff81fe6839a0 xpt_create_path() at xpt_create_path+0x5b/frame 0xff81fe6839f0 sbp_do_attach() at sbp_do_attach+0xe8/frame 0xff81fe683a30 I did some further poking around in the source code trying to figure out what went on here. Looks to me like in the current version of xpt_find_target() (called by xpt_compile_path() and hence, indirectly, by xpt_create_path() ) the code expects the SIM's mutex to be owned, but apparently the call from the sbp_do_attach happens without the SIM mutex being locked. I tried hacking together the following patch and the resulting kernel comes up and lets the system properly detect the drives and do I/O to them. I don't know enough about the CAM system and its locking to know if this patch is the Right Thing to do here, though. diff -r 96ce948dd944 sys/dev/firewire/sbp.c --- a/sys/dev/firewire/sbp.cSat May 04 17:23:33 2013 -0500 +++ b/sys/dev/firewire/sbp.cTue May 07 19:17:28 2013 -0500 @@ -1085,10 +1085,13 @@ END_DEBUG sbp_xfer_free(xfer); - if (sdev-path == NULL) +if (sdev-path == NULL) { + CAM_SIM_LOCK(target-sbp-sim); xpt_create_path(sdev-path, NULL, cam_sim_path(target-sbp-sim), target-target_id, sdev-lun_id); +CAM_SIM_UNLOCK(target-sbp-sim); + } /* * Let CAM scan the bus if we are in the boot process. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Problem with firewire disks with recent -CURRENT.
What happens if you re-add the xpt_periph variable to sbp_do_attach() ? ref: http://svnweb.freebsd.org/base/head/sys/dev/firewire/sbp.c?r1=249468r2=249467pathrev=249468diff_format=f see line 1089 Sean Tried that. No change, still get the same panic: mutex sbp not owned at /usr/src/sys/cam/cam_xpt.c:4549 . Richard ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Firewire disk/tape access stopped working after recent CAM commit
On Mon, Jan 23, 2012 at 11:16:05AM -0700, Kenneth D. Merry wrote: If you can, please try the attached patch and see if it has any impact on the problem. There is a bug in that commit in that we shouldn't be invalidating all LUNs on a target when we get a status of CAM_DEV_NOT_THERE. Just applied the patch, built new kernel, and rebooted, and all the FW drivees are showing up now. Thanks! It may be that we need to do a more thorough audit of how various SIM drivers are using the CAM_DEV_NOT_THERE status. So I take it the layers for the different hardware (SCSI, FW, USB, ATA/AHCI) are handling this status differently, so that's why this bug only showed up on the Firewire buses but not on ATA/AHCI, USB, or (on my other machine) SCSI buses? Richard ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Firewire disk/tape access stopped working after recent CAM commit
Hi. I tried upgrading my amd64 10-CURRENT box to the most recent -CURRENT code and found that the new kernel couldn't find my two disks and tape drive that are on a Firewire bus. All the USB and AHCI-attached hardware still showed up okay, it's just the Firewire stuff that failed to show up properly on boot. Spent today doing binary search to find the responsible commit and it looks to be this one: r23 | ken | 2012-01-11 18:41:48 -0600 (Wed, 11 Jan 2012) | 72 lines Fix a race condition in CAM peripheral free handling, locking in the CAM XPT bus traversal code, and a number of other periph level issues. Not sure what in this commit triggers the problem, or why it just hits Firewire and not the rest of the system. I've built kernels both right before and right after the r23 commit, with CAM debugging turned on real high on the firewire bus in question, bus 0 (hardwired to that number in device.hints, if that matters) options CAMDEBUG options CAM_DEBUG_BUS=0 options CAM_DEBUG_TARGET=-1 options CAM_DEBUG_LUN=-1 options CAM_DEBUG_FLAGS=CAM_DEBUG_INFO|CAM_DEBUG_TRACE|CAM_DEBUG_CDB and got dmesgs of both the bad (r23) and good (pre-r23) kernels, which I've put online at http://ln.servalan.com/rmtodd/bug1/dmesg.bad and http://ln.servalan.com/rmtodd/bug1/dmesg.good, respectively. They're a bit lengthy, what with all that debug info. Grepping out the info for one of the targets (disk 0, sbp0:0:0:0) and just looking at the lines for that one, we see that the good kernel does a lot more with that target, starting with the (noperiph:sbp0:0:0:0): xpt_compile_path bit, that the bad kernel doesn't do, as seen in the diff below. Not sure what's going on here, but if anyone has suggestions on more things I can test/debug code I can add to track this down further, let me know. --- /tmp/dbad 2012-01-22 19:08:03.0 -0600 +++ /tmp/dgood 2012-01-22 19:08:10.0 -0600 @@ -128,3 +128,1097 @@ (xpt0:sbp0:0:0:0): xpt_action_default (xpt0:sbp0:0:0:0): xpt_free_path (xpt0:sbp0:0:0:0): xpt_release_path +(noperiph:sbp0:0:0:0): xpt_compile_path +(noperiph:sbp0:0:0:0): xpt_setup_ccb +(noperiph:sbp0:0:0:0): xpt_action +(noperiph:sbp0:0:0:0): xpt_action_default +(da0:sbp0:0:0:0): xpt_compile_path +(da0:sbp0:0:0:0): xpt_setup_ccb +(da0:sbp0:0:0:0): xpt_action +(da0:sbp0:0:0:0): xpt_action_default +(da0:sbp0:0:0:0): xpt_done +(da0:sbp0:0:0:0): xpt_setup_ccb +(da0:sbp0:0:0:0): xpt_action +(da0:sbp0:0:0:0): xpt_action_default +(da0:sbp0:0:0:0): xpt_schedule +(da0:sbp0:0:0:0): xpt_setup_ccb +(da0:sbp0:0:0:0): xpt_action +(da0:sbp0:0:0:0): xpt_action_default +(da0:sbp0:0:0:0): READ CAPACITY(10). CDB: 25 0 0 0 0 0 0 0 0 0 +(noperiph:sbp0:0:0:0): xpt_release_path +(da0:sbp0:0:0:0): xpt_done +(da0:sbp0:0:0:0): camisr +(da0:sbp0:0:0:0): xpt_setup_ccb +(da0:sbp0:0:0:0): xpt_action +(da0:sbp0:0:0:0): xpt_action_default +(da0:sbp0:0:0:0): xpt_done +(da0:sbp0:0:0:0): xpt_setup_ccb +(da0:sbp0:0:0:0): xpt_action +(da0:sbp0:0:0:0): xpt_done +(da0:sbp0:0:0:0): xpt_setup_ccb +(da0:sbp0:0:0:0): xpt_action +(da0:sbp0:0:0:0): xpt_action_default +(da0:sbp0:0:0:0): xpt_done +(da0:sbp0:0:0:0): daopen: disk=da0 (unit 0) +(da0:sbp0:0:0:0): entering cdgetccb +(da0:sbp0:0:0:0): xpt_schedule +(da0:sbp0:0:0:0): xpt_setup_ccb +(da0:sbp0:0:0:0): xpt_action +(da0:sbp0:0:0:0): xpt_action_default +(da0:sbp0:0:0:0): READ CAPACITY(10). CDB: 25 0 0 0 0 0 0 0 0 0 +(da0:sbp0:0:0:0): xpt_done +(da0:sbp0:0:0:0): camisr +(da0:sbp0:0:0:0): xpt_setup_ccb +(da0:sbp0:0:0:0): xpt_action +(da0:sbp0:0:0:0): xpt_action_default +(da0:sbp0:0:0:0): xpt_done +(da0:sbp0:0:0:0): xpt_schedule +(da0:sbp0:0:0:0): xpt_setup_ccb +(da0:sbp0:0:0:0): xpt_action +(da0:sbp0:0:0:0): xpt_action_default +(da0:sbp0:0:0:0): READ(10). CDB: 28 0 22 ee c1 2f 0 0 1 0 +(noperiph:sbp0:0:0:0): xpt_compile_path +(noperiph:sbp0:0:0:0): xpt_setup_ccb +(noperiph:sbp0:0:0:0): xpt_action +(noperiph:sbp0:0:0:0): xpt_action_default +(noperiph:sbp0:0:0:0): xpt_release_path +(noperiph:sbp0:0:0:0): xpt_compile_path +(noperiph:sbp0:0:0:0): xpt_setup_ccb +(noperiph:sbp0:0:0:0): xpt_action +(noperiph:sbp0:0:0:0): xpt_action_default +(pass0:sbp0:0:0:0): xpt_compile_path +(pass0:sbp0:0:0:0): xpt_setup_ccb +(pass0:sbp0:0:0:0): xpt_action +(pass0:sbp0:0:0:0): xpt_action_default +(pass0:sbp0:0:0:0): xpt_done +(pass0:sbp0:0:0:0): xpt_setup_ccb +(pass0:sbp0:0:0:0): xpt_action +(pass0:sbp0:0:0:0): xpt_action_default +(noperiph:sbp0:0:0:0): xpt_release_path +(da0:sbp0:0:0:0): xpt_done +(da0:sbp0:0:0:0): camisr +(da0:sbp0:0:0:0): entering cdgetccb +(da0:sbp0:0:0:0): xpt_schedule +(da0:sbp0:0:0:0): xpt_setup_ccb +(da0:sbp0:0:0:0): xpt_action +(da0:sbp0:0:0:0): xpt_action_default +(da0:sbp0:0:0:0): SYNCHRONIZE CACHE(10). CDB: 35 0 0 0 0 0 0 0 0 0 +(da0:sbp0:0:0:0): xpt_done +(da0:sbp0:0:0:0): camisr +(da0:sbp0:0:0:0): daopen: disk=da0 (unit 0) +(da0:sbp0:0:0:0): entering cdgetccb +(da0:sbp0:0:0:0): xpt_schedule +(da0:sbp0:0:0:0): xpt_setup_ccb +(da0:sbp0:0:0:0): xpt_action
Re: new interrupts not working for me
John Baldwin wrote: On 06-Nov-2003 Peter Schultz wrote: John Baldwin wrote: On 05-Nov-2003 Peter Schultz wrote: I have a Tyan S1832DL w/dual pii 350s and it's not able to boot. Seems to be having trouble with my adaptec scsi controller, I get a whole bunch of output like this hand transcribed bit, it comes after waiting 15 seconds for scsi devices to settle: ahc0 timeout SCB already complete interrupts may not be functioning Infinite interrupt loop INTSTAT=0(probe3:ahc0:0:3:0): SCB 0x6 - timed out Anyone else seeing this? There are probably 100+ related lines of output, I'll have to configure serial debugging if you need to see it. The dmesg output excluding all the ahc0 errors would help figure out why your interrupts aren't working. However, I just committed a patch that might fix your problem. Now the kernel just dies and the machine reboots right in the beginning when it's setting up the ACPI/APIC stuff. Of course, with ACPI off, there's no apparent problem with the kernel. Ok. Did the old kernel break before with ACPI turned off? It should have. By the way, I've committed a fix for the ACPI breakage. I've got a similar motherboard to the original poster (a Tyan S1836DLUAN/GX instead of S1832DL), and ran across essentially the same problem -- the interrupts for the ahc controller weren't working -- with the new interrupt code. With the new kernel, booting with ACPI disabled worked okay, but booting with ACPI enabled caused the SCSI device probe to hang up. This is true even for a kernel compiled from current source today. Below I list the dmesg output for a boot with today's kernel with ACPI disabled. Alas, I don't have a similar file for the ACPI-enabled case (since the OS doesn't ever get up to a point where it can write to its disks, and don't have a machine available for ready serial console-ing), but I can tell you that where the non-ACPI boot said pcib0: slot 7 INTD routed to irq 19 pcib0: slot 17 INTA routed to irq 19 pcib0: slot 18 INTA routed to irq 16 pcib0: slot 18 INTB routed to irq 16 the booted-with-ACPI kernel said those interrupts were routed to IRQs 11 and 10, respectively, and the later ahc? probes said that ahc[01] were on irq 10 as well. I'd attach the dump of the ACPI tables as well, but, um, ichotolot# acpidump -t acpidump: sysctl machdep.acpi_root does not point to RSDP ichotolot# sysctl -a | grep acpi_root machdep.acpi_root: 0 ichotolot# So you can't dump the ACPI tables for debugging purposes if you didn't boot with ACPI? I don't recall this being the case before... Copyright (c) 1992-2003 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.1-CURRENT #9: Mon Nov 10 21:13:08 CST 2003 [EMAIL PROTECTED]:/usr/src/sys/i386/compile/ICHOTOLOTSMP Preloaded elf kernel /boot/kernel/kernel at 0xc0b3f000. MPTable: INTEL440GX Timecounter i8254 frequency 1193182 Hz quality 0 CPU: Pentium II/Pentium II Xeon/Celeron (400.91-MHz 686-class CPU) Origin = GenuineIntel Id = 0x653 Stepping = 3 Features=0x183fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR real memory = 668991488 (638 MB) avail memory = 640176128 (610 MB) FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 ioapic0: Assuming intbase of 0 ioapic0 Version 1.1 irqs 0-23 on motherboard Pentium Pro MTRR support enabled npx0: [FAST] npx0: math processor on motherboard npx0: INT 16 interface pcibios: BIOS version 2.10 pcib0: MPTable Host-PCI bridge at pcibus 0 on motherboard pci0: PCI bus on pcib0 pcib0: slot 7 INTD routed to irq 19 pcib0: slot 17 INTA routed to irq 19 pcib0: slot 18 INTA routed to irq 16 pcib0: slot 18 INTB routed to irq 16 agp0: Intel 82443GX host to PCI bridge mem 0xf800-0xfbff at device 0.0 on pci0 pcib1: MPTable PCI-PCI bridge at device 1.0 on pci0 pci1: PCI bus on pcib1 pcib1: slot 0 INTA routed to irq 16 pci1: display, VGA at device 0.0 (no driver attached) isab0: PCI-ISA bridge at device 7.0 on pci0 isa0: ISA bus on isab0 atapci0: Intel PIIX4 UDMA33 controller port 0xffa0-0xffaf at device 7.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata0: [MPSAFE] ata1: at 0x170 irq 15 on atapci0 ata1: [MPSAFE] uhci0: Intel 82371AB/EB (PIIX4) USB controller port 0xef80-0xef9f irq 19 at device 7.2 on pci0 usb0: Intel 82371AB/EB (PIIX4) USB controller on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered ums0: Cypress Sem PS2/USB Browser Combo Mouse, rev 1.00/4.9c, addr 2, iclass 3/1 ums0: 5 buttons and Z dir. piix0: PIIX Timecounter port 0x440-0x44f at device 7.3 on pci0 Timecounter PIIX frequency 3579545 Hz quality 0 pcib2: PCI-PCI bridge at device 16.0 on pci0 pci2: PCI bus on pcib2 fxp0: Intel 82558 Pro/100 Ethernet port 0xef40-0xef5f mem
Panic in scheduler code with SCHED_ULE during boot to multi-user.
Hi. Last night I upgraded to the most recent -current source and rebuilt everything, and decided on building the kernel to try the new SCHED_ULE scheduler (I had been using SCHED_4BSD before). Alas, the experiment did not go well; every time I booted the machine, I got a panic just as the system was about to put up the login prompt. Switching the kernel config back to SCHED_4BSD and building a kernel with the same (last night's) sources gave me a working kernel. This is on a dual-processor PII/400 box. Below I list what I've got from a kernel coredump of the SCHED_ULE kernel; I've added my comments on the gdb listing preceded by # signs. ichotolot# gdb -k ./kernel.debug ./vmcore.45 GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-undermydesk-freebsd... panic: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode cpuid = 1; lapic.id = 0100 fault virtual address = 0x38 fault code = supervisor read, page not present instruction pointer = 0x8:0xc036835d stack pointer = 0x10:0xe1cfbbbc frame pointer = 0x10:0xe1cfbbcc code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 649 (squid) trap number = 12 panic: page fault cpuid = 1; lapic.id = 0100 boot() called on cpu#1 syncing disks, buffers remaining... panic: absolutely cannot call smp_ipi_shootdown with interrupts already disabled cpuid = 1; lapic.id = 0100 boot() called on cpu#1 Uptime: 1m12s Dumping 638 MB 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 512 528 544 560 576 592 608 624 --- Reading symbols from /usr/src/sys/i386/compile/ICHOTOLOTSMP/modules/usr/src/sys/modules/acpi/acpi.ko.debug...done. Loaded symbols for /usr/src/sys/i386/compile/ICHOTOLOTSMP/modules/usr/src/sys/modules/acpi/acpi.ko.debug Reading symbols from /usr/src/sys/i386/compile/ICHOTOLOTSMP/modules/usr/src/sys/modules/linprocfs/linprocfs.ko.debug...done. Loaded symbols for /usr/src/sys/i386/compile/ICHOTOLOTSMP/modules/usr/src/sys/modules/linprocfs/linprocfs.ko.debug Reading symbols from /usr/src/sys/i386/compile/ICHOTOLOTSMP/modules/usr/src/sys/modules/linux/linux.ko.debug...done. Loaded symbols for /usr/src/sys/i386/compile/ICHOTOLOTSMP/modules/usr/src/sys/modules/linux/linux.ko.debug Reading symbols from /boot/kernel/green_saver.ko...done. Loaded symbols for /boot/kernel/green_saver.ko #0 doadump () at ../../../kern/kern_shutdown.c:240 240 dumping++; (kgdb) bt #0 doadump () at ../../../kern/kern_shutdown.c:240 #1 0xc03547c0 in boot (howto=260) at ../../../kern/kern_shutdown.c:372 #2 0xc0354ba6 in panic () at ../../../kern/kern_shutdown.c:550 #3 0xc050f9db in smp_tlb_shootdown (vector=0, addr1=0, addr2=0) at ../../../i386/i386/mp_machdep.c:2387 #4 0xc050fc79 in smp_invlpg_range (addr1=0, addr2=0) at ../../../i386/i386/mp_machdep.c:2519 #5 0xc0511df8 in pmap_invalidate_range (pmap=0xc06dc620, sva=3568271360, eva=1) at ../../../i386/i386/pmap.c:719 #6 0xc0512118 in pmap_qenter (sva=3568271360, m=0xe1cfb8c0, count=-1) at ../../../i386/i386/pmap.c:943 #7 0xc03a0448 in vm_hold_load_pages (bp=0xd199d440, from=3568271360, to=3568279552) at ../../../kern/vfs_bio.c:3574 #8 0xc039ea5c in allocbuf (bp=0xd199d440, size=6144) at ../../../kern/vfs_bio.c:2752 #9 0xc039e6fe in geteblk (size=6144) at ../../../kern/vfs_bio.c:2634 #10 0xc039b210 in bwrite (bp=0xd188b8e0) at ../../../kern/vfs_bio.c:818 #11 0xc039bc6c in bawrite (bp=0x0) at ../../../kern/vfs_bio.c:1153 #12 0xc03a4860 in vop_stdfsync (ap=0xe1cfba14) at ../../../kern/vfs_default.c:742 #13 0xc031ba10 in spec_fsync (ap=0xe1cfba14) at ../../../fs/specfs/spec_vnops.c:417 #14 0xc031ae38 in spec_vnoperate (ap=0x0) at ../../../fs/specfs/spec_vnops.c:122 #15 0xc04ad2d7 in ffs_sync (mp=0xc5336400, waitfor=2, cred=0xc1c27e80, td=0xc0669ec0) at vnode_if.h:624 #16 0xc03b0b1b in sync (td=0xc0669ec0, uap=0x0) at ../../../kern/vfs_syscalls.c:142 #17 0xc03542e2 in boot (howto=256) at ../../../kern/kern_shutdown.c:281 #18 0xc0354ba6 in panic () at ../../../kern/kern_shutdown.c:550 #19 0xc0517292 in trap_fatal (frame=0xe1cfbb7c, eva=0) at ../../../i386/i386/trap.c:836 #20 0xc0516863 in trap (frame= {tf_fs = -1067057128, tf_es = 16, tf_ds = -1067974640, tf_edi = -982159056, tf_esi = -1066987296, tf_ebp = -506479668, tf_isp = -506479704, tf_ebx = 0, tf_edx = 2, tf_ecx = 1, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1070169251, tf_cs = 8,
Re: HEADS UP: ACPI CHANGES AFFECTING MOST -CURRENT USERS
In servalan.mailinglist.fbsd-current David Malone writes: On Wed, Aug 29, 2001 at 07:58:59PM -0700, Mike Smith wrote: - The PnP BIOS is disabled and onboard peripherals are detected using ACPI, and attach to ACPI and not isa. With the ACPI module loaded I find that ed0, fdc0 and pca0 are no longer detected (well, fdc0 is detected but gives an error). I have the most recent BIOS installed and it doesn't seem to make any difference if I twiddle BIOS settings. Could this have something to do with hints, or where should I be looking for the problem? I'm seeing similar behavior, with fdc0 not functioning properly and giving the following stuff in dmesg. Note the 'fdc0: cmd 3 failed at out byte 1 of 3' messages; the kernel never seems to properly detect floppy drive 0. This is on a Tyan Thunder 100GX motherboard. It's not got the most current rev. of the BIOS, but I'm somewhat reluctant to try flashing a newer BIOS unless I'm sure the lossage is in the BIOS and not in the FreeBSD kernel. (Alas, trying the newer BIOS may be the only way to find out for sure.) Copyright (c) 1992-2001 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.0-CURRENT #1: Sat Sep 1 21:43:41 CDT 2001 [EMAIL PROTECTED]:/usr/src/sys/i386/compile/ICHOTOLOTSMP Timecounter i8254 frequency 1193182 Hz CPU: Pentium II/Pentium II Xeon/Celeron (400.91-MHz 686-class CPU) Origin = GenuineIntel Id = 0x653 Stepping = 3 Features=0x183fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR real memory = 134152192 (131008K bytes) avail memory = 124178432 (121268K bytes) Programming 24 pins in IOAPIC #0 IOAPIC #0 intpin 2 - irq 0 FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): apic id: 0, version: 0x00040011, at 0xfee0 cpu1 (AP): apic id: 1, version: 0x00040011, at 0xfee0 io0 (APIC): apic id: 2, version: 0x00170011, at 0xfec0 Preloaded elf kernel kernel at 0xc0633000. Preloaded elf module acpi.ko at 0xc063309c. Pentium Pro MTRR support enabled WARNING: Driver mistake: destroy_dev on 154/0 npx0: math processor on motherboard npx0: INT 16 interface acpi0: TYANCP TYANTBLE on motherboard acpi0: power button is handled as a fixed feature programming model. Timecounter ACPI frequency 3579545 Hz acpi_timer0: 24-bit timer at 3.579545MHz port 0x408-0x40b on acpi0 acpi_cpu0: CPU on acpi0 acpi_cpu1: CPU on acpi0 acpi_tz0: thermal zone on acpi0 acpi_pcib0: Host-PCI bridge port 0xcf8-0xcff on acpi0 IOAPIC #0 intpin 19 - irq 2 IOAPIC #0 intpin 16 - irq 10 pci0: PCI bus on acpi_pcib0 pcib1: PCI-PCI bridge at device 1.0 on pci0 pci1: PCI bus on pcib1 pci1: display, VGA at 0.0 (no driver attached) isab0: PCI-ISA bridge at device 7.0 on pci0 isa0: ISA bus on isab0 atapci0: Intel PIIX4 ATA33 controller port 0xffa0-0xffaf at device 7.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 uhci0: Intel 82371AB/EB (PIIX4) USB controller port 0xef80-0xef9f irq 2 at device 7.2 on pci0 usb0: Intel 82371AB/EB (PIIX4) USB controller on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered ums0: Cypress Sem PS2/USB Browser Combo Mouse, rev 1.00/4.9c, addr 2, iclass 3/1 ums0: 5 buttons and Z dir. Timecounter PIIX frequency 3579545 Hz pci0: bridge, PCI-unknown at 7.3 (no driver attached) pcib2: PCI-PCI bridge at device 16.0 on pci0 pci2: PCI bus on pcib2 fxp0: Intel Pro 10/100B/100+ Ethernet port 0xef40-0xef5f mem 0xfea0-0xfeaf,0xfc4ff000-0xfc4f irq 2 at device 17.0 on pci0 fxp0: Ethernet address 00:e0:81:10:47:b2 inphy0: i82555 10/100 media interface on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto ahc0: Adaptec aic7895 Ultra SCSI adapter port 0xe400-0xe4ff mem 0xfebfe000-0xfebfefff irq 10 at device 18.0 on pci0 aic7895C: Ultra Wide Channel A, SCSI Id=7, 32/255 SCBs ahc1: Adaptec aic7895 Ultra SCSI adapter port 0xe800-0xe8ff mem 0xfebff000-0xfebf irq 10 at device 18.1 on pci0 aic7895C: Ultra Wide Channel B, SCSI Id=7, 32/255 SCBs fdc0: cmd 3 failed at out byte 1 of 3 sio0 port 0x3f8-0x3ff irq 4 on acpi0 sio0: type 16550A sio1 port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A ppc0 port 0x378-0x37f irq 7 on acpi0 ppc0: Generic chipset (EPP/NIBBLE) in COMPATIBLE mode plip0: PLIP network interface on ppbus0 lpt0: Printer on ppbus0 lpt0: Interrupt-driven port ppi0: Parallel I/O on ppbus0 ppc1: cannot reserve I/O port range fdc0: cmd 3 failed at out byte 1 of 3 ppc1: cannot reserve I/O port range orm0: Option ROMs at iomem 0xc-0xc87ff,0xcc000-0xd07ff on isa0 atkbdc0: Keyboard controller (i8042) at port 0x60,0x64 on isa0 atkbd0: AT Keyboard flags 0x1 irq 1 on atkbdc0 kbd0 at atkbd0 ppc1: cannot reserve I/O port range sc0: System console at flags 0x100 on isa0 sc0: VGA 16 virtual consoles, flags=0x300 vga0: Generic ISA VGA
Re: Interrupt messages from usb0 on CURRENT
In servalan.mailinglist.fbsd-current you write: I just upgraded to the latest sources (two hours ago) on my VAIO laptop and I'm now getting dozens of messages: Aug 22 15:00:07 sidhe /boot/kernel/kernel: usb0: interrupt, but not for us Aug 22 15:00:51 sidhe last message repeated 8 times Aug 22 15:03:02 sidhe last message repeated 19 times Aug 22 15:12:59 sidhe last message repeated 92 times This is apparently due to a change last night in the uhci and ohci drivers to report interrupts the USB code sees but which don't correspond to any actual USB activity. I saw the same thing last night after I upgraded (to try out jhb's latest fixes, which worked like a charm on the sound problem). I note that on my system the uhci0 and fxp0 are on the same IRQ: uhci0: Intel 82371AB/EB (PIIX4) USB controller port 0xef80-0xef9f irq 2 at device 7.2 on pci0 fxp0: Intel Pro 10/100B/100+ Ethernet port 0xef40-0xef5f mem 0xfea0-0xfeaf,0xfc4ff000-0xfc4f irq 2 at device 17.0 on pci0 I wonder if the interrupts not for us are actually interrupts from the Ethernet that the USB code sees because both the USB and the Ethernet are on the same irq. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Sound broken on -current again...
In servalan.mailinglist.fbsd-current jhb writes: On 19-Aug-01 Richard Todd wrote: In servalan.mailinglist.fbsd-current Maxim Sobolev writes: I found that after reverting the following deltas (jhb's 10 August commit) sound starts working again: [list of deltas deleted] I found much the same thing; specifically, the problematic change is this one: What wait channel is the process (xmms, mpg123, whatever) in? Looking at a core file from a known-buggy kernel that I'd forced to core itself with ddb, I find for the madplay process: (kgdb) proc 855 (kgdb) bt #0 mi_switch () at ../../../kern/kern_synch.c:707 #1 0xc0273645 in msleep (ident=0xc13e0b00, mtx=0xc13d2800, priority=332, wmesg=0xc042bcb4 pcmwr, timo=1) at ../../../kern/kern_synch.c:466 #2 0xc01fcad8 in chn_sleep (c=0xc13d1680, str=0xc042bcb4 pcmwr, timeout=1) at ../../../dev/sound/pcm/channel.c:109 #3 0xc01fcd5c in chn_write (c=0xc13d1680, buf=0xc8f1af00) at ../../../dev/sound/pcm/channel.c:259 #4 0xc01fef40 in dsp_write (i_dev=0xc13e0f00, buf=0xc8f1af00, flag=2359297) at ../../../dev/sound/pcm/dsp.c:381 #5 0xc0243095 in spec_write (ap=0xc8f1ae90) at ../../../fs/specfs/spec_vnops.c:289 #6 0xc0242dc9 in spec_vnoperate (ap=0xc8f1ae90) at ../../../fs/specfs/spec_vnops.c:119 #7 0xc02b7c5f in vn_write (fp=0xc1623ec0, uio=0xc8f1af00, cred=0xc15c2600, flags=0, p=0xc8e54100) at vnode_if.h:303 #8 0xc028c073 in dofilewrite (p=0xc8e54100, fp=0xc1623ec0, fd=3, buf=0xbfbf8b74, nbyte=4608, offset=-1, flags=0) at ../../../sys/file.h:162 #9 0xc028bf26 in write (p=0xc8e54100, uap=0xc8f1af80) at ../../../kern/sys_generic.c:334 #10 0xc03e2fc9 in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = -1077965964, tf_esi = 4608, tf_ebp = -1077937536, tf_isp = -923684908, tf_ebx = -1077965964, tf_edx = 1103, tf_ecx = -411, tf_eax = 4, tf_trapno = 0, tf_err = 2, tf_eip = 672022312, tf_cs = 31, tf_eflags = 663, tf_esp = -1077966048, tf_ss = 47}) at ../../../i386/i386/trap.c:1128 #11 0xc03cce0d in syscall_with_err_pushed () so apparently it was waiting on 'pcmwr'. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Sound broken on -current again...
In servalan.mailinglist.fbsd-current Maxim Sobolev writes: I found that after reverting the following deltas (jhb's 10 August commit) sound starts working again: [list of deltas deleted] I found much the same thing; specifically, the problematic change is this one: jhb 2001/08/10 14:08:57 PDT Modified files: sys/kern kern_synch.c Log: Work around a race between msleep() and endtsleep() where it was possible for endtsleep() to be executing when msleep() resumed, for endtsleep() to spin on sched_lock long enough for the other process to loop on msleep() and sleep again resulting in endtsleep() waking up the wrong msleep. Obtained from:BSD/OS Revision ChangesPath 1.154 +24 -4 src/sys/kern/kern_synch.c Kernels built from source immediately prior to this change work; kernels built from source immediately after this change have the sound-related problems mentioned in this thread. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Sound broken on -current again...
In servalan.mailinglist.fbsd-current Daniel M. Kurry writes: On Wed, Aug 15, 2001 at 07:01:46PM +0200, some SMTP stream spewed forth: One gets the first DMA buffer full, then the process hangs... Due to the lack of replies, I'll go ahead. I am seeing sound breakage also. My card is a Creative Labs SoundBlaster Live!. xmms will play a short (less than a second) spurt of audio and then stop responding. mpg123 will not play (any audio to the speakers) at all. I ran a buildworld today which apparently broke it. That puts the breakage between today and sometime less than 2 months ago. (I really cannot be more specific.) I'm seeing much the same thing, on an SMP box with onboard sbc0 (Vibra16X) sound chip. Attempting to play sound with madplay gets about 2 seconds of sound and then silence, with the madplay process in an unkillable kernel wait. Oddly enough, the sbc0 interrupt thread continues to occasionally gather a tick of CPU time, but apparently not enough to do anything useful. I'm busy doing binary-search on the CVS tree, checking out source from different times and seeing if I can localize the commit that broke it. My current results are that a kernel built from source as of 2001/08/10 00:00 CDT (i.e. 2001/08/09 22:00:00 PDT) works, one built from source as of 2001/08/10 15:52 PDT does not, so the bug is somewhere in between there. I'm now trying to narrow this down further, to a specific commit somewhere in that region. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Couple Giant not locked at vm_object.c:261 panics I had to
In fbsd-current John Baldwin writes: On 09-Jun-01 Richard Todd wrote: Note that the first panic is somewhat muddled by the fact that, while syncing disks from the vm_object.c panic, it apparently paniced again with Giant locked at i386/trap.c:1153. That probably confuses the issue greatly. Yes, I need the first traceback, not the second. One question: are you using ktrace? ddb is your friend here, as it can do a traceback when you have the first panic. Yeah, I am using ktrace, and now that I think of it, yeah, a ktraced process was probably running when those panics occured. Unfortunately, ddb is not my friend, as I'm usually running X. :-( P.S. Stupid -current question: How does one tell what process was running that triggered a panic? This used to be findable with p *curproc in gdb, but that doesn't seem to work anymore. You have to look at the list of per-cpu data (look at the gd_allcpu list). In ddb you can use 'show pcpu' to look at per-cpu data. At some point, gdb needs to be taught the notion of a 'current CPU' and be taught a way to access per-cpu data of the current CPU. Ah. Okay. #10 0xc042c603 in ast (framep=0xc8ce0fa8) at ../../i386/i386/trap.c:1320 #11 0xc0417b00 in doreti_ast () Ok, this one is the ktrace bogon that was recently brought to my attention. Cool. Thanks. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Couple Giant not locked at vm_object.c:261 panics I had today....
Backtraces posted here in hopes they might enlighten someone. This is with kernel source from June 6 (specifically, Sticky Date: 2001.06.06.22.16.24 according to cvs status). The machine is a dual PII/400; dmesg follows the backtraces from the two panics. If you want more information from these two core files, please let me know. Note that the first panic is somewhat muddled by the fact that, while syncing disks from the vm_object.c panic, it apparently paniced again with Giant locked at i386/trap.c:1153. That probably confuses the issue greatly. P.S. Stupid -current question: How does one tell what process was running that triggered a panic? This used to be findable with p *curproc in gdb, but that doesn't seem to work anymore. Script started on Sat Jun 9 16:02:27 2001 You have mail. ichotolot# gdb -k kernel.debug vmcore.19 GNU gdb 4.18 Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-unknown-freebsd... IdlePTD 6516736 initial pcb at 529440 panicstr: witness_restore: lock (sleep mutex) Giant not locked panic messages: --- panic: mutex Giant not owned at ../../vm/vm_object.c:261 cpuid = 1; lapic.id = 0100 boot() called on cpu#1 syncing disks... exclusive (sleep mutex) Giant (0xc0576ca0) locked @ ../../i386/i386/trap.c:1153 exclusive (spin mutex) sched lock (0xc05763e0) locked @ ../../kern/kern_mutex.c:312 panic: witness_restore: lock (sleep mutex) Giant not locked cpuid = 1; lapic.id = 0100 boot() called on cpu#1 Uptime: 2d2h35m38s dumping to dev da0s2b, offset 270336 dump 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 --- #0 dumpsys () at ../../kern/kern_shutdown.c:478 478 if (dumping++) { (kgdb) p curproc No symbol curproc in current context. (kgdb) bt #0 dumpsys () at ../../kern/kern_shutdown.c:478 #1 0xc026b35f in boot (howto=260) at ../../kern/kern_shutdown.c:321 #2 0xc026b7d1 in panic (fmt=0xc0488ae5 %s: lock (%s) %s not locked) at ../../kern/kern_shutdown.c:600 #3 0xc02878a5 in witness_restore (lock=0xc0576ca0, file=0xc048bc20 ../../kern/vfs_bio.c, line=1827) at ../../kern/subr_witness.c:1297 #4 0xc0273836 in msleep (ident=0xc054eaec, mtx=0x0, priority=68, wmesg=0xc048c09e psleep, timo=100) at ../../kern/kern_synch.c:500 #5 0xc02ab0a5 in buf_daemon () at ../../kern/vfs_bio.c:1883 #6 0xc025af78 in fork_exit (callout=0xc02aaf20 buf_daemon, arg=0x0, frame=0xc80cbfa8) at ../../kern/kern_fork.c:727 (kgdb) fr 6 #6 0xc025af78 in fork_exit (callout=0xc02aaf20 buf_daemon, arg=0x0, frame=0xc80cbfa8) at ../../kern/kern_fork.c:727 727 callout(arg, frame); (kgdb) l 722 * cpu_set_fork_handler intercepts this function call to 723 * have this call a non-return function to stay in kernel mode. 724 * initproc has its own fork handler, but it does return. 725 */ 726 KASSERT(callout != NULL, (NULL callout in fork_exit)); 727 callout(arg, frame); 728 729 /* 730 * Check if a kernel thread misbehaved and returned from its main 731 * function. (kgdb) l 732 */ 733 PROC_LOCK(p); 734 if (p-p_flag P_KTHREAD) { 735 PROC_UNLOCK(p); 736 mtx_lock(Giant); 737 printf(Kernel thread \%s\ (pid %d) exited prematurely.\n, 738 p-p_comm, p-p_pid); 739 kthread_exit(0); 740 } 741 PROC_UNLOCK(p); (kgdb) p frame $1 = (struct trapframe *) 0xc80cbfa8 (kgdb) p frame[0] $2 = {tf_fs = 0, tf_es = 0, tf_ds = 0, tf_edi = 0, tf_esi = 0, tf_ebp = 0, tf_isp = 0, tf_ebx = 0, tf_edx = 1, tf_ecx = 0, tf_eax = 0, tf_trapno = 0, tf_err = 0, tf_eip = 0, tf_cs = 0, tf_eflags = 0, tf_esp = 0, tf_ss = 0} (kgdb) fr 5 #5 0xc02ab0a5 in buf_daemon () at ../../kern/vfs_bio.c:1883 1883tsleep(bd_request, PVM, qsleep, hz / 2); (kgdb) l 1878/* 1879 * We couldn't find any flushable dirty buffers but 1880 * still have too many dirty buffers, we 1881 * have to sleep and try again. (rare) 1882 */ 1883tsleep(bd_request, PVM, qsleep, hz / 2); 1884
Panic I got: mutex sx backing lock recursed at ../../kern/kern_condvar.c:198
I'm running -CURRENT on a dual PII/400 box with 128M of RAM. The kernel I'm running was built from sources current as of last night (i.e. around 9PM CDT Apr 3). Just now, while listening to streaming audio with xmms, the machine crashed. It's done that a couple times before, with recent-ish kernels while doing streaming audio with xmms, but the other times didn't give core dumps with usable backtraces. *This* time I got a decent backtrace. If I'm reading this backtrace right, the thread handling the sound hardware called selwakeup() (frame #19). This called pfind() (frame #18), which tries to lock allproc. Somewhere in doing this, witness_sleep() (frame #15) decides it wants to printf() a message. printf() calls down into the tty code, which goes into ptsstart() (frame #9) and the pty code (I'm not entirely sure why). This code then tries to do a selwakeup() of its own (frame #7) which calls pfind() which tries (again) to lock allproc, leading to the "mutex recursed" panic. GDB output and (if it matters) kernel config file below. Script started on Thu Apr 5 01:12:28 2001 ichotolot# cd /usr/src/sys/compile/ICHOTOLOTSMP ichotolot# gdb -k kernel.debug /var/crash/vmcore.7 GNU gdb 4.18 Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd"... IdlePTD 6356992 initial pcb at 513860 panicstr: mutex sx backing lock recursed at ../../kern/kern_condvar.c:198 panic messages: --- panic: mutex sx backing lock recursed at ../../kern/kern_condvar.c:198 cpuid = 0; lapic.id = boot() called on cpu#0 syncing disks... 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 1: dev:da0s2e, flags:21021024, blkno:11469104, lblkno:11469104 2: dev:da0s2e, flags:21021024, blkno:11468864, lblkno:11468864 3: dev:da0s2e, flags:2124, blkno:2048, lblkno:2048 4: dev:da0s2e, flags:21021024, blkno:2752848, lblkno:2752848 5: dev:da0s2e, flags:21021024, blkno:2752736, lblkno:2752736 6: dev:da0s2e, flags:21021024, blkno:11468976, lblkno:11468976 7: dev:da0s2a, flags:21021024, blkno:131152, lblkno:131152 8: dev:da0s2e, flags:21021024, blkno:2294176, lblkno:2294176 9: dev:da0s2e, flags:21021024, blkno:2425120, lblkno:2425120 10: dev:da0s2a, flags:21021024, blkno:131184, lblkno:131184 11: dev:da0s2e, flags:2124, blkno:16, lblkno:16 12: dev:da0s2e, flags:21021024, blkno:2294160, lblkno:2294160 13: dev:da0s2e, flags:21021024, blkno:14221440, lblkno:14221440 14: dev:da0s2e, flags:21021024, blkno:2294192, lblkno:2294192 15: dev:da0s2e, flags:01011024, blkno:11474186, lblkno:0 16: dev:da0s2e, flags:0124, blkno:11468848, lblkno:11468848 giving up on 16 buffers Uptime: 23h3m37s dumping to dev da0s2b, offset 270336 dump 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 --- #0 dumpsys () at ../../kern/kern_shutdown.c:478 478 if (dumping++) { (kgdb) bt #0 dumpsys () at ../../kern/kern_shutdown.c:478 #1 0xc0251547 in boot (howto=256) at ../../kern/kern_shutdown.c:321 #2 0xc0251a09 in panic (fmt=0xc0464a44 "mutex %s recursed at %s:%d") at ../../kern/kern_shutdown.c:592 #3 0xc024aec3 in _mtx_assert (m=0xc054765c, what=9, file=0xc0462932 "../../kern/kern_condvar.c", line=198) at ../../kern/kern_mutex.c:602 #4 0xc02369a2 in cv_wait (cvp=0xc0547698, mp=0xc054765c) at ../../kern/kern_condvar.c:198 #5 0xc0258caa in _sx_slock (sx=0xc0547640, file=0xc0464e23 "../../kern/kern_proc.c", line=143) at ../../kern/kern_sx.c:117 #6 0xc024bf48 in pfind (pid=606) at ../../kern/kern_proc.c:143 #7 0xc026ffe1 in selwakeup (sip=0xc10eea04) at ../../kern/sys_generic.c:1061 #8 0xc027accf in ptcwakeup (tp=0xc10eea20, flag=1) at ../../kern/tty_pty.c:318 #9 0xc027acaa in ptsstart (tp=0xc10eea20) at ../../kern/tty_pty.c:307 #10 0xc0278170 in ttstart (tp=0xc10eea20) at ../../kern/tty.c:1417 #11 0xc027978d in tputchar (c=46, tp=0xc10eea20) at ../../kern/tty.c:2484 #12 0xc0268813 in putchar (c=46, arg=0xc7f12e10) at ../../kern/subr_prf.c:304 #13 0xc0268fb8 in kvprintf ( fmt=0xc0468642 ":%d: %s with \"%s\" locked from %s:%d\n", func=0xc02687c4 putchar, arg=0xc7f12e10, radix=10, ap=0xc7f12e2c "Ç") at ../../kern/subr_prf.c:637 #14 0xc0268740 in printf ( fmt=0xc0468640 "%s:%d: %s with \"%s\" locked from %s:%d\n") at ../../kern/subr_prf.c:260 #15 0xc026cff9 in witness_sleep (check_only=0, lock=0xc054765c,
Re: Tracking down problem with booting large kernels (bug in locore.s)
In message [EMAIL PROTECTED], Peter Wemm writes: Richard Todd wrote: No crashes as of here pushl $begin /* jump to high virtualized add ress */ ret /* now running relocated at KERNBASE where the system is linked to run */ begin: crashes before it gets here!!! /* set up bootstrap stack */ movlproc0paddr,%eax /* location of in-kernel pages */ I have some suspicions.. Can you do a nm on your kernel? peter@daintree[8:41pm]~-102 nm /boot/kernel/kernel |grep begin c0123689 t begin Sure. A working kernel (the one I'm booted off of now) shows: 55 ichotolot ~[11:49PM] Z% nm /boot/kernel.good5/kernel | grep begin c0128c79 t begin c0368b3f t mp_begin and one that crashes shows: 56 ichotolot ~[11:50PM] Z% nm /boot/kernel.old/kernel | grep begin c01290a9 t begin c038d49f t mp_begin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Tracking down problem with booting large kernels (bug in locore.s)
On my system (dual PII/400 running -current), I've noticed for some time that if I build a kernel with too many device drivers in it (where "too many" seems to correspond to text size 3M for the resulting kernel), the system reboots itself immediately upon booting with the new kernel. Other people have noticed this before (see the thread "Recent kernels won't boot" in the mailing list archives at http://www.freebsd.org/mail/archive/2000/freebsd-current/20001015.freebsd-current.html ). However, no fix for or cause of the problem was ever identified, and the problem still exists in -current cvsuped as of today. I spent some time tonight seeing if I could localize the exact place of the crash, and had some luck finding where it's crashing. The problem is annoyingly hard to track down, as even booting with DDB and boot -d wouldn't catch the bug; the kernel reboots before DDB starts. I had to resort to sticking "hlt" instructions (or calls to cpu_halt()) in various places and seeing if I could get the kernel to hang (telling me that the kernel had gotten as far as where I stuck the halt.) I narrowed the crash down to this area of locore.s (note the arrows). --- /* Now enable paging */ movlR(IdlePTD), %eax movl%eax,%cr3 /* load ptd addr into mmu */ movl%cr0,%eax /* get control word */ orl $CR0_PE|CR0_PG,%eax /* enable paging */ movl%eax,%cr0 /* and let's page NOW! */ #ifdef BDE_DEBUGGER /* * Complete the adjustments for paging so that we can keep tracing through * initi386() after the low (physical) addresses for the gdt and idt become * invalid. */ callbdb_commit_paging #endif No crashes as of here pushl $begin /* jump to high virtualized address */ ret /* now running relocated at KERNBASE where the system is linked to run */ begin: crashes before it gets here!!! /* set up bootstrap stack */ movlproc0paddr,%eax /* location of in-kernel pages */ -- The pushl and ret is where the boot code is jumping to "begin:" at its proper virtual address after the page tables are setup. I'm guessing that create_pagetables is somehow losing and creating bogus page tables such that the jump to the kernel virtual address space goes into deep space somewhere, but frankly the details of page tables on the i386 are beyond my expertise. So I'm posting this in hopes that someone on here *does* know enough to figure out what's going wrong when the kernel size is sufficiently large. Any takers? To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Dual probing of PCI-connected hardware (was Re: xl driver
In servalan.mailinglist.fbsd-current you write: In message [EMAIL PROTECTED] R Joseph Wright writes: : Sep 3 13:24:26 manatee /kernel: xl0: 3Com 3c900-COMBO Etherlink XL port 0x6c00-0x6c3f irq 11 at device 9.0 on pci0 : Sep 3 13:24:26 manatee /kernel: xl1: 3Com 3c900-COMBO Etherlink XL port 0x6c00-0x6c3f irq 11 at device 9.0 on pci2 Looks like your pci bus is getting probed twice! Warner I've been seeing similar oddities too. Nothing crippling, but it is a little disconcerting to see the machine think you have twice as many SCSI controllers and ethernets as you actually have. This is with kernel src grabbed earlier today, i.e. after Peter Wemm's most recent fixes. Note how the kernel thinks it sees an fxp1 and an ahc2/3 as well as an ata2/3, but fails on the allocation of resources/IRQs for those devices, since they're already allocated to the "real" instances of the fxp, ahc0/1, ata0/1 devices. Copyright (c) 1992-2000 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.0-CURRENT #36: Sun Sep 3 18:54:17 CDT 2000 [EMAIL PROTECTED]:/usr/src/sys/compile/ICHOTOLOTSMP Calibrating clock(s) ... TSC clock: 400853210 Hz, i8254 clock: 1193016 Hz CLK_USE_I8254_CALIBRATION not specified - using default frequency Timecounter "i8254" frequency 1193182 Hz CLK_USE_TSC_CALIBRATION not specified - using old calibration method CPU: Pentium II/Pentium II Xeon/Celeron (400.91-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x653 Stepping = 3 Features=0x183fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR real memory = 134217728 (131072K bytes) Physical memory chunk(s): 0x1000 - 0x0009efff, 647168 bytes (158 pages) 0x00739000 - 0x07ff7fff, 126611456 bytes (30911 pages) avail memory = 123305984 (120416K bytes) Programming 24 pins in IOAPIC #0 IOAPIC #0 intpin 2 - irq 0 SMP: CPU0 apic_initialize(): lint0: 0x0700 lint1: 0x00010400 TPR: 0x0010 SVR: 0x01ff FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 0, version: 0x00040011, at 0xfee0 cpu1 (AP): apic id: 1, version: 0x00040011, at 0xfee0 io0 (APIC): apic id: 2, version: 0x00170011, at 0xfec0 bios32: Found BIOS32 Service Directory header at 0xc00fdb50 bios32: Entry = 0xfdb60 (c00fdb60) Rev = 0 Len = 1 pcibios: PCI BIOS entry at 0xf+0xdb81 pnpbios: Found PnP BIOS data at 0xc00f72e0 pnpbios: Entry = f:6984 Rev = 1.0 Other BIOS signatures found: Preloaded elf kernel "kernel" at 0xc071d000. random: entropy source nulldev: null device, zero device mem: memory I/O Pentium Pro MTRR support enabled Creating DISK md0 md0: Malloc disk Math emulator present SMP: CPU0 bsp_apic_configure(): lint0: 0x00010700 lint1: 0x0400 TPR: 0x0010 SVR: 0x01ff pcib-: pcib0 exists, using next available unit number npx0: math processor on motherboard npx0: INT 16 interface pcib0: Intel 82443GX host to PCI bridge on motherboard pci0: physical bus=0 found- vendor=0x8086, dev=0x71a0, revid=0x00 class=06-00-00, hdrtype=0x00, mfdev=0 subordinatebus=0secondarybus=0 map[10]: type 3, range 32, base f800, size 26, enabled found- vendor=0x8086, dev=0x71a1, revid=0x00 class=06-04-00, hdrtype=0x01, mfdev=0 subordinatebus=1secondarybus=1 found- vendor=0x8086, dev=0x7110, revid=0x02 class=06-01-00, hdrtype=0x00, mfdev=1 subordinatebus=0secondarybus=0 found- vendor=0x8086, dev=0x7111, revid=0x01 class=01-01-80, hdrtype=0x00, mfdev=0 subordinatebus=0secondarybus=0 map[20]: type 4, range 32, base ffa0, size 4, enabled Freeing (NOT implemented) redirected PCI irq 11. found- vendor=0x8086, dev=0x7112, revid=0x01 class=0c-03-00, hdrtype=0x00, mfdev=0 subordinatebus=0secondarybus=0 intpin=d, irq=19 map[20]: type 4, range 32, base ef80, size 5, enabled found- vendor=0x8086, dev=0x7113, revid=0x02 class=06-80-00, hdrtype=0x00, mfdev=0 subordinatebus=0secondarybus=0 map[90]: type 4, range 32, base 0440, size 4, enabled found- vendor=0x1011, dev=0x0024, revid=0x03 class=06-04-00, hdrtype=0x01, mfdev=0 subordinatebus=2secondarybus=2 Freeing (NOT implemented) redirected PCI irq 11. found- vendor=0x8086, dev=0x1229, revid=0x05 class=02-00-00, hdrtype=0x00, mfdev=0 subordinatebus=0secondarybus=0 intpin=a, irq=19 map[10]: type 3, range 32, base fc4ff000, size 12, enabled map[14]: type 4, range 32, base ef40, size 5, enabled map[18]: type 1, range 32, base fea0, size 20, enabled Freeing (NOT implemented) redirected PCI irq 10. found- vendor=0x9004, dev=0x7895, revid=0x04 class=01-00-00, hdrtype=0x00, mfdev=1 subordinatebus=0secondarybus=0 intpin=a,
Re: (noperiph:ahc0:0:-1:-1): ... error
In servalan.mailinglist.fbsd-current you write: I am trying to run a recent (as of today) and am seeing the following error when I try to boot:: (noperiph:ahc0:0:-1:-1): SCSI bus reset delivered. 0 SCBs aborted. panic: Bogus resid sgptr value 0xbd68609 (I copied this from the console after the boot failure, there may be minor mistakes.) This started happening when I started compiling kernels built from sources cvsuped around Jul 18. I am not sure what is causing these messages. The "noperiph" message appears to come from xpt_print_path in /usr/src/sys/cam/cam_xpt.c while the panic seems to be written by ahc_calc_residual in /usr/src/sys/dev/aic7xxx/aic7xxx.c. From a quick look at the code, the problem is not directly in the code pointed to by the messages. I have an Adaptec 2940UW. A much older kernel reports it as Adaptec 2940 Ultra SCSI adapter with aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs. The Bios on the board is version 2.20.0 I have 4 drives and a UMAX scanner connected to the bus. More details available if needed. I saw something similar, but not identical, when trying to boot a -current kernel made last night. I saw the (noperiph...) message you saw. After that, the machine didn't panic, but it didn't work very well, either. It did, after a few seconds, detect the SCSI tape drive I had (sa0), but failed on detecting the SCSI disk and CDROM, repeatedly timing out and resetting the bus. Alas, I didn't have the presence of mind to write down the exact messages; I'll try to do that tonight, assuming the bug is still present in the src I'm cvsupping now. This was on an SMP box (Tyan Thunder 100GX), with an aic7895 SCSI controller, and the following three SCSI devices: sa0 at ahc0 bus 0 target 0 lun 0 sa0: SONY SDT-7000 0300 Removable Sequential Access SCSI-2 device sa0: 10.000MB/s transfers (10.000MHz, offset 15) Mounting root from ufs:/dev/da0s2a da0 at ahc0 bus 0 target 6 lun 0 da0: QUANTUM ATLAS IV 9 WLS 0707 Fixed Direct Access SCSI-3 device da0: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing Enabled da0: 8761MB (17942584 512 byte sectors: 255H 63S/T 1116C) cd0 at ahc0 bus 0 target 1 lun 0 cd0: TOSHIBA CD-ROM XM-6401TA 1009 Removable CD-ROM SCSI-2 device cd0: 20.000MB/s transfers (20.000MHz, offset 15) cd0: Attempt to query device size failed: NOT READY, Medium not present To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: db 1.85 -- 2.x or 3.x?
In servalan.mailinglist.fbsd-current Brad Knowles writes: Besides, don't we use gcc as the system-standard compiler, and doesn't this likewise infect everything compiled on FreeBSD with the GPL? No, because none of the gcc code appears in the resulting binary. The binary does include the "libgcc" code, but that code is specifically exempted from the GPL. Programs that link against the Berkeley DB 2.x library, however, will end up including the DB code, and thus end up including code covered by the 2.x licence. [Note: of course, if you link against a shared library, the actual code from the library doesn't appear in your binary. It seems to be the general consensus opinion that the courts would treat this the same as the static linking case, i.e. your binary would be covered under the licence "as if" you had statically linked against the relevant library, but I don't know if this has ever been tested in court anywhere. If you're in a situation where the legalities really matter, you should probably ask a real lawyer instead of relying on the semi-informed opinion of people posting to mailing lists.] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message