Re: hard lock-up writing to tape
On Wednesday 19 November 2003 02:15 pm, Bruce Evans wrote: > > Anyway, the stuff to the left of the slash in the above is the list > of active consoles and the stuff to the right of the slash is the > list of possible consoles. You have to move stuff from one list to > the other. I vaguely remember that this is done using '-' to delete > things from the left hand list and something more direct to add them. You remember correctly. Thanks for the info. However, I think I'm going to have to throw in the towel on this. When I swap the console output using the kern.console sysctl, I can get user application console output to appear on the remote machine - just nothing from the kernel. For example, if I 'echo hello > /dev/console', hello will appear on the remote machine. But I never see any of the bold face messages, such as the very frequent: checking stopevent 2 with the following non-sleepable locks held: exclusive sleep mutex sigacts r = 0 (0xc6b6faa8) locked @ /disk2/src/sys/kern/ subr_trap.c:260 When I tried to generate a break using ~# from tip to drop into the debugger, nothing happens, so I don't think the serial console is fully connected in the kernel, even though the bold-face output disappeared from the syscons console. I do have an extra bit of information on my original tape lock-up problem. At one point, when I thought I had the remote console working and was reproducing the problem, the tape backup worked fine until I pinged the machine. I think the machine responded to the ping and then that was it. It locked up solid. mike ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: hard lock-up writing to tape
On Wed, 19 Nov 2003, Mike Durian wrote: > On Tuesday 18 November 2003 08:29 pm, Bruce Evans wrote: > > - -current has the kern.console sysctl for enabling multiple consoles > > (buut only 1 sio one). You can boot with a syscons console and then > > enable the serial, and the latter should work if it is on a working > > port to begin with. Anyway, this sysctl shows which sio port can be > > a console, if any. > > Is there any documentation on this sysctl? I'm not sure what I > should set it to. After a normal boot, it reads: Only in the source code. > kern.console: consolectl,/ttyd1,consolectl, Not even the bug that syscons's consolectl device is printed here is documented (the actual syscons console is on /dev/ttyv0, but this bogusly shares a tty struct with /dev/consolectl and many things cannot tell the difference. This bug also messes up the columns in pstat -t, since consolectl is too wide to fit). Anyway, the stuff to the left of the slash in the above is the list of active consoles and the stuff to the right of the slash is the list of possible consoles. You have to move stuff from one list to the other. I vaguely remember that this is done using '-' to delete things from the left hand list and something more direct to add them. Bruce ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: hard lock-up writing to tape
On Tuesday 18 November 2003 08:29 pm, Bruce Evans wrote: > > This could be from a speed mismatch or from kern.consmute somehwo getting > set. I had wondered about a speed mismatch, but everything I've found says 9600. I did not know to look at kern.consmute. I'll check that. > - -current has the kern.console sysctl for enabling multiple consoles > (buut only 1 sio one). You can boot with a syscons console and then > enable the serial, and the latter should work if it is on a working > port to begin with. Anyway, this sysctl shows which sio port can be > a console, if any. Is there any documentation on this sysctl? I'm not sure what I should set it to. After a normal boot, it reads: kern.console: consolectl,/ttyd1,consolectl, > - RELENG_4 and -current have the machdep.conspeed sysctl for setting the > console speed. That is the expected 9600. mike ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: hard lock-up writing to tape
On Tue, 18 Nov 2003, Mike Durian wrote: > On Monday 17 November 2003 04:41 pm, Mike Durian wrote: > > > > I was finally able to get some partial success by setting flag 0x30 > > for sio1. When I'd boot, I'd get console messages on my remote > > tip session. However, I'd only receive those messages printed > > from user-level applications. I would not see any of the bold-face > > messages from the kernel. > > I'm still stumbling with the remote serial console. Can someone > who does this often test and verify they can use COM2 as the > serial console - and then tell me what you did. Moving the 0x10 flag from sio0 to sio1 should be sufficient for the kernel part. Setting the 0x20 flag for sio1 together with the 0x10 flag should mainly save having to edit the flag for sio0. If the kernel's serial console is the same as the boot blocks', then it should use the same speed as the boot blocks set it too. Otherwise there may be a speed mismatch. > The best I can manage is described above and then I get neither > the bold kernel messages nor the debugger prompt. This could be from a speed mismatch or from kern.consmute somehwo getting set. Some of this stuff can be configured after booting: - RELENG4 has non-broken boot-time configuration which allows changing during the boot. - -current has the kern.console sysctl for enabling multiple consoles (buut only 1 sio one). You can boot with a syscons console and then enable the serial, and the latter should work if it is on a working port to begin with. Anyway, this sysctl shows which sio port can be a console, if any. - RELENG_4 and -current have the machdep.conspeed sysctl for setting the console speed. Bruce ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: hard lock-up writing to tape
On Monday 17 November 2003 04:41 pm, Mike Durian wrote: > > I was finally able to get some partial success by setting flag 0x30 > for sio1. When I'd boot, I'd get console messages on my remote > tip session. However, I'd only receive those messages printed > from user-level applications. I would not see any of the bold-face > messages from the kernel. I'm still stumbling with the remote serial console. Can someone who does this often test and verify they can use COM2 as the serial console - and then tell me what you did. The best I can manage is described above and then I get neither the bold kernel messages nor the debugger prompt. mike ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: hard lock-up writing to tape
On Monday 17 November 2003 02:09 pm, Doug White wrote: > > Set flag 0x80 on sio1 and take it off of sio0. Thats what the kernel uses > to decide which port to use. The BOOT_COMCONSOLE_PORT is used by loader > only. I was finally able to get some partial success by setting flag 0x30 for sio1. When I'd boot, I'd get console messages on my remote tip session. However, I'd only receive those messages printed from user-level applications. I would not see any of the bold-face messages from the kernel. I tried dropping into the kernel debugger when the machine was not hung. The machine would immediately become unresponsive, as you'd expect if it was stopped in the debugger, but I never got any prompt on the serial console. I couldn't type another on the serial console to make anything happen either. Are there some hard-coded assumptions in the kernel that force a remote serial console to only work on sio0? Until I can get this working, I'm not going to be much help providing the trace backs needed to debug the tape write lock-up. mike ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: hard lock-up writing to tape
On Mon, 17 Nov 2003, Mike Durian wrote: > On Monday 17 November 2003 10:50 am, Doug White wrote: > > > > To debug this, you will need to set up a serial console with some special > > kernel options. Instructions for booting with serial console are in the > > Handbook, but you will have to compile with the following kernel options: > > Is there a trick to setting up a serial console on sio1? My line > drivers are fried on sio0 and I only have sio1, sio4 and sio5 available > for use. Set flag 0x80 on sio1 and take it off of sio0. Thats what the kernel uses to decide which port to use. The BOOT_COMCONSOLE_PORT is used by loader only. -- Doug White| FreeBSD: The Power to Serve [EMAIL PROTECTED] | www.FreeBSD.org ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: hard lock-up writing to tape
On Monday 17 November 2003 10:50 am, Doug White wrote: > > To debug this, you will need to set up a serial console with some special > kernel options. Instructions for booting with serial console are in the > Handbook, but you will have to compile with the following kernel options: Is there a trick to setting up a serial console on sio1? My line drivers are fried on sio0 and I only have sio1, sio4 and sio5 available for use. I set BOOT_COMCONSOLE_PORT= 0x2F8 in /etc/make.conf, rebuilt sys/boot and installed. I put the new boot blocks on disk using bsdlabel -B /dev/ad0s2. I edited /boot/device.hints and changed hint.sio.0.flags="0x10" to hint.sio.1.flags="0x10". I also tried statically compiling the hints into the kernel. Now when I boot and use -h or set console=comconsole in loader, the console flips away from the vidconsole as expected, but doesn't go to sio1. At least not so I can tell. I've got a null-modem connecting sio1 to a tip session on another machine. I've verified the connection is good because I can tip between the two machines manually. What am I missing? mike ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: hard lock-up writing to tape
On Sun, 16 Nov 2003, Mike Durian wrote: > I'm using -current cvsup'd as of Nov 15, 2003. When I try to do a > dump or run the btape (fill command) program from bacula, my machine > will lock up hard. Doesn't respond to ping. No access to kernel > debugger. Num lock doesn't come on. Sounds like a Giant deadlock. dwhite's Form Letter on Debugging Giant Deadlocks If you are experiencing problems with CURRENT locking up hard, it may be due to a deadlock against the Giant mutex, which controls large parts of the kernel. Symptoms include: . No response to any input . System video console . Network (ping) To debug this, you will need to set up a serial console with some special kernel options. Instructions for booting with serial console are in the Handbook, but you will have to compile with the following kernel options: options DDB options BREAK_TO_DEBUGGER options WITNESS options INVARIANTS options INVARIANTS_SUPPORT Make sure your serial console is capable of sending a Break signal. If not, use "ALT_BREAK_TO_DEBUGGER" instead of "BREAK_TO_DEBUGGER". Enable the serial console and boot the system. Turn on terminal logging. In loader, stop the boot and type "boot -v" at the OK prompt to get additional info during the boot process. Once the system is up, trigger the hang. When the system hangs, issue the Break signal (or if you have used ALT_BREAK_TO_DEBUGGER, press Enter ~ ^E b (tilde, Ctrl-E, b)). If you get the db> prompt, then your hang is probably due to a Giant deadlock. If not, then something else may be at fault. Once in db>, run the following two commands and capture their output using your terminal's logging capability: show locks tr Take these and the boot -v output, put them on a webpage, and send a message to [EMAIL PROTECTED] carefully explaining what you did to trigger the hang. Good luck! > > I can perform a dump or run the btape fill program when in single > user mode, but in multi-user the machine will only stay up for > a short while before locking. > > This has been happening since I got the tape system (Sparcstorage > Library) about 3-4 weeks ago. I don't know how long the problem > existed before then as I didn't have a tape system to use. > > I've tried two types of SCSI cards: Adaptec 2930 and ASUS PCI-SC200 > (sym(4) device). Both behave the same. > > I wonder if it could be network or interrupt related. In single > user mode, the network interface is not up. > > Dmesg from my system follows: > Copyright (c) 1992-2003 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD 5.1-CURRENT #57: Sat Nov 15 15:50:50 MST 2003 > [EMAIL PROTECTED]:/disk2/obj/disk2/src/sys/BOOGIE > Preloaded elf kernel "/boot/kernel/kernel" at 0xc0a93000. > Preloaded elf module "/boot/kernel/linux.ko" at 0xc0a931f4. > Preloaded elf module "/boot/kernel/snd_pcm.ko" at 0xc0a932a0. > Preloaded elf module "/boot/kernel/snd_via82c686.ko" at 0xc0a9334c. > Preloaded elf module "/boot/kernel/sym.ko" at 0xc0a93400. > Preloaded elf module "/boot/kernel/nvidia.ko" at 0xc0a934a8. > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: AMD Athlon(tm) processor (1002.28-MHz 686-class CPU) > Origin = "AuthenticAMD" Id = 0x642 Stepping = 2 > > Features=0x183f9ff > AMD Features=0xc044 > real memory = 1073676288 (1023 MB) > avail memory = 1033502720 (985 MB) > Pentium Pro MTRR support enabled > npx0: [FAST] > npx0: on motherboard > npx0: INT 16 interface > acpi0: on motherboard > pcibios: BIOS version 2.10 > Using $PIR table, 8 entries at 0xc00fde30 > acpi0: Power Button (fixed) > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0 > acpi_cpu0: on acpi0 > acpi_button0: on acpi0 > pcib0: port > 0x6000-0x607f,0x5000-0x500f,0x4080-0x40ff,0x4000-0x407f,0xcf8-0xcff on acpi0 > pci0: on pcib0 > pcib0: slot 7 INTD is routed to irq 11 > pcib0: slot 7 INTD is routed to irq 11 > pcib0: slot 7 INTC is routed to irq 10 > pcib0: slot 9 INTA is routed to irq 9 > pcib0: slot 9 INTA is routed to irq 9 > pcib0: slot 9 INTA is routed to irq 9 > pcib0: slot 9 INTA is routed to irq 9 > pcib0: slot 10 INTA is routed to irq 10 > pcib0: slot 11 INTA is routed to irq 11 > pcib0: slot 12 INTA is routed to irq 10 > pcib0: slot 13 INTA is routed to irq 11 > agp0: mem > 0xd000-0xd7ff at device 0.0 on pci0 > pcib1: at device 1.0 on pci0 > pci1: on pcib1 > pcib0: slot 1 INTA is routed to irq 5 > pcib1: slot 0 INTA is routed to irq 5 > nvidia0: mem > 0xd800-0xdfff,0xe000-0xe0ff irq 5 at device 0.0 on pci1 > isab0: at device 7.0 on pci0 > isa0: on isab0 > atapci0: port 0xa000-0xa00f at device 7.1 on > pci0 > atapci0: Correcting VIA config for southbridge data corruption bug > ata0: at 0x1f0 irq 14 on atapci0 > ata0: [MPSAFE] > ata1: at 0x170 irq 15 on atapci0 > ata1: [M
hard lock-up writing to tape
I'm using -current cvsup'd as of Nov 15, 2003. When I try to do a dump or run the btape (fill command) program from bacula, my machine will lock up hard. Doesn't respond to ping. No access to kernel debugger. Num lock doesn't come on. I can perform a dump or run the btape fill program when in single user mode, but in multi-user the machine will only stay up for a short while before locking. This has been happening since I got the tape system (Sparcstorage Library) about 3-4 weeks ago. I don't know how long the problem existed before then as I didn't have a tape system to use. I've tried two types of SCSI cards: Adaptec 2930 and ASUS PCI-SC200 (sym(4) device). Both behave the same. I wonder if it could be network or interrupt related. In single user mode, the network interface is not up. Dmesg from my system follows: Copyright (c) 1992-2003 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.1-CURRENT #57: Sat Nov 15 15:50:50 MST 2003 [EMAIL PROTECTED]:/disk2/obj/disk2/src/sys/BOOGIE Preloaded elf kernel "/boot/kernel/kernel" at 0xc0a93000. Preloaded elf module "/boot/kernel/linux.ko" at 0xc0a931f4. Preloaded elf module "/boot/kernel/snd_pcm.ko" at 0xc0a932a0. Preloaded elf module "/boot/kernel/snd_via82c686.ko" at 0xc0a9334c. Preloaded elf module "/boot/kernel/sym.ko" at 0xc0a93400. Preloaded elf module "/boot/kernel/nvidia.ko" at 0xc0a934a8. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) processor (1002.28-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x642 Stepping = 2 Features=0x183f9ff AMD Features=0xc044 real memory = 1073676288 (1023 MB) avail memory = 1033502720 (985 MB) Pentium Pro MTRR support enabled npx0: [FAST] npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard pcibios: BIOS version 2.10 Using $PIR table, 8 entries at 0xc00fde30 acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0 acpi_cpu0: on acpi0 acpi_button0: on acpi0 pcib0: port 0x6000-0x607f,0x5000-0x500f,0x4080-0x40ff,0x4000-0x407f,0xcf8-0xcff on acpi0 pci0: on pcib0 pcib0: slot 7 INTD is routed to irq 11 pcib0: slot 7 INTD is routed to irq 11 pcib0: slot 7 INTC is routed to irq 10 pcib0: slot 9 INTA is routed to irq 9 pcib0: slot 9 INTA is routed to irq 9 pcib0: slot 9 INTA is routed to irq 9 pcib0: slot 9 INTA is routed to irq 9 pcib0: slot 10 INTA is routed to irq 10 pcib0: slot 11 INTA is routed to irq 11 pcib0: slot 12 INTA is routed to irq 10 pcib0: slot 13 INTA is routed to irq 11 agp0: mem 0xd000-0xd7ff at device 0.0 on pci0 pcib1: at device 1.0 on pci0 pci1: on pcib1 pcib0: slot 1 INTA is routed to irq 5 pcib1: slot 0 INTA is routed to irq 5 nvidia0: mem 0xd800-0xdfff,0xe000-0xe0ff irq 5 at device 0.0 on pci1 isab0: at device 7.0 on pci0 isa0: on isab0 atapci0: port 0xa000-0xa00f at device 7.1 on pci0 atapci0: Correcting VIA config for southbridge data corruption bug ata0: at 0x1f0 irq 14 on atapci0 ata0: [MPSAFE] ata1: at 0x170 irq 15 on atapci0 ata1: [MPSAFE] uhci0: port 0xa400-0xa41f irq 11 at device 7.2 on pci0 usb0: on uhci0 usb0: USB revision 1.0 uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0xa800-0xa81f irq 11 at device 7.3 on pci0 usb1: on uhci1 usb1: USB revision 1.0 uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered viapropm0: SMBus I/O base at 0x5000 viapropm0: port 0x5000-0x500f at device 7.4 on pci0 viapropm0: SMBus revision code 0x40 smbus0: on viapropm0 smb0: on smbus0 pcm0: port 0xb400-0xb403,0xb000-0xb003,0xac00-0xacff irq 10 at device 7.5 on pci0 pcm0: ohci0: mem 0xe3006000-0xe3006fff irq 9 at device 9.0 on pci0 usb2: OHCI version 1.0, legacy support usb2: on ohci0 usb2: USB revision 1.0 uhub2: (0x11c1) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 1 port with 1 removable, self powered ohci1: mem 0xe3007000-0xe3007fff irq 9 at device 9.1 on pci0 usb3: OHCI version 1.0, legacy support usb3: on ohci1 usb3: USB revision 1.0 uhub3: (0x11c1) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub3: 1 port with 1 removable, self powered ohci2: mem 0xe3004000-0xe3004fff irq 9 at device 9.2 on pci0 usb4: OHCI version 1.0, legacy support usb4: on ohci2 usb4: USB revision 1.0 uhub4: (0x11c1) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub4: 1 port with 1 removable, self powered ohci3: mem 0xe3005000-0xe3005fff irq 9 at device 9.3 on pci0 usb5: OHCI version 1.0, legacy support usb5: on ohci3 usb5: USB revision 1.0 uhub5: (0x11c1) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub5: 1 port with 1 removable, self powered puc0: port 0xcc00-0xcc0f,0xc800-0xc807,0xc400-0xc407,0xc000-0xc007,0xbc00-0xbc07,0xb800-0xb807 irq 10 at