Re: 6.0 random freezes
i've only used the generic 6.0 kernel # kgdb kernel.debug /var/crash/vmcore.1 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd. Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode fault virtual address = 0x10 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0a7cf08 stack pointer = 0x28:0xd56a694c frame pointer = 0x28:0x0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 29 (swi1: net) trap number = 12 panic: page fault Uptime: 1d23h40m51s Dumping 511 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 511MB (130800 pages) 495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 (CTRL-C to abort) (CTRL-C to abort) (CTRL-C to abort) 191 175 159 143 127 111 95 79 63 47 31 15 #0 doadump () at pcpu.h:165 165 pcpu.h: No such file or directory. in pcpu.h (kgdb) where #0 doadump () at pcpu.h:165 #1 0xc0638202 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:399 #2 0xc0638498 in panic (fmt=0xc084e5a2 %s) at /usr/src/sys/kern/kern_shutdown.c:555 #3 0xc0807c30 in trap_fatal (frame=0xd56a690c, eva=16) at /usr/src/sys/i386/i386/trap.c:831 #4 0xc080799b in trap_pfault (frame=0xd56a690c, usermode=0, eva=16) at /usr/src/sys/i386/i386/trap.c:742 #5 0xc08075d9 in trap (frame= {tf_fs = -1038680056, tf_es = 40, tf_ds = 40, tf_edi = 0, tf_esi = -646886620, tf_ebp = 0, tf_isp = -714446536, tf_ebx = -646862464, tf_edx = 791735, tf_ecx = -1073475471, tf_eax = 1, tf_trapno = 12, tf_err = 0, tf_eip = -1062744312, tf_cs = 32, tf_eflags = 66050, tf_esp = 16798208, tf_ss = 0}) at /usr/src/sys/i386/i386/trap.c:432 #6 0xc07f6dca in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc0a7cf08 in ?? () On 12/13/05, Peter Jeremy [EMAIL PROTECTED] wrote: On Tue, 2005-Dec-13 13:43:13 -0400, fredthetree wrote: [/var/crash/vmcore.1] -- Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode fault virtual address = 0x10 That's a NULL pointer de-reference - it Shouldn't Happen(TM). Can you get a backtrace from kgdb (where)? [vmcore.0] -- Unread portion of the kernel message buffer: ÃwÄ0¿Á0ÂÁÀíÁÀJðÂÄüÂ3ÄÓÂÀíÁÀóÂDþÁÀóÂðCÂÀíÁ1Ä ÃðÚÃÀíÁ´ÂÄBð°ÄÁÀíÁ[EMAIL PROTECTED]@ -- The most likely problem is that your vmcore file doesn't match your kernel. Are you running kgdb with the same kernel as was running when the system crashed? (If you don't have that kernel handy, you might as well delete vmcore.0). -- Peter Jeremy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.0 random freezes
On Wed, 2005-Dec-14 08:28:26 -0400, fredthetree wrote: i've only used the generic 6.0 kernel # kgdb kernel.debug /var/crash/vmcore.1 ... #6 0xc07f6dca in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc0a7cf08 in ?? () Unfortunately, it's frame 7 and below that is crucial. Was #7 the last line or did you cut the backtrace off? The top frames are the kernel handling the trap. It looks like the trap occurred in a KLD - in this case, try running: # cd /usr/obj/usr/src/sys/GENERIC (or name of kernel config) # make gdbinit[this just copies a few config files for kgdb] # gdb kernel.debug /var/crash/vmcore.1 (kgdb) kldsyms (kgdb) where Hopefully this will decode #7 and you can provide a few more frames. -- Peter Jeremy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.0 random freezes
Peter Jeremy said the following on 12/13/05 02:00: Note that PS/2 keyboards aren't hot-pluggable and attempts to do so can have deleterious effects on your keyboard and/or motherboard. In any case, the probe/attach sequence relies on the kernel being in a reasonably sane state (and I'm not sure if it will detect the keyboard as a console device except at boot time). I agree, but the keyboard is a passive device (with no power source, i.e. mostly harmless), and it's a standard practice to have only few movable consoles for several racks and plug them in only where it's necessary. It always has been working for us and I don't remember having any hot-plugging accidents for years. If the keyboard has been plugged in since the system booted, do you still get the same no response? If so, the kernel has wedged at a fairly low level and I'm not quite sure how to proceed other than by enabling the sanity checks that other people have mentioned (eg WITNESS, INVARIANTS) and hoping they catch something. I cannot say for sure. When the thing happens I'm usually away, and until I go there, the console could have been used by someone. I'm in process of getting a serial console, so if there's no response as well, I will enable the sanity checks. I only mentioned serial consoles on the off-chance that you had one. Whilst it may not help here, serial consoles have a number of advantages when managing remote equipment Thanks for pointing this. As I said I'm in process of getting one for now, and possibly equipping some dozens of servers with that later. After the downgrade we could eventually set a test bed and start hammering it with requests. The problem would be how to trigger the crash and whether we would be able to reproduce it at all. I already went to the 5.4 downgrade way. Actually I was forced to do so during the other night, when one of the machines started hanging up in every half an hour or so. Looks like the background fsck on the slower SATA based RAID5 array helped a lot with that. Now I have the test bed online. This is the very same server (SCSI based, with the OS drive intact and production data drives moved elsewhere) that was crashing once a day or so. Hopefully tomorrow I will have a serial console attached to it, so we can start pounding it. I hope this machine won't need to go in production during the next month or so and we'll have enough time for tests. Depending on your application and the interfaces to it, it might be feasible to either tee live traffic into both systems and just junk the responses from your test bed, or record live traffic and replay it into your test bed. It runs a fairly complex set of services. It's a shared web hosting server handling some hundreds of websites, and also email SMTP/POP3/IMAP, databases MySQL, FTP, DNS, etc. I don't know how easy would be implement such traffic gathering and replaying that on the test bed. It seems kind of complicated at first sight (though I realize it might be the only way to reproduce the crash). We might need some NAT (via ipfw?), some services might not like their responses being junked, etc. I was thinking about trying the kernel stress suite first. Or just have something rsync-ing lots files back and forth (possibly over the network), run apache bench in a loop and point it to some database intensive page, etc. Regards, Atanas ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.0 random freezes
Atanas said the following on 12/12/05 18:57: Peter Jeremy said the following on 12/12/05 13:40: When it hangs, break into DDB (Ctrl-Alt-Esc on the console or BREAK on a serial console). But if I have no keyboard response I won't be able to save it, right? (replying to myself) This is exactly what I was afraid would happen. The SATA based box just hung up again, with all of the kernel debugging options in place: makeoptions DEBUG=-g options KDB options DDB But I wasn't able to do anything with the keyboard in order to save a crashdump, so I got no other choices than hitting the reset button. Regards, Atanas ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.0 random freezes
On Mon, 2005-Dec-12 18:57:24 -0800, Atanas wrote: When I plug a keyboard, there's no response at all - no LEDs, no VTYs, Ctrl-Alt-Esc, etc. You might think of hint.atkbd.0.flags not being set properly, but it's right (i.e. unchanged, it appears to default to that on i386 5.x+) and other machines with identical configuration do accept keyboard. Note that PS/2 keyboards aren't hot-pluggable and attempts to do so can have deleterious effects on your keyboard and/or motherboard. In any case, the probe/attach sequence relies on the kernel being in a reasonably sane state (and I'm not sure if it will detect the keyboard as a console device except at boot time). If the keyboard has been plugged in since the system booted, do you still get the same no response? If so, the kernel has wedged at a fairly low level and I'm not quite sure how to proceed other than by enabling the sanity checks that other people have mentioned (eg WITNESS, INVARIANTS) and hoping they catch something. the next crash. But if I have no keyboard response I won't be able to save it, right? True. But DDB has been designed to rely on a fairly minimal subset of kernel functionality and often works, even though the system appears frozen. I do not know what a serial console is and would need some time to get along with it. Would I get something in addition to what I can get from the standard console? I only mentioned serial consoles on the off-chance that you had one. Whilst it may not help here, serial consoles have a number of advantages when managing remote equipment (ie equipment not sitting on your desk) - you can access the console remotely and log all console output on another computer (either cross-connect com1/com2 on pairs of hosts or get a multi-port serial card to manage a number of systems). If your BIOS can handle serial communications, you virtually never need to physically visit your servers. For details, check out: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/serialconsole-setup.html I personally use ports/comms/conserver-com to manage about 50 assorted Unix/FreeBSD servers and switches at work. The dumpdev variable seems to default to AUTO, i.e. trying to use the first swap device if it's bigger than the RAM (in my case yes), so I guess I don't need to touch it. It seems that my suggestion has been obsoleted by changes to the startup scripts since I checked last. After the downgrade we could eventually set a test bed and start hammering it with requests. The problem would be how to trigger the crash and whether we would be able to reproduce it at all. Depending on your application and the interfaces to it, it might be feasible to either tee live traffic into both systems and just junk the responses from your test bed, or record live traffic and replay it into your test bed. -- Peter Jeremy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.0 random freezes
On 12/13/05, fredthetree [EMAIL PROTECTED] wrote: On 12/12/05, Peter Jeremy [EMAIL PROTECTED] wrote: On Mon, 2005-Dec-12 22:21:52 -0400, fredthetree wrote: I just wanted to chime in and say I've had my 6.0-RELEASE #0 freeze up twice in the past few days. never once had it happen with 5.x. everything locks, no keyboard response, no mouse, and after several minutes it reboots itself, and savecore starts up during boot.. This suggests you've had a panic (or something that develops into one). If you've got a crashdump, you can probably get enough information out of it for people to get an idea of what is wrong. Please have a look at: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebg-gdb.html -- Peter Jeremy http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-gdb.html (fixed the link) does the following help? [/var/crash/vmcore.1] -- Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode fault virtual address = 0x10 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0a7cf08 stack pointer = 0x28:0xd56a694c frame pointer = 0x28:0x0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 29 (swi1: net) trap number = 12 panic: page fault Uptime: 1d23h40m51s Dumping 511 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 511MB (130800 pages) 495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 (CTRL-C to abort) (CTRL-C to abort) (CTRL-C to abort) 191 175 159 143 127 111 95 79 63 47 31 15 #0 doadump () at pcpu.h:165 165 pcpu.h: No such file or directory. in pcpu.h -- [vmcore.0] -- Unread portion of the kernel message buffer: ÃwÄ0¿Á0ÂÁÀíÁÀJðÂÄüÂ3ÄÓÂÀíÁÀóÂDþÁÀóÂðCÂÀíÁ1Ä ÃðÚÃÀíÁ´ÂÄBð°ÄÁÀíÁ[EMAIL PROTECTED]@ -- (after this text displays, i am unable to view the kgdb prompt... type exit [return] three times and i get back to the shell...) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.0 random freezes
On Tue, 13 Dec 2005, Atanas wrote: Atanas said the following on 12/12/05 18:57: Peter Jeremy said the following on 12/12/05 13:40: When it hangs, break into DDB (Ctrl-Alt-Esc on the console or BREAK on a serial console). But if I have no keyboard response I won't be able to save it, right? (replying to myself) This is exactly what I was afraid would happen. The SATA based box just hung up again, with all of the kernel debugging options in place: makeoptions DEBUG=-g options KDB options DDB But I wasn't able to do anything with the keyboard in order to save a crashdump, so I got no other choices than hitting the reset button. Regards, Atanas I posted this same problem recently. My latest attempt to troubleshoot the freezes was to snatch the SATA card out of the box. The machine has been running without any problems since. That was six days, 12 hours ago - the longest uptime I've had since I upgraded the machine to version 6. The only reason I added the SATA controller to the box was to set up a gmirror to make backups to, and since the machine kept freezing all the time I couldn't make decent backups anyway, so removing the SATA card didn't change the machine's duties at all. The reason I suspected the card might be the problem is that I installed it at the same time I upgraded to FreeBSD 6 (when the problems started), the card only cost $20, and the drives attached to the card were getting corrupted during crashes even though they weren't in use. The card was a SYBA SD-SATA-4P. I've also got an rl0 ethernet interface on the PCI bus also, and I wondered if it might be some kind of bus-mastering conflict or something. I still don't know if the problem was bad cheap hardware, bad interactions between cheap hardware, or a software problem. I suppose downgrading to 5.4 again might give some clues, but I don't really want to do that right now since the system finally seems to be stable again, albeit without my large disk array. This may easily have nothing to do with your problem, since we have so little information to go on, but the symptoms sound the same. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.0 random freezes
On Tue, 2005-Dec-13 13:43:13 -0400, fredthetree wrote: [/var/crash/vmcore.1] -- Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode fault virtual address = 0x10 That's a NULL pointer de-reference - it Shouldn't Happen(TM). Can you get a backtrace from kgdb (where)? [vmcore.0] -- Unread portion of the kernel message buffer: ÃwÄ0¿Á0ÂÁÀíÁÀJðÂÄüÂ3ÄÓÂÀíÁÀóÂDþÁÀóÂðCÂÀíÁ1Ä ÃðÚÃÀíÁ´ÂÄBð°ÄÁÀíÁ[EMAIL PROTECTED]@ -- The most likely problem is that your vmcore file doesn't match your kernel. Are you running kgdb with the same kernel as was running when the system crashed? (If you don't have that kernel handy, you might as well delete vmcore.0). -- Peter Jeremy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.0 random freezes
The load I'm talking about is less than moderate (less that 2.0 with plenty of CPU idle time). The freezing thing also does not appear to happen at peak times (I have rrdtool based CPU load graphs). Both machines have (almost) identical motherboards: Intel SE7520JR2SCSID2 and SE7520JR2ATAD2 2 Intel XeonE 3.2GHz 800MHz CPUs 4GB DDRII400 RegECC RAM Both machines boot with ACPI and hyperthreading enabled. Try to disable HTT in bios. It seldom gives you very much, and somtetimes degrades performance. Is it a webserver? If it generates alot of temporary files you can try adding/changing the following in /etc/sysctl.conf: kern.ipc.somaxconn=2048 kern.maxfiles=65536 vfs.ufs.dirhash_maxmem=8388608 regards Claus ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.0 random freezes
On Mon, 12 Dec 2005 22:15:55 +0100, Atanas [EMAIL PROTECTED] wrote: Hi, I have 3 machines running 6.0-RELEASE, and recently 2 of them started freezing once a day or so. There are no error messages on the console or in the system logs. What happens if you set one of these sysctl values to 0? (This disables SMP changes from 5.4 to 6.0.) debug.mpsafevfs: 1 debug.mpsafenet: 1 debug.mpsafevm: 1 And is there a possibility (performance-wise) to build a kernel with WITNESS and/or INVARIANTS options compiled in. This will give more info about possible locking problems. Your system will run slower. And because of this the problem may not occur anymore, but it is worth the try. Ronald. -- Ronald Klop Amsterdam, The Netherlands ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.0 random freezes
On Mon, 2005-Dec-12 13:15:55 -0800, Atanas wrote: I have 3 machines running 6.0-RELEASE, and recently 2 of them started freezing once a day or so. There are no error messages on the console or in the system logs. The first one I put in production about a month ago and it was working flawlessly until it got some load and now it started freezing almost every day. The second one has exactly the same behavior - it was fine when doing nothing (a couple of weeks), and started freezing when loaded. Define freezing: Does it respond to pings? Can you switch VTYs? Do the num-lock/caps-lock LEDs respond? Do some processes seem to freeze before others? I suggest you add the following to your kernel config: options KDB # Enable kernel debugger support. options DDB # Support DDB. When it hangs, break into DDB (Ctrl-Alt-Esc on the console or BREAK on a serial console). As a start, run 'show lockedvnods' and 'ps'. My guess is that you'll see a lock that has a number of waiters - which is probably the culprit. Use 'panic' or 'call doadump' to get a crashdump and then you can use kgdb to rummage around once you reboot - see http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebg-gdb.html makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols I suggest you add this back in. Without it, you can't debug any crash dumps that you manage to get (and add dumpdev to your rc.conf). Now the only reasonable option for me (I mean for production and in relatively short term) seems going downward to 5.4 and wait until 6.x get more stable Whilst I realise that you can't have production machines freezing on schedule, your assistance in providing more information about your problem will help make 6.x more stable. -- Peter Jeremy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.0 random freezes
Claus Guttesen said the following on 12/12/05 13:23: Both machines boot with ACPI and hyperthreading enabled. Try to disable HTT in bios. I think that I already achieved that by simply disabling the acpi module from device.hints, and it had no effect to the problem. It seldom gives you very much, and somtetimes degrades performance. Is it a webserver? It is a web server, and as such it tends to generate a lot of processes, many of them independent of each other and trying to run simultaneously. Thus more work horses (even less powerful virtual CPUs) make the server to perform smoother. This is just a practical observation though, and I could be wrong. I would rather go with 2 dual core Opterons, but these are sort of expensive for now. If it generates alot of temporary files you can try adding/changing the following in /etc/sysctl.conf: kern.ipc.somaxconn=2048 kern.maxfiles=65536 vfs.ufs.dirhash_maxmem=8388608 Currently I have the following: kern.ipc.somaxconn: 1024 kern.maxfiles: 12328 vfs.ufs.dirhash_maxmem: 2097152 kern.openfiles: 1992 It's closest relative (running 5.4-RELEASE on the same hardware) handles about twice more requests, temporary files, and open files. kern.openfiles there is about 4000, and if something tries to go above the limits, the kernel usually reports that. I have plenty of other boxes serving at least twice more requests with less powerful (also hyperthreaded) CPUs running 4.x and 5.x and with no problems. The ones I have problems with are way less loaded, and are supposedly faster ones. Thanks for your suggestions! Regards, Atanas ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.0 random freezes
Ronald Klop said the following on 12/12/05 13:27: What happens if you set one of these sysctl values to 0? (This disables SMP changes from 5.4 to 6.0.) debug.mpsafevfs: 1 debug.mpsafenet: 1 debug.mpsafevm: 1 Thanks for the suggestion! I just did so and rebooted both machines, so we'll see. I remember unseting debug.mpsafenet before 5.4 due to some ipfw limitations, but didn't know about the other two. And is there a possibility (performance-wise) to build a kernel with WITNESS and/or INVARIANTS options compiled in. This will give more info about possible locking problems. Your system will run slower. And because of this the problem may not occur anymore, but it is worth the try. Both machines are not much loaded, so I could afford slowing them down a bit for a while (I hope it won't be several times slower). I will do that at some point later if the problem still persists. I hope I won't be forced to downgrade to 5.4, though I'm already working on that (just in case). Regards, Atanas ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.0 random freezes
I just wanted to chime in and say I've had my 6.0-RELEASE #0 freeze up twice in the past few days. never once had it happen with 5.x. everything locks, no keyboard response, no mouse, and after several minutes it reboots itself, and savecore starts up during boot.. and again, it's not during heavy load, both times i was running X, just browsing ye olde internet with firefox, some other apps running as per usual.. FreeBSD atlan.ns.ca 6.0-RELEASE FreeBSD 6.0-RELEASE #0: Thu Nov 3 09:36:13 UTC 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC i386 Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 6.0-RELEASE #0: Thu Nov 3 09:36:13 UTC 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC Timecounter i8254 frequency 1193182 Hz quality 0 CPU: Intel Pentium III (701.59-MHz 686-class CPU) Origin = GenuineIntel Id = 0x683 Stepping = 3 Features=0x387f9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE real memory = 536805376 (511 MB) avail memory = 515956736 (492 MB) ath_hal: 0.9.14.9 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413) npx0: [FAST] npx0: math processor on motherboard npx0: INT 16 interface acpi0: AWARD AWRDACPI on motherboard acpi0: Power Button (fixed) pci_link0: ACPI PCI Link LNKA irq 11 on acpi0 pci_link1: ACPI PCI Link LNKB irq 5 on acpi0 pci_link2: ACPI PCI Link LNKC irq 9 on acpi0 pci_link3: ACPI PCI Link LNKD irq 10 on acpi0 Timecounter ACPI-safe frequency 3579545 Hz quality 1000 acpi_timer0: 24-bit timer at 3.579545MHz port 0x4008-0x400b on acpi0 cpu0: ACPI CPU on acpi0 acpi_throttle0: ACPI CPU Throttling on cpu0 acpi_button0: Power Button on acpi0 pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff,0x4000-0x4041,0x5000-0x500f on acpi0 pci0: ACPI PCI bus on pcib0 agp0: Intel 82443BX (440 BX) host to PCI bridge mem 0xe000-0xe3ff at device 0.0 on pci0 pcib1: PCI-PCI bridge at device 1.0 on pci0 pci1: PCI bus on pcib1 pci1: display, VGA at device 0.0 (no driver attached) isab0: PCI-ISA bridge at device 7.0 on pci0 isa0: ISA bus on isab0 atapci0: Intel PIIX4 UDMA33 controller port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f at device 7.1 on pci0 ata0: ATA channel 0 on atapci0 ata1: ATA channel 1 on atapci0 uhci0: Intel 82371AB/EB (PIIX4) USB controller port 0x9000-0x901f irq 10 at device 7.2 on pci0 uhci0: [GIANT-LOCKED] usb0: Intel 82371AB/EB (PIIX4) USB controller on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered pci0: bridge at device 7.3 (no driver attached) pcm0: Creative EMU10K1 port 0x9400-0x941f irq 9 at device 10.0 on pci0 pcm0: TriTech TR28602 AC97 Codec ath0: Atheros 5212 mem 0xe800-0xe800 irq 5 at device 11.0 on pci0 ath0: Ethernet address: 00:0f:3d:50:13:5c ath0: mac 5.9 phy 4.3 radio 4.6 fdc0: floppy drive controller port 0x3f2-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] fd0: 1440-KB 3.5 drive on fdc0 drive 0 sio0: 16550A-compatible COM port port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio1: 16550A-compatible COM port port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A ppc0: Standard parallel printer port port 0x378-0x37f irq 7 on acpi0 ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode ppbus0: Parallel port bus on ppc0 plip0: PLIP network interface on ppbus0 lpt0: Printer on ppbus0 lpt0: Interrupt-driven port ppi0: Parallel I/O on ppbus0 atkbdc0: Keyboard controller (i8042) port 0x60,0x64 irq 1 on acpi0 atkbd0: AT Keyboard irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] psm0: PS/2 Mouse irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model IntelliMouse, device ID 3 pmtimer0 on isa0 orm0: ISA Option ROM at iomem 0xc-0xc on isa0 sc0: System console at flags 0x100 on isa0 sc0: VGA 16 virtual consoles, flags=0x300 vga0: Generic ISA VGA at port 0x3c0-0x3df iomem 0xa-0xb on isa0 umass0: Cowon Systems, Inc. iAUDIO M3 Digital Audio Player, rev 2.00/1.00, addr 2 ugen0: EPSON EPSON Scanner, rev 2.00/1.10, addr 3 Timecounter TSC frequency 701594095 Hz quality 800 Timecounters tick every 1.000 msec ad0: 8063MB FUJITSU MPD3084AT DD-03-47 at ata0-master UDMA33 ad1: 57241MB WDC WD600BB-32CXA0 02.05B02 at ata0-slave UDMA33 acd0: CDRW HL-DT-ST GCE-8525B/1.03 at ata1-master UDMA33 da0 at umass-sim0 bus 0 target 0 lun 0 da0: TOSHIBA MK2004GAL JC10 Fixed Direct Access SCSI-0 device da0: 1.000MB/s transfers da0: 19073MB (39063024 512 byte sectors: 255H 63S/T 2431C) Trying to mount root from ufs:/dev/ad0s1a On 12/12/05, Atanas [EMAIL PROTECTED] wrote: Ronald Klop said the following on 12/12/05 13:27: What happens if you set one of these sysctl values to 0? (This disables SMP changes from 5.4 to 6.0.) debug.mpsafevfs: 1 debug.mpsafenet : 1 debug.mpsafevm: 1 Thanks for the suggestion! I just did so and rebooted both machines, so we'll see.
Re: 6.0 random freezes
Peter Jeremy said the following on 12/12/05 13:40: Define freezing: Does it respond to pings? Can you switch VTYs? Do the num-lock/caps-lock LEDs respond? Do some processes seem to freeze before others? I used the word freeze instead of crash, because the latter often gets associated with some errors reported by the kernel in system logs or on the console. In this case there are absolutely no error messages. I have also remote logging enabled (on another machine over the network), but there's nothing either. When the thing happens, the server appears to respond to pings for the first few minutes, but everything goes down until I go to the data canter. When I plug a keyboard, there's no response at all - no LEDs, no VTYs, Ctrl-Alt-Esc, etc. You might think of hint.atkbd.0.flags not being set properly, but it's right (i.e. unchanged, it appears to default to that on i386 5.x+) and other machines with identical configuration do accept keyboard. I have no information about processes. Only the thing I have is a real time CPU load graph. I have a script tailing the output of a vmstat cpu 15 and drawing a graph with user/system/idle times, so according to that graph there are no load spikes or unusual variations before the crashes. The usual user/system/idle percentages look like 10/7/83. I suggest you add the following to your kernel config: options KDB # Enable kernel debugger support. options DDB # Support DDB. I just set these along with the DEBUG option below, and got the new kernel (from 6.0-RELEASE sources dated Dec 9) running on both machines, so we'll see. When it hangs, break into DDB (Ctrl-Alt-Esc on the console or BREAK on a serial console). As a start, run 'show lockedvnods' and 'ps'. My guess is that you'll see a lock that has a number of waiters - which is probably the culprit. Use 'panic' or 'call doadump' to get a crashdump and then you can use kgdb to rummage around once you reboot - see http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebg-gdb.html I don't have any experience in chasing kernel bugs, so I'm not sure whether I would be able to get something useful, but I'll try that on the next crash. But if I have no keyboard response I won't be able to save it, right? I do not know what a serial console is and would need some time to get along with it. Would I get something in addition to what I can get from the standard console? makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols I suggest you add this back in. Without it, you can't debug any crash dumps that you manage to get (and add dumpdev to your rc.conf). My bad, I realized that it's kind of harmless, but it was weeks later after I put the box in production. It's back there now. The dumpdev variable seems to default to AUTO, i.e. trying to use the first swap device if it's bigger than the RAM (in my case yes), so I guess I don't need to touch it. Whilst I realise that you can't have production machines freezing on schedule, your assistance in providing more information about your problem will help make 6.x more stable. Yes, I know and I will try. Today I already had a couple of crashes (got lucky, no nasty data corruptions this time), and I cannot afford this to continue. I'm already working on the downgrade, but most likely I will have at least one of these 2 machines still running 6.x during the next day or two. After the downgrade we could eventually set a test bed and start hammering it with requests. The problem would be how to trigger the crash and whether we would be able to reproduce it at all. Thanks for the prompt reply! Regards, Atanas ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.0 random freezes
On Mon, 2005-Dec-12 22:21:52 -0400, fredthetree wrote: I just wanted to chime in and say I've had my 6.0-RELEASE #0 freeze up twice in the past few days. never once had it happen with 5.x. everything locks, no keyboard response, no mouse, and after several minutes it reboots itself, and savecore starts up during boot.. This suggests you've had a panic (or something that develops into one). If you've got a crashdump, you can probably get enough information out of it for people to get an idea of what is wrong. Please have a look at: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebg-gdb.html -- Peter Jeremy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.0 random freezes
Atanas said the following on 12/12/05 15:43: Ronald Klop said the following on 12/12/05 13:27: What happens if you set one of these sysctl values to 0? (This disables SMP changes from 5.4 to 6.0.) debug.mpsafevfs: 1 debug.mpsafenet: 1 debug.mpsafevm: 1 Thanks for the suggestion! I just did so and rebooted both machines, so we'll see. (replying to myself) ... and coincidentally or not, I got the next crash in less than 10 minutes :-( After the crash it ran for longer, until I rebooted it after rebuilding the kernel with debug hookups. Before the reboot I commented these out (i.e. set them back to 1), and now I'm waiting for a crashdump. Regards, Atanas ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]