Re: kernel panics involving NFS+RPCSEC_GSS
On Thu, Aug 18, 2011 at 3:25 PM, Clinton Adams clinton.ad...@gmail.com wrote: Hello, Kernel panics if clients hit the nfs server sufficiently hard - happens repeatedly with 13 clients logging in at the same approximate time, using nfsv4 mounted homes. server is running freebsd 8.2-RELEASE-p2. clients are linux 2.6.38-10 Running a memtest on the server now to rule out bad mem. The server has been used for samba, and it's only with the attempted switch to nfs that this problem has appeared. Err, wrong paste from another forum. Here's the trace from my server: Fatal trap 12: page fault while in kernel mode Fatal trap 12: page fault while in kernel mode cpuid = 0; cpuid = 2; apic id = 00apic id = 06 fault virtual address = 0x0 fault virtual address = 0x8 fault code = supervisor write data, page not present fault code = supervisor read data, page not present instruction pointer = 0x20:0x807db856 instruction pointer = 0x20:0x807dc0d7 stack pointer = 0x28:0xff8096c0d840 stack pointer = 0x28:0xff8096c17860 frame pointer = 0x28:0xff8096c0d860 frame pointer = 0x28:0xff8096c17a80 code segment= base 0x0, limit 0xf, type 0x1b code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= processor eflags= interrupt enabled, interrupt enabled, resume, resume, IOPL = 0IOPL = 0 current process = current process = 765 (nfsd: service)765 (nfsd: service) trap number = 12 trap number = 12 panic: page fault cpuid = 1 Uptime: 3h22m48s Physical memory: 2032 MB Dumping 406 MB: 391 375 359 343 327 311 295 279 263 247 231 215 199 183 167 151 135 119 103 87 71 55 39 23 7 Reading symbols from /boot/kernel/linux.ko...done. Loaded symbols for /boot/kernel/linux.ko Reading symbols from /boot/kernel/nfscommon.ko...done. Loaded symbols for /boot/kernel/nfscommon.ko Reading symbols from /boot/kernel/nfsd.ko...done. Loaded symbols for /boot/kernel/nfsd.ko Reading symbols from /boot/kernel/snp.ko...done. Loaded symbols for /boot/kernel/snp.ko #0 doadump () at pcpu.h:224 224 __asm(movq %%gs:0,%0 : =r (td)); (kgdb) list *0x807db856 0x807db856 is in svc_rpc_gss_forget_client (/usr/src/sys/rpc/rpcsec_gss/svc_rpcsec_gss.c:622). 617 struct svc_rpc_gss_client_list *list; 618 619 list = svc_rpc_gss_client_hash[client-cl_id.ci_id % CLIENT_HASH_SIZE]; 620 sx_xlock(svc_rpc_gss_lock); 621 TAILQ_REMOVE(list, client, cl_link); 622 TAILQ_REMOVE(svc_rpc_gss_clients, client, cl_alllink); 623 svc_rpc_gss_client_count--; 624 sx_xunlock(svc_rpc_gss_lock); 625 svc_rpc_gss_release_client(client); 626 } (kgdb) backtrace #0 doadump () at pcpu.h:224 #1 0x805cbabe in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:419 #2 0x805cbed3 in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:592 #3 0x808d239d in trap_fatal (frame=0xff0004c89460, eva=Variable eva is not available. ) at /usr/src/sys/amd64/amd64/trap.c:783 #4 0x808d275f in trap_pfault (frame=0xff8096c0d790, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:699 #5 0x808d2b5f in trap (frame=0xff8096c0d790) at /usr/src/sys/amd64/amd64/trap.c:449 #6 0x808bada4 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #7 0x807db856 in svc_rpc_gss_forget_client (client=0xff001c015200) at atomic.h:158 #8 0x807dc0e3 in svc_rpc_gss (rqst=0xff0004a24000, msg=0xff8096c0db20) at /usr/src/sys/rpc/rpcsec_gss/svc_rpcsec_gss.c:642 #9 0x807d48f3 in svc_run_internal (pool=0xff0004ca6200, ismaster=0) at /usr/src/sys/rpc/svc.c:837 #10 0x807d50ab in svc_thread_start (arg=Variable arg is not available. ) at /usr/src/sys/rpc/svc.c:1200 #11 0x805a26f8 in fork_exit (callout=0x807d50a0 svc_thread_start, arg=0xff0004ca6200, frame=0xff8096c0dc40) at /usr/src/sys/kern/kern_fork.c:845 #12 0x808bb26e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:565 #13 0x0080 in ?? () #14 0x7fffe6e0 in ?? () #15 0x002e in ?? () #16 0x in ?? () #17 0xfef4 in ?? () #18 0x in ?? () #19 0x009b in ?? () #20 0x7fffe6e0 in ?? () #21 0x0008 in ?? () #22 0x0003 in ?? () #23 0x in ?? ()
Re: Kernel panics?
On 11/04/10 12:35, Richard Morse wrote: Hi! I'm having a problem with an 8.1-RELEASE #0 amd64 machine. Looks like too many different problem all at once. Almost certainly there's a hardware problem somewhere. Try running memtest86 Three weeks ago, it had a kernel panic, which I was too tired to properly capture. On reboot, I forgot to run fsck in single user mode; about 12-14 hours later it crashed complaining that the background file system checks were inconsistent. A day later it crashed with a server double fault; I was unfortunately on the way to a meeting, along with all of my technical co-workers, so I wasn't able to see the screen, and it was being reported by someone who was poorly equipped to give a good report. A few days later, it had hung (it didn't respond to input), and I needed to hard restart. A few days later, the same thing happened. Last weekend, on Friday evening it complained about the hard disk controller disappearing (at least, as far back as I was able to go in the screen buffer). Saturday night, I finally got a kernel panic that I captured; after this, I turned on core dumps. However, last night, it crashed again, and tried to write out a core, but didn't succeed. The kernel panic from Saturday night was: panic: unknown cluster size cpuid = 0 Uptime: 1h49m37s Cannot dump. Device not defined or unavailable. aac0: shutting down controller... Fatal trap 12: page fault while in kernel mode cpuid = 9; apic id = 10 fault virtual address = 0x1d fault code= supervisor write data, page not present ... current process = 12 (irq256: em0) trap number = 12 done Last night's: Fatal trap 12: page fault while in kernel mode cpuid = 9; apic id = 11 fault virtual address = 0x8098f90e fault code= supervisor read data, page not present ... current process = 97530 (taper) trap number = 12 panic: page fault cpuid = 9 Uptime: 3d21h24m57s Physical memory: 12211MB Dumping 2942MB: Note that there was nothing after the Dumping 2942MB:; the cursor was sitting just after the colon. On reboot, it did not find any cores to save to disk (I did have to boot single user and fsck -y; is it possible that this interfered with the core dump? if so, how do I fix this?). I tried, this morning, to run memtest86, however both 3.5 and 3.4 just give loud annoying beeps, not displaying anything on screen (not even a menu; once I get past the boot loader from the memtest86 cd, it just starts beeping). Any suggestions? Thanks, Ricky The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Kernel panics?
On Thu, Nov 4, 2010 at 12:27 PM, Ivan Voras ivo...@freebsd.org wrote: Looks like too many different problem all at once. Almost certainly there's a hardware problem somewhere. Try running memtest86 On 11/04/10 12:35, Richard Morse wrote: Hi! I'm having a problem with an 8.1-RELEASE #0 amd64 machine. I think Mr. Morse said he ran memtest86 and it produced many loud beeps. This to me suggest that memtest86 is unable to preform it's tests. Have you tried swapping out the ram (same type, size/speed matters not as long as what goes in matches) and try memtest86 again? Did you know... If you play a Windows 2000 CD backwards, you hear satanic messages, but what's worse is when you play it forward ...it installs Windows 2000 -- Alfred Perlstein on chat at freebsd.org http://lists.freebsd.org/mailman/listinfo/freebsd-questions ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: kernel panics
On Fri, Jul 16, 2010 at 9:03 AM, n dhert ndhert...@gmail.com wrote: Where is the best place to report problems with kernel panics in FreeBSD 8.0 and to get help? Have a look at http://www.freebsd.org/send-pr.html Cheers I posted a message in this mailing list freebsd-questions, but if acutally never was published (?) ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
RE: Kernel Panics in 6.1 and 6.2 always Exim 4
-Original Message- From: Kris Kennaway [mailto:[EMAIL PROTECTED] Sent: Friday, September 21, 2007 5:20 PM To: Wil Hatfield Cc: freebsd-questions@freebsd.org Subject: Re: Kernel Panics in 6.1 and 6.2 always Exim 4 Wil Hatfield wrote: Well after a year we still haven't tracked down the kernel panic problems that are occuring on both our 6.1 and 6.2 machines for those we have had time to upgrade. It occurs on 6.1-RC, 6.1-RELEASE 6.1-STABLE, 6.2, you name it. We are noticing that all of the dumps are during Exim 4.6x runtime. I am suspicious of PR-97095 but would like others insights into the possibility. Well, as that PR says, the patch was committed after 6.1-RELEASE, therefore it is expected that older systems will have the problem. You only provided a trace from a 6.1 machine, so if you are saying that it still persists on an up-to-date RELENG_6 kernel, please file a new PR with the details. Kris Unfortunately when I upgraded the machine I have from 6.1-RELEASE to 6.2 it stopped dumping for me. So I have nothing to analyze. However, I still get the kernel panics I did before. Same frequency and always Exim. I bumped into a thread somewhere that said something about setting nmbclusters=0 might be a good workaround for this bug. Anybody heard anything about this or does it seem logical? Wil ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Kernel Panics in 6.1 and 6.2 always Exim 4
Wil Hatfield wrote: -Original Message- From: Kris Kennaway [mailto:[EMAIL PROTECTED] Sent: Friday, September 21, 2007 5:20 PM To: Wil Hatfield Cc: freebsd-questions@freebsd.org Subject: Re: Kernel Panics in 6.1 and 6.2 always Exim 4 Wil Hatfield wrote: Well after a year we still haven't tracked down the kernel panic problems that are occuring on both our 6.1 and 6.2 machines for those we have had time to upgrade. It occurs on 6.1-RC, 6.1-RELEASE 6.1-STABLE, 6.2, you name it. We are noticing that all of the dumps are during Exim 4.6x runtime. I am suspicious of PR-97095 but would like others insights into the possibility. Well, as that PR says, the patch was committed after 6.1-RELEASE, therefore it is expected that older systems will have the problem. You only provided a trace from a 6.1 machine, so if you are saying that it still persists on an up-to-date RELENG_6 kernel, please file a new PR with the details. Kris Unfortunately when I upgraded the machine I have from 6.1-RELEASE to 6.2 it stopped dumping for me. So I have nothing to analyze. However, I still get the kernel panics I did before. Same frequency and always Exim. That is quite contrary to expectations, so you should follow up with the PR. Please try to at least take a photo of the panic traceback from DDB or something. Kris ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Kernel Panics in 6.1 and 6.2 always Exim 4
On Sep 21, 2007, at 4:11 PM, Wil Hatfield wrote: IP Filter: v4.1.8 initialized. Default = block all, Logging = enabled ipfw2 (+ipv6) initialized, divert loadable, rule-based forwarding disabled, default to deny, logging unlimited Do you really need to run both IPFW and IP Filter at the same time? Can you nix one of 'em? ad0: 76319MB WDC WD800BB-00CAA1 17.07W17 at ata0-master UDMA100 acd0: CDROM HL-DT-ST CD-ROM GCR-8480B/1.02 at ata1-master UDMA33 Trying to mount root from ufs:/dev/ad0s1a WARNING: / was not properly dismounted g_vfs_done():md0[WRITE(offset=23527424, length=131072)]error = 28 g_vfs_done():md0[WRITE(offset=23805952, length=32768)]error = 28 errno 28 means: #define ENOSPC 28 /* No space left on device */ ...are you using a RAMDISK (md0 implies yes)? Is Exim filling it up? Are you using a malloc(9) based md, or a swap-based md? -- -Chuck ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Kernel Panics in 6.1 and 6.2 always Exim 4
Wil Hatfield wrote: Well after a year we still haven't tracked down the kernel panic problems that are occuring on both our 6.1 and 6.2 machines for those we have had time to upgrade. It occurs on 6.1-RC, 6.1-RELEASE 6.1-STABLE, 6.2, you name it. We are noticing that all of the dumps are during Exim 4.6x runtime. I am suspicious of PR-97095 but would like others insights into the possibility. Well, as that PR says, the patch was committed after 6.1-RELEASE, therefore it is expected that older systems will have the problem. You only provided a trace from a 6.1 machine, so if you are saying that it still persists on an up-to-date RELENG_6 kernel, please file a new PR with the details. Kris ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kernel panics, lots of them
[EMAIL PROTECTED] wrote: I have a box that has been having problems for months. Originally, there were problems that were corrected by replacing the mother board. What kind of problems did you have? And what hardware? It's quite possible to damage the CPU or even the power supply if the motherboard fails badly enough. Since then, and I'm not sure when this began, there have been kernel panics after several days of uptime. They can be after one day or three weeks, but they keep happening. Probably not a problem with cooling, then. Still sounds like flaky hardware, though, to me... So far, I've replaced an IDE cable and a boot time error disappeared, replaced RAM with no benefits, and cvsup'ed/make-world'ed with no benefits. I'm not sure what is causing the problems. Any suggestions of what I should do next? I still have 14 kernel panic dumps if anyone can think of tests that I should be running. Most of the panics appear to be page faults, but two of them were lockmgr issues. I'm considering replacing the mother board and/or the whole computer. Unfortunately, this is a fairly major server at my school (staff email, assorted web-based apps, web site, intranet, etc.) so I am trying to keep outage frequency and duration to a minimum. There is memtest and cpuburn in the ports; try running those and see whether you can get the system to crash. -Chuck ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kernel panics, lots of them
On Thu, 27 Mar 2003, Chuck Swiger wrote: [EMAIL PROTECTED] wrote: I have a box that has been having problems for months. Originally, there were problems that were corrected by replacing the mother board. What kind of problems did you have? And what hardware? It's quite possible to damage the CPU or even the power supply if the motherboard fails badly enough. At the moment, i.e. with the new mother board, RAM, and cable, It has: P3 600MHz 256MB RAM (was 1 stick, now 2) Tyan mother board (Trinity 400) SCSI PCI card using sym0 driver (can't remember which card) 4 SCSI 18GB hard drives 1 SCSI DDS-4 DAT tape drive 1 EIDE 10GB hard drive 1 ethernet interface using fxp driver (Interl EtherExpress Pro/100) 1 PCI VGA card (can't remember what kind) 1 SCSI cable 1 80-pin EIDE cable There is memtest and cpuburn in the ports; try running those and see whether you can get the system to crash. Just to verify before I run these programs in the middle of the work day: The purpose of these programs is to try to crash the system, right? :) Thanks, Jaime ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kernel panics, lots of them
[EMAIL PROTECTED] wrote: [ ... ] There is memtest and cpuburn in the ports; try running those and see whether you can get the system to crash. Just to verify before I run these programs in the middle of the work day: The purpose of these programs is to try to crash the system, right? :) You should be prepared for the system to crash, yes. :-) Of course, the point of these tests is that your hardware _should_ be able to run them for days or weeks without any problems with system stability. But if the system cooling/memory timing/etc is marginal, these will probably cause the system to panic within a few hours, which helps confirm where the problem lies... -Chuck ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]