Re: kernel panics involving NFS+RPCSEC_GSS

2011-08-18 Thread Clinton Adams
On Thu, Aug 18, 2011 at 3:25 PM, Clinton Adams clinton.ad...@gmail.com wrote:
 Hello,

 Kernel panics if clients hit the nfs server sufficiently hard -
 happens repeatedly with 13 clients logging in at the same approximate
 time, using nfsv4 mounted homes.

 server is running freebsd 8.2-RELEASE-p2. clients are linux 2.6.38-10

 Running a memtest on the server now to rule out bad mem. The server
 has been used for samba, and it's only with the attempted switch to
 nfs that this problem has appeared.


Err, wrong paste from another forum. Here's the trace from my server:

Fatal trap 12: page fault while in kernel mode
Fatal trap 12: page fault while in kernel mode
cpuid = 0;
cpuid = 2; apic id = 00apic id = 06
fault virtual address   = 0x0
fault virtual address   = 0x8
fault code  = supervisor write data, page not present
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x807db856
instruction pointer = 0x20:0x807dc0d7
stack pointer   = 0x28:0xff8096c0d840
stack pointer   = 0x28:0xff8096c17860
frame pointer   = 0x28:0xff8096c0d860
frame pointer   = 0x28:0xff8096c17a80
code segment= base 0x0, limit 0xf, type 0x1b
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags=
processor eflags= interrupt enabled, interrupt enabled,
resume, resume, IOPL = 0IOPL = 0
current process =
current process = 765 (nfsd: service)765 (nfsd: service)
trap number = 12
trap number = 12
panic: page fault

cpuid = 1
Uptime: 3h22m48s
Physical memory: 2032 MB
Dumping 406 MB: 391 375 359 343 327 311 295 279 263 247 231 215
199 183 167 151 135 119 103 87 71 55 39 23 7

Reading symbols from /boot/kernel/linux.ko...done.
Loaded symbols for /boot/kernel/linux.ko
Reading symbols from /boot/kernel/nfscommon.ko...done.
Loaded symbols for /boot/kernel/nfscommon.ko
Reading symbols from /boot/kernel/nfsd.ko...done.
Loaded symbols for /boot/kernel/nfsd.ko
Reading symbols from /boot/kernel/snp.ko...done.
Loaded symbols for /boot/kernel/snp.ko
#0  doadump () at pcpu.h:224
224 __asm(movq %%gs:0,%0 : =r (td));
(kgdb) list *0x807db856
0x807db856 is in svc_rpc_gss_forget_client
(/usr/src/sys/rpc/rpcsec_gss/svc_rpcsec_gss.c:622).
617 struct svc_rpc_gss_client_list *list;
618
619 list =
svc_rpc_gss_client_hash[client-cl_id.ci_id % CLIENT_HASH_SIZE];
620 sx_xlock(svc_rpc_gss_lock);
621 TAILQ_REMOVE(list, client, cl_link);
622 TAILQ_REMOVE(svc_rpc_gss_clients, client, cl_alllink);
623 svc_rpc_gss_client_count--;
624 sx_xunlock(svc_rpc_gss_lock);
625 svc_rpc_gss_release_client(client);
626 }
(kgdb) backtrace
#0  doadump () at pcpu.h:224
#1  0x805cbabe in boot (howto=260) at
/usr/src/sys/kern/kern_shutdown.c:419
#2  0x805cbed3 in panic (fmt=0x0) at
/usr/src/sys/kern/kern_shutdown.c:592
#3  0x808d239d in trap_fatal (frame=0xff0004c89460,
eva=Variable eva is not available.
) at /usr/src/sys/amd64/amd64/trap.c:783
#4  0x808d275f in trap_pfault (frame=0xff8096c0d790,
usermode=0) at /usr/src/sys/amd64/amd64/trap.c:699
#5  0x808d2b5f in trap (frame=0xff8096c0d790) at
/usr/src/sys/amd64/amd64/trap.c:449
#6  0x808bada4 in calltrap () at
/usr/src/sys/amd64/amd64/exception.S:224
#7  0x807db856 in svc_rpc_gss_forget_client
(client=0xff001c015200) at atomic.h:158
#8  0x807dc0e3 in svc_rpc_gss (rqst=0xff0004a24000,
msg=0xff8096c0db20) at
/usr/src/sys/rpc/rpcsec_gss/svc_rpcsec_gss.c:642
#9  0x807d48f3 in svc_run_internal
(pool=0xff0004ca6200, ismaster=0) at /usr/src/sys/rpc/svc.c:837
#10 0x807d50ab in svc_thread_start (arg=Variable arg is
not available.
) at /usr/src/sys/rpc/svc.c:1200
#11 0x805a26f8 in fork_exit (callout=0x807d50a0
svc_thread_start, arg=0xff0004ca6200, frame=0xff8096c0dc40)
at /usr/src/sys/kern/kern_fork.c:845
#12 0x808bb26e in fork_trampoline () at
/usr/src/sys/amd64/amd64/exception.S:565
#13 0x0080 in ?? ()
#14 0x7fffe6e0 in ?? ()
#15 0x002e in ?? ()
#16 0x in ?? ()
#17 0xfef4 in ?? ()
#18 0x in ?? ()
#19 0x009b in ?? ()
#20 0x7fffe6e0 in ?? ()
#21 0x0008 in ?? ()
#22 0x0003 in ?? ()
#23 0x in ?? ()

Re: Kernel panics?

2010-11-04 Thread Ivan Voras
On 11/04/10 12:35, Richard Morse wrote:
 Hi! I'm having a problem with an 8.1-RELEASE #0 amd64 machine.

Looks like too many different problem all at once. Almost certainly
there's a hardware problem somewhere. Try running memtest86


 Three weeks ago, it had a kernel panic, which I was too tired to properly 
 capture. On reboot, I forgot to run fsck in single user mode; about 12-14 
 hours later it crashed complaining that the background file system checks 
 were inconsistent.
 
 A day later it crashed with a server double fault; I was unfortunately on 
 the way to a meeting, along with all of my technical co-workers, so I wasn't 
 able to see the screen, and it was being reported by someone who was poorly 
 equipped to give a good report.
 
 A few days later, it had hung (it didn't respond to input), and I needed to 
 hard restart.
 
 A few days later, the same thing happened.
 
 Last weekend, on Friday evening it complained about the hard disk controller 
 disappearing (at least, as far back as I was able to go in the screen buffer).
 
 Saturday night, I finally got a kernel panic that I captured; after this, I 
 turned on core dumps.
 
 However, last night, it crashed again, and tried to write out a core, but 
 didn't succeed.
 
 The kernel panic from Saturday night was:
 
 panic: unknown cluster size
 cpuid = 0
 Uptime: 1h49m37s
 Cannot dump. Device not defined or unavailable.
 aac0: shutting down controller...
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 9; apic id = 10
 fault virtual address = 0x1d
 fault code= supervisor write data, page not present
 ...
 current process   = 12 (irq256: em0)
 trap number   = 12
 done
 
 Last night's:
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 9; apic id = 11
 fault virtual address = 0x8098f90e
 fault code= supervisor read data, page not present
 ...
 current process   = 97530 (taper)
 trap number   = 12
 panic: page fault
 cpuid = 9
 Uptime: 3d21h24m57s
 Physical memory: 12211MB
 Dumping 2942MB:
 
 Note that there was nothing after the Dumping 2942MB:; the cursor was 
 sitting just after the colon. On reboot, it did not find any cores to save to 
 disk (I did have to boot single user and fsck -y; is it possible that this 
 interfered with the core dump? if so, how do I fix this?).
 
 I tried, this morning, to run memtest86, however both 3.5 and 3.4 just give 
 loud annoying beeps, not displaying anything on screen (not even a menu; once 
 I get past the boot loader from the memtest86 cd, it just starts beeping).
 
 Any suggestions?
 
 Thanks,
 Ricky
 
 
 The information in this e-mail is intended only for the person to whom it is
 addressed. If you believe this e-mail was sent to you in error and the e-mail
 contains patient information, please contact the Partners Compliance HelpLine 
 at
 http://www.partners.org/complianceline . If the e-mail was sent to you in 
 error
 but does not contain patient information, please contact the sender and 
 properly
 dispose of the e-mail.
 
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
 

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Kernel panics?

2010-11-04 Thread Chris Brennan
On Thu, Nov 4, 2010 at 12:27 PM, Ivan Voras ivo...@freebsd.org wrote:
 Looks like too many different problem all at once. Almost certainly
 there's a hardware problem somewhere. Try running memtest86



On 11/04/10 12:35, Richard Morse wrote:
  Hi! I'm having a problem with an 8.1-RELEASE #0 amd64 machine.


I think Mr. Morse said he ran memtest86 and it produced many loud beeps.
This to me suggest that memtest86 is unable to preform it's tests. Have you
tried swapping out the ram (same type, size/speed matters not as long as
what goes in matches) and try memtest86 again?

Did you know...

If you play a Windows 2000 CD backwards, you hear satanic messages,
but what's worse is when you play it forward
  ...it installs Windows 2000

   -- Alfred Perlstein on chat at freebsd.org
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: kernel panics

2010-07-16 Thread Fernando Apesteguía
On Fri, Jul 16, 2010 at 9:03 AM, n dhert ndhert...@gmail.com wrote:
 Where is the best place to report  problems  with kernel panics in FreeBSD
 8.0 and to get help?

Have a look at

http://www.freebsd.org/send-pr.html

Cheers

 I posted a message in this mailing list freebsd-questions, but if acutally
 never was published (?)
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


RE: Kernel Panics in 6.1 and 6.2 always Exim 4

2007-09-23 Thread Wil Hatfield

 -Original Message-
 From: Kris Kennaway [mailto:[EMAIL PROTECTED]
 Sent: Friday, September 21, 2007 5:20 PM
 To: Wil Hatfield
 Cc: freebsd-questions@freebsd.org
 Subject: Re: Kernel Panics in 6.1 and 6.2 always Exim 4


 Wil Hatfield wrote:
  Well after a year we still haven't tracked down the kernel
 panic problems
  that are occuring on both our 6.1 and 6.2 machines for those we have had
  time to upgrade.  It occurs on 6.1-RC, 6.1-RELEASE 6.1-STABLE,
 6.2, you name
  it.
 
  We are noticing that all of the dumps are during Exim 4.6x runtime. I am
  suspicious of PR-97095 but would like others insights into the
 possibility.

 Well, as that PR says, the patch was committed after 6.1-RELEASE,
 therefore it is expected that older systems will have the problem.  You
 only provided a trace from a 6.1 machine, so if you are saying that it
 still persists on an up-to-date RELENG_6 kernel, please file a new PR
 with the details.

 Kris

Unfortunately when I upgraded the machine I have from 6.1-RELEASE to 6.2 it
stopped dumping for me. So I have nothing to analyze. However, I still get
the kernel panics I did before. Same frequency and always Exim.

I bumped into a thread somewhere that said something about setting
nmbclusters=0 might be a good workaround for this bug. Anybody heard
anything about this or does it seem logical?

Wil

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Kernel Panics in 6.1 and 6.2 always Exim 4

2007-09-23 Thread Kris Kennaway

Wil Hatfield wrote:

-Original Message-
From: Kris Kennaway [mailto:[EMAIL PROTECTED]
Sent: Friday, September 21, 2007 5:20 PM
To: Wil Hatfield
Cc: freebsd-questions@freebsd.org
Subject: Re: Kernel Panics in 6.1 and 6.2 always Exim 4


Wil Hatfield wrote:

Well after a year we still haven't tracked down the kernel

panic problems

that are occuring on both our 6.1 and 6.2 machines for those we have had
time to upgrade.  It occurs on 6.1-RC, 6.1-RELEASE 6.1-STABLE,

6.2, you name

it.

We are noticing that all of the dumps are during Exim 4.6x runtime. I am
suspicious of PR-97095 but would like others insights into the

possibility.

Well, as that PR says, the patch was committed after 6.1-RELEASE,
therefore it is expected that older systems will have the problem.  You
only provided a trace from a 6.1 machine, so if you are saying that it
still persists on an up-to-date RELENG_6 kernel, please file a new PR
with the details.

Kris


Unfortunately when I upgraded the machine I have from 6.1-RELEASE to 6.2 it
stopped dumping for me. So I have nothing to analyze. However, I still get
the kernel panics I did before. Same frequency and always Exim.


That is quite contrary to expectations, so you should follow up with the 
PR.  Please try to at least take a photo of the panic traceback from DDB 
or something.


Kris

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Kernel Panics in 6.1 and 6.2 always Exim 4

2007-09-21 Thread Chuck Swiger

On Sep 21, 2007, at 4:11 PM, Wil Hatfield wrote:

IP Filter: v4.1.8 initialized.  Default = block all, Logging = enabled
ipfw2 (+ipv6) initialized, divert loadable, rule-based forwarding  
disabled,

default to deny, logging unlimited


Do you really need to run both IPFW and IP Filter at the same time?   
Can you nix one of 'em?



ad0: 76319MB WDC WD800BB-00CAA1 17.07W17 at ata0-master UDMA100
acd0: CDROM HL-DT-ST CD-ROM GCR-8480B/1.02 at ata1-master UDMA33
Trying to mount root from ufs:/dev/ad0s1a
WARNING: / was not properly dismounted
g_vfs_done():md0[WRITE(offset=23527424, length=131072)]error = 28
g_vfs_done():md0[WRITE(offset=23805952, length=32768)]error = 28


errno 28 means:

#define ENOSPC  28  /* No space left on device */

...are you using a RAMDISK (md0 implies yes)?  Is Exim filling it  
up?  Are you using a malloc(9) based md, or a swap-based md?


--
-Chuck

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Kernel Panics in 6.1 and 6.2 always Exim 4

2007-09-21 Thread Kris Kennaway

Wil Hatfield wrote:

Well after a year we still haven't tracked down the kernel panic problems
that are occuring on both our 6.1 and 6.2 machines for those we have had
time to upgrade.  It occurs on 6.1-RC, 6.1-RELEASE 6.1-STABLE, 6.2, you name
it.

We are noticing that all of the dumps are during Exim 4.6x runtime. I am
suspicious of PR-97095 but would like others insights into the possibility.


Well, as that PR says, the patch was committed after 6.1-RELEASE, 
therefore it is expected that older systems will have the problem.  You 
only provided a trace from a 6.1 machine, so if you are saying that it 
still persists on an up-to-date RELENG_6 kernel, please file a new PR 
with the details.


Kris

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kernel panics, lots of them

2003-03-27 Thread Chuck Swiger
[EMAIL PROTECTED] wrote:
I have a box that has been having problems for months.
Originally, there were problems that were corrected by replacing the
mother board.
What kind of problems did you have?  And what hardware?  It's quite 
possible to damage the CPU or even the power supply if the motherboard 
fails badly enough.

 Since then, and I'm not sure when this began, there have
been kernel panics after several days of uptime.  They can be after one
day or three weeks, but they keep happening.
Probably not a problem with cooling, then.
Still sounds like flaky hardware, though, to me...
So far, I've replaced an IDE cable and a boot time error
disappeared, replaced RAM with no benefits, and cvsup'ed/make-world'ed
with no benefits.
I'm not sure what is causing the problems.  Any suggestions of
what I should do next?  I still have 14 kernel panic dumps if anyone can
think of tests that I should be running.  Most of the panics appear to be
page faults, but two of them were lockmgr issues.  I'm considering
replacing the mother board and/or the whole computer.  Unfortunately, this
is a fairly major server at my school (staff email, assorted web-based
apps, web site, intranet, etc.) so I am trying to keep outage frequency
and duration to a minimum.
There is memtest and cpuburn in the ports; try running those and see 
whether you can get the system to crash.

-Chuck

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kernel panics, lots of them

2003-03-27 Thread jaime
On Thu, 27 Mar 2003, Chuck Swiger wrote:
 [EMAIL PROTECTED] wrote:
  I have a box that has been having problems for months.
  Originally, there were problems that were corrected by replacing the
  mother board.

 What kind of problems did you have?  And what hardware?  It's quite
 possible to damage the CPU or even the power supply if the motherboard
 fails badly enough.

At the moment, i.e. with the new mother board, RAM, and cable, It
has:
P3 600MHz
256MB RAM (was 1 stick, now 2)
Tyan mother board (Trinity 400)
SCSI PCI card using sym0 driver (can't remember which card)
4 SCSI 18GB hard drives
1 SCSI DDS-4 DAT tape drive
1 EIDE 10GB hard drive
1 ethernet interface using fxp driver (Interl EtherExpress Pro/100)
1 PCI VGA card (can't remember what kind)
1 SCSI cable
1 80-pin EIDE cable


 There is memtest and cpuburn in the ports; try running those and see
 whether you can get the system to crash.

Just to verify before I run these programs in the middle of the
work day:  The purpose of these programs is to try to crash the system,
right?  :)

Thanks,
Jaime

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kernel panics, lots of them

2003-03-27 Thread Chuck Swiger
[EMAIL PROTECTED] wrote:
[ ... ]
 There is memtest and cpuburn in the ports; try running those and see
whether you can get the system to crash.
Just to verify before I run these programs in the middle of the
work day:  The purpose of these programs is to try to crash the system,
right?  :)
You should be prepared for the system to crash, yes.  :-)

Of course, the point of these tests is that your hardware _should_ be 
able to run them for days or weeks without any problems with system 
stability.  But if the system cooling/memory timing/etc is marginal, 
these will probably cause the system to panic within a few hours, which 
helps confirm where the problem lies...

-Chuck

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]