RE: Problems in VM structure ?

1999-02-18 Thread tcobb
I've adjusted MAXUSERS to 128 on my heavily loaded PIIs and the crashes
have not re-occurred for 24 hours now.  (Had to adjust NMBCLUSTERS up,
though)
The panics were happening every 5-8 hours like clockwork prior to this.

I believe that these crashes are caused by heavy network traffic, not
heavy load values, so a make world may not trigger this.  Actually, I 
couldn't force it to happen when I hit the box hard during testing with
web traffic, so it must be a combination thing.

Another clue is the fact that I can't seem to get a Pentium (P5) to crash
at all, ever, even when running exactly the same kernel config.  
The Pentium IIs fell over like crazy.


-Troy Cobb
 Circle Net, Inc.
 http://www.circle.net


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: Problems in VM structure ?

1999-02-17 Thread Bernd Walter
On Wed, Feb 17, 1999 at 08:46:49AM -0500, tc...@staff.circle.net wrote:
> 
> 
> >   -Original Message-
> >   From: Matthew Dillon [mailto:dil...@apollo.backplane.com]
> >   :What's the chance that our kernel adaptations for PIIs
> >   :is partly at fault?
> >   :
> >   :-Troy Cobb
> >   : Circle Net, Inc.
> >   : http://www.circle.net
> >   
> >   With what config?  Have you tried reducing maxusers to 128?
> >   
> > -Matt
> 
> 
> I've had it at MAXUSERS=256 on both the P5 and the P6.  The P5 stays
> stable, the P6 doesn't.  If I reduce MAXUSERS to 128 then these
> heavily loaded boxen will fall over due to out of MBUFs errors, or
> so I believe.
> 
I my case it was a P-II 400

> I'd love to find some real kernel-tuning documentation out there,
> one of my panics is a "pipeinit:  cannot allocate pipe -- out of kvm"
> and I can't pull a crashdump due to a DSCHECK error because my
> SWAP is > 2GB.
> 
The pipeinit is one of the panics I got very often - but they were not reduced 
to

> 
> -Troy Cobb
>  Circle Net, Inc.
>  http://www.circle.net
> 
> 
> To Unsubscribe: send mail to majord...@freebsd.org
> with "unsubscribe freebsd-current" in the body of the message

-- 
  B.Walter



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: Problems in VM structure ?

1999-02-17 Thread Bernd Walter
On Tue, Feb 16, 1999 at 12:19:07AM -0800, Matthew Dillon wrote:
> :maxusers 256
> 
> Try reducing maxusers to 128.  Another person reported similar behavior
> to me and after a bunch of work he tried going back to a basic 
> distribution -- and everything started working again.
> 
To add some public informations.
The host in case was using 512Meg RAM and I have tested it with 256Meg.
I originaly installed a 3.0-CURRENT from mid December and updated
to a recent version after getting panics.
I used MAXUSERS of 512 and was able to trigger a panic during only a few 
minutes uptime
after reducing it to 256 the host was more stable.
Now it is running without any panics using MAXUSERS 128.

> It turned out that a maxusers value of 256 and 512 were causing his 
> machine
> to go poof, but a maxusers value of 128 worked fine.
> 
> I haven't tracked the problem down yet.  Please try reducing your maxusers
> to 128 and email the results to current.
> 
>   -Matt
> 
> 
> To Unsubscribe: send mail to majord...@freebsd.org
> with "unsubscribe freebsd-current" in the body of the message

-- 
  B.Walter



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: Problems in VM structure ?

1999-02-17 Thread Gary Palmer
[ CC trimmed ]

tc...@staff.circle.net wrote in message ID
:
> I've had it at MAXUSERS=256 on both the P5 and the P6.  The P5 stays
> stable, the P6 doesn't.  If I reduce MAXUSERS to 128 then these
> heavily loaded boxen will fall over due to out of MBUFs errors, or
> so I believe.

If you are running out of MBUF clusters, play with

option  "NMBCLUSTERS=x"

directly rather than indirectly through maxusers... 4096 or 8192 is probably a 
good starting value. If you look in /sys/compile/YOUR_KERNEL at the param.c,
then you can get a better idea of what maxusers tweaks, and what you
need to tweak manually. I know its not a `howto' guide :) Sorry

Gary
--
Gary Palmer  FreeBSD Core Team Member
FreeBSD: Turning PC's into workstations. See http://www.FreeBSD.ORG/ for info




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: RE: RE: Problems in VM structure ?

1999-02-17 Thread Matthew Dillon
Try reducing maxusers to 128.  If you have mbuf problems, override 
NMBCLUSTERS ( making it 4096 or 8192 should be sufficient ).  Sometimes
network mbuf problems on heavily loaded machines are due to too-large
default buffer sizes - if net.inet.tcp.sendspace or recvspace is greater
then 16384, try reducing it to 16384.

If your machine is not using that > 2GB of swap, cap the swap at 1.9GB.

-Matt
Matthew Dillon 


:I've had it at MAXUSERS=256 on both the P5 and the P6.  The P5 stays
:stable, the P6 doesn't.  If I reduce MAXUSERS to 128 then these
:heavily loaded boxen will fall over due to out of MBUFs errors, or
:so I believe.
:
:I'd love to find some real kernel-tuning documentation out there,
:one of my panics is a "pipeinit:  cannot allocate pipe -- out of kvm"
:and I can't pull a crashdump due to a DSCHECK error because my
:SWAP is > 2GB.
:
:
:-Troy Cobb
: Circle Net, Inc.
: http://www.circle.net
:
:
:To Unsubscribe: send mail to majord...@freebsd.org
:with "unsubscribe freebsd-current" in the body of the message
:



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



RE: RE: Problems in VM structure ?

1999-02-17 Thread tcobb


>   -Original Message-
>   From: Matthew Dillon [mailto:dil...@apollo.backplane.com]
>   :What's the chance that our kernel adaptations for PIIs
>   :is partly at fault?
>   :
>   :-Troy Cobb
>   : Circle Net, Inc.
>   : http://www.circle.net
>   
>   With what config?  Have you tried reducing maxusers to 128?
>   
>   -Matt


I've had it at MAXUSERS=256 on both the P5 and the P6.  The P5 stays
stable, the P6 doesn't.  If I reduce MAXUSERS to 128 then these
heavily loaded boxen will fall over due to out of MBUFs errors, or
so I believe.

I'd love to find some real kernel-tuning documentation out there,
one of my panics is a "pipeinit:  cannot allocate pipe -- out of kvm"
and I can't pull a crashdump due to a DSCHECK error because my
SWAP is > 2GB.


-Troy Cobb
 Circle Net, Inc.
 http://www.circle.net


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: RE: Problems in VM structure ?

1999-02-17 Thread Matthew Dillon
:I'm seeing different responses depending on hardware.
:
:On regular Pentium 166 machines, I almost NEVER get
:a panic.  On brand-new Pentium II 350s, I get a panic
:every 6-9 hours.  This happens when both kernels are
:configured the same for maxusers.  It happens when
:both machines are under the same load level -- the
:P5 stays rock solid, the P6 flakes out.
:
:What's the chance that our kernel adaptations for PIIs
:is partly at fault?
:
:-Troy Cobb
: Circle Net, Inc.
: http://www.circle.net

With what config?  Have you tried reducing maxusers to 128?

-Matt



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



RE: Problems in VM structure ?

1999-02-16 Thread tcobb
I'm seeing different responses depending on hardware.

On regular Pentium 166 machines, I almost NEVER get
a panic.  On brand-new Pentium II 350s, I get a panic
every 6-9 hours.  This happens when both kernels are
configured the same for maxusers.  It happens when
both machines are under the same load level -- the
P5 stays rock solid, the P6 flakes out.

What's the chance that our kernel adaptations for PIIs
is partly at fault?


-Troy Cobb
 Circle Net, Inc.
 http://www.circle.net

>   -Original Message-
>   From: Brian Feldman [mailto:gr...@unixhelp.org]
>   Sent: Tuesday, February 16, 1999 7:48 AM
>   To: Matthew Dillon
>   Cc: Khetan Gajjar; curr...@freebsd.org
>   Subject: Re: Problems in VM structure ?
>   
>   
>   On Tue, 16 Feb 1999, Matthew Dillon wrote:
>   
>   > :maxusers 256
>   > 
>   > Try reducing maxusers to 128.  Another person 
>   reported similar behavior
>   > to me and after a bunch of work he tried going back 
>   to a basic 
>   > distribution -- and everything started working again.
>   > 
>   > It turned out that a maxusers value of 256 and 512 
>   were causing his machine
>   > to go poof, but a maxusers value of 128 worked fine.
>   > 
>   > I haven't tracked the problem down yet.  Please try 
>   reducing your maxusers
>   > to 128 and email the results to current.
>   
>   For what it's worth, my maxusers is 250 and my system is 
>   quite stable, even
>   during a make -j25 buildworld.
>   
>   > 
>   >   -Matt
>   > 
>   > 
>   > To Unsubscribe: send mail to majord...@freebsd.org
>   > with "unsubscribe freebsd-current" in the body of the message
>   > 
>   
>Brian Feldman_ __  
>   ___ ___ ___  
>gr...@unixhelp.org   _ __ ___ | _ ) __|   \ 
>http://www.freebsd.org/ _ __ ___  | _ \__ \ |) |
>FreeBSD: The Power to Serve!  _ __ ___  _ 
>   |___/___/___/ 
>   
>   
>   To Unsubscribe: send mail to majord...@freebsd.org
>   with "unsubscribe freebsd-current" in the body of the message
>   


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: Problems in VM structure ?

1999-02-16 Thread John S. Dyson
Matthew Dillon said:
> :maxusers 256
> 
> Try reducing maxusers to 128.  Another person reported similar behavior
> to me and after a bunch of work he tried going back to a basic 
> distribution -- and everything started working again.
> 
> It turned out that a maxusers value of 256 and 512 were causing his 
> machine
> to go poof, but a maxusers value of 128 worked fine.
> 
> I haven't tracked the problem down yet.  Please try reducing your maxusers
> to 128 and email the results to current.
> 
Likely because data structures are getting too big.  The kernel is limited
to (I forget) how big in VA space.

-- 
John  | Never try to teach a pig to sing,
dy...@iquest.net  | it makes one look stupid
jdy...@nc.com | and it irritates the pig.

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: Problems in VM structure ?

1999-02-16 Thread John Fieber
On Tue, 16 Feb 1999, Matthew Dillon wrote:

> Try reducing maxusers to 128.  Another person reported similar behavior
> to me and after a bunch of work he tried going back to a basic 
> distribution -- and everything started working again.
> 
> It turned out that a maxusers value of 256 and 512 were causing his 
> machine
> to go poof, but a maxusers value of 128 worked fine.

Another datapoint, Sybase goes poof with maxusers set to 64 or
higher.  This has been the case since before 3.0 was released.

-john


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: Problems in VM structure ?

1999-02-16 Thread Khetan Gajjar
On Tue, 16 Feb 1999, Matthew Dillon wrote:

MD>  Try reducing maxusers to 128.  Another person reported similar behavior
MD>  to me and after a bunch of work he tried going back to a basic 
MD>  distribution -- and everything started working again.

Hmmm, ok.

MD>  It turned out that a maxusers value of 256 and 512 were causing his 
machine
MD>  to go poof, but a maxusers value of 128 worked fine.

Ok. I'm glad, in a way, that I'm not the only one seeing
this.

The really weird thing though is that since reporting the problem,
it hasn't re-occured. If it occurs again, I'll mail the results of
the gdb -core /var/crash/blah, a trace and then try reducing the
number of maxusers.

This is the longest uptime I've had in almost two weeks - 14 hours.
Here's hoping :)

MD>  I haven't tracked the problem down yet.  Please try reducing your 
maxusers
MD>  to 128 and email the results to current.

If the problem re-occurs, I'll do so :)
---
Khetan Gajjar   (!kg1779) * khe...@iafrica.com ; khe...@os.org.za
http://www.os.org.za/~khetan  * Talk/Finger khe...@chain.freebsd.os.org.za
FreeBSD enthusiast* http://www2.za.freebsd.org/
Security-wise, NT is a OS with a "kick me" sign taped to it


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: Problems in VM structure ?

1999-02-16 Thread Brian Feldman
On Tue, 16 Feb 1999, Matthew Dillon wrote:

> :maxusers 256
> 
> Try reducing maxusers to 128.  Another person reported similar behavior
> to me and after a bunch of work he tried going back to a basic 
> distribution -- and everything started working again.
> 
> It turned out that a maxusers value of 256 and 512 were causing his 
> machine
> to go poof, but a maxusers value of 128 worked fine.
> 
> I haven't tracked the problem down yet.  Please try reducing your maxusers
> to 128 and email the results to current.

For what it's worth, my maxusers is 250 and my system is quite stable, even
during a make -j25 buildworld.

> 
>   -Matt
> 
> 
> To Unsubscribe: send mail to majord...@freebsd.org
> with "unsubscribe freebsd-current" in the body of the message
> 

 Brian Feldman_ __  ___ ___ ___  
 gr...@unixhelp.org   _ __ ___ | _ ) __|   \ 
 http://www.freebsd.org/ _ __ ___  | _ \__ \ |) |
 FreeBSD: The Power to Serve!  _ __ ___  _ |___/___/___/ 


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: Problems in VM structure ?

1999-02-16 Thread Matthew Dillon
:maxusers   256

Try reducing maxusers to 128.  Another person reported similar behavior
to me and after a bunch of work he tried going back to a basic 
distribution -- and everything started working again.

It turned out that a maxusers value of 256 and 512 were causing his machine
to go poof, but a maxusers value of 128 worked fine.

I haven't tracked the problem down yet.  Please try reducing your maxusers
to 128 and email the results to current.

-Matt


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: Problems in VM structure ?

1999-02-15 Thread Greg Lehey
On Monday, 15 February 1999 at 18:00:16 -0500, Luoqi Chen wrote:
>> Hi.
>>
>> I saw that my 4-CURRENT box from 8 February dropped to ddb
>> after my last make world. I rebuilt world today, and the
>> same problem is occuring. These problems started occuring
>> after Matt Dillon's changes to the VM system.
>>
>> What is worrying/troubling is that in single user mode,
>> the machine is stable, and manages to build world without
>> a problem. When booted into multi-user mode, it's stable
>> and usable for anywhere from 1 to 3 hours, and then
>> panics. There are no active users on at the time, and the
>> machine is not heavily loaded (0.0-0.2)
>>
>> I suspected a hardware error, so swopped all the RAM from a
>> production machine, and it still produces the same fault.
>>
>> The error is
>> panic: vm_fault: fault on nofault entry, addr : f2572000
>
> This indicates an unmapped struct buf, should be a software bug.
>
>> Debugger ("panic")
>> Stopped at Debuger+0x37: movl $0,in_Debugger
>>
>> When I hit c, I get this :
>>
> Could you type in "bt" next time this happens, and post the result?

It's "t" in ddb, not "bt".  Isn't consistency wonderful?

Khetan, you should also take a dump.  The backtrace is a good start,
but it probably won't be enough to solve the problem.

Greg
--
See complete headers for address, home page and phone numbers
finger g...@lemis.com for PGP public key

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: Problems in VM structure ?

1999-02-15 Thread Luoqi Chen
> Hi.
> 
> I saw that my 4-CURRENT box from 8 February dropped to ddb
> after my last make world. I rebuilt world today, and the
> same problem is occuring. These problems started occuring
> after Matt Dillon's changes to the VM system.
> 
> What is worrying/troubling is that in single user mode,
> the machine is stable, and manages to build world without
> a problem. When booted into multi-user mode, it's stable
> and usable for anywhere from 1 to 3 hours, and then
> panics. There are no active users on at the time, and the
> machine is not heavily loaded (0.0-0.2)
> 
> I suspected a hardware error, so swopped all the RAM from a 
> production machine, and it still produces the same fault.
> 
> The error is
> panic: vm_fault: fault on nofault entry, addr : f2572000

This indicates an unmapped struct buf, should be a software bug.

> Debugger ("panic")
> Stopped at Debuger+0x37: movl $0,in_Debugger
> 
> When I hit c, I get this :
> 
Could you type in "bt" next time this happens, and post the result?

-lq

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Problems in VM structure ?

1999-02-15 Thread Khetan Gajjar
Hi.

I saw that my 4-CURRENT box from 8 February dropped to ddb
after my last make world. I rebuilt world today, and the
same problem is occuring. These problems started occuring
after Matt Dillon's changes to the VM system.

What is worrying/troubling is that in single user mode,
the machine is stable, and manages to build world without
a problem. When booted into multi-user mode, it's stable
and usable for anywhere from 1 to 3 hours, and then
panics. There are no active users on at the time, and the
machine is not heavily loaded (0.0-0.2)

I suspected a hardware error, so swopped all the RAM from a 
production machine, and it still produces the same fault.

The error is
panic: vm_fault: fault on nofault entry, addr : f2572000
Debugger ("panic")
Stopped at Debuger+0x37: movl $0,in_Debugger

When I hit c, I get this :

Syncing disks...

Fatal trap 12 : page fault while in kernel mode
fault virtual address = 0x18
fault code = supervisor read, page not present
instruction pointer = 0x8:0xf0145d98
stack pointer = 0x10:0xf79b97fc
frame pointer = 0x10:0xf79b9810
code segment = base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, resume, IOPL=0
current process = 2035 (sendmail)
interrupt mask = net tty bio cam
kernel: type 12 trap, code=0
stopped at softclock+0x48: cmpl %esi,0x8 (%ecx)

Does anyone know what this means ? The machine was idle
at the time - I'm the only user, and it isn't used as a
public access server. The process varies, sometimes httpd,
other times sendmail.

I am using CAM, softupdates and NFS (not heavily though).
I haven't seen anything like this on -current or -hackers,
and searching the mailing lists didn't reveal anything
relevant.

The only "non standard" thing I'm doing is disabling the
vfs reallockblks that was causing the machine to panic
months ago. I re-enabled it, and it still does the same
thing. So, that's not it. I'm doing this by issuing
/sbin/sysctl -w vfs.ffs.doreallocblks=0 
on boot up.

My kernel config and dmesg is listed below.

TIA!

dmesg
-

Copyright (c) 1992-1999 FreeBSD Inc.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
FreeBSD 4.0-CURRENT #0: Mon Feb 15 12:50:24 SAST 1999
r...@chain.freebsd.os.org.za:/usr/src/sys/compile/CHAIN
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "TSC"  frequency 200455997 Hz
CPU: Pentium/P54C (200.46-MHz 586-class CPU)
  Origin = "GenuineIntel"  Id = 0x52c  Stepping=12
  Features=0x1bf
real memory  = 67108864 (65536K bytes)
avail memory = 62050304 (60596K bytes)
Preloaded elf kernel "kernel" at 0xf02c9000.
ccd0-1: Concatenated disk drivers
Probing for devices on PCI bus 0:
chip0:  rev 0x02 on pci0.0.0
ide_pci0:  rev 0xd0 int a irq 14 on 
pci0.0.1
chip1:  rev 0x01 on pci0.1.0
chip2:  rev 0x00 on pci0.2.0
vga0:  rev 0x53 int a irq 10 on pci0.9.0
de0:  rev 0x11 int a irq 5 on pci0.13.0
de0: SMC 21041 [10Mb/s] pass 1.1
de0: address 00:00:c0:f9:2f:c8
Probing for devices on PCI bus 1:
Probing for PnP devices:
Probing for devices on the ISA bus:
sc0 on isa
sc0: VGA color <16 virtual consoles, flags=0x0>
atkbdc0 at 0x60-0x6f on motherboard
atkbd0 irq 1 on isa
psm0 irq 12 on isa
psm0: model IntelliMouse, device ID 3
sio0 at 0x3f8-0x3ff irq 4 flags 0x10 on isa
sio0: type 16550A
sio1 at 0x2f8-0x2ff irq 3 on isa
sio1: type 16550A
fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1.44MB 3.5in
wdc0 at 0x1f0-0x1f7 irq 14 flags 0xa0ffa0ff on isa
ide_pci: generic_dmainit 01f0:0: warning, IDE controller timing not set
wdc0: unit 0 (wd0): , DMA, 32-bit, multi-block-16
wd0: 1033MB (2116800 sectors), 2100 cyls, 16 heads, 63 S/T, 512 B/S
ide_pci: generic_dmainit 01f0:1: warning, IDE controller timing not set
wdc0: unit 1 (wd1): , DMA, 32-bit, multi-block-8
wd1: 1039MB (2128896 sectors), 2112 cyls, 16 heads, 63 S/T, 512 B/S
wdc1 at 0x170-0x177 irq 15 flags 0xa0ffa0ff on isa
ide_pci: generic_dmainit 0170:0: warning, IDE controller timing not set
wdc1: unit 0 (wd2): , DMA, 32-bit, multi-block-32
wd2: 2015MB (4127760 sectors), 4095 cyls, 16 heads, 63 S/T, 512 B/S
ppc0 at 0x378 irq 7 on isa
ppc0: Winbond chipset (NIBBLE-only) in COMPATIBLE mode
plip0:  on ppbus 0
ppi0:  on ppbus 0
aha0 at 0x330-0x333 irq 11 drq 6 on isa
aha0: AHA-1542CF FW Rev. C.0 (ID=45) SCSI Host Adapter, SCSI ID 7, 16 CCBs
npx0 on motherboard
npx0: INT 16 interface
vga0 at 0x3b0-0x3df maddr 0xa msize 131072 on isa
de0: enabling 10baseT port
Intel Pentium detected, installing workaround for F00F bug
IP packet filtering initialized, divert enabled, rule-based forwarding 
disabled, logging limited to 100 packets/entry
de0 XXX: driver didn't set ifq_maxlen
lo0 XXX: driver didn't set ifq_maxlen
Waiting 2 seconds for SCSI devices to settle
changing root device to wd1s1a
da0 at aha0 bus 0 target 2 lun 0
da0:  Fixed Direct Access SCSI-2 device 
da0: 3.300MB/s transfers
da0: 516MB (1057616 512 byte sectors: 64H 32S