subject:"Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections"

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-12-15 Thread Mark Bucciarelli

Hi,

On 2007-05-10 8:40:36 Claudio Jeker wrote:

 With many shortliving connections you have a lot of sockets in TIME_WAIT.
 Because you are testing from one host only you start to hit these entries
 more and more often this often results in a retry from the client.

I'm curious what you meant by:

Because you are testing from one host
 only you start to hit these entries more ...

Entries in what?

Why does it matter that the http requests come from the same host?

I'm pushing the stock Apache and 4.2 Generic with http_load and can
make the system unresponsive with a rate of 100 new connections/second
(for 20 seconds).For a short period of time (20s?), my ssh console
is non-responsive.   Sometimes SSH even times out.  If it comes back,
I can see lots of tcp sockets (1,500+) bound to www in TIME_WAIT.

I'm going to move to lighttpd, but it will have the same issue when
serving lots and lots of small responses.

Do I need to bump somaxmax?

Or are there other avenues I should pursue first?

Or is the test bogus because all connections come from the same IP?
(Host and client are different boxes, connected via the internet.)

Thanks,

m

 START DMESG 
OpenBSD 4.2 (GENERIC) #375: Tue Aug 28 10:38:44 MDT 2007
[EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel(R) Celeron(R) CPU 2.66GHz (GenuineIntel 686-class) 2.67 GHz
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID
real mem  = 106412 (1015MB)
avail mem = 1021501440 (974MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 10/05/04, BIOS32 rev. 0 @
0xfd71c, SMBIOS rev. 2.31 @ 0xefa20 (47 entries)
bios0: vendor IBM version 2CKT19AUS date 10/05/2004
bios0: IBM 8085D5U
apm0 at bios0: Power Management spec V1.2
apm0: AC on, battery charge unknown
apm0: flags 30102 dobusy 0 doidle 1
pcibios0 at bios0: rev 2.1 @ 0xfd6b0/0x950
pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfdf00/224 (12 entries)
pcibios0: PCI Interrupt Router at 000:31:0 (Intel 82371FB ISA rev 0x00)
pcibios0: PCI bus #3 is the last bus
bios0: ROM list: 0xc/0xa000! 0xca000/0x1000 0xcb000/0x1000 0xe/0x1!
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 0 function 0 Intel 82865G/PE/P CPU-I/0-1 rev 0x02
vga1 at pci0 dev 2 function 0 Intel 82865G Video rev 0x02: aperture
at 0xf000, size 0x800
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
uhci0 at pci0 dev 29 function 0 Intel 82801EB/ER USB rev 0x02: irq 11
uhci1 at pci0 dev 29 function 1 Intel 82801EB/ER USB rev 0x02: irq 10
uhci2 at pci0 dev 29 function 2 Intel 82801EB/ER USB rev 0x02: irq 5
uhci3 at pci0 dev 29 function 3 Intel 82801EB/ER USB rev 0x02: irq 11
ehci0 at pci0 dev 29 function 7 Intel 82801EB/ER USB2 rev 0x02: irq 3
usb0 at ehci0: USB revision 2.0
uhub0 at usb0: Intel EHCI root hub, rev 2.00/1.00, addr 1
ppb0 at pci0 dev 30 function 0 Intel 82801BA AGP rev 0xc2
pci1 at ppb0 bus 3
vendor Conexant, unknown product 0x2702 (class communications
subclass miscellaneous, rev 0x01) at pci1 dev 0 function 0 not
configured
acx0 at pci1 dev 1 function 0 TI ACX111 rev 0x00: irq 11
acx0: ACX111, radio Radia (0x16), EEPROM ver 5, address 00:0f:b5:4c:91:d7
ATT/Lucent FW322 1394 rev 0x61 at pci1 dev 2 function 0 not configured
fxp0 at pci1 dev 8 function 0 Intel PRO/100 VE rev 0x02, i82562: irq
9, address 00:0d:60:e1:10:85
inphy0 at fxp0 phy 1: i82562ET 10/100 PHY, rev. 0
ichpcib0 at pci0 dev 31 function 0 Intel 82801EB/ER LPC rev 0x02:
24-bit timer at 3579545Hz
pciide0 at pci0 dev 31 function 1 Intel 82801EB/ER IDE rev 0x02:
DMA, channel 0 configured to compatibility, channel 1 configured to
compatibility
wd0 at pciide0 channel 0 drive 0: ST3200822A
wd0: 16-sector PIO, LBA48, 190782MB, 390721968 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5
atapiscsi0 at pciide0 channel 1 drive 0
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: HL-DT-ST, DVDRAM GSA-4082B, A202 SCSI0
5/cdrom removable
cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 4
ichiic0 at pci0 dev 31 function 3 Intel 82801EB/ER SMBus rev 0x02: irq 9
iic0 at ichiic0
adt0 at iic0 addr 0x2e: lm85 rev 0x62
auich0 at pci0 dev 31 function 5 Intel 82801EB/ER AC97 rev 0x02: irq
9, ICH5 AC97
ac97: codec id 0x41445374 (Analog Devices AD1981B)
ac97: codec features headphone, 20 bit DAC, No 3D Stereo
audio0 at auich0
usb1 at uhci0: USB revision 1.0
uhub1 at usb1: Intel UHCI root hub, rev 1.00/1.00, addr 1
usb2 at uhci1: USB revision 1.0
uhub2 at usb2: Intel UHCI root hub, rev 1.00/1.00, addr 1
usb3 at uhci2: USB revision 1.0
uhub3 at usb3: Intel UHCI root hub, rev 1.00/1.00, addr 1
usb4 at uhci3: USB revision 1.0
uhub4 at usb4: Intel UHCI root hub, rev 1.00/1.00, addr 1
isa0 at ichpcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-12-15 Thread Philip Guenther

On Dec 14, 2007 3:06 PM, Mark Bucciarelli [EMAIL PROTECTED] wrote:
 On 2007-05-10 8:40:36 Claudio Jeker wrote:

  With many shortliving connections you have a lot of sockets in TIME_WAIT.
  Because you are testing from one host only you start to hit these entries
  more and more often this often results in a retry from the client.

 I'm curious what you meant by:

 Because you are testing from one host
  only you start to hit these entries more ...

 Entries in what?

 Why does it matter that the http requests come from the same host?

TCP requires that the four-tuple of remote-IP, remote-port, local-IP,
local-port be unique for each connection.  If your test makes
connections from just one client to just port 80 on one IP of the
server, then three of the four parts of the tuple will all have the
same values.  The only thing that can be different for those
connections will be the port on the client side.  Since the valid port
number is range is 1 to 65535 and at least one end of the connection
will have to go through the TIME_WAIT state, that imposes a
theoretical limit of 65535 connections per TIME_WAIT duration.  The
practical rate will be less than that because TCP implementations
generally refuse to use low-numbered ports unless explicitly bound
there.  I believe OpenBSD limits such port assignments via the
net.inet.ip.porthi{first,last} sysctl variables which give you a
default range of only 16384 ports.  Putting that together with the
normal TIME_WAIT period of 2 minutes means that a single OpenBSD
machine connecting to a single port on a server is limited to 136
connections per second on average.


Philip Guenther

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-12-15 Thread Mark Bucciarelli

On 12/15/07, Philip Guenther [EMAIL PROTECTED] wrote:
 On Dec 14, 2007 3:06 PM, Mark Bucciarelli [EMAIL PROTECTED] wrote:
  On 2007-05-10 8:40:36 Claudio Jeker wrote:
 
   With many shortliving connections you have a lot of sockets in TIME_WAIT.
   Because you are testing from one host only you start to hit these entries
   more and more often this often results in a retry from the client.
 
  Why does it matter that the http requests come from the same host?

 I believe OpenBSD limits such port assignments via the
 net.inet.ip.porthi{first,last} sysctl variables which give you a
 default range of only 16384 ports.  Putting that together with the
 normal TIME_WAIT period of 2 minutes means that a single OpenBSD
 machine connecting to a single port on a server is limited to 136
 connections per second on average.

Got it, thanks.  That explains the operation already in progress
message from http_load.  :)  I've increased the client port range and
those messages are gone.

I'm noticing that often there are three sockets bound to www port that
end up in a state of CLOSING for nearly ten minutes after running the
test.  (Their send queue is equal to 316.)

Is it unusual to have such a long timeout?

m

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-12-15 Thread Daniel Ouellet


Mark Bucciarelli wrote:

On 12/15/07, Philip Guenther [EMAIL PROTECTED] wrote:

On Dec 14, 2007 3:06 PM, Mark Bucciarelli [EMAIL PROTECTED] wrote:

On 2007-05-10 8:40:36 Claudio Jeker wrote:


With many shortliving connections you have a lot of sockets in TIME_WAIT.
Because you are testing from one host only you start to hit these entries
more and more often this often results in a retry from the client.

Why does it matter that the http requests come from the same host?

I believe OpenBSD limits such port assignments via the
net.inet.ip.porthi{first,last} sysctl variables which give you a
default range of only 16384 ports.  Putting that together with the
normal TIME_WAIT period of 2 minutes means that a single OpenBSD
machine connecting to a single port on a server is limited to 136
connections per second on average.


Got it, thanks.  That explains the operation already in progress
message from http_load.  :)  I've increased the client port range and
those messages are gone.

I'm noticing that often there are three sockets bound to www port that
end up in a state of CLOSING for nearly ten minutes after running the
test.  (Their send queue is equal to 316.)

Is it unusual to have such a long timeout?

m


Hi Mark,

You revive a very old tread here.

Along the way if you read it well, there is a lots of good informations, 
but also I have to caution you, there is also some that are way wrong as 
well and were corrected in the process/tread.


Just make sure to not use all that I put in there blindly please.

Depending on your setup some will affect you negatively and there is 
some in that tread that are/were definitely wrong on my part and soem 
that definitely help too.


Don't use it blindly without proper test, I was wrong on some of them.

Best of luck,

Daniel.

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-10 Thread Daniel Ouellet


Ted Unangst wrote:

On 5/9/07, Daniel Ouellet [EMAIL PROTECTED] wrote:

I try to stay safe in my choices and comments are welcome, but I have to
point out as well that ALL the values below needs to be changes to that
new value to get working well. If even only one of them is not at the
level below, the results in the tests start to be affected pretty bad at
times.
net.bpf.bufsize=524288
net.inet.ip.redirect=0


never mind the rest, but these two really make no sense.  none.


Make no sense in the test and improving results, or make no sense in 
setting them as such here?


net.inet.ip.redirect=0

Is to disable ICMP routing redirects. Otherwise, your system could have 
its routing table misadjusted by an attacker. Wouldn't be wise to do so? 
May be if PF is turn on, then there is no reason for this, but with PF 
ON, I get drop and need to address that. Didn't pursue it yet as dead 
however.


As for the net.bpf.bufsize, I am looking again in my notes and tests, 
it's use for Berkeley Packet Filter (BPF), to maintains an internal 
kernel buffer for storing packets received off the wire.


Yes in that case it make sense not to have that here. I redid the tests 
with the default value and yes you are right! This one is wrong here. 
May be lack of sleep. (; Thanks for correcting me!


I also have the revise my statement on the net.inet.ip.portfirst=32768 
effect. In a series of new tests, it doesn't have the impact noted the 
first test runs. So, I would keep it as default value as well now. May 
be it was when PF was enable that I have more of an impact then. But my 
notes are not clear on that specific one.


Anything else you see that may be questionable in what I sent? I am 
doing more tests with different hardware to be sure it's all sane value 
in the end.


Other wise many thanks for having taken the time to look it over and 
give me your feedback on it!


I sure appreciate it big time!

Best

Daniel

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-10 Thread Claudio Jeker

On Thu, May 10, 2007 at 02:31:54AM -0400, Daniel Ouellet wrote:
 Ted Unangst wrote:
 On 5/9/07, Daniel Ouellet [EMAIL PROTECTED] wrote:
 I try to stay safe in my choices and comments are welcome, but I have to
 point out as well that ALL the values below needs to be changes to that
 new value to get working well. If even only one of them is not at the
 level below, the results in the tests start to be affected pretty bad at
 times.
 net.bpf.bufsize=524288
 net.inet.ip.redirect=0
 
 never mind the rest, but these two really make no sense.  none.
 
 Make no sense in the test and improving results, or make no sense in 
 setting them as such here?
 
 net.inet.ip.redirect=0
 
 Is to disable ICMP routing redirects. Otherwise, your system could have 
 its routing table misadjusted by an attacker. Wouldn't be wise to do so? 
 May be if PF is turn on, then there is no reason for this, but with PF 
 ON, I get drop and need to address that. Didn't pursue it yet as dead 
 however.
 

net.inet.ip.redirect has only an effect if you enable
net.inet.ip.forwarding. As you are running a server and not a router I
doubt this is the case. Additionally net.inet.ip.redirect does not modify
the routing table. Your are probably looking at net.inet.icmp.rediraccept.

 As for the net.bpf.bufsize, I am looking again in my notes and tests, 
 it's use for Berkeley Packet Filter (BPF), to maintains an internal 
 kernel buffer for storing packets received off the wire.
 
 Yes in that case it make sense not to have that here. I redid the tests 
 with the default value and yes you are right! This one is wrong here. 
 May be lack of sleep. (; Thanks for correcting me!
 
 I also have the revise my statement on the net.inet.ip.portfirst=32768 
 effect. In a series of new tests, it doesn't have the impact noted the 
 first test runs. So, I would keep it as default value as well now. May 
 be it was when PF was enable that I have more of an impact then. But my 
 notes are not clear on that specific one.
 

With many shortliving connections you have a lot of sockets in TIME_WAIT.
Because you are testing from one host only you start to hit these entries
more and more often this often results in a retry from the client.
Additionally by filling all available ports the port allocation algorithm
is starting to get slower but that's a problem that you will only see on
the host :) The accept behaviour of OpenBSD should be fine.

 Anything else you see that may be questionable in what I sent? I am 
 doing more tests with different hardware to be sure it's all sane value 
 in the end.
 
 Other wise many thanks for having taken the time to look it over and 
 give me your feedback on it!
 

I think there are a few knobs that you should reconsider. I will write an
other mail about that.

-- 
:wq Claudio

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-10 Thread Claudio Jeker

On Wed, May 09, 2007 at 06:41:27PM -0400, Daniel Ouellet wrote:
 Hi,
 
 I am passing my finding around for the configuration of sysctl.conf to 
 remove bottleneck I found in httpd as I couldn't get more then 300 httpd 
 process without crapping out badly and above that, the server simply got 
 out of wack.
 

SNIP

 ===
 sysctl.conf changes.
 
 kern.seminfo.semmni=1024
 kern.seminfo.semmns=4096
 kern.shminfo.shmall=16384
 kern.maxclusters=12000

What does netstat -m tell you about the peak usage of clusters is it
really that high?

 kern.maxproc=2048   # Increase for the process limits.
 kern.maxfiles=5000
 kern.shminfo.shmmax=67108864

 kern.somaxconn=2048

Is httpd really so slow in accepting sockets that you had to increase this
by factor 16? Is httpd actually doing a listen with such a large number?

 net.bpf.bufsize=524288

As tedu@ pointed out this has nothing todo with your setup.

 net.inet.ip.maxqueue=1278

Are you sure you need to tune the IP fragment queue? You are using TCP
which does PMTU discovery and sets the DF flag by default so no IP
fragments should be seen at all unless you borked something else.

 net.inet.ip.portfirst=32768
 net.inet.ip.redirect=0

This has no effect unless you enable forwarding.

 net.inet.tcp.keepinittime=10
 net.inet.tcp.keepidle=30
 net.inet.tcp.keepintvl=30

These values are super aggressive especially the keepidle and keepintvl
values are doubtful for your test. Is your benchmark using SO_KEEPALIVE? I
doubt that and so these two values have no effect and are actually
counterproductive (you are sending more packets for idle sessions).

 net.inet.tcp.mssdflt=1452

This is another knob that should not be changed unless you really know
what you are doing. The mss calculation uses this value as safe default
that is always accepted. Pushing that up to this value may have unpleasant
sideeffects for people behind IPSec tunnels. The used mss is the max
between mssdflt and the MTU of the route to the host minus IP and TCP
header.

 net.inet.tcp.recvspace=65535
 net.inet.tcp.sendspace=65535
 net.inet.tcp.rstppslimit=400

 net.inet.tcp.synbucketlimit=420
 net.inet.tcp.syncachelimit=20510

If you need to tune the syncache in such extrem ways you should consider
to adjust TCP_SYN_HASH_SIZE and leave synbucketlimit as is. The
synbucketlimit is here to limit attacks to the hash list by overloading
the bucket list. On your system it may be necessary to traverse 420 nodes
on a lookup. Honestly the syncachelimit and synbucketlimit knob are totaly
useless. If anything we should allow to resize the hash and calculate the
both limits from there.

-- 
:wq Claudio

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-10 Thread Daniel Ouellet

As requested a few times in private to make the results available, here 
you go with what works for me. Hope this help some anyway.


Use what make sense to you based on your setup, hardware and traffic.

Final value in use after testing are now set as follow for me assuming a 
good amount of memory to allow so many process to run. I use minimum 
2GB, some have 4GB.


Recompile httpd with upper limits for process. I put 2048 to allow more 
room in the future if needed, but I still want to be safe and limit the 
process lower that that. If php is in use for example, static 
compilation would improve, but I choose to keep the system as much as 
possible as default for many reasons, including maintenance, support and 
regular upgrades. Your choice may vary.


In fstab

A partition for the files used by the sites set with noatime set on it 
to avoid the change in last access time for each files. Definitely 
improve access time a lots under heavy load!


httpd logs could be on it's own partition as well, mounted softdep to 
gain some efficiency in logs updates if very busy sites.


For httpd.conf
==
Timeout 300
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 5
MinSpareServers 50
MaxSpareServers 100
StartServers 75
MaxClients 768
MaxRequestsPerChild 0


In sysctl.conf
==
# Below are values added to improve performance of httpd after
# testing with http_load under parallel and rate setting.

kern.maxclusters=12000  # The maximum number of mbuf(9) clusters
# that may be allocated.

kern.maxfiles=4096  # The maximum number of open files that
# may be open in the system.

kern.maxproc=2048   # The maximum number of simultaneous
# processes the system will allow.

kern.seminfo.semmni=1024# The maximum number of semaphore
# identifiers allowed.

kern.seminfo.semmns=4096# The maximum number of semaphores
# allowed in the system.

kern.shminfo.shmall=16384   # The maximum amount of total shared
# memory allowed in the system (in
# pages).

kern.shminfo.shmmax=67108864# The maximum shared memory segment size
# (in bytes).

kern.somaxconn=2048 # Upper bound on the number of half-open
# connections a process
# can allow to be associated with a
# socket, using listen(2).

net.inet.ip.maxqueue=1280   # Fragment flood protection. Sets the
# maximum number of
# unassembled IP fragments in the
# fragment queue.

net.inet.tcp.keepidle=30# Time connection must be idle before
# keepalive sent.

net.inet.tcp.keepinittime=10# Used by the syncache to timeout SYN
# request.

net.inet.tcp.keepintvl=30   # Interval between keepalive sent to
# remote machines.

net.inet.tcp.mssdflt=1452   # The maximum segment size that is used
# as default for non-local connections.

net.inet.tcp.recvspace=65535# TCP receive buffer size.

net.inet.tcp.rstppslimit=400# This variable specifies the maximum
# number of outgoing
# TCP RST packets per second.  TCP RST
# packets exceeding
# this value are subject to rate
# limitation and will not go
# out from the node.  A negative value
# disables rate limitation.

net.inet.tcp.sendspace=65535# TCP Send buffer size.

net.inet.tcp.synbucketlimit=420 # The maximum number of entries allowed
# per hash bucket in
# the TCP SYN cache.
net.inet.tcp.syncachelimit=20510# The maximum number of entries
# allowed in the TCP SYN
# cache.

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-10 Thread Daniel Ouellet


Claudio Jeker wrote:

net.inet.ip.redirect has only an effect if you enable
net.inet.ip.forwarding. As you are running a server and not a router I
doubt this is the case. Additionally net.inet.ip.redirect does not modify
the routing table. Your are probably looking at net.inet.icmp.rediraccept.


More reading in the man pages did the truck on that one and yes you are 
absolutely right. (;


I also have the revise my statement on the net.inet.ip.portfirst=32768 
effect. In a series of new tests, it doesn't have the impact noted the 
first test runs. So, I would keep it as default value as well now. May 
be it was when PF was enable that I have more of an impact then. But my 
notes are not clear on that specific one.




With many shortliving connections you have a lot of sockets in TIME_WAIT.
Because you are testing from one host only you start to hit these entries
more and more often this often results in a retry from the client.
Additionally by filling all available ports the port allocation algorithm
is starting to get slower but that's a problem that you will only see on
the host :) The accept behaviour of OpenBSD should be fine.


I did test it with a few more hosts and as stated, the OpenBSD default 
was right. (; But I appreciate the additional informations! Thanks.


Anything else you see that may be questionable in what I sent? I am 
doing more tests with different hardware to be sure it's all sane value 
in the end.


Other wise many thanks for having taken the time to look it over and 
give me your feedback on it!




I think there are a few knobs that you should reconsider. I will write an
other mail about that.


That sure would be welcome. I would be curious to see what else, or 
differences you may see. I did lots of tests in different setup, but I 
am always happy to see improvements.


I have for now my somewhat final version done and looks pretty good. 
Much better then before for sure anyway. Now I can enjoy seeing traffic 
coming in instead of worry about complains. (;


But more improvements and suggestions with explications would be welcome 
as understanding on my side anyway.


Many thanks!

Daniel

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-10 Thread Daniel Ouellet


Claudio Jeker wrote:

On Wed, May 09, 2007 at 06:41:27PM -0400, Daniel Ouellet wrote:

Hi,

I am passing my finding around for the configuration of sysctl.conf to 
remove bottleneck I found in httpd as I couldn't get more then 300 httpd 
process without crapping out badly and above that, the server simply got 
out of wack.




SNIP


===
sysctl.conf changes.

kern.seminfo.semmni=1024
kern.seminfo.semmns=4096
kern.shminfo.shmall=16384
kern.maxclusters=12000


What does netstat -m tell you about the peak usage of clusters is it
really that high?


I will do an other series of tests in the next few days and be sure of 
it before putting my foot in my mouth. But at 1, I was getting drops 
in my test setup.



kern.maxproc=2048   # Increase for the process limits.
kern.maxfiles=5000
kern.shminfo.shmmax=67108864



kern.somaxconn=2048


Is httpd really so slow in accepting sockets that you had to increase this
by factor 16? Is httpd actually doing a listen with such a large number?


Yes, I was doing tests using a few clients and pushing the server at 
2000 parallel connections to test with. That was in lab test and in real 
life, I assume that half should be fine. But I wanted to be safe. So, 
place for review on my side.



net.bpf.bufsize=524288


As tedu@ pointed out this has nothing todo with your setup.


Agreed before and was removed after more reading. You are right.


net.inet.ip.maxqueue=1278


Are you sure you need to tune the IP fragment queue? You are using TCP
which does PMTU discovery and sets the DF flag by default so no IP
fragments should be seen at all unless you borked something else.


With smaller queue I was getting slower responses and drop. May be a 
need a better way to verify this situation for a fact.



net.inet.ip.portfirst=32768
net.inet.ip.redirect=0


This has no effect unless you enable forwarding.


Was removed as well.


net.inet.tcp.keepinittime=10
net.inet.tcp.keepidle=30
net.inet.tcp.keepintvl=30


These values are super aggressive especially the keepidle and keepintvl
values are doubtful for your test. Is your benchmark using SO_KEEPALIVE? I
doubt that and so these two values have no effect and are actually
counterproductive (you are sending more packets for idle sessions).


Yes, aggressive I was/am. Keep Alive was/is in use yes. I will have more 
to play with in lab and see if I was to aggressive and look like you 
would think I am. The default value give me not as good results however. 
More tests needed specifically on this and I will do so. May be the 
defaults are fine, I will see if I can find a way to be more objective 
about these values.



net.inet.tcp.mssdflt=1452


This is another knob that should not be changed unless you really know
what you are doing. The mss calculation uses this value as safe default
that is always accepted. Pushing that up to this value may have unpleasant
sideeffects for people behind IPSec tunnels. The used mss is the max
between mssdflt and the MTU of the route to the host minus IP and TCP
header.


I will review and read more on it. I based my changes on results seen 
with the setup under heavy load. There is always place for improvements. 
This gives me more to consider and will do so.



net.inet.tcp.recvspace=65535
net.inet.tcp.sendspace=65535
net.inet.tcp.rstppslimit=400



net.inet.tcp.synbucketlimit=420
net.inet.tcp.syncachelimit=20510


If you need to tune the syncache in such extrem ways you should consider
to adjust TCP_SYN_HASH_SIZE and leave synbucketlimit as is. The
synbucketlimit is here to limit attacks to the hash list by overloading
the bucket list. On your system it may be necessary to traverse 420 nodes
on a lookup. Honestly the syncachelimit and synbucketlimit knob are totaly
useless. If anything we should allow to resize the hash and calculate the
both limits from there.


Interesting! I will retest with that in mind. Didn't see that 
explication in my reading so far. Thanks for this!


You are most helpful and this gives me something to research more and I 
sure appreciates your time in passing the informations.


Looks like a few more days of testing needed.

Many thanks!

Daniel

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-10 Thread Daniel Ouellet


Claudio Jeker wrote:

===
sysctl.conf changes.

kern.seminfo.semmni=1024
kern.seminfo.semmns=4096
kern.shminfo.shmall=16384
kern.maxclusters=12000


What does netstat -m tell you about the peak usage of clusters is it
really that high?


You are right again! (;

# netstat -m
14140 mbufs in use:
1098 mbufs allocated to data
12527 mbufs allocated to packet headers
515 mbufs allocated to socket names and addresses
585/694/4096 mbuf clusters in use (current/peak/max)
4976 Kbytes allocated to network (94% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

I was not looking at the right place. Back to default value.

Thanks for the help!

Daniel

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-09 Thread Otto Moerbeek

On Wed, 9 May 2007, Daniel Ouellet wrote:

 Otto Moerbeek wrote:
   Where are the OS bottleneck that I can may be improve here?
  
  Loks at the memory usage. 300 httpd procces could take up 3000M
  easily, especially with stuff like php. In that case, the machine
  starts swapping and your hit the roof. As a general rul, do not allow
  more httpd procces than our machine can handle without swapping. Also,
  a long KeepAliveTmeout can works against you, by holding slots. 
 
 Thanks Otto,
 
 I am still doing tests and tweak, but as far as swap, I checked that and same
 for keep alive in httpd.conf and I even changed it in:
 
 net.inet.tcp.keepinittime=10
 net.inet.tcp.keepidle=30
 net.inet.tcp.keepintvl=30

These parameters do not have a lot to do with what you are seeing.

I was talking abouty the KeepAliveTimeout of apache. It's by default
15s. WIth a long timout, any processs that has served a request will
wait 15s to see if the client issues more requests on the same
connection before it becomes available to serve other requests. For
more details, see 
http://httpd.apache.org/docs/1.3/mod/core.html#keepalivetimeout

 
 For testing only. I am not saying the value above are any good, but I am
 testing multiple things and reading a lot on sysctl and what each one does.
 
 KeepAliveTmeout is at 5 seconds.

Try lowering it even more.

 
 No swapping is happening, even with 1000 httpd running.
 
 load averages: 123.63, 39.74, 63.3285  01:26:47
 1064 processes:1063 idle, 1 on processor
 CPU states:  0.8% user,  0.0% nice,  3.1% system,  0.8% interrupt, 95.4% idle
 Memory: Real: 648M/1293M act/tot  Free: 711M  Swap: 0K/4096M used/tot

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-09 Thread Daniel Ouellet


Here is more tests with always repeated results.

I increase the number of contiguous connection only by 5, from 305 to 
310, and you get 3 times slower response for always the same thing and 
repeated all the time. Very consistent and from different clients as well.


You can do any variation of 10 to 300 connections and you will always 
get the same results, or very close to it. See that at the end as well 
for proof.


So, I know I am hitting a hard limit someplace, but can't find where.

Note that I use a difference of 5 here, but I can reproduce the results 
almost all the time, just by increasing the number of connections by 1. 
From 307 to 308 I get 75% of the time the same results as below, 
meaning times it;'s 6.7 seconds for the same transfer and other is 18.1 
seconds.


See below. Always the same transfer size, always the same amount of 
requests, always 100% success, but 3x slower.


Also, if I continue to increase it more, then I start to also get drop 
in replies, etc.


So, far I have played with 26 different sysctl setting that may affect 
that based on various possibility and from the man page and Google, but 
I can improve it some, not to the point of be able to use 500 
connections or more for example.


What is it that really limit the number of connection that badly and 
that hard?


===
305 parallel

# http_load -parallel 305 -fetches 500 -timeout 30 /tmp/test
500 fetches, 305 max parallel, 6.549e+06 bytes, in 6.71609 seconds
13098 mean bytes/connection
74.4481 fetches/sec, 975121 bytes/sec
msecs/connect: 1813.57 mean, 6007.53 max, 0.418 min
msecs/first-response: 509.309 mean, 1685.92 max, 3.606 min
HTTP response codes:
  code 200 -- 500
# http_load -parallel 305 -fetches 500 -timeout 30 /tmp/test
500 fetches, 305 max parallel, 6.549e+06 bytes, in 6.8586 seconds
13098 mean bytes/connection
72.9012 fetches/sec, 954860 bytes/sec
msecs/connect: 1957.35 mean, 6007.17 max, 0.445 min
msecs/first-response: 485.676 mean, 1559.27 max, 3.317 min
HTTP response codes:
  code 200 -- 500
# http_load -parallel 305 -fetches 500 -timeout 30 /tmp/test
500 fetches, 305 max parallel, 6.549e+06 bytes, in 6.81823 seconds
13098 mean bytes/connection
73.3328 fetches/sec, 960513 bytes/sec
msecs/connect: 1825.19 mean, 6007.11 max, 0.484 min
msecs/first-response: 508.281 mean, 1646.53 max, 3.422 min
HTTP response codes:
  code 200 -- 500

=
310 parallel

# http_load -parallel 310 -fetches 500 -timeout 30 /tmp/test
500 fetches, 310 max parallel, 6.549e+06 bytes, in 18.0998 seconds
13098 mean bytes/connection
27.6245 fetches/sec, 361826 bytes/sec
msecs/connect: 2281.39 mean, 18008.3 max, 0.434 min
msecs/first-response: 456.326 mean, 1555.78 max, 3.328 min
HTTP response codes:
  code 200 -- 500
# http_load -parallel 310 -fetches 500 -timeout 30 /tmp/test
500 fetches, 310 max parallel, 6.549e+06 bytes, in 18.1142 seconds
13098 mean bytes/connection
27.6027 fetches/sec, 361540 bytes/sec
msecs/connect: 2245.47 mean, 18011.4 max, 0.565 min
msecs/first-response: 460.068 mean, 1495.42 max, 3.32 min
HTTP response codes:
  code 200 -- 500
# http_load -parallel 310 -fetches 500 -timeout 30 /tmp/test
500 fetches, 310 max parallel, 6.549e+06 bytes, in 18.1635 seconds
13098 mean bytes/connection
27.5278 fetches/sec, 360559 bytes/sec
msecs/connect: 2485.7 mean, 18011.9 max, 0.598 min
msecs/first-response: 455.163 mean, 1573.78 max, 3.471 min
HTTP response codes:
  code 200 -- 500
#

===
10 parallel
# http_load -parallel 10 -fetches 500 -timeout 30 /tmp/test
500 fetches, 10 max parallel, 6.549e+06 bytes, in 6.01266 seconds
13098 mean bytes/connection
83.1579 fetches/sec, 1.0892e+06 bytes/sec
msecs/connect: 24.6605 mean, 6002.47 max, 0.349 min
msecs/first-response: 28.6373 mean, 798.5 max, 3.23 min
HTTP response codes:
  code 200 -- 500

==
20 parallel
# http_load -parallel 20 -fetches 500 -timeout 30 /tmp/test
500 fetches, 20 max parallel, 6.549e+06 bytes, in 7.12896 seconds
13098 mean bytes/connection
70.1365 fetches/sec, 918648 bytes/sec
msecs/connect: 48.676 mean, 6003.58 max, 0.342 min
msecs/first-response: 58.1521 mean, 1249.71 max, 3.216 min
HTTP response codes:
  code 200 -- 500


===
50 parallel
# http_load -parallel 50 -fetches 500 -timeout 30 /tmp/test
500 fetches, 50 max parallel, 6.549e+06 bytes, in 8.00917 seconds
13098 mean bytes/connection
62.4285 fetches/sec, 817688 bytes/sec
msecs/connect: 84.686 mean, 6003.49 max, 0.418 min
msecs/first-response: 174.045 mean, 1950.98 max, 3.349 min
HTTP response codes:
  code 200 -- 500



100 parallel
# http_load -parallel 100 -fetches 500 -timeout 30 /tmp/test
500 fetches, 100 max parallel, 6.549e+06 bytes, in 7.90241 seconds
13098 mean bytes/connection
63.2718 fetches/sec, 828735 bytes/sec
msecs/connect: 72.8683 mean, 6003.78 max, 0.417 min
msecs/first-response: 379.736 mean, 1964.26 max, 3.366 min
HTTP response codes:
  code 200 -- 500

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-09 Thread Srebrenko Sehic


On 5/9/07, Daniel Ouellet [EMAIL PROTECTED] wrote:


I increase the number of contiguous connection only by 5, from 305 to
310, and you get 3 times slower response for always the same thing and
repeated all the time. Very consistent and from different clients as well.

You can do any variation of 10 to 300 connections and you will always
get the same results, or very close to it. See that at the end as well
for proof.

So, I know I am hitting a hard limit someplace, but can't find where.


You've assumed that Apache is the bottleneck, but perhaps your
benchmark tool could be limited in some way. I suggest you try with
apache benchmark or some other tool just to verify the results.

Apache (especially in the prefork model) is known to have concurrency
issues. I doubt that there are knobs you can twist OpenBSD-wise that
will compensate for Apache and somehow magically make it scale.

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-09 Thread Daniel Ouellet


Srebrenko Sehic wrote:

On 5/9/07, Daniel Ouellet [EMAIL PROTECTED] wrote:


I increase the number of contiguous connection only by 5, from 305 to
310, and you get 3 times slower response for always the same thing and
repeated all the time. Very consistent and from different clients as 
well.


You can do any variation of 10 to 300 connections and you will always
get the same results, or very close to it. See that at the end as well
for proof.

So, I know I am hitting a hard limit someplace, but can't find where.


You've assumed that Apache is the bottleneck, but perhaps your
benchmark tool could be limited in some way. I suggest you try with
apache benchmark or some other tool just to verify the results.

Apache (especially in the prefork model) is known to have concurrency
issues. I doubt that there are knobs you can twist OpenBSD-wise that
will compensate for Apache and somehow magically make it scale.


Actually I have found a few things that fix it tonight.

I spend the last 24 hours reading like crazy and all night testing and 
reading more.


I can now have two clients using 1000 parallel connections to one i386 
850MHz server, my old one that I was testing with and I get all that no 
problem now. No delay and I can even push it more, but I figure at 2000 
parallel connections I should be able to get some breathing time now.


I will send the results soon.

All only in sysctl.conf

Now, I am still having some drop, not much, but some when I put pf in 
actions. So, that would be the next step I guess, but not now. I need 
some sleep.


Thanks

Daniel

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-09 Thread Douglas Allan Tutty

On Wed, May 09, 2007 at 01:30:41AM -0400, Daniel Ouellet wrote:
 No swapping is happening, even with 1000 httpd running.
 
 load averages: 123.63, 39.74, 63.3285  01:26:47
 1064 processes:1063 idle, 1 on processor
 CPU states:  0.8% user,  0.0% nice,  3.1% system,  0.8% interrupt, 95.4% 
 idle
 Memory: Real: 648M/1293M act/tot  Free: 711M  Swap: 0K/4096M used/tot
 

How does this server do with 1000 non-httpd processes running?  Perhaps
I need a newer Nemeth et al, but in my 3rd edition, pg 759 middle of the
page says Modern systems do not deal welll with load averages over
about 6.0.

Could your bottleneck be in context-switching between so many processes?
With so many, the memory cache will be faulting during the context
switching and have to be retreived from main memory.  I don't think that
such slow-downs appear in top, and I don't know about vmstat.  I don't
know if there's a tool to measure this on i386.

I've never run httpd but it looks to me like a massivly parralized
problem where each connection is trivial to serve (hense low CPU usage,
no disk-io waiting) but there are just so many of them.  

How does the server do with other connection services, e.g. pop or ftp?

Doug.

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-09 Thread Karsten McMinn


On 5/9/07, Daniel Ouellet [EMAIL PROTECTED] wrote:

I can now have two clients using 1000 parallel connections to one i386
850MHz server, my old one that I was testing with and I get all that no
problem now. No delay and I can even push it more, but I figure at 2000
parallel connections I should be able to get some breathing time now.


I've spent considerable time with tuning apache on openbsd to
consume all available resources in OpenBSD. Here's the
relevant httpd.conf sections:

Timeout 300
KeepAlive On
MaxKeepAliveRequests 5000
KeepAliveTimeout 15

MinSpareServers 20
MaxSpareServers 30
StartServers 50
MaxClients 5000
MaxRequestsPerChild 0

I had staticlly compiled php into my httpd binary and obviously
raised HARD_LIMIT to 5000, using OpenBSD's apache.

This netted me an ability to serve about a max of 3000
requests per second on a 1.6ghz athlon with 256MB of memory.

hth.

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-09 Thread Daniel Ouellet


Douglas Allan Tutty wrote:

On Wed, May 09, 2007 at 01:30:41AM -0400, Daniel Ouellet wrote:

No swapping is happening, even with 1000 httpd running.

load averages: 123.63, 39.74, 63.3285  01:26:47
1064 processes:1063 idle, 1 on processor
CPU states:  0.8% user,  0.0% nice,  3.1% system,  0.8% interrupt, 95.4% 
idle

Memory: Real: 648M/1293M act/tot  Free: 711M  Swap: 0K/4096M used/tot



How does this server do with 1000 non-httpd processes running?  Perhaps
I need a newer Nemeth et al, but in my 3rd edition, pg 759 middle of the
page says Modern systems do not deal welll with load averages over
about 6.0.


Be careful when reading these numbers here. Don't forget that I am doing 
this in labs with abuse, etc. I am trying to push the server as much as 
I can here. In production, I do see some server reaching 10, 18 and some 
time I saw up to 25, but all these were in extreme cases, most of the 
time, it's always below 10.


I can't answer this question with proper knowledge here as I don't 
pretend to know that answer. May be someone else can speak knowingly 
about it?



Could your bottleneck be in context-switching between so many processes?
With so many, the memory cache will be faulting during the context
switching and have to be retreived from main memory.  I don't think that
such slow-downs appear in top, and I don't know about vmstat.  I don't
know if there's a tool to measure this on i386.


Wasn't. However yes there is and I can see faulting. I check both the 
vmstat and iostat to see what's up. Obviously the number are higher on 
older hardware as it run out of horse power obviously. But the problem 
was the be able to handle more then 300 parallel connections and why it 
just 3x when only 2 more process were added. So, no, I don't think the 
context-switching had anything to do with it here.


You will see when I post the changes I did and the test I did. Some are 
surprising!



I've never run httpd but it looks to me like a massivly parralized
problem where each connection is trivial to serve (hense low CPU usage,
no disk-io waiting) but there are just so many of them.  


One multi core and multi processor hardware with proper memory, it 
shouldn't be a problem I think, but will know soon!



How does the server do with other connection services, e.g. pop or ftp?


I only run one application per servers, always did and most likely 
always will. So, any mail server is a mail server, and a web server is 
only a web server here anyway. Even DNS are only running DNS as well, etc.

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-09 Thread Daniel Ouellet


Karsten McMinn wrote:

On 5/9/07, Daniel Ouellet [EMAIL PROTECTED] wrote:

I can now have two clients using 1000 parallel connections to one i386
850MHz server, my old one that I was testing with and I get all that no
problem now. No delay and I can even push it more, but I figure at 2000
parallel connections I should be able to get some breathing time now.


I've spent considerable time with tuning apache on openbsd to
consume all available resources in OpenBSD. Here's the
relevant httpd.conf sections:


Thanks. My configuration is more aggressive them yours and I can tell 
you for a fact that the problem and limitations where not in the httpd 
configuration, but in the OS part in my case anyway.


Some of your value I think would/could crash your system. Specially the:

MaxKeepAliveRequests 5000
MaxClients 5000

I don't think you could reach that high. Why, simply on a memory usage 
stand point. That was my next exploration, but it's possible that one 
apache process could take as much as 11MB


 6035 www20   11M 9392K sleepnetcon   0:56  0.00% httpd

Obviously not all process would use that much. The question is really 
depending on content. If small images and lots of them, then each 
process use less memory. But if it is to serve all big files, then it's 
possible to use a good amount of memory per process. Now I don't have 
that answer here and I am not sure how to come with some logic on that, 
but even if each process was using only 1MB, then 5000 would give you 
5GB or RAM with is more then what OpenBSD was supporting until not so 
long ago, so you will start to swap and god knows what will happen then.


So, I think the these two value are not realistic and safe to us.


Timeout 300
KeepAlive On
MaxKeepAliveRequests 5000
KeepAliveTimeout 15


I use KeepAliveTimeout 5 and I am considering to reduce it.

If you think aboiut your suggestion here, you have KeepAliveTimeout 15 
and then MaxKeepAliveRequests 5000, don't you see the paradox here?


If your server is really busy, and lots of images on one page for 
example, then you would have a lots of process stuck in KeepAliveTimeout 
time out stage, so that's why you most likely increase your MaxClients 
5000 to compensate for that, but that's wrong I believe. It makes your 
server use more resources and be slower to react.


I use a logic here for the value on how to fix it.

MaxKeepAliveRequests I think should be set based on how many possible 
additional requests a URL from a browser that support keep alive and 
multiple requests at once could have. How many, well I think it's based 
on how many elements your web page can have. That's the idea here isn't 
it? Many browsers will call the URL and when images for example are on 
that page they will fire up an additional request to the web server. So, 
in theory the maximum number of requests you should allow should be the 
maximum possible of elements one page could have on it no? Assuming a 
users can click a few pages in a few seconds may be, I think anything 
above 1000 is not good. I could be wrong, but that's how I see it. I use 
250 and it serve me well and allow to protect the server from abuse from 
one source. I have some setup that allow 100 max here for the 
MaxKeepAliveRequests. But I would think that 1000 should be plenty and 
more then that may not be good. Unless my thinking above is wrong?


I can do more tests on that to know more obviously.

For testing reason in my lab I put MaxKeepAliveRequests 0, but I 
wouldn't use that in production for sure.


Your value may be good, I just think not, but that's open to discussion.

One thing for sure having the same number for MaxKeepAliveRequests and 
for MaxClients, I think is wrong as you open yourself to have one 
attacker from one source to bring your server down and huge it all for 
himself. I believe that MaxKeepAliveRequests should definitely be lower 
then your MaxClients, not the same for sure.



MinSpareServers 20
MaxSpareServers 30
StartServers 50


I also thing that if you want to run a so busy server, that you should 
have more StartServers and for sure have a bigger margin between the min 
and max as it will always kill process and start new one where as you 
fork a lots and that's a pretty slow process and costly as well.Again 
here I use some logic and based that on what the traffic is like. If you 
allow multiple requests per connection, wouldn't it make sense for you 
to be sure that you could serve that connection and all it's requests 
without having to fork new process? Meaning if you have 50 elements on 
your page, then it's possible that some browser will send you 50 
requests, so why not make sure you have 50 minimum process to serve 
them? Again, that's logical to me. I have some setup that I keep a 
minimum of 50 spare and maximum of 100 spare. Not always, but some cases 
yes. But it's better then the defautl one for sure. (;



MaxClients 5000


To high I think based on the above explications.

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-09 Thread Daniel Ouellet


Hi,

I am passing my finding around for the configuration of sysctl.conf to 
remove bottleneck I found in httpd as I couldn't get more then 300 httpd 
process without crapping out badly and above that, the server simply got 
out of wack.


All is default install and the tests are done with a server that is an 
old one. dmesg at the end in case you are interested. This is on OpenBSD 
4.0 and I pick that server just to see what's possible as it's not 
really a very powerful one.


You can also see the iostat output and the vmstat as well with the 
changes in place.


You sure can see a few page fault as I am really pushing the server 
much, but even then I get decent results and the bottleneck was remove, 
even with 2000 parallel connections. In that case I had to use two 
different clients as the http_load only support up to 1021 parallel 
connections, so to test pass that, I use more then one clients to push 
the server more.


But in all, the results are much better then a few days ago and now 
looks like we get more for the buck and adding more powerful hardware 
will be use better now instead of suffering the same limitations.


I put also the value changed in sysctl.conf to come to this final setup.

I am not saying the value are the best possible choice, but they work 
well in the test situation and there is many as you will see. Some are 
very surprising to me, like the change in net.inet.ip.portfirst. Yes I 
know, but if I leave it as default, then I can't get full success in the 
test below and get time out, some errors and efficiency is not as good. 
May be that's because of the random ports range calculations, I can't 
say, but in any case, the effect is there and tested.


I try to stay safe in my choices and comments are welcome, but I have to 
point out as well that ALL the values below needs to be changes to that 
new value to get working well. If even only one of them is not at the 
level below, the results in the tests start to be affected pretty bad at 
times.


So, not only one value needs to be changed or address the issues, but 
ALL of them below.


I am still working on finding may be more restrictive value to keep the 
system as stable and safe and close to the default as possible, but 
below is a very good setup in y tests and all the results are below as well.


As for the value in httpd.conf, they are still in progress to make them 
more normal, but for this test they are:


Timeout 300
KeepAlive On
MaxKeepAliveRequests 0 (shouldn't stay like this as limits needs to be 
in place)

KeepAliveTimeout 5
MinSpareServers 40
MaxSpareServers 80
StartServers 40
MaxClients 2048
MaxRequestsPerChild 0

Also, the httpd use .so module like php and is not compile statically.

For the value above, I think a more reasonable (still in progress as 
well) would be for a very busy server:


Timeout 300
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 5
MinSpareServers 50
MaxSpareServers 100
StartServers 75
MaxClients 768
MaxRequestsPerChild 0

However, I am not settle on them fully yet. I send an earlier email with 
explication for why some value should be pick.


http://marc.info/?l=openbsd-miscm=117874246431437w=2

Any comments on any parts or caution I have overlooked?

Thanks and hope this help some others that may suffer from the same 
problem I did.


Daniel

===
sysctl.conf changes.

kern.seminfo.semmni=1024
kern.seminfo.semmns=4096
kern.shminfo.shmall=16384
kern.maxclusters=12000
kern.maxproc=2048   # Increase for the process limits.
kern.maxfiles=5000
kern.shminfo.shmmax=67108864
kern.somaxconn=2048
net.bpf.bufsize=524288
net.inet.ip.maxqueue=1278
net.inet.ip.portfirst=32768
net.inet.ip.redirect=0
net.inet.tcp.keepinittime=10
net.inet.tcp.keepidle=30
net.inet.tcp.keepintvl=30
net.inet.tcp.mssdflt=1452
net.inet.tcp.recvspace=65535
net.inet.tcp.rstppslimit=400
net.inet.tcp.sendspace=65535
net.inet.tcp.synbucketlimit=420
net.inet.tcp.syncachelimit=20510



===
Test with multiple parallel connections, from 10 to 1000. As expected, 
the results gets better as we go and I was able to go up to 2000, but I 
limit the server at 2048 in the recompile version. At 2000, I get close 
to 2x the delay, meaning it's start to go back up before that, but still 
get full completed without errors in less then the time out of 30 
seconds, witch I couldn't do before at 300 parallel connections anyway.



# http_load -parallel 10 -fetches 1500 -timeout 30 /tmp/test
1500 fetches, 10 max parallel, 1.9647e+07 bytes, in 19.8742 seconds
13098 mean bytes/connection
75.4747 fetches/sec, 988568 bytes/sec
msecs/connect: 84.6428 mean, 6003.03 max, 0.347 min
msecs/first-response: 17.6985 mean, 1698.64 max, 3.236 min
HTTP response codes:
  code 200 -- 1500
# http_load -parallel 20 -fetches 1500 -timeout 30 /tmp/test
1500 fetches, 20 max parallel, 1.9647e+07 bytes, in 20.824 seconds
13098 mean bytes/connection
72.0324 fetches/sec, 943480 bytes/sec
msecs/connect:

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-09 Thread Marcos Laufer

Daniel,

Try the same test with this changes

Timeout 60
KeepAlive Off

If my guess is right, you'll notice big improvement.
Tell me how it goes

Marcos Laufer

- Original Message - 
From: Daniel Ouellet [EMAIL PROTECTED]
To: misc@openbsd.org
Sent: Wednesday, May 09, 2007 7:41 PM
Subject: Re: Bottleneck in httpd. I need help to address capacity issues on
max parallel and rate connections


Hi,

I am passing my finding around for the configuration of sysctl.conf to
remove bottleneck I found in httpd as I couldn't get more then 300 httpd
process without crapping out badly and above that, the server simply got
out of wack.

All is default install and the tests are done with a server that is an
old one. dmesg at the end in case you are interested. This is on OpenBSD
4.0 and I pick that server just to see what's possible as it's not
really a very powerful one.

You can also see the iostat output and the vmstat as well with the
changes in place.

You sure can see a few page fault as I am really pushing the server
much, but even then I get decent results and the bottleneck was remove,
even with 2000 parallel connections. In that case I had to use two
different clients as the http_load only support up to 1021 parallel
connections, so to test pass that, I use more then one clients to push
the server more.

But in all, the results are much better then a few days ago and now
looks like we get more for the buck and adding more powerful hardware
will be use better now instead of suffering the same limitations.

I put also the value changed in sysctl.conf to come to this final setup.

I am not saying the value are the best possible choice, but they work
well in the test situation and there is many as you will see. Some are
very surprising to me, like the change in net.inet.ip.portfirst. Yes I
know, but if I leave it as default, then I can't get full success in the
test below and get time out, some errors and efficiency is not as good.
May be that's because of the random ports range calculations, I can't
say, but in any case, the effect is there and tested.

I try to stay safe in my choices and comments are welcome, but I have to
point out as well that ALL the values below needs to be changes to that
new value to get working well. If even only one of them is not at the
level below, the results in the tests start to be affected pretty bad at
times.

So, not only one value needs to be changed or address the issues, but
ALL of them below.

I am still working on finding may be more restrictive value to keep the
system as stable and safe and close to the default as possible, but
below is a very good setup in y tests and all the results are below as well.

As for the value in httpd.conf, they are still in progress to make them
more normal, but for this test they are:

Timeout 300
KeepAlive On
MaxKeepAliveRequests 0 (shouldn't stay like this as limits needs to be
in place)
KeepAliveTimeout 5
MinSpareServers 40
MaxSpareServers 80
StartServers 40
MaxClients 2048
MaxRequestsPerChild 0

Also, the httpd use .so module like php and is not compile statically.

For the value above, I think a more reasonable (still in progress as
well) would be for a very busy server:

Timeout 300
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 5
MinSpareServers 50
MaxSpareServers 100
StartServers 75
MaxClients 768
MaxRequestsPerChild 0

However, I am not settle on them fully yet. I send an earlier email with
explication for why some value should be pick.

http://marc.info/?l=openbsd-miscm=117874246431437w=2

Any comments on any parts or caution I have overlooked?

Thanks and hope this help some others that may suffer from the same
problem I did.

Daniel

===
sysctl.conf changes.

kern.seminfo.semmni=1024
kern.seminfo.semmns=4096
kern.shminfo.shmall=16384
kern.maxclusters=12000
kern.maxproc=2048   # Increase for the process limits.
kern.maxfiles=5000
kern.shminfo.shmmax=67108864
kern.somaxconn=2048
net.bpf.bufsize=524288
net.inet.ip.maxqueue=1278
net.inet.ip.portfirst=32768
net.inet.ip.redirect=0
net.inet.tcp.keepinittime=10
net.inet.tcp.keepidle=30
net.inet.tcp.keepintvl=30
net.inet.tcp.mssdflt=1452
net.inet.tcp.recvspace=65535
net.inet.tcp.rstppslimit=400
net.inet.tcp.sendspace=65535
net.inet.tcp.synbucketlimit=420
net.inet.tcp.syncachelimit=20510



===
Test with multiple parallel connections, from 10 to 1000. As expected,
the results gets better as we go and I was able to go up to 2000, but I
limit the server at 2048 in the recompile version. At 2000, I get close
to 2x the delay, meaning it's start to go back up before that, but still
get full completed without errors in less then the time out of 30
seconds, witch I couldn't do before at 300 parallel connections anyway.


# http_load -parallel 10 -fetches 1500 -timeout 30 /tmp/test
1500 fetches, 10 max parallel, 1.9647e+07 bytes, in 19.8742 seconds
13098 mean bytes/connection
75.4747 fetches/sec, 988568 bytes/sec

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-09 Thread Daniel Ouellet


Marcos Laufer wrote:

Daniel,

Try the same test with this changes

Timeout 60
KeepAlive Off

If my guess is right, you'll notice big improvement.
Tell me how it goes


Neither apply to the issue that was at hand. Timeout 60, or 300 like in 
this case have nothing to do with the connections rate or limit, but in 
some cases where processing from php scripts takes a long time, doing 
less then timeout 60 will stop the script for finishing. Plus timeout 60 
is the time it will wait for an answer on the client side. The issue 
here is not a lack of reply, or a delay in it. See:


http://httpd.apache.org/docs/1.3/mod/core.html#timeout

For more details.

As for KeepAlive Off, that would simply increase the number of required 
connections to the server with would have the opposite effect of helping.


http://httpd.apache.org/docs/1.3/mod/core.html#keepalive

I appreciate you looking at it, but that really have nothing to do with 
the problem as it was describe and demonstrated as well.


Thanks

Daniel

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-09 Thread Ted Unangst


On 5/9/07, Daniel Ouellet [EMAIL PROTECTED] wrote:

I try to stay safe in my choices and comments are welcome, but I have to
point out as well that ALL the values below needs to be changes to that
new value to get working well. If even only one of them is not at the
level below, the results in the tests start to be affected pretty bad at
times.
net.bpf.bufsize=524288
net.inet.ip.redirect=0


never mind the rest, but these two really make no sense.  none.

Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-08 Thread Daniel Ouellet

I am trying to improve my performance and fix my problem on httpd, but 
look like I am hitting the roof regardless if I test in lab using an old 
850MHz i386 or an new AMD64 at 1.6GHz. Both have  2GB of ram, so that's 
the issue both have. I can't pass more then ~300 to 325 simultaneous 
httpd process and timeout goes jump high.


So, I guess may be the limit are in the connection process of the TCP 
stack, more then the httpd itself. But I am at a lots as to where to 
look. Tested both on 4.1 and 3.9 just to see.


Where are the OS bottleneck that I can may be improve here?

Please read for more details and more can be provided as well.

I need some help as I even went as far as order 4x X4100 with 2x dual 
core processor 2.4GHz and 2x 10K SAS drives in them with 8GB of ram as 
well, so 4GB per processors and I am afraid to hit the same limitations. 
There isn't any reason that I shouldn't be able to pass these limits.


I don't have the new Sun yet, may be a week before I have them, but I am 
trying to get ahead of the setup to fix my problem and test in lab. It 
really is a capacity issue and look likes putting more powerful hardware 
at it will not fix it.


I have:

# sysctl kern.maxproc
kern.maxproc=2048

Both also have noatime setup on the partition that the web files comes 
from and I even send the logs of httpd to /dev/null to be sure it's not 
writing logs that would slow it down.


I use http_load to test my configuration and changes, but I am not 
successful at improving it more. Look like connections are timing out 
and I can't get more then ~ 300 process serving for httpd. Yes I have 
also increase and recompile the httpd to allow more then the hard limit 
of 250 and I can start 1500 httpd process if I want and they do run, but 
they do not server traffic looks like and I am still getting timeout.


Even if I start StartServers 2500 httpd process to be sure I don't run 
out, or that the start of additional one is not the limit here, I can't 
get more then about ~ 300 successful parallel at one with a decent timeout:


# http_load -parallel 500 -fetches 2500 -timeout 20 /tmp/www2
2500 fetches, 500 max parallel, 1.25616e+07 bytes, in 41.815 seconds
5024.62 mean bytes/connection
59.7872 fetches/sec, 300408 bytes/sec
msecs/connect: 1868.76 mean, 18014 max, 0.597 min
msecs/first-response: 2741.85 mean, 19968.2 max, 4.005 min
345 timeouts
345 bad byte counts
HTTP response codes:
  code 200 -- 2155


# http_load -parallel 500 -fetches 2500 -timeout 60 /tmp/www2
http://www2.netcampaign.com/: byte count wrong
http://www2.netcampaign.com/: byte count wrong
2500 fetches, 500 max parallel, 1.37498e+07 bytes, in 42.3446 seconds
5499.91 mean bytes/connection
59.0394 fetches/sec, 324711 bytes/sec
msecs/connect: 2064.88 mean, 42024.6 max, 0.621 min
msecs/first-response: 2408.3 mean, 21687.7 max, 4.136 min
2 bad byte counts
HTTP response codes:
  code 200 -- 2500


The response time goes pretty high with multiple parallel fetch, witch 
is expected to be slower yes, but how can I improve that? See the jump 
between 300 to 400 in the AMD64 version below. Even if I say to use 400 
parallel connections, looking at the box top and all, looks like I can 
pass ~325? Both in old server and new server. So, I guess it must be 
something in the kernel setup that limit that?


Any clue would be appreciated and when can I possibly look for that?


Example:

OLD i386 850MHz
# http_load -parallel 100 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 100 max parallel, 1.37438e+07 bytes, in 32.1498 seconds
5497.53 mean bytes/connection
77.7609 fetches/sec, 427493 bytes/sec
msecs/connect: 96.7252 mean, 6008.79 max, 0.49 min
msecs/first-response: 985.229 mean, 11051.5 max, 3.514 min
HTTP response codes:
  code 200 -- 2500

New AMD64 1,6GHz
# http_load -parallel 100 -fetches 2500 -timeout 60 /tmp/www1
2500 fetches, 100 max parallel, 1.38878e+07 bytes, in 12.8811 seconds
.11 mean bytes/connection
194.082 fetches/sec, 1.07815e+06 bytes/sec
msecs/connect: 84.7087 mean, 6003.59 max, 0.351 min
msecs/first-response: 236.256 mean, 1921.73 max, 2.066 min
HTTP response codes:
  code 200 -- 2500

# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www1
2500 fetches, 200 max parallel, 1.36869e+07 bytes, in 11.8518 seconds
5474.78 mean bytes/connection
210.939 fetches/sec, 1.15484e+06 bytes/sec
msecs/connect: 178.411 mean, 6004.23 max, 0.353 min
msecs/first-response: 350.587 mean, 2427.51 max, 2.297 min
HTTP response codes:
  code 200 -- 2500

# http_load -parallel 300 -fetches 2500 -timeout 60 /tmp/www1
2500 fetches, 300 max parallel, 1.37912e+07 bytes, in 11.8928 seconds
5516.47 mean bytes/connection
210.211 fetches/sec, 1.15962e+06 bytes/sec
msecs/connect: 612.953 mean, 8995.56 max, 0.344 min
msecs/first-response: 266.107 mean, 2345.62 max, 2.069 min
HTTP response codes:
  code 200 -- 2500

# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www1
2500 fetches, 400 max parallel, 1.35291e+07 bytes, in 18.209 seconds
5411.63

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-08 Thread Ted Unangst


On 5/8/07, Daniel Ouellet [EMAIL PROTECTED] wrote:

I use http_load to test my configuration and changes, but I am not
successful at improving it more. Look like connections are timing out
and I can't get more then ~ 300 process serving for httpd. Yes I have
also increase and recompile the httpd to allow more then the hard limit
of 250 and I can start 1500 httpd process if I want and they do run, but
they do not server traffic looks like and I am still getting timeout.

Even if I start StartServers 2500 httpd process to be sure I don't run
out, or that the start of additional one is not the limit here, I can't
get more then about ~ 300 successful parallel at one with a decent timeout:


first, are you sure you are testing the server and not the client?

second, what happens if you start another web server on port 8080 and
test simultaneously?

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-08 Thread Wijnand Wiersma


Daniel,

Maybe I am about to say something really stupid, but ok, here I go:
are you testing from one location only? Maybe that host is the
bottleneck itself.

Wijnand

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-08 Thread Daniel Ouellet


Wijnand Wiersma wrote:

Daniel,

Maybe I am about to say something really stupid, but ok, here I go:
are you testing from one location only? Maybe that host is the
bottleneck itself.


Nothing is stupid for me right now. I am looking for any ideas that can 
help. Even if that look stupid, I am welling to test it.


As for the setup for the test, all servers and client are connected to 
the same Cisco switch directly.

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-08 Thread Daniel Ouellet


Ted Unangst wrote:

On 5/8/07, Daniel Ouellet [EMAIL PROTECTED] wrote:
first, are you sure you are testing the server and not the client?


I will try a different server. For now, I use a Sun V120 with nothing 
running on it as the client. I will use more beef one to be sure and 
report back.


Also PF is not running on either client and servers for tests.

I also try these tests:

net.inet.ip.maxqueue=300 - 1000

and

kern.somaxconn: 128 - 512

In any case, what I see is that I can't pass 5.8Mb/sec on the old i386 
server and 9.0Mb/sec on the HP145 AMD64 one regardless if I use 100 
parallel connection or 400. More then 400 really put all numbers down 
and delay, lost, etc.



second, what happens if you start another web server on port 8080 and
test simultaneously?


No, but I will. I am really looking for any ideas as I am at a lost and 
I will use heavyer clients to be sure it's not the problem here.

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-08 Thread Daniel Ouellet


Ted Unangst wrote:

first, are you sure you are testing the server and not the client?


Yes confirmed, it's not the client. I just did it from and IBM e365 with 
dual core processor. dmesg lower, but the results below for the Sun and 
the IBM looks similar. So, no client issue that I can see:


IBM e365 client:

# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 200 max parallel, 1.33069e+07 bytes, in 19.0603 seconds
5322.74 mean bytes/connection
131.163 fetches/sec, 698146 bytes/sec
msecs/connect: 140.559 mean, 6014.22 max, -7.799 min
msecs/first-response: 919.846 mean, 8114.42 max, -3.572 min
HTTP response codes:
  code 200 -- 2500

# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 400 max parallel, 1.39552e+07 bytes, in 18.2373 seconds
5582.08 mean bytes/connection
137.082 fetches/sec, 765203 bytes/sec
msecs/connect: 814.221 mean, 18006.5 max, -7.838 min
msecs/first-response: 1248.39 mean, 11165.7 max, -3.433 min
HTTP response codes:
  code 200 -- 2500


Sun V120 client:

# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 200 max parallel, 1.37375e+07 bytes, in 19.137 seconds
5494.99 mean bytes/connection
130.637 fetches/sec, 717851 bytes/sec
msecs/connect: 232.358 mean, 6005.86 max, 0.439 min
msecs/first-response: 872.213 mean, 10733.2 max, 3.409 min
HTTP response codes:
  code 200 -- 2500

# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 400 max parallel, 1.37627e+07 bytes, in 18.6019 seconds
5505.09 mean bytes/connection
134.395 fetches/sec, 739854 bytes/sec
msecs/connect: 1182 mean, 18013.3 max, 0.502 min
msecs/first-response: 1001.47 mean, 9873.65 max, 3.435 min
HTTP response codes:
  code 200 -- 2500


http_load Client dmesg:

# dmesg
OpenBSD 4.0 (GENERIC.MP) #967: Sat Sep 16 20:38:15 MDT 2006
[EMAIL PROTECTED]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 1072672768 (1047532K)
avail mem = 907272192 (886008K)
using 22937 buffers containing 107474944 bytes (104956K) of memory
mainbus0 (root)
bios0 at mainbus0: SMBIOS rev. 2.34 @ 0x3ff7c000 (46 entries)
bios0: IBM IBM eServer 326m -[796976U]-
ipmi0 at mainbus0: version 1.5 interface KCS iobase 0xca2/2 spacing 1
mainbus0: Intel MP Specification (Version 1.4) (AMD  HAMMER  )
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Dual Core AMD Opteron(tm) Processor 280, 2394.39 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,NXE,MMXX,FFXSR,LONG,3DNOW2,3DNOW
cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 1MB 
64b/line 16-way L2 cache

cpu0: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu0: DTLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu0: apic clock running at 199MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Dual Core AMD Opteron(tm) Processor 280, 2394.00 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,NXE,MMXX,FFXSR,LONG,3DNOW2,3DNOW
cpu1: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 1MB 
64b/line 16-way L2 cache

cpu1: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu1: DTLB 32 4KB entries fully associative, 8 4MB entries fully associative
mpbios: bus 0 is type PCI
mpbios: bus 1 is type PCI
mpbios: bus 2 is type PCI
mpbios: bus 3 is type PCI
mpbios: bus 4 is type PCI
mpbios: bus 5 is type PCI
mpbios: bus 6 is type PCI
mpbios: bus 7 is type PCI
mpbios: bus 8 is type PCI
mpbios: bus 9 is type ISA
ioapic0 at mainbus0 apid 4 pa 0xfec0, version 11, 16 pins
ioapic1 at mainbus0 apid 5 pa 0xfec01000, version 11, 16 pins
ioapic2 at mainbus0 apid 6 pa 0xfec02000, version 11, 16 pins
pci0 at mainbus0 bus 0: configuration mode 1
ppb0 at pci0 dev 1 function 0 ServerWorks HT-1000 PCI rev 0x00
pci1 at ppb0 bus 1
ppb1 at pci1 dev 13 function 0 ServerWorks HT-1000 PCIX rev 0xb2
pci2 at ppb1 bus 2
pciide0 at pci1 dev 14 function 0 ServerWorks HT-1000 SATA rev 0x00: DMA
pciide0: using apic 4 int 11 (irq 11) for native-PCI interrupt
pciide0: port 0: device present, speed: 1.5Gb/s
wd0 at pciide0 channel 0 drive 0: WDC WD800JD-23LSA0
wd0: 16-sector PIO, LBA48, 76324MB, 156312576 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5
pciide0: port 1: PHY offline
pciide0: port 2: PHY offline
pciide0: port 3: PHY offline
pciide1 at pci1 dev 14 function 1 ServerWorks HT-1000 SATA rev 0x00
piixpm0 at pci0 dev 2 function 0 ServerWorks HT-1000 rev 0x00: polling
iic0 at piixpm0: disabled to avoid ipmi0 interactions
pciide2 at pci0 dev 2 function 1 ServerWorks HT-1000 IDE rev 0x00: DMA
atapiscsi0 at pciide2 channel 0 drive 0
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: HL-DT-ST, CD-ROM GCR-8240N, 1.06 SCSI0 
5/cdrom removable

cd0(pciide2:0:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 0
pcib0 at pci0 dev 2 function 2 ServerWorks HT-1000 LPC rev 0x00
ohci0 at pci0 dev 3

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-08 Thread Joachim Schipper

On Tue, May 08, 2007 at 06:04:43PM -0400, Daniel Ouellet wrote:
 Ted Unangst wrote:
 first, are you sure you are testing the server and not the client?
 
 Yes confirmed, it's not the client. I just did it from and IBM e365 with 
 dual core processor. dmesg lower, but the results below for the Sun and 
 the IBM looks similar. So, no client issue that I can see:
 
 IBM e365 client:
 
 # http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
 2500 fetches, 200 max parallel, 1.33069e+07 bytes, in 19.0603 seconds
 5322.74 mean bytes/connection
 131.163 fetches/sec, 698146 bytes/sec
 msecs/connect: 140.559 mean, 6014.22 max, -7.799 min
 msecs/first-response: 919.846 mean, 8114.42 max, -3.572 min
 HTTP response codes:
   code 200 -- 2500
 
 # http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
 2500 fetches, 400 max parallel, 1.39552e+07 bytes, in 18.2373 seconds
 5582.08 mean bytes/connection
 137.082 fetches/sec, 765203 bytes/sec
 msecs/connect: 814.221 mean, 18006.5 max, -7.838 min
 msecs/first-response: 1248.39 mean, 11165.7 max, -3.433 min
 HTTP response codes:
   code 200 -- 2500
 
 
 Sun V120 client:
 
 # http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
 2500 fetches, 200 max parallel, 1.37375e+07 bytes, in 19.137 seconds
 5494.99 mean bytes/connection
 130.637 fetches/sec, 717851 bytes/sec
 msecs/connect: 232.358 mean, 6005.86 max, 0.439 min
 msecs/first-response: 872.213 mean, 10733.2 max, 3.409 min
 HTTP response codes:
   code 200 -- 2500
 
 # http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
 2500 fetches, 400 max parallel, 1.37627e+07 bytes, in 18.6019 seconds
 5505.09 mean bytes/connection
 134.395 fetches/sec, 739854 bytes/sec
 msecs/connect: 1182 mean, 18013.3 max, 0.502 min
 msecs/first-response: 1001.47 mean, 9873.65 max, 3.435 min
 HTTP response codes:
   code 200 -- 2500

Just a question - what do you seen when trying from localhost? That
would eliminate quite a few networking issues, at least.

Joachim

-- 
TFMotD: factor, primes (6) - factor a number, generate primes

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-08 Thread Daniel Ouellet


Ted Unangst wrote:

first, are you sure you are testing the server and not the client?


Even run locally, the numbers don't look much better. Even in this case, 
looks like it can't do the required number of parallel requested:


old i386
# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 94 max parallel, 1.37816e+07 bytes, in 20.7814 seconds
5512.65 mean bytes/connection
120.3 fetches/sec, 663172 bytes/sec
msecs/connect: 326.667 mean, 6062.79 max, 1.248 min
msecs/first-response: 36.5991 mean, 6071.86 max, 3.419 min
HTTP response codes:
  code 200 -- 2500
# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 90 max parallel, 1.38708e+07 bytes, in 20.9679 seconds
5548.31 mean bytes/connection
119.23 fetches/sec, 661525 bytes/sec
msecs/connect: 346.224 mean, 6130.06 max, 1.228 min
msecs/first-response: 43.7965 mean, 6055.29 max, 3.392 min
HTTP response codes:
  code 200 -- 2500


new amd64
# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www1
2500 fetches, 64 max parallel, 1.33453e+07 bytes, in 14.2911 seconds
5338.11 mean bytes/connection
174.934 fetches/sec, 933819 bytes/sec
msecs/connect: 107.002 mean, 6016.89 max, 0.802 min
msecs/first-response: 19.2824 mean, 512.538 max, 1.706 min
HTTP response codes:
  code 200 -- 2500
# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www1
2500 fetches, 63 max parallel, 1.37396e+07 bytes, in 14.1811 seconds
5495.84 mean bytes/connection
176.291 fetches/sec, 968869 bytes/sec
msecs/connect: 106.943 mean, 6022.11 max, -8.932 min
msecs/first-response: 21.5082 mean, 3041.49 max, 1.716 min
HTTP response codes:
  code 200 -- 2500

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-08 Thread Wijnand Wiersma


Daniel Ouellet tried to tell me:

Wijnand Wiersma wrote:

Daniel,

Maybe I am about to say something really stupid, but ok, here I go:
are you testing from one location only? Maybe that host is the
bottleneck itself.


Nothing is stupid for me right now. I am looking for any ideas that 
can help. Even if that look stupid, I am welling to test it.


As for the setup for the test, all servers and client are connected to 
the same Cisco switch directly.

I meant the client being the bottleneck ;-)
Sorry for not being clear.


Wijnand

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-08 Thread Daniel Ouellet


Joachim Schipper wrote:

Just a question - what do you seen when trying from localhost? That
would eliminate quite a few networking issues, at least.


Not that much different. I would even say that may be not as good 
locally. Plus I sent an other example for two different servers with the 
test done locally as well. Should show up on marc very soon. Not there yet.


Local:
# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 52 max parallel, 1.42596e+07 bytes, in 20.8623 seconds
5703.82 mean bytes/connection
119.833 fetches/sec, 683507 bytes/sec
msecs/connect: 107.61 mean, 6061.48 max, 1.224 min
msecs/first-response: 39.1055 mean, 6008.52 max, 3.384 min
HTTP response codes:
  code 200 -- 2500

# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 82 max parallel, 1.35499e+07 bytes, in 20.7909 seconds
5419.97 mean bytes/connection
120.245 fetches/sec, 651724 bytes/sec
msecs/connect: 290.4 mean, 6059.02 max, 1.253 min
msecs/first-response: 33.4435 mean, 6004.2 max, 3.459 min
HTTP response codes:
  code 200 -- 2500

Remote:

# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 400 max parallel, 1.34383e+07 bytes, in 18.4801 seconds
5375.32 mean bytes/connection
135.281 fetches/sec, 727177 bytes/sec
msecs/connect: 1016.4 mean, 18012.9 max, 0.406 min
msecs/first-response: 1104.19 mean, 10505.5 max, 3.455 min
HTTP response codes:
  code 200 -- 2500
# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 200 max parallel, 1.36846e+07 bytes, in 23.4292 seconds
5473.85 mean bytes/connection
106.704 fetches/sec, 584083 bytes/sec
msecs/connect: 391.978 mean, 6006.38 max, 0.486 min
msecs/first-response: 742.048 mean, 10497.9 max, 3.403 min
HTTP response codes:
  code 200 -- 2500

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-08 Thread Daniel Ouellet


Wijnand Wiersma wrote:

I meant the client being the bottleneck ;-)
Sorry for not being clear.


Nope. I sent updates on that too with a more powerful server. And I am 
doing tests now with three clients at once to see and I can get a bit 
more process running on the server side, but still no more output of 
that server.


It is cap somehow and I am not sure what does it yet.

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-08 Thread Douglas Allan Tutty

On Tue, May 08, 2007 at 07:13:27PM -0400, Daniel Ouellet wrote:
 
 Nope. I sent updates on that too with a more powerful server. And I am 
 doing tests now with three clients at once to see and I can get a bit 
 more process running on the server side, but still no more output of 
 that server.
 
 It is cap somehow and I am not sure what does it yet.
 

I'm new at this so please ignore if its not helpful.

Is this a bandwidth (hardware) limitation on the computer itself?  If so
then a faster processor won't help.  Bus contention?

Doug.

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-08 Thread Daniel Ouellet


Douglas Allan Tutty wrote:

It is cap somehow and I am not sure what does it yet.



I'm new at this so please ignore if its not helpful.

Is this a bandwidth (hardware) limitation on the computer itself?  If so
then a faster processor won't help.  Bus contention?


Could always be a possibility, but if you take the data sent and the 
time spend to send it, you would see that one server in all tests look 
like it cap at around 5.8Mb/sec and the other one at 9.0Mb/sec. These 
numbers are sure way to low to be a bus problem here. Even drive speed, 
look to me that drives these days sure can spit data lots faster then 
this for sure.


I am trying so many different things without success so far. But I am 
sure there have to be something I am overlooking here. Doesn't make 
sense to me that one would be cap at that level. I don't believe it 
anyway, but on the other end, I am running out of idea to check and 
Google doesn't provide me lots more to try that I haven't done already.


I am sure Henning can get more out of his servers then this, but I am 
not sure how he does it to be honest.

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-08 Thread Otto Moerbeek

On Tue, 8 May 2007, Daniel Ouellet wrote:

 I am trying to improve my performance and fix my problem on httpd, but look
 like I am hitting the roof regardless if I test in lab using an old 850MHz
 i386 or an new AMD64 at 1.6GHz. Both have  2GB of ram, so that's the issue
 both have. I can't pass more then ~300 to 325 simultaneous httpd process and
 timeout goes jump high.
 
 So, I guess may be the limit are in the connection process of the TCP stack,
 more then the httpd itself. But I am at a lots as to where to look. Tested
 both on 4.1 and 3.9 just to see.
 
 Where are the OS bottleneck that I can may be improve here?

Loks at the memory usage. 300 httpd procces could take up 3000M
easily, especially with stuff like php. In that case, the machine
starts swapping and your hit the roof. As a general rul, do not allow
more httpd procces than our machine can handle without swapping. Also,
a long KeepAliveTmeout can works against you, by holding slots. 

-Otto

Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

2007-05-08 Thread Daniel Ouellet


Otto Moerbeek wrote:

Where are the OS bottleneck that I can may be improve here?


Loks at the memory usage. 300 httpd procces could take up 3000M
easily, especially with stuff like php. In that case, the machine
starts swapping and your hit the roof. As a general rul, do not allow
more httpd procces than our machine can handle without swapping. Also,
a long KeepAliveTmeout can works against you, by holding slots. 


Thanks Otto,

I am still doing tests and tweak, but as far as swap, I checked that and 
same for keep alive in httpd.conf and I even changed it in:


net.inet.tcp.keepinittime=10
net.inet.tcp.keepidle=30
net.inet.tcp.keepintvl=30

For testing only. I am not saying the value above are any good, but I am 
testing multiple things and reading a lot on sysctl and what each one does.


KeepAliveTmeout is at 5 seconds.

No swapping is happening, even with 1000 httpd running.

load averages: 123.63, 39.74, 63.3285  01:26:47
1064 processes:1063 idle, 1 on processor
CPU states:  0.8% user,  0.0% nice,  3.1% system,  0.8% interrupt, 95.4% 
idle

Memory: Real: 648M/1293M act/tot  Free: 711M  Swap: 0K/4096M used/tot

38 matches

Mail list logo