RE: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-03-03 Thread Andrew Gallatin

Don Bowman writes:

  I'm not sure what affect on fxp. fxp is inherently limited
  by something internal to it, which prevents achieving 
  high packet rates. bge is the best chip, but doesn't
  have the best bsd support.
  

Just curious - why is bge the best chip?  Is it because
it exports a really nice API (separate recv ring for small messages),
or is the chip inherently faster, regardless of its API?

I'm trying to design a new ethernet API for a firmware-based nic,
and I'm trying to convince a colleague that having separate
receive rings for small and large frames is a really good thing.

Thanks,

Drew
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-03-03 Thread Luigi Rizzo
On Wed, Mar 03, 2004 at 10:03:11AM -0500, Andrew Gallatin wrote:
 
 Don Bowman writes:
 
   I'm not sure what affect on fxp. fxp is inherently limited
   by something internal to it, which prevents achieving 
   high packet rates. bge is the best chip, but doesn't

but you should not compare apples and oranges. the fxp is a 100mbit NIC,
the bge is a GigE NIC.

 Just curious - why is bge the best chip?  Is it because
 it exports a really nice API (separate recv ring for small messages),
 or is the chip inherently faster, regardless of its API?
 
 I'm trying to design a new ethernet API for a firmware-based nic,
 and I'm trying to convince a colleague that having separate
 receive rings for small and large frames is a really good thing.

i am actually not very convinced either, unless you are telling me
that there is a way to preserve ordering. Or you'd be in trouble
when, on your busy link, there is a mismatch between user-level and
link-level block sizes.

So, what is your design like, you want to pass the NIC buffers of
2-3 different sizes and let the NIC choose from the most appropriate
pool depending on the incoming frame size, but still return
received frames in a single ring in arrival order ?
This would make sense, but having completely separate rings
(small frames here, large frames there) with no ordering relation
would not.

cheers
luigi
 Drew
 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to [EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-03-03 Thread Andrew Gallatin

Luigi Rizzo writes:
  On Wed, Mar 03, 2004 at 10:03:11AM -0500, Andrew Gallatin wrote:
...
   I'm trying to design a new ethernet API for a firmware-based nic,
   and I'm trying to convince a colleague that having separate
   receive rings for small and large frames is a really good thing.
  
  i am actually not very convinced either, unless you are telling me
  that there is a way to preserve ordering. Or you'd be in trouble
  when, on your busy link, there is a mismatch between user-level and
  link-level block sizes.
  
  So, what is your design like, you want to pass the NIC buffers of
  2-3 different sizes and let the NIC choose from the most appropriate
  pool depending on the incoming frame size, but still return
  received frames in a single ring in arrival order ?

Yes, exactly.  This way you get to pass the stack small (MHLEN)
frames in mbufs, rather than clusters without doing something like
copying them in the driver's rx interrupt handler.  You can allocate
tons of mbufs so that you can absorb the occasional burst (or spike in
host latency) without being as bad of pig as you'd be if you allocated
a huge number of clusters ;)

You also get to set yourself up for zero-copy receive by splitting
the headers into mbufs, and the payloads into jumbo clusters
that can get page-flipped.  But that's a lot trickier and not
really in the scope of the initial implementation.

Drew
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-03-01 Thread Mike Tancsa
On Sat, 28 Feb 2004 23:17:44 -0500, in sentex.lists.freebsd.hackers 
If you want to spend more time in kernel, perhaps change

I might have HZ @ 2500 as well.

Hi,
Just curious as to the reasoning behind that ?

---Mike
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-03-01 Thread Mike Tancsa
At 09:38 PM 29/02/2004, Don Bowman wrote:
From: Mike Tancsa [mailto:[EMAIL PROTECTED]
 At 08:44 PM 29/02/2004, Don Bowman wrote:
 From: Mike Tancsa [mailto:[EMAIL PROTECTED]
  
   On Sat, 28 Feb 2004 23:17:44 -0500, in
 sentex.lists.freebsd.hackers 
   If you want to spend more time in kernel, perhaps change
   
   I might have HZ @ 2500 as well.
I picked 2500 as the best for my system. Its higher than
allowed by rfc1323 and PAWS [kern/61404], but not by so much
that i anticipate a problem.
Do you run the box with the supplied patch ?  On the firewall device I was 
thinking of experimenting with, I do have long TCP sessions that it sounds 
like HZ=2500 would break.


For my target packets per second
rate, it means that i can use a reasonable number of dma
descriptors. I found that bridging performance in particular
needs the higher hz to avoid dropping packets, to improve
its performance.
In terms of fiddling with the em tunables, what are the drawbacks of moving 
from 256 to 512 on

EM_MAX_TXD
EM_MAX_RXD
more buffers == better ability to handle latency
bursts, but worse for cache occupancy.
Buffers as is net.inet.ip.intr_queue_maxlen ?

Thanks,

---Mike 

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-02-29 Thread Don Bowman
From: Mike Silbersack [mailto:[EMAIL PROTECTED]
 On Sat, 28 Feb 2004, Don Bowman wrote:
 
  You could use ipfw to limit the damage of a syn flood, e.g.
  a keep-state rule with a limit of ~2-5 per source IP, lower the
  timeouts, increase the hash buckets in ipfw, etc. This would
  use a mask on src-ip of all bits.
  something like:
  allow tcp from any to any setup limit src-addr 2
 
  this would only allow 2 concurrent TCP sessions per unique
  source address. Depends on the syn flood you are expecting
  to experience. You could also use dummynet to shape syn
  traffic to a fixed level i suppose.
 
 Does that really help?  If so, we need to optimize the syncache. :(

In a real-world situation, with some latency from the originating
syn-flood attacker, the syncache behaves fine.
In a synthetic test situation like this, with probably ~0 latency
from the initiator, the syncache gets overwhelmed too.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-02-29 Thread Robert Watson

On Sun, 29 Feb 2004, Mike Silbersack wrote:

 On Sat, 28 Feb 2004, Don Bowman wrote:
 
  this would only allow 2 concurrent TCP sessions per unique
  source address. Depends on the syn flood you are expecting
  to experience. You could also use dummynet to shape syn
  traffic to a fixed level i suppose.
 
 Does that really help?  If so, we need to optimize the syncache. :(

Given that we have syncookie support, the other thing we could consider
doing under high syn load is simply to drop the syncache from the loop
entirely.  The syncache provides us with the ability to gracefully
degrade as the syn rate goes up, but the FIFO cache bucket overflow
handling means we pay the cost of syncache entry allocation even in the
high load situation.  It might be interesting to measure when syncache
overflow is taking place, and simply drop it from the loop under a rate
known to exceed the syncache capacity, then re-enable it again once the
rate drops.  This would remove a memory allocation, queue walking, and in
the case of an SMP system, locking, from the syn handling path.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Senior Research Scientist, McAfee Research

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-02-29 Thread Don Bowman
From: Mike Tancsa [mailto:[EMAIL PROTECTED]
 
 On Sat, 28 Feb 2004 23:17:44 -0500, in sentex.lists.freebsd.hackers 
 If you want to spend more time in kernel, perhaps change
 
 I might have HZ @ 2500 as well.
 
 Hi,
   Just curious as to the reasoning behind that ?

@ high packet rates, you don't have enough DMA
queues available to the em driver, and will drop.
increasing the number of dma buffers will cause
problems with cache occupancy. Increasing the HZ
doesn't have a huge cost.





___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-02-29 Thread Mike Tancsa
At 08:44 PM 29/02/2004, Don Bowman wrote:
From: Mike Tancsa [mailto:[EMAIL PROTECTED]

 On Sat, 28 Feb 2004 23:17:44 -0500, in sentex.lists.freebsd.hackers 
 If you want to spend more time in kernel, perhaps change
 
 I might have HZ @ 2500 as well.

 Hi,
   Just curious as to the reasoning behind that ?
@ high packet rates, you don't have enough DMA
queues available to the em driver, and will drop.
increasing the number of dma buffers will cause
problems with cache occupancy. Increasing the HZ
doesn't have a huge cost.
But why that value ? Did you determine it by trial and error or deduce it 
based on some other factors ?  Also, is this value optimal for fxp based boxes.

---Mike 

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-02-29 Thread Don Bowman
From: Mike Tancsa [mailto:[EMAIL PROTECTED]
 At 08:44 PM 29/02/2004, Don Bowman wrote:
 From: Mike Tancsa [mailto:[EMAIL PROTECTED]
  
   On Sat, 28 Feb 2004 23:17:44 -0500, in 
 sentex.lists.freebsd.hackers 
   If you want to spend more time in kernel, perhaps change
   
   I might have HZ @ 2500 as well.
  
   Hi,
 Just curious as to the reasoning behind that ?
 
 @ high packet rates, you don't have enough DMA
 queues available to the em driver, and will drop.
 increasing the number of dma buffers will cause
 problems with cache occupancy. Increasing the HZ
 doesn't have a huge cost.
 
 But why that value ? Did you determine it by trial and error 
 or deduce it 
 based on some other factors ?  Also, is this value optimal 
 for fxp based boxes.

I picked 2500 as the best for my system. Its higher than
allowed by rfc1323 and PAWS [kern/61404], but not by so much 
that i anticipate a problem. For my target packets per second
rate, it means that i can use a reasonable number of dma
descriptors. I found that bridging performance in particular
needs the higher hz to avoid dropping packets, to improve
its performance.

I'm not sure what affect on fxp. fxp is inherently limited
by something internal to it, which prevents achieving 
high packet rates. bge is the best chip, but doesn't
have the best bsd support.

The value of HZ needs to be based on your target packet
rate, the maximum latency in your system, and the size
of your buffers for all steps.

more buffers == better ability to handle latency
bursts, but worse for cache occupancy.

Freebsd is not the best system for trying to guarantee
latency through, you can find things like ahd, syncache,
arp freeing that will suddenly wake up and munch all
kinds of cpu time with spl? taken. freebsd-current 
is both better and worse: its better with the fine grained
locking, but worse since those locks can end up costing
you more than you would have spent just taking giant 
and being done with it: semaphores are expensive,
particularly on SMP systems.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-02-28 Thread Don Bowman
 I have a machine running 4.9.  P4 2.8Ghz, 800mhz bus, Intel PRO/1000 
 ethernet connected to a Cisco, both sides are locked to 1000/FD.
 
 The kernel has HZ=1000, and DEVICE_POLLING, IPFW, DUMMYNET, 
 etc. After 
 only a few minutes of run time under an attack ~90,000 pps. 
 The attack 
 has been limited at the router to JUST incoming TCP port 80 inbound 
 traffic. I don't know why the machine is having such a hard 
 time under 
 the load. The cpu shows it is 90% idle even under the worst of the 
 attack.  What am I doing wrong?

I think there's a problem with CPU time not getting properly
accounted for in device polling, so it may be busier than you think.

For this scenario, i would set net.inet.tcp.blackhole=2. You
might be spending a lot of time creating the ICMP unreachable
messages, rather than in the network driver (where device polling
would help).

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-02-28 Thread Deepak Jain
It was kindly pointed out that I didn't including the symptoms of the 
problem:

Without polling on, I get 70+% interrupt load, and I get live lock.

With polling on, I start getting huge amounts of input errors, packet 
loss, and general unresponsiveness to the network. The web server on it 
doesn't respond though it occassionally will open the connection, just 
not respond. accept_filter on/off makes no difference. I have read other 
posts that say em systems can more 200kpps without serious incident.

Thanks in advance,

DJ

Deepak Jain wrote:

I have a machine running 4.9.  P4 2.8Ghz, 800mhz bus, Intel PRO/1000 
ethernet connected to a Cisco, both sides are locked to 1000/FD.

The kernel has HZ=1000, and DEVICE_POLLING, IPFW, DUMMYNET, etc. After 
only a few minutes of run time under an attack ~90,000 pps. The attack 
has been limited at the router to JUST incoming TCP port 80 inbound 
traffic. I don't know why the machine is having such a hard time under 
the load. The cpu shows it is 90% idle even under the worst of the 
attack.  What am I doing wrong?

Thanks,

DJ

#sysctl -a |grep hz
kern.clockrate: { hz = 1000, tick = 1000, tickadj = 1, profhz = 1024, 
stathz = 1
28 }
#sysctl -a |grep polling
kern.polling.burst: 544
kern.polling.each_burst: 30
kern.polling.burst_max: 550
kern.polling.idle_poll: 1
kern.polling.poll_in_trap: 0
kern.polling.user_frac: 50
kern.polling.reg_frac: 30
kern.polling.short_ticks: 44151
kern.polling.lost_polls: 84925
kern.polling.pending_polls: 0
kern.polling.residual_burst: 0
kern.polling.handlers: 1
kern.polling.enable: 1
kern.polling.phase: 0
kern.polling.suspect: 39272
kern.polling.stalled: 5

Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 4.9-RELEASE #8: Sat Feb 28 23:42:41 GMT 2004
Timecounter i8254  frequency 1193182 Hz
CPU: Intel(R) Pentium(R) 4 CPU 2.80GHz (2806.38-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0xf29  Stepping = 9
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,C 

MOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  Hyperthreading: 2 logical CPUs
real memory  = 2147418112 (2097088K bytes)
avail memory = 2085978112 (2037088K bytes)
Preloaded elf kernel kernel at 0xc04fa000.
Warning: Pentium 4 CPU: PSE disabled
Pentium Pro MTRR support enabled
md0: Malloc disk
Using $PIR table, 12 entries at 0xc00fdea0
npx0: math processor on motherboard
npx0: INT 16 interface
pcib0: Host to PCI bridge on motherboard
pci0: PCI bus on pcib0
pcib1: PCI to PCI bridge (vendor=8086 device=2579) at device 1.0 on pci0
pci1: PCI bus on pcib1
pcib2: PCI to PCI bridge (vendor=8086 device=257b) at device 3.0 on pci0
pci2: PCI bus on pcib2
em0: Intel(R) PRO/1000 Network Connection, Version - 1.7.16 port 
0xb000-0xb01f
 mem 0xf300-0xf301 irq 12 at device 1.0 on pci2
em0:  Speed:N/A  Duplex:N/A
uhci0: Intel 82801EB (ICH5) USB controller USB-A port 0xcc00-0xcc1f 
irq 11 at
device 29.0 on pci0
usb0: Intel 82801EB (ICH5) USB controller USB-A on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: Intel 82801EB (ICH5) USB controller USB-B port 0xc000-0xc01f 
irq 3 at d
evice 29.1 on pci0
usb1: Intel 82801EB (ICH5) USB controller USB-B on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: Intel 82801EB (ICH5) USB controller USB-C port 0xc400-0xc41f 
irq 12 at
device 29.2 on pci0
usb2: Intel 82801EB (ICH5) USB controller USB-C on uhci2
usb2: USB revision 1.0
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhci3: Intel 82801EB (ICH5) USB controller USB-D port 0xc800-0xc81f 
irq 11 at
device 29.3 on pci0
usb3: Intel 82801EB (ICH5) USB controller USB-D on uhci3
usb3: USB revision 1.0
uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
pci0: USB controller at 29.7 irq 7
pcib3: Intel 82801BA/BAM (ICH2) Hub to PCI bridge at device 30.0 on pci0
pci3: PCI bus on pcib3
ahd0: Adaptec 29320LP Ultra320 SCSI adapter port 
0x9400-0x94ff,0x9000-0x90ff m
em 0xf202-0xf2021fff irq 11 at device 0.0 on pci3
aic7901A: Ultra320 Wide Channel A, SCSI Id=7, PCI 33 or 66Mhz, 512 SCBs
pci3: unknown card (vendor=0x105a, dev=0x3373) at 3.0 irq 10
pci3: ATI Mach64-GR graphics accelerator at 7.0 irq 11
pci3: unknown card (vendor=0x8086, dev=0x1051) at 8.0 irq 5
isab0: PCI to ISA bridge (vendor=8086 device=24d0) at device 31.0 on pci0
isa0: ISA bus on isab0
atapci0: Intel ICH5 ATA100 controller port 
0xf000-0xf00f,0-0x3,0-0x7,0-0x3,0-0
x7 irq 0 at device 31.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
pci0: unknown card (vendor=0x8086, dev=0x24d3) at 31.3 irq 9
orm0: Option ROMs at iomem 0xc-0xc7fff,0xc8000-0xd17ff on isa0
pmtimer0 on 

Re: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-02-28 Thread Deepak Jain


Don Bowman wrote:

I have a machine running 4.9.  P4 2.8Ghz, 800mhz bus, Intel PRO/1000 
ethernet connected to a Cisco, both sides are locked to 1000/FD.

The kernel has HZ=1000, and DEVICE_POLLING, IPFW, DUMMYNET, 
etc. After 
only a few minutes of run time under an attack ~90,000 pps. 
The attack 
has been limited at the router to JUST incoming TCP port 80 inbound 
traffic. I don't know why the machine is having such a hard 
time under 
the load. The cpu shows it is 90% idle even under the worst of the 
attack.  What am I doing wrong?


I think there's a problem with CPU time not getting properly
accounted for in device polling, so it may be busier than you think.
For this scenario, i would set net.inet.tcp.blackhole=2. You
might be spending a lot of time creating the ICMP unreachable
messages, rather than in the network driver (where device polling
would help).
I'd like to know more about the CPU time idea. I have 
net.inet.udp.blackhole=2 and net.inet.tcp.blackhole=2 because I saw a 
lot of dstunreachable packets out.

The system can hyperthread, but I thought the singlethreading of polling 
might have been an issue, so I recompiled the kernel without SMP.

DJ



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-02-28 Thread Deepak Jain
And this was picked up in the messages log:

/kernel: stray irq 7
last message repeated 2 times
/kernel: too many stray irq 7's; not logging any more
DJ

Don Bowman wrote:

I have a machine running 4.9.  P4 2.8Ghz, 800mhz bus, Intel PRO/1000 
ethernet connected to a Cisco, both sides are locked to 1000/FD.

The kernel has HZ=1000, and DEVICE_POLLING, IPFW, DUMMYNET, 
etc. After 
only a few minutes of run time under an attack ~90,000 pps. 
The attack 
has been limited at the router to JUST incoming TCP port 80 inbound 
traffic. I don't know why the machine is having such a hard 
time under 
the load. The cpu shows it is 90% idle even under the worst of the 
attack.  What am I doing wrong?


I think there's a problem with CPU time not getting properly
accounted for in device polling, so it may be busier than you think.
For this scenario, i would set net.inet.tcp.blackhole=2. You
might be spending a lot of time creating the ICMP unreachable
messages, rather than in the network driver (where device polling
would help).
--don


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-02-28 Thread Don Bowman
 It was kindly pointed out that I didn't including the symptoms of the 
 problem:
 
 
 Without polling on, I get 70+% interrupt load, and I get live lock.
 
 With polling on, I start getting huge amounts of input errors, packet 
 loss, and general unresponsiveness to the network. The web 
 server on it 
 doesn't respond though it occassionally will open the 
 connection, just 
 not respond. accept_filter on/off makes no difference. I have 
 read other 
 posts that say em systems can more 200kpps without serious incident.
 
 Thanks in advance,
 
 DJ

You may need to increase the MAX_RXD inside your em driver to e.g. 512.

With similar system, em can handle ~800Kpps of bridging.

Your earlier email showed a very large number of RST messages,
which makes me suspect the blackhole actually wasn't enabled.

Not exactly sure what you're trying to do here. It sounds like
you are trying to generate a SYN flood on port 80, and your listen
queue is backing up. You've increased kern.ipc.somaxconn? Does your
application specify a fixed listen queue depth? Could it be increased?
Are you using apache as the server? Could you use a kqueue-enabled
one like thttpd?

Have you checked net.inet.ip.intr_queue_drops?
If its showing 0, check net.inet.ip.intr_queue_maxlen, perhaps
increase it.

Have you sufficient mbufs and clusters? netstat -m.

If you want to spend more time in kernel, perhaps change
kern.polling.user_frac to 10?

I might have HZ @ 2500 as well.

You could use ipfw to limit the damage of a syn flood, e.g.
a keep-state rule with a limit of ~2-5 per source IP, lower the
timeouts, increase the hash buckets in ipfw, etc. This would
use a mask on src-ip of all bits.
something like:
allow tcp from any to any setup limit src-addr 2

this would only allow 2 concurrent TCP sessions per unique
source address. Depends on the syn flood you are expecting
to experience. You could also use dummynet to shape syn
traffic to a fixed level i suppose.

now... this will switch the DoS condition to elsewhere in
the kernel, and it might not win you anything.
net.inet.ip.fw.dyn_buckets=16384
net.inet.ip.fw.dyn_syn_lifetime=5
net.inet.ip.fw.dyn_max=32000

might be called for if you try that approach.


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-02-28 Thread Deepak Jain


Don Bowman wrote:

It was kindly pointed out that I didn't including the symptoms of the 
problem:

Without polling on, I get 70+% interrupt load, and I get live lock.

With polling on, I start getting huge amounts of input errors, packet 
loss, and general unresponsiveness to the network. The web 
server on it 
doesn't respond though it occassionally will open the 
connection, just 
not respond. accept_filter on/off makes no difference. I have 
read other 
posts that say em systems can more 200kpps without serious incident.

Thanks in advance,

DJ


You may need to increase the MAX_RXD inside your em driver to e.g. 512.
I didn't know if my card had a buffer bigger than the default 256. I can 
increase it, but I didn't know how to determine how big a MAX_RXD my 
card would support. When the system was under load, it was generating 
2xHZ clock ticks (2000 when HZ was 1000) is that normal?

With similar system, em can handle ~800Kpps of bridging.
What settings did you use?

Your earlier email showed a very large number of RST messages,
which makes me suspect the blackhole actually wasn't enabled.
Not exactly sure what you're trying to do here. It sounds like
you are trying to generate a SYN flood on port 80, and your listen
queue is backing up. You've increased kern.ipc.somaxconn? Does your
application specify a fixed listen queue depth? Could it be increased?
Are you using apache as the server? Could you use a kqueue-enabled
one like thttpd?
Using apache, might go to squid or thttpd. Didn't think it should make a 
big deal. Increased somaxconn. Basically the system is getting hammered 
(after all filtering at the router) with valid get requests on port 80.

Have you checked net.inet.ip.intr_queue_drops?
If its showing 0, check net.inet.ip.intr_queue_maxlen, perhaps
increase it.
net.inet.ip.intr_queue_maxlen: 500
net.inet.ip.intr_queue_drops: 0
p1003_1b.sigqueue_max: 0
No intr drops.

Have you sufficient mbufs and clusters? netstat -m.

1026/5504/262144 mbufs in use (current/peak/max):
1026 mbufs allocated to data
1024/5460/65536 mbuf clusters in use (current/peak/max)
12296 Kbytes allocated to network (6% of mb_map in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines
mbufs look fine.

If you want to spend more time in kernel, perhaps change
kern.polling.user_frac to 10?
I'll do that.
I might have HZ @ 2500 as well.

You could use ipfw to limit the damage of a syn flood, e.g.
a keep-state rule with a limit of ~2-5 per source IP, lower the
timeouts, increase the hash buckets in ipfw, etc. This would
use a mask on src-ip of all bits.
something like:
allow tcp from any to any setup limit src-addr 2
This is a great idea. We were trapping those who crossed our connection 
thresholds and blackholing them upstream (automatically, with a script).


this would only allow 2 concurrent TCP sessions per unique
source address. Depends on the syn flood you are expecting
to experience. You could also use dummynet to shape syn
traffic to a fixed level i suppose.
now... this will switch the DoS condition to elsewhere in
the kernel, and it might not win you anything.
net.inet.ip.fw.dyn_buckets=16384
net.inet.ip.fw.dyn_syn_lifetime=5
net.inet.ip.fw.dyn_max=32000
might be called for if you try that approach.

I see where that should get us. We'll see.

Thanks!

DJ

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-02-28 Thread Don Bowman
 ...
  
  
  You may need to increase the MAX_RXD inside your em driver 
 to e.g. 512.
 
 I didn't know if my card had a buffer bigger than the default 
 256. I can 
 increase it, but I didn't know how to determine how big a MAX_RXD my 
 card would support. When the system was under load, it was generating 
 2xHZ clock ticks (2000 when HZ was 1000) is that normal?

max_rxd is not a function of the card. its the system ram
you devote to dma buffers.
Too big will cause problems since it will flush through the 
cache etc.

you should get (vmstat -i) a clk rate that exactly matches
hz.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-02-28 Thread Mike Silbersack

On Sat, 28 Feb 2004, Don Bowman wrote:

 You could use ipfw to limit the damage of a syn flood, e.g.
 a keep-state rule with a limit of ~2-5 per source IP, lower the
 timeouts, increase the hash buckets in ipfw, etc. This would
 use a mask on src-ip of all bits.
 something like:
 allow tcp from any to any setup limit src-addr 2

 this would only allow 2 concurrent TCP sessions per unique
 source address. Depends on the syn flood you are expecting
 to experience. You could also use dummynet to shape syn
 traffic to a fixed level i suppose.

Does that really help?  If so, we need to optimize the syncache. :(

Mike Silby Silbersack
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0, polling performance, P4 2.8ghz FSB 800mhz

2004-02-28 Thread Deepak Jain
You could use ipfw to limit the damage of a syn flood, e.g.
a keep-state rule with a limit of ~2-5 per source IP, lower the
timeouts, increase the hash buckets in ipfw, etc. This would
use a mask on src-ip of all bits.
something like:
allow tcp from any to any setup limit src-addr 2
this would only allow 2 concurrent TCP sessions per unique
source address. Depends on the syn flood you are expecting
to experience. You could also use dummynet to shape syn
traffic to a fixed level i suppose.


Does that really help?  If so, we need to optimize the syncache. :(

I know that if I rate shape the setup traffic, it helps.

DJ

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]