Re: firewall is very slow, something's wrong

2007-10-10 Thread Henning Brauer
* Florin Andrei [EMAIL PROTECTED] [2007-10-09 22:54]:
 Henning Brauer wrote:
 * Florin Andrei [EMAIL PROTECTED] [2007-10-09 19:34]:
 then, an i386 kernel should perform considerably better than amd64 for 
 firewalling/routing/...
 That is surprising. What is the reason?
 we dunno really. it hasn't been benched in sometimesoit might not even be 
 true nay more, but last time the difference was dramatic.

 Then I will do some tests with 4.2 on gigabit-capable hardware. If anything 
 noteworthy comes out, I'll post the results.
 Don't expect something too fancy, but I guess anything is better than 
 nothing.

 How much RAM can the i386 kernel use on an amd64 machine?
 4GB minus pci space

 Hmmm.

 Please correct me if I'm wrong:
 Let's say a firewall is connected to a pretty fast Internet pipe (in the 
 gigabit range). Let's say there's a DDoS against this environment. In 
 theory, the firewall would need lots of RAM so that it can deal with the 
 incoming nasty packets, create an entry for each packet in the state table 
 (don't know the correct name for it in OpenBSD, sorry), then expire it 
 after a while.
 In theory, the firewall could be tweaked to expire unused states quickly, 
 but still, more RAM is better when dealing with a DDoS.

nope.
the kernel will not ever use more than 1 GB (or were it 768MB? memory 
fuzzy).
more than 1 GB of memory on a firewall even hurts.ok, not much. but a 
bit.

 What's still not clear to me is how much RAM I should provision per 1Gb of 
 bandwidth on OpenBSD, assuming there's an incoming worst-case-scenario 
 DDoS, that consumes RAM (and other resources) on the firewall yet leaves 
 some bandwidth open for legitimate traffic (so the firewall must be able to 
 continue to let the good traffic pass through). Also assuming some tweaking 
 has been done on the firewall to expire the bad stuff quickly without 
 affecting legitimate traffic.

RAM is not your concern on a firewall.

 If the SMP kernel does not actually hurt performance, I might have to use 
 it.
 it does. seriously. locking is not free.

 Aw, damn. I was hoping that's not quite the case.

 Well, then hopefully the dynamic routing daemons won't get too greedy and 
 DoS the firewall from within. :-)

no, they won't.
they only get the cpu cycles not required for packet forwarding (well, 
interrupts + softint handling really) anyway.

 Or I may have to re-think the whole 
 environment and forget the idea of doing any kind of dynamic routing on the 
 firewall - from a security perspective, dynamic routing on the firewall 
 sucks anyway.

no, not really, not if done right.

-- 
Henning Brauer, [EMAIL PROTECTED], [EMAIL PROTECTED]
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg  Amsterdam



Re: firewall is very slow, something's wrong

2007-10-10 Thread Siju George
On 10/9/07, Henning Brauer [EMAIL PROTECTED] wrote:
 * Florin Andrei [EMAIL PROTECTED] [2007-10-09 19:34]:
  then, an i386 kernel should perform considerably better than amd64 for
  firewalling/routing/...
 
  That is surprising. What is the reason?

 we dunno really. it hasn't been benched in sometimesoit might not even
 be true nay more, but last time the difference was dramatic.


I thought by running an amd64 kernel will get me twice the speed than
an i386 on an amd64 machine since one is 64 bit processing and the
other is just 32 bit :-(

How about on sparc64 systems? do you get thwice the speed compared to
its 32 bit counterpart?

Thank you so much

Kind Regards

Siju



Re: firewall is very slow, something's wrong

2007-10-10 Thread Henning Brauer
* Siju George [EMAIL PROTECTED] [2007-10-10 15:10]:
 On 10/9/07, Henning Brauer [EMAIL PROTECTED] wrote:
  * Florin Andrei [EMAIL PROTECTED] [2007-10-09 19:34]:
   then, an i386 kernel should perform considerably better than amd64 for
   firewalling/routing/...
   That is surprising. What is the reason?
  we dunno really. it hasn't been benched in sometimesoit might not even
  be true nay more, but last time the difference was dramatic.
 I thought by running an amd64 kernel will get me twice the speed than
 an i386 on an amd64 machine since one is 64 bit processing and the
 other is just 32 bit :-(

so you think a 20 ton truck is twice as fast as a 10 ton truck?

-- 
Henning Brauer, [EMAIL PROTECTED], [EMAIL PROTECTED]
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg  Amsterdam



Re: firewall is very slow, something's wrong

2007-10-10 Thread Peter N. M. Hansteen
Henning Brauer [EMAIL PROTECTED] writes:

 so you think a 20 ton truck is twice as fast as a 10 ton truck?

horizontal or vertical motion? assuming a perfectly spherical truck?

-- 
Peter N. M. Hansteen, member of the first RFC 1149 implementation team
http://bsdly.blogspot.com/ http://www.datadok.no/ http://www.nuug.no/
Remember to set the evil bit on all malicious network traffic
delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.



Re: firewall is very slow, something's wrong

2007-10-10 Thread Robert C Wittig

Siju George wrote:


I thought by running an amd64 kernel will get me twice the speed than
an i386 on an amd64 machine since one is 64 bit processing and the
other is just 32 bit :-(



64 bit processors (combined with 64 bit capable operating systems) have 
the ability to address more RAM than 32 bit processors because 64^2 is a 
much larger number than 32^2... lots more RAM addresses).


This does not speed things up, though, until you run out of RAM, and 
start having to access the swapfile.


The processor's speed... MHz, GHz, etc., will determine how fast the 
processor itself can process instructions.



--
-wittig http://www.robertwittig.com/
http://robertwittig.net/
http://robertwittig.org/
.



Re: firewall is very slow, something's wrong

2007-10-10 Thread Paul de Weerd
On Wed, Oct 10, 2007 at 09:24:25AM -0500, Robert C Wittig wrote:
| Siju George wrote:
|
| I thought by running an amd64 kernel will get me twice the speed than
| an i386 on an amd64 machine since one is 64 bit processing and the
| other is just 32 bit :-(
| 
|
| 64 bit processors (combined with 64 bit capable operating systems) have
| the ability to address more RAM than 32 bit processors because 64^2 is a
| much larger number than 32^2... lots more RAM addresses).
|
| This does not speed things up, though, until you run out of RAM, and
| start having to access the swapfile.
|
| The processor's speed... MHz, GHz, etc., will determine how fast the
| processor itself can process instructions.

Depending on your software, 64 bit processors can be quite a bit
faster. If you're dealing with 64bit integers, using 64bit registers,
etc., a lower clocked 64bit CPU might be faster than a 32bit CPU
clocking at a higher rate. In short: There is no short answer. It
depends on what you're doing.

From what Henning tells us (and what sounds logical to me), grabbing a
ethernet frame from a NIC and putting it on another NIC doesn't really
change much from 32bit to 64bit.

Your compiler also comes into play. If that is more tuned towards a
certain 32bit architecture (such as i386) than a certain 64bit arch
(because it's less populair, such as sparc64 or hppa64 or mips64),
this will impact your performance quite a bit.

Cheers,

Paul 'WEiRD' de Weerd

--
[++-]+++.+++[---].+++[+
+++-].++[-]+.--.[-]
 http://www.weirdnet.nl/

[demime 1.01d removed an attachment of type application/pgp-signature]



Re: firewall is very slow, something's wrong

2007-10-10 Thread Jon Radel
Robert C Wittig wrote:

 64 bit processors (combined with 64 bit capable operating systems) have
 the ability to address more RAM than 32 bit processors because 64^2 is a
 much larger number than 32^2... lots more RAM addresses).

The increase from 2^32 to 2^64 is even more impressive.  ;-)

--Jon Radel

[demime 1.01d removed an attachment of type application/x-pkcs7-signature which 
had a name of smime.p7s]



Re: firewall is very slow, something's wrong

2007-10-10 Thread Tony Abernethy
Robert C Wittig wrote:
 Siju George wrote:
 
  I thought by running an amd64 kernel will get me twice the 
 speed than
  an i386 on an amd64 machine since one is 64 bit processing and the
  other is just 32 bit :-(
  
 
 64 bit processors (combined with 64 bit capable operating 
 systems) have 
 the ability to address more RAM than 32 bit processors 
 because 64^2 is a 
 much larger number than 32^2... lots more RAM addresses).

Actually 2^64 vs 2^32  (64^2 is 2^7, 64 is 2^6, 32 is 2^5)

Other things equal, 64-bit should take twice as long because it 
takes 64 bits to do anything instead of 32 bits.

Not really that simple, because accessing 32 bits can involve
1) accessing the 64 bits that the 32 bits are in.
2) selecting the appropriate 32 bits of the 64 bits.

 
 This does not speed things up, though, until you run out of RAM, and 
 start having to access the swapfile.
The 64-bits does affect how big the swap file can be without
resorting to Rube Goldberg contraptions to identify what is what.

 
 The processor's speed... MHz, GHz, etc., will determine how fast the 
 processor itself can process instructions.
 
 
 -- 
 -wittig http://www.robertwittig.com/
  http://robertwittig.net/
  http://robertwittig.org/
 .



Re: firewall is very slow, something's wrong

2007-10-10 Thread Siju George
On 10/10/07, Henning Brauer [EMAIL PROTECTED] wrote:
 * Siju George [EMAIL PROTECTED] [2007-10-10 15:10]:
  On 10/9/07, Henning Brauer [EMAIL PROTECTED] wrote:
   * Florin Andrei [EMAIL PROTECTED] [2007-10-09 19:34]:
then, an i386 kernel should perform considerably better than amd64 for
firewalling/routing/...
That is surprising. What is the reason?
   we dunno really. it hasn't been benched in sometimesoit might not even
   be true nay more, but last time the difference was dramatic.
  I thought by running an amd64 kernel will get me twice the speed than
  an i386 on an amd64 machine since one is 64 bit processing and the
  other is just 32 bit :-(

 so you think a 20 ton truck is twice as fast as a 10 ton truck?


O.K I get it :-)
So when does changing from 32 bit to a 64-bit processor actually help?

Kind Regards

Siju



Re: firewall is very slow, something's wrong

2007-10-10 Thread Scott Wells

And is it in a vacuum?

Peter N. M. Hansteen wrote:

Henning Brauer [EMAIL PROTECTED] writes:

  

so you think a 20 ton truck is twice as fast as a 10 ton truck?



horizontal or vertical motion? assuming a perfectly spherical truck?




Re: firewall is very slow, something's wrong

2007-10-10 Thread Tony Abernethy
Siju George wrote:
snip
  so you think a 20 ton truck is twice as fast as a 10 ton truck?
 O.K I get it :-)
 So when does changing from 32 bit to a 64-bit processor actually help?

Quoting Paul de Weerd,
In short: There is no short answer. It depends on what you're doing.
( Not to mention how you do it ;-)

Short answer:
When you *might* need more than a GB or so of RAM/swap. 
Most anything is faster than stuck.

Easy: 2:1 ratio *either direction* which is faster.
Hard: 10:1 ratio (again either direction).
(figure in loading/unloading times on the truck analogy)



Re: firewall is very slow, something's wrong

2007-10-10 Thread Stuart Henderson
On 2007/10/10 11:20, Tony Abernethy wrote:
 Siju George wrote:
 snip
   so you think a 20 ton truck is twice as fast as a 10 ton truck?
  O.K I get it :-)
  So when does changing from 32 bit to a 64-bit processor actually help?
 
 Quoting Paul de Weerd,
 In short: There is no short answer. It depends on what you're doing.
 ( Not to mention how you do it ;-)

There are other changes between i386/amd64 than the number of bits
(e.g. amd64 has more registers, which allows some other changes that
can improve performance for some things), so it depends a lot on
the code being run.

You can't even always say, software X is faster on arch Y, since
the way you use that software can give different results.

If you're looking for fastest, just benchmark as close to real-life
use on both, it's the easiest way. You also often need to test whether
what you're trying to run does work correctly on !i386 arch (it's not
uncommon for code to make assumptions which don't hold true on !i386).

Of course, there are reasons other than fastest you might choose
a particular arch.

 Short answer:
 When you *might* need more than a GB or so of RAM/swap. 
 Most anything is faster than stuck.

 Easy: 2:1 ratio *either direction* which is faster.
 Hard: 10:1 ratio (again either direction).

I'm not too sure I understand what you're saying here.



Re: firewall is very slow, something's wrong

2007-10-10 Thread Robert C Wittig

Paul de Weerd wrote:

wittig wrote:
| 64 bit processors (combined with 64 bit capable operating systems) have 
| the ability to address more RAM than 32 bit processors because 64^2 is a 
| much larger number than 32^2... lots more RAM addresses).


Oops! that should have read:

2^64 and 2^32


Depending on your software, 64 bit processors can be quite a bit
faster. If you're dealing with 64bit integers, using 64bit registers,
etc., a lower clocked 64bit CPU might be faster than a 32bit CPU
clocking at a higher rate. In short: There is no short answer. It
depends on what you're doing.



Point taken, particularly where big integers are concerned.


From what Henning tells us (and what sounds logical to me), grabbing a
ethernet frame from a NIC and putting it on another NIC doesn't really
change much from 32bit to 64bit.

Your compiler also comes into play. If that is more tuned towards a
certain 32bit architecture (such as i386) than a certain 64bit arch
(because it's less populair, such as sparc64 or hppa64 or mips64),
this will impact your performance quite a bit.



If you had to choose between, say, 2 gig RAM and a 32 bit CPU, or 1 gig 
RAM and a 64 bit CPU, which would be a better choice, in general?



--
-wittig http://www.robertwittig.com/
http://robertwittig.net/
http://robertwittig.org/
.



Re: firewall is very slow, something's wrong

2007-10-10 Thread Paul de Weerd
On Wed, Oct 10, 2007 at 12:34:48PM -0500, Robert C Wittig wrote:
| If you had to choose between, say, 2 gig RAM and a 32 bit CPU, or 1 gig
| RAM and a 64 bit CPU, which would be a better choice, in general?

There is no such generalization. The amount of RAM you need depends on
the task. For firewalling, you don't need lots. For a high-traffic,
caching webserver you do need much.

If, in general, you are firewalling .. you won't need much RAM. If, in
general, you are doing something else, you might need it. Like I said
in my previous mail, there is no short answer. No quick solution.
Everything has advantages and disadvantages. In some cases you may not
even want to run OpenBSD (*shock* !).

In general, you should look at the specific problem at hand and solve
it with the means available.

Cheers,

Paul 'WEiRD' de Weerd

--
[++-]+++.+++[---].+++[+
+++-].++[-]+.--.[-]
 http://www.weirdnet.nl/

[demime 1.01d removed an attachment of type application/pgp-signature]



Re: firewall is very slow, something's wrong

2007-10-10 Thread Ted Unangst
On 10/10/07, Robert C Wittig [EMAIL PROTECTED] wrote:
 If you had to choose between, say, 2 gig RAM and a 32 bit CPU, or 1 gig
 RAM and a 64 bit CPU, which would be a better choice, in general?

64-bit and 1 GB.  it's much easier to add another GB RAM later than to
add 32-bits.



Re: firewall is very slow, something's wrong

2007-10-10 Thread Henning Brauer
* Robert C Wittig [EMAIL PROTECTED] [2007-10-10 20:45]:
 If you had to choose between, say, 2 gig RAM and a 32 bit CPU, or 1 gig RAM 
 and a 64 bit CPU, which would be a better choice, in general?

for a packet filter/router/...? 32bit 2Gig and take a gig out.
for a databse server? 64bit and add ram when required.
there is no in general.

-- 
Henning Brauer, [EMAIL PROTECTED], [EMAIL PROTECTED]
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg  Amsterdam



Re: firewall is very slow, something's wrong

2007-10-09 Thread Henning Brauer
* Florin Andrei [EMAIL PROTECTED] [2007-10-05 03:55]:
 The hardware is AMD64, Tyan Transport, 2 CPUs 2 cores each. I am using the 
 SMP kernel. The network card is Intel Pro/1000 PCI Express 4x dual gigabit 
 port, it carries both em0 and em1.

First, you want to run 4.2 or -current, that shoudl about double your 
throughput.
then, an i386 kernel should perform considerably better than amd64 for 
firewalling/routing/...
next, you don't want SMP for such tasks. take out the second CPU and 
give it to somebody who can use it, and run the uniprocessor kernel.
last, increase net.inet.ip.ifq.maxlen until you see the congestion 
counter not increasing much any more under load. should not exceed 2500 
by too much. as a rule of thumb, 256 per gigE interface aren't too far 
off.

-- 
Henning Brauer, [EMAIL PROTECTED], [EMAIL PROTECTED]
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg  Amsterdam



Re: firewall is very slow, something's wrong

2007-10-09 Thread Florin Andrei

Henning Brauer wrote:

* Florin Andrei [EMAIL PROTECTED] [2007-10-09 19:34]:
then, an i386 kernel should perform considerably better than amd64 for 
firewalling/routing/...

That is surprising. What is the reason?


we dunno really. it hasn't been benched in sometimesoit might not even 
be true nay more, but last time the difference was dramatic.


Then I will do some tests with 4.2 on gigabit-capable hardware. If 
anything noteworthy comes out, I'll post the results.
Don't expect something too fancy, but I guess anything is better than 
nothing.



How much RAM can the i386 kernel use on an amd64 machine?


4GB minus pci space


Hmmm.

Please correct me if I'm wrong:
Let's say a firewall is connected to a pretty fast Internet pipe (in the 
gigabit range). Let's say there's a DDoS against this environment. In 
theory, the firewall would need lots of RAM so that it can deal with the 
incoming nasty packets, create an entry for each packet in the state 
table (don't know the correct name for it in OpenBSD, sorry), then 
expire it after a while.
In theory, the firewall could be tweaked to expire unused states 
quickly, but still, more RAM is better when dealing with a DDoS.


What's still not clear to me is how much RAM I should provision per 1Gb 
of bandwidth on OpenBSD, assuming there's an incoming 
worst-case-scenario DDoS, that consumes RAM (and other resources) on the 
firewall yet leaves some bandwidth open for legitimate traffic (so the 
firewall must be able to continue to let the good traffic pass through). 
Also assuming some tweaking has been done on the firewall to expire the 
bad stuff quickly without affecting legitimate traffic.


But all that depends on the actual legitimate traffic and on the 
firewall rules.

I guess that's another way of saying more tests are needed. :-/

If the SMP kernel does not actually hurt performance, I might have to use 
it.


it does. seriously. locking is not free.


Aw, damn. I was hoping that's not quite the case.

Well, then hopefully the dynamic routing daemons won't get too greedy 
and DoS the firewall from within. :-) Or I may have to re-think the 
whole environment and forget the idea of doing any kind of dynamic 
routing on the firewall - from a security perspective, dynamic routing 
on the firewall sucks anyway.


Looks like my performance test matrix just got bigger by a factor of 2x. 
:-/ But the bad combinations should get pruned pretty quickly, I guess.


+-+---+---+
|  \  | i386  | amd64 |
+-+---+---+
| SMP |   |   |
+-+---+---+
| UP  |   |   |
+-+---+---+

--
Florin Andrei

http://florin.myip.org/



Re: firewall is very slow, something's wrong

2007-10-08 Thread Florin Andrei

Stuart Henderson wrote:

On 2007/10/04 17:48, Florin Andrei wrote:
All firewall rules are written as stateless as possible - I don't need 
stateful filtering, the setup is very simple (allow HTTP inbound, allow a 
few ICMP types, and that's it).

  congestion116169  197.2/s


Try setting net.inet.ip.ifq.maxlen to 256 (sysctl/sysctl.conf),
if you still see the congestion count increasing then search for
net.inet.ip.ifq.maxlen in the list archives and have a read.


I raised maxlen to 300. I also enabled ACPI. It's still slow. The 
congestion counter is still not zero - currently at 386.5/s
One good thing is that there used to be a big pause when the kernel was 
booting up, probably waiting for some device or something - now with 
ACPI the pause is smaller. It's still waiting for something, just not as 
much.


I am watching the system with top, set to update every 1s, and I noticed 
there are a lot of interrupt load bursts on CPU0. The percentage of 
interrupt load is very uneven, sometimes as low as 15%, sometimes as 
high as 75%.
I unleashed the UDP flood and the firewall is totally frozen - can't do 
anything even on the local keyboard. Not even the display (running top) 
gets updated anymore. The machine is frozen solid. All network traffic 
stops immediately.

Kill the UDP flood and OpenBSD resumes normal operations.

I tried the uniprocessor kernel and it's exactly the same.

Comparison with Linux on the exact same hardware:
HTTP download speed through the firewall is 112 Mbyte / sec (saturating 
the GigE ports) and the interrupt load is relatively low and constant - 
about 30%.
Under UDP flood with Linux as a firewall, the current download finishes 
up, but a new one cannot get started. The system is not frozen at all, 
it's quite usable, in fact I can heavily overload it (running a bunch of 
CPU hogs) to the point where userspace becomes sluggish and load average 
is up to 250 or so, yet the firewall is not influenced at all.


So what's the deal here? The heavy interrupt load percentage seems to 
indicate an issue with the network driver if I'm not mistaken. But these 
are good and quite popular network cards - Intel Pro/1000 PCI Express 4x 
dual-port gigabit, seen by kernel as em0 and em1


--
Florin Andrei

http://florin.myip.org/



Re: firewall is very slow, something's wrong

2007-10-08 Thread Florin Andrei

Florin Andrei wrote:


I expected OpenBSD 4.1 to do better. But the thing is, even without the 
UDP flood, the OpenBSD firewall is very slow. I am downloading a huge 
file through it, via HTTP, and all I get is 4 Mbyte / sec. With Linux I 
get 112 Mbyte / sec.


Something's wrong. Or I'm doing something wrong.


Disabled all pf rules including NAT, now it's just pass in ; pass out
Now the download is able to saturate the gig ports, about 112 Mbyte / sec.
But it's still not constantly at 112, it sometime drops below that about 
10%. When that happens, CPU0 has 0% idle cycles. A lot of interrupts, 
always above 70% on CPU0, going to 99% when the download slows down.

The congestion counter is now 0.

The UDP flood still freezes the system solid (but I discovered that the 
system clock continues to work more or less fine, it's just the text 
console and the firewall that are not responsive).


I still can't match the performance I get from Linux. Any suggestion is 
appreciated.


--
Florin Andrei

http://florin.myip.org/



Re: firewall is very slow, something's wrong

2007-10-08 Thread knitti
On 10/8/07, Florin Andrei [EMAIL PROTECTED] wrote:
 I still can't match the performance I get from Linux. Any suggestion is
 appreciated.

there were in the past postings on this list about problems with quad-port
em NICs. I am absolutely not in a position to tell whether they are relevant
for this situation.  If I remember correctly, there was a problem with TCP
checksum offloading, and a suggested fix in one instance was jumpering
the card down to 66 MHz. I can't tell if this is related in *any* way.

I think there are some people here who *could* tell if you'd post a dmesg.

gretings,
knitti



Re: firewall is very slow, something's wrong

2007-10-08 Thread Florin Andrei

knitti wrote:


there were in the past postings on this list about problems with quad-port
em NICs. I am absolutely not in a position to tell whether they are relevant
for this situation.  If I remember correctly, there was a problem with TCP
checksum offloading, and a suggested fix in one instance was jumpering
the card down to 66 MHz. I can't tell if this is related in *any* way.

I think there are some people here who *could* tell if you'd post a dmesg.


# dmesg 



OpenBSD 4.1 (GENERIC.MP) #1152: Sat Mar 10 19:22:57 MST 2007
[EMAIL PROTECTED]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 3220754432 (3145268K)
avail mem = 2757828608 (2693192K)
using 22937 buffers containing 322281472 bytes (314728K) of memory
mainbus0 (root)
bios0 at mainbus0: SMBIOS rev. 2.3 @ 0xf97e0 (61 entries)
bios0: empty empty
acpi0 at mainbus0: rev 2
acpi0: tables DSDT FACP APIC OEMB SRAT
acpitimer at acpi0 not configured
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Dual-Core AMD Opteron(tm) Processor 2216, 2394.33 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,CX16,NXE,MMXX,FFXSR,LONG,3DNOW2,3DNOW
cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 1MB 
64b/line 16-way L2 cache

cpu0: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu0: DTLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu0: apic clock running at 205MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Dual-Core AMD Opteron(tm) Processor 2216, 2465.82 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,CX16,NXE,MMXX,FFXSR,LONG,3DNOW2,3DNOW
cpu1: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 1MB 
64b/line 16-way L2 cache

cpu1: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu1: DTLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu2 at mainbus0: apid 2 (application processor)
cpu2: Dual-Core AMD Opteron(tm) Processor 2216, 2465.82 MHz
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,CX16,NXE,MMXX,FFXSR,LONG,3DNOW2,3DNOW
cpu2: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 1MB 
64b/line 16-way L2 cache

cpu2: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu2: DTLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Dual-Core AMD Opteron(tm) Processor 2216, 2465.82 MHz
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,CX16,NXE,MMXX,FFXSR,LONG,3DNOW2,3DNOW
cpu3: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 1MB 
64b/line 16-way L2 cache

cpu3: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu3: DTLB 32 4KB entries fully associative, 8 4MB entries fully associative
ioapic0 at mainbus0 apid 4 pa 0xfec0, version 11, 16 pins
ioapic1 at mainbus0 apid 5 pa 0xfec01000, version 11, 16 pins
ioapic2 at mainbus0 apid 6 pa 0xfec02000, version 11, 16 pins
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (P0P1)
acpiprt2 at acpi0: bus 2 (P1P2)
acpiprt3 at acpi0: bus 3 (BR14)
acpiprt4 at acpi0: bus 4 (BR1E)
acpiprt5 at acpi0: bus 5 (BR28)
acpiprt6 at acpi0: bus 6 (BR32)
acpiprt7 at acpi0: bus 7 (BR3C)
acpibtn at acpi0 not configured
ipmi0 at mainbus0: reserve send fails
pci0 at mainbus0 bus 0: configuration mode 1
ppb0 at pci0 dev 1 function 0 ServerWorks HT-1000 PCI rev 0x00
pci1 at ppb0 bus 1
ppb1 at pci1 dev 13 function 0 ServerWorks HT-1000 PCIX rev 0xc0
pci2 at ppb1 bus 2
pciide0 at pci1 dev 14 function 0 ServerWorks HT-1000 SATA rev 0x00: DMA
pciide0: using apic 4 int 11 (irq 11) for native-PCI interrupt
pciide0: port 0: device present, speed: 1.5Gb/s
wd0 at pciide0 channel 0 drive 0: ST3250620AS
wd0: 16-sector PIO, LBA48, 238475MB, 488397168 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5
pciide0: port 1: PHY offline
pciide0: port 2: PHY offline
pciide0: port 3: PHY offline
piixpm0 at pci0 dev 2 function 0 ServerWorks HT-1000 rev 0x00: polling
iic0 at piixpm0
adt0 at iic0 addr 0x2e: emc6d100 rev 0x68
pciide1 at pci0 dev 2 function 1 ServerWorks HT-1000 IDE rev 0x00: DMA
atapiscsi0 at pciide1 channel 0 drive 1
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: TEAC, DV-28E-R, 1.8A SCSI0 5/cdrom removable
cd0(pciide1:0:1): using PIO mode 4, DMA mode 2, Ultra-DMA mode 0
pcib0 at pci0 dev 2 function 2 ServerWorks HT-1000 LPC rev 0x00
ohci0 at pci0 dev 3 function 0 ServerWorks HT-1000 USB rev 0x01: apic 
4 int 10 (irq 10), version 1.0, legacy support

usb0 at ohci0: USB revision 1.0
uhub0 at usb0
uhub0: ServerWorks OHCI root hub, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
ohci1 at pci0 dev 3 function 1 ServerWorks HT-1000 USB rev 0x01: 

Re: firewall is very slow, something's wrong

2007-10-08 Thread Karsten McMinn
On 10/8/07, Florin Andrei [EMAIL PROTECTED] wrote:
 snip
 The UDP flood still freezes the system solid (but I discovered that the
 system clock continues to work more or less fine, it's just the text
 console and the firewall that are not responsive).

 I still can't match the performance I get from Linux. Any suggestion is
 appreciated.

while is dreadfully obvious that there is some weirdness
happening, you'll definately get more performance by
switching to the latest snapshot or wait for your 4.2 cd
if it hasn't come yet.  What model transport do you have
and whats the Mainbords bios rev?



Re: firewall is very slow, something's wrong

2007-10-07 Thread Claudio Jeker
On Thu, Oct 04, 2007 at 05:48:50PM -0700, Florin Andrei wrote:
 Dual-homed firewall, web server on the private network, firewall is 
 doing 1:1 NAT for the web server to the public interface of the 
 firewall. em0 is the public interface, em1 is the private one.
 
 In the exact same setup (same hardware even) I am comparing Linux and 
 OpenBSD for a firewall. Installed Linux on a hard-disc, OpenBSD on 
 another disc, and I'm just swapping discs while I'm testing.
 All firewall rules are written as stateless as possible - I don't need 
 stateful filtering, the setup is very simple (allow HTTP inbound, allow 
 a few ICMP types, and that's it).
 
 With Linux, I achieve gigabit transfer speeds through the firewall 
 (saturating the network ports), but the firewall refuses to let any new 
 connection through when I flood it with a bunch of small UDP packets 
 with random source addresses.
 
 I expected OpenBSD 4.1 to do better. But the thing is, even without the 
 UDP flood, the OpenBSD firewall is very slow. I am downloading a huge 
 file through it, via HTTP, and all I get is 4 Mbyte / sec. With Linux I 
 get 112 Mbyte / sec.
 
 Something's wrong. Or I'm doing something wrong.
 
 The hardware is AMD64, Tyan Transport, 2 CPUs 2 cores each. I am using 
 the SMP kernel. The network card is Intel Pro/1000 PCI Express 4x dual 
 gigabit port, it carries both em0 and em1.
 

I guess you need to enable acpi with config(8) as the system is quite
new and most newer system have busted MP BIOS infos. The effect is bad
interrupt routing and other crazyness -- which is often felt as slow
systems.

-- 
:wq Claudio



Re: firewall is very slow, something's wrong

2007-10-05 Thread Stuart Henderson
On 2007/10/04 17:48, Florin Andrei wrote:
 All firewall rules are written as stateless as possible - I don't need 
 stateful filtering, the setup is very simple (allow HTTP inbound, allow a 
 few ICMP types, and that's it).

You might want to re-think this, stateless rulesets are usually
slower. This is interesting:

http://www.undeadly.org/cgi?action=articlesid=20060927091645

   congestion116169  197.2/s

Try setting net.inet.ip.ifq.maxlen to 256 (sysctl/sysctl.conf),
if you still see the congestion count increasing then search for
net.inet.ip.ifq.maxlen in the list archives and have a read.