jamal a écrit :
On Wed, 2005-07-12 at 11:48 -0800, John Ronciak wrote:

On 12/7/05, Jeff Garzik <[EMAIL PROTECTED]> wrote:


So... under load, copybreak causes e1000 to fall over more rapidly than
no-copybreak?

If so, it sounds like copybreak should be disabled by default, and/or a
runtime switched added for it.

I wouldn't say "fall over".  With small packet only tests (the ones
being run for this exercise) _all_ packets are being copied which is
why when the system become CPU bound you see performance drop.  Normal
cases don't only have small packets and is where the gains are.  These
are also what is not being tested because I'm sure nobody would be
able to agree on an acceptable test for it.  Copybreak probably
shouldn't be used in routing use cases.  Since I think routing is the
special case and not the normal case copybreak should be on by default
and disabled when used in cases like small packet routing is being
done.


I am no longer sure that your results on copybreak for host bound
packets can be trusted anymore. All your copybreak was doing was making
the prefetch look good according to my tests.

Eric Dumazet <[EMAIL PROTECTED]> theorized there may be some value in
copybreak if you are host bound. I only seen it as an unnecessary pain
really.


In my case, my production servers are usually ram bounded, not cpu bounded.
Without copybreak enables nic drivers, they crash in few minutes.

Humor me, try to replace the code that does copybreak with a while loop
that counts down from 20 -> 0.
And see if you notice any changes for host bound traffic.

What are you trying to learn from these experiments that we dont already know ?

You can probably tune your kernel without prefetch stuff, adding some random delays in some spots : For example, you might handle less interrupts if you slowdown the interrupt handler and let the NIC feed you more packets at each interrupt : Less system overhead, less atomic ops... and so on.

Another example of what can happens on SMP :

Say you have nice coalescing NIC parameters so that you receive about 10 packets each interrupt. As soon the first packet is processed, another CPU might fire an user thread that might try to emit an answer packet while the first CPU still holds a lock in the NIC interrupt handler.

So adding a delay in the user thread might improve performance too, because it might reduce contention.

And on the contrary, if you optimize an application, using a better compiler for example, you might get worse performance : Should we stop trying to have better compilers ?

Eric
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to