On Wed, 2005-07-12 at 16:11 -0800, David S. Miller wrote:
> From: John Ronciak <[EMAIL PROTECTED]>
> Date: Wed, 7 Dec 2005 16:09:21 -0800
> 
> > On 12/7/05, David S. Miller <[EMAIL PROTECTED]> wrote:
> >
> > I think Jesse's data and recommendation of only keeping the #1, #2
> > and #5 prefetches seem like the right thing to do with data to back
> > it up.  It also goes along with what Robert showed as well.
> 
> Ok.  Let's also see what Jamal's testing shows us.

Good news to the intel folks: My results agree yours this time. 
Robert, could you double check on opterons?
#1, #2, and #5 _with copybreak_ turned on gave the best results. I got
about the same results as with all turned on +/- a few pps which could
be attributed to experimental errors. 
Without copybreak on, the same setup using #1, #2, and #5 also gave the
best result.

The interesting thing is #1 alone was close behind. #1 and #2 together
was not as good. This leads me to believe that copybreak masks cycles
that make the prefetch look better. If you get rid of copybreak, you end
up with worse numbers - you need to have code with about the same cpu
and mem path characteristics if you are going to replace it. 
I dont think copybreak is useful _at all_ for routing.  The accounting
reasoning makes sense, but I do plan to chase the testscase John pointed
out with UDP at some later point...

the scary part, and i do hope i am wrong:
-------------------------------------------

It means that in the future any single line change to that code path
would result in performance either going up or down at least for the
hardware i tested. [I played again with putting a while loop counter
without the copybreak code and was able to "tune" the performance
numbers ;-> ]
So for the people maintaining - keep this in mind and also dont forget
that different architectures would behave differently. I am told the ARM
for example would not suffer from any need to "tune" because the
prefetch is a hint and the CPU knows not go and fetch when it needs
data.

conclusion?
----------

If Robert reaches the same conclusion, I think we should allow the patch
with the specific prefetches and copybreak turned on for the simple
reason that this gives the best performance _today_ on the hardware the
tests were run on.
It would be valuable for the Intel folks to retest on a variety of
hardware, in particular lower end and other architectures where this NIC
is used.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to