On Wed, 2005-07-12 at 16:11 -0800, David S. Miller wrote: > From: John Ronciak <[EMAIL PROTECTED]> > Date: Wed, 7 Dec 2005 16:09:21 -0800 > > > On 12/7/05, David S. Miller <[EMAIL PROTECTED]> wrote: > > > > I think Jesse's data and recommendation of only keeping the #1, #2 > > and #5 prefetches seem like the right thing to do with data to back > > it up. It also goes along with what Robert showed as well. > > Ok. Let's also see what Jamal's testing shows us.
Good news to the intel folks: My results agree yours this time. Robert, could you double check on opterons? #1, #2, and #5 _with copybreak_ turned on gave the best results. I got about the same results as with all turned on +/- a few pps which could be attributed to experimental errors. Without copybreak on, the same setup using #1, #2, and #5 also gave the best result. The interesting thing is #1 alone was close behind. #1 and #2 together was not as good. This leads me to believe that copybreak masks cycles that make the prefetch look better. If you get rid of copybreak, you end up with worse numbers - you need to have code with about the same cpu and mem path characteristics if you are going to replace it. I dont think copybreak is useful _at all_ for routing. The accounting reasoning makes sense, but I do plan to chase the testscase John pointed out with UDP at some later point... the scary part, and i do hope i am wrong: ------------------------------------------- It means that in the future any single line change to that code path would result in performance either going up or down at least for the hardware i tested. [I played again with putting a while loop counter without the copybreak code and was able to "tune" the performance numbers ;-> ] So for the people maintaining - keep this in mind and also dont forget that different architectures would behave differently. I am told the ARM for example would not suffer from any need to "tune" because the prefetch is a hint and the CPU knows not go and fetch when it needs data. conclusion? ---------- If Robert reaches the same conclusion, I think we should allow the patch with the specific prefetches and copybreak turned on for the simple reason that this gives the best performance _today_ on the hardware the tests were run on. It would be valuable for the Intel folks to retest on a variety of hardware, in particular lower end and other architectures where this NIC is used. cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html