Jesse Brandeburg a écrit :
On 12/7/05, David S. Miller <[EMAIL PROTECTED]> wrote:
From: Eric Dumazet <[EMAIL PROTECTED]>
Date: Thu, 08 Dec 2005 04:47:05 +0100
#4#5 as proposed in the patch can not be a win
+ prefetch(next_skb);
+ prefetch(next_skb->data - NET_IP_ALIGN);
because at the time #5 is done, the CPU dont have in its cache next_skb->data
(because the #4 prefetch is the previous instruction)
prefetch(ptr) is mostly a free hint for the CPU.
prefetch(*ptr) can be expensive because of the needed indirection that might
slow down the CPU if *ptr is not yet available in its L1 cache.
Agreed. Doing a depdant prefetch back to back so close like
that is nearly pointless.
right, after i did this code, i realized that, and it is demonstrable
that #4 hurts, if only a little.
I'm sticking with my suggestion we go to #1,#2,#5
I would try another thing : #1,#2,#4bis
#4bis prefetch(&next_skb->data);
instead of any combination of #4 or #5
#4 prefetch(next_skb);
#5 prefetch(next_skb->data - NET_IP_ALIGN);
This way, the next time #1 is done (next loop), previous #4bis makes the
dereference hit the L1 cache : Prefetch should be more efficient.
Eric
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html