Jesse Brandeburg a écrit :
On 12/7/05, David S. Miller <[EMAIL PROTECTED]> wrote:

From: Eric Dumazet <[EMAIL PROTECTED]>
Date: Thu, 08 Dec 2005 04:47:05 +0100


#4#5 as proposed in the patch can not be a win

+             prefetch(next_skb);
+             prefetch(next_skb->data - NET_IP_ALIGN);

because at the time #5 is done, the CPU dont have in its cache next_skb->data
(because the #4 prefetch is the previous instruction)

prefetch(ptr) is mostly a free hint for the CPU.

prefetch(*ptr) can be expensive because of the needed indirection that might
slow down the CPU if *ptr is not yet available in its L1 cache.

Agreed.  Doing a depdant prefetch back to back so close like
that is nearly pointless.


right, after i did this code, i realized that, and it is demonstrable
that #4 hurts, if only a little.
I'm sticking with my suggestion we go to #1,#2,#5

I would try another thing : #1,#2,#4bis

#4bis          prefetch(&next_skb->data);

instead of any combination of #4 or #5

#4             prefetch(next_skb);
#5             prefetch(next_skb->data - NET_IP_ALIGN);

This way, the next time #1 is done (next loop), previous #4bis makes the dereference hit the L1 cache : Prefetch should be more efficient.

Eric
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to