On Friday 05 May 2006 10:49, Eric Dumazet wrote:
> On a dual opteron box, I noticed high oprofile numbers in net/core/dst.c 
> , function dst_destroy(struct dst_entry * dst)
> 
> It appears the smb_rmb() done at the begining of  dst_destroy() is the 
> killer  (this is a lfence machine instruction, that apparently is doing 
> a *lot* of things... may be IO related...) that is responsible for 80% 
> of the cpu time used by the whole function.
> 
> I dont understand very much all variety of available barriers, and why 
> this smb_rmb() is used in dst_destroy().
> I missed the corresponding wmb that should be done somewhere in the dst 
> code.
> 
> Do we have an alternative to smp_rmb() in the dst_destroy()/ kfree_skb() 
> context ?

Eliminating it probably wouldn't help very much - it just flushes the 
loads already in flight. If it didn't do that the next smp_rmb() would.
I'm surprised there are that many though. Normally kernel code is spagetti
enough that the CPU cannot speculate too many loads ahead.

But are you 100% sure the cost is not in the lock decl ? That would make
more sense. Perhaps profile for cache misses too and double check?

-Andi
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to