On Friday 05 May 2006 10:49, Eric Dumazet wrote:
On a dual opteron box, I noticed high oprofile numbers in net/core/dst.c
, function dst_destroy(struct dst_entry * dst)
It appears the smb_rmb() done at the begining of dst_destroy() is the
killer (this is a lfence machine instruction, that apparently is doing
a *lot* of things... may be IO related...) that is responsible for 80%
of the cpu time used by the whole function.
I dont understand very much all variety of available barriers, and why
this smb_rmb() is used in dst_destroy().
I missed the corresponding wmb that should be done somewhere in the dst
code.
Do we have an alternative to smp_rmb() in the dst_destroy()/ kfree_skb()
context ?
Eliminating it probably wouldn't help very much - it just flushes the
loads already in flight. If it didn't do that the next smp_rmb() would.
I'm surprised there are that many though. Normally kernel code is spagetti
enough that the CPU cannot speculate too many loads ahead.
But are you 100% sure the cost is not in the lock decl ? That would make
more sense. Perhaps profile for cache misses too and double check?
-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html