On Tue, Mar 07, 2006 at 01:04:36PM +1100, Nick Piggin wrote:
> I'd say it will turn out to be more trouble than its worth, for the
> miserly cost
> avoiding one atomic_inc, and one atomic_dec_and_test on page-local data
> that will
> be in L1 cache. I'd never turn my nose up at anyone just having a go
> though :)
The cost is anything but miserly. Consider that every lock instruction is
a memory barrier which takes your OoO CPU with lots of instructions in flight
to ramp down to just 1 for the time it takes that instruction to execute.
That synchronization is what makes the atomic expensive.
In the case of netperf, I ended up with a 2.5Gbit/s (~30%) performance
improvement through nothing but microoptimizations. There is method to
my madness. ;-)
-ben
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html