On 20 Jan 2012, at 00:46, David Xu wrote:

> It depends on hardware, if it is a large machine with lots of cpu,
> a small conflict on dual-core machine can become a large conflict
> on large machine because it is possible more cpus are now
> running same code which becomes a bottleneck. On a large machine
> which has 1024 cores, many code need to be redesigned.

You'll also find that the relative cost of atomic instructions varies a lot 
between CPU models.  Between Core 2 and Sandy Bridge Core i7, the relative cost 
of an atomic add (full barrier) dropped by about two thirds.  The cache 
coherency logic has been significantly improved on the newer chips.  

For portable code, it's worth remembering that ARMv8 (which doesn't entirely 
exist yet) contains a set of barriers that closely match the semantics of the 
C[++]11 memory ordering.  They do this not for performance (directly), but for 
power efficiency - so using the least-restrictive required locking will 
eventually result in code for mobile devices that uses less battery power, if 
it's in a hot path.  

David_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to