we have this patch primarily for getting valid profile counts. we observe that for some high-threaded programs, we are getting poor counter due to data racing of counter update (like counter value is only 15% of what it supposed to be for a 10-thread program).
In general, enabling atomic updates slows down programs. (for my some of my toy programs, it has 3x slow down.) And that the reason I use options to control value and edge profile count. -Rong On Thu, Dec 20, 2012 at 8:57 AM, Andrew Pinski <pins...@gmail.com> wrote: > On Thu, Dec 20, 2012 at 8:20 AM, Jan Hubicka <hubi...@ucw.cz> wrote: >>> On Wed, Dec 19, 2012 at 4:29 PM, Andrew Pinski <pins...@gmail.com> wrote: >>> > >>> > On Wed, Dec 19, 2012 at 12:08 PM, Rong Xu <x...@google.com> wrote: >>> > > Hi, >>> > > >>> > > This patch adds the supprot of atomic update the profile counters. >>> > > Tested with google internal benchmarks and fdo kernel build. >>> > >>> > I think you should use the __atomic_ functions instead of __sync_ >>> > functions as they allow better performance for simple counters as you >>> > can use __ATOMIC_RELAXED. >>> >>> You are right. I think __ATOMIC_RELAXED should be OK here. >>> Thanks for the suggestion. >>> >>> > >>> > And this would be useful for the trunk also. I was going to implement >>> > this exact thing this week but some other important stuff came up. >>> >>> I'll post trunk patch later. >> >> Yes, I like that patch, too. Even if the costs are quite high (and this is >> why >> atomic updates was sort of voted down in the past) the alternative of using >> TLS >> has problems with too-much per-thread memory. > > Actually sometimes (on some processors) atomic increments are cheaper > than doing a regular incremental. Mainly because there is an > instruction which can handle it in the L2 cache rather than populating > the L1. Octeon is one such processor where this is true. > > Thanks, > Andrew Pinski > >> >> While there are even more alternatives, like recording the changes and >> commmiting them in blocks (say at function return), I guess some solution is >> better than no solution. >> >> Thanks, >> Honza