we have this patch primarily for getting valid profile counts. we
observe that for some high-threaded programs, we are getting poor
counter due to data racing of counter update (like counter value is
only 15% of what it supposed to be for a 10-thread program).

In general, enabling atomic updates slows down programs. (for my some
of my toy programs, it has 3x slow down.) And that the reason I use
options to control value and edge profile count.

-Rong

On Thu, Dec 20, 2012 at 8:57 AM, Andrew Pinski <pins...@gmail.com> wrote:
> On Thu, Dec 20, 2012 at 8:20 AM, Jan Hubicka <hubi...@ucw.cz> wrote:
>>> On Wed, Dec 19, 2012 at 4:29 PM, Andrew Pinski <pins...@gmail.com> wrote:
>>> >
>>> > On Wed, Dec 19, 2012 at 12:08 PM, Rong Xu <x...@google.com> wrote:
>>> > > Hi,
>>> > >
>>> > > This patch adds the supprot of atomic update the profile counters.
>>> > > Tested with google internal benchmarks and fdo kernel build.
>>> >
>>> > I think you should use the __atomic_ functions instead of __sync_
>>> > functions as they allow better performance for simple counters as you
>>> > can use __ATOMIC_RELAXED.
>>>
>>> You are right. I think __ATOMIC_RELAXED should be OK here.
>>> Thanks for the suggestion.
>>>
>>> >
>>> > And this would be useful for the trunk also.  I was going to implement
>>> > this exact thing this week but some other important stuff came up.
>>>
>>> I'll post trunk patch later.
>>
>> Yes, I like that patch, too. Even if the costs are quite high (and this is 
>> why
>> atomic updates was sort of voted down in the past) the alternative of using 
>> TLS
>> has problems with too-much per-thread memory.
>
> Actually sometimes (on some processors) atomic increments are cheaper
> than doing a regular incremental.  Mainly because there is an
> instruction which can handle it in the L2 cache rather than populating
> the L1.   Octeon is one such processor where this is true.
>
> Thanks,
> Andrew Pinski
>
>>
>> While there are even more alternatives, like recording the changes and
>> commmiting them in blocks (say at function return), I guess some solution is
>> better than no solution.
>>
>> Thanks,
>> Honza

Reply via email to