https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552

--- Comment #45 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Linus Torvalds from comment #43)
> (In reply to Richard Biener from comment #42)
> > 
> > I think if we want to avoid doing optimizations on gcov counters we should
> > make them volatile. 
> 
> Honestly, that sounds like the cleanest and safest option to me.
> 
> That said, with the gcov counters apparently also being 64-bit, I suspect it
> will create some truly horrid code generation.
> 
> Presumably you'd end up getting a lot of load-load-add-adc-store-store
> instruction patterns, which is not just six instructions when just two
> should do - it also uses up two registers.
> 
> So while it sounds like the simplest and safest model, maybe it just makes
> code generation too unbearably bad?
> 
> Maybe nobody who uses gcov would care. But I suspect it might be quite the
> big performance regression, to the point where even people who thought they
> don't care will go "that's a bit much".
> 
> I wonder if there is some half-way solution that would allow at least a
> load-add-store-load-adc-store instruction sequence, which would then mean
> (a) one less register wasted and (b) potentially allow some peephole
> optimization turning it into just a addmem-adcmem instruction pair.
> 
> Turning just the one of the memops into a volatile access might be enough
> (eg just the load, but not the store?)

It might be possible to introduce something like a __volatile_inc () which
implements a somewhat relaxed "volatile".

For user code

volatile long long x;
void foo () { x++; }

emitting inc + adc with memory operands is only "incorrect" in re-ordering
the subword reads with the subword writes, the reads and writes still happen
architecturally ...

That said, the coverage code could make this re-ordering explicit for
32bit with some conditional code (add-with-overflow) that eventually
combines back nicely even with volatile ...

Reply via email to