Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

Segher Boessenkool Tue, 22 Sep 2020 15:40:52 -0700

Hi!

On Tue, Sep 22, 2020 at 06:06:30PM +0100, Richard Sandiford wrote:
> Qing Zhao <qing.z...@oracle.com> writes:
> > Okay, thanks for the info. 
> > then, what’s the current definition of UNSPEC_VOLATILE? 
> 
> I'm not sure it's written down anywhere TBH.  rtl.texi just says:
> 
>   @code{unspec_volatile} is used for volatile operations and operations
>   that may trap; @code{unspec} is used for other operations.
> 
> which seems like a cyclic definition: volatile expressions are defined
> to be expressions that are volatile.


volatile_insn_p returns true for unspec_volatile (and all other volatile
things).  Unfortunately the comment on this function is just as confused
as pretty much everything else :-/

> But IMO the semantics are that unspec_volatile patterns with a given
> set of inputs and outputs act for dataflow purposes like volatile asms
> with the same inputs and outputs.  The semantics of asm volatile are
> at least slightly more well-defined (if only by example); see extend.texi
> for details.  In particular:
> 
>   Note that the compiler can move even @code{volatile asm} instructions 
> relative
>   to other code, including across jump instructions. For example, on many 
>   targets there is a system register that controls the rounding mode of 
>   floating-point operations. Setting it with a @code{volatile asm} statement,
>   as in the following PowerPC example, does not work reliably.
> 
>   @example
>   asm volatile("mtfsf 255, %0" : : "f" (fpenv));
>   sum = x + y;
>   @end example
> 
>   The compiler may move the addition back before the @code{volatile asm}
>   statement. To make it work as expected, add an artificial dependency to
>   the @code{asm} by referencing a variable in the subsequent code, for
>   example:
> 
>   @example
>   asm volatile ("mtfsf 255,%1" : "=X" (sum) : "f" (fpenv));
>   sum = x + y;
>   @end example
> 
> which is very similar to the unspec_volatile case we're talking about.

So just like volatile memory accesses, they have an (unknown) side
effect, which means they have to execute on the real machine as on the
abstract machine (wrt sequence points).  All side effects have to happen
exactly as often as proscribed, and in the same order.  Just like
volatile asm, too.

And there is no magic to it, there are no other effects.

> To take an x86 example:
> 
>   void
>   f (char *x)
>   {
>     asm volatile ("");
>     x[0] = 0;
>     asm volatile ("");
>     x[1] = 0;
>     asm volatile ("");
>   }
> 
> gets optimised to:
> 
>         xorl    %eax, %eax
>         movw    %ax, (%rdi)

(If you use "#" or "#smth" you can see those in the generated asm --
completely empty asm is helpfully (uh...) not printed.)

> with the two stores being merged.  The same thing is IMO valid for
> unspec_volatile.  In both cases, you would need some kind of memory
> clobber to prevent the move and merge from happening.

Even then, x[] could be optimised away completely (with whole program
optimisation, or something).  The only way to really prevent the
compiler from optimising memory accesses is to make it not see the
details (with an asm or an unspec, for example).

> The above is conservatively correct.  But not all passes do it.
> E.g. combine does have a similar approach:
> 
>   /* If INSN contains volatile references (specifically volatile MEMs),
>      we cannot combine across any other volatile references.

And this is correct, and the *minimum* to do even (this could change the
order of the side effects, depending how combine places the resulting
insns in I2 and I3).

>      Even if INSN doesn't contain volatile references, any intervening
>      volatile insn might affect machine state.  */

Confusingly stated, but essentially correct (it is possible we place
the volatile at I2, and everything would still be sequenced correctly,
but combine does not guarantee that).

>   is_volatile_p = volatile_refs_p (PATTERN (insn))
>     ? volatile_refs_p
>     : volatile_insn_p;

Too much subtlety in there, heh.


Segher

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

Reply via email to