https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113280

            Bug ID: 113280
           Summary: Strange error for empty inline assembly with +X
                    constraint
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: david at westcontrol dot com
  Target Milestone: ---

I am getting strange results with an empty inline assembly line, used as an
optimisation barrier.  This is the test code I've tried in different setups:

#define opt 1

typedef int T;
T test(T a, T b) {
    T x = a + b;
#if opt == 1
    asm ("" : "+X" (x));   
#endif
#if opt == 2
    asm ("" : "+X" (x) : "0" (x));
#endif
#if opt == 3
    asm ("" :: "X" (x));
    asm ("" : "+X" (x));   
#endif
    return x - b;
} 

The idea of the assembly line is to be an optimisation barrier - it forces the
compiler to calculate the value of "x" before the assembly line, and has to use
that value (rather than any pre-calculated information) after the assembly
line.  This makes it something like an "asm("" ::: "memory")" barrier, but
focused on a single variable, which may not need to be moved out of registers. 
I've found this kind of construct "asm("" : "+g" (x))" useful to control
ordering of code around disabled interrupt sections of embedded code, and it
can also force the order of floating point calculations even while -ffast-math
is used.  Usually I've used "g" as the operand constraint, and that has worked
perfectly for everything except floating point types on targets that have
dedicated floating point registers.

I had expected "+X" (option 1 above) to work smoothly for floating point types.
 And it does, sometimes - other times I get an error:

    error: inconsistent operand constraints in an 'asm'

I've tested mainly on x86-64 (current trunk, which will be 14.0) and ARM 32-bit
(13.2), with -O2 optimisation.  I get no errors with -O0, but there is no need
of an optimisation barrier if optimisation is not enabled!  The exact compiler
version does not seem to make a difference, but the target does, as does the
type T.

I can't see why there should be a difference in the way these three variations
work, or why some combinations give errors.  When there is no compiler error,
the code (in all my testing) has been optimal as intended.

Some test results:

T = float :
  opt 1   arm-32 fail  x86-64 fail
  opt 2   arm-32 fail  x86-64 ok
  opt 3   arm-32 ok  x86-64 ok

T = int :
  opt 1   arm-32 fail  x86-64 fail
  opt 2   arm-32 fail  x86-64 ok
  opt 3   arm-32 ok  x86-64 ok

T = double :
  opt 1   arm-32 fail  x86-64 fail
  opt 2   arm-32 fail  x86-64 ok
  opt 3   arm-32 ok  x86-64 ok

T = long long int :
  opt 1   arm-32 ok  x86-64 fail
  opt 2   arm-32 ok  x86-64 ok
  opt 3   arm-32 ok  x86-64 ok

T = _Complex double :
  opt 1   arm-32 ok  x86-64 ok
  opt 2   arm-32 fail  x86-64 fail
  opt 3   arm-32 ok  x86-64 ok

T = long double :
  (arm-32 has 64-bit long double, so the results are not relevant)
  opt 1 gives "inconsistent operand constraints in an asm"
  opt 2 and 3 give "output constraint 0 must specify a single register", which
is an understandable error.


It looks like the "+X" constraint only works reliably if preceded by the
assembly line with the "X" input constraint.  For my own use, I can live with
that - it all goes in a macro anyway.  But maybe it points to a deeper issue,
or at least a potential for neatening things up.

Reply via email to