On Mon, 22 Feb 2021 at 16:30, Segher Boessenkool
<seg...@kernel.crashing.org> wrote:
>
> Hi!
>
> First off, thanbk you for the patch!

You're welcome!

> On Mon, Feb 15, 2021 at 11:22:52PM +0000, Neven Sajko via Gcc-patches wrote:
> > There is a long-standing, but undocumented GCC inline assembly feature
> > that's part of the extended asm GCC extension to C and C++: extended
> > asm empty input constraints.
>
> There is no such thing.  *All* empty constraints have the same
> semantics: anything whatsoever will do.  Any register, any constant, any
> memory.

What I was trying to express is that input operand constraints are
unlike output operand constraints in that they can be empty. I now
realize I ended up being slightly confusing, though.

> > --- a/gcc/doc/md.texi
> > +++ b/gcc/doc/md.texi
> > @@ -1131,7 +1131,102 @@ the addressing register.
> >  @subsection Simple Constraints
> >  @cindex simple constraints
> >
> > -The simplest kind of constraint is a string full of letters, each of
> > +An input constraint is allowed to be an empty string, in which case it is
> > +called an empty input constraint.
>
> That is just shorthand for "empty constraint that is used for an input
> operand".  It is not special, and it *is* documented:
> https://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html#Simple-Constraints
>   The simplest kind of constraint is a string full of letters, each of
>   which describes one kind of operand that is permitted.
>
> A length zero string is allowed as well.  This could be made more
> explicit sure; OTOH, it isn't very often useful.  So your example
> (using it for making a dependency) is certainly useful to have.  But
> it is not a special case at all.

Syntactically, it's not a special case; but I definitely think the
semantics could be better documented. Proof:

* There's a relevant Stack Overflow question. If I didn't know better
I'd conclude from the discussion there that empty input constraints
are undocumented and unsupported, and there would surely be an answer
if the documentation on the GCC side was a bit better:
https://stackoverflow.com/questions/63305223/gcc-asm-with-empty-input-operand-constraint

* Clang erroneously doesn't support empty constraints for many years
now (even though their internal documentation still says empty input
constraints are supported, and external documentation says they
support all the same constraints as GCC does). I suppose they may have
been mislead by the lack of explicit mention of the feature in GCC's
documentation.

> > (When an empty input constraint is used,
> > +the assembler template will most probably also be empty. I.e., the 
> > @code{asm}
> > +declaration need not contain actual assembly code.)
>
> Don't use parentheses like this in documentation please.

OK.

> > An empty input
> > +constraint can be used to create an artificial dependency on a C or C++
> > +variable (the variable that appears in the expression associated with the
> > +constraint) without incurring unnecessary costs to performance.
>
> It still needs a register (or memory) reserved there (or sometimes a
> constant can be used, but you have no dependency in that case!)

Yeah, this is a bit more complicated than I perhaps implied. An asm
volatile can tell the compiler "I need this value calculated at this
point", but the compiler may still choose to eliminate the calculation
from the generated code if it can perform it itself at compilation
time. Thus currently the programmer must be able to predict if GCC
will be able compute the value of some variable or expression; the
good thing is that this is usually easy to predict.

> > +An example of where such behavior may be useful is for preventing compiler
> > +optimizations like dead store elimination or hoisting code outside a loop 
> > for
> > +certain pieces of C or C++ code.
>
> You should not think about preventing the compiler from doing something.
> Instead, you can give the compiler extra information that makes it *do*
> something: it has to, because it has to implement the semantics your
> source program has.
>
> > Specific applications may include direct
> > +interaction with hardware features; or things like testing, fuzzing and
> > +benchmarking.
>
> What does this mean?

The manual already has examples for "direct interaction with hardware features".

Benchmarking is another relatively well known example of an activity
during which we may be inconvenienced by the compiler doing dead store
elimination and loop hoisting at certain specific places in the code.
E.g., Google's Benchmark has DoNotOptimize and Facebook's Folly has
doNotOptimizeAway:

https://github.com/google/benchmark/blob/master/include/benchmark/benchmark.h#L308
https://github.com/facebook/folly/blob/master/folly/BenchmarkUtil.h#L73

Unit testing and fuzzing are other such examples, specifically when
trying to test for undefined behavior with sanitizers.

> Here is a simple example showing why this isn't as simple to use as
> you imply here:
>
> ===
> void f(int x)
> {
>         asm volatile("" :: ""(x));
> }
>
> void g(void)
> {
>         return f(42);
> }
> ===
>
> Both function compile to (taking aarch64 as example) just "ret".  But,
> if you look at what the compiler does, you see in the "dfinish" pass it
> has for f:
>
> (insn:TI 6 3 20 (asm_operands/v ("") ("") 0 [
>             (reg:SI 0 x0 [93])
>         ]
>          [
>             (asm_input:SI ("") zlc.c:3)
>         ]
>          [] zlc.c:3) "zlc.c":3:2 -1
>      (expr_list:REG_DEAD (reg:SI 0 x0 [93])
>         (nil)))
>
>
> (so it has register x0 as input), while function g has
>
> (insn:TI 5 2 16 (asm_operands/v ("") ("") 0 [
>             (const_int 42 [0x2a])
>         ]
>          [
>             (asm_input:SI ("") zlc.c:3)
>         ]
>          [] zlc.c:3) "zlc.c":3:2 -1
>      (nil))
>
> which has no dependency, gets fed the constant 42 instead, because
> *anything at all* is allowed by an empty constraint.
>
> You can also make this clear by using
>
>         asm volatile("# %0" :: ""(x));
>
> which gives
>         # x0
> resp.
>         # 42
>
> or, with -fverbose-asm:
>         # x0            // tmp93
> and
>         # 42            //
>
> which is clear as mud, but it means in f there was a variable as input
> to the asm, and in g there wasn't.

Thank you for the example.

I would be very satisfied if the wording from the end of Jonathan's
message made it to the documentation, though perhaps there should be
an additional warning about the issue that Segher pointed to: that GCC
may still eliminate a calculation if it can perform it at compilation
time.

Thanks,
Neven

Reply via email to