On Fri, Jun 04, 2021 at 06:02:27PM +0000, Andreas Krebbel via Gcc wrote:
> Hi,
> 
> I wonder if we could replace the register asm construct for
> inline assemblies with something a bit nicer and more obvious.
> E.g. turning this (real world example from IBM Z kernel code):
> 
> int diag8_response(int cmdlen, char *response, int *rlen)
> {
>         register unsigned long reg2 asm ("2") = (addr_t) cpcmd_buf;
>         register unsigned long reg3 asm ("3") = (addr_t) response;
>         register unsigned long reg4 asm ("4") = cmdlen | 0x40000000L;
>         register unsigned long reg5 asm ("5") = *rlen; /* <-- */
>         asm volatile(
>                 "       diag    %2,%0,0x8\n"
>                 "       brc     8,1f\n"
>                 "       agr     %1,%4\n"
>                 "1:\n"
>                 : "+d" (reg4), "+d" (reg5)
>                 : "d" (reg2), "d" (reg3), "d" (*rlen): "cc");
>         *rlen = reg5;
>         return reg4;
> }
> 
> into this:
> 
> int diag8_response(int cmdlen, char *response, int *rlen)
> {
>         unsigned long len = cmdlen | 0x40000000L;
> 
>         asm volatile(
>                 "       diag    %2,%0,0x8\n"
>                 "       brc     8,1f\n"
>                 "       agr     %1,%4\n"
>                 "1:\n"
>                 : "+{r4}" (len), "+{r5}" (*rlen)
>                 : "{r2}" ((addr_t)cpcmd_buf), "{r3}" ((addr_t)response), "d" 
> (*rlen): "cc");
>         return len;
> }
> 
> Apart from being much easier to read because the hard regs become part
> of the inline assembly it solves also a couple of other issues:
> 
> - function calls might clobber register asm variables see BZ100908
> - the constraints for the register asm operands are superfluous
> - one register asm variable cannot be used for 2 different inline
>   assemblies if the value is expected in different hard regs
> 
> I've started with a hackish implementation for IBM Z using the
> TARGET_MD_ASM_ADJUST hook and let all the places parsing constraints
> skip over the {} parts.  But perhaps it would be useful to make this a
> generic mechanism for all targets?!
> 
> Andrea

Hi all,

I would like to resurrect this topic
https://gcc.gnu.org/pipermail/gcc/2021-June/236269.html and have been
coming up with a first implementation in order to discuss this further.

Basically, I see two ways to implement this.  First is by letting LRA
assign the registers and the second one by introducing extra moves just
before/after asm statements.  Currently I went for the latter and emit
extra moves during expand into hard regs as specified by the
input/output constraints.

Before going forward I would like to get some feedback whether this approach
makes sense to you at all or whether you see some show stoppers.  I was
wondering whether my current approach is robust enough in the sense that no
other pass could potentially remove the extra moves I introduced before.
In particular I was first worried about code motion.  Initially I thought I
have to make use not only of hard regs but hard regs which are flagged as
register-asms in order to prevent optimizations to fiddly around with those
moves.  However, after some more investigation I tend to conclude that this is
not necessary.  Any thoughts about this approach?

With the current approach I can at least handle cases like:

int __attribute__ ((noipa))
foo (int x) { return x; }

int test (int x)
{
  asm ("foo %0,%1\n" :: "{r3}" (foo (x + 1)), "{r2}" (x));
  return x;
}

Note, this is written with the s390 ABI in mind where the first int argument
and return value are passed in register r2.  The point here is that r2 needs to
be altered and restored multiple times until we reach } of function test().
Luckily, during expand we get all this basically for free.

This brings me to the general question what should be allowed and what not?
Evaluation order of input expressions is probably unspecified similar to
function arguments.  However, what about this one:

int test (int x)
{
  register int y asm ("r5") = x + 1;
  asm ("foo %0,%1\n" : "={r4}" (y) : "{r1}" (y));
  return y;
}

IMHO the input is just fine but the output constraint is misleading and it is
not obvious in which register variable y resides after the asm statement.
With my current implementation, were I don't bail out, it is register r4
contrary to the decl.  Interestingly, the other way around where one register
is "aliased" by multiple variables is accepted by vanilla GCC:

int foo (int x, int y)
{
  register int a asm ("r1") = x;
  register int b asm ("r1") = y;
  return a + b;
}

Though, probably not intentionally.

Cheers,
Stefan

Reply via email to