https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62662

            Bug ID: 62662
           Summary: [4.9/5 Regression] Miscompilation of Qt on s390x
           Product: gcc
           Version: 4.9.1
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jakub at gcc dot gnu.org
                CC: krebbel at gcc dot gnu.org, uweigand at gcc dot gnu.org
            Target: s390x-linux

struct A
{
  int a;
  void b ();
};
int c;
struct B
{
  struct C { A d; };
  static C e;
};
struct D
{
  B::C *d;
  D () : d (&B::e) { d->d.b (); }
};
struct F
{
  F (int *);
  int *f;
  D g;
};
void A::b ()
{
  volatile int x;
  __asm__("" : "=&d" (c), "=&d" (x), "=m" (*&a) : "a" (&a), "dm" (0));
}
F::F (int *p1) : f (p1) {}

compiled with -m64 -O2 -fvisibility=hidden -fPIC -march=z9-109 -mtune=z10
results in wrong-code:
        stg     %r15,120(%r15)
        stg     %r3,0(%r2)
.LCFI3:
        lay     %r15,-168(%r15)
.LCFI4:
        larl    %r4,c
        larl    %r1,_ZN1B1eE@GOTENT
        lhi     %r5,0
        lg      %r1,0(%r1)
        stg     %r1,8(%r2)
        st      %r2,0(%r4)
        st      %r3,164(%r15)
        lg      %r4,280(%r15)
        lg      %r15,288(%r15)
.LCFI5:
        br      %r4

The %r14 return register is not saved to the return register stack slot in the
prologue, but is restored from it in the epilogue and jumped to.

>From quick look at this, this seems to be because of bad interaction in between
          /* Fetch return address from stack before load multiple,
             this will do good for scheduling.  */
code in s390_emit_epilogue and s390_optimize_prologue.  If
cfun_frame_layout.save_return_addr_p is false, and first register needing save
during s390_emit_epilogue is below 13 and last above 14, then
s390_emit_epilogue attempts to optimize and loads the return register into
typically %r4 before doing the load multiple insn, but s390_optimize_prologue
doesn't take this into account, it will remove the store multiple insn in the
prologue if at that point only %r15 needs saving, and also replace the load
multiple, but keep loading from return register stack slot into %r4 and return
to %r4.

So, either s390_optimize_prologue needs to adjust also the return insn (and
remove the load of the return address from stack slot if DSE can't handle it?),
or we need to arrange in this case that s390_optimize_prologue doesn't optimize
the return address store away.

Reply via email to