Sorry for the slow reply, been away for a few days.... Eric Botcazou <ebotca...@adacore.com> writes: > > This patch is a minimal change to prevent (subreg(mem)) from being > > simplified to use the outer mode for WORD_REGISTER_OPERATIONS. There > > is high probability of refining and/or re-implementing this for GCC 8 > > but such a change would be too invasive. This change at least ensures > > correctness but may prevent simplification of some acceptable cases. > > This one causes: > > +FAIL: gcc.dg/torture/builtin-complex-1.c -O3 -fomit-frame-pointer - > funroll- > loops -fpeel-loops -ftracer -finline-functions (test for excess errors) > +WARNING: gcc.dg/torture/builtin-complex-1.c -O3 -fomit-frame-pointer > - > funroll-loops -fpeel-loops -ftracer -finline-functions compilation > failed to produce executable > +FAIL: gcc.dg/torture/builtin-complex-1.c -O3 -g (test for excess > errors) > +WARNING: gcc.dg/torture/builtin-complex-1.c -O3 -g compilation > failed to > produce executable > +WARNING: program timed out. > +WARNING: program timed out. > > on SPARC 32-bit, i.e. LRA hangs. Reduced testcase attached, compile at > -O3 with a cc1 configured for sparc-sun-solaris2.10. > > > gcc/ > > PR target/78660 > > * lra-constraints.c (simplify_operand_subreg): Handle > > WORD_REGISTER_OPERATIONS targets. > > --- > > gcc/lra-constraints.c | 17 ++++++++++++----- > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c index > > 66ff2bb..484a70d 100644 > > --- a/gcc/lra-constraints.c > > +++ b/gcc/lra-constraints.c > > @@ -1541,11 +1541,18 @@ simplify_operand_subreg (int nop, machine_mode > > reg_mode) subregs as we don't substitute such equiv memory (see > > processing equivalences in function lra_constraints) and because for > > spilled pseudos we allocate stack memory enough for the biggest > > - corresponding paradoxical subreg. */ > > - if (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode) > > - && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst))) > > - || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode) > > - && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg)))) > > + corresponding paradoxical subreg. > > + > > + However, never simplify a (subreg (mem ...)) for > > + WORD_REGISTER_OPERATIONS targets as this may lead to loading > junk > > + data into a register when the inner is narrower than outer or > > + missing important data from memory when the inner is wider > than > > + outer. */ > > + if (!WORD_REGISTER_OPERATIONS > > + && (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode) > > + && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst))) > > + || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode) > > + && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN > (reg))))) > > return true; > > > > *curr_id->operand_loc[nop] = operand; > > I think that we might need: > > if (!(GET_MODE_PRECISION (mode) > GET_MODE_PRECISION (innermode) > && WORD_REGISTER_OPERATIONS) > && (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode) > && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst))) > || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode) > && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg))))) > return true; > > i.e. we force the reloading only for paradoxical subregs.
Maybe, though I'm not entirely convinced. For WORD_REGISTER_OPERATIONS both paradoxical and normal SUBREGs can be a problem as the inner mode in both cases can be used elsewhere for a reload making the content of the spill slot wrong either in the subreg reload or the ordinary reload elsewhere. However, the condition can be tightened; I should not have made it so simplistic I guess. I.e. both modes must be no wider than a word and must be different precision to force an inner reload. Adding that check would fix this case for SPARC and should be fine for MIPS but I need to wait for a bootstrap build to be sure. I don't really understand why LRA can't reload this for SPARC though as I'm not sure there is any guarantee provided to backends that some SUBREGs will be reloaded using their outer mode. If there is such a guarantee then it would be much easier to reason about this logic but as it stands I suspect we are having to tweak LRA to cope with assumptions made in various targets that happen to have held true (and I have no doubt that MIPS has some of these as well especially in terms of the FP/GP register usage with float and int modes.) All being well we can capture such assumptions and formalise them so we ensure they hold true (or modify backends appropriately I guess). The condition would look like this, What do you think? if (!(GET_MODE_PRECISION (mode) != GET_MODE_PRECISION (innermode) && GET_MODE_SIZE (mode) <= UNITS_PER_WORD && GET_MODE_SIZE (innermode) <= UNITS_PER_WORD && WORD_REGISTER_OPERATIONS) && (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode) && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst))) || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode) && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg))))) return true; Thanks, Matthew