On Wed, Jan 19, 2022 at 7:18 PM Andre Vieira (lists) via Gcc-patches < gcc-patches@gcc.gnu.org> wrote:
> Hi Christophe, > > On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote: > > At some point during the development of this patch series, it appeared > > that in some cases the register allocator wants “VPR or general” > > rather than “VPR or general or FP” (which is the same thing as > > ALL_REGS). The series does not seem to require this anymore, but it > > seems to be a good thing to do anyway, to give the register allocator > > more freedom. > Not sure I fully understand this, but I guess it creates an extra class > the register allocator can use to group things that can go into VPR or > general reg? > > > > CLASS_MAX_NREGS and arm_hard_regno_nregs need adjustment to avoid a > > regression in gcc.dg/stack-usage-1.c when compiled with -mthumb > > -mfloat-abi=hard -march=armv8.1-m.main+mve.fp+fp.dp. > I have not looked into this failure, but ... > > > > 2022-01-13 Christophe Lyon <christophe.l...@foss.st.com> > > > > gcc/ > > * config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS. > > (REG_CLASS_NAMES): Likewise. > > (REG_CLASS_CONTENTS): Likewise. > > (CLASS_MAX_NREGS): Handle VPR. > > * config/arm/arm.c (arm_hard_regno_nregs): Handle VPR. > > > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > > index bb75921f32d..c3559ca8703 100644 > > --- a/gcc/config/arm/arm.c > > +++ b/gcc/config/arm/arm.c > > @@ -25287,6 +25287,9 @@ thumb2_asm_output_opcode (FILE * stream) > > static unsigned int > > arm_hard_regno_nregs (unsigned int regno, machine_mode mode) > > { > > + if (IS_VPR_REGNUM (regno)) > > + return CEIL (GET_MODE_SIZE (mode), 2); > When do we ever want to use more than 1 register for VPR? > That was tricky. Richard Sandiford helped me analyze the problem, I guess I can quote him: RS> I think the problem is a combination of a few things: RS> RS> (1) arm_hard_regno_mode_ok rejects SImode in VPR, so SImode moves RS> to or from the VPR_REG class get the maximum cost. RS> RS> (2) IRA thinks from CLASS_MAX_NREGS and arm_hard_regno_nregs that RS> VPR is big enough to hold SImode. RS> RS> (3) If a class C1 is a superset of a class C2, and if C2 is big enough RS> to hold a mode M, IRA ensures that move costs for M involving C1 RS> are >= move costs for M involving C2. RS> RS> (1) is correct but (2) isn't. IMO (3) is dubious: the trigger should RS> be whether C2 is actually allowed to hold M, not whether C2 is big enough RS> to hold M. However, changing that is likely to cause problems elsewhere, RS> and could lead to classes like GENERAL_AND_FP_REGS being used when RS> FP_REGS are disabled (which might be confusing). RS> RS> “Fixing” (2) using: RS> RS> CEIL (GET_MODE_SIZE (mode), 2) RS> RS> for VPR_REG & VPR_REGNUM seems to make the costs correct. I don't know RS> if it would cause other problems though. RS> RS> I don't think CLASS_MAX_NREGS should do anything special for superclasses RS> of VPR_REG, even though that makes the definition non-obvious. If an RS> SImode is stored in GENERAL_AND_VPR_REGS, it will in reality be stored RS> in the GENERAL_REGS subset, so the maximum count should come from there RS> rather than VPR_REG. Does that answer your question? > > > > @@ -1453,7 +1456,9 @@ extern const char *fp_sysreg_names[NB_FP_SYSREGS]; > > ARM regs are UNITS_PER_WORD bits. > > FIXME: Is this true for iWMMX? */ > > #define CLASS_MAX_NREGS(CLASS, MODE) \ > > - (ARM_NUM_REGS (MODE)) > > + (CLASS == VPR_REG) \ > > + ? CEIL (GET_MODE_SIZE (MODE), 2) \ > > + : (ARM_NUM_REGS (MODE)) > > > Same. > >