Richard Earnshaw <richard.earns...@foss.arm.com> writes: > On 08/12/2022 16:39, Tamar Christina via Gcc-patches wrote: >> Hi All, >> >> At -O0 (as opposed to e.g. volatile) we can get into the situation where the >> in0 and result RTL arguments passed to the division function are memory >> locations instead of registers. I think we could reject these early on by >> checking that the gimple values are GIMPLE registers, but I think it's >> better to >> handle it. >> >> As such I force them to registers and emit a move to the memory locations and >> leave it up to reload to handle. This fixes the ICE and still allows the >> optimization in these cases, which improves the code quality a lot. >> >> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. >> >> Ok for master? >> >> Thanks, >> Tamar >> >> >> >> gcc/ChangeLog: >> >> PR target/107988 >> * config/aarch64/aarch64.cc >> (aarch64_vectorize_can_special_div_by_constant): Ensure input and output >> RTL are registers. >> >> gcc/testsuite/ChangeLog: >> >> PR target/107988 >> * gcc.target/aarch64/pr107988-1.c: New test. >> >> --- inline copy of patch -- >> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc >> index >> b8dc3f070c8afc47c85fa18768c4da92c774338f..9f96424993c4fcccce90e1b241fcb3aa97025225 >> 100644 >> --- a/gcc/config/aarch64/aarch64.cc >> +++ b/gcc/config/aarch64/aarch64.cc >> @@ -24337,12 +24337,27 @@ aarch64_vectorize_can_special_div_by_constant >> (enum tree_code code, >> if (!VECTOR_TYPE_P (vectype)) >> return false; >> >> + if (!REG_P (in0)) >> + in0 = force_reg (GET_MODE (in0), in0); >> + >> gcc_assert (output); >> >> - if (!*output) >> - *output = gen_reg_rtx (TYPE_MODE (vectype)); >> + rtx res = NULL_RTX; >> + >> + /* Once e get to this point we cannot reject the RTL, if it's not a reg >> then >> + Create a new reg and write the result to the output afterwards. */ >> + if (!*output || !REG_P (*output)) >> + res = gen_reg_rtx (TYPE_MODE (vectype)); >> + else >> + res = *output; > > Why not write > rtx res = *output > if (!res || !REG_P (res)) > res = gen_reg_rtx... > > then you don't need either the else clause or the dead NULL_RTX assignment.
I'd prefer that we use the expand_insn interface, which already has logic for coercing inputs and outputs to predicates. Something like: machine_mode mode = TYPE_MODE (vectype); unsigned int flags = aarch64_classify_vector_mode (mode); if ((flags & VEC_ANY_SVE) && !TARGET_SVE2) return false; ... expand_operand ops[3]; create_output_operand (&ops[0], *output, mode); create_input_operand (&ops[1], in0, mode); create_fixed_operand (&ops[2], in1); expand_insn (insn_code, 3, ops); *output = ops[0].value; return true; On this function: why do we have the VECTOR_TYPE_P condition in: /* We can use the optimized pattern. */ if (in0 == NULL_RTX && in1 == NULL_RTX) return true; if (!VECTOR_TYPE_P (vectype)) return false; ? It seems odd to be returning false after we have decided (in the non-generating case) that everything is OK. When would we see a vector mode that has an associated division instruction (checked above this), and yet not have a vector type? Thanks, Richard >> + >> + emit_insn (gen_aarch64_bitmask_udiv3 (TYPE_MODE (vectype), res, in0, >> in1)); >> + >> + if (*output && res != *output) >> + emit_move_insn (*output, res); >> + else >> + *output = res; >> >> - emit_insn (gen_aarch64_bitmask_udiv3 (TYPE_MODE (vectype), *output, in0, >> in1)); >> return true; >> } >> >> diff --git a/gcc/testsuite/gcc.target/aarch64/pr107988-1.c >> b/gcc/testsuite/gcc.target/aarch64/pr107988-1.c >> new file mode 100644 >> index >> 0000000000000000000000000000000000000000..c4fd290271b738345173b569bdc58c092fba7fe9 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/pr107988-1.c >> @@ -0,0 +1,10 @@ >> +/* { dg-do compile } */ >> +/* { dg-additional-options "-O0" } */ >> +typedef unsigned short __attribute__((__vector_size__ (16))) V; >> + >> +V >> +foo (V v) >> +{ >> + v /= 255; >> + return v; >> +} >> >> >> >> > > Otherwise OK. > > R.