On 08/12/2022 16:39, Tamar Christina via Gcc-patches wrote:
Hi All,
At -O0 (as opposed to e.g. volatile) we can get into the situation where the
in0 and result RTL arguments passed to the division function are memory
locations instead of registers. I think we could reject these early on by
checking that the gimple values are GIMPLE registers, but I think it's better to
handle it.
As such I force them to registers and emit a move to the memory locations and
leave it up to reload to handle. This fixes the ICE and still allows the
optimization in these cases, which improves the code quality a lot.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
PR target/107988
* config/aarch64/aarch64.cc
(aarch64_vectorize_can_special_div_by_constant): Ensure input and output
RTL are registers.
gcc/testsuite/ChangeLog:
PR target/107988
* gcc.target/aarch64/pr107988-1.c: New test.
--- inline copy of patch --
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index
b8dc3f070c8afc47c85fa18768c4da92c774338f..9f96424993c4fcccce90e1b241fcb3aa97025225
100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -24337,12 +24337,27 @@ aarch64_vectorize_can_special_div_by_constant (enum
tree_code code,
if (!VECTOR_TYPE_P (vectype))
return false;
+ if (!REG_P (in0))
+ in0 = force_reg (GET_MODE (in0), in0);
+
gcc_assert (output);
- if (!*output)
- *output = gen_reg_rtx (TYPE_MODE (vectype));
+ rtx res = NULL_RTX;
+
+ /* Once e get to this point we cannot reject the RTL, if it's not a reg then
+ Create a new reg and write the result to the output afterwards. */
+ if (!*output || !REG_P (*output))
+ res = gen_reg_rtx (TYPE_MODE (vectype));
+ else
+ res = *output;
Why not write
rtx res = *output
if (!res || !REG_P (res))
res = gen_reg_rtx...
then you don't need either the else clause or the dead NULL_RTX assignment.
+
+ emit_insn (gen_aarch64_bitmask_udiv3 (TYPE_MODE (vectype), res, in0, in1));
+
+ if (*output && res != *output)
+ emit_move_insn (*output, res);
+ else
+ *output = res;
- emit_insn (gen_aarch64_bitmask_udiv3 (TYPE_MODE (vectype), *output, in0, in1));
return true;
}
diff --git a/gcc/testsuite/gcc.target/aarch64/pr107988-1.c b/gcc/testsuite/gcc.target/aarch64/pr107988-1.c
new file mode 100644
index
0000000000000000000000000000000000000000..c4fd290271b738345173b569bdc58c092fba7fe9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr107988-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O0" } */
+typedef unsigned short __attribute__((__vector_size__ (16))) V;
+
+V
+foo (V v)
+{
+ v /= 255;
+ return v;
+}
Otherwise OK.
R.