Hi Tamar, On Wed, 20 Jun 2018 at 15:35, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> wrote: > > > On 20/06/18 11:33, Tamar Christina wrote: > > Hi Kyrill, > > > > Many thanks for the review! > > > > The 06/20/2018 09:43, Kyrill Tkachov wrote: > >> Hi Tamar, > >> > >> On 19/06/18 15:14, Tamar Christina wrote: > >>> Hi All, > >>> > >>> This patch requires > >>> https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01145.html to work, > >>> it has been accepted once already but caused a regression on certain > >>> configuratoins. > >>> I am re-submitting it with the required mid-end change and requesting a > >>> back-port. > >>> > >>> --- original patch. > >>> > >>> Taking the subreg of a vector mode on big-endian may result in an infinite > >>> recursion and eventually a segfault once we run out of stack space. > >>> > >>> As an example, taking a subreg of V4HF to SImode we end up in the > >>> following > >>> loop on big-endian: > >>> > >>> #861 0x00000000008462e9 in operand_subword_force > >>> src/gcc/gcc/emit-rtl.c:1787 > >>> #862 0x0000000000882a90 in emit_move_multi_word src/gcc/gcc/expr.c:3621 > >>> #863 0x000000000087eea1 in emit_move_insn_1 src/gcc/gcc/expr.c:3698 > >>> #864 0x000000000087f350 in emit_move_insn src/gcc/gcc/expr.c:3757 > >>> #865 0x000000000085e326 in copy_to_reg src/gcc/gcc/explow.c:603 > >>> #866 0x00000000008462e9 in operand_subword_force > >>> src/gcc/gcc/emit-rtl.c:1787 > >>> > >>> The reason is that operand_subword_force will always fail. When the value > >>> is in > >>> a register that can't be accessed as a multi word the code tries to > >>> create a new > >>> psuedo register and emit the value to it. Eventually you end up in > >>> simplify_gen_subreg > >>> which calls validate_subreg. > >>> > >>> validate_subreg will however always fail because of the > >>> REG_CAN_CHANGE_MODE_P check. > >>> > >>> On little endian this check always returns true. On big-endian this check > >>> is supposed > >>> to prevent values that have a size larger than word size, due to those > >>> being stored in > >>> VFP registers. > >>> > >>> However we are only interested in a subreg of the vector mode, so we > >>> should be checking > >>> the unit size, not the size of the entire mode. Doing this fixes the > >>> problem. > >>> > >>> Regtested on armeb-none-eabi and no regressions. > >>> Bootstrapped on arm-none-linux-gnueabihf and no issues. > >>> > >>> Ok for trunk? and for backport to GCC 8? > >>> > >>> Thanks, > >>> Tamar > >>> > >>> gcc/ > >>> 2018-06-19 Tamar Christina <tamar.christ...@arm.com> > >>> > >>> PR target/84711 > >>> * config/arm/arm.c (arm_can_change_mode_class): Use > >>> GET_MODE_UNIT_SIZE > >>> instead of GET_MODE_SIZE when comparing Units. > >>> > >>> gcc/testsuite/ > >>> 2018-06-19 Tamar Christina <tamar.christ...@arm.com> > >>> > >>> PR target/84711 > >>> * gcc.target/arm/big-endian-subreg.c: New. > >>> > >>> -- > >> > >> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > >> index > >> 90d62e699bce9594879be2e3016c9b36c7e064c8..703632240822e762a906570964b949c783df56f3 > >> 100644 > >> --- a/gcc/config/arm/arm.c > >> +++ b/gcc/config/arm/arm.c > >> @@ -31508,8 +31508,8 @@ arm_can_change_mode_class (machine_mode from, > >> machine_mode to, > >> { > >> if (TARGET_BIG_END > >> && !(GET_MODE_SIZE (from) == 16 && GET_MODE_SIZE (to) == 8) > >> - && (GET_MODE_SIZE (from) > UNITS_PER_WORD > >> - || GET_MODE_SIZE (to) > UNITS_PER_WORD) > >> + && (GET_MODE_UNIT_SIZE (from) > UNITS_PER_WORD > >> + || GET_MODE_UNIT_SIZE (to) > UNITS_PER_WORD) > >>
After this commit (r262436), I have noticed regressions on armeb-none-linux-gnueabihf --with-cpu cortex-a9 --with-fpu neon-fp16 FAIL: gcc.dg/vect/vect-nop-move.c execution test FAIL: g++.dg/torture/vshuf-v2si.C -O3 -g execution test FAIL: g++.dg/torture/vshuf-v4si.C -O3 -g execution test FAIL: g++.dg/torture/vshuf-v8hi.C -O3 -g execution test Can you have a look? > >> Does GET_MODE_UNIT_SIZE do what you want? Its documentation in rtl.texi > >> says: > >> "Returns the size in bytes of the subunits of a datum of mode @var{m}. > >> This is the same as @code{GET_MODE_SIZE} except in the case of complex > >> modes. For them, the unit size is the size of the real or imaginary > >> part." > >> > >> Does it also do the right thing for vector modes (GET_MODE_SIZE > >> (GET_MODE_INNER (mode))) ? > >> If so, this patch is ok, but we'll need to update the documentation to > >> make it more explicit. > > I don't know what the documentation is trying to say here, but the key is > > the first part: > > > > "returns the size in bytes of the subunits of a datum of mode m" > > > > MODE: V4SI, GET_MODE_UNIT_SIZE: 4, GET_MODE_SIZE: 16 > > > > So GET_MODE_UNIT_SIZE (m) * GET_MODE_NUNITS(m) == GET_MODE_SIZE (m) > > > > or GET_MODE_UNIT_SIZE (m) == GET_MODE_SIZE (GET_MODE_INNER (mode)). > > > > From this the only time GET_MODE_UNIT_SIZE is equal to GET_MODE_SIZE is on > > non-vector modes of V1 vector modes. > > Right, this is what I thought, but the documentation is not as clear as it > could be. > Your patch is ok for trunk. > > Thanks, > Kyrill > > > Kind Regards, > > Tamar > > > >> Thanks for the patch, > >> Kyrill > >> > >> > >> >