Hi Jakub,

On Fri, Feb 09, 2018 at 07:43:16AM +0100, Jakub Jelinek wrote:
> BSWAP is documented as:
> @item (bswap:@var{m} @var{x})
> Represents the value @var{x} with the order of bytes reversed, carried out
> in mode @var{m}, which must be a fixed-point machine mode.
> The mode of @var{x} must be @var{m} or @code{VOIDmode}.
> The fixed-point is something used widely in rtl.texi and is very confusing
> now that we have FIXED_POINT_TYPE floats, I assume it talks about integral
> modes or, because it is also used with vector modes, about INTEGRAL_MODE_P
> modes.  My understanding is that bswap on a vector integral mode is meant to
> be bswap of each element individually.

Right.  The documentation could use some improvement ;-)

> The rs6000 backend uses bswap not just on scalar integral modes and
> vector integral modes, but also on V4SF and V2DF, but ICEs in simplify-rtx.c
> where we don't expect bswap to be used on SF/DFmode (vector bswap is handled
> as bswap of each element).
> The following patch adjusts the rs6000 backend to use well defined bswaps
> on corresponding integral modes instead (and also does what we've done in
> i386 backend years ago, avoid subreg on the lhs because it breaks combine
> attempts to optimize it).

> Bootstrapped/regtested on powerpc64{,le}-linux, on powerpc64-linux including
> -m64/-m32, ok for trunk?
> 2018-02-09  Jakub Jelinek  <ja...@redhat.com>
>       PR target/84226
>       * config/rs6000/vsx.md (p9_xxbrq_v16qi): Change input operand
>       constraint from =wa to wa.  Avoid a subreg on the output operand,
>       instead use a pseudo and subreg it in a move.
>       (p9_xxbrd_<mode>): Changed to ...
>       (p9_xxbrd_v2di): ... this insn, without VSX_D iterator.
>       (p9_xxbrd_v2df): New expander.
>       (p9_xxbrw_<mode>): Changed to ...
>       (p9_xxbrw_v4si): ... this insn, without VSX_W iterator.
>       (p9_xxbrw_v4sf): New expander.
>       * gcc.target/powerpc/pr84226.c: New test.

> +(define_expand "p9_xxbrd_v2df"
> +  [(use (match_operand:V2DF 0 "vsx_register_operand" "=wa"))
> +   (use (match_operand:V2DF 1 "vsx_register_operand" "wa"))]
> +{
> +  rtx op0 = gen_reg_rtx (V2DImode);
> +  rtx op1 = gen_lowpart (V2DImode, operands[1]);
> +  emit_insn (gen_p9_xxbrd_v2di (op0, op1));
> +  emit_move_insn (operands[0], gen_lowpart (V2DFmode, op0));
> +  DONE;
> +})

Okay for trunk with or without such an improvement.  Thanks!


Reply via email to