https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119130
--- Comment #2 from Avinash Jayakar <avinashd at gcc dot gnu.org> ---
(In reply to Avinash Jayakar from comment #1)
> in altivec.md:3616
>
> (define_expand "convert_4f32_8f16"
> [(set (match_operand:V8HI 0 "register_operand" "=v")
> (unspec:V8HI [(match_operand:V4SF 1 "register_operand" "v")
> (match_operand:V4SF 2 "register_operand" "v")]
> UNSPEC_CONVERT_4F32_8F16))]
> "TARGET_P9_VECTOR"
> {
> rtx rtx_tmp_hi = gen_reg_rtx (V4SImode);
> rtx rtx_tmp_lo = gen_reg_rtx (V4SImode);
>
> emit_insn (gen_vsx_xvcvsphp (rtx_tmp_hi, operands[1]));
> emit_insn (gen_vsx_xvcvsphp (rtx_tmp_lo, operands[2]));
> if (!BYTES_BIG_ENDIAN)
> emit_insn (gen_altivec_vpkuwum (operands[0], rtx_tmp_hi, rtx_tmp_lo));
> else
> emit_insn (gen_altivec_vpkuwum (operands[0], rtx_tmp_lo, rtx_tmp_hi));
> DONE;
> })
>
> I am not sure why the order of operands in the big endian mode are reversed.
> The vpkuwum insn operated only on registers and therefore should be agnostic
> of endianness.
Sorry I was wrong. The endianness does matter on vector registers, but still
the order of operands must not have been swapped here, since the patterns for
vpkuwm already handle it, therefore since 2 swaps happen, wrong results are
produced.