[PATCH] Add vec_pack_ufix_trunc_{v4df,v2df} expanders
Hi! Similarly to the V{4,8}SFmode - unsigned V{4,8}SImode conversion support for AVX this one adds V{2,4}DFmode - unsigned V{4,8}SImode conversion. Ok for trunk? 2011-11-01 Jakub Jelinek ja...@redhat.com * config/i386/sse.md (ssepackfltmode): New mode attr. (vec_pack_ufix_trunc_mode): New expander using VF2 iterator. --- gcc/config/i386/sse.md.jj 2011-11-01 09:04:37.0 +0100 +++ gcc/config/i386/sse.md 2011-11-01 09:37:36.0 +0100 @@ -3127,6 +3127,56 @@ (define_expand vec_pack_sfix_trunc_v2df DONE; }) +(define_mode_attr ssepackfltmode + [(V4DF V8SI) (V2DF V4SI)]) + +(define_expand vec_pack_ufix_trunc_mode + [(match_operand:ssepackfltmode 0 register_operand ) + (match_operand:VF2 1 register_operand ) + (match_operand:VF2 2 register_operand )] + TARGET_AVX +{ + REAL_VALUE_TYPE MTWO32r, TWO31r; + rtx two31r, mtwo32r, tmp[8]; + int i; + + for (i = 0; i 6; i++) +tmp[i] = gen_reg_rtx (MODEmode); + tmp[6] = gen_reg_rtx (ssepackfltmodemode); + tmp[7] = gen_reg_rtx (ssepackfltmodemode); + real_ldexp (TWO31r, dconst1, 31); + two31r = const_double_from_real_value (TWO31r, DFmode); + two31r = ix86_build_const_vector (MODEmode, 1, two31r); + two31r = force_reg (MODEmode, two31r); + real_ldexp (MTWO32r, dconstm1, 32); + mtwo32r = const_double_from_real_value (MTWO32r, DFmode); + mtwo32r = ix86_build_const_vector (MODEmode, 1, mtwo32r); + mtwo32r = force_reg (MODEmode, mtwo32r); + emit_insn (gen_avx_cmpmode3 (tmp[0], operands[1], two31r, GEN_INT (29))); + emit_insn (gen_avx_cmpmode3 (tmp[1], operands[2], two31r, GEN_INT (29))); + emit_insn (gen_andmode3 (tmp[2], tmp[0], mtwo32r)); + emit_insn (gen_andmode3 (tmp[3], tmp[1], mtwo32r)); + emit_insn (gen_addmode3 (tmp[4], operands[1], tmp[2])); + emit_insn (gen_addmode3 (tmp[5], operands[2], tmp[3])); + if (MODEmode == V4DFmode) +{ + emit_insn (gen_avx_cvttpd2dq256_2 (tmp[6], tmp[4])); + emit_insn (gen_avx_cvttpd2dq256_2 (tmp[7], tmp[5])); + emit_insn (gen_avx_vperm2f128v8si3 (operands[0], tmp[6], tmp[7], + GEN_INT (0x20))); +} + else +{ + emit_insn (gen_sse2_cvttpd2dq (tmp[6], tmp[4])); + emit_insn (gen_sse2_cvttpd2dq (tmp[7], tmp[5])); + emit_insn (gen_vec_interleave_lowv2di (gen_lowpart (V2DImode, + operands[0]), +gen_lowpart (V2DImode, tmp[6]), +gen_lowpart (V2DImode, tmp[7]))); +} + DONE; +}) + (define_expand vec_pack_sfix_v4df [(match_operand:V8SI 0 register_operand ) (match_operand:V4DF 1 nonimmediate_operand ) Jakub
Re: [PATCH] Add vec_pack_ufix_trunc_{v4df,v2df} expanders
On Tue, Nov 1, 2011 at 10:07 AM, Jakub Jelinek ja...@redhat.com wrote: Similarly to the V{4,8}SFmode - unsigned V{4,8}SImode conversion support for AVX this one adds V{2,4}DFmode - unsigned V{4,8}SImode conversion. Ok for trunk? Please put expander function into i386.c. IMO, this expander can be better written using variable mode and indirect functions. Otherwise, it looks OK. Thanks, Uros.
[PATCH] Add vec_pack_ufix_trunc_{v4df,v2df} expanders (take 2)
On Tue, Nov 01, 2011 at 11:16:07AM +0100, Uros Bizjak wrote: On Tue, Nov 1, 2011 at 10:07 AM, Jakub Jelinek ja...@redhat.com wrote: Similarly to the V{4,8}SFmode - unsigned V{4,8}SImode conversion support for AVX this one adds V{2,4}DFmode - unsigned V{4,8}SImode conversion. Ok for trunk? Please put expander function into i386.c. IMO, this expander can be better written using variable mode and indirect functions. Like this? Advantage is that fixuns_truncmodesseintvecmodelower2 pattern can use the helper too and shrink, disadvantage is that the stmts in the new pattern are now in vcmppd; vandpd; vaddpd; vcmppd; vandpd; vaddpd order instead of vcmppd; vcmppd; vandpd; vandpd; vaddpd; vaddpd; (not sure why the scheduler didn't change it, but on the other side it is scheduler's job). 2011-11-01 Jakub Jelinek ja...@redhat.com * config/i386/i386-protos.h (ix86_expand_adjust_ufix_to_sfix_si): New prototype. * config/i386/i386.c (ix86_expand_adjust_ufix_to_sfix_si): New function. * config/i386/sse.md (fixuns_truncmodesseintvecmodelower2): Use it. (ssepackfltmode): New mode attr. (vec_pack_ufix_trunc_mode): New expander. --- gcc/config/i386/i386-protos.h.jj2011-10-25 08:13:31.0 +0200 +++ gcc/config/i386/i386-protos.h 2011-11-01 14:18:59.0 +0100 @@ -109,6 +109,7 @@ extern void ix86_expand_convert_uns_sixf extern void ix86_expand_convert_uns_sidf_sse (rtx, rtx); extern void ix86_expand_convert_uns_sisf_sse (rtx, rtx); extern void ix86_expand_convert_sign_didf_sse (rtx, rtx); +extern rtx ix86_expand_adjust_ufix_to_sfix_si (rtx); extern enum ix86_fpcmp_strategy ix86_fp_comparison_strategy (enum rtx_code); extern void ix86_expand_fp_absneg_operator (enum rtx_code, enum machine_mode, rtx[]); --- gcc/config/i386/i386.c.jj 2011-10-31 20:44:13.0 +0100 +++ gcc/config/i386/i386.c 2011-11-01 14:26:31.0 +0100 @@ -17016,6 +17016,46 @@ ix86_expand_convert_uns_sisf_sse (rtx ta emit_move_insn (target, fp_hi); } +/* Adjust a V*SFmode/V*DFmode value VAL so that *sfix_trunc* resp. fix_trunc* + pattern can be used on it instead of *ufix_trunc* resp. fixuns_trunc*. + This is done by subtracting 0x1p32 from VAL if VAL is greater or equal + (non-signalling) than 0x1p31. */ + +rtx +ix86_expand_adjust_ufix_to_sfix_si (rtx val) +{ + REAL_VALUE_TYPE MTWO32r, TWO31r; + rtx two31r, mtwo32r, tmp[3]; + enum machine_mode mode = GET_MODE (val); + enum machine_mode scalarmode = GET_MODE_INNER (mode); + rtx (*cmp) (rtx, rtx, rtx, rtx); + int i; + + for (i = 0; i 3; i++) +tmp[i] = gen_reg_rtx (mode); + real_ldexp (TWO31r, dconst1, 31); + two31r = const_double_from_real_value (TWO31r, scalarmode); + two31r = ix86_build_const_vector (mode, 1, two31r); + two31r = force_reg (mode, two31r); + real_ldexp (MTWO32r, dconstm1, 32); + mtwo32r = const_double_from_real_value (MTWO32r, scalarmode); + mtwo32r = ix86_build_const_vector (mode, 1, mtwo32r); + mtwo32r = force_reg (mode, mtwo32r); + switch (mode) +{ +case V8SFmode: cmp = gen_avx_cmpv8sf3; break; +case V4SFmode: cmp = gen_avx_cmpv4sf3; break; +case V4DFmode: cmp = gen_avx_cmpv4df3; break; +case V2DFmode: cmp = gen_avx_cmpv2df3; break; +default: gcc_unreachable (); +} + emit_insn (cmp (tmp[0], val, two31r, GEN_INT (29))); + tmp[1] = expand_simple_binop (mode, AND, tmp[0], mtwo32r, tmp[1], + 0, OPTAB_DIRECT); + return expand_simple_binop (mode, PLUS, val, tmp[1], tmp[2], + 0, OPTAB_DIRECT); +} + /* A subroutine of ix86_build_signbit_mask. If VECT is true, then replicate the value for all elements of the vector register. */ --- gcc/config/i386/sse.md.jj 2011-11-01 09:04:37.0 +0100 +++ gcc/config/i386/sse.md 2011-11-01 14:25:52.0 +0100 @@ -2323,32 +2323,13 @@ (define_insn fix_truncv4sfv4si2 (set_attr mode TI)]) (define_expand fixuns_truncmodesseintvecmodelower2 - [(set (match_dup 4) - (unspec:VF1 - [(match_operand:VF1 1 register_operand ) - (match_dup 2) - (const_int 29)] UNSPEC_PCMP)) - (set (match_dup 5) - (and:VF1 (match_dup 4) (match_dup 3))) - (set (match_dup 6) - (plus:VF1 (match_dup 1) (match_dup 5))) - (set (match_operand:sseintvecmode 0 register_operand ) - (fix:sseintvecmode (match_dup 6)))] + [(match_operand:sseintvecmode 0 register_operand ) + (match_operand:VF1 1 register_operand )] TARGET_AVX { - REAL_VALUE_TYPE MTWO32r, TWO31r; - int i; - - real_ldexp (TWO31r, dconst1, 31); - operands[2] = const_double_from_real_value (TWO31r, SFmode); - operands[2] = ix86_build_const_vector (MODEmode, 1, operands[2]); - operands[2] = force_reg (MODEmode, operands[2]); - real_ldexp (MTWO32r, dconstm1, 32); - operands[3] = const_double_from_real_value (MTWO32r, SFmode); - operands[3] =
Re: [PATCH] Add vec_pack_ufix_trunc_{v4df,v2df} expanders (take 2)
On 11/01/2011 06:35 AM, Jakub Jelinek wrote: ... disadvantage is that the stmts in the new pattern are now in vcmppd; vandpd; vaddpd; vcmppd; vandpd; vaddpd order instead of vcmppd; vcmppd; vandpd; vandpd; vaddpd; vaddpd; (not sure why the scheduler didn't change it, but on the other side it is scheduler's job). I wonder if the scheduling description didn't get updated properly? If the scheduler believes that the each insn takes 1 cycle, and there is only one pipe for them, it won't reorder anything. * config/i386/i386-protos.h (ix86_expand_adjust_ufix_to_sfix_si): New prototype. * config/i386/i386.c (ix86_expand_adjust_ufix_to_sfix_si): New function. * config/i386/sse.md (fixuns_truncmodesseintvecmodelower2): Use it. (ssepackfltmode): New mode attr. (vec_pack_ufix_trunc_mode): New expander. Looks good to me. r~
Re: [PATCH] Add vec_pack_ufix_trunc_{v4df,v2df} expanders (take 2)
On Tue, Nov 1, 2011 at 2:35 PM, Jakub Jelinek ja...@redhat.com wrote: Similarly to the V{4,8}SFmode - unsigned V{4,8}SImode conversion support for AVX this one adds V{2,4}DFmode - unsigned V{4,8}SImode conversion. Ok for trunk? Please put expander function into i386.c. IMO, this expander can be better written using variable mode and indirect functions. Like this? Advantage is that fixuns_truncmodesseintvecmodelower2 pattern can use the helper too and shrink, disadvantage is that the stmts in the new pattern are now in vcmppd; vandpd; vaddpd; vcmppd; vandpd; vaddpd order instead of vcmppd; vcmppd; vandpd; vandpd; vaddpd; vaddpd; (not sure why the scheduler didn't change it, but on the other side it is scheduler's job). 2011-11-01 Jakub Jelinek ja...@redhat.com * config/i386/i386-protos.h (ix86_expand_adjust_ufix_to_sfix_si): New prototype. * config/i386/i386.c (ix86_expand_adjust_ufix_to_sfix_si): New function. * config/i386/sse.md (fixuns_truncmodesseintvecmodelower2): Use it. (ssepackfltmode): New mode attr. (vec_pack_ufix_trunc_mode): New expander. OK. Thanks, Uros.