[PATCH] Add vec_pack_ufix_trunc_{v4df,v2df} expanders

2011-11-01 Thread Jakub Jelinek
Hi!

Similarly to the V{4,8}SFmode - unsigned V{4,8}SImode conversion
support for AVX this one adds V{2,4}DFmode - unsigned V{4,8}SImode
conversion.

Ok for trunk?

2011-11-01  Jakub Jelinek  ja...@redhat.com

* config/i386/sse.md (ssepackfltmode): New mode attr.
(vec_pack_ufix_trunc_mode): New expander using VF2 iterator.

--- gcc/config/i386/sse.md.jj   2011-11-01 09:04:37.0 +0100
+++ gcc/config/i386/sse.md  2011-11-01 09:37:36.0 +0100
@@ -3127,6 +3127,56 @@ (define_expand vec_pack_sfix_trunc_v2df
   DONE;
 })
 
+(define_mode_attr ssepackfltmode
+  [(V4DF V8SI) (V2DF V4SI)])
+
+(define_expand vec_pack_ufix_trunc_mode
+  [(match_operand:ssepackfltmode 0 register_operand )
+   (match_operand:VF2 1 register_operand )
+   (match_operand:VF2 2 register_operand )]
+  TARGET_AVX
+{
+  REAL_VALUE_TYPE MTWO32r, TWO31r;
+  rtx two31r, mtwo32r, tmp[8];
+  int i;
+
+  for (i = 0; i  6; i++)
+tmp[i] = gen_reg_rtx (MODEmode);
+  tmp[6] = gen_reg_rtx (ssepackfltmodemode);
+  tmp[7] = gen_reg_rtx (ssepackfltmodemode);
+  real_ldexp (TWO31r, dconst1, 31);
+  two31r = const_double_from_real_value (TWO31r, DFmode);
+  two31r = ix86_build_const_vector (MODEmode, 1, two31r);
+  two31r = force_reg (MODEmode, two31r);
+  real_ldexp (MTWO32r, dconstm1, 32);
+  mtwo32r = const_double_from_real_value (MTWO32r, DFmode);
+  mtwo32r = ix86_build_const_vector (MODEmode, 1, mtwo32r);
+  mtwo32r = force_reg (MODEmode, mtwo32r);
+  emit_insn (gen_avx_cmpmode3 (tmp[0], operands[1], two31r, GEN_INT (29)));
+  emit_insn (gen_avx_cmpmode3 (tmp[1], operands[2], two31r, GEN_INT (29)));
+  emit_insn (gen_andmode3 (tmp[2], tmp[0], mtwo32r));
+  emit_insn (gen_andmode3 (tmp[3], tmp[1], mtwo32r));
+  emit_insn (gen_addmode3 (tmp[4], operands[1], tmp[2]));
+  emit_insn (gen_addmode3 (tmp[5], operands[2], tmp[3]));
+  if (MODEmode == V4DFmode)
+{
+  emit_insn (gen_avx_cvttpd2dq256_2 (tmp[6], tmp[4]));
+  emit_insn (gen_avx_cvttpd2dq256_2 (tmp[7], tmp[5]));
+  emit_insn (gen_avx_vperm2f128v8si3 (operands[0], tmp[6], tmp[7],
+ GEN_INT (0x20)));
+}
+  else
+{
+  emit_insn (gen_sse2_cvttpd2dq (tmp[6], tmp[4]));
+  emit_insn (gen_sse2_cvttpd2dq (tmp[7], tmp[5]));
+  emit_insn (gen_vec_interleave_lowv2di (gen_lowpart (V2DImode,
+ operands[0]),
+gen_lowpart (V2DImode, tmp[6]),
+gen_lowpart (V2DImode, tmp[7])));
+}
+  DONE;
+})
+
 (define_expand vec_pack_sfix_v4df
   [(match_operand:V8SI 0 register_operand )
(match_operand:V4DF 1 nonimmediate_operand )

Jakub


Re: [PATCH] Add vec_pack_ufix_trunc_{v4df,v2df} expanders

2011-11-01 Thread Uros Bizjak
On Tue, Nov 1, 2011 at 10:07 AM, Jakub Jelinek ja...@redhat.com wrote:

 Similarly to the V{4,8}SFmode - unsigned V{4,8}SImode conversion
 support for AVX this one adds V{2,4}DFmode - unsigned V{4,8}SImode
 conversion.

 Ok for trunk?

Please put expander function into i386.c. IMO, this expander can be
better written using variable mode and indirect functions.

Otherwise, it looks OK.

Thanks,
Uros.


[PATCH] Add vec_pack_ufix_trunc_{v4df,v2df} expanders (take 2)

2011-11-01 Thread Jakub Jelinek
On Tue, Nov 01, 2011 at 11:16:07AM +0100, Uros Bizjak wrote:
 On Tue, Nov 1, 2011 at 10:07 AM, Jakub Jelinek ja...@redhat.com wrote:
 
  Similarly to the V{4,8}SFmode - unsigned V{4,8}SImode conversion
  support for AVX this one adds V{2,4}DFmode - unsigned V{4,8}SImode
  conversion.
 
  Ok for trunk?
 
 Please put expander function into i386.c. IMO, this expander can be
 better written using variable mode and indirect functions.

Like this?
Advantage is that fixuns_truncmodesseintvecmodelower2 pattern can use
the helper too and shrink, disadvantage is that the stmts in the new
pattern are now in vcmppd; vandpd; vaddpd; vcmppd; vandpd; vaddpd order
instead of vcmppd; vcmppd; vandpd; vandpd; vaddpd; vaddpd; (not sure why
the scheduler didn't change it, but on the other side it is scheduler's
job).

2011-11-01  Jakub Jelinek  ja...@redhat.com

* config/i386/i386-protos.h (ix86_expand_adjust_ufix_to_sfix_si): New
prototype.
* config/i386/i386.c (ix86_expand_adjust_ufix_to_sfix_si): New
function.
* config/i386/sse.md (fixuns_truncmodesseintvecmodelower2): Use
it.
(ssepackfltmode): New mode attr.
(vec_pack_ufix_trunc_mode): New expander.

--- gcc/config/i386/i386-protos.h.jj2011-10-25 08:13:31.0 +0200
+++ gcc/config/i386/i386-protos.h   2011-11-01 14:18:59.0 +0100
@@ -109,6 +109,7 @@ extern void ix86_expand_convert_uns_sixf
 extern void ix86_expand_convert_uns_sidf_sse (rtx, rtx);
 extern void ix86_expand_convert_uns_sisf_sse (rtx, rtx);
 extern void ix86_expand_convert_sign_didf_sse (rtx, rtx);
+extern rtx ix86_expand_adjust_ufix_to_sfix_si (rtx);
 extern enum ix86_fpcmp_strategy ix86_fp_comparison_strategy (enum rtx_code);
 extern void ix86_expand_fp_absneg_operator (enum rtx_code, enum machine_mode,
rtx[]);
--- gcc/config/i386/i386.c.jj   2011-10-31 20:44:13.0 +0100
+++ gcc/config/i386/i386.c  2011-11-01 14:26:31.0 +0100
@@ -17016,6 +17016,46 @@ ix86_expand_convert_uns_sisf_sse (rtx ta
 emit_move_insn (target, fp_hi);
 }
 
+/* Adjust a V*SFmode/V*DFmode value VAL so that *sfix_trunc* resp. fix_trunc*
+   pattern can be used on it instead of *ufix_trunc* resp. fixuns_trunc*.
+   This is done by subtracting 0x1p32 from VAL if VAL is greater or equal
+   (non-signalling) than 0x1p31.  */
+
+rtx
+ix86_expand_adjust_ufix_to_sfix_si (rtx val)
+{
+  REAL_VALUE_TYPE MTWO32r, TWO31r;
+  rtx two31r, mtwo32r, tmp[3];
+  enum machine_mode mode = GET_MODE (val);
+  enum machine_mode scalarmode = GET_MODE_INNER (mode);
+  rtx (*cmp) (rtx, rtx, rtx, rtx);
+  int i;
+
+  for (i = 0; i  3; i++)
+tmp[i] = gen_reg_rtx (mode);
+  real_ldexp (TWO31r, dconst1, 31);
+  two31r = const_double_from_real_value (TWO31r, scalarmode);
+  two31r = ix86_build_const_vector (mode, 1, two31r);
+  two31r = force_reg (mode, two31r);
+  real_ldexp (MTWO32r, dconstm1, 32);
+  mtwo32r = const_double_from_real_value (MTWO32r, scalarmode);
+  mtwo32r = ix86_build_const_vector (mode, 1, mtwo32r);
+  mtwo32r = force_reg (mode, mtwo32r);
+  switch (mode)
+{
+case V8SFmode: cmp = gen_avx_cmpv8sf3; break;
+case V4SFmode: cmp = gen_avx_cmpv4sf3; break;
+case V4DFmode: cmp = gen_avx_cmpv4df3; break;
+case V2DFmode: cmp = gen_avx_cmpv2df3; break;
+default: gcc_unreachable ();
+}
+  emit_insn (cmp (tmp[0], val, two31r, GEN_INT (29)));
+  tmp[1] = expand_simple_binop (mode, AND, tmp[0], mtwo32r, tmp[1],
+   0, OPTAB_DIRECT);
+  return expand_simple_binop (mode, PLUS, val, tmp[1], tmp[2],
+ 0, OPTAB_DIRECT);
+}
+
 /* A subroutine of ix86_build_signbit_mask.  If VECT is true,
then replicate the value for all elements of the vector
register.  */
--- gcc/config/i386/sse.md.jj   2011-11-01 09:04:37.0 +0100
+++ gcc/config/i386/sse.md  2011-11-01 14:25:52.0 +0100
@@ -2323,32 +2323,13 @@ (define_insn fix_truncv4sfv4si2
(set_attr mode TI)])
 
 (define_expand fixuns_truncmodesseintvecmodelower2
-  [(set (match_dup 4)
-   (unspec:VF1
- [(match_operand:VF1 1 register_operand )
-  (match_dup 2)
-  (const_int 29)] UNSPEC_PCMP))
-   (set (match_dup 5)
-   (and:VF1 (match_dup 4) (match_dup 3)))
-   (set (match_dup 6)
-   (plus:VF1 (match_dup 1) (match_dup 5)))
-   (set (match_operand:sseintvecmode 0 register_operand )
-   (fix:sseintvecmode (match_dup 6)))]
+  [(match_operand:sseintvecmode 0 register_operand )
+   (match_operand:VF1 1 register_operand )]
   TARGET_AVX
 {
-  REAL_VALUE_TYPE MTWO32r, TWO31r;
-  int i;
-
-  real_ldexp (TWO31r, dconst1, 31);
-  operands[2] = const_double_from_real_value (TWO31r, SFmode);
-  operands[2] = ix86_build_const_vector (MODEmode, 1, operands[2]);
-  operands[2] = force_reg (MODEmode, operands[2]);
-  real_ldexp (MTWO32r, dconstm1, 32);
-  operands[3] = const_double_from_real_value (MTWO32r, SFmode);
-  operands[3] = 

Re: [PATCH] Add vec_pack_ufix_trunc_{v4df,v2df} expanders (take 2)

2011-11-01 Thread Richard Henderson
On 11/01/2011 06:35 AM, Jakub Jelinek wrote:
 ... disadvantage is that the stmts in the new
 pattern are now in vcmppd; vandpd; vaddpd; vcmppd; vandpd; vaddpd order
 instead of vcmppd; vcmppd; vandpd; vandpd; vaddpd; vaddpd; (not sure why
 the scheduler didn't change it, but on the other side it is scheduler's
 job).

I wonder if the scheduling description didn't get updated properly?
If the scheduler believes that the each insn takes 1 cycle, and there
is only one pipe for them, it won't reorder anything.

   * config/i386/i386-protos.h (ix86_expand_adjust_ufix_to_sfix_si): New
   prototype.
   * config/i386/i386.c (ix86_expand_adjust_ufix_to_sfix_si): New
   function.
   * config/i386/sse.md (fixuns_truncmodesseintvecmodelower2): Use
   it.
   (ssepackfltmode): New mode attr.
   (vec_pack_ufix_trunc_mode): New expander.

Looks good to me.


r~


Re: [PATCH] Add vec_pack_ufix_trunc_{v4df,v2df} expanders (take 2)

2011-11-01 Thread Uros Bizjak
On Tue, Nov 1, 2011 at 2:35 PM, Jakub Jelinek ja...@redhat.com wrote:

  Similarly to the V{4,8}SFmode - unsigned V{4,8}SImode conversion
  support for AVX this one adds V{2,4}DFmode - unsigned V{4,8}SImode
  conversion.
 
  Ok for trunk?

 Please put expander function into i386.c. IMO, this expander can be
 better written using variable mode and indirect functions.

 Like this?
 Advantage is that fixuns_truncmodesseintvecmodelower2 pattern can use
 the helper too and shrink, disadvantage is that the stmts in the new
 pattern are now in vcmppd; vandpd; vaddpd; vcmppd; vandpd; vaddpd order
 instead of vcmppd; vcmppd; vandpd; vandpd; vaddpd; vaddpd; (not sure why
 the scheduler didn't change it, but on the other side it is scheduler's
 job).

 2011-11-01  Jakub Jelinek  ja...@redhat.com

        * config/i386/i386-protos.h (ix86_expand_adjust_ufix_to_sfix_si): New
        prototype.
        * config/i386/i386.c (ix86_expand_adjust_ufix_to_sfix_si): New
        function.
        * config/i386/sse.md (fixuns_truncmodesseintvecmodelower2): Use
        it.
        (ssepackfltmode): New mode attr.
        (vec_pack_ufix_trunc_mode): New expander.

OK.

Thanks,
Uros.