Re: [PATCH V4] PR88497 - Extend reassoc for vector bit_field_ref

2019-07-10 Thread Richard Biener
On Mon, 8 Jul 2019, Kewen.Lin wrote:

> Hi Richard,
> 
> Thanks a lot for your review and reply, I've updated the patch accordingly.
> 
> Main changes compared to previous one:
>   - Consider SVE (poly_int) on bit_field_ref size/offset.
>   - Filter valid candidates with sbitmap first.
>   - Support multiple modes (TODO 1).
>   - Test cases: add SSE2 support for 1..5, update run result check for 1,
> add two more cases (5,6)to verify multiple mode support (x86).
> 
> Bootstrapped and regression test passed on powerpc64le-unknown-linux-gnu
> and x86_64-linux-gnu.
> 
> Could you help to review it again?  Thanks in advance!

+  tree rhs = gimple_assign_rhs1 (oe1def);
+  tree vec = TREE_OPERAND (rhs, 0);
+  tree vec_type = TREE_TYPE (vec);
+
+  if (TREE_CODE (vec) != SSA_NAME || !VECTOR_TYPE_P (vec_type))
+   continue;
...
+  /* Ignore it if target machine can't support this VECTOR type.  */
+  if (!VECTOR_MODE_P (TYPE_MODE (vec_type)))
+   continue;

put the check below next to the one above, it's cheaper than the
poly-int stuff that currently preceeds it.

+  v_info_ptr *info_ptr = v_info_map.get (vec);
+  if (info_ptr)
+   {

there is get_or_insert which enables you to mostly merge the two cases
with a preceeding

  if (!existed)
v_info_ptr = new v_info;

also eliding the extra hash-lookup the put operation otherwise requires.

+  for (hash_map::iterator it = v_info_map.begin ();
+   it != v_info_map.end (); ++it)
+{
+  tree cand_vec = (*it).first;
+  v_info_ptr cand_info = (*it).second;
+  unsigned int num_elems = VECTOR_CST_NELTS (cand_vec).to_constant 
();

please add a quick out like

  if (cand_info->length () != num_elems)
continue;

since to have no holes and no overlap you need exactly that many.

I think you can avoid the somewhat ugly mode_to_total counter array.
Since you have sorted valid_vecs after modes simply do

  for (unsigned i = 0; i < valid_vecs.length () - 1; ++i)
{
  tree tvec = valid_vecs[i];
  enum machine_mode mode = TYPE_MODE (TREE_TYPE (tvec));

(please don't use unsigned int for mode!)

  /* Skip modes with only a single candidate.  */
  if (TYPE_MODE (TREE_TYPE (valid_vecs[i+1])) != mode)
continue;

  do
{
...
}
  while (...)

Richard.

> Kewen
> 
> 
> 
> gcc/ChangeLog
> 
> 2019-07-08  Kewen Lin  
> 
>   PR tree-optimization/88497
>   * tree-ssa-reassoc.c (reassociate_bb): Swap the positions of 
>   GIMPLE_BINARY_RHS check and gimple_visited_p check, call new 
>   function undistribute_bitref_for_vector.
>   (undistribute_bitref_for_vector): New function.
>   (cleanup_vinfo_map): Likewise.
>   (sort_by_mach_mode): Likewise.
> 
> gcc/testsuite/ChangeLog
> 
> 2019-07-08  Kewen Lin  
> 
>   * gcc.dg/tree-ssa/pr88497-1.c: New test.
>   * gcc.dg/tree-ssa/pr88497-2.c: Likewise.
>   * gcc.dg/tree-ssa/pr88497-3.c: Likewise.
>   * gcc.dg/tree-ssa/pr88497-4.c: Likewise.
>   * gcc.dg/tree-ssa/pr88497-5.c: Likewise.
>   * gcc.dg/tree-ssa/pr88497-6.c: Likewise.
>   * gcc.dg/tree-ssa/pr88497-7.c: Likewise.
> 
> 
> 

-- 
Richard Biener 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

Re: [PATCH V4] PR88497 - Extend reassoc for vector bit_field_ref

2019-07-09 Thread Segher Boessenkool
On Tue, Jul 09, 2019 at 10:28:06AM +0800, Kewen.Lin wrote:
> on 2019/7/9 上午12:32, Segher Boessenkool wrote:
> > On Mon, Jul 08, 2019 at 04:07:00PM +0800, Kewen.Lin wrote:
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c
> >> @@ -0,0 +1,60 @@
> >> +/* { dg-do run } */
> >> +/* { dg-require-effective-target vect_double } */
> >> +/* { dg-require-effective-target powerpc_vsx_ok { target { powerpc*-*-* } 
> >> } } */
> > 
> > For "dg-do run" tests, you need "powerpc_vsx_hw".  "_ok" only tests if
> > the assembler can handle VSX instructions, not whether the test system
> > can run them.  (powerpc_vsx_ok is what you need for "dg-do assemble" or
> > "dg-do link" tests.  It also tests if you can use -mvsx, but that doesn't
> > do what you might hope it does: you can use -mvsx together with a -mcpu=
> > that doesn't support VSX, for example).
> 
> Thanks, I will update it.  But sorry that I can't find "powerpc_vsx_hw" but 
> "vsx_hw_available".  I guess it's the one you are referring to?

Yeah, sorry.  You can also use just "vsx_hw".

> And I happened
> to find the vect_double will force powerpc to check vsx_hw_available.

Yes :-)  So this whole line is unnecessary.


Segher


Re: [PATCH V4] PR88497 - Extend reassoc for vector bit_field_ref

2019-07-08 Thread Kewen.Lin
Hi Segher,

on 2019/7/9 上午12:32, Segher Boessenkool wrote:
> Hi Kewen,
> 
> On Mon, Jul 08, 2019 at 04:07:00PM +0800, Kewen.Lin wrote:
>> gcc/ChangeLog
> 
> (You have trailing spaces in the changelog, fwiw).
> 

Thanks for catching!

>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c
>> @@ -0,0 +1,60 @@
>> +/* { dg-do run } */
>> +/* { dg-require-effective-target vect_double } */
>> +/* { dg-require-effective-target powerpc_vsx_ok { target { powerpc*-*-* } } 
>> } */
> 
> For "dg-do run" tests, you need "powerpc_vsx_hw".  "_ok" only tests if
> the assembler can handle VSX instructions, not whether the test system
> can run them.  (powerpc_vsx_ok is what you need for "dg-do assemble" or
> "dg-do link" tests.  It also tests if you can use -mvsx, but that doesn't
> do what you might hope it does: you can use -mvsx together with a -mcpu=
> that doesn't support VSX, for example).
> 

Thanks, I will update it.  But sorry that I can't find "powerpc_vsx_hw" but 
"vsx_hw_available".  I guess it's the one you are referring to?  And I happened
to find the vect_double will force powerpc to check vsx_hw_available.

This reminds me I should use sse2 instead of sse2-runtime for case 2-5.

>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-2.c
>> @@ -0,0 +1,37 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target vect_float } */
>> +/* { dg-require-effective-target powerpc_altivec_ok { target { powerpc*-*-* 
>> } } } */
> 
> This one is fine, and the rest of the tests as well.
> 
> 
> Segher
> 



Re: [PATCH V4] PR88497 - Extend reassoc for vector bit_field_ref

2019-07-08 Thread Segher Boessenkool
Hi Kewen,

On Mon, Jul 08, 2019 at 04:07:00PM +0800, Kewen.Lin wrote:
> gcc/ChangeLog

(You have trailing spaces in the changelog, fwiw).

> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c
> @@ -0,0 +1,60 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target vect_double } */
> +/* { dg-require-effective-target powerpc_vsx_ok { target { powerpc*-*-* } } 
> } */

For "dg-do run" tests, you need "powerpc_vsx_hw".  "_ok" only tests if
the assembler can handle VSX instructions, not whether the test system
can run them.  (powerpc_vsx_ok is what you need for "dg-do assemble" or
"dg-do link" tests.  It also tests if you can use -mvsx, but that doesn't
do what you might hope it does: you can use -mvsx together with a -mcpu=
that doesn't support VSX, for example).

> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-2.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_float } */
> +/* { dg-require-effective-target powerpc_altivec_ok { target { powerpc*-*-* 
> } } } */

This one is fine, and the rest of the tests as well.


Segher


[PATCH V4] PR88497 - Extend reassoc for vector bit_field_ref

2019-07-08 Thread Kewen.Lin
Hi Richard,

Thanks a lot for your review and reply, I've updated the patch accordingly.

Main changes compared to previous one:
  - Consider SVE (poly_int) on bit_field_ref size/offset.
  - Filter valid candidates with sbitmap first.
  - Support multiple modes (TODO 1).
  - Test cases: add SSE2 support for 1..5, update run result check for 1,
add two more cases (5,6)to verify multiple mode support (x86).

Bootstrapped and regression test passed on powerpc64le-unknown-linux-gnu
and x86_64-linux-gnu.

Could you help to review it again?  Thanks in advance!

Kewen



gcc/ChangeLog

2019-07-08  Kewen Lin  

PR tree-optimization/88497
* tree-ssa-reassoc.c (reassociate_bb): Swap the positions of 
GIMPLE_BINARY_RHS check and gimple_visited_p check, call new 
function undistribute_bitref_for_vector.
(undistribute_bitref_for_vector): New function.
(cleanup_vinfo_map): Likewise.
(sort_by_mach_mode): Likewise.

gcc/testsuite/ChangeLog

2019-07-08  Kewen Lin  

* gcc.dg/tree-ssa/pr88497-1.c: New test.
* gcc.dg/tree-ssa/pr88497-2.c: Likewise.
* gcc.dg/tree-ssa/pr88497-3.c: Likewise.
* gcc.dg/tree-ssa/pr88497-4.c: Likewise.
* gcc.dg/tree-ssa/pr88497-5.c: Likewise.
* gcc.dg/tree-ssa/pr88497-6.c: Likewise.
* gcc.dg/tree-ssa/pr88497-7.c: Likewise.


diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c
new file mode 100644
index 000..d17de42
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c
@@ -0,0 +1,60 @@
+/* { dg-do run } */
+/* { dg-require-effective-target vect_double } */
+/* { dg-require-effective-target powerpc_vsx_ok { target { powerpc*-*-* } } } 
*/
+/* { dg-require-effective-target sse2_runtime { target { i?86-*-* x86_64-*-* } 
} } */
+/* { dg-options "-O2 -ffast-math -fdump-tree-reassoc1" } */
+/* { dg-additional-options "-mvsx" { target { powerpc*-*-* } } } */
+/* { dg-additional-options "-msse2" { target { i?86-*-* x86_64-*-* } } } */
+
+/* To test reassoc can undistribute vector bit_field_ref summation.
+
+   arg1 and arg2 are two arrays whose elements of type vector double.
+   Assuming:
+ A0 = arg1[0], A1 = arg1[1], A2 = arg1[2], A3 = arg1[3],
+ B0 = arg2[0], B1 = arg2[1], B2 = arg2[2], B3 = arg2[3],
+
+   Then:
+ V0 = A0 * B0, V1 = A1 * B1, V2 = A2 * B2, V3 = A3 * B3,
+
+   reassoc transforms
+
+ accumulator += V0[0] + V0[1] + V1[0] + V1[1] + V2[0] + V2[1]
+  + V3[0] + V3[1];
+
+   into:
+
+ T = V0 + V1 + V2 + V3
+ accumulator += T[0] + T[1];
+
+   Fewer bit_field_refs, only two for 128 or more bits vector.  */
+
+typedef double v2df __attribute__ ((vector_size (16)));
+__attribute__ ((noinline)) double
+test (double accumulator, v2df arg1[], v2df arg2[])
+{
+  v2df temp;
+  temp = arg1[0] * arg2[0];
+  accumulator += temp[0] + temp[1];
+  temp = arg1[1] * arg2[1];
+  accumulator += temp[0] + temp[1];
+  temp = arg1[2] * arg2[2];
+  accumulator += temp[0] + temp[1];
+  temp = arg1[3] * arg2[3];
+  accumulator += temp[0] + temp[1];
+  return accumulator;
+}
+
+extern void abort (void);
+
+int
+main ()
+{
+  v2df v2[4] = {{1.0, 2.0}, {4.0, 8.0}, {1.0, 3.0}, {9.0, 27.0}};
+  v2df v3[4] = {{1.0, 4.0}, {16.0, 64.0}, {1.0, 2.0}, {3.0, 4.0}};
+  double acc = 100.0;
+  double res = test (acc, v2, v3);
+  if (res != 827.0)
+abort ();
+  return 0;
+}
+/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 2 "reassoc1" { target { 
powerpc*-*-* i?86-*-* x86_64-*-* } } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr88497-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-2.c
new file mode 100644
index 000..ce34c7a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-2.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_float } */
+/* { dg-require-effective-target powerpc_altivec_ok { target { powerpc*-*-* } 
} } */
+/* { dg-require-effective-target sse2_runtime { target { i?86-*-* x86_64-*-* } 
} } */
+/* { dg-options "-O2 -ffast-math -fdump-tree-reassoc1" } */
+/* { dg-additional-options "-maltivec" { target { powerpc*-*-* } } } */
+/* { dg-additional-options "-msse2" { target { i?86-*-* x86_64-*-* } } } */
+
+/* To test reassoc can undistribute vector bit_field_ref on multiplication.
+
+   v1, v2, v3, v4 of type vector float.
+
+   reassoc transforms
+
+ accumulator *= v1[0] * v1[1] * v1[2] * v1[3] *
+v2[0] * v2[1] * v2[2] * v2[3] *
+v3[0] * v3[1] * v3[2] * v3[3] *
+v4[0] * v4[1] * v4[2] * v4[3] ;
+
+   into:
+
+ T = v1 * v2 * v3 * v4;
+ accumulator *= T[0] * T[1] * T[2] * T[3];
+
+   Fewer bit_field_refs, only four for 128 or more bits vector.  */
+
+typedef float v4sf __attribute__ ((vector_size (16)));
+float
+test (float accumulator, v4sf v1, v4sf v2, v4sf v3, v4sf v4)
+{
+  accumulator *= v1[0] * v1[1] * v1[2] * v1[3];
+  accumulator *= v2[0] * v2[1] * v2[2] * v2[3];
+  accumulator *=