Re: [PATCH GCC][6/7]Support loop nest distribution for builtin partition

2017-10-12 Thread Bin.Cheng
On Thu, Oct 12, 2017 at 2:32 PM, Richard Biener
 wrote:
> On Thu, Oct 5, 2017 at 3:17 PM, Bin Cheng  wrote:
>> Hi,
>> This patch rewrites classification part of builtin partition so that nested
>> builtin partitions are supported.  With this extension, below loop nest:
>> void
>> foo (void)
>> {
>>   for (unsigned i = 0; i < M; ++i)
>> for (unsigned j = 0; j < N; ++j)
>>   arr[i][j] = 0;
>>
>> will be distributed into a single memset, rather than a loop of memset.
>> Bootstrap and test in patch set on x86_64 and AArch64, is it OK?
>
> +  tree access_size = fold_convert (sizetype, TYPE_SIZE_UNIT (TREE_TYPE 
> (ref)));
> +
>
> TYPE_SIZE_UNIT should be always sizetype.
Done.

>
> +  /* Classify the builtin kind.  */
> +  if (single_ld == NULL)
> +classify_builtin_1 (loop, partition, single_st);
> +  else
> +classify_builtin_2 (loop, rdg, partition, single_st, single_ld);
>
> maybe name those helpers classify_builtin_st and classify_builtin_ldst?
Done.  Patch updated in attachment, Will apply it later.

Thanks,
bin
2017-10-12  Bin Cheng  

* tree-loop-distribution.c (struct builtin_info): New struct.
(struct partition): Refactor fields into struct builtin_info.
(partition_free): Free struct builtin_info.
(build_size_arg_loc, build_addr_arg_loc): Delete.
(generate_memset_builtin, generate_memcpy_builtin): Get memory range
information from struct builtin_info.
(find_single_drs): New function refactored from classify_partition.
Also moved builtin validity checks to this function.
(compute_access_range, alloc_builtin): New functions.
(classify_builtin_st, classify_builtin_ldst): New functions.
(classify_partition): Refactor code into functions find_single_drs,
classify_builtin_st and classify_builtin_ldst.
(distribute_loop): Don't do runtime alias check when distributing
loop nest.
(find_seed_stmts_for_distribution): New function.
(pass_loop_distribution::execute): Refactor code finding seed
stmts into above function.  Support distribution for the innermost
two-level loop nest.  Adjust dump information.

gcc/testsuite/ChangeLog
2017-10-12  Bin Cheng  

* gcc.dg/tree-ssa/ldist-28.c: New test.
* gcc.dg/tree-ssa/ldist-29.c: New test.
* gcc.dg/tree-ssa/ldist-30.c: New test.
* gcc.dg/tree-ssa/ldist-31.c: New test.

>
> Ok with those changes.
>
> Thanks,
> Richard.
>
From 8271ce0851a60b38226e92558bca234774e5503e Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Wed, 27 Sep 2017 13:00:59 +0100
Subject: [PATCH 6/7] loop_nest-builtin-pattern-20171012.txt

---
 gcc/testsuite/gcc.dg/tree-ssa/ldist-28.c |  16 +
 gcc/testsuite/gcc.dg/tree-ssa/ldist-29.c |  17 ++
 gcc/testsuite/gcc.dg/tree-ssa/ldist-30.c |  16 +
 gcc/testsuite/gcc.dg/tree-ssa/ldist-31.c |  19 ++
 gcc/tree-loop-distribution.c | 507 +++
 5 files changed, 377 insertions(+), 198 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ldist-28.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ldist-29.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ldist-30.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ldist-31.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-28.c b/gcc/testsuite/gcc.dg/tree-ssa/ldist-28.c
new file mode 100644
index 000..4420139
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-28.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns -fdump-tree-ldist-details" } */
+
+#define M (256)
+#define N (1024)
+int arr[M][N];
+
+void
+foo (void)
+{
+  for (unsigned i = 0; i < M; ++i)
+for (unsigned j = 0; j < N; ++j)
+  arr[i][j] = 0;
+}
+
+/* { dg-final { scan-tree-dump "Loop nest . distributed: split to 0 loops and 1 library" "ldist" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-29.c b/gcc/testsuite/gcc.dg/tree-ssa/ldist-29.c
new file mode 100644
index 000..9ce93e8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-29.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns -fdump-tree-ldist-details" } */
+
+#define M (256)
+#define N (512)
+int arr[M][N];
+
+void
+foo (void)
+{
+  for (unsigned i = 0; i < M; ++i)
+for (unsigned j = 0; j < N - 1; ++j)
+  arr[i][j] = 0;
+}
+
+/* { dg-final { scan-tree-dump-not "Loop nest . distributed: split to" "ldist" } } */
+/* { dg-final { scan-tree-dump-times "Loop . distributed: split to 0 loops and 1 library" 1 "ldist" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-30.c b/gcc/testsuite/gcc.dg/tree-ssa/ldist-30.c
new file mode 100644
index 000..f31860a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-30.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns 

Re: [PATCH GCC][6/7]Support loop nest distribution for builtin partition

2017-10-12 Thread Richard Biener
On Thu, Oct 5, 2017 at 3:17 PM, Bin Cheng  wrote:
> Hi,
> This patch rewrites classification part of builtin partition so that nested
> builtin partitions are supported.  With this extension, below loop nest:
> void
> foo (void)
> {
>   for (unsigned i = 0; i < M; ++i)
> for (unsigned j = 0; j < N; ++j)
>   arr[i][j] = 0;
>
> will be distributed into a single memset, rather than a loop of memset.
> Bootstrap and test in patch set on x86_64 and AArch64, is it OK?

+  tree access_size = fold_convert (sizetype, TYPE_SIZE_UNIT (TREE_TYPE (ref)));
+

TYPE_SIZE_UNIT should be always sizetype.

+  /* Classify the builtin kind.  */
+  if (single_ld == NULL)
+classify_builtin_1 (loop, partition, single_st);
+  else
+classify_builtin_2 (loop, rdg, partition, single_st, single_ld);

maybe name those helpers classify_builtin_st and classify_builtin_ldst?

Ok with those changes.

Thanks,
Richard.

> Thanks,
> bin
> 2017-10-04  Bin Cheng  
>
> * tree-loop-distribution.c (struct builtin_info): New struct.
> (struct partition): Refactor fields into struct builtin_info.
> (partition_free): Free struct builtin_info.
> (build_size_arg_loc, build_addr_arg_loc): Delete.
> (generate_memset_builtin, generate_memcpy_builtin): Get memory range
> information from struct builtin_info.
> (find_single_drs): New function refactored from classify_partition.
> Also moved builtin validity checks to this function.
> (compute_access_range, alloc_builtin): New functions.
> (classify_builtin_1, classify_builtin_2): New functions.
> (classify_partition): Refactor code into functions find_single_drs,
> classify_builtin_1 and classify_builtin_2.
> (distribute_loop): Don't do runtime alias check when distributing
> loop nest.
> (find_seed_stmts_for_distribution): New function.
> (pass_loop_distribution::execute): Refactor code finding seed
> stmts into above function.  Support distribution for the innermost
> two-level loop nest.  Adjust dump information.
>
> gcc/testsuite/ChangeLog
> 2017-10-04  Bin Cheng  
>
> * gcc.dg/tree-ssa/ldist-28.c: New test.
> * gcc.dg/tree-ssa/ldist-29.c: New test.
> * gcc.dg/tree-ssa/ldist-30.c: New test.
> * gcc.dg/tree-ssa/ldist-31.c: New test.


[PATCH GCC][6/7]Support loop nest distribution for builtin partition

2017-10-05 Thread Bin Cheng
Hi,
This patch rewrites classification part of builtin partition so that nested
builtin partitions are supported.  With this extension, below loop nest:
void
foo (void)
{
  for (unsigned i = 0; i < M; ++i)
for (unsigned j = 0; j < N; ++j)
  arr[i][j] = 0;

will be distributed into a single memset, rather than a loop of memset.
Bootstrap and test in patch set on x86_64 and AArch64, is it OK?

Thanks,
bin
2017-10-04  Bin Cheng  

* tree-loop-distribution.c (struct builtin_info): New struct.
(struct partition): Refactor fields into struct builtin_info.
(partition_free): Free struct builtin_info.
(build_size_arg_loc, build_addr_arg_loc): Delete.
(generate_memset_builtin, generate_memcpy_builtin): Get memory range
information from struct builtin_info.
(find_single_drs): New function refactored from classify_partition.
Also moved builtin validity checks to this function.
(compute_access_range, alloc_builtin): New functions.
(classify_builtin_1, classify_builtin_2): New functions.
(classify_partition): Refactor code into functions find_single_drs,
classify_builtin_1 and classify_builtin_2.
(distribute_loop): Don't do runtime alias check when distributing
loop nest.
(find_seed_stmts_for_distribution): New function.
(pass_loop_distribution::execute): Refactor code finding seed
stmts into above function.  Support distribution for the innermost
two-level loop nest.  Adjust dump information.

gcc/testsuite/ChangeLog
2017-10-04  Bin Cheng  

* gcc.dg/tree-ssa/ldist-28.c: New test.
* gcc.dg/tree-ssa/ldist-29.c: New test.
* gcc.dg/tree-ssa/ldist-30.c: New test.
* gcc.dg/tree-ssa/ldist-31.c: New test.diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-28.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-28.c
new file mode 100644
index 000..4420139
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-28.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns 
-fdump-tree-ldist-details" } */
+
+#define M (256)
+#define N (1024)
+int arr[M][N];
+
+void
+foo (void)
+{
+  for (unsigned i = 0; i < M; ++i)
+for (unsigned j = 0; j < N; ++j)
+  arr[i][j] = 0;
+}
+
+/* { dg-final { scan-tree-dump "Loop nest . distributed: split to 0 loops and 
1 library" "ldist" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-29.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-29.c
new file mode 100644
index 000..9ce93e8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-29.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns 
-fdump-tree-ldist-details" } */
+
+#define M (256)
+#define N (512)
+int arr[M][N];
+
+void
+foo (void)
+{
+  for (unsigned i = 0; i < M; ++i)
+for (unsigned j = 0; j < N - 1; ++j)
+  arr[i][j] = 0;
+}
+
+/* { dg-final { scan-tree-dump-not "Loop nest . distributed: split to" "ldist" 
} } */
+/* { dg-final { scan-tree-dump-times "Loop . distributed: split to 0 loops and 
1 library" 1 "ldist" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-30.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-30.c
new file mode 100644
index 000..f31860a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-30.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns 
-fdump-tree-ldist-details" } */
+
+#define M (256)
+#define N (512)
+int a[M][N], b[M][N];
+
+void
+foo (void)
+{
+  for (unsigned i = 0; i < M; ++i)
+for (unsigned j = N; j > 0; --j)
+  a[i][j - 1] = b[i][j - 1];
+}
+
+/* { dg-final { scan-tree-dump-times "Loop nest . distributed: split to" 1 
"ldist" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-31.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-31.c
new file mode 100644
index 000..60a9f74
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-31.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns 
-fdump-tree-ldist-details" } */
+
+#define M (256)
+#define N (512)
+int a[M][N], b[M][N], c[M];
+
+void
+foo (void)
+{
+  for (int i = M - 1; i >= 0; --i)
+{
+  c[i] = 0;
+  for (unsigned j = N; j > 0; --j)
+   a[i][j - 1] = b[i][j - 1];
+}
+}
+
+/* { dg-final { scan-tree-dump-times "Loop nest . distributed: split to 0 
loops and 2 library" 1 "ldist" } } */
diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 59a968c..237474f 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -581,72 +581,82 @@ build_rdg (struct loop *loop, control_dependences *cd)
 
 
 /* Kind of distributed loop.  */
 enum partition_kind {
 PKIND_NORMAL, PKIND_MEMSET, PKIND_MEMCPY, PKIND_MEMMOVE
 };
 
 /* Type of distributed loop.  */
 enum partition_type {