c: Add C2X BOOL_MAX and BOOL_WIDTH to limits.h

2020-07-02 Thread Joseph Myers
C2X adds BOOL_MAX and BOOL_WIDTH macros to .  As GCC only
supports values 0 and 1 for _Bool (regardless of the number of bits in
the representation, other bits are padding bits and if any of them are
nonzero, the representation is a trap representation), the values of
those macros can just be hardcoded directly in  rather than
needing corresponding predefined macros.

Bootstrapped with no regressions on x86_64-pc-linux-gnu.  OK to
commit?

gcc/
2020-07-02  Joseph Myers  

* glimits.h [__STDC_VERSION__ > 201710L] (BOOL_MAX, BOOL_WIDTH):
New macros.

gcc/testsuite/
2020-07-02  Joseph Myers  

* gcc.dg/c11-bool-limits-1.c, gcc.dg/c2x-bool-limits-1.c: New
tests.

diff --git a/gcc/glimits.h b/gcc/glimits.h
index a37f496ef1b..50927510677 100644
--- a/gcc/glimits.h
+++ b/gcc/glimits.h
@@ -150,4 +150,12 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 # define ULLONG_WIDTH __LONG_LONG_WIDTH__
 #endif
 
+#if defined (__STDC_VERSION__) && __STDC_VERSION__ > 201710L
+/* C2X width and limit of _Bool.  */
+# undef BOOL_MAX
+# define BOOL_MAX 1
+# undef BOOL_WIDTH
+# define BOOL_WIDTH 1
+#endif
+
 #endif /* _LIMITS_H___ */
diff --git a/gcc/testsuite/gcc.dg/c11-bool-limits-1.c 
b/gcc/testsuite/gcc.dg/c11-bool-limits-1.c
new file mode 100644
index 000..9ca29be4d72
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c11-bool-limits-1.c
@@ -0,0 +1,13 @@
+/* Test limits for _Bool not in  in C11.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c11" } */
+
+#include 
+
+#ifdef BOOL_MAX
+# error "unexpected BOOL_MAX"
+#endif
+
+#ifdef BOOL_WIDTH
+# error "unexpected BOOL_WIDTH"
+#endif
diff --git a/gcc/testsuite/gcc.dg/c2x-bool-limits-1.c 
b/gcc/testsuite/gcc.dg/c2x-bool-limits-1.c
new file mode 100644
index 000..d32b4ef59ed
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-bool-limits-1.c
@@ -0,0 +1,19 @@
+/* Test limits for _Bool in  in C2x.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x" } */
+
+#include 
+
+#ifndef BOOL_MAX
+# error "missing BOOL_MAX"
+#endif
+
+#ifndef BOOL_WIDTH
+# error "missing BOOL_WIDTH"
+#endif
+
+/* In principle _Bool can support values wider than 1 bit, stored via
+   type punning, but this is not supported by GCC.  */
+
+_Static_assert (BOOL_MAX == 1, "bad BOOL_MAX");
+_Static_assert (BOOL_WIDTH == 1, "bad BOOL_WIDTH");

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] OpenMP: Disable GPU threads when only teams are used

2020-07-02 Thread Andrew Stubbs

On 02/07/2020 18:00, Jakub Jelinek wrote:

On Thu, Jul 02, 2020 at 05:15:20PM +0100, Andrew Stubbs wrote:

This patch, originally by Kwok, auto-adjusts the default OpenMP target
arguments to set num_threads(1) when there are no parallel regions. There
may still be multiple teams in this case.

The result is that libgomp will not attempt to launch GPU threads that will
never get used.

OK to commit?


That doesn't look safe to me.
My understanding of the patch is that it looks for parallel construct
lexically in the target region, but that isn't sufficient, one can do that
only if the target region can't encounter a parallel construct in the target
region (i.e. the body and all functions that are called from it at runtime).


OpenMP is complicated. :-(

Is it normally expected that the runtime will always launch the maximum 
number of threads, just in case?


There's a cost to both launching and running excess threads that it 
would be nice to avoid, but the real point of the optimization is that 
launching fewer threads allows us to launch more teams.


AMD GPUs usually allow us to run 2040 or 2400 wavefronts simultaneously, 
so if we're running 15 unused threads for each team then we're limiting 
ourselves to 60 or 64 teams. If we limit each team to 1 thread then we 
can run the full 2040 or 2400 teams. Potentially, that's a 16x speed 
improvement on kernels that happen to not use parallel regions.


I would like to be able to do this, but it appears that the region data 
is insufficient for complex cases. Can you suggest a good way to solve this?



Perhaps one could ignore some builtin calls but it would need to be ones
where one can assume there will be no OpenMP code in them.

Also, it needs to avoid doing the optimization if there is or might
indirectly be called omp_get_thread_limit (), because if the optimization
forces thread_limit (1), that means that omp_get_thread_limit () in the
region will also return 1 rather than the expected value.


Would that not be the correct answer, if the number of threads actually 
has been limited to 1?


Thanks for the prompt review.

Andrew



[PATCH] PR fortran/95709 - [9/10/11 Regression] ICE in gfc_resolve_code, at fortran/resolve.c:11807

2020-07-02 Thread Harald Anlauf
The obsolescent (=legacy) assigned GOTO should only allow scalar integer
variables.  Check for proper conditions.

Regtested on x86_64-pc-linux-gnu.

OK for master / backports?

Thanks,
Harald


PR fortran/95709 - ICE in gfc_resolve_code, at fortran/resolve.c:11807

The legacy "assigned GOTO" accepts only scalar integer variables.
Check for proper arguments.

gcc/fortran/
PR fortran/95709
* resolve.c (gfc_resolve_code): Check for valid arguments to
asigned GOTO.

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 5cc9f72e55c..34b1b80c31e 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -11814,10 +11814,13 @@ start:
 	case EXEC_GOTO:
 	  if (code->expr1 != NULL)
 	{
-	  if (code->expr1->ts.type != BT_INTEGER)
-		gfc_error ("ASSIGNED GOTO statement at %L requires an "
-			   "INTEGER variable", >expr1->where);
-	  else if (code->expr1->symtree->n.sym->attr.assign != 1)
+	  if (code->expr1->ts.type != BT_INTEGER
+		  || code->expr1->rank != 0
+		  || code->expr1->symtree == NULL)
+		gfc_error ("ASSIGNED GOTO statement at %L requires a "
+			   "scalar INTEGER variable", >expr1->where);
+	  else if (code->expr1->symtree->n.sym
+		   && code->expr1->symtree->n.sym->attr.assign != 1)
 		gfc_error ("Variable %qs has not been assigned a target "
 			   "label at %L", code->expr1->symtree->n.sym->name,
 			   >expr1->where);
diff --git a/gcc/testsuite/gfortran.dg/pr95709.f90 b/gcc/testsuite/gfortran.dg/pr95709.f90
new file mode 100644
index 000..6acd6fd14e6
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95709.f90
@@ -0,0 +1,10 @@
+! { dg-do compile }
+! { dg-options "-std=legacy" }
+! PR fortran/95709 - ICE in gfc_resolve_code, at fortran/resolve.c:11807
+
+program p
+  integer, parameter :: i(1) = 1
+  goto i(1)! { dg-error "requires a scalar INTEGER variable" }
+  goto i%kind, (1) ! { dg-error "requires a scalar INTEGER variable" }
+1 continue
+end


[committed 2/2] libstdc++: Require c++98_only effective target for a test

2020-07-02 Thread Jonathan Wakely via Gcc-patches

And one more like the previous patch.

Tested x86_64-linux, committed to trunk.


commit 5079855e7ebe8fd4a7f9005dd75fa35f8cd54daa
Author: Jonathan Wakely 
Date:   Thu Jul 2 21:27:12 2020 +0100

libstdc++: Require c++98_only effective target for a test

This test checks a conversion which only exists in C++98 and won't
compile since C++11. It uses { dg-options "-std=gnu++98" } so that it is
explicitly run in C++98 mode. This change also adds a target selector so
that the test will be skipped if the dg-options directive is filtered
out or overridden.

libstdc++-v3/ChangeLog:

* testsuite/27_io/basic_ios/conv/voidptr.cc: Add c++98_only
target selector.

diff --git a/libstdc++-v3/testsuite/27_io/basic_ios/conv/voidptr.cc b/libstdc++-v3/testsuite/27_io/basic_ios/conv/voidptr.cc
index 680157f2d10..76a3ab7a6f4 100644
--- a/libstdc++-v3/testsuite/27_io/basic_ios/conv/voidptr.cc
+++ b/libstdc++-v3/testsuite/27_io/basic_ios/conv/voidptr.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-options "-std=gnu++98" }
+// { dg-do run { target c++98_only } }
 
 #include 
 #include 


[committed 1/2] libstdc++: Require c++98_only effective target for some tests

2020-07-02 Thread Jonathan Wakely via Gcc-patches
These tests verify that including C++11 headers fails to compile in
C++98 mode. They use { dg-options "-std=gnu++98" } so that they are
explicitly run in C++98 mode. This change also adds a target selector so
that the tests will be skipped even if the dg-options directive is
filtered out or overridden. This is in preparation for a desired future
change where tests do not use -std options, so that they can be tested
with e.g. --target_board=unix\"{-std=gnu++17,-std=gnu++20}\"

In some cases the dg-options and dg-do directives need to be reordered,
so that the -std=gnu++98 option is already added to the options before
the target selector is checked.

libstdc++-v3/ChangeLog:

* testsuite/18_support/headers/cstdalign/std_c++0x_neg.cc: Add
c++98_only target selector.
* testsuite/18_support/headers/cstdbool/std_c++0x_neg.cc:
Likewise.
* testsuite/18_support/headers/cstdint/std_c++0x_neg.cc:
Likewise.
* testsuite/18_support/headers/new/synopsis_cxx98.cc: Likewise.
* testsuite/19_diagnostics/headers/system_error/std_c++0x_neg.cc:
Likewise.
* testsuite/20_util/headers/type_traits/std_c++0x_neg.cc:
Likewise.
* testsuite/23_containers/headers/array/std_c++0x_neg.cc:
Likewise.
* testsuite/23_containers/headers/tuple/std_c++0x_neg.cc:
Likewise.
* testsuite/23_containers/headers/unordered_map/std_c++0x_neg.cc:
Likewise.
* testsuite/23_containers/headers/unordered_set/std_c++0x_neg.cc:
Likewise.
* testsuite/26_numerics/headers/ccomplex/std_c++0x_neg.cc:
Likewise.
* testsuite/26_numerics/headers/cfenv/std_c++0x_neg.cc:
Likewise.
* 
testsuite/26_numerics/headers/cmath/c99_classification_macros_c++98.cc:
Likewise.
* testsuite/26_numerics/headers/ctgmath/std_c++0x_neg.cc:
Likewise.
* testsuite/26_numerics/headers/random/std_c++0x_neg.cc:
Likewise.
* testsuite/27_io/headers/cinttypes/std_c++0x_neg.cc: Likewise.
* testsuite/28_regex/headers/regex/std_c++0x_neg.cc: Likewise.
* testsuite/29_atomics/headers/atomic/std_c++0x_neg.cc:
Likewise.
* testsuite/30_threads/headers/condition_variable/std_c++0x_neg.cc:
Likewise.
* testsuite/30_threads/headers/future/std_c++0x_neg.cc:
Likewise.
* testsuite/30_threads/headers/mutex/std_c++0x_neg.cc: Likewise.
* testsuite/30_threads/headers/thread/std_c++0x_neg.cc:
Likewise.

Tested x86_64-linux, committed to trunk.


commit b857b179772951677cb12781f892454cd09d7831
Author: Jonathan Wakely 
Date:   Thu Jul 2 21:27:12 2020 +0100

libstdc++: Require c++98_only effective target for some tests

These tests verify that including C++11 headers fails to compile in
C++98 mode. They use { dg-options "-std=gnu++98" } so that they are
explicitly run in C++98 mode. This change also adds a target selector so
that the tests will be skipped even if the dg-options directive is
filtered out or overridden. This is in preparation for a desired future
change where tests do not use -std options, so that they can be tested
with e.g. --target_board=unix\"{-std=gnu++17,-std=gnu++20}\"

In some cases the dg-options and dg-do directives need to be reordered,
so that the -std=gnu++98 option is already added to the options before
the target selector is checked.

libstdc++-v3/ChangeLog:

* testsuite/18_support/headers/cstdalign/std_c++0x_neg.cc: Add
c++98_only target selector.
* testsuite/18_support/headers/cstdbool/std_c++0x_neg.cc:
Likewise.
* testsuite/18_support/headers/cstdint/std_c++0x_neg.cc:
Likewise.
* testsuite/18_support/headers/new/synopsis_cxx98.cc: Likewise.
* testsuite/19_diagnostics/headers/system_error/std_c++0x_neg.cc:
Likewise.
* testsuite/20_util/headers/type_traits/std_c++0x_neg.cc:
Likewise.
* testsuite/23_containers/headers/array/std_c++0x_neg.cc:
Likewise.
* testsuite/23_containers/headers/tuple/std_c++0x_neg.cc:
Likewise.
* testsuite/23_containers/headers/unordered_map/std_c++0x_neg.cc:
Likewise.
* testsuite/23_containers/headers/unordered_set/std_c++0x_neg.cc:
Likewise.
* testsuite/26_numerics/headers/ccomplex/std_c++0x_neg.cc:
Likewise.
* testsuite/26_numerics/headers/cfenv/std_c++0x_neg.cc:
Likewise.
* 
testsuite/26_numerics/headers/cmath/c99_classification_macros_c++98.cc:
Likewise.
* testsuite/26_numerics/headers/ctgmath/std_c++0x_neg.cc:
Likewise.
* testsuite/26_numerics/headers/random/std_c++0x_neg.cc:
Likewise.
* 

[PATCH] Add -fld-path= to specify an arbitrary executable as the linker

2020-07-02 Thread Fāng-ruì Sòng via Gcc-patches

On 2020-07-01, Fāng-ruì Sòng wrote:

On 2020-07-01, Martin Liška wrote:

On 6/30/20 5:32 PM, Fāng-ruì Sòng wrote:

There is some concern about clang's -fuse-ld=path
http://lists.llvm.org/pipermail/cfe-dev/2020-June/065710.html and use
of COMPILER_PATH vs PATH.
Shall we introduce another option like -fld-path=path (PATH is used,
COMPILER_PATH is not used)?


I would recommend first landing a patch to LLVM and then we can do
a corresponding change to GCC.

Martin


Thank a lot for you welcoming words! This is what I intend to add for clang: 
https://reviews.llvm.org/D83015

I'll create a GCC patch superseding this one later.


Attached the new patch.
>From e7f86cdcaf03e4ddb98d0df9d07894d9ffb7d91a Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Thu, 2 Jul 2020 12:26:09 -0700
Subject: [PATCH] Add -fld-path= to specify an arbitrary executable as the
 linker

The value can be either a relative path (relative to a COMPILER_PATH
directory or a PATH directory) or an absolute path. -fld-path=
complements -fuse-ld={bfd,gold,lld} which specifies the linker flavor.

	PR driver/93645
	* common.opt (-fld-path=): Add -fld-path=
	* opts.c (common_handle_option): Handle OPT_fld_path_.
	* gcc.c (driver_handle_option): Likewise.
	* collect2.c (main): Likewise.
	* doc/invoke.texi: Document -fld-path=.
---
 gcc/collect2.c  | 57 -
 gcc/common.opt  |  4 
 gcc/doc/invoke.texi |  6 +
 gcc/gcc.c   |  2 +-
 gcc/opts.c  |  1 +
 5 files changed, 53 insertions(+), 17 deletions(-)

diff --git a/gcc/collect2.c b/gcc/collect2.c
index f8a5ce45994..efa652f7f82 100644
--- a/gcc/collect2.c
+++ b/gcc/collect2.c
@@ -844,6 +844,7 @@ main (int argc, char **argv)
   const char **ld1;
   bool use_plugin = false;
   bool use_collect_ld = false;
+  const char *ld_path = NULL;
 
   /* The kinds of symbols we will have to consider when scanning the
  outcome of a first pass link.  This is ALL to start with, then might
@@ -961,12 +962,21 @@ main (int argc, char **argv)
 	if (selected_linker == USE_DEFAULT_LD)
 	  selected_linker = USE_PLUGIN_LD;
 	  }
-	else if (strcmp (argv[i], "-fuse-ld=bfd") == 0)
-	  selected_linker = USE_BFD_LD;
-	else if (strcmp (argv[i], "-fuse-ld=gold") == 0)
-	  selected_linker = USE_GOLD_LD;
-	else if (strcmp (argv[i], "-fuse-ld=lld") == 0)
-	  selected_linker = USE_LLD_LD;
+	else if (strncmp (argv[i], "-fuse-ld=bfd", 9) == 0
+		 && selected_linker != USE_LD_MAX)
+	  {
+	if (strcmp (argv[i] + 9, "bfd") == 0)
+	  selected_linker = USE_BFD_LD;
+	else if (strcmp (argv[i] + 9, "gold") == 0)
+	  selected_linker = USE_GOLD_LD;
+	else if (strcmp (argv[i] + 9, "lld") == 0)
+	  selected_linker = USE_LLD_LD;
+	  }
+	else if (strncmp (argv[i], "-fld-path=", 10) == 0)
+	  {
+	ld_path = argv[i] + 10;
+	selected_linker = USE_LD_MAX;
+	  }
 	else if (strncmp (argv[i], "-o", 2) == 0)
 	  {
 	/* Parse the output filename if it's given so that we can make
@@ -1117,14 +1127,27 @@ main (int argc, char **argv)
   ld_file_name = find_a_file (, collect_ld_suffix, X_OK);
   use_collect_ld = ld_file_name != 0;
 }
-  /* Search the compiler directories for `ld'.  We have protection against
- recursive calls in find_a_file.  */
-  if (ld_file_name == 0)
-ld_file_name = find_a_file (, ld_suffixes[selected_linker], X_OK);
-  /* Search the ordinary system bin directories
- for `ld' (if native linking) or `TARGET-ld' (if cross).  */
-  if (ld_file_name == 0)
-ld_file_name = find_a_file (, full_ld_suffixes[selected_linker], X_OK);
+  if (selected_linker == USE_LD_MAX)
+{
+  /* If -fld-path= does not contain a slash, search for the command using
+	 the PATH environment variable.  */
+  if (lbasename (ld_path) == ld_path)
+	ld_file_name = find_a_file (, ld_path, X_OK);
+  else if (file_exists (ld_path))
+	ld_file_name = ld_path;
+}
+  else
+{
+  /* Search the compiler directories for `ld'.  We have protection against
+	 recursive calls in find_a_file.  */
+  if (ld_file_name == 0)
+	ld_file_name = find_a_file (, ld_suffixes[selected_linker], X_OK);
+  /* Search the ordinary system bin directories
+	 for `ld' (if native linking) or `TARGET-ld' (if cross).  */
+  if (ld_file_name == 0)
+	ld_file_name =
+	  find_a_file (, full_ld_suffixes[selected_linker], X_OK);
+}
 
 #ifdef REAL_NM_FILE_NAME
   nm_file_name = find_a_file (, REAL_NM_FILE_NAME, X_OK);
@@ -1297,9 +1320,11 @@ main (int argc, char **argv)
 #endif
 		}
 	  else if (!use_collect_ld
-		   && strncmp (arg, "-fuse-ld=", 9) == 0)
+		   && (strncmp (arg, "-fuse-ld=", 9) == 0 ||
+			   strncmp (arg, "-fld-path=", 10) == 0))
 		{
-		  /* Do not pass -fuse-ld={bfd|gold|lld} to the linker. */
+		  /* Do not pass -fuse-ld={bfd|gold|lld} or -fld-path= to the
+		 linker. */
 		  ld1--;
 		  ld2--;
 		}
diff --git a/gcc/common.opt b/gcc/common.opt
index df8af365d1b..d5dfbb5c9c6 100644
--- 

Re: [patch] Extend store merging to STRING_CST

2020-07-02 Thread Eric Botcazou
> So this variant combined with the rest of the patch is OK then.

Thanks.  It occurred to me that using string_constant might be slightly better 
(iti is already used by gimple_fold_builtin_memchr in the same file).


* gimple-fold.c (gimple_fold_builtin_memory_op): Fold calls that were
initially created for the assignment of a variable-sized destination
and whose source is now a string constant.

-- 
Eric Botcazoudiff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 4e3de95d2d2..72c5e43300a 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -700,7 +700,6 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
   gimple *stmt = gsi_stmt (*gsi);
   tree lhs = gimple_call_lhs (stmt);
   tree len = gimple_call_arg (stmt, 2);
-  tree destvar, srcvar;
   location_t loc = gimple_location (stmt);
 
   /* If the LEN parameter is a constant zero or in range where
@@ -741,7 +740,7 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 }
   else
 {
-  tree srctype, desttype;
+  tree srctype, desttype, destvar, srcvar, srcoff;
   unsigned int src_align, dest_align;
   tree off0;
   const char *tmp_str;
@@ -991,7 +990,9 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
   /* Choose between src and destination type for the access based
  on alignment, whether the access constitutes a register access
 	 and whether it may actually expose a declaration for SSA rewrite
-	 or SRA decomposition.  */
+	 or SRA decomposition.  Also try to expose a string constant, we
+	 might be able to concatenate several of them later into a single
+	 string store.  */
   destvar = NULL_TREE;
   srcvar = NULL_TREE;
   if (TREE_CODE (dest) == ADDR_EXPR
@@ -1008,7 +1009,16 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 	   && (is_gimple_reg_type (srctype)
 		   || dest_align >= TYPE_ALIGN (srctype)))
 	srcvar = fold_build2 (MEM_REF, srctype, src, off0);
-  if (srcvar == NULL_TREE && destvar == NULL_TREE)
+  /* FIXME: Don't transform copies from strings with known original length.
+	 As soon as strlenopt tests that rely on it for passing are adjusted,
+	 this hack can be removed.  */
+  else if (gimple_call_alloca_for_var_p (stmt)
+	   && (srcvar = string_constant (src, , NULL, NULL))
+	   && integer_zerop (srcoff)
+	   && tree_int_cst_equal (TYPE_SIZE_UNIT (TREE_TYPE (srcvar)), len)
+	   && dest_align >= TYPE_ALIGN (TREE_TYPE (srcvar)))
+	srctype = TREE_TYPE (srcvar);
+  else
 	return false;
 
   /* Now that we chose an access type express the other side in
@@ -1071,19 +1081,29 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 	  goto set_vop_and_replace;
 	}
 
-  /* We get an aggregate copy.  Use an unsigned char[] type to
-	 perform the copying to preserve padding and to avoid any issues
-	 with TREE_ADDRESSABLE types or float modes behavior on copying.  */
-  desttype = build_array_type_nelts (unsigned_char_type_node,
-	 tree_to_uhwi (len));
-  srctype = desttype;
-  if (src_align > TYPE_ALIGN (srctype))
-	srctype = build_aligned_type (srctype, src_align);
+  /* We get an aggregate copy.  If the source is a STRING_CST, then
+	 directly use its type to perform the copy.  */
+  if (TREE_CODE (srcvar) == STRING_CST)
+	  desttype = srctype;
+
+  /* Or else, use an unsigned char[] type to perform the copy in order
+	 to preserve padding and to avoid any issues with TREE_ADDRESSABLE
+	 types or float modes behavior on copying.  */
+  else
+	{
+	  desttype = build_array_type_nelts (unsigned_char_type_node,
+	 tree_to_uhwi (len));
+	  srctype = desttype;
+	  if (src_align > TYPE_ALIGN (srctype))
+	srctype = build_aligned_type (srctype, src_align);
+	  srcvar = fold_build2 (MEM_REF, srctype, src, off0);
+	}
+
   if (dest_align > TYPE_ALIGN (desttype))
 	desttype = build_aligned_type (desttype, dest_align);
-  new_stmt
-	= gimple_build_assign (fold_build2 (MEM_REF, desttype, dest, off0),
-			   fold_build2 (MEM_REF, srctype, src, off0));
+  destvar = fold_build2 (MEM_REF, desttype, dest, off0);
+  new_stmt = gimple_build_assign (destvar, srcvar);
+
 set_vop_and_replace:
   gimple_move_vops (new_stmt, stmt);
   if (!lhs)


Re: [PATCH] gcov: fix gcov-tool merge for TOPN counters

2020-07-02 Thread Gerald Pfeifer
On Tue, 16 Jun 2020, Martin Liška wrote:
> libgcc/ChangeLog:
> 
>   * libgcov-util.c (read_gcda_finalize): Remove const operator.
>   (merge_wrapper): Add both counts and use them properly.
>   (topn_to_memory_representation): New function.
>   (gcov_merge): Covert on disk representation to in memory
>   representation.

It may not have been this patch, but looking around the bootstrap
failure I reported earlier I noticed the following in stage 2:

  /scratch/tmp/gerald/GCC-HEAD/gcc/../libgcc/libgcov-util.c: In function 
  'int gcov_profile_merge(gcov_info*, gcov_info*, int, int)':
  /scratch/tmp/gerald/GCC-HEAD/gcc/../libgcc/libgcov-util.c:703:12: 
  warning: '*' may be used uninitialized [-Wmaybe-uninitialized]
  703 |   tgt_tail = tgt_infos[tgt_cnt - 1];
  |   ~^~~~

Do you also see this (or is this specific to my 32-bit tester)?

Gerald


[committed] libstdc++: Fix atomic tests (PR 91153, PR 93224)

2020-07-02 Thread Jonathan Wakely via Gcc-patches
These tests fail with AIX double double. Use different floating point
values that behave less surprisingly.

libstdc++-v3/ChangeLog:

PR libstdc++/91153
PR target/93224
* testsuite/29_atomics/atomic_float/1.cc: Use different values
for tests.
* testsuite/29_atomics/atomic_ref/float.cc: Likewise.

Tested powerpc64le-linux (both ibmlongdouble and ieeelongdouble).

Committed to master. Backport to gcc-10 to follow.

commit c6f431bba531bac3212b66069cf0f9718edf0132
Author: Jonathan Wakely 
Date:   Thu Jul 2 17:11:10 2020 +

libstdc++: Fix atomic tests (PR 91153, PR 93224)

These tests fail with AIX double double. Use different floating point
values that behave less surprisingly.

libstdc++-v3/ChangeLog:

PR libstdc++/91153
PR target/93224
* testsuite/29_atomics/atomic_float/1.cc: Use different values
for tests.
* testsuite/29_atomics/atomic_ref/float.cc: Likewise.

diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/1.cc 
b/libstdc++-v3/testsuite/29_atomics/atomic_float/1.cc
index cf80c84bbb9..a38b040ee8f 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_float/1.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/1.cc
@@ -453,27 +453,28 @@ test03()
 ok = a1.compare_exchange_strong(expected, 3.2l, mo);
 VERIFY( !ok && a1.load() == 2.56l && expected == 2.56l );
 
-f0 = a0.fetch_add(1.2l);
-VERIFY( f0 == 12.8l );
-VERIFY( a0 == 14.0l );
-f1 = a1.fetch_add(2.4l, mo);
-VERIFY( f1 == 2.56l );
-VERIFY( a1 == 4.96l );
+a1 = a0 = 0.5l;
+f0 = a0.fetch_add(0.25l);
+VERIFY( f0 == 0.5l );
+VERIFY( a0 == 0.75l );
+f1 = a1.fetch_add(0.25l, mo);
+VERIFY( f1 == 0.5l );
+VERIFY( a1 == 0.75l );
 
-f0 = a0.fetch_sub(1.2l);
-VERIFY( f0 == 14.0l );
-VERIFY( a0 == 12.8l );
-f1 = a1.fetch_sub(3.5l, mo);
-VERIFY( f1 == 4.96l );
-VERIFY( a1 == 1.46l );
+f0 = a0.fetch_sub(0.5l);
+VERIFY( f0 == 0.75l );
+VERIFY( a0 == 0.25l );
+f1 = a1.fetch_sub(0.5l, mo);
+VERIFY( f1 == 0.75l );
+VERIFY( a1 == 0.25l );
 
-f0 = a0 += 1.2l;
-VERIFY( f0 == 14.0l );
-VERIFY( a0 == 14.0l );
+f0 = a0 += 0.75l;
+VERIFY( f0 == 1.0l );
+VERIFY( a0 == 1.0l );
 
-f0 = a0 -= 0.8l;
-VERIFY( f0 == 13.2l );
-VERIFY( a0 == 13.2l );
+f0 = a0 -= 0.5l;
+VERIFY( f0 == 0.5l );
+VERIFY( a0 == 0.5l );
   }
 
   // Repeat for volatile std::atomic
@@ -540,27 +541,28 @@ test03()
 ok = a1.compare_exchange_strong(expected, 3.2l, mo);
 VERIFY( !ok && a1.load() == 2.56l && expected == 2.56l );
 
-f0 = a0.fetch_add(1.2l);
-VERIFY( f0 == 12.8l );
-VERIFY( a0 == 14.0l );
-f1 = a1.fetch_add(2.4l, mo);
-VERIFY( f1 == 2.56l );
-VERIFY( a1 == 4.96l );
+a1 = a0 = 0.5l;
+f0 = a0.fetch_add(0.25l);
+VERIFY( f0 == 0.5l );
+VERIFY( a0 == 0.75l );
+f1 = a1.fetch_add(0.25l, mo);
+VERIFY( f1 == 0.5l );
+VERIFY( a1 == 0.75l );
 
-f0 = a0.fetch_sub(1.2l);
-VERIFY( f0 == 14.0l );
-VERIFY( a0 == 12.8l );
-f1 = a1.fetch_sub(3.5l, mo);
-VERIFY( f1 == 4.96l );
-VERIFY( a1 == 1.46l );
+f0 = a0.fetch_sub(0.5l);
+VERIFY( f0 == 0.75l );
+VERIFY( a0 == 0.25l );
+f1 = a1.fetch_sub(0.5l, mo);
+VERIFY( f1 == 0.75l );
+VERIFY( a1 == 0.25l );
 
-f0 = a0 += 1.2l;
-VERIFY( f0 == 14.0l );
-VERIFY( a0 == 14.0l );
+f0 = a0 += 0.75l;
+VERIFY( f0 == 1.0l );
+VERIFY( a0 == 1.0l );
 
-f0 = a0 -= 0.8l;
-VERIFY( f0 == 13.2l );
-VERIFY( a0 == 13.2l );
+f0 = a0 -= 0.5l;
+VERIFY( f0 == 0.5l );
+VERIFY( a0 == 0.5l );
   }
 }
 
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_ref/float.cc 
b/libstdc++-v3/testsuite/29_atomics/atomic_ref/float.cc
index 0aee9ceedcf..4726a14f9ca 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_ref/float.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_ref/float.cc
@@ -267,31 +267,32 @@ test03()
 ok = a.compare_exchange_strong(expected, 6.4l, mo);
 VERIFY( !ok && a.load() == 204.8l && expected == 204.8l );
 
-v = a.fetch_add(3.2l);
-VERIFY( v == 204.8l );
-VERIFY( a == 208.0l );
-v = a.fetch_add(-8.5l, mo);
-VERIFY( v == 208.0l );
-VERIFY( a == 199.5l );
+a = 0.5l;
+v = a.fetch_add(0.5l);
+VERIFY( v == 0.5l );
+VERIFY( a == 1.0l );
+v = a.fetch_add(-0.2l, mo);
+VERIFY( v == 1.0l );
+VERIFY( a == 0.8l );
 
-v = a.fetch_sub(109.5l);
-VERIFY( v == 199.5l );
-VERIFY( a == 90.0l );
-v = a.fetch_sub(2, mo);
-VERIFY( v == 90.0l );
-VERIFY( a == 88.0l );
+v = a.fetch_sub(0.4l);
+VERIFY( v == 0.8l );
+VERIFY( a == 0.4l );
+v = a.fetch_sub(-0.4l, mo);
+VERIFY( v == 0.4l );
+VERIFY( a == 0.8l );
 
-v = a += 5.0l;
-VERIFY( v == 93.0l );
-VERIFY( a == 93.0l );
+v = a += .8l;
+VERIFY( v == 1.6l );
+VERIFY( a == 1.6l );
 
-   

Re: [PATCH] nvptx: Fix ICE in nvptx_vector_alignment on gcc.dg/attr-vector_size.c

2020-07-02 Thread Tom de Vries


On 7/2/20 3:10 PM, Roger Sayle wrote:
> Hi Tom,
> 
> Beware of doing something just because the default target hook does it that 
> way.

Well, what I'm saying is that if the default target hook doesn't assume
tree_fits_uhwi_p (size), the safest solution is to do the same in the
nvptx target hook.

Thanks,
- Tom

> See https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549128.html
> which fixes PR middle-end/90597 by correcting default_vector_alignment to do
> it the same way as proposed by this nvptx backend patch.
> 
> Many thanks for pointing out that the nvptx problem is PR target/90932.
> 
> Roger
> --
> 
> -Original Message-
> From: Tom de Vries  
> Sent: 02 July 2020 13:09
> To: Roger Sayle ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] nvptx: Fix ICE in nvptx_vector_alignment on 
> gcc.dg/attr-vector_size.c
> 
> On 6/29/20 7:19 PM, Roger Sayle wrote:
>> This patch addresses the ICE in gcc.dg/attr-vector_size.c during make 
>> -k check on nvptx-none.  The actual ICE looks like:
>>
>> testsuite/gcc.dg/attr-vector_size.c:29:1: internal compiler error: in 
>> tree_to_shwi, at tree.c:7321
>> 0xf53bf2 tree_to_shwi(tree_node const*)
>> ../../gcc/gcc/tree.c:7321
>> 0xff1969 nvptx_vector_alignment
>> ../../gcc/gcc/config/nvptx/nvptx.c:5105^M
>>
>> The problem is that the caller has ensured that TYPE_SIZE(type) is 
>> representable as an unsigned HOST_WIDE_INT, but nvptx_vector_alignment 
>> is accessing it as a signed HOST_WIDE_INT which overflows in 
>> pathological conditions.  Amongst those pathological conditions is 
>> that a TYPE_SIZE of zero can sometimes reach this function, prior to 
>> an error being emitted.  Making sure the result is not less than the 
>> mode's alignment and not greater than BIGGEST_ALIGNMENT fixes the ICEs, and 
>> generates the expected compile-time error messages.
>>
>> Tested on --target=nvptx-none, with a "make" and "make check" which 
>> results in four fewer unexpected failures and three more expected passes.
>> Ok for mainline?
>>
>>
>> 2020-06-29  Roger Sayle  
>>
>> gcc/ChangeLog:
>> * config/nvptx/nvptx.c (nvptx_vector_alignment): Use tree_to_uhwi
>> to access TYPE_SIZE (type).  Return at least the mode's alignment.
>>
>> Thanks,
>> Roger
>> --
>> Roger Sayle
>> NextMove Software
>> Cambridge, UK
>>
>> diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index 
>> e3e84df..bfad91b 100644
>> --- a/gcc/config/nvptx/nvptx.c
>> +++ b/gcc/config/nvptx/nvptx.c
>> @@ -5102,9 +5102,10 @@ static const struct attribute_spec 
>> nvptx_attribute_table[] =  static HOST_WIDE_INT  
>> nvptx_vector_alignment (const_tree type)  {
>> -  HOST_WIDE_INT align = tree_to_shwi (TYPE_SIZE (type));
>> -
>> -  return MIN (align, BIGGEST_ALIGNMENT);
>> +  unsigned HOST_WIDE_INT align = tree_to_uhwi (TYPE_SIZE (type));
> 
> In the default target hook, we test tree_fits_uhwi_p, so I prefer we do the 
> same.
> 
>> +  if (align > BIGGEST_ALIGNMENT)
>> +return BIGGEST_ALIGNMENT;
> 
> I prefer using MIN to make code easier to read.
> 
>> +  return MAX (align, GET_MODE_ALIGNMENT (TYPE_MODE (type)));
>>  }
>>  
>>  /* Indicate that INSN cannot be duplicated.   */
> 
> Also, this is related to a PR, number included in commit log.
> 
> As of yet untested updated patch attached.
> 
> Thanks,
> - Tom
> 


Re: [PATCH] OpenMP: Disable GPU threads when only teams are used

2020-07-02 Thread Jakub Jelinek via Gcc-patches
On Thu, Jul 02, 2020 at 05:15:20PM +0100, Andrew Stubbs wrote:
> This patch, originally by Kwok, auto-adjusts the default OpenMP target
> arguments to set num_threads(1) when there are no parallel regions. There
> may still be multiple teams in this case.
> 
> The result is that libgomp will not attempt to launch GPU threads that will
> never get used.
> 
> OK to commit?

That doesn't look safe to me.
My understanding of the patch is that it looks for parallel construct
lexically in the target region, but that isn't sufficient, one can do that
only if the target region can't encounter a parallel construct in the target
region (i.e. the body and all functions that are called from it at runtime).

void
foo ()
{
  #pragma omp distribute parallel for simd
  for (int i = 0; i < 1000; i++)
do_something;
}

extern void baz (); // function that calls foo, unconditionally or conditionally
#pragma omp declare target to (foo, baz)

void
bar ()
{
  #pragma omp target teams
  baz ();
}

Perhaps one could ignore some builtin calls but it would need to be ones
where one can assume there will be no OpenMP code in them.

Also, it needs to avoid doing the optimization if there is or might
indirectly be called omp_get_thread_limit (), because if the optimization
forces thread_limit (1), that means that omp_get_thread_limit () in the
region will also return 1 rather than the expected value.

Jakub



[pushed] c++: Allow virtual consteval functions [PR88335]

2020-07-02 Thread Jason Merrill via Gcc-patches
Jakub's partial implementation of consteval virtual had trouble with the
current ABI requirement that we omit the vtable slot for a consteval virtual
function; it's difficult to use the normal code for constant evaluation and
also magically make the slots disappear if the vtables get written out.  I
notice that Clang trunk also doesn't implement that requirement, and it
seems unnecessary to me; I expect consteval virtual functions to be
extremely rare, so it should be fine to just give them a vtable slot as
normal but put zero in it if the vtable gets emitted.  I've commented as
much to the ABI committee.

One of Jakub's testcases points out that we weren't handling thunks in
our constexpr virtual handling; that is fixed here as well.

Incidentally, being able to use C++11 range-for definitely simplified
clear_consteval_vfns.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/c-family/ChangeLog:

* c-cppbuiltin.c (c_cpp_builtins): Define __cpp_consteval.

gcc/cp/ChangeLog:

Support C++20 consteval virtual functions.
* decl.c (grokfndecl): Allow consteval virtual.
* search.c (check_final_overrider): Check consteval mismatch.
* constexpr.c (cxx_eval_thunk_call): New.
(cxx_eval_call_expression): Call it.
* cvt.c (cp_get_fndecl_from_callee): Handle FDESC_EXPR.
* decl2.c (mark_vtable_entries): Track vtables with consteval.
(maybe_emit_vtables): Pass consteval_vtables through.
(clear_consteval_vfns): Replace consteval with nullptr.
(c_parse_final_cleanups): Call it.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/consteval-virtual1.C: New test.
* g++.dg/cpp2a/consteval-virtual2.C: New test.
* g++.dg/cpp2a/consteval-virtual3.C: New test.
* g++.dg/cpp2a/consteval-virtual4.C: New test.
* g++.dg/cpp2a/consteval-virtual5.C: New test.

Co-authored-by: Jakub Jelinek 
---
 gcc/c-family/c-cppbuiltin.c   |  2 +-
 gcc/cp/constexpr.c| 48 +++
 gcc/cp/cvt.c  | 11 ++--
 gcc/cp/decl.c |  9 ---
 gcc/cp/decl2.c| 39 ++--
 gcc/cp/search.c   | 36 +++
 .../g++.dg/cpp2a/consteval-virtual1.C | 12 
 .../g++.dg/cpp2a/consteval-virtual2.C | 22 +++
 .../g++.dg/cpp2a/consteval-virtual3.C | 53 
 .../g++.dg/cpp2a/consteval-virtual4.C | 48 +++
 .../g++.dg/cpp2a/consteval-virtual5.C | 61 +++
 11 files changed, 307 insertions(+), 34 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-virtual1.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-virtual2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-virtual3.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-virtual4.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-virtual5.C

diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index a7d65d63934..83f52fdf5d8 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -995,7 +995,7 @@ c_cpp_builtins (cpp_reader *pfile)
  cpp_define (pfile, "__cpp_constexpr=201907L");
  cpp_define (pfile, "__cpp_constexpr_in_decltype=201711L");
  cpp_define (pfile, "__cpp_conditional_explicit=201806L");
- /* cpp_define (pfile, "__cpp_consteval=201811L"); */
+ cpp_define (pfile, "__cpp_consteval=201811L");
  cpp_define (pfile, "__cpp_constinit=201907L");
  cpp_define (pfile, "__cpp_deduction_guides=201907L");
  cpp_define (pfile, "__cpp_nontype_template_parameter_class=201806L");
diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index f766abd3a11..1939166e907 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -2129,6 +2129,52 @@ replace_result_decl (tree *tp, tree decl, tree 
replacement)
   return data.changed;
 }
 
+/* Evaluate the call T to virtual function thunk THUNK_FNDECL.  */
+
+static tree
+cxx_eval_thunk_call (const constexpr_ctx *ctx, tree t, tree thunk_fndecl,
+bool lval,
+bool *non_constant_p, bool *overflow_p)
+{
+  tree function = THUNK_TARGET (thunk_fndecl);
+
+  /* virtual_offset is only set in the presence of virtual bases, which make
+ the class non-literal, so we don't need to handle it here.  */
+  if (THUNK_VIRTUAL_OFFSET (thunk_fndecl))
+{
+  gcc_assert (!DECL_DECLARED_CONSTEXPR_P (function));
+  if (!ctx->quiet)
+   {
+ error ("call to non-% function %qD", function);
+ explain_invalid_constexpr_fn (function);
+   }
+  *non_constant_p = true;
+  return t;
+}
+
+  tree new_call = copy_node (t);
+  CALL_EXPR_FN (new_call) = function;
+  TREE_TYPE (new_call) = TREE_TYPE (TREE_TYPE (function));
+
+  tree offset = size_int (THUNK_FIXED_OFFSET (thunk_fndecl));
+
+  if (DECL_THIS_THUNK_P 

[PATCH] OpenMP: Disable GPU threads when only teams are used

2020-07-02 Thread Andrew Stubbs
This patch, originally by Kwok, auto-adjusts the default OpenMP target 
arguments to set num_threads(1) when there are no parallel regions. 
There may still be multiple teams in this case.


The result is that libgomp will not attempt to launch GPU threads that 
will never get used.


OK to commit?

Andrew
OpenMP: Disable GPU threads when only teams are used

	gcc/
	* omp-expand.c (contains_threads): New.
	(get_target_arguments): Add region argument.  Set number of threads
	to one if region does not contain threads.
	(expand_omp_target): Add extra argument in call to
	get_target_arguments.

Co-Authored-By: Andrew Stubbs  

diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
index 0f07e51f7e8..6afe18d5ee0 100644
--- a/gcc/omp-expand.c
+++ b/gcc/omp-expand.c
@@ -8461,10 +8461,22 @@ push_target_argument_according_to_value (gimple_stmt_iterator *gsi, int device,
 }
 }
 
+static bool
+contains_threads (struct omp_region *region)
+{
+  if (!region)
+return false;
+
+  return region->type == GIMPLE_OMP_PARALLEL
+	 || contains_threads (region->inner)
+	 || contains_threads (region->next);
+}
+
 /* Create an array of arguments that is then passed to GOMP_target.  */
 
 static tree
-get_target_arguments (gimple_stmt_iterator *gsi, gomp_target *tgt_stmt)
+get_target_arguments (gimple_stmt_iterator *gsi, gomp_target *tgt_stmt,
+		  struct omp_region *region)
 {
   auto_vec  args;
   tree clauses = gimple_omp_target_clauses (tgt_stmt);
@@ -8481,6 +8493,11 @@ get_target_arguments (gimple_stmt_iterator *gsi, gomp_target *tgt_stmt)
 t = OMP_CLAUSE_THREAD_LIMIT_EXPR (c);
   else
 t = integer_minus_one_node;
+
+  if (tree_int_cst_equal (t, integer_zero_node)
+  && !contains_threads (region->inner))
+t = integer_one_node;
+
   push_target_argument_according_to_value (gsi, GOMP_TARGET_ARG_DEVICE_ALL,
 	   GOMP_TARGET_ARG_THREAD_LIMIT, t,
 	   );
@@ -8994,7 +9011,7 @@ expand_omp_target (struct omp_region *region)
 	depend = build_int_cst (ptr_type_node, 0);
   args.quick_push (depend);
   if (start_ix == BUILT_IN_GOMP_TARGET)
-	args.quick_push (get_target_arguments (, entry_stmt));
+	args.quick_push (get_target_arguments (, entry_stmt, region));
   break;
 case BUILT_IN_GOACC_PARALLEL:
   if (lookup_attribute ("oacc serial", DECL_ATTRIBUTES (child_fn)) != NULL)


Re: Fortran : Fortran translation issues PR52279

2020-07-02 Thread David Edelsohn via Gcc-patches
Mark,

A quick test with

const char hint [] = _(" [see %<-fno-allow-invalid-boz%>]");

reproduces the failure.

const char *hint = _(" [see %<-fno-allow-invalid-boz%>]");

seems to work.  I will do a full bootstrap test with that change later today.

Do you want me to commit it if it works or do you want me to report it to you?

Thanks, David

On Thu, Jul 2, 2020 at 9:38 AM David Edelsohn  wrote:
>
> On Thu, Jul 2, 2020 at 8:56 AM Mark Eggleston
>  wrote:
> >
> > On 02/07/2020 13:25, David Edelsohn wrote:
> > > On Thu, Jul 2, 2020 at 6:16 AM Mark Eggleston
> > >  wrote:
> > >> I've committed the change from array to pointer. Does this fix your 
> > >> builds?
> > >>
> > >> On 02/07/2020 08:18, Mark Eggleston wrote:
> > >>> On 01/07/2020 20:07, David Edelsohn wrote:
> >  This patch breaks bootstrap.
> > >>> Apologies. I didn't see this when I built the compiler the with
> > >>> bootstrap on x86_64. I'll endevour to get it fixed as soon as possible.
> > > Mark,
> > >
> > > Your change did nothing because I previously had removed the
> > > translation markers to restore bootstrap.  You were not sufficiently
> > > observant to check the ChangeLog or notice that the line already was
> > > changed.
> > >
> > > Please stop making random, untested changes.  This is inappropriate
> > > for GCC development.
> > OK. I had no way of confirming as the bootstrap worked for me. I naively
> > thought that it was covered as an "obvious" fix.
> >
> > I was concerned that I caused a breakage and wanted to fix it.
> >
> > Regarding the ChangeLog I was unfamiliar with this situation and will
> > check them if this ever happens again (I hope not).  I'll also keep
> > portability issues in mind.
> >
> > I see the issue of the incorrect hint text has been addressed. As I
> > understand it the string as is will not make it to the pot file so will
> > not be translated, is that acceptable? I would like to be able to close
> > the PR.
>
> Mark,
>
> I reverted your change for that one line to restore bootstrap for
> everyone because Trunk should not be broken for 6 hours and it was
> trivial to revert that one, localized change without affecting the
> rest of your patch.
>
> The translation message for hint should be in the pot file
> translation.  This needs to be fixed correctly.
>
> I would suggest that you restore your original change and work to
> reproduce the failure.  I don't know if you were not building with NLS
> or had some other local configuration.  It seems that
>
> hint[] = _("...")
>
> cannot be valid C syntax unless _( ... ) is a no-op in your build and
> the compiler sees a constant string.  The failure occurred for many
> other developers.
>
> Thanks, David


Re: [PATCH][RFC] Do not stream all zeros for gcda files.

2020-07-02 Thread Gerald Pfeifer
On Thu, 2 Jul 2020, Martin Liška wrote:
> All right, you convinced me and I'm going to install the patch.

I'm fraid this may have broke i386-unknown-freebsd-11.4 (with clang 10.0 
as bootstrap compiler, though that doesn't appear to be the trigger here):

/scratch/tmp/gerald/OBJ-0702-1130/./prev-gcc/xg++ ...
  -g -O2 -fno-checking -gtoggle -DIN_GCC -fno-exceptions -fno-rtti 
  -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
  -Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute 
  -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
  -Wno-overlength-strings -Werror -fno-common ...
  -o gcov-dump.o -MT gcov-dump.o -MMD -MP -MF ./.deps/gcov-dump.TPo 
  /scratch/tmp/gerald/GCC-HEAD/gcc/gcov-dump.c
/scratch/tmp/gerald/GCC-HEAD/gcc/gcov-dump.c: In function 'void 
tag_function(const char*, unsigned int, int, unsigned int)':
/scratch/tmp/gerald/GCC-HEAD/gcc/gcov-dump.c:312:34: error: comparison of 
integer expressions of different signedness: 'long unsigned int' and 'int' 
[-Werror=sign-compare]
  312 |   if (gcov_position () - pos < length)
  |   ~~~^~~~

Gerald


Re: [PATCH] avoid -Wnonnull for lambda stubs (PR c++/95984)

2020-07-02 Thread Martin Sebor via Gcc-patches

On 7/1/20 3:25 PM, Jason Merrill wrote:

On 7/1/20 3:31 PM, Martin Sebor wrote:

The attached patch avoids null pointer checking for the first
argument to calls to the member operator() of lambda objects
emitted by the C++ front end.  This avoids both the spurious
-Wnonnull warnings for such calls as well as the ICE reported
in the bug.

In addition, the patch also avoids function argument checking
for any calls when the tf_warning bit isn't set.  This isn't
strictly necessary to avoid the ICE but it seems like a good
precaution against something similar occurring in other checks
in the future(*).

Finally, while testing the fix I noticed that the common code
doesn't recognize nullptr as a poiner when processing templates.
I've extended the handling to let it handle it as well.


Any possible value of nullptr_t is null, so I think ignoring it is 
appropriate.


In ordinary (including variadic) functions, GCC warns for all
null pointers, including nullptr.  In templates (and in auto
arguments), GCC warns for nullptr only when its converted to
a pointer type (e.g., (void*)nullptr) but not for naked nullptr.
As a user, I expect a warning for all null pointers in all these
contexts.  The distinction between the two kinds is too subtle
to appreciate by most of us (or to be useful in this context).
The test case below (I put it together to verify my patch was
working) shows the difference with lambdas:

  template  void f (L f) {
f ((void*)nullptr);   // -Wnonnull
  }

  template  void g (L f) {
f (nullptr);  // missing -Wnonnull
  }

  void h ()
  {
f ([](auto...) __attribute__ ((nonnull)) { });
g ([](auto...) __attribute__ ((nonnull)) { });
  }




[*] It seems to me that a more robust solution to prevent
the diagnostic subsystem from being re-entered as a result
of callbacks into the front end would be to have the pretty
printer disable all warnings prior to the bcallbacks and
re-enable them afterwards.  That would require an efficient
and reliable way of controlling all warnings (as well as
querying their state), which I think would be a useful
feature to have in any case.  For one thing, it would make
handling warnings and responding to #pragma GCC diagnostics
more robust.



+  const char *arg0str = IDENTIFIER_POINTER (arg0name);
+  closure = !strcmp (arg0str, "__closure");


Let's use id_equal here.


Done in the attached revision.

Martin

Exclude calls to variadic lambda stubs from -Wnonnull checking (PR c++/95984).

Resolves:
PR c++/95984 - Internal compiler error: Error reporting routines re-entered in -Wnonnull on a variadic lamnda
PR c++/96021 - missing -Wnonnull passing nullptr to a nonnull variadic lambda

gcc/c-family/ChangeLog:

	PR c++/95984
	* c-common.c (check_function_nonnull):
	(check_nonnull_arg):

gcc/cp/ChangeLog:

	PR c++/95984
	* call.c (build_over_call):

gcc/testsuite/ChangeLog:

	PR c++/95984
	* g++.dg/warn/Wnonnull6.C: New test.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index aae1ddb6b89..51ecde69f2d 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5308,12 +5308,26 @@ check_function_nonnull (nonnull_arg_ctx , int nargs, tree *argarray)
   int firstarg = 0;
   if (TREE_CODE (ctx.fntype) == METHOD_TYPE)
 {
+  bool closure = false;
+  if (ctx.fndecl)
+	{
+	  /* For certain lambda expressions the C++ front end emits calls
+	 that pass a null this pointer as an argument named __closure
+	 to the member operator() of empty function.  Detect those
+	 and avoid checking them, but proceed to check the remaining
+	 arguments.  */
+	  tree arg0 = DECL_ARGUMENTS (ctx.fndecl);
+	  if (tree arg0name = DECL_NAME (arg0))
+	closure = id_equal (arg0name, "__closure");
+	}
+
   /* In calls to C++ non-static member functions check the this
 	 pointer regardless of whether the function is declared with
 	 attribute nonnull.  */
   firstarg = 1;
-  check_function_arguments_recurse (check_nonnull_arg, , argarray[0],
-	firstarg);
+  if (!closure)
+	check_function_arguments_recurse (check_nonnull_arg, , argarray[0],
+	  firstarg);
 }
 
   tree attrs = lookup_attribute ("nonnull", TYPE_ATTRIBUTES (ctx.fntype));
@@ -5503,7 +5517,9 @@ check_nonnull_arg (void *ctx, tree param, unsigned HOST_WIDE_INT param_num)
  happen if the "nonnull" attribute was given without an operand
  list (which means to check every pointer argument).  */
 
-  if (TREE_CODE (TREE_TYPE (param)) != POINTER_TYPE)
+  tree paramtype = TREE_TYPE (param);
+  if (TREE_CODE (paramtype) != POINTER_TYPE
+  && TREE_CODE (paramtype) != NULLPTR_TYPE)
 return;
 
   /* Diagnose the simple cases of null arguments.  */
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index d8923be1d68..5341a572980 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -8842,15 +8842,16 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
   gcc_assert (j <= nargs);
   nargs = j;
 
-  /* Avoid 

Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions emitted at -O3

2020-07-02 Thread Richard Biener via Gcc-patches
On Thu, Jul 2, 2020 at 3:22 PM xiezhiheng  wrote:
>
> Hi,
>
> This is a fix for pr94442.
> I modify get_inner_reference to handle the case for MEM[ptr, off].
> I extract the "off" and add it to the recorded offset, then I build a
> MEM[ptr, 0] and return it later.
>
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 3c68b0d754c..8cc18449a0c 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -7362,7 +7362,8 @@ tree
>  get_inner_reference (tree exp, poly_int64_pod *pbitsize,
>  poly_int64_pod *pbitpos, tree *poffset,
>  machine_mode *pmode, int *punsignedp,
> -int *preversep, int *pvolatilep)
> +int *preversep, int *pvolatilep,
> +bool include_memref_p)
>  {
>tree size_tree = 0;
>machine_mode mode = VOIDmode;
> @@ -7509,6 +7510,21 @@ get_inner_reference (tree exp, poly_int64_pod 
> *pbitsize,
> }
>   exp = TREE_OPERAND (TREE_OPERAND (exp, 0), 0);
> }
> + else if (include_memref_p
> +  && TREE_CODE (TREE_OPERAND (exp, 0)) == SSA_NAME)
> +   {
> + tree off = TREE_OPERAND (exp, 1);
> + if (!integer_zerop (off))
> +   {
> + poly_offset_int boff = mem_ref_offset (exp);
> + boff <<= LOG2_BITS_PER_UNIT;
> + bit_offset += boff;
> +
> + exp = build2 (MEM_REF, TREE_TYPE (exp),
> +   TREE_OPERAND (exp, 0),
> +   build_int_cst (TREE_TYPE (off), 0));
> +   }
> +   }
>   goto done;
>
> default:
> @@ -10786,7 +10802,7 @@ expand_expr_real_1 (tree exp, rtx target, 
> machine_mode tmode,
> int reversep, volatilep = 0, must_force_mem;
> tree tem
>   = get_inner_reference (exp, , , , ,
> -, , );
> +, , , true);
> rtx orig_op0, memloc;
> bool clear_mem_expr = false;
>
> diff --git a/gcc/tree.h b/gcc/tree.h
> index a74872f5f3e..7df0d15f7f9 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -6139,7 +6139,8 @@ extern bool complete_ctor_at_level_p (const_tree, 
> HOST_WIDE_INT, const_tree);
> look for the ultimate containing object, which is returned and specify
> the access position and size.  */
>  extern tree get_inner_reference (tree, poly_int64_pod *, poly_int64_pod *,
> -tree *, machine_mode *, int *, int *, int *);
> +tree *, machine_mode *, int *, int *, int *,
> +bool = false);
>
>  extern tree build_personality_function (const char *);
>
>
> I add an argument "include_memref_p" to control whether to go into MEM_REF,
> because without it will cause the test case "Warray-bounds-46.c" to fail in 
> regression.
>
> It because function set_base_and_offset in gimple-ssa-warn-restrict.c
>   base = get_inner_reference (expr, , , _off,
>   , , , );
>   ...
>   ...
>   if (TREE_CODE (base) == MEM_REF)
> {
>   tree memrefoff = fold_convert (ptrdiff_type_node, TREE_OPERAND (base, 
> 1));
>   extend_offset_range (memrefoff);
>   base = TREE_OPERAND (base, 0);
>
>   if (refoff != HOST_WIDE_INT_MIN
>   && TREE_CODE (expr) == COMPONENT_REF)
> {
>   /* Bump up the offset of the referenced subobject to reflect
>  the offset to the enclosing object.  For example, so that
>  in
>struct S { char a, b[3]; } s[2];
>strcpy (s[1].b, "1234");
>  REFOFF is set to s[1].b - (char*)s.  */
>   offset_int off = tree_to_shwi (memrefoff);
>   refoff += off;
> }
>
>   if (!integer_zerop (memrefoff))   <=
> /* A non-zero offset into an array of struct with flexible array
>members implies that the array is empty because there is no
>way to initialize such a member when it belongs to an array.
>This must be some sort of a bug.  */
> refsize = 0;
> }
>
> needs MEM_REF offset to judge whether refsize should be set to zero.
> But I fold the offset into bitpos and the offset will always be zero.
>
> Suggestion?

The thing you want to fix is not get_inner_reference but the aarch64 backend
to not make __builtin_aarch64_sqaddv16qi clobber global memory.  That way
CSE can happen on GIMPLE which can handle the difference in the IL just
fine.

Richard.


Re: Fortran : Fortran translation issues PR52279

2020-07-02 Thread David Edelsohn via Gcc-patches
On Thu, Jul 2, 2020 at 8:56 AM Mark Eggleston
 wrote:
>
> On 02/07/2020 13:25, David Edelsohn wrote:
> > On Thu, Jul 2, 2020 at 6:16 AM Mark Eggleston
> >  wrote:
> >> I've committed the change from array to pointer. Does this fix your builds?
> >>
> >> On 02/07/2020 08:18, Mark Eggleston wrote:
> >>> On 01/07/2020 20:07, David Edelsohn wrote:
>  This patch breaks bootstrap.
> >>> Apologies. I didn't see this when I built the compiler the with
> >>> bootstrap on x86_64. I'll endevour to get it fixed as soon as possible.
> > Mark,
> >
> > Your change did nothing because I previously had removed the
> > translation markers to restore bootstrap.  You were not sufficiently
> > observant to check the ChangeLog or notice that the line already was
> > changed.
> >
> > Please stop making random, untested changes.  This is inappropriate
> > for GCC development.
> OK. I had no way of confirming as the bootstrap worked for me. I naively
> thought that it was covered as an "obvious" fix.
>
> I was concerned that I caused a breakage and wanted to fix it.
>
> Regarding the ChangeLog I was unfamiliar with this situation and will
> check them if this ever happens again (I hope not).  I'll also keep
> portability issues in mind.
>
> I see the issue of the incorrect hint text has been addressed. As I
> understand it the string as is will not make it to the pot file so will
> not be translated, is that acceptable? I would like to be able to close
> the PR.

Mark,

I reverted your change for that one line to restore bootstrap for
everyone because Trunk should not be broken for 6 hours and it was
trivial to revert that one, localized change without affecting the
rest of your patch.

The translation message for hint should be in the pot file
translation.  This needs to be fixed correctly.

I would suggest that you restore your original change and work to
reproduce the failure.  I don't know if you were not building with NLS
or had some other local configuration.  It seems that

hint[] = _("...")

cannot be valid C syntax unless _( ... ) is a no-op in your build and
the compiler sees a constant string.  The failure occurred for many
other developers.

Thanks, David


Re: [PATCH] nvptx: : Add support for popcount and widening multiply instructions

2020-07-02 Thread Tom de Vries
On 7/1/20 3:06 PM, Roger Sayle wrote:
> 
> The following patch adds support for the popc and mul.wide instructions to 
> the nvptx backend.
> I've a follow-up patch for supporting mul.hi instructions, but those changes 
> require some minor
> tweaks to GCC's middle-end, so I'll submit those pieces separately.
> 
> Tested by "make" and "make -k check" on --build=nvptx-none hosted on 
> x86_64-pc-linux-gnu with
> no new regressions.
> 
> 2020-07-01  Roger Sayle  
> 
> gcc/ChangeLog:
> * config/nvptx/nvptx.md (popcount2): New instructions.
> (mulhishi3, mulsidi3, umulhisi3, umulsidi3): New instructions.
> 
> gcc/testsuite/ChangeLog:
> * gcc.target/nvptx/popc-1.c: New test.
> * gcc.target/nvptx/popc-2.c: New test.
> * gcc.target/nvptx/popc-3.c: New test.
> * gcc.target/nvptx/mul-wide.c: New test.
> * gcc.target/nvptx/umul-wide.c: New test.
> 
> 
> Ok for mainline?
> 

Hi Roger,

LGTM, please apply.

[ Btw, can you next time add the new files to the patch.  That's
somewhat more convenient to apply. ]

Thanks
- Tom



[PATCH PR94442] [AArch64] Redundant ldp/stp instructions emitted at -O3

2020-07-02 Thread xiezhiheng
Hi,

This is a fix for pr94442.
I modify get_inner_reference to handle the case for MEM[ptr, off].
I extract the "off" and add it to the recorded offset, then I build a
MEM[ptr, 0] and return it later.

diff --git a/gcc/expr.c b/gcc/expr.c
index 3c68b0d754c..8cc18449a0c 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -7362,7 +7362,8 @@ tree
 get_inner_reference (tree exp, poly_int64_pod *pbitsize,
 poly_int64_pod *pbitpos, tree *poffset,
 machine_mode *pmode, int *punsignedp,
-int *preversep, int *pvolatilep)
+int *preversep, int *pvolatilep,
+bool include_memref_p)
 {
   tree size_tree = 0;
   machine_mode mode = VOIDmode;
@@ -7509,6 +7510,21 @@ get_inner_reference (tree exp, poly_int64_pod *pbitsize,
}
  exp = TREE_OPERAND (TREE_OPERAND (exp, 0), 0);
}
+ else if (include_memref_p
+  && TREE_CODE (TREE_OPERAND (exp, 0)) == SSA_NAME)
+   {
+ tree off = TREE_OPERAND (exp, 1);
+ if (!integer_zerop (off))
+   {
+ poly_offset_int boff = mem_ref_offset (exp);
+ boff <<= LOG2_BITS_PER_UNIT;
+ bit_offset += boff;
+
+ exp = build2 (MEM_REF, TREE_TYPE (exp),
+   TREE_OPERAND (exp, 0),
+   build_int_cst (TREE_TYPE (off), 0));
+   }
+   }
  goto done;
 
default:
@@ -10786,7 +10802,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode 
tmode,
int reversep, volatilep = 0, must_force_mem;
tree tem
  = get_inner_reference (exp, , , , ,
-, , );
+, , , true);
rtx orig_op0, memloc;
bool clear_mem_expr = false;
 
diff --git a/gcc/tree.h b/gcc/tree.h
index a74872f5f3e..7df0d15f7f9 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -6139,7 +6139,8 @@ extern bool complete_ctor_at_level_p (const_tree, 
HOST_WIDE_INT, const_tree);
look for the ultimate containing object, which is returned and specify
the access position and size.  */
 extern tree get_inner_reference (tree, poly_int64_pod *, poly_int64_pod *,
-tree *, machine_mode *, int *, int *, int *);
+tree *, machine_mode *, int *, int *, int *,
+bool = false);
 
 extern tree build_personality_function (const char *);


I add an argument "include_memref_p" to control whether to go into MEM_REF,
because without it will cause the test case "Warray-bounds-46.c" to fail in 
regression.

It because function set_base_and_offset in gimple-ssa-warn-restrict.c
  base = get_inner_reference (expr, , , _off,
  , , , );
  ...
  ...
  if (TREE_CODE (base) == MEM_REF)
{
  tree memrefoff = fold_convert (ptrdiff_type_node, TREE_OPERAND (base, 1));
  extend_offset_range (memrefoff);
  base = TREE_OPERAND (base, 0);

  if (refoff != HOST_WIDE_INT_MIN
  && TREE_CODE (expr) == COMPONENT_REF)
{
  /* Bump up the offset of the referenced subobject to reflect
 the offset to the enclosing object.  For example, so that
 in
   struct S { char a, b[3]; } s[2];
   strcpy (s[1].b, "1234");
 REFOFF is set to s[1].b - (char*)s.  */
  offset_int off = tree_to_shwi (memrefoff);
  refoff += off;
}

  if (!integer_zerop (memrefoff))   <=
/* A non-zero offset into an array of struct with flexible array
   members implies that the array is empty because there is no
   way to initialize such a member when it belongs to an array.
   This must be some sort of a bug.  */
refsize = 0;
}

needs MEM_REF offset to judge whether refsize should be set to zero.
But I fold the offset into bitpos and the offset will always be zero.

Suggestion?


RE: [PATCH] nvptx: Fix ICE in nvptx_vector_alignment on gcc.dg/attr-vector_size.c

2020-07-02 Thread Roger Sayle
Hi Tom,

Beware of doing something just because the default target hook does it that way.
See https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549128.html
which fixes PR middle-end/90597 by correcting default_vector_alignment to do
it the same way as proposed by this nvptx backend patch.

Many thanks for pointing out that the nvptx problem is PR target/90932.

Roger
--

-Original Message-
From: Tom de Vries  
Sent: 02 July 2020 13:09
To: Roger Sayle ; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] nvptx: Fix ICE in nvptx_vector_alignment on 
gcc.dg/attr-vector_size.c

On 6/29/20 7:19 PM, Roger Sayle wrote:
> This patch addresses the ICE in gcc.dg/attr-vector_size.c during make 
> -k check on nvptx-none.  The actual ICE looks like:
> 
> testsuite/gcc.dg/attr-vector_size.c:29:1: internal compiler error: in 
> tree_to_shwi, at tree.c:7321
> 0xf53bf2 tree_to_shwi(tree_node const*)
> ../../gcc/gcc/tree.c:7321
> 0xff1969 nvptx_vector_alignment
> ../../gcc/gcc/config/nvptx/nvptx.c:5105^M
> 
> The problem is that the caller has ensured that TYPE_SIZE(type) is 
> representable as an unsigned HOST_WIDE_INT, but nvptx_vector_alignment 
> is accessing it as a signed HOST_WIDE_INT which overflows in 
> pathological conditions.  Amongst those pathological conditions is 
> that a TYPE_SIZE of zero can sometimes reach this function, prior to 
> an error being emitted.  Making sure the result is not less than the 
> mode's alignment and not greater than BIGGEST_ALIGNMENT fixes the ICEs, and 
> generates the expected compile-time error messages.
> 
> Tested on --target=nvptx-none, with a "make" and "make check" which 
> results in four fewer unexpected failures and three more expected passes.
> Ok for mainline?
> 
> 
> 2020-06-29  Roger Sayle  
> 
> gcc/ChangeLog:
> * config/nvptx/nvptx.c (nvptx_vector_alignment): Use tree_to_uhwi
> to access TYPE_SIZE (type).  Return at least the mode's alignment.
> 
> Thanks,
> Roger
> --
> Roger Sayle
> NextMove Software
> Cambridge, UK
> 
> diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index 
> e3e84df..bfad91b 100644
> --- a/gcc/config/nvptx/nvptx.c
> +++ b/gcc/config/nvptx/nvptx.c
> @@ -5102,9 +5102,10 @@ static const struct attribute_spec 
> nvptx_attribute_table[] =  static HOST_WIDE_INT  
> nvptx_vector_alignment (const_tree type)  {
> -  HOST_WIDE_INT align = tree_to_shwi (TYPE_SIZE (type));
> -
> -  return MIN (align, BIGGEST_ALIGNMENT);
> +  unsigned HOST_WIDE_INT align = tree_to_uhwi (TYPE_SIZE (type));

In the default target hook, we test tree_fits_uhwi_p, so I prefer we do the 
same.

> +  if (align > BIGGEST_ALIGNMENT)
> +return BIGGEST_ALIGNMENT;

I prefer using MIN to make code easier to read.

> +  return MAX (align, GET_MODE_ALIGNMENT (TYPE_MODE (type)));
>  }
>  
>  /* Indicate that INSN cannot be duplicated.   */

Also, this is related to a PR, number included in commit log.

As of yet untested updated patch attached.

Thanks,
- Tom



Re: Fortran : Fortran translation issues PR52279

2020-07-02 Thread Mark Eggleston



On 02/07/2020 13:56, Mark Eggleston wrote:


On 02/07/2020 13:25, David Edelsohn wrote:

On Thu, Jul 2, 2020 at 6:16 AM Mark Eggleston
 wrote:
I've committed the change from array to pointer. Does this fix your 
builds?


On 02/07/2020 08:18, Mark Eggleston wrote:

On 01/07/2020 20:07, David Edelsohn wrote:

This patch breaks bootstrap.

Apologies. I didn't see this when I built the compiler the with
bootstrap on x86_64. I'll endevour to get it fixed as soon as 
possible.

Mark,

Your change did nothing because I previously had removed the
translation markers to restore bootstrap.  You were not sufficiently
observant to check the ChangeLog or notice that the line already was
changed.

Please stop making random, untested changes.  This is inappropriate
for GCC development.
OK. I had no way of confirming as the bootstrap worked for me. I 
naively thought that it was covered as an "obvious" fix.


I was concerned that I caused a breakage and wanted to fix it.

Regarding the ChangeLog I was unfamiliar with this situation and will 
check them if this ever happens again (I hope not).  I'll also keep 
portability issues in mind.


I see the issue of the incorrect hint text has been addressed.

whoops that's not true.
As I understand it the string as is will not make it to the pot file 
so will not be translated, is that acceptable? I would like to be able 
to close the PR.


regards,

Mark


Thanks, David


--
https://www.codethink.co.uk/privacy.html



Re: Fortran : Fortran translation issues PR52279

2020-07-02 Thread Mark Eggleston



On 02/07/2020 13:25, David Edelsohn wrote:

On Thu, Jul 2, 2020 at 6:16 AM Mark Eggleston
 wrote:

I've committed the change from array to pointer. Does this fix your builds?

On 02/07/2020 08:18, Mark Eggleston wrote:

On 01/07/2020 20:07, David Edelsohn wrote:

This patch breaks bootstrap.

Apologies. I didn't see this when I built the compiler the with
bootstrap on x86_64. I'll endevour to get it fixed as soon as possible.

Mark,

Your change did nothing because I previously had removed the
translation markers to restore bootstrap.  You were not sufficiently
observant to check the ChangeLog or notice that the line already was
changed.

Please stop making random, untested changes.  This is inappropriate
for GCC development.
OK. I had no way of confirming as the bootstrap worked for me. I naively 
thought that it was covered as an "obvious" fix.


I was concerned that I caused a breakage and wanted to fix it.

Regarding the ChangeLog I was unfamiliar with this situation and will 
check them if this ever happens again (I hope not).  I'll also keep 
portability issues in mind.


I see the issue of the incorrect hint text has been addressed. As I 
understand it the string as is will not make it to the pot file so will 
not be translated, is that acceptable? I would like to be able to close 
the PR.


regards,

Mark


Thanks, David


--
https://www.codethink.co.uk/privacy.html



[PATCH] tree-optimization/96028 - fix bogus externalizing of SLP node

2020-07-02 Thread Richard Biener
This guards externalizing a SLP node when it fails to code generate
to actually have scalar defs we can use.  It also makes failure
to do so not fell the whole SLP instance but instead try this again
on the parent.

This is the variant I installed.

2020-07-02  Richard Biener  

PR tree-optimization/96028
* tree-vect-slp.c (vect_slp_convert_to_external): Make sure
we have scalar stmts to use.
(vect_slp_analyze_node_operations): When analyzing a child
failed try externalizing the parent node.
---
 gcc/tree-vect-slp.c | 27 ++-
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 532809d2667..be6b7a1ab29 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2788,6 +2788,7 @@ vect_slp_convert_to_external (vec_info *vinfo, slp_tree 
node,
 
   if (!is_a  (vinfo)
   || node == SLP_INSTANCE_TREE (node_instance)
+  || !SLP_TREE_SCALAR_STMTS (node).exists ()
   || vect_contains_pattern_stmt_p (SLP_TREE_SCALAR_STMTS (node)))
 return false;
 
@@ -2892,16 +2893,25 @@ vect_slp_analyze_node_operations (vec_info *vinfo, 
slp_tree node,
  doesn't result in any issue since we throw away the lvisited set
  when we fail.  */
   if (visited.contains (node)
-  || lvisited.add (node))
+  || lvisited.contains (node))
 return true;
 
+  bool res = true;
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
-if (!vect_slp_analyze_node_operations (vinfo, child, node_instance,
-  visited, lvisited, cost_vec))
-  return false;
+{
+  res = vect_slp_analyze_node_operations (vinfo, child, node_instance,
+ visited, lvisited, cost_vec);
+  if (!res)
+   break;
+}
 
-  bool res = vect_slp_analyze_node_operations_1 (vinfo, node, node_instance,
-cost_vec);
+  if (res)
+{
+  res = vect_slp_analyze_node_operations_1 (vinfo, node, node_instance,
+   cost_vec);
+  if (res)
+   lvisited.add (node);
+}
 
   /* When the node can be vectorized cost invariant nodes it references.
  This is not done in DFS order to allow the refering node
@@ -2943,13 +2953,12 @@ vect_slp_analyze_node_operations (vec_info *vinfo, 
slp_tree node,
  vect_prologue_cost_for_slp (child, cost_vec);
}
 
-  /* If this node can't be vectorized, try pruning the tree here rather
- than felling the whole thing.  */
+  /* If this node or any of its children can't be vectorized, try pruning
+ the tree here rather than felling the whole thing.  */
   if (!res && vect_slp_convert_to_external (vinfo, node, node_instance))
 {
   /* We'll need to revisit this for invariant costing and number
 of vectorized stmt setting.   */
-  lvisited.remove (node);
   res = true;
 }
 
-- 
2.26.2


Re: *ping* [patch, fortran] PR 27318, warn if interfaces do not match

2020-07-02 Thread Paul Richard Thomas via Gcc-patches
Hi Thomas and Dominique,

The patch looks fine to me. If Dominique has nothing to report then it is
OK for trunk.

Thanks

Paul


On Thu, 2 Jul 2020 at 11:15,  wrote:

> Le 2020-06-30 23:39, Thomas Koenig a écrit :
> > OK,
> >
> > so here is an updated version, which includes the updated test cases.
> >
> > Anything else?  OK for trunk?
> >
>
> Nothing to report!-)
>
> Thanks for the patch,
>
> Dominique
>


-- 
"If you can't explain it simply, you don't understand it well enough" -
Albert Einstein


RE: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182

2020-07-02 Thread Yangfei (Felix)
Hi,

> -Original Message-
> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> Sent: Thursday, July 2, 2020 5:17 PM
> To: Yangfei (Felix) 
> Cc: Richard Biener ; Richard Biener
> ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182
> 

Cut...

> 
> Thanks, pushed to master with a minor whitespace fix for…

Thanks for finding it.

> > + nscalars = (STMT_SLP_TYPE (stmt_info)
> > +   ? vf * DR_GROUP_SIZE (stmt_info) : vf);
> 
> …the indentation on this line.  Hope you don't mind, but I also “reflowed”
> the commit message to make it fit within 72 chars.
> (The text itself is the same.)

It's OK.  :-)
BTW: Is this the rule for gcc git commit msg format? 72 chars instead of 80 
chars?

Felix


Re: Fortran : Fortran translation issues PR52279

2020-07-02 Thread David Edelsohn via Gcc-patches
On Thu, Jul 2, 2020 at 6:16 AM Mark Eggleston
 wrote:
>
> I've committed the change from array to pointer. Does this fix your builds?
>
> On 02/07/2020 08:18, Mark Eggleston wrote:
> > On 01/07/2020 20:07, David Edelsohn wrote:
> >> This patch breaks bootstrap.
> >
> > Apologies. I didn't see this when I built the compiler the with
> > bootstrap on x86_64. I'll endevour to get it fixed as soon as possible.

Mark,

Your change did nothing because I previously had removed the
translation markers to restore bootstrap.  You were not sufficiently
observant to check the ChangeLog or notice that the line already was
changed.

Please stop making random, untested changes.  This is inappropriate
for GCC development.

Thanks, David


Re: [PATCH] nvptx: Fix ICE in nvptx_vector_alignment on gcc.dg/attr-vector_size.c

2020-07-02 Thread Tom de Vries
On 6/29/20 7:19 PM, Roger Sayle wrote:
> This patch addresses the ICE in gcc.dg/attr-vector_size.c during make -k 
> check on
> nvptx-none.  The actual ICE looks like:
> 
> testsuite/gcc.dg/attr-vector_size.c:29:1: internal compiler error: in 
> tree_to_shwi, at tree.c:7321
> 0xf53bf2 tree_to_shwi(tree_node const*)
> ../../gcc/gcc/tree.c:7321
> 0xff1969 nvptx_vector_alignment
> ../../gcc/gcc/config/nvptx/nvptx.c:5105^M
> 
> The problem is that the caller has ensured that TYPE_SIZE(type) is 
> representable as
> an unsigned HOST_WIDE_INT, but nvptx_vector_alignment is accessing it as a
> signed HOST_WIDE_INT which overflows in pathological conditions.  Amongst 
> those
> pathological conditions is that a TYPE_SIZE of zero can sometimes reach this 
> function,
> prior to an error being emitted.  Making sure the result is not less than the 
> mode's
> alignment and not greater than BIGGEST_ALIGNMENT fixes the ICEs, and 
> generates 
> the expected compile-time error messages.
> 
> Tested on --target=nvptx-none, with a "make" and "make check" which results 
> in four
> fewer unexpected failures and three more expected passes.
> Ok for mainline?
> 
> 
> 2020-06-29  Roger Sayle  
> 
> gcc/ChangeLog:
> * config/nvptx/nvptx.c (nvptx_vector_alignment): Use tree_to_uhwi
> to access TYPE_SIZE (type).  Return at least the mode's alignment.
> 
> Thanks,
> Roger
> --
> Roger Sayle
> NextMove Software
> Cambridge, UK
> 
> diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
> index e3e84df..bfad91b 100644
> --- a/gcc/config/nvptx/nvptx.c
> +++ b/gcc/config/nvptx/nvptx.c
> @@ -5102,9 +5102,10 @@ static const struct attribute_spec 
> nvptx_attribute_table[] =
>  static HOST_WIDE_INT
>  nvptx_vector_alignment (const_tree type)
>  {
> -  HOST_WIDE_INT align = tree_to_shwi (TYPE_SIZE (type));
> -
> -  return MIN (align, BIGGEST_ALIGNMENT);
> +  unsigned HOST_WIDE_INT align = tree_to_uhwi (TYPE_SIZE (type));

In the default target hook, we test tree_fits_uhwi_p, so I prefer we do
the same.

> +  if (align > BIGGEST_ALIGNMENT)
> +return BIGGEST_ALIGNMENT;

I prefer using MIN to make code easier to read.

> +  return MAX (align, GET_MODE_ALIGNMENT (TYPE_MODE (type)));
>  }
>  
>  /* Indicate that INSN cannot be duplicated.   */

Also, this is related to a PR, number included in commit log.

As of yet untested updated patch attached.

Thanks,
- Tom
nvptx: Fix ICE in nvptx_vector_alignment on gcc.dg/attr-vector_size.c

This patch addresses the ICE in gcc.dg/attr-vector_size.c during
make -k check on nvptx-none.  The actual ICE looks like:

testsuite/gcc.dg/attr-vector_size.c:29:1: internal compiler error: \
  in tree_to_shwi, at tree.c:7321
0xf53bf2 tree_to_shwi(tree_node const*)
../../gcc/gcc/tree.c:7321
0xff1969 nvptx_vector_alignment
../../gcc/gcc/config/nvptx/nvptx.c:5105^M

The problem is that the caller has ensured that TYPE_SIZE(type) is
representable as an unsigned HOST_WIDE_INT, but nvptx_vector_alignment is
accessing it as a signed HOST_WIDE_INT which overflows in pathological
conditions.  Amongst those pathological conditions is that a TYPE_SIZE of
zero can sometimes reach this function, prior to an error being emitted.
Making sure the result is not less than the mode's alignment and not greater
than BIGGEST_ALIGNMENT fixes the ICEs, and generates the expected
compile-time error messages.

Tested on --target=nvptx-none, with a "make" and "make check" which results
in four fewer unexpected failures and three more expected passes.

2020-06-29  Roger Sayle  

gcc/ChangeLog:

	PR target/90932
	* config/nvptx/nvptx.c (nvptx_vector_alignment): Use tree_to_uhwi
	to access TYPE_SIZE (type).  Return at least the mode's alignment.

---
 gcc/config/nvptx/nvptx.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index e3e84dfd4e4..d2f321fcbcc 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -5102,9 +5102,22 @@ static const struct attribute_spec nvptx_attribute_table[] =
 static HOST_WIDE_INT
 nvptx_vector_alignment (const_tree type)
 {
-  HOST_WIDE_INT align = tree_to_shwi (TYPE_SIZE (type));
+  unsigned HOST_WIDE_INT align;
+  tree size = TYPE_SIZE (type);
+
+  /* Ensure align is not bigger than BIGGEST_ALIGNMENT.  */
+  if (tree_fits_uhwi_p (size))
+{
+  align = tree_to_uhwi (size);
+  align = MIN (align, BIGGEST_ALIGNMENT);
+}
+  else
+align = BIGGEST_ALIGNMENT;
+
+  /* Ensure align is not smaller than mode alignment.  */
+  align = MAX (align, GET_MODE_ALIGNMENT (TYPE_MODE (type)));
 
-  return MIN (align, BIGGEST_ALIGNMENT);
+  return align;
 }
 
 /* Indicate that INSN cannot be duplicated.   */


Re: Fortran : Fortran translation issues PR52279

2020-07-02 Thread David Edelsohn via Gcc-patches
I already had removed your change.  I hope that you did not re-break bootstrap.

You should be re-producing the breakage and then confirming the fix,
not randomly applying patches and asking others if you didn't break it
again.  That is not appropriate for GCC development.

Thanks David

On Thu, Jul 2, 2020 at 6:16 AM Mark Eggleston
 wrote:
>
> I've committed the change from array to pointer. Does this fix your builds?
>
> On 02/07/2020 08:18, Mark Eggleston wrote:
> > On 01/07/2020 20:07, David Edelsohn wrote:
> >> This patch breaks bootstrap.
> >
> > Apologies. I didn't see this when I built the compiler the with
> > bootstrap on x86_64. I'll endevour to get it fixed as soon as possible.
> >
> > regards,
> >
> > Mark
> >
> >> It is not portable to use _( ... ) to initialize an array.
> >>
> >> In file included from
> >> /nasfarm/edelsohn/src/src/gcc/fortran/gfortran.h:52,
> >>   from /nasfarm/edelsohn/src/src/gcc/fortran/check.c:32:
> >> /nasfarm/edelsohn/src/src/gcc/fortran/check.c: In function 'bool
> >> gfc_invalid_boz
> >> (const char*, locus*)':
> >> /nasfarm/edelsohn/src/src/gcc/intl.h:51:27: error: initializer fails
> >> to determine size of 'hint'
> >> 51 | # define _(msgid) gettext (msgid)
> >>|   ^~~
> >> /nasfarm/edelsohn/src/src/gcc/fortran/check.c:70:23: note: in
> >> expansion of macro
> >>   '_'
> >> 70 |   const char hint[] = _(" [see %<-fno-allow-invalid-boz%>]");
> >>|   ^
> >> /nasfarm/edelsohn/src/src/gcc/intl.h:51:27: error: array must be
> >> initialized with a brace-enclosed initializer
> >>
> --
> https://www.codethink.co.uk/privacy.html
>


Re: [PATCH] Fortran : Fill in missing array dimensions using the lower, bound (for review)

2020-07-02 Thread Mark Eggleston



On 01/07/2020 23:01, Jerry DeLisle wrote:



On 6/27/20 1:40 AM, Thomas Koenig via Fortran wrote:

Hi Mark,


Use -fdec-add-missing-indexes to enable feature. Also enabled by fdec.
A warning that the lower bound is being used for a mission dimension
is output unless suppressed by using -Wno-missing-index.


This is... seriously problematic.  I forsee all sorts of not-so-funny
interactions with more modern features.

What do people actually do with this kind of code?  What kind of test
cases do you have that "work" with this?

And people would actually want to save a few keystrokes so they don't
have to write A(N,M,1) instead of A(N,M)?  And is this even the right
fix, how sure are you of the semantics; is there documentation for
this feature (maybe on Bitsavers)?  If not, this is not be done.

If this goes in at all, I want this rejected with any modern Fortran
feature, i.e. it should not be contain

- allocatable arrays
- coarrays
- pointers
- derived types
- CLASS
- assumed-shape arrays
- assumed-rank arrays (well, it probably doesn't make sense)
- KIND=4 characters
- as an argument to any of the array intrinsics like MATMUL,
  EOSHIFT, ...

but even with these restrictions, I will still take a lot of convincing
that this make sense.

Just imagine what will happen if people specify -fdec for some
relatively benign reason (for example because they want -fdec-math)
and start not finding bugs in their code because of this feature.

Best regards

Thomas


Please stop fixing problematic DEC programs by using the compiler as 
the pet tool. Use an editor or python or some suitable tool to 
initialize arrays properly.


Agreed, I was tasked with upstreaming various legacy patches, 
fortunately there has been a parallel effort refactor Fortran code to 
reduce the need for these patches.  It is likely that this one is no 
longer required.  There is only one more is left, hopefully that will 
also be addressed in the code.  I'll relay your opinion as an argument 
for not attempting the last patch and I can address the many Fortran PRs 
instead.


regards,

Mark



I appreciate the effort, but need things run by here before the effort 
so you can spend the effort on really true compiler bugs, and not on 
the wishes of perhaps a paying customer.


We should never have caved on the previous DEC enhancement.

Just my humble opinion.

Jerry


--
https://www.codethink.co.uk/privacy.html



RE: [PATCH] Fix unnecessary register spill that depends on function ordering

2020-07-02 Thread Omar Tahir
> Omar Tahir  writes:
> > Hi Richard,
> >
> > From: Richard Sandiford 
> >> > @@ -3719,6 +3722,7 @@ static unsigned int rest_of_handle_sched (void) 
> >> > { #ifdef INSN_SCHEDULING
> >> > +  first_moveable_pseudo = last_moveable_pseudo;
> >> >if (flag_selective_scheduling
> >> >&& ! maybe_skip_selective_scheduling ())
> >> >  run_selective_scheduling ();
> >> 
> >> I think instead we should zero out both variables at the end of IRA.
> >> There are other places besides the scheduler that call into the IRA code, 
> >> so tackling the problem there seems more general.
> >
> > If you zero first_moveable_pseudo and last_moveable_pseudo after IRA then
> > they'll be zero for the second scheduler pass, which uses them.
> 
> Are you sure?  It shouldn't be doing that, since there are no pseudos
> left when the second scheduling pass runs.  RA replaces all pseudos
> with hard registers.
> 
> So if the values in the variables has a noticeable effect on sched2,
> I think that's even more reason to clear them after IRA :-)

That's a good point. A few other passes call functions that make use of
the moveable pseudo variables. But if they're before IRA then they should
be zero, and as you say if they're after IRA then there are no pseudos left!

I've moved the reset to the end of move_unallocated_pseudos. Unfortunately I
can't inline the patch as there's a form feed (^L) that's mangling the text,
not sure how to get around that.

Thanks,
Omar



gcc/ChangeLog:

2020-07-02: Omar Tahir 

* ira.c (move_unallocated_pseudos): Zero first_moveable_pseudo and
last_moveable_pseudo before returning.

gcc/testsuite/ChangeLog:

2020-07-02: Omar Tahir 

* gcc.target/aarch64/nospill.c: New test.


moveable_pseudo.patch
Description: moveable_pseudo.patch


Re: [patch] Extend store merging to STRING_CST

2020-07-02 Thread Richard Biener via Gcc-patches
On Thu, Jul 2, 2020 at 12:27 PM Eric Botcazou  wrote:
>
> > Sorry for the ping-pong but why's using a new char[len] type problematic?
>
> Because the type may be incompatible with that of the STRING_CST, so we would
> need to build a new STRING_CST with the new type.

I see.

> > That said, I do like p2 more even if we need to special-case STRING_CST
> > sources at the end for store-merging to be happy.
>
> OK.

So this variant combined with the rest of the patch is OK then.

Thanks,
Richard.

>
> --
> Eric Botcazou


Re: [patch] Extend store merging to STRING_CST

2020-07-02 Thread Eric Botcazou
> Sorry for the ping-pong but why's using a new char[len] type problematic?

Because the type may be incompatible with that of the STRING_CST, so we would 
need to build a new STRING_CST with the new type.

> That said, I do like p2 more even if we need to special-case STRING_CST
> sources at the end for store-merging to be happy.

OK.

-- 
Eric Botcazou


Re: Fortran : Fortran translation issues PR52279

2020-07-02 Thread Mark Eggleston

I've committed the change from array to pointer. Does this fix your builds?

On 02/07/2020 08:18, Mark Eggleston wrote:

On 01/07/2020 20:07, David Edelsohn wrote:

This patch breaks bootstrap.


Apologies. I didn't see this when I built the compiler the with 
bootstrap on x86_64. I'll endevour to get it fixed as soon as possible.


regards,

Mark


It is not portable to use _( ... ) to initialize an array.

In file included from 
/nasfarm/edelsohn/src/src/gcc/fortran/gfortran.h:52,

  from /nasfarm/edelsohn/src/src/gcc/fortran/check.c:32:
/nasfarm/edelsohn/src/src/gcc/fortran/check.c: In function 'bool 
gfc_invalid_boz

(const char*, locus*)':
/nasfarm/edelsohn/src/src/gcc/intl.h:51:27: error: initializer fails
to determine size of 'hint'
    51 | # define _(msgid) gettext (msgid)
   |   ^~~
/nasfarm/edelsohn/src/src/gcc/fortran/check.c:70:23: note: in 
expansion of macro

  '_'
    70 |   const char hint[] = _(" [see %<-fno-allow-invalid-boz%>]");
   |   ^
/nasfarm/edelsohn/src/src/gcc/intl.h:51:27: error: array must be
initialized with a brace-enclosed initializer


--
https://www.codethink.co.uk/privacy.html



Re: *ping* [patch, fortran] PR 27318, warn if interfaces do not match

2020-07-02 Thread dhumieres . dominique

Le 2020-06-30 23:39, Thomas Koenig a écrit :

OK,

so here is an updated version, which includes the updated test cases.

Anything else?  OK for trunk?



Nothing to report!-)

Thanks for the patch,

Dominique


[PATCH] tree-optimization/96028 - fix bogus externalizing of SLP node

2020-07-02 Thread Richard Biener
This guards externalizing a SLP node when it fails to code generate
to actually have scalar defs we can use.  It also makes failure
to do so not fell the whole SLP instance but instead try this again
on the parent.

Bootstrap / regtest running on x86_64-unknown-linux-gnu and should
fix an ICE observed on sparc-solaris.

2020-07-02  Richard Biener  

PR tree-optimization/96028
* tree-vect-slp.c (vect_slp_convert_to_external): Make sure
we have scalar stmts to use.
(vect_slp_analyze_node_operations): When analyzing a child
failed try externalizing the parent node.
---
 gcc/tree-vect-slp.c | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 532809d2667..a471f89248c 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2788,6 +2788,7 @@ vect_slp_convert_to_external (vec_info *vinfo, slp_tree 
node,
 
   if (!is_a  (vinfo)
   || node == SLP_INSTANCE_TREE (node_instance)
+  || !SLP_TREE_SCALAR_STMTS (node).exists ()
   || vect_contains_pattern_stmt_p (SLP_TREE_SCALAR_STMTS (node)))
 return false;
 
@@ -2895,13 +2896,18 @@ vect_slp_analyze_node_operations (vec_info *vinfo, 
slp_tree node,
   || lvisited.add (node))
 return true;
 
+  bool res;
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
-if (!vect_slp_analyze_node_operations (vinfo, child, node_instance,
-  visited, lvisited, cost_vec))
-  return false;
+{
+  res = vect_slp_analyze_node_operations (vinfo, child, node_instance,
+ visited, lvisited, cost_vec);
+  if (!res)
+   break;
+}
 
-  bool res = vect_slp_analyze_node_operations_1 (vinfo, node, node_instance,
-cost_vec);
+  if (res)
+res = vect_slp_analyze_node_operations_1 (vinfo, node, node_instance,
+ cost_vec);
 
   /* When the node can be vectorized cost invariant nodes it references.
  This is not done in DFS order to allow the refering node
@@ -2943,8 +2949,8 @@ vect_slp_analyze_node_operations (vec_info *vinfo, 
slp_tree node,
  vect_prologue_cost_for_slp (child, cost_vec);
}
 
-  /* If this node can't be vectorized, try pruning the tree here rather
- than felling the whole thing.  */
+  /* If this node or any of its children can't be vectorized, try pruning
+ the tree here rather than felling the whole thing.  */
   if (!res && vect_slp_convert_to_external (vinfo, node, node_instance))
 {
   /* We'll need to revisit this for invariant costing and number
-- 
2.26.2


Re: [PATCH] tree-cfg: Fix ICE with switch stmt to unreachable opt and forced labels [PR95857]

2020-07-02 Thread Richard Biener via Gcc-patches
On Thu, Jul 2, 2020 at 11:03 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> The following testcase ICEs, because during the cfg cleanup, we see:
>   switch (i$e_11)  [33.33%], case -3:  [33.33%], case 0: 
>  [33.33%], case 2:  [33.33%]>
> ...
> lab2:
>   __builtin_unreachable ();
> where lab2 is FORCED_LABEL.  The way it works, we go through the case labels
> and when we reach the first one that points to gimple_seq_unreachable*
> basic block, we remove the edge (if any) from the switch bb to the bb
> containing the label and bbs reachable only through that edge we've just
> removed.  Once we do that, we must throw away all other cases that use
> the same label (or some other labels from the same bb we've removed the edge
> to and the bb).  To avoid quadratic behavior, this is not done by walking
> all remaining cases immediately before removing, but only when processing
> them later.
> For normal labels this works, fine, if the label is in a deleted bb, it will
> have NULL label_to_block and we handle that case, or, if the unreachable bb
> has some other edge to it, only the edge will be removed and not the bb,
> and again, find_edge will not find the edge and we only remove the case.
> And if a label would be to some other block, that other block wouldn't have
> been removed earlier because there would be still an edge from the switch
> block.
> Now, FORCED_LABEL (and I think DECL_NONLOCAL too) break this, because
> those labels aren't removed, but instead moved to some surrounding basic
> block.  So, when we later process those, when their gimple_seq_unreachable*
> basic block is removed, label_to_block will return some unrelated block
> (in the testcase the switch bb), so we decide to keep the case which doesn't
> seem to be unreachable, but we don't really have an edge from the switch
> block to the block the label got moved to.
>
> I thought first about punting in gimple_seq_unreachable* on
> FORCED_LABEL/DECL_NONLOCAL labels, but that might penalize even code that
> doesn't care, so this instead just makes sure that for
> FORCED_LABEL/DECL_NONLOCAL labels that are being removed (and thus moved
> randomly) we remember in a hash_set the fact that those labels should be
> treated as removed for the purpose of the optimization, and later on
> handle those labels that way.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2020-07-02  Jakub Jelinek  
>
> PR tree-optimization/95857
> * tree-cfg.c (group_case_labels_stmt): When removing an unreachable
> base_bb, remember all forced and non-local labels on it and later
> treat those as if they have NULL label_to_block.  Formatting fix.
> Fix a comment typo.
>
> * gcc.dg/pr95857.c: New test.
>
> --- gcc/tree-cfg.c.jj   2020-06-30 11:18:52.981737596 +0200
> +++ gcc/tree-cfg.c  2020-06-30 12:50:22.068246981 +0200
> @@ -1728,6 +1728,7 @@ group_case_labels_stmt (gswitch *stmt)
>int old_size = gimple_switch_num_labels (stmt);
>int i, next_index, new_size;
>basic_block default_bb = NULL;
> +  hash_set *removed_labels = NULL;
>
>default_bb = gimple_switch_default_bb (cfun, stmt);
>
> @@ -1744,8 +1745,11 @@ group_case_labels_stmt (gswitch *stmt)
>base_bb = label_to_block (cfun, CASE_LABEL (base_case));
>
>/* Discard cases that have the same destination as the default case or
> -whose destiniation blocks have already been removed as unreachable.  
> */
> -  if (base_bb == NULL || base_bb == default_bb)
> +whose destination blocks have already been removed as unreachable.  
> */
> +  if (base_bb == NULL
> + || base_bb == default_bb
> + || (removed_labels
> + && removed_labels->contains (CASE_LABEL (base_case
> {
>   i++;
>   continue;
> @@ -1768,10 +1772,13 @@ group_case_labels_stmt (gswitch *stmt)
>   /* Merge the cases if they jump to the same place,
>  and their ranges are consecutive.  */
>   if (merge_bb == base_bb
> + && (removed_labels == NULL
> + || !removed_labels->contains (CASE_LABEL (merge_case)))
>   && wi::to_wide (CASE_LOW (merge_case)) == bhp1)
> {
> - base_high = CASE_HIGH (merge_case) ?
> - CASE_HIGH (merge_case) : CASE_LOW (merge_case);
> + base_high
> +   = (CASE_HIGH (merge_case)
> +  ? CASE_HIGH (merge_case) : CASE_LOW (merge_case));
>   CASE_HIGH (base_case) = base_high;
>   next_index++;
> }
> @@ -1792,7 +1799,29 @@ group_case_labels_stmt (gswitch *stmt)
> {
>   edge base_edge = find_edge (gimple_bb (stmt), base_bb);
>   if (base_edge != NULL)
> -   remove_edge_and_dominated_blocks (base_edge);
> +   {
> + for (gimple_stmt_iterator gsi = gsi_start_bb (base_bb);
> +  !gsi_end_p (gsi); 

[PATCH] tree-optimization/96022 - fix ICE with vectorized shift

2020-07-02 Thread Richard Biener
This fixes lane extraction for internal def vectorized shifts
with an effective scalar shift operand by always using lane zero
of the first vector stmt.

It also fixes a SLP build issue noticed on the testcase where
we end up building unary vector ops with the only operand built
form scalars which isn't profitable by itself.  The exception
is for stores.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-07-02  Richard Biener  

PR tree-optimization/96022
* tree-vect-stmts.c (vectorizable_shift): Only use the
first vector stmt when extracting the scalar shift amount.
* tree-vect-slp.c (vect_build_slp_tree_2): Also build unary
nodes with all-scalar children from scalars but not stores.
(vect_analyze_slp_instance): Mark the node not failed.

* g++.dg/vect/pr96022.cc: New testcase.
---
 gcc/testsuite/g++.dg/vect/pr96022.cc | 12 
 gcc/tree-vect-slp.c  | 11 ++-
 gcc/tree-vect-stmts.c|  6 --
 3 files changed, 22 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/vect/pr96022.cc

diff --git a/gcc/testsuite/g++.dg/vect/pr96022.cc 
b/gcc/testsuite/g++.dg/vect/pr96022.cc
new file mode 100644
index 000..ca6b27696f5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/pr96022.cc
@@ -0,0 +1,12 @@
+// { dg-do compile }
+// { dg-additional-options "-O3" }
+
+extern int arr_6[];
+extern char arr_7[] __attribute__((aligned));
+void test(short a, bool, int p8) {
+  for (bool b = 0; b < (bool)p8; b = 1)
+for (short c = 0; c < 5; c++) {
+  arr_6[c] = (long)2 << a - 30574;
+  arr_7[c] = 0;
+}
+}
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 532809d2667..af123b5ab1d 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -1530,11 +1530,11 @@ fail:
 
   vect_free_oprnd_info (oprnds_info);
 
-  /* If we have all children of a non-unary child built up from
- uniform scalars then just throw that away, causing it built up
- from scalars.  */
-  if (nops > 1
-  && is_a  (vinfo)
+  /* If we have all children of a child built up from uniform scalars
+ then just throw that away, causing it built up from scalars.
+ The exception is the SLP node for the vector store.  */
+  if (is_a  (vinfo)
+  && !STMT_VINFO_GROUPED_ACCESS (stmt_info)
   /* ???  Rejecting patterns this way doesn't work.  We'd have to
 do extra work to cancel the pattern so the uses see the
 scalar version.  */
@@ -2230,6 +2230,7 @@ vect_analyze_slp_instance (vec_info *vinfo,
  return false;
}
  /* Fatal mismatch.  */
+ matches[0] = true;
  matches[group_size / const_max_nunits * const_max_nunits] = false;
  vect_free_slp_tree (node, false);
}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 9b7b04ce2d3..d68547ed1b5 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -5403,7 +5403,7 @@ vectorizable_shift (vec_info *vinfo,
  if (!op1_vectype)
op1_vectype = get_vectype_for_scalar_type (vinfo,
   TREE_TYPE (op1),
-  slp_node);
+  slp_op1);
 
   /* Unlike the other binary operators, shifts/rotates have
  the rhs being int, instead of the same type as the lhs,
@@ -5575,11 +5575,11 @@ vectorizable_shift (vec_info *vinfo,
   /* Arguments are ready.  Create the new vector stmt.  */
   FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0)
 {
-  vop1 = vec_oprnds1[i];
   /* For internal defs where we need to use a scalar shift arg
 extract the first lane.  */
   if (scalar_shift_arg && dt[1] == vect_internal_def)
{
+ vop1 = vec_oprnds1[0];
  new_temp = make_ssa_name (TREE_TYPE (TREE_TYPE (vop1)));
  gassign *new_stmt
= gimple_build_assign (new_temp,
@@ -5590,6 +5590,8 @@ vectorizable_shift (vec_info *vinfo,
  vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
  vop1 = new_temp;
}
+  else
+   vop1 = vec_oprnds1[i];
   gassign *new_stmt = gimple_build_assign (vec_dest, code, vop0, vop1);
   new_temp = make_ssa_name (vec_dest, new_stmt);
   gimple_assign_set_lhs (new_stmt, new_temp);
-- 
2.26.2


Re: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182

2020-07-02 Thread Richard Sandiford
"Yangfei (Felix)"  writes:
> Hi,
>
>> -Original Message-
>> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
>> Sent: Wednesday, July 1, 2020 9:03 PM
>> To: Yangfei (Felix) 
>> Cc: Richard Biener ; Richard Biener
>> ; gcc-patches@gcc.gnu.org
>> Subject: Re: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182
>> 
>> "Yangfei (Felix)"  writes:
>> >> On June 30, 2020 4:23:03 PM GMT+02:00, Richard Sandiford
>> >>  wrote:
>> >> >Richard Biener  writes:
>> >> >> So it seems odd to somehow put in the number of vectors...  so to
>> >> >> me it would have made sense if it did
>> >> >>
>> >> >>   possible_npeel_number = lower_bound (nscalars);
>> >> >>
>> >> >> or whateveris necessary to make the polys happy.  Thus simply
>> >> >> elide the vect_get_num_vectors call?  But it's been very long
>> >> >> since I've dived into the alignment peeling code...
>> >> >
>> >> >Ah, I see what you mean.  So rather than:
>> >> >
>> >> >   /* Save info about DR in the hash table.  Also include peeling
>> >> >  amounts according to the explanation above.  */
>> >> >  for (j = 0; j < possible_npeel_number; j++)
>> >> >{
>> >> >  vect_peeling_hash_insert (_htab, loop_vinfo,
>> >> > dr_info, npeel_tmp);
>> >> >   npeel_tmp += target_align / dr_size;
>> >> >}
>> >> >
>> >> >just have something like:
>> >> >
>> >> >   while (known_le (npeel_tmp, nscalars))
>> >> > {
>> >> >   …
>> >> > }
>> >> >
>> >> >?
>> >>
>> >> Yeah.
>> >
>> > Not sure if I understand correctly.  I am supposing the following check in
>> the original code is not necessary if we go like that.
>> >
>> > 1822   if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>> >
>> > Is that correct?
>> 
>> I think we still need it.  I guess there are two choices:
>> 
>> - make nscalars default to npeel_tmp before the “if” above.
>
> I think this will be simpler.  How about the v2 patch?
> Bootstrapped and tested on aarch64-linux-gnu & x86_64-linux-gnu.
>
> Thanks,
> Felix
>
> From 69efce90fa746a2a86f8d673335b874c7df34d9f Mon Sep 17 00:00:00 2001
> From: Fei Yang 
> Date: Thu, 2 Jul 2020 14:39:23 +0800
> Subject: [PATCH] vect: Fix an ICE in exact_div [PR95961]
>
> In the test case for PR95961, vectorization factor computed by
> vect_determine_vectorization_factor is [8,8].  But this is updated to [1,1]
> later by vect_update_vf_for_slp.  When we call vect_get_num_vectors in
> vect_enhance_data_refs_alignment, the number of scalars which is based on
> the vectorization factor is not a multiple of the the number of elements in
> the vector type.  This leads to the ICE.  This isn't a simple stream of
> contiguous vector accesses.  It's hard to predict from the available 
> information
> how many vector accesses we'll actually need per iteration.  As discussed, 
> here
> we should use the number of scalars instead of the number of vectors as an 
> upper
> bound for the loop saving info about DR in the hash table.
>
> 2020-07-02  Felix Yang  
>
> gcc/
>   PR tree-optimization/95961
>   * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Use the
>   number of scalars instead of the number of vectors as an upper bound
>   for the loop saving info about DR in the hash table.  Remove unused
>   local variables.
>
> gcc/testsuite/
>
>   PR tree-optimization/95961
>   * gcc.target/aarch64/sve/pr95961.c: New test.

Thanks, pushed to master with a minor whitespace fix for…

> +   nscalars = (STMT_SLP_TYPE (stmt_info)
> + ? vf * DR_GROUP_SIZE (stmt_info) : vf);

…the indentation on this line.  Hope you don't mind, but I also
“reflowed” the commit message to make it fit within 72 chars.
(The text itself is the same.)

Richard


[committed] openmp: Diagnose non-rectangular loops with invalid steps

2020-07-02 Thread Jakub Jelinek via Gcc-patches
Hi!

The OpenMP 5 standard requires that if some loop in OpenMP loop nest refers
to some outer loop's iterator variable, then the subtraction of the 
multiplication
factors for the outer iterator multiplied by the outer increment modulo the
inner increment is 0.  For loops with non-constants in any of these we can't
diagnose it, it would be a task for something like -fsanitize=openmp,
but if all these are constant, we can diagnose it.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2020-07-02  Jakub Jelinek  

* omp-expand.c (expand_omp_for): Diagnose non-rectangular loops with
invalid steps - ((m2 - m1) * incr_outer) % incr must be 0 in valid
OpenMP non-rectangular loops.  Use XALLOCAVEC.

* c-c++-common/gomp/loop-7.c: New test.

--- gcc/omp-expand.c.jj 2020-06-29 14:51:54.855085915 +0200
+++ gcc/omp-expand.c2020-06-30 15:45:00.531813174 +0200
@@ -7122,15 +7122,55 @@ expand_omp_for (struct omp_region *regio
   struct omp_for_data fd;
   struct omp_for_data_loop *loops;
 
-  loops
-= (struct omp_for_data_loop *)
-  alloca (gimple_omp_for_collapse (last_stmt (region->entry))
- * sizeof (struct omp_for_data_loop));
+  loops = XALLOCAVEC (struct omp_for_data_loop,
+ gimple_omp_for_collapse (last_stmt (region->entry)));
   omp_extract_for_data (as_a  (last_stmt (region->entry)),
, loops);
   region->sched_kind = fd.sched_kind;
   region->sched_modifiers = fd.sched_modifiers;
   region->has_lastprivate_conditional = fd.lastprivate_conditional != 0;
+  if (fd.non_rect && !gimple_omp_for_combined_into_p (fd.for_stmt))
+{
+  for (int i = fd.first_nonrect; i <= fd.last_nonrect; i++)
+   if ((loops[i].m1 || loops[i].m2)
+   && (loops[i].m1 == NULL_TREE
+   || TREE_CODE (loops[i].m1) == INTEGER_CST)
+   && (loops[i].m2 == NULL_TREE
+   || TREE_CODE (loops[i].m2) == INTEGER_CST)
+   && TREE_CODE (loops[i].step) == INTEGER_CST
+   && TREE_CODE (loops[i - loops[i].outer].step) == INTEGER_CST)
+ {
+   tree t;
+   tree itype = TREE_TYPE (loops[i].v);
+   if (loops[i].m1 && loops[i].m2)
+ t = fold_build2 (MINUS_EXPR, itype, loops[i].m2, loops[i].m1);
+   else if (loops[i].m1)
+ t = fold_build1 (NEGATE_EXPR, itype, loops[i].m1);
+   else
+ t = loops[i].m2;
+   t = fold_build2 (MULT_EXPR, itype, t,
+fold_convert (itype,
+  loops[i - loops[i].outer].step));
+   if (TYPE_UNSIGNED (itype) && loops[i].cond_code == GT_EXPR)
+ t = fold_build2 (TRUNC_MOD_EXPR, itype,
+  fold_build1 (NEGATE_EXPR, itype, t),
+  fold_build1 (NEGATE_EXPR, itype,
+   fold_convert (itype,
+ loops[i].step)));
+   else
+ t = fold_build2 (TRUNC_MOD_EXPR, itype, t,
+  fold_convert (itype, loops[i].step));
+   if (integer_nonzerop (t))
+ error_at (gimple_location (fd.for_stmt),
+   "invalid OpenMP non-rectangular loop step; "
+   "%<(%E - %E) * %E%> is not a multiple of loop %d "
+   "step %qE",
+   loops[i].m2 ? loops[i].m2 : integer_zero_node,
+   loops[i].m1 ? loops[i].m1 : integer_zero_node,
+   loops[i - loops[i].outer].step, i + 1,
+   loops[i].step);
+ }
+}
 
   gcc_assert (EDGE_COUNT (region->entry->succs) == 2);
   BRANCH_EDGE (region->entry)->flags &= ~EDGE_ABNORMAL;
--- gcc/testsuite/c-c++-common/gomp/loop-7.c.jj 2020-06-30 16:29:38.969728972 
+0200
+++ gcc/testsuite/c-c++-common/gomp/loop-7.c2020-06-30 15:49:17.788152186 
+0200
@@ -0,0 +1,24 @@
+void
+foo (void)
+{
+  #pragma omp for collapse(2)  /* { dg-error "invalid OpenMP non-rectangular 
loop step" } */
+  for (int i = 0; i < 6; i++)
+for (int j = 4 * i; j < 7 * i; j += 2)
+  ;
+  #pragma omp for collapse(2)  /* { dg-error "invalid OpenMP non-rectangular 
loop step" } */
+  for (int i = 0; i < 32; i += 7)
+for (int j = 3 * i; j < 7 * i; j += 30)
+  ;
+  #pragma omp for collapse(2)
+  for (int i = 0; i < 6; i++)
+for (int j = 4 * i; j < 6 * i; j += 2)
+  ;
+  #pragma omp for collapse(2)
+  for (int i = 0; i < 6; i += 2)
+for (int j = 4 * i; j < 7 * i; j += 2)
+  ;
+  #pragma omp for collapse(2)
+  for (int i = 0; i < 6; i += 5)
+for (int j = 4 * i; j < 7 * i; j += 15)
+  ;
+}


Jakub



Re: [PATCH] VEC_COND_EXPR: do not expand comparisons feeding it

2020-07-02 Thread Richard Biener via Gcc-patches
On Thu, Jul 2, 2020 at 10:20 AM Martin Liška  wrote:
>
> On 6/30/20 3:03 PM, Richard Biener wrote:
> > Now to simply restore previous behavior for this particular case we should
> > probably make expand_vector_comparison walk immediate uses as you do
> > but then call expand_vector_condition for each VEC_COND_EXPR use,
> > making that return whether it "consumed" the comparison.  If all uses
> > consumed the comparison we should DCE it, if not, we should lower it?
>
> All right, I prepared patch for it.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

OK.  Can you please replace 'auto_bitmap *' args with 'bitmap'?
auto_bitmap decays to bitmap just fine.

Thanks,
Richard.

> Ready to be installed?
> Thanks,
> Martin


[PATCH] tree-cfg: Fix ICE with switch stmt to unreachable opt and forced labels [PR95857]

2020-07-02 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs, because during the cfg cleanup, we see:
  switch (i$e_11)  [33.33%], case -3:  [33.33%], case 0: 
 [33.33%], case 2:  [33.33%]>
...
lab2:
  __builtin_unreachable ();
where lab2 is FORCED_LABEL.  The way it works, we go through the case labels
and when we reach the first one that points to gimple_seq_unreachable*
basic block, we remove the edge (if any) from the switch bb to the bb
containing the label and bbs reachable only through that edge we've just
removed.  Once we do that, we must throw away all other cases that use
the same label (or some other labels from the same bb we've removed the edge
to and the bb).  To avoid quadratic behavior, this is not done by walking
all remaining cases immediately before removing, but only when processing
them later.
For normal labels this works, fine, if the label is in a deleted bb, it will
have NULL label_to_block and we handle that case, or, if the unreachable bb
has some other edge to it, only the edge will be removed and not the bb,
and again, find_edge will not find the edge and we only remove the case.
And if a label would be to some other block, that other block wouldn't have
been removed earlier because there would be still an edge from the switch
block.
Now, FORCED_LABEL (and I think DECL_NONLOCAL too) break this, because
those labels aren't removed, but instead moved to some surrounding basic
block.  So, when we later process those, when their gimple_seq_unreachable*
basic block is removed, label_to_block will return some unrelated block
(in the testcase the switch bb), so we decide to keep the case which doesn't
seem to be unreachable, but we don't really have an edge from the switch
block to the block the label got moved to.

I thought first about punting in gimple_seq_unreachable* on
FORCED_LABEL/DECL_NONLOCAL labels, but that might penalize even code that
doesn't care, so this instead just makes sure that for
FORCED_LABEL/DECL_NONLOCAL labels that are being removed (and thus moved
randomly) we remember in a hash_set the fact that those labels should be
treated as removed for the purpose of the optimization, and later on
handle those labels that way.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-07-02  Jakub Jelinek  

PR tree-optimization/95857
* tree-cfg.c (group_case_labels_stmt): When removing an unreachable
base_bb, remember all forced and non-local labels on it and later
treat those as if they have NULL label_to_block.  Formatting fix.
Fix a comment typo.

* gcc.dg/pr95857.c: New test.

--- gcc/tree-cfg.c.jj   2020-06-30 11:18:52.981737596 +0200
+++ gcc/tree-cfg.c  2020-06-30 12:50:22.068246981 +0200
@@ -1728,6 +1728,7 @@ group_case_labels_stmt (gswitch *stmt)
   int old_size = gimple_switch_num_labels (stmt);
   int i, next_index, new_size;
   basic_block default_bb = NULL;
+  hash_set *removed_labels = NULL;
 
   default_bb = gimple_switch_default_bb (cfun, stmt);
 
@@ -1744,8 +1745,11 @@ group_case_labels_stmt (gswitch *stmt)
   base_bb = label_to_block (cfun, CASE_LABEL (base_case));
 
   /* Discard cases that have the same destination as the default case or
-whose destiniation blocks have already been removed as unreachable.  */
-  if (base_bb == NULL || base_bb == default_bb)
+whose destination blocks have already been removed as unreachable.  */
+  if (base_bb == NULL
+ || base_bb == default_bb
+ || (removed_labels
+ && removed_labels->contains (CASE_LABEL (base_case
{
  i++;
  continue;
@@ -1768,10 +1772,13 @@ group_case_labels_stmt (gswitch *stmt)
  /* Merge the cases if they jump to the same place,
 and their ranges are consecutive.  */
  if (merge_bb == base_bb
+ && (removed_labels == NULL
+ || !removed_labels->contains (CASE_LABEL (merge_case)))
  && wi::to_wide (CASE_LOW (merge_case)) == bhp1)
{
- base_high = CASE_HIGH (merge_case) ?
- CASE_HIGH (merge_case) : CASE_LOW (merge_case);
+ base_high
+   = (CASE_HIGH (merge_case)
+  ? CASE_HIGH (merge_case) : CASE_LOW (merge_case));
  CASE_HIGH (base_case) = base_high;
  next_index++;
}
@@ -1792,7 +1799,29 @@ group_case_labels_stmt (gswitch *stmt)
{
  edge base_edge = find_edge (gimple_bb (stmt), base_bb);
  if (base_edge != NULL)
-   remove_edge_and_dominated_blocks (base_edge);
+   {
+ for (gimple_stmt_iterator gsi = gsi_start_bb (base_bb);
+  !gsi_end_p (gsi); gsi_next ())
+   if (glabel *stmt = dyn_cast  (gsi_stmt (gsi)))
+ {
+   if (FORCED_LABEL (gimple_label_label (stmt))
+   || DECL_NONLOCAL (gimple_label_label (stmt)))
+ {
+   /* 

Re: [PATCH] ipa-sra: Prevent constructing debug info from wrong argument

2020-07-02 Thread Richard Biener via Gcc-patches
On Wed, Jul 1, 2020 at 11:59 PM Martin Jambor  wrote:
>
> Hi,
>
> the mechanism generating debug info for removed parameters did not
> adjust index of the argument in the call statement to take into
> account extra arguments IPA-SRA might have produced when splitting a
> strucutre.  This patch addresses that omission and stops gdb from
> showing incorrect value for the removed parameter and says "value
> optimized out" instead.  The guality testcase will end up as
> UNSUPPORTED in the results which is how Richi told me on IRC we deal
> with this.
>
> It is possible to generate debug info to actually show the value of the
> removed parameter but so far my approaches to do that seem too
> controversial
> (https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546705.html), so
> before I come up with something better I'd like to push this to master
> and the gcc-10 branch in time for the GCC 10.2 release.
>
> Bootstrapped and tested on master on x86_64-linux, bootstrap on top of
> the gcc-10 branch is underway?  OK for both if it passes?

OK.

Thanks,
Richard.

> Thanks,
>
> Martin
>
>
> gcc/ChangeLog:
>
> 2020-07-01  Martin Jambor  
>
> PR guality/95343
> * ipa-param-manipulation.c (ipa_param_adjustments::modify_call): 
> Adjust
> argument index if necessary.
>
> gcc/testsuite/ChangeLog:
>
> 2020-07-01  Martin Jambor  
>
> PR guality/95343
> * gcc.dg/guality/pr95343.c: New test.
> ---
>  gcc/ipa-param-manipulation.c   |  6 +++-
>  gcc/testsuite/gcc.dg/guality/pr95343.c | 45 ++
>  2 files changed, 50 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/guality/pr95343.c
>
> diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
> index 2cc4bc79dc1..5fc0de56556 100644
> --- a/gcc/ipa-param-manipulation.c
> +++ b/gcc/ipa-param-manipulation.c
> @@ -790,7 +790,11 @@ ipa_param_adjustments::modify_call (gcall *stmt,
>   if (!is_gimple_reg (old_parm) || kept[i])
> continue;
>   tree origin = DECL_ORIGIN (old_parm);
> - tree arg = gimple_call_arg (stmt, i);
> + tree arg;
> + if (transitive_remapping)
> +   arg = gimple_call_arg (stmt, index_map[i]);
> + else
> +   arg = gimple_call_arg (stmt, i);
>
>   if (!useless_type_conversion_p (TREE_TYPE (origin), TREE_TYPE 
> (arg)))
> {
> diff --git a/gcc/testsuite/gcc.dg/guality/pr95343.c 
> b/gcc/testsuite/gcc.dg/guality/pr95343.c
> new file mode 100644
> index 000..a3e57decda8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/guality/pr95343.c
> @@ -0,0 +1,45 @@
> +/* { dg-do run } */
> +/* { dg-options "-g -fno-ipa-icf" } */
> +
> +volatile int v;
> +
> +int __attribute__((noipa))
> +get_val0 (void)  {return 0;}
> +int __attribute__((noipa))
> +get_val2 (void)  {return 2;}
> +
> +struct S
> +{
> +  int a, b, c;
> +};
> +
> +static int __attribute__((noinline))
> +bar (struct S s, int x, int y, int z, int i)
> +{
> +  int r;
> +  v = s.a + s.b;   /* { dg-final { gdb-test . "i+1" "3" } } */
> +  return r;
> +}
> +
> +static int __attribute__((noinline))
> +foo (struct S s, int i)
> +{
> +  int r;
> +  r = bar (s, 3, 4, 5, i);
> +  return r;
> +}
> +
> +
> +int
> +main (void)
> +{
> +  struct S s;
> +  int i;
> +  i = get_val2 ();
> +  s.a = get_val0 ();
> +  s.b = get_val0 ();
> +  s.c = get_val0 ();
> +  int r = foo (s, i);
> +  v = r + i;
> +  return 0;
> +}
> --
> 2.27.0
>


[Ada] Reject components in extensions overlapping with the parent

2020-07-02 Thread Eric Botcazou
Such problematic components can be specified by means of a component clause 
but they cannot be fully supported by the type system.  They had initially 
been forbidden, then we decided to accept them by working around the type 
system, but this is very fragile and, for example, any static aggregate is 
guaranteed to trigger an ICE with the current implementation.

We now reject them again, except if the -gnatd.K switch is passed.

Tested on x86-64/Linux, applied on the mainline.


2020-07-02  Eric Botcazou  

* debug.adb (d.K): Document new usage.
* fe.h (Debug_Flag_Dot_KK): Declare.
* gcc-interface/decl.c (gnat_to_gnu_field): Give an error when the
component overlaps with the parent subtype, except with -gnatd.K.

-- 
Eric Botcazoudiff --git a/gcc/ada/debug.adb b/gcc/ada/debug.adb
index 63b14b2bd6d..0f73c2a17ae 100644
--- a/gcc/ada/debug.adb
+++ b/gcc/ada/debug.adb
@@ -128,7 +128,7 @@ package body Debug is
--  d.H
--  d.I  Do not ignore enum representation clauses in CodePeer mode
--  d.J  Relaxed rules for pragma No_Return
-   --  d.K
+   --  d.K  Do not reject components in extensions overlapping with parent
--  d.L  Depend on back end for limited types in if and case expressions
--  d.M  Relaxed RM semantics
--  d.N  Add node to all entities
@@ -898,6 +898,11 @@ package body Debug is
--   for that. If the procedure does in fact return normally, execution
--   is erroneous, and therefore unpredictable.
 
+   --  d.K  Do not reject components in extensions overlapping with the parent
+   --   component. Such components can be specified by means of a component
+   --   clause but they cannot be fully supported by the GCC type system.
+   --   This switch nevertheless allows them for the sake of compatibility.
+
--  d.L  Normally the front end generates special expansion for conditional
--   expressions of a limited type. This debug flag removes this special
--   case expansion, leaving it up to the back end to handle conditional
diff --git a/gcc/ada/fe.h b/gcc/ada/fe.h
index 043300d42c9..463a89c5fdb 100644
--- a/gcc/ada/fe.h
+++ b/gcc/ada/fe.h
@@ -59,9 +59,11 @@ extern int Compiler_Abort (String_Pointer, String_Pointer, Boolean) ATTRIBUTE_NO
 
 /* debug: */
 
+#define Debug_Flag_Dot_KK	debug__debug_flag_dot_kk
 #define Debug_Flag_Dot_R	debug__debug_flag_dot_r
 #define Debug_Flag_NN		debug__debug_flag_nn
 
+extern Boolean Debug_Flag_Dot_KK;
 extern Boolean Debug_Flag_Dot_R;
 extern Boolean Debug_Flag_NN;
 
diff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c
index cad06a4dd06..025714bd339 100644
--- a/gcc/ada/gcc-interface/decl.c
+++ b/gcc/ada/gcc-interface/decl.c
@@ -7234,12 +7234,12 @@ gnat_to_gnu_field (Entity_Id gnat_field, tree gnu_record_type, int packed,
 {
   Entity_Id gnat_parent = Parent_Subtype (gnat_record_type);
 
-  /* Ensure the position does not overlap with the parent subtype, if there
-	 is one.  This test is omitted if the parent of the tagged type has a
-	 full rep clause since, in this case, component clauses are allowed to
-	 overlay the space allocated for the parent type and the front-end has
-	 checked that there are no overlapping components.  */
-  if (Present (gnat_parent) && !Is_Fully_Repped_Tagged_Type (gnat_parent))
+  /* Ensure the position doesn't overlap with the parent subtype if there
+	 is one.  It would be impossible to build CONSTRUCTORs and accessing
+	 the parent could clobber the component in the extension if directly
+	 done.  We accept it with -gnatd.K for the sake of compatibility.  */
+  if (Present (gnat_parent)
+	  && !(Debug_Flag_Dot_KK && Is_Fully_Repped_Tagged_Type (gnat_parent)))
 	{
 	  tree gnu_parent = gnat_to_gnu_type (gnat_parent);
 


Re: [PATCH] tree-ssa-threadbackward.c (profitable_jump_thread_path): Do not allow __builtin_constant_p.

2020-07-02 Thread Richard Biener via Gcc-patches
On Wed, Jul 1, 2020 at 10:02 PM Jeff Law  wrote:
>
> On Mon, 2020-06-29 at 08:58 +0200, Richard Biener wrote:
> > On Sat, Jun 27, 2020 at 4:52 PM Marc Glisse  wrote:
> > > On Fri, 26 Jun 2020, Jeff Law via Gcc-patches wrote:
> > >
> > > > In theory yes, but there are cases where paths converge (like you've 
> > > > shown) where
> > > > you may have evaluated to a constant on the paths, but it's not a 
> > > > constant at the
> > > > convergence point.  You have to be very careful using b_c_p like this 
> > > > and it's
> > > > been a regular source of kernel bugs.
> > > >
> > > >
> > > > I'd recommend looking at the .ssa dump and walk forward from there if 
> > > > the .ssa
> > > > dump looks correct.
> > >
> > > Here is the last dump before thread1 (105t.mergephi2). I don't see
> > > anything incorrect in it.
> > >
> > > ledtrig_cpu (_Bool is_active)
> > > {
> > >int old;
> > >int iftmp.0_1;
> > >int _5;
> > >
> > > [local count: 1073741824]:
> > >if (is_active_2(D) != 0)
> > >  goto ; [50.00%]
> > >else
> > >  goto ; [50.00%]
> > >
> > > [local count: 536870913]:
> > >
> > > [local count: 1073741824]:
> > ># iftmp.0_1 = PHI <1(2), -1(3)>
> > >_5 = __builtin_constant_p (iftmp.0_1);
> > >if (_5 != 0)
> > >  goto ; [50.00%]
> > >else
> > >  goto ; [50.00%]
> > >
> > > [local count: 536870913]:
> > >if (iftmp.0_1 >= -128)
> > >  goto ; [50.00%]
> > >else
> > >  goto ; [50.00%]
> > >
> > > [local count: 268435456]:
> > >if (iftmp.0_1 <= 127)
> > >  goto ; [34.00%]
> > >else
> > >  goto ; [66.00%]
> > >
> > > [local count: 91268056]:
> > >__asm__ __volatile__("asi %0,%1
> > > " : "ptr" "=Q" MEM[(int *)_active_cpus] : "val" "i" iftmp.0_1, "Q" 
> > > MEM[(int *)_active_cpus] : "memory", "cc");
> > >goto ; [100.00%]
> > >
> > > [local count: 982473769]:
> > >__asm__ __volatile__("laa %0,%2,%1
> > > " : "old" "=d" old_8, "ptr" "=Q" MEM[(int *)_active_cpus] : "val" "d" 
> > > iftmp.0_1, "Q" MEM[(int *)_active_cpus] : "memory", "cc");
> > >
> > > [local count: 1073741824]:
> > >return;
> > >
> > > }
> > >
> > > There is a single _b_c_p, the immediate asm argument is exactly the
> > > argument of _b_c_p, and it is in the branch protected by _b_c_p.
> > >
> > > Now the thread1 dump, for comparison
> > >
> > > ledtrig_cpu (_Bool is_active)
> > > {
> > >int old;
> > >int iftmp.0_4;
> > >int iftmp.0_6;
> > >int _7;
> > >int _12;
> > >int iftmp.0_13;
> > >int iftmp.0_14;
> > >
> > > [local count: 1073741824]:
> > >if (is_active_2(D) != 0)
> > >  goto ; [50.00%]
> > >else
> > >  goto ; [50.00%]
> > >
> > > [local count: 536870912]:
> > ># iftmp.0_6 = PHI <1(2)>
> > >_7 = __builtin_constant_p (iftmp.0_6);
> > >if (_7 != 0)
> > >  goto ; [50.00%]
> > >else
> > >  goto ; [50.00%]
> > >
> > > [local count: 536870912]:
> > ># iftmp.0_4 = PHI <-1(2)>
> > >_12 = __builtin_constant_p (iftmp.0_4);
> > >if (_12 != 0)
> > >  goto ; [50.00%]
> > >else
> > >  goto ; [50.00%]
> > >
> > > [local count: 268435456]:
> > >if (iftmp.0_4 >= -128)
> > >  goto ; [20.00%]
> > >else
> > >  goto ; [80.00%]
> > >
> > > [local count: 214748364]:
> > >if (iftmp.0_6 <= 127)
> > >  goto ; [12.00%]
> > >else
> > >  goto ; [88.00%]
> > >
> > > [local count: 91268056]:
> > ># iftmp.0_13 = PHI 
> > >__asm__ __volatile__("asi %0,%1
> > > " : "ptr" "=Q" MEM[(int *)_active_cpus] : "val" "i" iftmp.0_13, "Q" 
> > > MEM[(int *)_active_cpus] : "memory", "cc");
> > >goto ; [100.00%]
> > >
> > > [local count: 982473769]:
> > ># iftmp.0_14 = PHI  > > iftmp.0_6(3)>
> > >__asm__ __volatile__("laa %0,%2,%1
> > > " : "old" "=d" old_8, "ptr" "=Q" MEM[(int *)_active_cpus] : "val" "d" 
> > > iftmp.0_14, "Q" MEM[(int *)_active_cpus] : "memory", "cc");
> > >
> > > [local count: 1073741824]:
> > >return;
> > >
> > > }
> > >
> > > Thread1 decides to separate the paths is_active and !is_active
> > > (surprisingly, for one it optimizes out the comparison <= 127 and for the
> > > other the comparison >= -128, while it could optimize both in both cases).
> > > And it decides to converge after the comparisons, but before the asm.
> > >
> > > What the pass did does seem to hurt. It looks like if we duplicate _b_c_p,
> > > we may need to duplicate far enough to include all the blocks dominated by
> > > _b_c_p==true (including the asm, here). Otherwise, any _b_c_p can be
> > > optimized to true, because for a boolean
> > >
> > > b is the same as b ? true : false
> > > __builtin_constant_p(b ? true : false) would be the same as b ?
> > > __builtin_constant_p(true) : __builtin_constant_p(false), i.e. true.
> > >
> > > It is too bad we don't have any optimization pass using ranges between IPA
> > > and thread1, that would have gotten rid of the comparisons, and hence the
> > > 

Re: [PATCH] VEC_COND_EXPR: do not expand comparisons feeding it

2020-07-02 Thread Martin Liška

On 6/30/20 3:03 PM, Richard Biener wrote:

Now to simply restore previous behavior for this particular case we should
probably make expand_vector_comparison walk immediate uses as you do
but then call expand_vector_condition for each VEC_COND_EXPR use,
making that return whether it "consumed" the comparison.  If all uses
consumed the comparison we should DCE it, if not, we should lower it?


All right, I prepared patch for it.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin
>From 366b61fb66efa9af0cfaa2a10df72a4224c708b8 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 30 Jun 2020 08:57:27 +0200
Subject: [PATCH] VEC_COND_EXPR: do not expand comparisons feeding it

gcc/ChangeLog:

	* tree-vect-generic.c (expand_vector_condition): Forward declaration.
	(expand_vector_comparison): Do not expand a comparison if all
	uses are consumed by a VEC_COND_EXPR.
	(expand_vector_operation): Change void return type to bool.
	(expand_vector_operations_1): Pass dce_ssa_names.
---
 gcc/tree-vect-generic.c | 59 -
 1 file changed, 52 insertions(+), 7 deletions(-)

diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index a4b56195903..92884fad2d3 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -371,14 +371,53 @@ expand_vector_addition (gimple_stmt_iterator *gsi,
 a, b, code);
 }
 
+static bool
+expand_vector_condition (gimple_stmt_iterator *gsi, auto_bitmap *dce_ssa_names);
+
 /* Try to expand vector comparison expression OP0 CODE OP1 by
querying optab if the following expression:
 	VEC_COND_EXPR< OP0 CODE OP1, {-1,...}, {0,...}>
can be expanded.  */
 static tree
 expand_vector_comparison (gimple_stmt_iterator *gsi, tree type, tree op0,
-  tree op1, enum tree_code code)
+			  tree op1, enum tree_code code,
+			  auto_bitmap *dce_ssa_names)
 {
+  tree lhs = gimple_assign_lhs (gsi_stmt (*gsi));
+  use_operand_p use_p;
+  imm_use_iterator iterator;
+  bool vec_cond_expr_only = true;
+
+  /* As seen in PR95830, we should not expand comparisons that are only
+ feeding a VEC_COND_EXPR statement.  */
+  auto_vec uses;
+  FOR_EACH_IMM_USE_FAST (use_p, iterator, lhs)
+uses.safe_push (USE_STMT (use_p));
+
+  for (unsigned i = 0; i < uses.length (); i ++)
+{
+  gassign *use = dyn_cast (uses[i]);
+  if (use != NULL
+	  && gimple_assign_rhs_code (use) == VEC_COND_EXPR
+	  && gimple_assign_rhs1 (use) == lhs)
+	{
+	  gimple_stmt_iterator it = gsi_for_stmt (use);
+	  if (!expand_vector_condition (, dce_ssa_names))
+	{
+	  vec_cond_expr_only = false;
+	  break;
+	}
+	}
+  else
+	{
+	  vec_cond_expr_only = false;
+	  break;
+	}
+}
+
+  if (!uses.is_empty () && vec_cond_expr_only)
+return NULL_TREE;
+
   tree t;
   if (!expand_vec_cmp_expr_p (TREE_TYPE (op0), type, code)
   && !expand_vec_cond_expr_p (type, TREE_TYPE (op0), code))
@@ -932,7 +971,8 @@ expand_vector_divmod (gimple_stmt_iterator *gsi, tree type, tree op0,
 
 /* Expand a vector condition to scalars, by using many conditions
on the vector's elements.  */
-static void
+
+static bool
 expand_vector_condition (gimple_stmt_iterator *gsi, auto_bitmap *dce_ssa_names)
 {
   gassign *stmt = as_a  (gsi_stmt (*gsi));
@@ -975,7 +1015,7 @@ expand_vector_condition (gimple_stmt_iterator *gsi, auto_bitmap *dce_ssa_names)
   if (expand_vec_cond_expr_p (type, TREE_TYPE (a1), code))
 {
   gcc_assert (TREE_CODE (a) == SSA_NAME || TREE_CODE (a) == VECTOR_CST);
-  return;
+  return true;
 }
 
   /* Handle vector boolean types with bitmasks.  If there is a comparison
@@ -1006,7 +1046,7 @@ expand_vector_condition (gimple_stmt_iterator *gsi, auto_bitmap *dce_ssa_names)
   a = gimplify_build2 (gsi, BIT_IOR_EXPR, type, a1, a2);
   gimple_assign_set_rhs_from_tree (gsi, a);
   update_stmt (gsi_stmt (*gsi));
-  return;
+  return true;
 }
 
   /* TODO: try and find a smaller vector type.  */
@@ -1070,11 +1110,14 @@ expand_vector_condition (gimple_stmt_iterator *gsi, auto_bitmap *dce_ssa_names)
   if (a_is_comparison)
 bitmap_set_bit (*dce_ssa_names,
 		SSA_NAME_VERSION (gimple_assign_lhs (assign)));
+
+  return false;
 }
 
 static tree
 expand_vector_operation (gimple_stmt_iterator *gsi, tree type, tree compute_type,
-			 gassign *assign, enum tree_code code)
+			 gassign *assign, enum tree_code code,
+			 auto_bitmap *dce_ssa_names)
 {
   machine_mode compute_mode = TYPE_MODE (compute_type);
 
@@ -1128,7 +1171,8 @@ expand_vector_operation (gimple_stmt_iterator *gsi, tree type, tree compute_type
 	  tree rhs1 = gimple_assign_rhs1 (assign);
 	  tree rhs2 = gimple_assign_rhs2 (assign);
 
-	  return expand_vector_comparison (gsi, type, rhs1, rhs2, code);
+	  return expand_vector_comparison (gsi, type, rhs1, rhs2, code,
+	   dce_ssa_names);
 	}
 
   case TRUNC_DIV_EXPR:
@@ -2213,7 +2257,8 @@ expand_vector_operations_1 

Re: [PATCH][RFC] Do not stream all zeros for gcda files.

2020-07-02 Thread Martin Liška

On 7/1/20 4:01 PM, Jan Hubicka wrote:

On 7/1/20 3:15 AM, Martin Liška wrote:

On 6/30/20 3:24 PM, Nathan Sidwell wrote:

On 6/30/20 8:53 AM, Martin Liška wrote:



Yes, if there's a real need for a solid compression, then I would use a
generic compressor of a final streamed file.


for avoidance of doubt, I'm good with your current solution


I think special casing zeros is good even in combination with a generic
compressor because it is very common special case to deal with (for
programs that exit frequently and have a lot of basic blocks like GCC
generic comporessor may easilly turn out to be a bottleneck if fed with
overhelming quantity of redundant data). I am also happy with current solution.

Thanks for working on it!


All right, you convinced me and I'm going to install the patch.

Martin


Honza


nathan


--
Nathan Sidwell




RE: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182

2020-07-02 Thread Yangfei (Felix)
Hi,

> -Original Message-
> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> Sent: Wednesday, July 1, 2020 9:03 PM
> To: Yangfei (Felix) 
> Cc: Richard Biener ; Richard Biener
> ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182
> 
> "Yangfei (Felix)"  writes:
> >> On June 30, 2020 4:23:03 PM GMT+02:00, Richard Sandiford
> >>  wrote:
> >> >Richard Biener  writes:
> >> >> So it seems odd to somehow put in the number of vectors...  so to
> >> >> me it would have made sense if it did
> >> >>
> >> >>   possible_npeel_number = lower_bound (nscalars);
> >> >>
> >> >> or whateveris necessary to make the polys happy.  Thus simply
> >> >> elide the vect_get_num_vectors call?  But it's been very long
> >> >> since I've dived into the alignment peeling code...
> >> >
> >> >Ah, I see what you mean.  So rather than:
> >> >
> >> >/* Save info about DR in the hash table.  Also include peeling
> >> >   amounts according to the explanation above.  */
> >> >  for (j = 0; j < possible_npeel_number; j++)
> >> >{
> >> >  vect_peeling_hash_insert (_htab, loop_vinfo,
> >> >  dr_info, npeel_tmp);
> >> >npeel_tmp += target_align / dr_size;
> >> >}
> >> >
> >> >just have something like:
> >> >
> >> >while (known_le (npeel_tmp, nscalars))
> >> >  {
> >> >…
> >> >  }
> >> >
> >> >?
> >>
> >> Yeah.
> >
> > Not sure if I understand correctly.  I am supposing the following check in
> the original code is not necessary if we go like that.
> >
> > 1822   if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
> >
> > Is that correct?
> 
> I think we still need it.  I guess there are two choices:
> 
> - make nscalars default to npeel_tmp before the “if” above.

I think this will be simpler.  How about the v2 patch?
Bootstrapped and tested on aarch64-linux-gnu & x86_64-linux-gnu.

Thanks,
Felix


pr95961-v2.diff
Description: pr95961-v2.diff


Re: Fortran : Fortran translation issues PR52279

2020-07-02 Thread Mark Eggleston

On 01/07/2020 20:07, David Edelsohn wrote:

This patch breaks bootstrap.


Apologies. I didn't see this when I built the compiler the with 
bootstrap on x86_64. I'll endevour to get it fixed as soon as possible.


regards,

Mark


It is not portable to use _( ... ) to initialize an array.

In file included from /nasfarm/edelsohn/src/src/gcc/fortran/gfortran.h:52,
  from /nasfarm/edelsohn/src/src/gcc/fortran/check.c:32:
/nasfarm/edelsohn/src/src/gcc/fortran/check.c: In function 'bool gfc_invalid_boz
(const char*, locus*)':
/nasfarm/edelsohn/src/src/gcc/intl.h:51:27: error: initializer fails
to determine size of 'hint'
51 | # define _(msgid) gettext (msgid)
   |   ^~~
/nasfarm/edelsohn/src/src/gcc/fortran/check.c:70:23: note: in expansion of macro
  '_'
70 |   const char hint[] = _(" [see %<-fno-allow-invalid-boz%>]");
   |   ^
/nasfarm/edelsohn/src/src/gcc/intl.h:51:27: error: array must be
initialized with a brace-enclosed initializer


--
https://www.codethink.co.uk/privacy.html