[RFA/RFC] Improve DSE/aliasing around __builtin_object_size (PR89689, P2 regression)

2020-01-28 Thread Jeff Law

Here's two approaches to fix the regression in 89689.  Note this only
fixes the regression in the originally reported testcase.  THe deeper
issue Martin raises in C#1 is not addressed.  Thus if we go forward
with either patch ISTM that we would drop the regression markers, but
keep the BZ open.

So the key blocks in question as we enter DSE1:

> 

> ;;   basic block 2, loop depth 0, maybe hot
> ;;prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED)
> ;;pred:   ENTRY (FALLTHRU,EXECUTABLE)
>   h = l;
>   h$data_33 = l.data;
>   if (h$data_33 != &__sb_slop)
> goto ; [70.00%]
>   else
> goto ; [30.00%]
> ;;succ:   3 [70.0% (guessed)]  (TRUE_VALUE,EXECUTABLE)
> ;;4 [30.0% (guessed)]  (FALSE_VALUE,EXECUTABLE)
> 
> ;;   basic block 3, loop depth 0, maybe hot
> ;;prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED)
> ;;pred:   2 [70.0% (guessed)]  (TRUE_VALUE,EXECUTABLE)
>   *h$data_33 = 0;
> ;;succ:   4 [always]  (FALLTHRU,EXECUTABLE)
> 
> ;;   basic block 4, loop depth 0, maybe hot
> ;;prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED)
> ;;pred:   3 [always]  (FALLTHRU,EXECUTABLE)
> ;;2 [30.0% (guessed)]  (FALSE_VALUE,EXECUTABLE)
>   _23 = __builtin_object_size (h$data_33, 0);
>   _25 = __builtin___memcpy_chk (h$data_33, "abcd", 4, _23);
>   __builtin_puts (h$data_33);
>   _6 = h$data_33 == &__sb_slop;
>   _7 = (int) _6;
>   printf ("%d\n", _7);
>   _9 = h$data_33 == 
>   _10 = (int) _9;
>   printf ("%d\n", _10);
>   if (h$data_33 != &__sb_slop)
> goto ; [70.00%]
>   else
> goto ; [30.00%]
> 

So vrp1 is going to want to thread the jump at the end of bb4 which
requires duplicating bb4.  One version of bb4 will only be reachable
via the false edge from bb2.  That in turn exposes a cprop opportunity 
to replace h$data_33 with &__sbloop and the type of &__sbloop is a
char[1] array thus triggering the false positive warning.

We can avoid all that by realizing the store in bb3 is dead.  That in
turn makes the conditional at the end of bb2 pointless as bb3 is empty
and thus both arms ultimately transfer control to bb4 without any other
side effects meaning we can eliminate that conditional which in turn
eliminates the need for jump threading.

Again, this only fixes the original testcase and if there were other
statements in bb3 it wouldn't work and it won't work for the test in
c#1.  But the proposed changes are a general improvement.

DSE misses this case because ref_maybe_used_by_stmt_p returns true for
the virtual operand of the __builtin_object_size call.  Opps!

There's two easy ways to fix this, I've bootstrapped and regression
tested both.

First, the most conservative fix and perhaps the most appropriate for
gcc-10 -- special casing __builtin_object_size in tree-ssa-dse.c.

The second approach is more general and marks __builtin_object_size as
const rather than just pure.  Jakub and I kicked this around in IRC a
bit.  We've always marked __builtin_object_size as pure.  It was never
const.  THere is some concern that code motion might cause _b_o_s to
get different results, which would be particularly concerning for
_b_o_s types #1 and #3 which do use some contextual information. 
However, ISTM that motion is still going to be limited by the SSA
graph, though somewhat less because we wouldn't have a dependency on
the virtual operands.  So it *should* be safe, but there's some degree
of hesitation to make this change at this point in gcc-10's lifecycle.

Thoughts?  If we go forward obviously I'd add a testcase and ChangeLog
entries.

Jeff
diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
index 374143e5785..e88342f1b68 100644
--- a/gcc/tree-ssa-dse.c
+++ b/gcc/tree-ssa-dse.c
@@ -789,8 +789,13 @@ dse_classify_store (ao_ref *ref, gimple *stmt,
 		  phi_def = use_stmt;
 		}
 	}
-	  /* If the statement is a use the store is not dead.  */
-	  else if (ref_maybe_used_by_stmt_p (use_stmt, ref))
+	  /* If the statement is a use the store is not dead.   Do not
+	 consider calls to __builtin_object_size as a use.  For gcc-11
+	 we plan mark __builtin_object_size as const.  */
+	  else if ((!is_gimple_call (use_stmt)
+		|| !gimple_call_builtin_p (as_a  (use_stmt),
+	   BUILT_IN_OBJECT_SIZE))
+		   && ref_maybe_used_by_stmt_p (use_stmt, ref))
 	{
 	  /* Handle common cases where we can easily build an ao_ref
 		 structure for USE_STMT and in doing so we find that the
commit f4e5d3f4755b6a5846ac20b53008b90131a8bb7c
Author: Jeff Law 
Date:   Tue Jan 28 18:16:09 2020 -0500

Mark _b_o_s as constant

diff --git a/gcc/builtins.def b/gcc/builtins.def
index 5ab842c34c2..fa8b0641ab1 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -972,7 +972,7 @@ DEF_BUILTIN_STUB (BUILT_IN_STRCMP_EQ, "__builtin_strcmp_eq")
 DEF_BUILTIN_STUB (BUILT_IN_STRNCMP_EQ, "__builtin_strncmp_eq")
 
 /* Object size checking builtins.  */
-DEF_GCC_BUILTIN   

Re: GCC fixincludes bug

2020-01-28 Thread Bruce Korb
On 1/23/20 8:07 AM, Pétur Orri Ragnarsson wrote:
> Hi.
> 
> I ran into a problem with fixincludes when building GCC. Your email is
> in the README file.
> 
> 
> When building a crosscompiling GCC 8.3.0 for powerpc the following "fix"
> is applied:
> 
>     $ diff /opt/pluto-targets/powerpc-rootfs/usr/include/open62541.h
> /opt/plutotoolchain/lib/gcc/powerpc-marel-linux-gnu/8.3.0/include-fixed/open62541.h
>     0a1,9
>     > /*  DO NOT EDIT THIS FILE.
>     >
>     >     It has been auto-edited by fixincludes from:
>     >
>     > "/opt/pluto-targets/powerpc-rootfs/usr/include/open62541.h"
>     >
>     >     This had to be done to correct non-standard usages in the
>     >     original, manufacturer supplied header file.  */
>     >
>     115c124
>     < # ifndef __COUNTER__ /* PPC GCC fix */
>     ---
>     > # ifndef __COUNTER__ /* __PPC__ GCC fix */
>    
> This fix is of course useless since it's editing a string inside a
> comment. Unfortunately it's also harmful since it causes this header
> file to be included in the compiler distribution in a way that it gets
> used before the system header. Then I went and upgraded the open62541
> library which made incompatible changes to its API (from v0.3 to v1.0 if
> it matters) causing builds to fail due to the wrong header being used.
> 
> I can apply some hack to fix this locally in our packaging system so
> it's not an urgent error, and it would also go away by building the
> compiler with the new header installed. But I feel like those are
> workarounds that are not very future-proof.
> 
> Is this something you think can/should be fixed? Do you need more info
> from me to figure out where and why this substitution happens?

It'll be a little while before I can re-pull all of GCC and look at it.
There's a substitution rule somewhere that effectively:

sed 's/PPC/__PPC__/'

somewhere. Add a constraint that prevents the application of that fix
when it appears after a "/*" then re-generate the C code that does all
the fixing.

Cheers - Bruce

> (I've sent this mail directly to you only instead of the mailing list
> since I don't know if I'm including enough information.)

Probably better to include gcc-patches as I have become less reliable on
email since I was retired.


Re: [committed] Add include hack to fix missing SCNuMAX defines in inttypes.h on hpux11.[01]*

2020-01-28 Thread Bruce Korb
On 1/25/20 9:34 AM, John David Anglin wrote:
> +/*
> + * Fix missing SCNuMAX defines in inttypes.h
> + */
> +fix = {
> +hackname  = hpux_c99_inttypes4;
> +mach  = "hppa*-hp-hpux11.[01]*";
> +files = inttypes.h;
> +sed   = "/^[ \t]*#[ \t]*define[ \t]*SCNxMAX[ \t]*SCNx64/a\\\n"
> + "#define SCNuMAX \t SCNu64\n";
> +sed   = "/^[ \t]*#[ \t]*define[ \t]*SCNxMAX[ \t]*SCNx32/a\\\n"
> + "#define SCNuMAX \t SCNu32\n";
> +test_text = "#define SCNxMAX SCNx64\n"
> + "#define SCNxMAX SCNx32\n";
> +};
> +
"" in the inserted text? OK. It works...


Re: [cris-decc0 8/9] cris: Move trivially from cc0 to reg:CC model, removing most optimizations.

2020-01-28 Thread Hans-Peter Nilsson
> From: Segher Boessenkool 
> Date: Mon, 27 Jan 2020 23:52:21 +0100

> Hi!
> 
> On Wed, Jan 22, 2020 at 07:11:27AM +0100, Hans-Peter Nilsson wrote:
> > I intend to put back as many as I find use for, of those
> > anonymous patterns in a controlled manner, with self-contained
> > test-cases proving their usability, rather than symmetry with
> > other instructions and similar addressing modes, which guided
> > the original introduction.  I've entered prX to track code
> > performance regressions related to this transition, with focus
> > on target-side causes and fixes; besides the function prologue
> > special-case, there were some checking presence of the bit-test
> > (btstq) instruction.
> 
> That's PR93372 (not X :-) ).

(Wow, someone read the log!)

Doh!  Thanks, will fix at next rebase.  (FWIW, I just added a
mail-send-hook to at least stop (and ask) me from sending email
with that kind of placeholder in the body. :)

> Do you have any estimate how much removing cc0 this way costs in
> performance (or code size, or any other metric)?

Not yet, by any quantifiable metrics, but I plan to do that once
I'm happy with compare-elim.c doing its job for the most glaring
cases.

Still, looking at differences to the cc0 version for libgcc, it
might even be an improvement due to better register allocation
(fewer saved registers and register moves and the like).  To
wit, that effect has a good chance of dominating over negative
fallout from missed compare-elimination opportunities and other
simplifications, just judging from initial observations.
Film at 11.

brgds, H-P


Re: [patch, fortran, wwwdocs] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule

2020-01-28 Thread Andrew Benson
Hi Tobias,

On Tuesday, January 28, 2020 6:49:54 PM PST Tobias Burnus wrote:
> Thus, I do not think it is a real problem in practice. We could be nice,
> however, and add a note to the release notes (i.e.
> https://gcc.gnu.org/gcc-10/changes.html); I not completely sure whether
> it is worthwhile but why not. [See https://gcc.gnu.org/about.html#git
> about how to change WWW files and CC Gerald as web maintainer when
> submitting wwwdocs patches.]

I've attached a draft patch to update the release notes about this ABI 
breakage. I don't know if I've explained it sufficiently clearly though?

-Andrew

-- 

* Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html

* Galacticus: https://github.com/galacticusorg/galacticus
diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html
index dcce6b8..5a959a1 100644
--- a/htdocs/gcc-10/changes.html
+++ b/htdocs/gcc-10/changes.html
@@ -446,6 +446,13 @@ a work-in-progress.
 objects with allocatable components. In this case, the optional arguments
 STAT= and ERRMSG= are currently ignored.
   
+  
+The handling of module and submodule names has been reworked to allow the
+full 63-character length mandated by the standard. Previously symbol names
+were truncated if the combined length of module, submodule, and function
+name exceeded 126 characters. This change therefore breaks the ABI, but only
+for cases where this 126 character limit was exceeded.
+  
 
 
 


[patch, fortran] PR93486 - ICE on valid with nested submodules and long submodule names

2020-01-28 Thread Andrew Benson
I opened PR93486 for this problem:

The following causes an ICE with revision 
ad690d79cfbb905c5546c9333c5fd089d906505b:

module ivs
  interface l
 module procedure l_
  end interface l
contains
  function l_()
  end function l_
end module ivs

module aModeratleyLongModuleName
  use ivs
  interface
 module subroutine cmo()
 end subroutine cmo
  end interface
end module aModeratleyLongModuleName

submodule (aModeratleyLongModuleName) 
aNameForASubmoduleThatIsVeryLongButWhichIsLegalStill
contains
  module procedure cmo
  end procedure cmo
end submodule aNameForASubmoduleThatIsVeryLongButWhichIsLegalStill

submodule 
(aModeratleyLongModuleName:aNameForASubmoduleThatIsVeryLongButWhichIsLegalStill)
 
sb
end submodule sb

submodule (aModeratleyLongModuleName:sb) sc
end submodule sc


$ gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data001/abenson/Galacticus/Tools_Devel_Install/bin/../
libexec/gcc/x86_64-pc-linux-gnu/10.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-git/configure --prefix=/home/abenson/Galacticus/
Tools_Devel --enable-languages=c,c++,fortran --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.0.1 20200128 (experimental) (GCC) 


$ gfortran -c test.F90 -o test.o
f951: internal compiler error: Segmentation fault
0xe1bc9f crash_signal
../../gcc-git/gcc/toplev.c:328
0x7f98119c71ef ???
/data001/abenson/Galacticus/Tools/glibc-2.12.1/signal/../sysdeps/unix/
sysv/linux/x86_64/sigaction.c:0
0x84b3f0 gfc_use_modules()
../../gcc-git/gcc/fortran/module.c:7315
0x85acc8 use_modules
../../gcc-git/gcc/fortran/parse.c:114
0x866daa parse_module
../../gcc-git/gcc/fortran/parse.c:6111
0x86733c gfc_parse_file()
../../gcc-git/gcc/fortran/parse.c:6428
0x8b780f gfc_be_parse_file
../../gcc-git/gcc/fortran/f95-lang.c:210
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.


The ICE goes away if the module and/or submodule names are made shorter.

The problem is caused when loading (generic or operator) interfaces from a 
module file. The module name can include the submodule name, so a size of 
2*GFC_MAX_SYMBOL_LEN+2 is required to read this in.

The attached patch fixes this problem and regression tests cleanly *if* the 
patch for PR87103 is applied (otherwise that problem gets triggered by the new 
test case).

PR87103 has been a long-standing issue - there's a patch that works (either my 
original ugly fix, or the better fix proposed by Bernhard at https://
gcc.gnu.org/ml/fortran/2018-09/msg00044.html ). It would be very nice to get 
one of these patches committed.

-Andrew

-- 

* Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html

* Galacticus: https://github.com/galacticusorg/galacticus
diff --git a/gcc/fortran/module.c b/gcc/fortran/module.c
index 4487f65..b6a4e87 100644
--- a/gcc/fortran/module.c
+++ b/gcc/fortran/module.c
@@ -4568,7 +4568,9 @@ static void
 load_operator_interfaces (void)
 {
   const char *p;
-  char name[GFC_MAX_SYMBOL_LEN + 1], module[GFC_MAX_SYMBOL_LEN + 1];
+  /* "module" must be large enough for the case of submodules in which the name
+ has the form module.submodule */
+  char name[GFC_MAX_SYMBOL_LEN + 1], module[2 * GFC_MAX_SYMBOL_LEN + 2];
   gfc_user_op *uop;
   pointer_info *pi = NULL;
   int n, i;
@@ -4624,7 +4626,9 @@ static void
 load_generic_interfaces (void)
 {
   const char *p;
-  char name[GFC_MAX_SYMBOL_LEN + 1], module[GFC_MAX_SYMBOL_LEN + 1];
+  /* "module" must be large enough for the case of submodules in which the name
+ has the form module.submodule */
+  char name[GFC_MAX_SYMBOL_LEN + 1], module[2 * GFC_MAX_SYMBOL_LEN + 2];
   gfc_symbol *sym;
   gfc_interface *generic = NULL, *gen = NULL;
   int n, i, renamed;
diff --git a/gcc/testsuite/gfortran.dg/pr93486.f90 b/gcc/testsuite/gfortran.dg/pr93486.f90
new file mode 100644
index 000..5037d40
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr93486.f90
@@ -0,0 +1,30 @@
+! { dg-do compile }
+! PR fortran/93486
+module ivs
+  interface l
+ module procedure l_
+  end interface l
+contains
+  function l_()
+  end function l_
+end module ivs
+
+module aModeratleyLongModuleName
+  use ivs
+  interface
+ module subroutine cmo()
+ end subroutine cmo
+  end interface
+end module aModeratleyLongModuleName
+
+submodule (aModeratleyLongModuleName) aNameForASubmoduleThatIsVeryLongButWhichIsLegalStill
+contains
+  module procedure cmo
+  end procedure cmo
+end submodule aNameForASubmoduleThatIsVeryLongButWhichIsLegalStill
+
+submodule (aModeratleyLongModuleName:aNameForASubmoduleThatIsVeryLongButWhichIsLegalStill) sb
+end submodule sb
+
+submodule (aModeratleyLongModuleName:sb) sc
+end submodule sc
2020-01-28  Andrew Benson  

	PR fortran/93486
	* module.c: Increase size of var

[PATCH] c++: Fix class NTTP with template arguments [PR92948]

2020-01-28 Thread Marek Polacek
This PR points out an ICE with an alias template and class NTTP, but I
found that there are more issues.  Trouble arise when we use a
(non-type) template parameter as an argument to the template arg list of
a template that accepts a class NTTP and a conversion to a class is
involved, e.g.

  struct A { A(int) { } };
  template struct B { };

  template void fn () {
  B b;
  }

Normally for such a conversion we create a TARGET_EXPR with an
AGGR_INIT_EXPR that calls a __ct_comp with some arguments, but not in
this case: when converting X to type A in convert_nontype_argument we
are in a template and the template parameter 'X' is value-dependent, and
AGGR_INIT_EXPR don't work in templates.  So it just creates a TARGET_EXPR
that contains "A::A(*this, X)", but with that overload resolution fails:
  error: no matching function for call to 'A::A(A*, int)'
That happens because finish_call_expr for a BASELINK creates a dummy
object, so we have 'this' twice.  I thought for the value-dependent case
we could use IMPLICIT_CONV_EXPR, as in the patch below.  Note that I
only do this when we convert to a different type than the type of the
expr.  The point is to avoid the call to a converting ctor.

The second issue was an ICE in tsubst_copy: when there's a conversion
like the above involved then
 /* Wrapper to make a C++20 template parameter object const.  */
  op = tsubst_copy (op, args, complain, in_decl);
might not produce a _ZT... VAR_DECL with const type, so the assert
  gcc_assert (CP_TYPE_CONST_P (TREE_TYPE (op)));
fails.  Allowing IMPLICIT_CONV_EXPR there probably makes sense.

And since convert_nontype_argument uses value_dependent_expression_p
a lot, I used a dedicated bool to speed things up.

Bootstrapped/regtested on x86_64-linux, ok for trunk and maybe 9?

2020-01-28  Marek Polacek  

PR c++/92948 - Fix class NTTP with template arguments.
* pt.c (convert_nontype_argument): Use IMPLICIT_CONV_EXPR when
converting a value-dependent expression to a class type.
(tsubst_copy) : Allow IMPLICIT_CONV_EXPR
as the result of the tsubst_copy call.

* g++.dg/cpp2a/nontype-class28.C: New test.
* g++.dg/cpp2a/nontype-class29.C: New test.
---
 gcc/cp/pt.c  | 36 +++
 gcc/testsuite/g++.dg/cpp2a/nontype-class28.C | 37 
 gcc/testsuite/g++.dg/cpp2a/nontype-class29.C | 26 ++
 3 files changed, 85 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class28.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class29.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index e889c800aca..fccf4e95453 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -7074,7 +7074,8 @@ convert_nontype_argument (tree type, tree expr, 
tsubst_flags_t complain)
   if (non_dep)
 expr = instantiate_non_dependent_expr_internal (expr, complain);
 
-  if (value_dependent_expression_p (expr))
+  const bool val_dep_p = value_dependent_expression_p (expr);
+  if (val_dep_p)
 expr = canonicalize_expr_argument (expr, complain);
 
   /* 14.3.2/5: The null pointer{,-to-member} conversion is applied
@@ -7100,10 +7101,17 @@ convert_nontype_argument (tree type, tree expr, 
tsubst_flags_t complain)
 a conversion function with a value-dependent argument, which
 could invoke taking the address of a temporary representing
 the result of the conversion.  */
- if (COMPOUND_LITERAL_P (expr)
- && CONSTRUCTOR_IS_DEPENDENT (expr)
- && MAYBE_CLASS_TYPE_P (expr_type)
- && TYPE_HAS_CONVERSION (expr_type))
+ if (!same_type_ignoring_top_level_qualifiers_p (type, expr_type)
+ && ((COMPOUND_LITERAL_P (expr)
+  && CONSTRUCTOR_IS_DEPENDENT (expr)
+  && MAYBE_CLASS_TYPE_P (expr_type)
+  && TYPE_HAS_CONVERSION (expr_type))
+ /* Similarly, converting e.g. an integer to a class
+involves a constructor call.  convert_like would
+create a TARGET_EXPR, but in a template we can't
+use AGGR_INIT_EXPR, and the TARGET_EXPR would lead
+to a bogus error.  */
+ || (val_dep_p && MAYBE_CLASS_TYPE_P (type
{
  expr = build1 (IMPLICIT_CONV_EXPR, type, expr);
  IMPLICIT_CONV_EXPR_NONTYPE_ARG (expr) = true;
@@ -7189,8 +7197,7 @@ convert_nontype_argument (tree type, tree expr, 
tsubst_flags_t complain)
 
   /* Notice that there are constant expressions like '4 % 0' which
 do not fold into integer constants.  */
-  if (TREE_CODE (expr) != INTEGER_CST
- && !value_dependent_expression_p (expr))
+  if (TREE_CODE (expr) != INTEGER_CST && !val_dep_p)
{
  if (complain & tf_error)
{
@@ -7265,7 +7272,7 @@ convert_nontype_argument (tree type, tree expr, 
tsubst_flags_t complain)
Here, we do 

Re: [PATCH] testsuite: XFAIL gcc.dg/torture/pr93133.c for powerpc*-*-* [PR93393]

2020-01-28 Thread Segher Boessenkool
On Tue, Jan 28, 2020 at 10:43:24AM +, Richard Sandiford wrote:
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
> OK to install?

Yes, thank you.


Segher


> 2020-01-28  Richard Sandiford  
> 
> gcc/testsuite/
>   PR testsuite/93393
>   * gcc.dg/torture/pr93133.c: XFAIL for powerpc*-*-*.
> ---
>  gcc/testsuite/gcc.dg/torture/pr93133.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/torture/pr93133.c 
> b/gcc/testsuite/gcc.dg/torture/pr93133.c
> index 21eae1eb3dd..19af6b8ee6d 100644
> --- a/gcc/testsuite/gcc.dg/torture/pr93133.c
> +++ b/gcc/testsuite/gcc.dg/torture/pr93133.c
> @@ -1,4 +1,4 @@
> -/* { dg-do run } */
> +/* { dg-do run { xfail powerpc*-*-* } } */
>  /* { dg-add-options ieee } */
>  /* { dg-require-effective-target fenv_exceptions } */
>  /* { dg-skip-if "fenv" { powerpc-ibm-aix* } } */


Re: [PATCH 3/3] Implementation of -Wclobbered on tree-ssa

2020-01-28 Thread Jeff Law
On Thu, 2019-10-17 at 16:14 +0300, Alexander Monakov wrote:
> On Tue, 8 Oct 2019, Alexander Monakov wrote:
> 
> [massive snip]
> 
> > So in my opinion our CFG is good enough, the real issues with -Wclobbered 
> > false
> > positives are not due to phi nodes but other effects.
> > 
> > If you agree: what would be the next steps?
> 
> Hello,
> 
> may I ping this discussion?  I apologize for letting my cranky mood that day 
> leak
> too badly into the message.
No worries at all.  Just too much going on.  Happy to try and kick
things moving.  Particularly in the context of 21161 and 61118 which
are regressions.

jeff




Re: [PATCH 3/3] Implementation of -Wclobbered on tree-ssa

2020-01-28 Thread Jeff Law
On Tue, 2019-10-08 at 18:04 +0300, Alexander Monakov wrote:
> On Thu, 3 Oct 2019, Jeff Law wrote:
> 
> > You may want to review the 2018 discussion:
> > 
> > https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg185287.html
> > 
> > THe 2018 discussion was primarily concerned with fixing the CFG
> > inaccuracies which would in turn fix 61118.  But would not fix 21161.
> 
> Well, PR 61118 was incorrectly triaged to begin with.
> 
> Let me show starting from a simplified testcase:
> 
> void bar(jmp_buf);
> void foo(int n)
> {
> jmp_buf jb;
> for (; n; n--)
> if (setjmp(jb))
> bar(jb);
> }
> 
> Here we warn that "parameter 'n' might be clobbered..." despite that
> "obviously" there is no way 'n' might be modified between setjmp and
> bar... right?
Not necessarily. You have to look at what objects are live.  N is
clearly going to be live throughout the entire loop and thus it will be
subject to setjmp/longjmp warnings.

The inaccuracies in our CFG at the setjmp call cause objects which are
not actually live across the call to appear to be live.  See PR 21161.
> 
> Now suppose 'bar' looks like
[ ... ]
It doesn't really matter what bar looks like.

> 
> 
> Still, I can imagine one can argue that the warning in the PR is bogus, as
> the loop assigns invariant values to problematic variables:
I'm not sure which PR you're referring to (21161, 65041 or some other?)


> 
> void foo(struct ctx *arg)
> {
> while (arg->cond) {
> void *var1 = >field;
> void (*var2)(void*) = globalfn;
> jmp_buf jb;
> if (!setjmp(jb))
> var2(var1);
> }
> etc(arg);
> }
> 
> and since "obviously" >field is invariant, there's no way setjmp will
> overwrite 'var1' with a different value, right?.. Except:
> 
> If the while loop is not entered, a longjmp from inside 'etc' would restore
> default (uninitialized) values; note the compiler doesn't pay attention to the
> lifetime of jmp_buf, it simply assumes 'etc' may cause setjmp to return
> abnormally (this is correct in case of other returns_twice functions such as
> vfork).
Whether or not we should warn for restoring an uninitialized state I
think is somethign we haven't really considered.  Depending on how many
assignments to the pseudo there are and the exact control flow we may
or may not warn.

> 
> 
> So the bottom line here that setjmps in loops are extra tricky, warnings for
> them that superficially appear to be false positives may indicate a real 
> issue,
> and we can do a better job documenting that and recommending safer patterns.
I don't think setjmps in loops add any real additional complexity.


> 
> 
> Now regarding claims that we emit "incorrect" CFG that is causing extra phi
> nodes, and that "fixing CFG" would magically eliminate those and help with
> false positives.
> 
> While in principle it is perhaps nicer to have IR that more closely models
> the abnormal flow, I don't see how it would reduce phi nodes; we'd go from
It certainly does move PHI nodes and it can reduce the number of false
positives.  It's not as effective as it could be because of lameness on
the RTL side (again see PR21161).

Jeff



[committed] add test for PR 93437

2020-01-28 Thread Martin Sebor

Done in r10-6309-g4dd27b527c503aa50909fe1eb7d308266b1e103a.

Martin


[COMMITTED] c++: Fix return deduction of lambda in discarded stmt.

2020-01-28 Thread Jason Merrill
A return statement in a discarded statement is not used for return type
deduction, but we still want to do deduction for a return statement in a
lambda in a discarded statement.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/93442
* parser.c (cp_parser_lambda_expression): Clear in_discarded_stmt.
---
 gcc/cp/parser.c  |  6 ++
 .../g++.dg/cpp1z/constexpr-if-lambda1.C  | 16 
 2 files changed, 22 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda1.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 72037ee7b46..b8327823777 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -10530,6 +10530,10 @@ cp_parser_lambda_expression (cp_parser* parser)
 parser->implicit_template_scope = 0;
 parser->auto_is_implicit_function_template_parm_p = false;
 
+/* The body of a lambda in a discarded statement is not discarded.  */
+bool discarded = in_discarded_stmt;
+in_discarded_stmt = 0;
+
 /* By virtue of defining a local class, a lambda expression has access to
the private variables of enclosing classes.  */
 
@@ -10560,6 +10564,8 @@ cp_parser_lambda_expression (cp_parser* parser)
 
 finish_struct (type, /*attributes=*/NULL_TREE);
 
+in_discarded_stmt = discarded;
+
 parser->num_template_parameter_lists = saved_num_template_parameter_lists;
 parser->in_statement = in_statement;
 parser->in_switch_statement_p = in_switch_statement_p;
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda1.C 
b/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda1.C
new file mode 100644
index 000..64c4cd27fe6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda1.C
@@ -0,0 +1,16 @@
+// PR c++/93442
+// { dg-do compile { target c++17 } }
+
+struct bar {
+int foo(){return 0;}
+};
+
+int foobar() {
+if constexpr(true) {
+return 0;
+} else {
+return [](){
+return bar{};
+}().foo();
+}
+}

base-commit: 5aebfb71763c7c8d0bb96adcd0a5f94de96a2a13
-- 
2.18.1



[COMMITTED] c++: Fix guard variable and attribute weak.

2020-01-28 Thread Jason Merrill
My patch for PR 91476 worked for decls that are implicitly comdat/weak due
to C++ linkage rules, but broke variables explicitly marked weak.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/93477
PR c++/91476
* decl2.c (copy_linkage): Do copy DECL_ONE_ONLY and DECL_WEAK.
---
 gcc/cp/decl2.c|  8 ++--
 gcc/testsuite/g++.dg/abi/guard4.C | 11 +++
 2 files changed, 17 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/abi/guard4.C

diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 1ecf0b937d5..98d8e6a6b53 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -3228,8 +3228,12 @@ copy_linkage (tree guard, tree decl)
 {
   CP_DECL_THREAD_LOCAL_P (guard) = CP_DECL_THREAD_LOCAL_P (decl);
   set_decl_tls_model (guard, DECL_TLS_MODEL (decl));
-  /* We can't rely on DECL_WEAK (decl) or DECL_ONE_ONLY (decl) here, as
-they may not be set until import_export_decl at EOF.  */
+  if (DECL_ONE_ONLY (decl))
+   make_decl_one_only (guard, cxx_comdat_group (guard));
+  if (TREE_PUBLIC (decl))
+   DECL_WEAK (guard) = DECL_WEAK (decl);
+  /* Also check vague_linkage_p, as DECL_WEAK and DECL_ONE_ONLY might not
+be set until import_export_decl at EOF.  */
   if (vague_linkage_p (decl))
comdat_linkage (guard);
   DECL_VISIBILITY (guard) = DECL_VISIBILITY (decl);
diff --git a/gcc/testsuite/g++.dg/abi/guard4.C 
b/gcc/testsuite/g++.dg/abi/guard4.C
new file mode 100644
index 000..537c90524f3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/guard4.C
@@ -0,0 +1,11 @@
+// PR c++/93477
+// { dg-require-weak }
+
+namespace x {
+  struct s {
+s() {}
+static int a;
+  };
+  // { dg-final { scan-assembler {.weak[^\n]*_ZGVN1x1bE} } }
+  struct s __attribute__((weak)) b = s();
+}

base-commit: 5aebfb71763c7c8d0bb96adcd0a5f94de96a2a13
-- 
2.18.1



[committed] analyzer: fix ICE when longjmp isn't marked 'noreturn' (PR 93316)

2020-01-28 Thread David Malcolm
Comments 11-16 within PR analyzer/93316 discuss an ICE in some setjmp
tests seen on AIX and powerpc-darwin9.

The issue turned out to be an implicit assumption that longjmp is
marked "noreturn".  There are two places in engine.cc where the code
attempted to locate the longjmp GIMPLE_CALL by finding the final stmt
within its supernode, in one place casting it via "as_a ",
in the other using it as the location_t of the
"rewinding from longjmp..." event.

When longjmp isn't marked noreturn, its basic block and hence supernode
can have additional stmts after the longjmp; in the setjmp-3.c case
this was a GIMPLE_RETURN, leading to a ICE when casting this to a
GIMPLE_CALL.

This patch fixes the two places in question to use the
  rewind_info_t::get_longjmp_call
member function introduced by 342e14ffa30e9163a1a75e0a4fb21b6883d58dbe,
fixing the ICE (and ensuring the correct location_t is used for events).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Reproduced the bug and verified the fix with a stage 1 build on
powerpc-ibm-aix7.2.0.0 (gcc111), where there were no remaining FAILs in
analyzer.exp after the patch.

Committed to master as r10-6305-g5aebfb71763c7c8d0bb96adcd0a5f94de96a2a13.

gcc/analyzer/ChangeLog:
PR analyzer/93316
* engine.cc (rewind_info_t::update_model): Get the longjmp call
stmt via get_longjmp_call () rather than assuming it is the last
stmt in the longjmp's supernode.
(rewind_info_t::add_events_to_path): Get the location_t for the
rewind_from_longjmp_event via get_longjmp_call () rather than from
the supernode's get_end_location ().
---
 gcc/analyzer/engine.cc | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 2bc0aff6a6e..9acec704224 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -1333,21 +1333,13 @@ void
 rewind_info_t::update_model (region_model *model,
 const exploded_edge )
 {
-  const exploded_node _enode = *eedge.m_src;
-  const program_point _point = src_enode.get_point ();
-
-  const gimple *last_stmt
-= src_point.get_supernode ()->get_last_stmt ();
-  gcc_assert (last_stmt);
-  const gcall *longjmp_call = as_a  (last_stmt);
-
   const program_point _point = eedge.m_src->get_point ();
   const program_point _point = eedge.m_dest->get_point ();
 
   gcc_assert (longjmp_point.get_stack_depth ()
  >= setjmp_point.get_stack_depth ());
 
-  model->on_longjmp (longjmp_call,
+  model->on_longjmp (get_longjmp_call (),
 get_setjmp_call (),
 setjmp_point.get_stack_depth (), NULL);
 }
@@ -1368,7 +1360,7 @@ rewind_info_t::add_events_to_path (checker_path 
*emission_path,
 
   emission_path->add_event
 (new rewind_from_longjmp_event
- (, src_point.get_supernode ()->get_end_location (),
+ (, get_longjmp_call ()->location,
   src_point.get_fndecl (),
   src_stack_depth, this));
   emission_path->add_event
-- 
2.21.0



[PATCH] coroutines: Prevent repeated error messages for missing promise.

2020-01-28 Thread Iain Sandoe
While looking at Pr93458, I spotted the following non-fatal, but 
unhelpful diagnostic output.

If the user's coroutine return type omits the mandatory promise
type then we will currently restate that error each time we see
a coroutine keyword, which doesn't provide any new information.
This suppresses all but the first instance in each coroutine.

tested on x86_64-darwin16
OK for master?
thanks
Iain

gcc/cp/ChangeLog:

2020-01-28  Iain Sandoe  

* coroutines.cc (find_promise_type): Delete unused forward
declaration.
(struct coroutine_info): Add a bool for no promise type error.
(coro_promise_type_found_p): Only emit the error for a missing
promise once in each affected coroutine.

gcc/testsuite/ChangeLog:

2020-01-28  Iain Sandoe  

* g++.dg/coroutines/coro-missing-promise.C: New test.
---
 gcc/cp/coroutines.cc  |  7 +--
 .../g++.dg/coroutines/coro-missing-promise.C  | 20 +++
 2 files changed, 25 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/coroutines/coro-missing-promise.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 89ecea91b9a..a56bbc883ca 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -33,7 +33,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "gcc-rich-location.h"
 #include "hash-map.h"
 
-static tree find_promise_type (tree);
 static bool coro_promise_type_found_p (tree, location_t);
 
 /* GCC C++ coroutines implementation.
@@ -93,6 +92,7 @@ struct GTY((for_user)) coroutine_info
function into a coroutine.  */
   /* Flags to avoid repeated errors for per-function issues.  */
   bool coro_ret_type_error_emitted;
+  bool coro_promise_error_emitted;
 };
 
 struct coroutine_info_hasher : ggc_ptr_hash
@@ -447,7 +447,10 @@ coro_promise_type_found_p (tree fndecl, location_t loc)
   /* If we don't find it, punt on the rest.  */
   if (coro_info->promise_type == NULL_TREE)
{
- error_at (loc, "unable to find the promise type for this coroutine");
+ if (!coro_info->coro_promise_error_emitted)
+   error_at (loc, "unable to find the promise type for"
+ " this coroutine");
+ coro_info->coro_promise_error_emitted = true;
  return false;
}
 
diff --git a/gcc/testsuite/g++.dg/coroutines/coro-missing-promise.C 
b/gcc/testsuite/g++.dg/coroutines/coro-missing-promise.C
new file mode 100644
index 000..3fc21a4e2ca
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/coro-missing-promise.C
@@ -0,0 +1,20 @@
+//  { dg-additional-options "-fsyntax-only -w" }
+
+#include "coro.h"
+
+// Diagnose completely missing promise.
+
+// { dg-error {no type named 'promise_type' in 'struct NoPromiseHere'} "" { 
target *-*-* } 0 }
+
+struct NoPromiseHere {
+  coro::coroutine_handle<> handle;
+  NoPromiseHere () : handle (nullptr) {}
+  NoPromiseHere (coro::coroutine_handle<> handle) : handle (handle) {}
+};
+
+NoPromiseHere
+bar ()
+{
+  co_yield 22; // { dg-error {unable to find the promise type for this 
coroutine} }
+  co_return 0;
+}
-- 
2.24.1



Patch to fix PR93272

2020-01-28 Thread Vladimir Makarov

The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93272

The patch was successfully tested and bootstrapped on x86_64.

Unfortunately it is hard to create a test case for the patch.  So there 
is no test for this PR.



commit 5c8a1211b9873a1b69ef7b2fddae181535bc3b0a (HEAD -> master, origin/master, origin/HEAD)
Author: Vladimir N. Makarov 
Date:   Tue Jan 28 15:43:44 2020 -0500

Fix for PR93272 - LRA: EH reg allocated to hold local variable

2020-01-28  Vladimir Makarov  

PR rtl-optimization/93272
* ira-lives.c (process_out_of_region_eh_regs): New function.
(process_bb_node_lives): Call it.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 326250b9b96..8d60dcf0864 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2020-01-28  Vladimir Makarov  
+
+	PR rtl-optimization/93272
+	* ira-lives.c (process_out_of_region_eh_regs): New function.
+	(process_bb_node_lives): Call it.
+
 2020-01-28  Jan Hubicka  
 
 	* coverage.c (read_counts_file): Make error message lowercase.
diff --git a/gcc/ira-lives.c b/gcc/ira-lives.c
index 155f825ea8d..f776fd2342f 100644
--- a/gcc/ira-lives.c
+++ b/gcc/ira-lives.c
@@ -1160,6 +1160,50 @@ non_conflicting_reg_copy_p (rtx_insn *insn)
   return SET_SRC (set);
 }
 
+#ifdef EH_RETURN_DATA_REGNO
+
+/* Add EH return hard registers as conflict hard registers to allocnos
+   living at end of BB.  For most allocnos it is already done in
+   process_bb_node_lives when we processing input edges but it does
+   not work when and EH edge is edge out of the current region.  This
+   function covers such out of region edges. */
+static void
+process_out_of_region_eh_regs (basic_block bb)
+{
+  edge e;
+  edge_iterator ei;
+  unsigned int i;
+  bitmap_iterator bi;
+  bool eh_p = false;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+if ((e->flags & EDGE_EH)
+	&& IRA_BB_NODE (e->dest)->parent != IRA_BB_NODE (bb)->parent)
+  eh_p = true;
+
+  if (! eh_p)
+return;
+
+  EXECUTE_IF_SET_IN_BITMAP (df_get_live_out (bb), FIRST_PSEUDO_REGISTER, i, bi)
+{
+  ira_allocno_t a = ira_curr_regno_allocno_map[i];
+  for (int n = ALLOCNO_NUM_OBJECTS (a) - 1; n >= 0; n--)
+	{
+	  ira_object_t obj = ALLOCNO_OBJECT (a, n);
+	  for (int k = 0; ; k++)
+	{
+	  unsigned int regno = EH_RETURN_DATA_REGNO (k);
+	  if (regno == INVALID_REGNUM)
+		break;
+	  SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno);
+	  SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno);
+	}
+	}
+}
+}
+
+#endif
+
 /* Process insns of the basic block given by its LOOP_TREE_NODE to
update allocno live ranges, allocno hard register conflicts,
intersected calls, and register pressure info for allocnos for the
@@ -1213,6 +1257,10 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node)
   EXECUTE_IF_SET_IN_BITMAP (reg_live_out, FIRST_PSEUDO_REGISTER, j, bi)
 	mark_pseudo_regno_live (j);
 
+#ifdef EH_RETURN_DATA_REGNO
+  process_out_of_region_eh_regs (bb);
+#endif
+
   freq = REG_FREQ_FROM_BB (bb);
   if (freq == 0)
 	freq = 1;


Verify sanity of indirect call/topn profiles

2020-01-28 Thread Jan Hubicka
Hi,
I will try to reming this next stage1 since it is not regression fix.
I found it useful to have bit of sanity checking of the topn profiles to
work out the bugs in merging and updating that was there.

Honza

gcc/ChangeLog:

2020-01-28  Jan Hubicka  

* profile.c (compute_value_histograms): Verify profile sanity.

diff --git a/gcc/profile.c b/gcc/profile.c
index cd754c4c66a..782534e5ab4 100644
--- a/gcc/profile.c
+++ b/gcc/profile.c
@@ -856,10 +856,10 @@ compute_value_histograms (histogram_values values, 
unsigned cfg_checksum,
   gimple_add_histogram_value (cfun, stmt, hist);
   hist->hvalue.counters =  XNEWVEC (gcov_type, hist->n_counters);
   for (j = 0; j < hist->n_counters; j++)
-if (aact_count)
-  hist->hvalue.counters[j] = aact_count[j];
-else
-  hist->hvalue.counters[j] = 0;
+   if (aact_count)
+ hist->hvalue.counters[j] = aact_count[j];
+   else
+ hist->hvalue.counters[j] = 0;
 
   if (hist->type == HIST_TYPE_TOPN_VALUES
  || hist->type == HIST_TYPE_INDIR_CALL)
@@ -871,6 +871,26 @@ compute_value_histograms (histogram_values values, 
unsigned cfg_checksum,
= RDIV (hist->hvalue.counters[2 * i + 2], GCOV_TOPN_VALUES);
 
  sort_hist_values (hist);
+
+ /* Check profile sanity.  */
+ if (hist->hvalue.counters[2] != -1)
+   for (int i = 0; i < GCOV_TOPN_VALUES - 1 && ok; i++)
+ for (int j = i + 1; j < GCOV_TOPN_VALUES && ok; j++)
+   if ((hist->hvalue.counters[i * 2 + 1]
+== hist->hvalue.counters[j * 2 + 1]
+&& hist->hvalue.counters[i * 2 + 2]
+&& hist->hvalue.counters[j * 2 + 2])
+   || hist->hvalue.counters[i * 2 + 2] < 0)
+ {
+   if (hist->type == HIST_TYPE_TOPN_VALUES)
+ error_at (gimple_location (stmt),
+   "corrupted profile info:"
+   " invalid topn profile histogram");
+   else
+ error_at (gimple_location (stmt),
+   "corrupted profile info:"
+   " indirect call profile histogram");
+ }
}
 
   /* Time profiler counter is not related to any statement,


diagnostics: make error message lowercase.

2020-01-28 Thread Jan Hubicka
Hi,
I comitted this ias obvious.  I also wonder if the next error should be
info or should be part of the first error message.

* coverage.c (read_counts_file): Make error message lowercase.

diff --git a/gcc/coverage.c b/gcc/coverage.c
index 5e961b26f66..f29ff640c43 100644
--- a/gcc/coverage.c
+++ b/gcc/coverage.c
@@ -265,7 +265,7 @@ read_counts_file (void)
  else if (entry->lineno_checksum != lineno_checksum
   || entry->cfg_checksum != cfg_checksum)
{
- error ("Profile data for function %u is corrupted", fn_ident);
+ error ("profile data for function %u is corrupted", fn_ident);
  error ("checksum is (%x,%x) instead of (%x,%x)",
 entry->lineno_checksum, entry->cfg_checksum,
 lineno_checksum, cfg_checksum);


profile: fix profile count dumping

2020-01-28 Thread Jan Hubicka
Hi,
this patch fixes dump of quality. Comitted as obvious.

* profile-count.h (profile_quality_display_names): Fix ordering.

diff --git a/gcc/profile-count.c b/gcc/profile-count.c
index 0c792297826..c89914ff8a0 100644
--- a/gcc/profile-count.c
+++ b/gcc/profile-count.c
@@ -78,9 +78,9 @@ const char *profile_quality_display_names[] =
   "estimated locally",
   "estimated locally, globally 0",
   "estimated locally, globally 0 adjusted",
-  "adjusted",
-  "auto FDO",
   "guessed",
+  "auto FDO",
+  "adjusted",
   "precise"
 };
 


[COMMITTED] c++: Allow template rvalue-ref conv to bind to lvalue ref.

2020-01-28 Thread Jason Merrill
When I implemented the [over.match.ref] rule that a reference conversion
function needs to match l/rvalue of the target reference type it changed our
handling of this testcase.  It seems to me that our current behavior is what
the standard says, but it doesn't seem desirable, and all the other
compilers have our old behavior.  So let's limit the change to non-templates
until there's some clarification from the committee.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/90546
* call.c (build_user_type_conversion_1): Allow a template conversion
returning an rvalue reference to bind directly to an lvalue.
---
 gcc/cp/call.c |  2 ++
 gcc/testsuite/g++.dg/cpp0x/rv-conv3.C | 15 +++
 2 files changed, 17 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/rv-conv3.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 009cb85ad60..fde29f8f631 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -4115,6 +4115,8 @@ build_user_type_conversion_1 (tree totype, tree expr, int 
flags,
   EXPR_LOCATION (expr));
}
  else if (TYPE_REF_P (totype) && !ics->rvaluedness_matches_p
+  /* Limit this to non-templates for now (PR90546).  */
+  && !cand->template_decl
   && TREE_CODE (TREE_TYPE (totype)) != FUNCTION_TYPE)
{
  /* If we are called to convert to a reference type, we are trying
diff --git a/gcc/testsuite/g++.dg/cpp0x/rv-conv3.C 
b/gcc/testsuite/g++.dg/cpp0x/rv-conv3.C
new file mode 100644
index 000..5f727fcc0d5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/rv-conv3.C
@@ -0,0 +1,15 @@
+// PR c++/90546
+// { dg-do link { target c++11 } }
+
+struct Foo { };
+void test(const Foo&) {}
+Foo f;
+struct Bar {
+  template  operator T&&();
+};
+template<> Bar::operator const Foo&&() {
+return static_cast(f);
+}
+int main() {
+  test(Bar{});
+}

base-commit: 0968f7da26221bfc9203cd557c8636c0a0f24018
-- 
2.18.1



ipa: fix handling of multiple speculations (PR93318)

2020-01-28 Thread Jan Hubicka
Hi,
This patch started as work to resole Richard's comment on quadratic lookups
in resolve_speculation. While doing it I however noticed multiple problems
in the new speuclative call code which made the patch quite big. In 
particular:
 1) Before applying speculation we consider only targets with at lest
probability 1/2.
If profile is sane at most two targets can have probability greater or
equal to 1/2. So the new multi-target speculation code got enabled only
in very special scenario when there ae precisely two target with precise
probability 1/2 (which is tested by the single testcase).

As a conseuqence the multiple target logic got minimal test coverage and
this made us to miss several ICEs.
 2) Profile updating in profile merging, tree-inline and indirect call
expansion was wrong which led to inconsistent profiles (as already seen
on the testcase).
 3) Code responsible to turn speculative call to direct call was broken for
anything with more than one target.
 4) There were multiple cases where call_site_hash went out of sync which
eventually leads to an ICE..
 5) Some code expects that all speculative call targets forms a sequence in
the callee linked list but there is no code to maintain that invariant
nor a verifier.
Fixing this it became obvious that the current API of speculative_call_info is
not useful because it really builds on fact tht there are precisely three
components (direct call, ref and indirect call) in every speculative call
sequence.  I ended up replacing it with iterator API for direct call
(first_speculative_call_target, next_speculative_call_target) and accessors for
the other coponents updating comment in cgraph.h.

Finally I made the work with call site hash more effetive by updating edge
manipulation to keep them in sequence. So first one can be looked up from the
hash and then they can be iterated by callee.

There are other things that can be improved (for example the speculation should
start with most common target first), but I will try to keep that for next
stage1. This patch is mostly about getting rid of ICE and profile corruption
which is a regression from GCC 9.

lto-bootstrapped/regtested x86_64-linux and also tested with Firefox and
hack for less conservative indirect call profile merging.

Honza

gcc/ChangeLog:

2020-01-28  Jan Hubicka  

PR lto/93318
* cgraph.c (cgraph_add_edge_to_call_site_hash): Update call site
hash only when edge is first within the sequence.
(cgraph_edge::set_call_stmt): Update handling of speculative calls.
(symbol_table::create_edge): Do not set target_prob.
(cgraph_edge::remove_caller): Watch for speculative calls when updating
the call site hash.
(cgraph_edge::make_speculative): Drop target_prob parameter.
(cgraph_edge::speculative_call_info): Remove.
(cgraph_edge::first_speculative_call_target): New member function.
(update_call_stmt_hash_for_removing_direct_edge): New function.
(cgraph_edge::resolve_speculation): Rewrite to new API.
(cgraph_edge::speculative_call_for_target): New member function.
(cgraph_edge::make_direct): Rewrite to new API; fix handling of
multiple speculation targets.
(cgraph_edge::redirect_call_stmt_to_callee): Likewise; fix updating
of profile.
(verify_speculative_call): Verify that targets form an interval.
* cgraph.h (cgraph_edge::speculative_call_info): Remove.
(cgraph_edge::first_speculative_call_target): New member function.
(cgraph_edge::next_speculative_call_target): New member function.
(cgraph_edge::speculative_call_target_ref): New member function.
(cgraph_edge;:speculative_call_indirect_edge): New member funtion.
(cgraph_edge): Remove target_prob.
* cgraphclones.c (cgraph_node::set_call_stmt_including_clones):
Fix handling of speculative calls.
* ipa-devirt.c (ipa_devirt): Fix handling of speculative cals.
* ipa-fnsummary.c (analyze_function_body): Likewise.
* ipa-inline.c (speculation_useful_p): Use new speculative call API.
* ipa-profile.c (dump_histogram): Fix formating.
(ipa_profile_generate_summary): Watch for overflows.
(ipa_profile): Do not require probablity to be 1/2; update to new API.
* ipa-prop.c (ipa_make_edge_direct_to_target): Update to new API.
(update_indirect_edges_after_inlining): Update to new API.
* ipa-utils.c (ipa_merge_profiles): Rewrite merging of speculative call
profiles.
* profile-count.h: (profile_probability::adjusted): New.
* tree-inline.c (copy_bb): Update to new speculative call API; fix
updating of profile.
* value-prof.c (gimple_ic_transform): Rename to ...
(dump_ic_profile): ... this one; update dumping.
(stream_in_histogram_value): Fix formating.

Re: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES

2020-01-28 Thread H.J. Lu
On Tue, Jan 28, 2020 at 10:58 AM Jakub Jelinek  wrote:
>
> On Tue, Jan 28, 2020 at 10:20:36AM -0800, H.J. Lu wrote:
> > From 66c534dedc7a9a632aa38c32e3f7c251b8f2c778 Mon Sep 17 00:00:00 2001
> > From: "H.J. Lu" 
> > Date: Mon, 27 Jan 2020 09:35:11 -0800
> > Subject: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES
> >
> > movaps/movups is one byte shorter than movdaq/movdqu.  But it isn't the
> > case for AVX nor AVX512.  This patch prefers TARGET_AVX over
> > TARGET_SSE_TYPELESS_STORES and adjust vmovups checks in assembly ouputs.
>
> If you haven't committed yet, please fix the movdaq typo in the description
> (to movdqa).
>

Will do.

Thanks.

-- 
H.J.


Re: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES

2020-01-28 Thread Jakub Jelinek
On Tue, Jan 28, 2020 at 10:20:36AM -0800, H.J. Lu wrote:
> From 66c534dedc7a9a632aa38c32e3f7c251b8f2c778 Mon Sep 17 00:00:00 2001
> From: "H.J. Lu" 
> Date: Mon, 27 Jan 2020 09:35:11 -0800
> Subject: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES
> 
> movaps/movups is one byte shorter than movdaq/movdqu.  But it isn't the
> case for AVX nor AVX512.  This patch prefers TARGET_AVX over
> TARGET_SSE_TYPELESS_STORES and adjust vmovups checks in assembly ouputs.

If you haven't committed yet, please fix the movdaq typo in the description
(to movdqa).

Jakub



Re: [PATCH] diagnostic_metadata: unbreak xgettext (v2)

2020-01-28 Thread David Malcolm
On Tue, 2020-01-28 at 16:34 +0100, Jakub Jelinek wrote:
> On Tue, Jan 28, 2020 at 10:30:58AM -0500, David Malcolm wrote:
> > gcc/analyzer/ChangeLog:
> > * region-model.cc (poisoned_value_diagnostic::emit): Update for
> > renaming of warning_at overload to warning_meta.
> > * sm-file.cc (file_leak::emit): Likewise.
> > * sm-malloc.cc (double_free::emit): Likewise.
> > (possible_null_deref::emit): Likewise.
> > (possible_null_arg::emit): Likewise.
> > (null_deref::emit): Likewise.
> > (null_arg::emit): Likewise.
> > (use_after_free::emit): Likewise.
> > (malloc_leak::emit): Likewise.
> > (free_of_non_heap::emit): Likewise.
> > * sm-sensitive.cc (exposure_through_output_file::emit):
> > Likewise.
> > * sm-signal.cc (signal_unsafe_call::emit): Likewise.
> > * sm-taint.cc (tainted_array_index::emit): Likewise.
> > 
> > gcc/ChangeLog:
> > * diagnostic-core.h (warning_at): Rename overload to...
> > (warning_meta): ...this.
> > (emit_diagnostic_valist): Delete decl of overload taking
> > diagnostic_metadata.
> > * diagnostic.c (emit_diagnostic_valist): Likewise for defn.
> > (warning_at): Rename overload taking diagnostic_metadata to...
> > (warning_meta): ...this.
> > 
> > gcc/testsuite/ChangeLog:
> > * gcc.dg/plugin/diagnostic_plugin_test_metadata.c: Update for
> > renaming of warning_at overload to warning_meta.
> > * gcc.dg/plugin/diagnostic_plugin_test_paths.c: Likewise.
> 
> LGTM, thanks

Thanks.   B succeeded, so pushed to master
(6c8e584430bc5dc01b4438f3c38a2a10fcba7685).



Re: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES

2020-01-28 Thread H.J. Lu
On Tue, Jan 28, 2020 at 10:04 AM Uros Bizjak  wrote:
>
> On Tue, Jan 28, 2020 at 6:51 PM H.J. Lu  wrote:
> >
> > On Tue, Jan 28, 2020 at 9:12 AM Uros Bizjak  wrote:
> > >
> > > On Tue, Jan 28, 2020 at 4:34 PM H.J. Lu  wrote:
> > >
> > > > > You could move
> > > > >
> > > > > (match_test "TARGET_AVX")
> > > > >   (const_string "TI")
> > > > >
> > > > > up to bypass the cases below.
> > > > >
> > > >
> > > > I don't think we can do that.   There are 2 cases where we prefer 
> > > > movaps/movups:
> > > >
> > > > /* Use packed single precision instructions where posisble.  I.e.
> > > > movups instead   of movupd.  */
> > > > DEF_TUNE (X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL,
> > > > "sse_packed_single_insn_optimal",
> > > >   m_BDVER | m_ZNVER)
> > > >
> > > > /* X86_TUNE_SSE_TYPELESS_STORES: Always movaps/movups for 128bit 
> > > > stores.   */
> > > > DEF_TUNE (X86_TUNE_SSE_TYPELESS_STORES, "sse_typeless_stores",
> > > >   m_AMD_MULTIPLE | m_CORE_ALL | m_GENERIC)
> > > >
> > > > We should always use movaps/movups for 
> > > > TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL.
> > > > It is wrong to bypass TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL with 
> > > > TARGET_AVX
> > > > as m_BDVER | m_ZNVER support AVX.
> > >
> > > The reason for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL on AMD target is
> > > only insn size, as advised in e.g. Software Optimization Guide for the
> > > AMD Family 15h Processors [1], section 7.1.2, where it is said:
> > >
> > > --quote--
> > > 7.1.2 Reduce Instruction SizeOptimization
> > >
> > > Reduce the size of instructions when possible.
> > >
> > > Rationale
> > >
> > > Using smaller instruction sizes improves instruction fetch throughput.
> > > Specific examples include the following:
> > >
> > > *In SIMD code, use the single-precision (PS) form of instructions
> > > instead of the double-precision (PD) form. For example, for register
> > > to register moves, MOVAPS achieves the same result as MOVAPD, but uses
> > > one less byte to encode the instruction and has no prefix byte. Other
> > > examples in which single-precision forms can be substituted for
> > > double-precision forms include MOVUPS, MOVNTPS, XORPS, ORPS, ANDPS,
> > > and SHUFPS.
> > > ...
> > > --/quote--
> > >
> > > Please note that this optimization applies only to non-AVX forms, as
> > > demonstrated by:
> > >
> > >0:   0f 28 c8movaps %xmm0,%xmm1
> > >3:   66 0f 28 c8 movapd %xmm0,%xmm1
> > >7:   c5 f8 28 d1 vmovaps %xmm1,%xmm2
> > >b:   c5 f9 28 d1 vmovapd %xmm1,%xmm2
> > >
> > > Also note that MOVDQA is missing in the above optimization. It is
> > > harmful to substitute MOVDQA with MOVAPS, as it can (and does)
> > > introduce +1 cycle forwarding penalty between FLT (FPA/FPM) and INT
> > > (VALU) FP clusters.
> > >
> > > Following the above optimization, it is obvious that
> > > TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL handling was cargo-culted from
> > > one pattern to another. Its use should be reviewed and fixed where not
> > > appropriate.
> > >
> > > [1] https://www.amd.com/system/files/TechDocs/47414_15h_sw_opt_guide.pdf
> > >
> > > Uros.
> >
> > Here is the updated patch which moves TARGET_AVX before
> > TARGET_SSE_TYPELESS_STORES.   OK for master if there is
> > no regression?
> >
> > Thanks.
>
>
> +   (match_test "TARGET_AVX")
> + (const_string "")
> (and (match_test " == 16")
>
> Only MODE_SIZE == 16 cases will be left here, since TARGET_AVX is
> necessary for MODE_SIZE > 16. This test can be removed.
>
> OK with the above change.
>

This is the patch I am going to check in.

Thanks.

-- 
H.J.
From 66c534dedc7a9a632aa38c32e3f7c251b8f2c778 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 27 Jan 2020 09:35:11 -0800
Subject: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES

movaps/movups is one byte shorter than movdaq/movdqu.  But it isn't the
case for AVX nor AVX512.  This patch prefers TARGET_AVX over
TARGET_SSE_TYPELESS_STORES and adjust vmovups checks in assembly ouputs.

gcc/

	PR target/91461
	* config/i386/i386.md (*movoi_internal_avx): Remove
	TARGET_SSE_TYPELESS_STORES check.
	(*movti_internal): Prefer TARGET_AVX over
	TARGET_SSE_TYPELESS_STORES.
	(*movtf_internal): Likewise.
	* config/i386/sse.md (mov_internal): Prefer TARGET_AVX over
	TARGET_SSE_TYPELESS_STORES.  Remove " == 16" check
	from TARGET_SSE_TYPELESS_STORES.

gcc/testsuite/

	PR target/91461
	* gcc.target/i386/avx256-unaligned-store-2.c: Don't check
	vmovups.
	* gcc.target/i386/avx256-unaligned-store-3.c: Likewise.
	* gcc.target/i386/pieces-memcpy-4.c: Likewise.
	* gcc.target/i386/pieces-memcpy-5.c: Likewise.
	* gcc.target/i386/pieces-memcpy-6.c: Likewise.
	* gcc.target/i386/pieces-strcpy-2.c: Likewise.
	* gcc.target/i386/pr90980-1.c: Likewise.
	* gcc.target/i386/pr87317-4.c: Check "\tvmovd\t" instead of
	"vmovd" to avoid matching "vmovdqu".
	* gcc.target/i386/pr87317-5.c: Likewise.
	* gcc.target/i386/pr87317-7.c: Likewise.
	* 

Re: [patch, fortran] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule

2020-01-28 Thread Andrew Benson
Hi Tobias,

Thanks - committed as 

https://gcc.gnu.org/git/gitweb.cgi?
p=gcc.git;a=commit;h=ad690d79cfbb905c5546c9333c5fd089d906505b

I'll send a separate email about changes to the release notes.

-Andrew

On Tuesday, January 28, 2020 6:49:54 PM PST Tobias Burnus wrote:
> Hi Andrew,
> 
> On 1/28/20 5:41 PM, Andrew Benson wrote:
> >> Can you also update the comment before the #define? It currently has:
> > Thanks for the suggestion. An updated patch is attached.
> 
> Thanks. LGTM now.
> 
> >> PS: I wonder whether there are relevant systems which will fail because
> >> they do not handle that long symbol names...
> > 
> > Good question. Should I hold off on committing the patch until this can be
> > tested further? Or should I just go ahead and commit and deal with any
> > such
> > problems if they show up?
> 
> I think it is fine – as a user, one can always choose a shorter name if
> the system one is interested in doesn't support it. Besides, following
> the current scheme, we do not really have a choice without breaking the ABI.
> > Also, Richard Biener raised the question (in the PR) of whether this patch
> > would be an ABI change. I can see that it probably would be, but don't
> > know
> > for sure. If it would change the ABI is there anything else that I need to
> > include in the patch?
> 
> As just mentioned on the PR, I think it is a small ABI change – before a
> very long identified got cut off now it is available in full extent. — I
> do not see any simple way to avoid this ABI change and, frankly, I also
> do not think any real-world program uses that long identifiers.
> 
> Namely, instead of using 3 times the length of 42 character
> ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnop") the standard permits 3
> times the length of 63 characters
> (ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890_).
> Already 42 is quite long for a module, submodule or procedure name. And
> that does work even without the patch. And as only the sum counts, if
> one part only uses 10 characters (e.g. "my_module5"), now 2*58
> characters available two others.
> 
> If at the end it really causes problems (source code not available), the
> user still can modify the module file and change the procedure name to
> the trimmed value and continue.
> 
> Thus, I do not think it is a real problem in practice. We could be nice,
> however, and add a note to the release notes (i.e.
> https://gcc.gnu.org/gcc-10/changes.html); I not completely sure whether
> it is worthwhile but why not. [See https://gcc.gnu.org/about.html#git
> about how to change WWW files and CC Gerald as web maintainer when
> submitting wwwdocs patches.]
> 
> Thanks,
> 
> Tobias
> 
> > patch.diff
> > 
> > diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
> > index 52bc045..69171f3 100644
> > --- a/gcc/fortran/trans.h
> > +++ b/gcc/fortran/trans.h
> > @@ -23,8 +23,8 @@ along with GCC; see the file COPYING3.  If not see
> > 
> >   #include "predict.h"  /* For enum br_predictor and PRED_*.  */
> > 
> > -/* Mangled symbols take the form __module__name.  */
> > -#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
> > +/* Mangled symbols take the form __module__name or
> > __module.submodule__name.  */ +#define GFC_MAX_MANGLED_SYMBOL_LEN 
> > (GFC_MAX_SYMBOL_LEN*3+5)
> > 
> >   /* Struct for holding a block of statements.  It should be treated as an
> >   
> >  opaque entity and not modified directly.  This allows us to change
> >  the
> > 
> > diff --git a/gcc/testsuite/gfortran.dg/pr93461.f90
> > b/gcc/testsuite/gfortran.dg/pr93461.f90 new file mode 100644
> > index 000..3bef326
> > --- /dev/null
> > +++ b/gcc/testsuite/gfortran.dg/pr93461.f90
> > @@ -0,0 +1,22 @@
> > +! { dg-do compile }
> > +! PR fortran/93461
> > +module aModuleWithAnAllowedName
> > +  interface
> > + module subroutine aShortName()
> > + end subroutine aShortName
> > +  end interface
> > +end module aModuleWithAnAllowedName
> > +
> > +submodule (aModuleWithAnAllowedName)
> > aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName +contains
> > +  subroutine aShortName()
> > +call aSubroutineWithAVeryLongNameThatWillCauseAProblem()
> > +call aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
> > +  end subroutine aShortName
> > +
> > +  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem()
> > +  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem
> > +
> > +  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
> > +  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso
> > +end submodule aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName
> > 
> > ChangeLog
> > 
> > 2020-01-28  Andrew Benson
> > 
> > PR fortran/93461
> > * trans.h: Increase GFC_MAX_MANGLED_SYMBOL_LEN to
> > GFC_MAX_SYMBOL_LEN*3+5 to allow for inclusion of submodule name,
> > plus the "." between module and submodule names.
> > 
> > * gfortran.dg/pr93461.f90: New test.


-- 

* Andrew Benson: 

Re: [patch, fortran] PR93473 - ICE on valid with long module + submodule names

2020-01-28 Thread Andrew Benson
Thanks - committed as a83b5cc5828ee34471de415e8893242dd3b0a91b 

https://gcc.gnu.org/git/gitweb.cgi?
p=gcc.git;a=commit;h=a83b5cc5828ee34471de415e8893242dd3b0a91b

I'll close the PR.

Thanks,
Andrew

On Tuesday, January 28, 2020 6:32:21 PM PST Tobias Burnus wrote:
> LGTM now.
> 
> Thanks,
> 
> Tobias
> 
> On 1/28/20 5:41 PM, Andrew Benson wrote:
> > On Tuesday, January 28, 2020 9:27:58 AM PST Tobias Burnus wrote:
> >> On 1/28/20 12:41 AM, Andrew Benson wrote:
> >>> The problem occurs in set_syms_host_assoc() where the "parent1" and
> >>> "parent2" variables have a maximum length of GFC_MAX_SYMBOL_LEN+1. This
> >>> is insufficient when the parent names are a module+submodule name
> >>> concatenated with a ".". The patch above fixes this by increasing their
> >>> length to 2*GFC_MAX_SYMBOL_LEN+2.
> >>> 
> >>> A patch to fix this is attached. The patch regression tests cleanly - ok
> >>> to
> >>> commit?
> >> 
> >> The patch is okay, but can you add a comment – similar to the other
> >> patch – which makes clear why one needs twice the normal
> >> 
> >> GFC_MAX_SYMBOL_LEN (+2)? You currently have:
> >>>  const char dot[2] = ".";
> >>> 
> >>> -  char parent1[GFC_MAX_SYMBOL_LEN + 1];
> >>> -  char parent2[GFC_MAX_SYMBOL_LEN + 1];
> >>> +  char parent1[2 * GFC_MAX_SYMBOL_LEN + 2];
> >>> +  char parent2[2 * GFC_MAX_SYMBOL_LEN + 2];
> > 
> > Yes - I've added a comment here explaining the naming convention. An
> > updated patch is attached.
> > 
> > -Andrew


-- 

* Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html

* Galacticus: https://github.com/galacticusorg/galacticus



Re: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES

2020-01-28 Thread Uros Bizjak
On Tue, Jan 28, 2020 at 6:51 PM H.J. Lu  wrote:
>
> On Tue, Jan 28, 2020 at 9:12 AM Uros Bizjak  wrote:
> >
> > On Tue, Jan 28, 2020 at 4:34 PM H.J. Lu  wrote:
> >
> > > > You could move
> > > >
> > > > (match_test "TARGET_AVX")
> > > >   (const_string "TI")
> > > >
> > > > up to bypass the cases below.
> > > >
> > >
> > > I don't think we can do that.   There are 2 cases where we prefer 
> > > movaps/movups:
> > >
> > > /* Use packed single precision instructions where posisble.  I.e.
> > > movups instead   of movupd.  */
> > > DEF_TUNE (X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL,
> > > "sse_packed_single_insn_optimal",
> > >   m_BDVER | m_ZNVER)
> > >
> > > /* X86_TUNE_SSE_TYPELESS_STORES: Always movaps/movups for 128bit stores.  
> > >  */
> > > DEF_TUNE (X86_TUNE_SSE_TYPELESS_STORES, "sse_typeless_stores",
> > >   m_AMD_MULTIPLE | m_CORE_ALL | m_GENERIC)
> > >
> > > We should always use movaps/movups for 
> > > TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL.
> > > It is wrong to bypass TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL with 
> > > TARGET_AVX
> > > as m_BDVER | m_ZNVER support AVX.
> >
> > The reason for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL on AMD target is
> > only insn size, as advised in e.g. Software Optimization Guide for the
> > AMD Family 15h Processors [1], section 7.1.2, where it is said:
> >
> > --quote--
> > 7.1.2 Reduce Instruction SizeOptimization
> >
> > Reduce the size of instructions when possible.
> >
> > Rationale
> >
> > Using smaller instruction sizes improves instruction fetch throughput.
> > Specific examples include the following:
> >
> > *In SIMD code, use the single-precision (PS) form of instructions
> > instead of the double-precision (PD) form. For example, for register
> > to register moves, MOVAPS achieves the same result as MOVAPD, but uses
> > one less byte to encode the instruction and has no prefix byte. Other
> > examples in which single-precision forms can be substituted for
> > double-precision forms include MOVUPS, MOVNTPS, XORPS, ORPS, ANDPS,
> > and SHUFPS.
> > ...
> > --/quote--
> >
> > Please note that this optimization applies only to non-AVX forms, as
> > demonstrated by:
> >
> >0:   0f 28 c8movaps %xmm0,%xmm1
> >3:   66 0f 28 c8 movapd %xmm0,%xmm1
> >7:   c5 f8 28 d1 vmovaps %xmm1,%xmm2
> >b:   c5 f9 28 d1 vmovapd %xmm1,%xmm2
> >
> > Also note that MOVDQA is missing in the above optimization. It is
> > harmful to substitute MOVDQA with MOVAPS, as it can (and does)
> > introduce +1 cycle forwarding penalty between FLT (FPA/FPM) and INT
> > (VALU) FP clusters.
> >
> > Following the above optimization, it is obvious that
> > TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL handling was cargo-culted from
> > one pattern to another. Its use should be reviewed and fixed where not
> > appropriate.
> >
> > [1] https://www.amd.com/system/files/TechDocs/47414_15h_sw_opt_guide.pdf
> >
> > Uros.
>
> Here is the updated patch which moves TARGET_AVX before
> TARGET_SSE_TYPELESS_STORES.   OK for master if there is
> no regression?
>
> Thanks.


+   (match_test "TARGET_AVX")
+ (const_string "")
(and (match_test " == 16")

Only MODE_SIZE == 16 cases will be left here, since TARGET_AVX is
necessary for MODE_SIZE > 16. This test can be removed.

OK with the above change.

Thanks,
Uros.

> --
> H.J.


[PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES

2020-01-28 Thread H.J. Lu
On Tue, Jan 28, 2020 at 9:12 AM Uros Bizjak  wrote:
>
> On Tue, Jan 28, 2020 at 4:34 PM H.J. Lu  wrote:
>
> > > You could move
> > >
> > > (match_test "TARGET_AVX")
> > >   (const_string "TI")
> > >
> > > up to bypass the cases below.
> > >
> >
> > I don't think we can do that.   There are 2 cases where we prefer 
> > movaps/movups:
> >
> > /* Use packed single precision instructions where posisble.  I.e.
> > movups instead   of movupd.  */
> > DEF_TUNE (X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL,
> > "sse_packed_single_insn_optimal",
> >   m_BDVER | m_ZNVER)
> >
> > /* X86_TUNE_SSE_TYPELESS_STORES: Always movaps/movups for 128bit stores.   
> > */
> > DEF_TUNE (X86_TUNE_SSE_TYPELESS_STORES, "sse_typeless_stores",
> >   m_AMD_MULTIPLE | m_CORE_ALL | m_GENERIC)
> >
> > We should always use movaps/movups for 
> > TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL.
> > It is wrong to bypass TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL with TARGET_AVX
> > as m_BDVER | m_ZNVER support AVX.
>
> The reason for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL on AMD target is
> only insn size, as advised in e.g. Software Optimization Guide for the
> AMD Family 15h Processors [1], section 7.1.2, where it is said:
>
> --quote--
> 7.1.2 Reduce Instruction SizeOptimization
>
> Reduce the size of instructions when possible.
>
> Rationale
>
> Using smaller instruction sizes improves instruction fetch throughput.
> Specific examples include the following:
>
> *In SIMD code, use the single-precision (PS) form of instructions
> instead of the double-precision (PD) form. For example, for register
> to register moves, MOVAPS achieves the same result as MOVAPD, but uses
> one less byte to encode the instruction and has no prefix byte. Other
> examples in which single-precision forms can be substituted for
> double-precision forms include MOVUPS, MOVNTPS, XORPS, ORPS, ANDPS,
> and SHUFPS.
> ...
> --/quote--
>
> Please note that this optimization applies only to non-AVX forms, as
> demonstrated by:
>
>0:   0f 28 c8movaps %xmm0,%xmm1
>3:   66 0f 28 c8 movapd %xmm0,%xmm1
>7:   c5 f8 28 d1 vmovaps %xmm1,%xmm2
>b:   c5 f9 28 d1 vmovapd %xmm1,%xmm2
>
> Also note that MOVDQA is missing in the above optimization. It is
> harmful to substitute MOVDQA with MOVAPS, as it can (and does)
> introduce +1 cycle forwarding penalty between FLT (FPA/FPM) and INT
> (VALU) FP clusters.
>
> Following the above optimization, it is obvious that
> TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL handling was cargo-culted from
> one pattern to another. Its use should be reviewed and fixed where not
> appropriate.
>
> [1] https://www.amd.com/system/files/TechDocs/47414_15h_sw_opt_guide.pdf
>
> Uros.

Here is the updated patch which moves TARGET_AVX before
TARGET_SSE_TYPELESS_STORES.   OK for master if there is
no regression?

Thanks.

-- 
H.J.
From cbcf8b23b29588f12e464076dacacd4600d0059b Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 27 Jan 2020 09:35:11 -0800
Subject: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES

movaps/movups is one byte shorter than movdaq/movdqu.  But it isn't the
case for AVX nor AVX512.  This patch prefers TARGET_AVX over
TARGET_SSE_TYPELESS_STORES and adjust vmovups checks in assembly ouputs.

gcc/

	PR target/91461
	* config/i386/i386.md (*movoi_internal_avx): Remove
	TARGET_SSE_TYPELESS_STORES check.
	(*movti_internal): Prefer TARGET_AVX over
	TARGET_SSE_TYPELESS_STORES.
	(*movtf_internal): Likewise.
	* config/i386/sse.md (mov_internal): Likewise.

gcc/testsuite/

	PR target/91461
	* gcc.target/i386/avx256-unaligned-store-2.c: Don't check
	vmovups.
	* gcc.target/i386/avx256-unaligned-store-3.c: Likewise.
	* gcc.target/i386/pieces-memcpy-4.c: Likewise.
	* gcc.target/i386/pieces-memcpy-5.c: Likewise.
	* gcc.target/i386/pieces-memcpy-6.c: Likewise.
	* gcc.target/i386/pieces-strcpy-2.c: Likewise.
	* gcc.target/i386/pr90980-1.c: Likewise.
	* gcc.target/i386/pr87317-4.c: Check "\tvmovd\t" instead of
	"vmovd" to avoid matching "vmovdqu".
	* gcc.target/i386/pr87317-5.c: Likewise.
	* gcc.target/i386/pr87317-7.c: Likewise.
	* gcc.target/i386/pr91461-1.c: New test.
	* gcc.target/i386/pr91461-2.c: Likewise.
	* gcc.target/i386/pr91461-3.c: Likewise.
	* gcc.target/i386/pr91461-4.c: Likewise.
	* gcc.target/i386/pr91461-5.c: Likewise.

Xi
---
 gcc/config/i386/i386.md   | 12 ++-
 gcc/config/i386/sse.md|  4 +-
 .../i386/avx256-unaligned-store-2.c   |  4 +-
 .../i386/avx256-unaligned-store-3.c   |  4 +-
 .../gcc.target/i386/pieces-memcpy-4.c |  3 +-
 .../gcc.target/i386/pieces-memcpy-5.c |  3 +-
 .../gcc.target/i386/pieces-memcpy-6.c |  3 +-
 .../gcc.target/i386/pieces-strcpy-2.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr87317-4.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr87317-5.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr87317-7.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr90980-1.c 

Re: [patch, fortran] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule

2020-01-28 Thread Tobias Burnus

Hi Andrew,

On 1/28/20 5:41 PM, Andrew Benson wrote:

Can you also update the comment before the #define? It currently has:

Thanks for the suggestion. An updated patch is attached.

Thanks. LGTM now.

PS: I wonder whether there are relevant systems which will fail because they
do not handle that long symbol names...

Good question. Should I hold off on committing the patch until this can be
tested further? Or should I just go ahead and commit and deal with any such
problems if they show up?


I think it is fine – as a user, one can always choose a shorter name if 
the system one is interested in doesn't support it. Besides, following 
the current scheme, we do not really have a choice without breaking the ABI.



Also, Richard Biener raised the question (in the PR) of whether this patch
would be an ABI change. I can see that it probably would be, but don't know
for sure. If it would change the ABI is there anything else that I need to
include in the patch?


As just mentioned on the PR, I think it is a small ABI change – before a 
very long identified got cut off now it is available in full extent. — I 
do not see any simple way to avoid this ABI change and, frankly, I also 
do not think any real-world program uses that long identifiers.


Namely, instead of using 3 times the length of 42 character 
("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnop") the standard permits 3 
times the length of 63 characters 
(ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890_). 
Already 42 is quite long for a module, submodule or procedure name. And 
that does work even without the patch. And as only the sum counts, if 
one part only uses 10 characters (e.g. "my_module5"), now 2*58 
characters available two others.


If at the end it really causes problems (source code not available), the 
user still can modify the module file and change the procedure name to 
the trimmed value and continue.


Thus, I do not think it is a real problem in practice. We could be nice, 
however, and add a note to the release notes (i.e. 
https://gcc.gnu.org/gcc-10/changes.html); I not completely sure whether 
it is worthwhile but why not. [See https://gcc.gnu.org/about.html#git 
about how to change WWW files and CC Gerald as web maintainer when 
submitting wwwdocs patches.]


Thanks,

Tobias


patch.diff

diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
index 52bc045..69171f3 100644
--- a/gcc/fortran/trans.h
+++ b/gcc/fortran/trans.h
@@ -23,8 +23,8 @@ along with GCC; see the file COPYING3.  If not see
  
  #include "predict.h"  /* For enum br_predictor and PRED_*.  */
  
-/* Mangled symbols take the form __module__name.  */

-#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
+/* Mangled symbols take the form __module__name or __module.submodule__name.  
*/
+#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*3+5)
  
  /* Struct for holding a block of statements.  It should be treated as an

 opaque entity and not modified directly.  This allows us to change the
diff --git a/gcc/testsuite/gfortran.dg/pr93461.f90 
b/gcc/testsuite/gfortran.dg/pr93461.f90
new file mode 100644
index 000..3bef326
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr93461.f90
@@ -0,0 +1,22 @@
+! { dg-do compile }
+! PR fortran/93461
+module aModuleWithAnAllowedName
+  interface
+ module subroutine aShortName()
+ end subroutine aShortName
+  end interface
+end module aModuleWithAnAllowedName
+
+submodule (aModuleWithAnAllowedName) 
aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName
+contains
+  subroutine aShortName()
+call aSubroutineWithAVeryLongNameThatWillCauseAProblem()
+call aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
+  end subroutine aShortName
+
+  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem()
+  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem
+
+  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
+  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso
+end submodule aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName

ChangeLog

2020-01-28  Andrew Benson

PR fortran/93461
* trans.h: Increase GFC_MAX_MANGLED_SYMBOL_LEN to
GFC_MAX_SYMBOL_LEN*3+5 to allow for inclusion of submodule name,
plus the "." between module and submodule names.

* gfortran.dg/pr93461.f90: New test.


Re: [PATCH] doc: clarify the situation with pointer arithmetic

2020-01-28 Thread Alexander Monakov
On Tue, 28 Jan 2020, Uecker, Martin wrote:

> Unfortunately, this is not as simple. It is not trivial to
> define the set of objects whose "address could have participated
> in the computation"
> [...]

I am not satisfied with the response, but I'm not sure I should
continue arguing here.

Alexander


Re: [patch, fortran] PR93473 - ICE on valid with long module + submodule names

2020-01-28 Thread Tobias Burnus

LGTM now.

Thanks,

Tobias

On 1/28/20 5:41 PM, Andrew Benson wrote:

On Tuesday, January 28, 2020 9:27:58 AM PST Tobias Burnus wrote:

On 1/28/20 12:41 AM, Andrew Benson wrote:

The problem occurs in set_syms_host_assoc() where the "parent1" and
"parent2" variables have a maximum length of GFC_MAX_SYMBOL_LEN+1. This
is insufficient when the parent names are a module+submodule name
concatenated with a ".". The patch above fixes this by increasing their
length to 2*GFC_MAX_SYMBOL_LEN+2.

A patch to fix this is attached. The patch regression tests cleanly - ok
to
commit?

The patch is okay, but can you add a comment – similar to the other
patch – which makes clear why one needs twice the normal

GFC_MAX_SYMBOL_LEN (+2)? You currently have:

 const char dot[2] = ".";

-  char parent1[GFC_MAX_SYMBOL_LEN + 1];
-  char parent2[GFC_MAX_SYMBOL_LEN + 1];
+  char parent1[2 * GFC_MAX_SYMBOL_LEN + 2];
+  char parent2[2 * GFC_MAX_SYMBOL_LEN + 2];

Yes - I've added a comment here explaining the naming convention. An updated
patch is attached.

-Andrew



Re: Return slot optimization for stack protector strong

2020-01-28 Thread Stefan Schulze Frielinghaus
On Mon, Jan 27, 2020 at 06:53:51PM +0100, Jakub Jelinek wrote:
> On Mon, Jan 27, 2020 at 06:49:23PM +0100, Stefan Schulze Frielinghaus wrote:
> > some function calls trigger the stack-protector-strong although such
> > calls are later on implemented via calls to internal functions.
> > Consider the following example:
> > 
> > long double
> > rintl_wrapper (long double x)
> > {
> >   return rintl (x);
> > }
> > 
> > On s390x a return value of type `long double` is passed via a return
> > slot.  Thus according to function `stack_protect_return_slot_p` a
> > function call like `rintl (x)` triggers the stack-protector-strong since
> > rintl is not an internal function.  However, in a later stage, during
> > `expand_call_stmt`, such a call is implemented via a call to an internal
> > function.  This means in the example, the call `rintl (x)` is expanded
> > into an assembler instruction with register operands only.  Thus this
> > late time decision renders the usage of the stack protector superfluous.
> 
> I doubt your predicate gives any guarantees that the builtin will be
> expanded inline rather than a library call.  Some builtins might be expanded
> inline or as a library call depending on various options, or depending on
> particular arguments etc.

My predicate is more or less just a copy of what happens in
`expand_call_stmt` where we have

decl = gimple_call_fndecl (stmt);
if (gimple_call_lhs (stmt)
&& !gimple_has_side_effects (stmt)
&& (optimize || (decl && called_as_built_in (decl
  {
internal_fn ifn = replacement_internal_fn (stmt);
if (ifn != IFN_LAST)
  {
expand_internal_call (ifn, stmt);
return;
  }
  }

There a decision is made whether a call is implemented as a call to an
internal function or not.  Thus using the very same logic it should be
possible to decide at an earlier stage that a call will be implemented
as a call to an internal function.  Since an internal function has no
linkage, it is therefore guaranteed that it will be inlined.

Do I miss something?

Cheers,
Stefan



Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

2020-01-28 Thread Uros Bizjak
On Tue, Jan 28, 2020 at 4:34 PM H.J. Lu  wrote:

> > You could move
> >
> > (match_test "TARGET_AVX")
> >   (const_string "TI")
> >
> > up to bypass the cases below.
> >
>
> I don't think we can do that.   There are 2 cases where we prefer 
> movaps/movups:
>
> /* Use packed single precision instructions where posisble.  I.e.
> movups instead   of movupd.  */
> DEF_TUNE (X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL,
> "sse_packed_single_insn_optimal",
>   m_BDVER | m_ZNVER)
>
> /* X86_TUNE_SSE_TYPELESS_STORES: Always movaps/movups for 128bit stores.   */
> DEF_TUNE (X86_TUNE_SSE_TYPELESS_STORES, "sse_typeless_stores",
>   m_AMD_MULTIPLE | m_CORE_ALL | m_GENERIC)
>
> We should always use movaps/movups for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL.
> It is wrong to bypass TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL with TARGET_AVX
> as m_BDVER | m_ZNVER support AVX.

The reason for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL on AMD target is
only insn size, as advised in e.g. Software Optimization Guide for the
AMD Family 15h Processors [1], section 7.1.2, where it is said:

--quote--
7.1.2 Reduce Instruction SizeOptimization

Reduce the size of instructions when possible.

Rationale

Using smaller instruction sizes improves instruction fetch throughput.
Specific examples include the following:

*In SIMD code, use the single-precision (PS) form of instructions
instead of the double-precision (PD) form. For example, for register
to register moves, MOVAPS achieves the same result as MOVAPD, but uses
one less byte to encode the instruction and has no prefix byte. Other
examples in which single-precision forms can be substituted for
double-precision forms include MOVUPS, MOVNTPS, XORPS, ORPS, ANDPS,
and SHUFPS.
...
--/quote--

Please note that this optimization applies only to non-AVX forms, as
demonstrated by:

   0:   0f 28 c8movaps %xmm0,%xmm1
   3:   66 0f 28 c8 movapd %xmm0,%xmm1
   7:   c5 f8 28 d1 vmovaps %xmm1,%xmm2
   b:   c5 f9 28 d1 vmovapd %xmm1,%xmm2

Also note that MOVDQA is missing in the above optimization. It is
harmful to substitute MOVDQA with MOVAPS, as it can (and does)
introduce +1 cycle forwarding penalty between FLT (FPA/FPM) and INT
(VALU) FP clusters.

Following the above optimization, it is obvious that
TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL handling was cargo-culted from
one pattern to another. Its use should be reviewed and fixed where not
appropriate.

[1] https://www.amd.com/system/files/TechDocs/47414_15h_sw_opt_guide.pdf

Uros.


[PATCH, ivopts] Fix fast-math-pr55281.c ICE

2020-01-28 Thread Andrew Stubbs

This patch fixes an ICE compiling fast-math-pr55281.c for amdgcn.

The problem is that an "iv" is created in which both base and step are 
pointer types, but the base is converted to sizetype without also 
converting the step to a non-pointer type. Later in the compilation this 
results in a PLUS_EXPR with a pointer argument, which is invalid gimple.


The patch fixes the problem by ensuring that the step is converted at 
the same point the base is. This seems like it ought to be correct. If 
the step is not a pointer type then no conversion occurs.


I don't really understand why I only see this issue on amdgcn, but it 
might be because the pointer in question is in a MASK_LOAD which is 
perhaps not that commonly used?


I've tested this on amdgcn, and done a full bootstrap and test on x86_64 
also.


OK to commit?

Thanks

Andrew
Fix fast-math-pr55281.c ICE.

2020-01-28  Andrew Stubbs  

	gcc/
	* tree-ssa-loop-ivopts.c (add_iv_candidate_for_use): Convert step to
	basestep similarly to basetype.

diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index a21f3077e74..1abeb13bb53 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -3472,7 +3472,7 @@ add_iv_candidate_for_use (struct ivopts_data *data, struct iv_use *use)
 {
   poly_uint64 offset;
   tree base;
-  tree basetype;
+  tree basetype, basestep;
   struct iv *iv = use->iv;
 
   add_candidate (data, iv->base, iv->step, false, use);
@@ -3482,9 +3482,14 @@ add_iv_candidate_for_use (struct ivopts_data *data, struct iv_use *use)
 
   /* Record common candidate with initial value zero.  */
   basetype = TREE_TYPE (iv->base);
+  basestep = iv->step;
   if (POINTER_TYPE_P (basetype))
-basetype = sizetype;
-  record_common_cand (data, build_int_cst (basetype, 0), iv->step, use);
+{
+  basetype = sizetype;
+  if (POINTER_TYPE_P (TREE_TYPE (iv->step)))
+	basestep = fold_convert (basetype, iv->step);
+}
+  record_common_cand (data, build_int_cst (basetype, 0), basestep, use);
 
   /* Compare the cost of an address with an unscaled index with the cost of
 an address with a scaled index and add candidate if useful.  */


Re: ACLE intrinsics: BFloat16 load intrinsics for AArch32

2020-01-28 Thread Delia Burduv
Ping.

From: Delia Burduv 
Sent: 22 January 2020 17:31
To: gcc-patches@gcc.gnu.org 
Cc: ni...@redhat.com ; Richard Earnshaw 
; Kyrylo Tkachov ; Ramana 
Radhakrishnan 
Subject: Re: ACLE intrinsics: BFloat16 load intrinsics for AArch32

Ping.

I will change the tests to use the exact input and output registers as
Richard Sandiford suggested for the AArch64 patches.

On 12/20/19 6:48 PM, Delia Burduv wrote:
> This patch adds the ARMv8.6 ACLE BFloat16 load intrinsics vld{q}_bf16
> as part of the BFloat16 extension.
> (https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics)
>
> The intrinsics are declared in arm_neon.h .
> A new test is added to check assembler output.
>
> This patch depends on the Arm back-end patche.
> (https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01448.html)
>
> Tested for regression on arm-none-eabi and armeb-none-eabi. I don't have
> commit rights, so if this is ok can someone please commit it for me?
>
> gcc/ChangeLog:
>
> 2019-11-14  Delia Burduv  
>
>  * config/arm/arm_neon.h (bfloat16_t): New typedef.
>  (bfloat16x4x2_t): New typedef.
>  (bfloat16x8x2_t): New typedef.
>  (bfloat16x4x3_t): New typedef.
>  (bfloat16x8x3_t): New typedef.
>  (bfloat16x4x4_t): New typedef.
>  (bfloat16x8x4_t): New typedef.
>  (vld2_bf16): New.
>  (vld2q_bf16): New.
>  (vld3_bf16): New.
>  (vld3q_bf16): New.
>  (vld4_bf16): New.
>  (vld4q_bf16): New.
>  (vld2_dup_bf16): New.
>  (vld2q_dup_bf16): New.
>   (vld3_dup_bf16): New.
>  (vld3q_dup_bf16): New.
>  (vld4_dup_bf16): New.
>  (vld4q_dup_bf16): New.
>  * config/arm/arm-builtins.c (E_V2BFmode): New mode.
>  (VAR13): New.
>  (arm_simd_types[Bfloat16x2_t]):New type.
>  * config/arm/arm-modes.def (V2BF): New mode.
>  * config/arm/arm-simd-builtin-types.def
>  (Bfloat16x2_t): New entry.
>  * config/arm/arm_neon_builtins.def
>  (vld2): Changed to VAR13 and added v4bf, v8bf
>  (vld2_dup): Changed to VAR8 and added v4bf, v8bf
>  (vld3): Changed to VAR13 and added v4bf, v8bf
>  (vld3_dup): Changed to VAR8 and added v4bf, v8bf
>  (vld4): Changed to VAR13 and added v4bf, v8bf
>  (vld4_dup): Changed to VAR8 and added v4bf, v8bf
>  * config/arm/iterators.md (VDXBF): New iterator.
>  (VQ2BF): New iterator.
>  (V_elem): Added V4BF, V8BF.
>  (V_sz_elem): Added V4BF, V8BF.
>  (V_mode_nunits): Added V4BF, V8BF.
>  (q): Added V4BF, V8BF.
>  *config/arm/neon.md (vld2): Used new iterators.
>  (vld2_dup): Used new iterators.
>  (vld2_dupv8bf): New.
>  (vst3): Used new iterators.
>  (vst3qa): Used new iterators.
>  (vst3qb): Used new iterators.
>  (vld3_dup): Used new iterators.
>  (vld3_dupv8bf): New.
>  (vst4): Used new iterators.
>  (vst4qa): Used new iterators.
>  (vst4qb): Used new iterators.
>  (vld4_dup): Used new iterators.
>  (vld4_dupv8bf): New.
>
>
> gcc/testsuite/ChangeLog:
>
> 2019-11-14  Delia Burduv  
>
>  * gcc.target/arm/simd/bf16_vldn_1.c: New test.


Re: ACLE intrinsics: BFloat16 store (vst{q}_bf16) intrinsics for AArch32

2020-01-28 Thread Delia Burduv
Ping.

From: Delia Burduv 
Sent: 22 January 2020 17:29
To: gcc-patches@gcc.gnu.org 
Cc: ni...@redhat.com ; Richard Earnshaw 
; Kyrylo Tkachov ; Ramana 
Radhakrishnan 
Subject: Re: ACLE intrinsics: BFloat16 store (vst{q}_bf16) intrinsics for 
AArch32

Ping.

I will change the tests to use the exact input and output registers as
Richard Sandiford suggested for the AArch64 patches.

On 12/20/19 6:46 PM, Delia Burduv wrote:
> This patch adds the ARMv8.6 ACLE BFloat16 store intrinsics
> vst{q}_bf16 as part of the BFloat16 extension.
> (https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics)
>
> The intrinsics are declared in arm_neon.h .
> A new test is added to check assembler output.
>
> This patch depends on the Arm back-end patche.
> (https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01448.html)
>
> Tested for regression on arm-none-eabi and armeb-none-eabi. I don't have
> commit rights, so if this is ok can someone please commit it for me?
>
> gcc/ChangeLog:
>
> 2019-11-14  Delia Burduv  
>
>  * config/arm/arm_neon.h (bfloat16_t): New typedef.
>  (bfloat16x4x2_t): New typedef.
>  (bfloat16x8x2_t): New typedef.
>  (bfloat16x4x3_t): New typedef.
>  (bfloat16x8x3_t): New typedef.
>  (bfloat16x4x4_t): New typedef.
>  (bfloat16x8x4_t): New typedef.
>  (vst2_bf16): New.
>  (vst2q_bf16): New.
>  (vst3_bf16): New.
>  (vst3q_bf16): New.
>  (vst4_bf16): New.
>  (vst4q_bf16): New.
>  * config/arm/arm-builtins.c (E_V2BFmode): New mode.
>  (VAR13): New.
>  (arm_simd_types[Bfloat16x2_t]):New type.
>  * config/arm/arm-modes.def (V2BF): New mode.
>  * config/arm/arm-simd-builtin-types.def
>  (Bfloat16x2_t): New entry.
>  * config/arm/arm_neon_builtins.def
>  (vst2): Changed to VAR13 and added v4bf, v8bf
>  (vst3): Changed to VAR13 and added v4bf, v8bf
>  (vst4): Changed to VAR13 and added v4bf, v8bf
>  * config/arm/iterators.md (VDXBF): New iterator.
>  (VQ2BF): New iterator.
>  (V_elem): Added V4BF, V8BF.
>  (V_sz_elem): Added V4BF, V8BF.
>  (V_mode_nunits): Added V4BF, V8BF.
>  (q): Added V4BF, V8BF.
>  *config/arm/neon.md (vst2): Used new iterators.
>  (vst3): Used new iterators.
>  (vst3qa): Used new iterators.
>  (vst3qb): Used new iterators.
>  (vst4): Used new iterators.
>  (vst4qa): Used new iterators.
>  (vst4qb): Used new iterators.
>
>
> gcc/testsuite/ChangeLog:
>
> 2019-11-14  Delia Burduv  
>
>  * gcc.target/arm/simd/bf16_vstn_1.c: New test.


Re: [GCC][PATCH][AArch32] ACLE intrinsics bfloat16 vmmla and vfma for AArch32 AdvSIMD

2020-01-28 Thread Delia Burduv
Ping.

From: Delia Burduv 
Sent: 22 January 2020 17:26
To: gcc-patches@gcc.gnu.org 
Cc: ni...@redhat.com ; Richard Earnshaw 
; Ramana Radhakrishnan 
; Kyrylo Tkachov 
Subject: Re: [GCC][PATCH][AArch32] ACLE intrinsics bfloat16 vmmla and vfma 
for AArch32 AdvSIMD

Ping.

I have read Richard Sandiford's comments on the AArch64 patches and I
will apply what is relevant to this patch as well. Particularly, I will
change the tests to use the exact input and output registers and I will
change the types of the rtl patterns.

On 12/20/19 6:44 PM, Delia Burduv wrote:
> This patch adds the ARMv8.6 ACLE intrinsics for vmmla, vfmab and vfmat
> as part of the BFloat16 extension.
> (https://developer.arm.com/docs/101028/latest.)
> The intrinsics are declared in arm_neon.h and the RTL patterns are
> defined in neon.md.
> Two new tests are added to check assembler output and lane indices.
>
> This patch depends on the Arm back-end patche.
> (https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01448.html)
>
> Tested for regression on arm-none-eabi and armeb-none-eabi. I don't have
> commit rights, so if this is ok can someone please commit it for me?
>
> gcc/ChangeLog:
>
> 2019-11-12  Delia Burduv  
>
>  * config/arm/arm_neon.h (vbfmmlaq_f32): New.
>(vbfmlalbq_f32): New.
>(vbfmlaltq_f32): New.
>(vbfmlalbq_lane_f32): New.
>(vbfmlaltq_lane_f32): New.
>  (vbfmlalbq_laneq_f32): New.
>(vbfmlaltq_laneq_f32): New.
>  * config/arm/arm_neon_builtins.def (vbfmmla): New.
>(vbfmab): New.
>(vbfmat): New.
>(vbfmab_lane): New.
>(vbfmat_lane): New.
>(vbfmab_laneq): New.
>(vbfmat_laneq): New.
>   * config/arm/iterators.md (BF_MA): New int iterator.
>(bt): New int attribute.
>(VQXBF): Copy of VQX with V8BF.
>(V_HALF): Added V8BF.
>* config/arm/neon.md (neon_vbfmmlav8hi): New insn.
>(neon_vbfmav8hi): New insn.
>(neon_vbfma_lanev8hi): New insn.
>(neon_vbfma_laneqv8hi): New expand.
>(neon_vget_high): Changed iterator to VQXBF.
>  * config/arm/unspecs.md (UNSPEC_BFMMLA): New UNSPEC.
>(UNSPEC_BFMAB): New UNSPEC.
>(UNSPEC_BFMAT): New UNSPEC.
>
> 2019-11-12  Delia Burduv  
>
>  * gcc.target/arm/simd/bf16_ma_1.c: New test.
>  * gcc.target/arm/simd/bf16_ma_2.c: New test.
>  * gcc.target/arm/simd/bf16_mmla_1.c: New test.


Re: [patch, fortran] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule

2020-01-28 Thread Andrew Benson
Hi Tobias,

> > The problem occurs because GFC_MAX_MANGLED_SYMBOL_LEN is set to
> > GFC_MAX_SYMBOL_LEN*2+4, which is sufficient for a module name plus
> > function name (plus the additional "_"'s that get prepended), but
> > insufficient if a submodule name is included. The name then gets
> > truncated and can lead to two different functions having the same
> > (truncated) symbol name.
> > 
> > The fix is to increase this length to GFC_MAX_SYMBOL_LEN*3+5 - which
> > allows for the submodule name plus the "." added between module and
> > submodule names.
> > 
> > I've attached a patch for this which includes a new test case for this PR.
> > The patch regression tests cleanly.
> > 
> > OK to commit?
> 
> Can you also update the comment before the #define? It currently has:
> 
>   /* Mangled symbols take the form __module__name.  */
> -#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
> +#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*3+5)

Thanks for the suggestion. An updated patch is attached.

> PS: I wonder whether there are relevant systems which will fail because they
> do not handle that long symbol names...

Good question. Should I hold off on committing the patch until this can be 
tested further? Or should I just go ahead and commit and deal with any such 
problems if they show up?

Also, Richard Biener raised the question (in the PR) of whether this patch 
would be an ABI change. I can see that it probably would be, but don't know 
for sure. If it would change the ABI is there anything else that I need to 
include in the patch?

Thanks,
Andrew

-- 

* Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html

* Galacticus: https://github.com/galacticusorg/galacticus
diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
index 52bc045..69171f3 100644
--- a/gcc/fortran/trans.h
+++ b/gcc/fortran/trans.h
@@ -23,8 +23,8 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "predict.h"  /* For enum br_predictor and PRED_*.  */
 
-/* Mangled symbols take the form __module__name.  */
-#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
+/* Mangled symbols take the form __module__name or __module.submodule__name.  */
+#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*3+5)
 
 /* Struct for holding a block of statements.  It should be treated as an
opaque entity and not modified directly.  This allows us to change the
diff --git a/gcc/testsuite/gfortran.dg/pr93461.f90 b/gcc/testsuite/gfortran.dg/pr93461.f90
new file mode 100644
index 000..3bef326
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr93461.f90
@@ -0,0 +1,22 @@
+! { dg-do compile }
+! PR fortran/93461
+module aModuleWithAnAllowedName
+  interface
+ module subroutine aShortName()
+ end subroutine aShortName
+  end interface
+end module aModuleWithAnAllowedName
+
+submodule (aModuleWithAnAllowedName) aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName
+contains
+  subroutine aShortName()
+call aSubroutineWithAVeryLongNameThatWillCauseAProblem()
+call aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
+  end subroutine aShortName
+  
+  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem()
+  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem
+
+  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
+  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso  
+end submodule aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName
2020-01-28  Andrew Benson  

	PR fortran/93461
	* trans.h: Increase GFC_MAX_MANGLED_SYMBOL_LEN to
	GFC_MAX_SYMBOL_LEN*3+5 to allow for inclusion of submodule name,
	plus the "." between module and submodule names.

	* gfortran.dg/pr93461.f90: New test.


Re: [patch, fortran] PR93473 - ICE on valid with long module + submodule names

2020-01-28 Thread Andrew Benson
On Tuesday, January 28, 2020 9:27:58 AM PST Tobias Burnus wrote:
> On 1/28/20 12:41 AM, Andrew Benson wrote:
> > The problem occurs in set_syms_host_assoc() where the "parent1" and
> > "parent2" variables have a maximum length of GFC_MAX_SYMBOL_LEN+1. This
> > is insufficient when the parent names are a module+submodule name
> > concatenated with a ".". The patch above fixes this by increasing their
> > length to 2*GFC_MAX_SYMBOL_LEN+2.
> > 
> > A patch to fix this is attached. The patch regression tests cleanly - ok
> > to
> > commit?
> 
> The patch is okay, but can you add a comment – similar to the other
> patch – which makes clear why one needs twice the normal
> 
> GFC_MAX_SYMBOL_LEN (+2)? You currently have:
> > const char dot[2] = ".";
> > 
> > -  char parent1[GFC_MAX_SYMBOL_LEN + 1];
> > -  char parent2[GFC_MAX_SYMBOL_LEN + 1];
> > +  char parent1[2 * GFC_MAX_SYMBOL_LEN + 2];
> > +  char parent2[2 * GFC_MAX_SYMBOL_LEN + 2];

Yes - I've added a comment here explaining the naming convention. An updated 
patch is attached.

-Andrew

-- 

* Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html

* Galacticus: https://github.com/galacticusorg/galacticus
2020-01-28  Andrew Benson  

	PR fortran/93473
	* parse.c: Increase length of char variables to allow them to hold
	a concatenated module + submodule name.

	* gfortran.dg/pr93473.f90: New test.
diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
index 4bff0c8..f71a95d 100644
--- a/gcc/fortran/parse.c
+++ b/gcc/fortran/parse.c
@@ -6045,8 +6045,9 @@ set_syms_host_assoc (gfc_symbol *sym)
 {
   gfc_component *c;
   const char dot[2] = ".";
-  char parent1[GFC_MAX_SYMBOL_LEN + 1];
-  char parent2[GFC_MAX_SYMBOL_LEN + 1];
+  /* Symbols take the form module.submodule_ or module.name_. */
+  char parent1[2 * GFC_MAX_SYMBOL_LEN + 2];
+  char parent2[2 * GFC_MAX_SYMBOL_LEN + 2];
 
   if (sym == NULL)
 return;
diff --git a/gcc/testsuite/gfortran.dg/pr93473.f90 b/gcc/testsuite/gfortran.dg/pr93473.f90
new file mode 100644
index 000..dda8525
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr93473.f90
@@ -0,0 +1,28 @@
+! { dg-do compile }
+! { dg-options "-ffree-line-length-none" }
+! PR fortran/93473
+module aModestlyLongModuleName
+  
+  type :: aTypeWithASignificantlyLongNameButStillAllowedOK
+  end type aTypeWithASignificantlyLongNameButStillAllowedOK
+  
+  interface
+ module function aFunctionWithALongButStillAllowedName(parameters) result(self)
+   type(aTypeWithASignificantlyLongNameButStillAllowedOK) :: self
+ end function aFunctionWithALongButStillAllowedName
+  end interface
+  
+end module aModestlyLongModuleName
+
+submodule (aModestlyLongModuleName) aTypeWithASignificantlyLongNameButStillAllowedOK_
+
+contains
+
+  module procedure aFunctionWithALongButStillAllowedName
+ class(*), pointer :: genericObject
+  end procedure aFunctionWithALongButStillAllowedName
+
+end submodule aTypeWithASignificantlyLongNameButStillAllowedOK_
+
+submodule (aModestlyLongModuleName:aTypeWithASignificantlyLongNameButStillAllowedOK_) aSubmoduleWithASignificantlyLongButStillAllowedName__
+end submodule aSubmoduleWithASignificantlyLongButStillAllowedName__


Re: [committed, amdgcn] Fix ICE on unsupported FP comparison

2020-01-28 Thread Andrew Stubbs

On 24/01/2020 14:58, Andrew Stubbs wrote:
I've committed this patch to fix an ICE building the 
gcc.dg/vect/fast-math-pr55281.c testcase.


Oops, I got that crossed. This was the fix for gcc.dg/pr50310-2.c. The 
fast-math-pr55281.c fix will be posted shortly.


The problem was that the combine pass was trying to use the "unle" and 
"ungt" FP comparison operators. There's no hardware support for these, 
so the operators should have been rejected, but the predicates were too 
loose.


Andrew




[PATCH] coroutines: Fix ICE on invalid (PR93458).

2020-01-28 Thread Iain Sandoe
Hi,

Since coroutine-ness is discovered lazily, we encounter the diagnostics
during each keyword parse.  We were not handling the case where a user code
failed to include fundamental information (e.g. the traits) in a graceful
manner (nor tolerating that level of fail for subsequent diagnostics).

Once we've emitted an error for this level of fail, then we suppress
additional copies (otherwise the same thing will be reported for every
coroutine keyword seen).

tested on x86_64-darwin16,
OK for trunk?
thanks
iain

gcc/cp/ChangeLog:

2020-01-28  Iain Sandoe  

* coroutines.cc (get_coroutine_info): Tolerate fatal errors that might
leave the info unset.
(find_coro_traits_template_decl): Note we emitted the error and
suppress duplicates.
(coro_promise_type_found_p): Reorder initialization so that we check
for the traits and their usability before allocation of the info
table.  Check for a suitable return type and emit a diagnostic for
here instead of relying on the lookup machinery.  This allows the
error to have a better location, and means we can suppress multiple
copies.
(coro_function_valid_p): Re-check for a valid promise (and thus the
traits) before proceeding.  Tolerate missing info as a fatal error.

gcc/testsuite/ChangeLog:

2020-01-28  Iain Sandoe  

* g++.dg/coroutines/pr93458-1-missing-traits.C: New test.
* g++.dg/coroutines/pr93458-2-bad-coro-type.C: New test.
---
 gcc/cp/coroutines.cc  | 76 ++-
 .../coroutines/pr93458-1-missing-traits.C | 10 +++
 .../coroutines/pr93458-2-bad-coro-type.C  | 12 +++
 3 files changed, 78 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/coroutines/pr93458-1-missing-traits.C
 create mode 100644 gcc/testsuite/g++.dg/coroutines/pr93458-2-bad-coro-type.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index e8a6a4033f6..ca86c7e6950 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -169,7 +169,8 @@ get_or_insert_coroutine_info (tree fn_decl)
 coroutine_info *
 get_coroutine_info (tree fn_decl)
 {
-  gcc_checking_assert (coroutine_info_table != NULL);
+  if (coroutine_info_table == NULL)
+return NULL;
 
   coroutine_info **slot = coroutine_info_table->find_slot_with_hash
 (fn_decl, coroutine_info_hasher::hash (fn_decl), NO_INSERT);
@@ -255,11 +256,16 @@ static GTY(()) tree void_coro_handle_type;
 static tree
 find_coro_traits_template_decl (location_t kw)
 {
+  static bool error_emitted = false;
   tree traits_decl = lookup_qualified_name (std_node, coro_traits_identifier,
0, true);
-  if (traits_decl == NULL_TREE || traits_decl == error_mark_node)
+  /* If we are missing fundmental information, such as the traits, then don't
+ emit this for every keyword in a TU.  */
+  if (!error_emitted &&
+  (traits_decl == NULL_TREE || traits_decl == error_mark_node))
 {
   error_at (kw, "cannot find % template");
+  error_emitted = true;
   return NULL_TREE;
 }
   else
@@ -370,34 +376,45 @@ coro_promise_type_found_p (tree fndecl, location_t loc)
 {
   gcc_assert (fndecl != NULL_TREE);
 
-  /* Save the coroutine data on the side to avoid the overhead on every
- function decl.  */
-
-  /* We only need one entry per coroutine in a TU, the assumption here is that
- there are typically not 1000s.  */
   if (!coro_initialized)
 {
-  gcc_checking_assert (coroutine_info_table == NULL);
-  /* A table to hold the state, per coroutine decl.  */
-  coroutine_info_table =
-   hash_table::create_ggc (11);
-  /* Set up the identifiers we will use.  */
-  gcc_checking_assert (coro_traits_identifier == NULL);
+  /* Trees we only need to create once.
+Set up the identifiers we will use.  */
   coro_init_identifiers ();
-  /* Trees we only need to create once.  */
+
   /* Coroutine traits template.  */
   coro_traits_templ = find_coro_traits_template_decl (loc);
-  gcc_checking_assert (coro_traits_templ != NULL);
+  if (coro_traits_templ == NULL_TREE
+ || coro_traits_templ == error_mark_node)
+   return false;
+
   /*  coroutine_handle<> template.  */
   coro_handle_templ = find_coro_handle_template_decl (loc);
-  gcc_checking_assert (coro_handle_templ != NULL);
+  if (coro_handle_templ == NULL_TREE
+ || coro_handle_templ == error_mark_node)
+   return false;
+
   /*  We can also instantiate the void coroutine_handle<>  */
   void_coro_handle_type =
instantiate_coro_handle_for_promise_type (loc, NULL_TREE);
-  gcc_checking_assert (void_coro_handle_type != NULL);
+  if (void_coro_handle_type == NULL_TREE
+ || void_coro_handle_type == error_mark_node)
+   return false;
+
+  /* A table to hold the state, per coroutine decl.  */
+  gcc_checking_assert (coroutine_info_table 

[PR 93542] Make __has_include a builtin macro

2020-01-28 Thread Nathan Sidwell

The clever hack of '#define __has_include __has_include' breaks -dD
and -fdirectives-only, because that emits definitions.  This turns
__has_include into a proper builtin macro.  Thus it's never emitted
via -dD, and because use outside of directive processing is undefined,
we can just expand it anywhere.

I did try down this path the first time round, but it started looking 
rat holey.  Turns out not so rat holey after all.


nathan
--
Nathan Sidwell
2020-01-28  Nathan Sidwell  

	PR preprocessor/93452
	* internal.h (struct spec_nodes): Drop n__has_include{,_next}.
	* directives.c (lex_macro_node): Don't check __has_include redef.
	* expr.c (eval_token): Drop __has_include eval.
	(parse_has_include): Move to ...
	* macro.c (builtin_has_include): ... here.
	(_cpp_builtin_macro_text): Eval __has_include{,_next}.
	* include/cpplib.h (enum cpp_builtin_type): Add BT_HAS_INCLUDE{,_NEXT}.
	* init.c (builtin_array): Add them.
	(cpp_init_builtins): Drop __has_include{,_next} init here ...
	* pch.c (cpp_read_state): ... and here.
	* traditional.c (enum ls): Drop has_include states ...
	(_cpp_scan_out_logical_line): ... and here.


 gcc/testsuite/c-c++-common/cpp/pr93452-1.c | 10 ++
 gcc/testsuite/c-c++-common/cpp/pr93452-2.c | 11 ++
 libcpp/ChangeLog   | 16 +
 libcpp/directives.c|  4 +--
 libcpp/expr.c  | 58 --
 libcpp/include/cpplib.h|  4 ++-
 libcpp/init.c  | 13 ++-
 libcpp/internal.h  |  2 --
 libcpp/macro.c | 56 +
 libcpp/pch.c   |  2 --
 libcpp/traditional.c   | 20 +++
 11 files changed, 103 insertions(+), 93 deletions(-)

diff --git c/gcc/testsuite/c-c++-common/cpp/pr93452-1.c w/gcc/testsuite/c-c++-common/cpp/pr93452-1.c
new file mode 100644
index 000..f0986e49238
--- /dev/null
+++ w/gcc/testsuite/c-c++-common/cpp/pr93452-1.c
@@ -0,0 +1,10 @@
+/* { dg-do preprocess } */
+/* { dg-additional-options "-fdirectives-only" } */
+
+int main ()
+{
+  return 0;
+}
+
+/* A regexp that doesn't match itself!  */
+/* { dg-final { scan-file-not pr93452-1.i {_[_]has_include} } } */
diff --git c/gcc/testsuite/c-c++-common/cpp/pr93452-2.c w/gcc/testsuite/c-c++-common/cpp/pr93452-2.c
new file mode 100644
index 000..c9ab0e9f763
--- /dev/null
+++ w/gcc/testsuite/c-c++-common/cpp/pr93452-2.c
@@ -0,0 +1,11 @@
+/* { dg-do preprocess } */
+/* { dg-additional-options "-dD" } */
+
+#if __has_include ("who cares" )
+#endif
+int main ()
+{
+  return 0;
+}
+
+/* { dg-final { scan-file-not pr93452-2.i {__has_include} } } */
diff --git c/libcpp/directives.c w/libcpp/directives.c
index 10735c8c668..bbfdfcd3368 100644
--- c/libcpp/directives.c
+++ w/libcpp/directives.c
@@ -596,9 +596,7 @@ lex_macro_node (cpp_reader *pfile, bool is_def_or_undef)
   cpp_hashnode *node = token->val.node.node;
 
   if (is_def_or_undef
-	  && (node == pfile->spec_nodes.n_defined
-	  || node == pfile->spec_nodes.n__has_include
-	  || node == pfile->spec_nodes.n__has_include_next))
+	  && node == pfile->spec_nodes.n_defined)
 	cpp_error (pfile, CPP_DL_ERROR,
 		   "\"%s\" cannot be used as a macro name",
 		   NODE_NAME (node));
diff --git c/libcpp/expr.c w/libcpp/expr.c
index 6c56803e3b0..2ae9be07c1a 100644
--- c/libcpp/expr.c
+++ w/libcpp/expr.c
@@ -64,8 +64,6 @@ static unsigned int interpret_float_suffix (cpp_reader *, const uchar *, size_t)
 static unsigned int interpret_int_suffix (cpp_reader *, const uchar *, size_t);
 static void check_promotion (cpp_reader *, const struct op *);
 
-static cpp_num parse_has_include (cpp_reader *, cpp_hashnode *, include_type);
-
 /* Token type abuse to create unary plus and minus operators.  */
 #define CPP_UPLUS ((enum cpp_ttype) (CPP_LAST_CPP_OP + 1))
 #define CPP_UMINUS ((enum cpp_ttype) (CPP_LAST_CPP_OP + 2))
@@ -1159,10 +1157,6 @@ eval_token (cpp_reader *pfile, const cpp_token *token,
 case CPP_NAME:
   if (token->val.node.node == pfile->spec_nodes.n_defined)
 	return parse_defined (pfile);
-  else if (token->val.node.node == pfile->spec_nodes.n__has_include)
-	return parse_has_include (pfile, token->val.node.node, IT_INCLUDE);
-  else if (token->val.node.node == pfile->spec_nodes.n__has_include_next)
-	return parse_has_include (pfile, token->val.node.node, IT_INCLUDE_NEXT);
   else if (CPP_OPTION (pfile, cplusplus)
 	   && (token->val.node.node == pfile->spec_nodes.n_true
 		   || token->val.node.node == pfile->spec_nodes.n_false))
@@ -2189,55 +2183,3 @@ num_div_op (cpp_reader *pfile, cpp_num lhs, cpp_num rhs, enum cpp_ttype op,
   return lhs;
 }
 
-/* Handle meeting "__has_include" in a preprocessor expression.  */
-static cpp_num
-parse_has_include (cpp_reader *pfile, cpp_hashnode *op, include_type type)
-{
-  cpp_num result;
-
-  result.unsignedp = false;
-  

Re: [PATCH] analyzer: fix build with gcc 4.4 (PR 93276)

2020-01-28 Thread David Malcolm
On Tue, 2020-01-28 at 11:13 +0100, Jakub Jelinek wrote:
> On Fri, Jan 24, 2020 at 07:53:28PM -0500, David Malcolm wrote:
> > This patch fixes various build failures seen with gcc 4.4
> > 
> > gcc prior to 4.6 complains about:
> > 
> >   error: #pragma GCC diagnostic not allowed inside functions
> > 
> > for various uses of PUSH_IGNORE_WFORMAT and POP_IGNORE_WFORMAT.
> > This patch makes them a no-op with such compilers.
> 
> I think this is wrong.
> All that is really needed is make sure you #include "diagnostic-
> core.h"
> before including pretty-print.h.  By including
> diagnostic-core.h first, you do:
> #ifndef GCC_DIAG_STYLE
> #define GCC_DIAG_STYLE __gcc_tdiag__
> #endif
> and then pretty-print.h will do:
> #ifdef GCC_DIAG_STYLE
> #define GCC_PPDIAG_STYLE GCC_DIAG_STYLE
> #else
> #define GCC_PPDIAG_STYLE __gcc_diag__
> #endif
> If instead pretty-print.h is included first, then it will use
> __gcc_diag__
> instead of __gcc_tdiag__ and thus will assume %E/%D etc. can't be
> handled.
> 
> I've so far just tested that in stage3 with this patch analyzer
> builds
> without any -Wformat/-Wformat-extra-args warnings.

Aha!  Thanks - that's much better.
 
> Ok for trunk if it passes bootstrap/regtest?

LGTM.


Dave



Re: [PATCH] Add OpenACC acc_get_property support for AMD GCN

2020-01-28 Thread Andrew Stubbs

On 28/01/2020 14:55, Harwath, Frederik wrote:

Hi,
this patch adds full support for the OpenACC 2.6 acc_get_property and
acc_get_property_string functions to the libgomp GCN plugin. This replaces
the existing stub in libgomp/plugin-gcn.c.

Andrew: The value returned for acc_property_memory ("size of device memory in 
bytes"
according to the spec) is the HSA_REGION_INFO_SIZE of the agent's data_region. 
This
has been adapted from a previous incomplete implementation that we had on the 
OG9 branch.
Does that sound reasonable?


I don't think we can do better than that.


I have tested the patch with amdgcn and nvptx offloading.

Ok to commit this to the main branch?


Best regards,
Frederik




@@ -544,6 +547,8 @@ struct hsa_context_info
   int agent_count;
   /* Array of agent_info structures describing the individual HSA agents.  */
   struct agent_info *agents;
+  /* Driver version string. */
+  char driver_version_s[30];
 };
 
 /* Format of the on-device heap.

@@ -1513,6 +1518,15 @@ init_hsa_context (void)
GOMP_PLUGIN_error ("Failed to list all HSA runtime agents");
 }
 
+  uint16_t minor, major;

+  status = hsa_fns.hsa_system_get_info_fn (HSA_SYSTEM_INFO_VERSION_MINOR, 
);
+  if (status != HSA_STATUS_SUCCESS)
+GOMP_PLUGIN_error ("Failed to obtain HSA runtime minor version");
+  status = hsa_fns.hsa_system_get_info_fn (HSA_SYSTEM_INFO_VERSION_MAJOR, 
);
+  if (status != HSA_STATUS_SUCCESS)
+GOMP_PLUGIN_error ("Failed to obtain HSA runtime major version");
+  sprintf (hsa_context.driver_version_s, "HSA Runtime %d.%d", major, minor);
+
   hsa_context.initialized = true;
   return true;
 }


If we're going to use a fixed-size buffer then we should use snprintf 
and emit GCN_WARNING if the return value is greater than 
"sizeof(driver_version_s)", even though that is unlikely. Do the same in 
the testcase, but use a bigger buffer so that truncation causes a 
mismatch and test failure.



@@ -75,6 +56,39 @@ void expect_device_properties
   "but was %s.\n", s);
   abort ();
 }
+}
+
+void expect_device_memory
+(acc_device_t dev_type, int dev_num, size_t expected_total_memory)
+{
+
+


I realise that an existing function in this testcase uses this layout, 
but the code style does not normally have the parameter list on the next 
line, and certainly not in column 1.



+
+int main()
+{


Space before ().

Thanks

Andrew


Re: [PATCH] diagnostic_metadata: unbreak xgettext (v2)

2020-01-28 Thread Jakub Jelinek
On Tue, Jan 28, 2020 at 10:30:58AM -0500, David Malcolm wrote:
> gcc/analyzer/ChangeLog:
>   * region-model.cc (poisoned_value_diagnostic::emit): Update for
>   renaming of warning_at overload to warning_meta.
>   * sm-file.cc (file_leak::emit): Likewise.
>   * sm-malloc.cc (double_free::emit): Likewise.
>   (possible_null_deref::emit): Likewise.
>   (possible_null_arg::emit): Likewise.
>   (null_deref::emit): Likewise.
>   (null_arg::emit): Likewise.
>   (use_after_free::emit): Likewise.
>   (malloc_leak::emit): Likewise.
>   (free_of_non_heap::emit): Likewise.
>   * sm-sensitive.cc (exposure_through_output_file::emit): Likewise.
>   * sm-signal.cc (signal_unsafe_call::emit): Likewise.
>   * sm-taint.cc (tainted_array_index::emit): Likewise.
> 
> gcc/ChangeLog:
>   * diagnostic-core.h (warning_at): Rename overload to...
>   (warning_meta): ...this.
>   (emit_diagnostic_valist): Delete decl of overload taking
>   diagnostic_metadata.
>   * diagnostic.c (emit_diagnostic_valist): Likewise for defn.
>   (warning_at): Rename overload taking diagnostic_metadata to...
>   (warning_meta): ...this.
> 
> gcc/testsuite/ChangeLog:
>   * gcc.dg/plugin/diagnostic_plugin_test_metadata.c: Update for
>   renaming of warning_at overload to warning_meta.
>   * gcc.dg/plugin/diagnostic_plugin_test_paths.c: Likewise.

LGTM, thanks.

Jakub



Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

2020-01-28 Thread H.J. Lu
On Tue, Jan 28, 2020 at 6:45 AM Uros Bizjak  wrote:
>
> On Tue, Jan 28, 2020 at 3:32 PM H.J. Lu  wrote:
> >
> > On Mon, Jan 27, 2020 at 11:04 PM Uros Bizjak  wrote:
> > >
> > > On Mon, Jan 27, 2020 at 11:17 PM H.J. Lu  wrote:
> > > >
> > > > On Mon, Jan 27, 2020 at 12:26 PM Uros Bizjak  wrote:
> > > > >
> > > > > On Mon, Jan 27, 2020 at 7:23 PM H.J. Lu  wrote:
> > > > > >
> > > > > > movaps/movups is one byte shorter than movdaq/movdqu.  But it isn't 
> > > > > > the
> > > > > > case for AVX nor AVX512.  We should disable 
> > > > > > TARGET_SSE_TYPELESS_STORES
> > > > > > for TARGET_AVX.
> > > > > >
> > > > > > gcc/
> > > > > >
> > > > > > PR target/91461
> > > > > > * config/i386/i386.h (TARGET_SSE_TYPELESS_STORES): Disable 
> > > > > > for
> > > > > > TARGET_AVX.
> > > > > > * config/i386/i386.md (*movoi_internal_avx): Remove
> > > > > > TARGET_SSE_TYPELESS_STORES check.
> > > > > >
> > > > > > gcc/testsuite/
> > > > > >
> > > > > > PR target/91461
> > > > > > * gcc.target/i386/pr91461-1.c: New test.
> > > > > > * gcc.target/i386/pr91461-2.c: Likewise.
> > > > > > * gcc.target/i386/pr91461-3.c: Likewise.
> > > > > > * gcc.target/i386/pr91461-4.c: Likewise.
> > > > > > * gcc.target/i386/pr91461-5.c: Likewise.
> > > > > > ---
> > > > > >  gcc/config/i386/i386.h|  4 +-
> > > > > >  gcc/config/i386/i386.md   |  4 +-
> > > > > >  gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 
> > > > > >  gcc/testsuite/gcc.target/i386/pr91461-2.c | 19 ++
> > > > > >  gcc/testsuite/gcc.target/i386/pr91461-3.c | 76 
> > > > > > +++
> > > > > >  gcc/testsuite/gcc.target/i386/pr91461-4.c | 21 +++
> > > > > >  gcc/testsuite/gcc.target/i386/pr91461-5.c | 17 +
> > > > > >  7 files changed, 203 insertions(+), 4 deletions(-)
> > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-1.c
> > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-2.c
> > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-3.c
> > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-4.c
> > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-5.c
> > > > > >
> > > > > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > > > > > index 943e9a5c783..c134b04c5c4 100644
> > > > > > --- a/gcc/config/i386/i386.h
> > > > > > +++ b/gcc/config/i386/i386.h
> > > > > > @@ -516,8 +516,10 @@ extern unsigned char 
> > > > > > ix86_tune_features[X86_TUNE_LAST];
> > > > > >  #define TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL \
> > > > > > ix86_tune_features[X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL]
> > > > > >  #define TARGET_SSE_SPLIT_REGS  
> > > > > > ix86_tune_features[X86_TUNE_SSE_SPLIT_REGS]
> > > > > > +/* NB: movaps/movups is one byte shorter than movdaq/movdqu.  But 
> > > > > > it
> > > > > > +   isn't the case for AVX nor AVX512.  */
> > > > > >  #define TARGET_SSE_TYPELESS_STORES \
> > > > > > -   ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES]
> > > > > > +   (!TARGET_AVX && 
> > > > > > ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES])
> > > > >
> > > > > This is wrong place to disable the feature.
> > > >
> > > > Like this?
> > >
> > > No.
> > >
> > > There is a mode attribute in i386.md/sse.md for relevant patterns.
> > > Please adapt calculation of mode attributes instead.
> > >
> >
> > Like this?
>
> Still no.
>
> You could move
>
> (match_test "TARGET_AVX")
>   (const_string "TI")
>
> up to bypass the cases below.
>

I don't think we can do that.   There are 2 cases where we prefer movaps/movups:

/* Use packed single precision instructions where posisble.  I.e.
movups instead   of movupd.  */
DEF_TUNE (X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL,
"sse_packed_single_insn_optimal",
  m_BDVER | m_ZNVER)

/* X86_TUNE_SSE_TYPELESS_STORES: Always movaps/movups for 128bit stores.   */
DEF_TUNE (X86_TUNE_SSE_TYPELESS_STORES, "sse_typeless_stores",
  m_AMD_MULTIPLE | m_CORE_ALL | m_GENERIC)

We should always use movaps/movups for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL.
It is wrong to bypass TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL with TARGET_AVX
as m_BDVER | m_ZNVER support AVX.

-- 
H.J.


[PATCH] diagnostic_metadata: unbreak xgettext (v2)

2020-01-28 Thread David Malcolm
On Tue, 2020-01-28 at 15:36 +0100, Jakub Jelinek wrote:
> On Tue, Jan 28, 2020 at 09:30:17AM -0500, David Malcolm wrote:
> > * diagnostic-core.h (warning_at): Rename overload to...
> > (warning_with_metadata_at): ...this.
> 
> I think this is just too long, especially for a function used at
> least
> in the analyzer pretty often, leading to lots of ugly formatting.
> Can't you use warning_meta or something similar instead?
> 
> > (emit_diagnostic_valist): Delete decl of overload taking
> > diagnostic_metadata.
> > * diagnostic.c (emit_diagnostic_valist): Likewise for defn.
> > (warning_at): Rename overload taking diagnostic_metadata to...
> > (warning_with_metadata_at): ...this.
> 
>   Jakub

Fair enough, thanks.  Bootstrap in progress of v2:

Changed in v2:
- rename from warning_with_metadata_at to warning_meta
- fix test plugins

While C++ can have overloads, xgettext can't deal with overloads that have
different argument positions, leading to two failures in "make gcc.pot":

emit_diagnostic_valist used incompatibly as both
--keyword=emit_diagnostic_valist:4
--flag=emit_diagnostic_valist:4:gcc-internal-format and
--keyword=emit_diagnostic_valist:5
--flag=emit_diagnostic_valist:5:gcc-internal-format

warning_at used incompatibly as both
--keyword=warning_at:3 --flag=warning_at:3:gcc-internal-format and
--keyword=warning_at:4 --flag=warning_at:4:gcc-internal-format

The emit_diagnostic_valist overload isn't used anywhere (I think it's
a leftover from an earlier iteration of the analyzer patch kit).

The warning_at overload is used throughout the analyzer but nowhere else.

Ideally I'd like to consolidate this argument with something
constructable in various ways:
- from a metadata and an int or
- from an int (or, better an "enum opt_code"),
so that the overload happens when implicitly choosing the ctor, but
that feels like stage 1 material.

In the meantime, fix xgettext by deleting the unused overload and
renaming the used one.

gcc/analyzer/ChangeLog:
* region-model.cc (poisoned_value_diagnostic::emit): Update for
renaming of warning_at overload to warning_meta.
* sm-file.cc (file_leak::emit): Likewise.
* sm-malloc.cc (double_free::emit): Likewise.
(possible_null_deref::emit): Likewise.
(possible_null_arg::emit): Likewise.
(null_deref::emit): Likewise.
(null_arg::emit): Likewise.
(use_after_free::emit): Likewise.
(malloc_leak::emit): Likewise.
(free_of_non_heap::emit): Likewise.
* sm-sensitive.cc (exposure_through_output_file::emit): Likewise.
* sm-signal.cc (signal_unsafe_call::emit): Likewise.
* sm-taint.cc (tainted_array_index::emit): Likewise.

gcc/ChangeLog:
* diagnostic-core.h (warning_at): Rename overload to...
(warning_meta): ...this.
(emit_diagnostic_valist): Delete decl of overload taking
diagnostic_metadata.
* diagnostic.c (emit_diagnostic_valist): Likewise for defn.
(warning_at): Rename overload taking diagnostic_metadata to...
(warning_meta): ...this.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic_plugin_test_metadata.c: Update for
renaming of warning_at overload to warning_meta.
* gcc.dg/plugin/diagnostic_plugin_test_paths.c: Likewise.
---
 gcc/analyzer/region-model.cc  | 19 ---
 gcc/analyzer/sm-file.cc   |  6 +--
 gcc/analyzer/sm-malloc.cc | 49 ++-
 gcc/analyzer/sm-sensitive.cc  |  7 +--
 gcc/analyzer/sm-signal.cc |  8 +--
 gcc/analyzer/sm-taint.cc  | 24 -
 gcc/diagnostic-core.h |  9 ++--
 gcc/diagnostic.c  | 16 ++
 .../plugin/diagnostic_plugin_test_metadata.c  |  4 +-
 .../plugin/diagnostic_plugin_test_paths.c | 13 ++---
 10 files changed, 73 insertions(+), 82 deletions(-)

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 62c96a6ceea..acaadcf9d3b 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -3827,27 +3827,26 @@ public:
{
  diagnostic_metadata m;
  m.add_cwe (457); /* "CWE-457: Use of Uninitialized Variable".  */
- return warning_at (rich_loc, m,
-OPT_Wanalyzer_use_of_uninitialized_value,
-"use of uninitialized value %qE",
-m_expr);
+ return warning_meta (rich_loc, m,
+  OPT_Wanalyzer_use_of_uninitialized_value,
+  "use of uninitialized value %qE",
+  m_expr);
}
break;
   case POISON_KIND_FREED:
{
  diagnostic_metadata m;
  m.add_cwe (416); /* "CWE-416: Use After Free".  */
- return warning_at (rich_loc, m,
-

Re: [PATCH] gcc: Add new configure options to allow static libraries to be selected

2020-01-28 Thread Jonathan Wakely

On 22/01/20 15:39 +, Andrew Burgess wrote:

The motivation behind this change is to make it easier for a user to
link against static libraries on a target where dynamic libraries are
the default library type (for example GNU/Linux).

Further, my motivation is really for linking libraries into GDB,
however, the binutils-gdb/config/ directory is a copy of gcc/config/
so changes for GDB need to be approved by the GCC project first.

After making this change in the gcc/config/ directory I've run
autoreconf on all of the configure scripts in the GCC tree and a
couple have been updated, so I'll use one of these to describe what my
change does.

Consider libcpp, this library links against libiconv.  Currently if
the user builds on a system with both static and dynamic libiconv
installed then autotools will pick up the dynamic libiconv by
default.  This is almost certainly the right thing to do.

However, if the user wants to link against static libiconv then things
are a little harder, they could remove the dynamic libiconv from their
system, but this is probably a bad idea (other things might depend on
that library),


Typically they would only need to remove the .so symlink used when
linking, not the actual libiconv.so.NNN library that other things
depend on. If they put the symlink back again, it's only a problem
for other builds that might want the symlink, not for any
already-linked software using the shared library. Still not ideal
though.


or the user can build their own version of libiconv,
install it into a unique prefix, and then configure gcc using the
--with-libiconv-prefix=DIR flag.  This works fine, but is somewhat
annoying, the static library available, I just can't get autotools to
use it.


A much simpler solution is:

mkdir -p /tmp/iconv/lib
ln -s $real_iconv_prefix/include /tmp/iconv/
ln -s $real_iconv_prefix/lib64/libiconv.a /tmp/iconv/lib64

And then configure with --with-libiconv-prefix=/tmp/iconv

i.e. there's no need to build or install anything, just use symlinks
to create a new path that doesn't have the shared library.

It's still a bit more effort than with your new configure option, but
it's pretty trivial.



Re: [Ping][GCC][IRA] Revert 11b8091fb to fix Bug 93221

2020-01-28 Thread Vladimir Makarov

On 1/28/20 4:30 AM, Joel Hutton wrote:

On 28/01/2020 09:07, Eric Botcazou wrote:

Ping! Eric, do you have any objections to reverting?

See my comment posted in the audit trail of the TN on 01/20...
Probably missing live range splitting or somesuch, as envisioned by
Vladimir in its approval message.  Feel free to eventually revert it.

Great. Vladimir, Ok for trunk?


Yes. Thank you.



[PATCH] Add OpenACC acc_get_property support for AMD GCN

2020-01-28 Thread Harwath, Frederik
Hi,
this patch adds full support for the OpenACC 2.6 acc_get_property and
acc_get_property_string functions to the libgomp GCN plugin. This replaces
the existing stub in libgomp/plugin-gcn.c.

Andrew: The value returned for acc_property_memory ("size of device memory in 
bytes"
according to the spec) is the HSA_REGION_INFO_SIZE of the agent's data_region. 
This
has been adapted from a previous incomplete implementation that we had on the 
OG9 branch.
Does that sound reasonable?

I have tested the patch with amdgcn and nvptx offloading.

Ok to commit this to the main branch?


Best regards,
Frederik

From 6f1855281c38993a088f9b4af020a786f8e05fe9 Mon Sep 17 00:00:00 2001
From: Frederik Harwath 
Date: Tue, 28 Jan 2020 08:01:00 +0100
Subject: [PATCH] Add OpenACC acc_get_property support for AMD GCN

Add full support for the OpenACC 2.6 acc_get_property and
acc_get_property_string functions to the libgomp GCN plugin.

libgomp/
	* plugin-gcn.c (struct agent_info): Add fields "name" and
	"vendor_name" ...
	(GOMP_OFFLOAD_init_device): ... and init from here.
	(struct hsa_context_info): Add field "driver_version_s" ...
	(init_hsa_contest): ... and init from here.
	(GOMP_OFFLOAD_openacc_get_property): Replace stub with a proper
	implementation.
	* testsuite/libgomp.oacc-c-c++-common/acc_get_property.c:
	Enable test execution for amdgcn and host offloading targets.
	* testsuite/libgomp.oacc-fortran/acc_get_property.f90: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/acc_get_property-aux.c
	(expect_device_properties): Split function into ...
	(expect_device_string_properties): ... this new function ...
	(expect_device_memory): ... and this new function.
	* testsuite/libgomp.oacc-c-c++-common/acc_get_property-gcn.c:
	Add test.
---
 libgomp/plugin/plugin-gcn.c   |  63 +++--
 .../acc_get_property-aux.c|  60 +---
 .../acc_get_property-gcn.c| 132 ++
 .../acc_get_property.c|   5 +-
 .../libgomp.oacc-fortran/acc_get_property.f90 |   2 -
 5 files changed, 224 insertions(+), 38 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-gcn.c

diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index 7854c142f05..0a09daaa0a4 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -425,7 +425,10 @@ struct agent_info
 
   /* The instruction set architecture of the device. */
   gcn_isa device_isa;
-
+  /* Name of the agent. */
+  char name[64];
+  /* Name of the vendor of the agent. */
+  char vendor_name[64];
   /* Command queues of the agent.  */
   hsa_queue_t *sync_queue;
   struct goacc_asyncqueue *async_queues, *omp_async_queue;
@@ -544,6 +547,8 @@ struct hsa_context_info
   int agent_count;
   /* Array of agent_info structures describing the individual HSA agents.  */
   struct agent_info *agents;
+  /* Driver version string. */
+  char driver_version_s[30];
 };
 
 /* Format of the on-device heap.
@@ -1513,6 +1518,15 @@ init_hsa_context (void)
 	GOMP_PLUGIN_error ("Failed to list all HSA runtime agents");
 }
 
+  uint16_t minor, major;
+  status = hsa_fns.hsa_system_get_info_fn (HSA_SYSTEM_INFO_VERSION_MINOR, );
+  if (status != HSA_STATUS_SUCCESS)
+GOMP_PLUGIN_error ("Failed to obtain HSA runtime minor version");
+  status = hsa_fns.hsa_system_get_info_fn (HSA_SYSTEM_INFO_VERSION_MAJOR, );
+  if (status != HSA_STATUS_SUCCESS)
+GOMP_PLUGIN_error ("Failed to obtain HSA runtime major version");
+  sprintf (hsa_context.driver_version_s, "HSA Runtime %d.%d", major, minor);
+
   hsa_context.initialized = true;
   return true;
 }
@@ -3410,15 +3424,19 @@ GOMP_OFFLOAD_init_device (int n)
 return hsa_error ("Error requesting maximum queue size of the GCN agent",
 		  status);
 
-  char buf[64];
   status = hsa_fns.hsa_agent_get_info_fn (agent->id, HSA_AGENT_INFO_NAME,
-	  );
+	  >name);
   if (status != HSA_STATUS_SUCCESS)
 return hsa_error ("Error querying the name of the agent", status);
 
-  agent->device_isa = isa_code (buf);
+  agent->device_isa = isa_code (agent->name);
   if (agent->device_isa < 0)
-return hsa_error ("Unknown GCN agent architecture.", HSA_STATUS_ERROR);
+return hsa_error ("Unknown GCN agent architecture", HSA_STATUS_ERROR);
+
+  status = hsa_fns.hsa_agent_get_info_fn (agent->id, HSA_AGENT_INFO_VENDOR_NAME,
+	  >vendor_name);
+  if (status != HSA_STATUS_SUCCESS)
+return hsa_error ("Error querying the vendor name of the agent", status);
 
   status = hsa_fns.hsa_queue_create_fn (agent->id, queue_size,
 	HSA_QUEUE_TYPE_MULTI,
@@ -4115,12 +4133,37 @@ GOMP_OFFLOAD_openacc_async_dev2host (int device, void *dst, const void *src,
 union goacc_property_value
 GOMP_OFFLOAD_openacc_get_property (int device, enum goacc_property prop)
 {
-  /* Stub. Check device and return default value for unsupported properties. */
-  /* TODO: Implement this function. */
-  get_agent_info (device);
+  

[PATCH] c++: Function declared with typedef with eh-specification.

2020-01-28 Thread Jason Merrill
We just need to handle the exception specification like other properties of
a function typedef.

Tested x86_64-pc-linux-gnu, applying to trunk/9.

PR c++/90731
* decl.c (grokdeclarator): Propagate eh spec from typedef.
---
 gcc/cp/decl.c| 1 +
 gcc/testsuite/g++.dg/cpp1z/noexcept-type22.C | 6 ++
 2 files changed, 7 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/noexcept-type22.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index e55de5dd53d..6ad558eef9e 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -12848,6 +12848,7 @@ grokdeclarator (const cp_declarator *declarator,
  memfn_quals |= type_memfn_quals (type);
  rqual = type_memfn_rqual (type);
  type_quals = TYPE_UNQUALIFIED;
+ raises = TYPE_RAISES_EXCEPTIONS (type);
}
 }
 
diff --git a/gcc/testsuite/g++.dg/cpp1z/noexcept-type22.C 
b/gcc/testsuite/g++.dg/cpp1z/noexcept-type22.C
new file mode 100644
index 000..dd9924ff1b1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/noexcept-type22.C
@@ -0,0 +1,6 @@
+// PR c++/90731
+// { dg-do compile { target c++17 } }
+
+typedef void T() noexcept(true);
+T t;
+void t() noexcept(true);

base-commit: a5ed4958a2c1b563e933b25ca3b481761cc40b07
-- 
2.18.1



Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

2020-01-28 Thread Uros Bizjak
On Tue, Jan 28, 2020 at 3:32 PM H.J. Lu  wrote:
>
> On Mon, Jan 27, 2020 at 11:04 PM Uros Bizjak  wrote:
> >
> > On Mon, Jan 27, 2020 at 11:17 PM H.J. Lu  wrote:
> > >
> > > On Mon, Jan 27, 2020 at 12:26 PM Uros Bizjak  wrote:
> > > >
> > > > On Mon, Jan 27, 2020 at 7:23 PM H.J. Lu  wrote:
> > > > >
> > > > > movaps/movups is one byte shorter than movdaq/movdqu.  But it isn't 
> > > > > the
> > > > > case for AVX nor AVX512.  We should disable TARGET_SSE_TYPELESS_STORES
> > > > > for TARGET_AVX.
> > > > >
> > > > > gcc/
> > > > >
> > > > > PR target/91461
> > > > > * config/i386/i386.h (TARGET_SSE_TYPELESS_STORES): Disable for
> > > > > TARGET_AVX.
> > > > > * config/i386/i386.md (*movoi_internal_avx): Remove
> > > > > TARGET_SSE_TYPELESS_STORES check.
> > > > >
> > > > > gcc/testsuite/
> > > > >
> > > > > PR target/91461
> > > > > * gcc.target/i386/pr91461-1.c: New test.
> > > > > * gcc.target/i386/pr91461-2.c: Likewise.
> > > > > * gcc.target/i386/pr91461-3.c: Likewise.
> > > > > * gcc.target/i386/pr91461-4.c: Likewise.
> > > > > * gcc.target/i386/pr91461-5.c: Likewise.
> > > > > ---
> > > > >  gcc/config/i386/i386.h|  4 +-
> > > > >  gcc/config/i386/i386.md   |  4 +-
> > > > >  gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 
> > > > >  gcc/testsuite/gcc.target/i386/pr91461-2.c | 19 ++
> > > > >  gcc/testsuite/gcc.target/i386/pr91461-3.c | 76 
> > > > > +++
> > > > >  gcc/testsuite/gcc.target/i386/pr91461-4.c | 21 +++
> > > > >  gcc/testsuite/gcc.target/i386/pr91461-5.c | 17 +
> > > > >  7 files changed, 203 insertions(+), 4 deletions(-)
> > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-1.c
> > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-2.c
> > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-3.c
> > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-4.c
> > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-5.c
> > > > >
> > > > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > > > > index 943e9a5c783..c134b04c5c4 100644
> > > > > --- a/gcc/config/i386/i386.h
> > > > > +++ b/gcc/config/i386/i386.h
> > > > > @@ -516,8 +516,10 @@ extern unsigned char 
> > > > > ix86_tune_features[X86_TUNE_LAST];
> > > > >  #define TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL \
> > > > > ix86_tune_features[X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL]
> > > > >  #define TARGET_SSE_SPLIT_REGS  
> > > > > ix86_tune_features[X86_TUNE_SSE_SPLIT_REGS]
> > > > > +/* NB: movaps/movups is one byte shorter than movdaq/movdqu.  But it
> > > > > +   isn't the case for AVX nor AVX512.  */
> > > > >  #define TARGET_SSE_TYPELESS_STORES \
> > > > > -   ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES]
> > > > > +   (!TARGET_AVX && 
> > > > > ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES])
> > > >
> > > > This is wrong place to disable the feature.
> > >
> > > Like this?
> >
> > No.
> >
> > There is a mode attribute in i386.md/sse.md for relevant patterns.
> > Please adapt calculation of mode attributes instead.
> >
>
> Like this?

Still no.

You could move

(match_test "TARGET_AVX")
  (const_string "TI")

up to bypass the cases below.

Uros.


Uros.

>
> --
> H.J.


Re: [PATCH] diagnostic_metadata: unbreak xgettext

2020-01-28 Thread Jakub Jelinek
On Tue, Jan 28, 2020 at 09:30:17AM -0500, David Malcolm wrote:
>   * diagnostic-core.h (warning_at): Rename overload to...
>   (warning_with_metadata_at): ...this.

I think this is just too long, especially for a function used at least
in the analyzer pretty often, leading to lots of ugly formatting.
Can't you use warning_meta or something similar instead?

>   (emit_diagnostic_valist): Delete decl of overload taking
>   diagnostic_metadata.
>   * diagnostic.c (emit_diagnostic_valist): Likewise for defn.
>   (warning_at): Rename overload taking diagnostic_metadata to...
>   (warning_with_metadata_at): ...this.

Jakub



Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

2020-01-28 Thread H.J. Lu
On Mon, Jan 27, 2020 at 11:04 PM Uros Bizjak  wrote:
>
> On Mon, Jan 27, 2020 at 11:17 PM H.J. Lu  wrote:
> >
> > On Mon, Jan 27, 2020 at 12:26 PM Uros Bizjak  wrote:
> > >
> > > On Mon, Jan 27, 2020 at 7:23 PM H.J. Lu  wrote:
> > > >
> > > > movaps/movups is one byte shorter than movdaq/movdqu.  But it isn't the
> > > > case for AVX nor AVX512.  We should disable TARGET_SSE_TYPELESS_STORES
> > > > for TARGET_AVX.
> > > >
> > > > gcc/
> > > >
> > > > PR target/91461
> > > > * config/i386/i386.h (TARGET_SSE_TYPELESS_STORES): Disable for
> > > > TARGET_AVX.
> > > > * config/i386/i386.md (*movoi_internal_avx): Remove
> > > > TARGET_SSE_TYPELESS_STORES check.
> > > >
> > > > gcc/testsuite/
> > > >
> > > > PR target/91461
> > > > * gcc.target/i386/pr91461-1.c: New test.
> > > > * gcc.target/i386/pr91461-2.c: Likewise.
> > > > * gcc.target/i386/pr91461-3.c: Likewise.
> > > > * gcc.target/i386/pr91461-4.c: Likewise.
> > > > * gcc.target/i386/pr91461-5.c: Likewise.
> > > > ---
> > > >  gcc/config/i386/i386.h|  4 +-
> > > >  gcc/config/i386/i386.md   |  4 +-
> > > >  gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 
> > > >  gcc/testsuite/gcc.target/i386/pr91461-2.c | 19 ++
> > > >  gcc/testsuite/gcc.target/i386/pr91461-3.c | 76 +++
> > > >  gcc/testsuite/gcc.target/i386/pr91461-4.c | 21 +++
> > > >  gcc/testsuite/gcc.target/i386/pr91461-5.c | 17 +
> > > >  7 files changed, 203 insertions(+), 4 deletions(-)
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-1.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-2.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-3.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-4.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-5.c
> > > >
> > > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > > > index 943e9a5c783..c134b04c5c4 100644
> > > > --- a/gcc/config/i386/i386.h
> > > > +++ b/gcc/config/i386/i386.h
> > > > @@ -516,8 +516,10 @@ extern unsigned char 
> > > > ix86_tune_features[X86_TUNE_LAST];
> > > >  #define TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL \
> > > > ix86_tune_features[X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL]
> > > >  #define TARGET_SSE_SPLIT_REGS  
> > > > ix86_tune_features[X86_TUNE_SSE_SPLIT_REGS]
> > > > +/* NB: movaps/movups is one byte shorter than movdaq/movdqu.  But it
> > > > +   isn't the case for AVX nor AVX512.  */
> > > >  #define TARGET_SSE_TYPELESS_STORES \
> > > > -   ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES]
> > > > +   (!TARGET_AVX && 
> > > > ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES])
> > >
> > > This is wrong place to disable the feature.
> >
> > Like this?
>
> No.
>
> There is a mode attribute in i386.md/sse.md for relevant patterns.
> Please adapt calculation of mode attributes instead.
>

Like this?


-- 
H.J.
From 1ba0c9ce5f764b8faa8c66b70e676af187a57415 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 27 Jan 2020 09:35:11 -0800
Subject: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

movaps/movups is one byte shorter than movdaq/movdqu.  But it isn't the
case for AVX nor AVX512.  We should disable TARGET_SSE_TYPELESS_STORES
for TARGET_AVX.

gcc/

	PR target/91461
	* config/i386/i386.md (*movoi_internal_avx): Remove
	TARGET_SSE_TYPELESS_STORES check.
	(*movti_internal): Disable TARGET_SSE_TYPELESS_STORES for
	TARGET_AVX.
	* config/i386/sse.md (mov_internal): Likewise.

gcc/testsuite/

	PR target/91461
	* gcc.target/i386/pr91461-1.c: New test.
	* gcc.target/i386/pr91461-2.c: Likewise.
	* gcc.target/i386/pr91461-3.c: Likewise.
	* gcc.target/i386/pr91461-4.c: Likewise.
	* gcc.target/i386/pr91461-5.c: Likewise.
---
 gcc/config/i386/i386.md   |  8 +--
 gcc/config/i386/sse.md|  2 +-
 gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 
 gcc/testsuite/gcc.target/i386/pr91461-2.c | 19 ++
 gcc/testsuite/gcc.target/i386/pr91461-3.c | 76 +++
 gcc/testsuite/gcc.target/i386/pr91461-4.c | 21 +++
 gcc/testsuite/gcc.target/i386/pr91461-5.c | 17 +
 7 files changed, 203 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-5.c

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index a125ab350bb..62aaf40a4af 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1980,9 +1980,7 @@ (define_insn "*movoi_internal_avx"
 	   (and (eq_attr "alternative" "1")
 		(match_test "TARGET_AVX512VL"))
 		 (const_string "XI")
-	   (ior (match_test 

[PATCH] diagnostic_metadata: unbreak xgettext

2020-01-28 Thread David Malcolm
On Tue, 2020-01-28 at 11:16 +0100, Jakub Jelinek wrote:
> On Wed, Dec 18, 2019 at 07:08:25PM -0500, David Malcolm wrote:
> > This patch adds support for associating a diagnostic message with
> > an
> > optional diagnostic_metadata object, so that plugins can add extra
> > data
> > to their diagnostics (e.g. mapping a diagnostic to a taxonomy or
> > coding
> > standard such as from CERT or MISRA).
> > 
> > Currently this only supports associating a CWE identifier with a
> > diagnostic (which is what I'm using for the warnings in the
> > analyzer
> > patch kit), but adding a diagnostic_metadata class allows for
> > future
> > growth in this area without an explosion of further "warning_at"
> > overloads for all of the different kinds of custom data that a
> > plugin
> > might want to add.
> > 
> > This version of the patch renames the overly-general
> > -fdiagnostics-show-metadata to -fdiagnostics-show-cwe and adds test
> > coverage for it via a plugin.
> > 
> > It also adds a note to the documentation that no GCC diagnostics
> > currently use this; it's a feature for plugins (and, at some point,
> > I hope, the analyzer).
> > 
> > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> > 
> > Committed to trunk as r279556.
> 
> Unfortunately, this patch broke the i18n.
> $ make gcc.pot
> /bin/sh ../../gcc/../mkinstalldirs po
> make srcextra
> make[1]: Entering directory '/usr/src/gcc/obj/gcc'
> cp -p gengtype-lex.c ../../gcc
> cp -p gengtype-lex.c ../../gcc
> make[1]: Leaving directory '/usr/src/gcc/obj/gcc'
> AWK=gawk /bin/sh ../../gcc/po/exgettext \
>   /usr/bin/xgettext gcc ../../gcc
> scanning for keywords, %e and %n strings...
> emit_diagnostic_valist used incompatibly as both --
> keyword=emit_diagnostic_valist:4
> --flag=emit_diagnostic_valist:4:gcc-internal-format and --
> keyword=emit_diagnostic_valist:5
> --flag=emit_diagnostic_valist:5:gcc-internal-format
> make: *** [Makefile:4338: po/gcc.pot] Error 1
> 
> While C++ can have overloads, xgettext can't deal with overloads that
> have
> different argument positions.
> emit_diagnostic_valist with the new args (i.e. the :5 one) isn't used
> right
> now, so one way around at least for now is to remove it again,
> another is to
> rename it.
> 
>   Jakub

Sorry about this.

There are actually two broken names; here's the patch I'm currently testing
(which fixes "make gcc.pot" for me); bootstrap is in progress:


While C++ can have overloads, xgettext can't deal with overloads that have
different argument positions, leading to two failures in "make gcc.pot":

emit_diagnostic_valist used incompatibly as both
--keyword=emit_diagnostic_valist:4
--flag=emit_diagnostic_valist:4:gcc-internal-format and
--keyword=emit_diagnostic_valist:5
--flag=emit_diagnostic_valist:5:gcc-internal-format

warning_at used incompatibly as both
--keyword=warning_at:3 --flag=warning_at:3:gcc-internal-format and
--keyword=warning_at:4 --flag=warning_at:4:gcc-internal-format

The emit_diagnostic_valist overload isn't used anywhere (I think it's
a leftover from an earlier iteration of the analyzer patch kit).

The warning_at overload is used throughout the analyzer but nowhere else.

Ideally I'd like to consolidate this argument with something
constructable in various ways:
- from a metadata and an int or
- from an int (or, better an "enum opt_code"),
so that the overload happens when implicitly choosing the ctor, but
that feels like stage 1 material.

In the meantime, fix xgettext by deleting the unused overload and
renaming the used one.

gcc/analyzer/ChangeLog:
* region-model.cc (poisoned_value_diagnostic::emit): Update for
renaming of warning_at overload to warning_with_metadata_at.
* sm-file.cc (file_leak::emit): Likewise.
* sm-malloc.cc (double_free::emit): Likewise.
(possible_null_deref::emit): Likewise.
(possible_null_arg::emit): Likewise.
(null_deref::emit): Likewise.
(null_arg::emit): Likewise.
(use_after_free::emit): Likewise.
(malloc_leak::emit): Likewise.
(free_of_non_heap::emit): Likewise.
* sm-sensitive.cc (exposure_through_output_file::emit): Likewise.
* sm-signal.cc (signal_unsafe_call::emit): Likewise.
* sm-taint.cc (tainted_array_index::emit): Likewise.

gcc/ChangeLog:
* diagnostic-core.h (warning_at): Rename overload to...
(warning_with_metadata_at): ...this.
(emit_diagnostic_valist): Delete decl of overload taking
diagnostic_metadata.
* diagnostic.c (emit_diagnostic_valist): Likewise for defn.
(warning_at): Rename overload taking diagnostic_metadata to...
(warning_with_metadata_at): ...this.
---
 gcc/analyzer/region-model.cc | 24 
 gcc/analyzer/sm-file.cc  |  6 ++--
 gcc/analyzer/sm-malloc.cc| 54 
 gcc/analyzer/sm-sensitive.cc |  7 +++--
 gcc/analyzer/sm-signal.cc|  8 +++---
 gcc/analyzer/sm-taint.cc | 27 

Re: [PATCH] Fix component mappings with derived types for OpenACC

2020-01-28 Thread Julian Brown
On Fri, 24 Jan 2020 10:58:49 +0100
Tobias Burnus  wrote:

> Hi Julian,
> 
> the gfortran part is rather obvious and it and the test case look
> fine to me → OK.
> The oacc-mem.c also looks okay, but I assume Thomas needs to 
> rubber-stamp it.

I understand that Thomas is unavailable for the time being, so won't be
able to use his rubber-stamp powers. I added the offending libgomp code
to start with though, so I think I can go ahead and commit the patch.
I'll hold off for 24 hours though in case there are any objections
(Jakub?).

Thanks,

Julian

> On 1/10/20 2:49 AM, Julian Brown wrote:
> > This patch fixes a bug with mapping Fortran components (i.e. with
> > the manual deep-copy support) which themselves have derived types.
> > I've also added a couple of new tests to make sure such mappings
> > are lowered correctly, and to check for the case that Tobias found
> > in the message:
> >
> >https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00215.html
> >
> > The previous incorrect mapping was causing the error condition
> > mentioned in that message to fail to trigger, and I think (my!)
> > code in libgomp (goacc_exit_data_internal) to handle
> > GOMP_MAP_STRUCT specially was papering over the bad mapping also.
> > Oops!
> >
> > I haven't attempted to implement the (harder) sub-copy detection, if
> > that is indeed supposed to be forbidden by the spec. This patch
> > should get us to the same behaviour in Fortran as in C & C++ though.
> >
> > Tested with offloading to nvptx, also with the (uncommitted)
> > reference-count self-checking patch enabled.
> >
> > OK?
> >
> > Thanks,
> >
> > Julian
> >
> > ChangeLog
> >
> >  gcc/fortran/
> >  * trans-openmp.c (gfc_trans_omp_clauses): Use inner not decl
> > for components with derived types.
> >
> >  gcc/testsuite/
> >  * gfortran.dg/goacc/mapping-tests-3.f90: New test.
> >  * gfortran.dg/goacc/mapping-tests-4.f90: New test.
> >
> >  libgomp/
> >  * oacc-mem.c (goacc_exit_data_internal): Remove special
> > (no-copyback) behaviour for GOMP_MAP_STRUCT.
> >
> > component-mappings-derived-types-1.diff
> >
> > commit 5e9d8846bbaa33a9bdb08adcf1ee9f224a8e8fc0
> > Author: Julian Brown
> > Date:   Wed Jan 8 15:57:46 2020 -0800
> >
> >  Fix component mappings with derived types for OpenACC
> >  
> >  gcc/fortran/
> >  * trans-openmp.c (gfc_trans_omp_clauses): Use inner
> > not decl for components with derived types.
> >  
> >  gcc/testsuite/
> >  * gfortran.dg/goacc/mapping-tests-3.f90: New test.
> >  * gfortran.dg/goacc/mapping-tests-4.f90: New test.
> >  
> >  libgomp/
> >  * oacc-mem.c (goacc_exit_data_internal): Remove
> > special (no-copyback) behaviour for GOMP_MAP_STRUCT.
> >
> > diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
> > index 9835d2aae3c..efc392d7fa6 100644
> > --- a/gcc/fortran/trans-openmp.c
> > +++ b/gcc/fortran/trans-openmp.c
> > @@ -2783,9 +2783,9 @@ gfc_trans_omp_clauses (stmtblock_t *block,
> > gfc_omp_clauses *clauses, }
> >   else
> > {
> > - OMP_CLAUSE_DECL (node) = decl;
> > + OMP_CLAUSE_DECL (node) = inner;
> >   OMP_CLAUSE_SIZE (node)
> > -   = TYPE_SIZE_UNIT (TREE_TYPE (decl));
> > +   = TYPE_SIZE_UNIT (TREE_TYPE (inner));
> > }
> > }
> >   else if (lastcomp->next
> > diff --git a/gcc/testsuite/gfortran.dg/goacc/mapping-tests-3.f90
> > b/gcc/testsuite/gfortran.dg/goacc/mapping-tests-3.f90 new file mode
> > 100644 index 000..312f596461e
> > --- /dev/null
> > +++ b/gcc/testsuite/gfortran.dg/goacc/mapping-tests-3.f90
> > @@ -0,0 +1,15 @@
> > +! { dg-options "-fopenacc -fdump-tree-omplower" }
> > +
> > +subroutine foo
> > +  type one
> > +integer i, j
> > +  end type
> > +  type two
> > +type(one) A, B
> > +  end type
> > +
> > +  type(two) x
> > +
> > +  !$acc enter data copyin(x%A)
> > +! { dg-final { scan-tree-dump-times "omp target
> > oacc_enter_exit_data map\\(struct:x \\\[len: 1\\\]\\) map\\(to:x.a
> > \\\[len: \[0-9\]+\\\]\\)" 1 "omplower" } } +end diff --git
> > a/gcc/testsuite/gfortran.dg/goacc/mapping-tests-4.f90
> > b/gcc/testsuite/gfortran.dg/goacc/mapping-tests-4.f90 new file mode
> > 100644 index 000..6257af942df --- /dev/null
> > +++ b/gcc/testsuite/gfortran.dg/goacc/mapping-tests-4.f90
> > @@ -0,0 +1,17 @@
> > +subroutine foo
> > +  type one
> > +integer i, j
> > +  end type
> > +  type two
> > +type(one) A, B
> > +  end type
> > +
> > +  type(two) x
> > +
> > +! This is accepted at present, although it represents a
> > probably-unintentional +! overlapping subcopy.
> > +  !$acc enter data copyin(x%A, x%A%i)
> > +! But this raises an error.
> > +  !$acc enter data copyin(x%A, x%A%i, x%A%i)
> > +! { dg-error ".x.a.i. appears more than once in map clauses"
> > "" { target 

[COMMITTED] libstdc++: Avoid using sizeof with function types (PR 93470)

2020-01-28 Thread Jonathan Wakely
PR libstdc++/93470
* include/bits/refwrap.h (reference_wrapper::operator()): Restrict
static assertion to object types.

Tested powerpc64le-linux and verified with Clang.

Committed to master. Backport to gcc-9-branch needed too.

commit 72a9fd209b6db3b5f81fbb008e22ea93c00465e5
Author: Jonathan Wakely 
Date:   Tue Jan 28 13:24:09 2020 +

libstdc++: Avoid using sizeof with function types (PR 93470)

PR libstdc++/93470
* include/bits/refwrap.h (reference_wrapper::operator()): Restrict
static assertion to object types.

diff --git a/libstdc++-v3/include/bits/refwrap.h 
b/libstdc++-v3/include/bits/refwrap.h
index 2bcd44d3894..717aa01629c 100644
--- a/libstdc++-v3/include/bits/refwrap.h
+++ b/libstdc++-v3/include/bits/refwrap.h
@@ -343,7 +343,8 @@ _GLIBCXX_MEM_FN_TRAITS(&& noexcept, false_type, true_type)
operator()(_Args&&... __args) const
{
 #if __cplusplus > 201703L
- static_assert(sizeof(type), "type must be complete");
+ if constexpr (is_object_v)
+   static_assert(sizeof(type), "type must be complete");
 #endif
  return std::__invoke(get(), std::forward<_Args>(__args)...);
}


[PATCH] libstdc++: Replace glibc-specific check for clock_gettime (PR 93325)

2020-01-28 Thread Jonathan Wakely
It's wrong to assume that clock_gettime is unavailable on any *-*-linux*
target that doesn't have glibc 2.17 or later. Use a generic test instead
of using __GLIBC_PREREQ. Only do that test when is_hosted=yes so that we
don't get an error for cross targets without a working linker.

This ensures that C library's clock_gettime will be used on non-glibc
targets, instead of an incorrect syscall to SYS_clock_gettime.

PR libstdc++/93325
* acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Use AC_SEARCH_LIBS for
clock_gettime instead of explicit glibc version check.
* configure: Regenerate.

Tested powerpc64le-linux, built on centos7 and centos6 as well to
check the results are correct for various versions of glibc.

Committed to master. I plan to backport this later too.

commit 759812fddc81c0c131d4633b2a7f56412ce8dbed
Author: Jonathan Wakely 
Date:   Tue Jan 28 13:24:09 2020 +

libstdc++: Replace glibc-specific check for clock_gettime (PR 93325)

It's wrong to assume that clock_gettime is unavailable on any *-*-linux*
target that doesn't have glibc 2.17 or later. Use a generic test instead
of using __GLIBC_PREREQ. Only do that test when is_hosted=yes so that we
don't get an error for cross targets without a working linker.

This ensures that C library's clock_gettime will be used on non-glibc
targets, instead of an incorrect syscall to SYS_clock_gettime.

PR libstdc++/93325
* acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Use AC_SEARCH_LIBS 
for
clock_gettime instead of explicit glibc version check.
* configure: Regenerate.

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 39147916e94..ee5e0336f2c 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -1422,20 +1422,14 @@ AC_DEFUN([GLIBCXX_ENABLE_LIBSTDCXX_TIME], [
 ac_has_nanosleep=yes
 ;;
   gnu* | linux* | kfreebsd*-gnu | knetbsd*-gnu)
-AC_MSG_CHECKING([for at least GNU libc 2.17])
-AC_TRY_COMPILE(
-  [#include ],
-  [
-  #if ! __GLIBC_PREREQ(2, 17)
-  #error
-  #endif
-  ],
-  [glibcxx_glibc217=yes], [glibcxx_glibc217=no])
-AC_MSG_RESULT($glibcxx_glibc217)
-
-if test x"$glibcxx_glibc217" = x"yes"; then
-  ac_has_clock_monotonic=yes
-  ac_has_clock_realtime=yes
+# Don't use link test for freestanding library, in case gcc_no_link=yes
+if test x"$is_hosted" = xyes; then
+  # Versions of glibc before 2.17 needed -lrt for clock_gettime.
+  AC_SEARCH_LIBS(clock_gettime, [rt])
+  if test x"$ac_cv_search_clock_gettime" = x"none required"; then
+ac_has_clock_monotonic=yes
+ac_has_clock_realtime=yes
+  fi
 fi
 ac_has_nanosleep=yes
 ac_has_sched_yield=yes


Re: [PATCH v2][ARM] Disable code hoisting with -O3 (PR80155)

2020-01-28 Thread Segher Boessenkool
On Tue, Jan 28, 2020 at 10:42:16AM +0100, Richard Biener wrote:
> On Mon, Jan 27, 2020 at 10:47 PM Segher Boessenkool
>  wrote:
> > On Tue, Jan 21, 2020 at 02:10:21PM +, Wilco Dijkstra wrote:
> > > While code hoisting generally improves codesize, it can affect performance
> > > negatively. Benchmarking shows it doesn't help SPEC and negatively affects
> > > embedded benchmarks. Since the impact is relatively small with -O2 and 
> > > mainly
> > > affects -O3, the simplest option is to disable code hoisting for -O3 and 
> > > higher.
> >
> > Should this be a generic thing, not target-specific?
> 
> The change doesn't make sense - I'm sure if you'd look at the actual cases
> it's not code hoisting in itself but "optimizations" enabled by it that cause
> the issues.  It's probably the same thing as PRE causing issues downstream
> for some benchmarks but do you disable PRE then by default just because of 
> that?

Well, that depends on how bad it is.  And of course not if it is just
because of benchmark peculiarities, we're not a banchmark compiler after
all, we're meant to compile real code.  But if PRE (just to continue
your example) would mess that up more often than not, then yes, it
should not be enabled by default.


Segher


[PATCH] tree-optimization/93439 move clique bookkeeping to OMP expansion

2020-01-28 Thread Richard Biener
Autopar was doing clique bookkeeping too early when creating destination
functions but then later introducing new cliques via versioning loops.
The following moves the bookkeeping to the actual outlining process.

Bootstrapped & tested on x86_64-unknown-linux-gnu, pushed.

Richard.

2020-01-28  Richard Biener  

PR tree-optimization/93439
* tree-parloops.c (create_loop_fn): Move clique bookkeeping...
* tree-cfg.c (move_sese_region_to_fn): ... here.
(verify_types_in_gimple_reference): Verify used cliques are
tracked.

* gfortran.dg/graphite/pr93439.f90: New testcase.
---
 gcc/testsuite/gfortran.dg/graphite/pr93439.f90 | 21 +
 gcc/tree-cfg.c | 17 +
 gcc/tree-parloops.c|  1 -
 3 files changed, 38 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/graphite/pr93439.f90

diff --git a/gcc/testsuite/gfortran.dg/graphite/pr93439.f90 
b/gcc/testsuite/gfortran.dg/graphite/pr93439.f90
new file mode 100644
index 000..e815ab929e1
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/graphite/pr93439.f90
@@ -0,0 +1,21 @@
+! { dg-additional-options "-O2 -floop-parallelize-all -floop-unroll-and-jam 
-ftree-parallelize-loops=2" }
+
+module ai
+  integer, parameter :: dp = 8
+contains
+  subroutine qu(ja, nq, en, p5)
+real(kind = dp) :: nq(ja), en(ja), p5(ja)
+call tl(ja, nq, en, p5)
+  end subroutine qu
+
+  subroutine tl(ja, nq, en, p5)
+real(kind = dp) :: nq(9), en(9 * ja), p5(3 * ja)
+do mc = 1, ja
+   do mb = 1, 9
+  do ma = 1, 3
+ p5((mc - 1) * 3 + ma) = p5((mc - 1) * 3 + ma) - 1
+  end do
+   end do
+end do
+  end subroutine tl
+end module ai
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index fd69b366bf4..f7b817d94e6 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3226,6 +3226,13 @@ verify_types_in_gimple_reference (tree expr, bool 
require_lvalue)
  debug_generic_stmt (expr);
  return true;
}
+  if (MR_DEPENDENCE_CLIQUE (expr) != 0
+ && MR_DEPENDENCE_CLIQUE (expr) > cfun->last_clique)
+   {
+ error ("invalid clique in %qs", code_name);
+ debug_generic_stmt (expr);
+ return true;
+   }
 }
   else if (TREE_CODE (expr) == TARGET_MEM_REF)
 {
@@ -3245,6 +3252,13 @@ verify_types_in_gimple_reference (tree expr, bool 
require_lvalue)
  debug_generic_stmt (expr);
  return true;
}
+  if (MR_DEPENDENCE_CLIQUE (expr) != 0
+ && MR_DEPENDENCE_CLIQUE (expr) > cfun->last_clique)
+   {
+ error ("invalid clique in %qs", code_name);
+ debug_generic_stmt (expr);
+ return true;
+   }
 }
   else if (TREE_CODE (expr) == INDIRECT_REF)
 {
@@ -7744,6 +7758,9 @@ move_sese_region_to_fn (struct function *dest_cfun, 
basic_block entry_bb,
   after = bb;
 }
 
+  /* Adjust the maximum clique used.  */
+  dest_cfun->last_clique = saved_cfun->last_clique;
+
   loop->aux = NULL;
   loop0->aux = NULL;
   /* Loop sizes are no longer correct, fix them up.  */
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index d315b797a66..d9250d36c72 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -2202,7 +2202,6 @@ create_loop_fn (location_t loc)
   DECL_ARGUMENTS (decl) = t;
 
   allocate_struct_function (decl, false);
-  DECL_STRUCT_FUNCTION (decl)->last_clique = act_cfun->last_clique;
 
   /* The call to allocate_struct_function clobbers CFUN, so we need to restore
  it.  */
-- 
2.16.4


Re: [PATCH] doc: clarify the situation with pointer arithmetic

2020-01-28 Thread Uecker, Martin
Am Dienstag, den 28.01.2020, 11:01 +0100 schrieb Richard Biener:
> On Tue, Jan 28, 2020 at 8:20 AM Alexander Monakov  wrote:
> > 
> > On Tue, 28 Jan 2020, Uecker, Martin wrote:
> > 
> > > > (*) this also shows the level of "obfuscation" needed to fool compilers
> > > > to lose provenance knowledge is hard to predict.
> > > 
> > > Well, this is exactly the problem we want to address by defining
> > > a clear way to do this. Casting to an integer would be the way
> > > to state: "consider the pointer as escaped and forget the
> > > provenance"  and casting an integer to a  pointer would
> > > mean "this pointer may point to all objects whose pointer has
> > > escaped". This would give the programmer explicit control about
> > > this aspect and make most existing code using pointer-to-integer
> > > casts well-defined. At the same time, this should be simple
> > > to add to existing points-to analysis.
> > 
> > Can you explain why you make it required for the compiler to treat the
> > points-to set unnecessarily broader than it could prove? In the Matlab
> > example, there's a simple chain of computations that the compiler is
> > following to prove that the pointer resulting from the final cast is
> > derived from exactly one other pointer (no other pointers have
> > participated in the computations).
> > 
> > Or, in other words:
> > 
> > is there an example where a programmer can distinguish between the
> > requirement you explain above vs. the fine-grained interpretation
> > that GIMPLE aims to implement (at least as I understand it), which is:
> > 
> >   when the program creates a pointer by means of non-pointer computations
> >   (casts, representation access, etc), the resulting pointer may point to:
> > 
> > * any object which address could have participated in the computation
> >   (which is at worst the entire set of "exposed" objects up to that
> >    program point, but can be much narrower if the compiler can see
> >    the entire chain of computations)
> > 
> > * any objects which is not "exposed" but could have known address, e.g.
> >   because it is placed at a specific address during linking
> 
> Note for the current PTA implementation there's almost no cases we can
> handle conservatively enough.  Consider the simple
> 
>  int a[4];
>  int *p = [1];
>  uintptr_t pi = (uintptr_t)p;
>  pi += 4;
>  int *q = (int *)pi;
> 
> our PTA knows that p points to a (but not the exact offset), same for pi
> (the cast doesn't change the value).  Now you add 4 - this could lead
> you outside of 'a' so the points-to set becomes 'a and anything'.

PVNI would say that 'a' gets exposed in the first case and
then 'q' can point to all exposed objects because of the
second cast.

A correct and conservative implementation would do this:
In the PTA you would need to mark the address of a escaped
and then later assign a conservative points-to set (all
escaped objects) to q.

Yes, this limits optimizations, but I do not think this is
terrible. (optimizations could be re-enabled with a compiler
option)

> I'm also not sure what PVNI does to
> 
>  int a[4];
>  int *p = [1];
>  p += 10;
>  uintptr_t pi = (uintptr_t)p;
>  p = (int *)pi;
> 
> we assume that p points to a even after p += 10 (but it of course points
> outside of the object - obvious here, but not necessarily in more
> obfuscated cases).

This is UB because a pointer to outside of the object is formed.

> Now, can we assume pi points to a?  The cast isn't value-changing.  Do we have
> to assume (int *)pi points to anything?  So, is
> 
>  p = (int *)(uintptr_t)p;
> 
> something like "laundering" a pointer?  We don't preserve such simple 
> round-trip
> casts since they are value-preserving.  Are they provenance preserving?

Yes, such a pair would be "laundering" as it allows 'p' to then 
point to any exposed object provenance-wise.

For such casts the FE would maybe add a marker. Maybe a calls
to builtin functions 'builtin_expose(a)' and 'builting_bless'.
(having those builtins would be interesting on its own, btw).

Having said this, some optimizations could still be allowed using
the "as-if" rule and other lines of reasoning. Specifally, PVNI
states that 'p' gets assigned the provenance of the object the
integer values is the address of. So if the compiler can proof
that the address belongs to certain objects it can reassign the
points-to set to the new 'p'. Only if there is ambiguity between
which objects the address belongs to, the reasoning needs to
be more conservative. 

For example:
int a[3];
int b[3];

 // b also exposed
int *p = (int*)(uintptr_t)[3];

Here, p could point to the one-after address of 'a' or the
first element of 'b'. (but only because 'b' was also exposed).

If the compiler can prove that something like this can not
happen (e.g. by considering offsets), it can still do some
tracking of points-to sets. 

Best,
Martin



> Richard.
> 
> > Thanks.
> > Alexander

Re: [PATCH][AArch64] Fix shrinkwrapping interactions with atomics (PR92692)

2020-01-28 Thread Segher Boessenkool
Hi!

On Mon, Jan 27, 2020 at 04:28:29PM +, Wilco Dijkstra wrote:
> > On Thu, Jan 16, 2020 at 12:50:14PM +, Wilco Dijkstra wrote:
> >> The separate shrinkwrapping pass may insert stores in the middle
> >> of atomics loops which can cause issues on some implementations.
> >> Avoid this by delaying splitting of atomic patterns until after
> >> prolog/epilog generation.
> >
> > Note that this isn't specific to sws at all: there isn't anything
> > stopping later passes from doing this either.  Is there anything that
> > protects us from sched2 doing similar here, for example?
> 
> The expansions create extra basic blocks and insert various barriers
> that would stop any reasonable scheduler from doing it. And the
> current scheduler is basic block based.

Hrm, and probably no other passes after *logue can create new store
insns (ignoring peepholes and splitters, that would be almost *asking*
for trouble ;-) ).  I see.  Thanks for explaining.


Segher


Re: [PATCH] doc: clarify the situation with pointer arithmetic

2020-01-28 Thread Uecker, Martin
Am Dienstag, den 28.01.2020, 10:20 +0300 schrieb Alexander Monakov:
> On Tue, 28 Jan 2020, Uecker, Martin wrote:
> 
> > > (*) this also shows the level of "obfuscation" needed to fool compilers
> > > to lose provenance knowledge is hard to predict.
> > 
> > Well, this is exactly the problem we want to address by defining
> > a clear way to do this. Casting to an integer would be the way
> > to state: "consider the pointer as escaped and forget the 
> > provenance"  and casting an integer to a  pointer would
> > mean "this pointer may point to all objects whose pointer has
> > escaped". This would give the programmer explicit control about
> > this aspect and make most existing code using pointer-to-integer
> > casts well-defined. At the same time, this should be simple
> > to add to existing points-to analysis.
> 
> Can you explain why you make it required for the compiler to treat the
> points-to set unnecessarily broader than it could prove? In the Matlab
> example, there's a simple chain of computations that the compiler is
> following to prove that the pointer resulting from the final cast is
> derived from exactly one other pointer (no other pointers have
> participated in the computations).
> 
> Or, in other words:
> 
> is there an example where a programmer can distinguish between the
> requirement you explain above vs. the fine-grained interpretation
> that GIMPLE aims to implement (at least as I understand it), which is:
> 
>   when the program creates a pointer by means of non-pointer computations
>   (casts, representation access, etc), the resulting pointer may point to:
> 
> * any object which address could have participated in the computation
>   (which is at worst the entire set of "exposed" objects up to that
>    program point, but can be much narrower if the compiler can see
>    the entire chain of computations)
> 
> * any objects which is not "exposed" but could have known address, e.g.
>   because it is placed at a specific address during linking

Unfortunately, this is not as simple. It is not trivial to
define the set of objects whose "address could have participated
in the computation"

int a = ... random number
int b = 
if (a == b) {
  int *p = (int*)a; 

Did '' participate in the computation?

What if you output and integer using I/O and read it back in?

What if you copy an integer using control flow?

There are many similar questions like this.

If we want to make this part of the standard, we need to formulate
rules for all integer operations about how the provenance flows.


There are several problems with this. A compiler needs to be able
to compute the complete points-to set. If it might miss an object
which is allowed to be used it has to be conservative with
aliasing - the analysis become useless. 

So we always have the trade-off between making the rules simpler
and restrict the tracking or specify more complicated rules that
allow sophisticated tracking but  then all compilers that want
to do this optimization have to implement this complicated
rules or need to fall back to being conservative. 


Finally, all integer operations would have a potential hidden
second meaning when applied to addresses of objects, which
makes it much easier to reason about.

Assume you have a function

int difference(int a, int b)
{
return a - b;
}

And later you replace an expression 
'difference(a, a)' with '0' in the program.
This seems a trivial and logical thing to do, but if
this expression was applied to addresses, you might
have broken a provenance chain and introduced a bug.


In my opinion, integers should stay integers with simple
logical properties. 

Gruß,
Martin





Re: [RFC] [c-family] PR92867 - Add returns_arg attribute

2020-01-28 Thread Jakub Jelinek
On Tue, Jan 28, 2020 at 05:09:36PM +0530, Prathamesh Kulkarni wrote:
> On Tue, 28 Jan 2020 at 17:00, Jakub Jelinek  wrote:
> >
> > On Tue, Jan 28, 2020 at 04:56:59PM +0530, Prathamesh Kulkarni wrote:
> > > Thanks for the suggestions. In the attached patch I bumped up value of
> > > ERF_RETURNS_ARG_MASK
> > > to UINT_MAX >> 2, and use highest two bits for ERF_NOALIAS and 
> > > ERF_RETURNS_ARG.
> > > And use fn spec "Z" to store the argument number in fn-spec 
> > > format.
> > > Does that look OK ?
> >
> > No.
> >
> > +#define ERF_RETURN_ARG_MASK(UINT_MAX >> 2)
> >
> >  /* Nonzero if the return value is equal to the argument number
> > flags & ERF_RETURN_ARG_MASK.  */
> > -#define ERF_RETURNS_ARG(1 << 2)
> > +#define ERF_RETURNS_ARG(1 << (BITS_PER_WORD - 2))
> >
> > How is size of host int related to BITS_PER_WORD?  Not to mention that
> > if BITS_PER_WORD is 64 and host int is 32-bit, 1 << (64 - 2) is UB.
> Oops sorry. I should have used HOST_BITS_PER_INT.
> I assume that'd be correct ?

It still wouldn't, 1 << (HOST_BITS_PER_INT - 1) is negative number, you'd
need either 1U and verify all ERF_* flags uses, or avoid using the sign bit.
The patch has other issues, you don't verify that the argnum fits into
the bits (doesn't overflow into the other ERF_* bits), in
+  char *s = (char *) xmalloc (sizeof (char) * BITS_PER_WORD);  
   
+  s[0] = 'Z';  
   
+  sprintf (s + 1, "%lu", argnum);  
   
1) sizeof (char) is 1 by definition
2) it is pointless to allocate it and then deallocate (just use automatic
array)
3) it is unclear how is BITS_PER_WORD related to the length of decimal
encoded string + Z char + terminating '\0'.  The usual way is for unsigned
numbers to estimate number of digits by counting 3 digits per each 8 bits,
in your case of course + 2.

More importantly, the "fn spec" attribute isn't used just in
gimple_call_return_flags, but also in e.g. gimple_call_arg_flags which
assumes that the return stuff in there is a single char and the reaming
chars are for argument descriptions, or in decl_return_flags which you
haven't modified.

I must say I fail to see the point in trying to glue this together into the
"fn spec" argument so incompatibly, why can't we handle the fn spec with its
1-4 returns_arg and returns_arg attribute with arbitrary position
side-by-side?

Jakub



Re: [PATCH v2][ARM] Disable code hoisting with -O3 (PR80155)

2020-01-28 Thread Ramana Radhakrishnan
On Tue, Nov 26, 2019 at 3:18 PM Wilco Dijkstra  wrote:
>
> Hi,
>
> While code hoisting generally improves codesize, it can affect performance
> negatively. Benchmarking shows it doesn't help SPEC and negatively affects
> embedded benchmarks. Since the impact is relatively small with -O2 and mainly
> affects -O3, the simplest option is to disable code hoisting for -O3 and 
> higher.
>
> OK for commit?
>
> ChangeLog:
> 2019-11-26  Wilco Dijkstra  
>
> PR tree-optimization/80155
> * common/config/arm/arm-common.c (arm_option_optimization_table):
> Disable -fcode-hoisting with -O3.
> --
>
> diff --git a/gcc/common/config/arm/arm-common.c 
> b/gcc/common/config/arm/arm-common.c
> index 
> b761d3abd670a144a593c4b410b1e7fbdcb52f56..3e11f21b7dd76cc071b645c32a6fdb4a92511279
>  100644
> --- a/gcc/common/config/arm/arm-common.c
> +++ b/gcc/common/config/arm/arm-common.c
> @@ -39,6 +39,8 @@ static const struct default_options 
> arm_option_optimization_table[] =
>  /* Enable section anchors by default at -O1 or higher.  */
>  { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
>  { OPT_LEVELS_FAST, OPT_fsched_pressure, NULL, 1 },
> +/* Disable code hoisting with -O3 or higher.  */
> +{ OPT_LEVELS_3_PLUS, OPT_fcode_hoisting, NULL, 0 },
>  { OPT_LEVELS_NONE, 0, NULL, 0 }
>};
>

What are the cases in O3 where this is slower with embedded benchmarks
and how can we fix it ? Keeping the target "different" in this manner
doesn't augur well for the long term.

Ramana


Re: [PR47785] COLLECT_AS_OPTIONS

2020-01-28 Thread Richard Biener
On Fri, Jan 24, 2020 at 7:04 AM Prathamesh Kulkarni
 wrote:
>
> On Mon, 20 Jan 2020 at 15:44, Richard Biener  
> wrote:
> >
> > On Wed, Jan 8, 2020 at 11:20 AM Prathamesh Kulkarni
> >  wrote:
> > >
> > > On Tue, 5 Nov 2019 at 17:38, Richard Biener  
> > > wrote:
> > > >
> > > > On Tue, Nov 5, 2019 at 12:17 AM Kugan Vivekanandarajah
> > > >  wrote:
> > > > >
> > > > > Hi,
> > > > > Thanks for the review.
> > > > >
> > > > > On Tue, 5 Nov 2019 at 03:57, H.J. Lu  wrote:
> > > > > >
> > > > > > On Sun, Nov 3, 2019 at 6:45 PM Kugan Vivekanandarajah
> > > > > >  wrote:
> > > > > > >
> > > > > > > Thanks for the reviews.
> > > > > > >
> > > > > > >
> > > > > > > On Sat, 2 Nov 2019 at 02:49, H.J. Lu  wrote:
> > > > > > > >
> > > > > > > > On Thu, Oct 31, 2019 at 6:33 PM Kugan Vivekanandarajah
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > On Wed, 30 Oct 2019 at 03:11, H.J. Lu  
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > On Sun, Oct 27, 2019 at 6:33 PM Kugan Vivekanandarajah
> > > > > > > > > >  wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi Richard,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for the review.
> > > > > > > > > > >
> > > > > > > > > > > On Wed, 23 Oct 2019 at 23:07, Richard Biener 
> > > > > > > > > > >  wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Oct 21, 2019 at 10:04 AM Kugan Vivekanandarajah
> > > > > > > > > > > >  wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hi Richard,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for the pointers.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, 11 Oct 2019 at 22:33, Richard Biener 
> > > > > > > > > > > > >  wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Oct 11, 2019 at 6:15 AM Kugan 
> > > > > > > > > > > > > > Vivekanandarajah
> > > > > > > > > > > > > >  wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Richard,
> > > > > > > > > > > > > > > Thanks for the review.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, 2 Oct 2019 at 20:41, Richard Biener 
> > > > > > > > > > > > > > >  wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, Oct 2, 2019 at 10:39 AM Kugan 
> > > > > > > > > > > > > > > > Vivekanandarajah
> > > > > > > > > > > > > > > >  wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > As mentioned in the PR, attached patch adds 
> > > > > > > > > > > > > > > > > COLLECT_AS_OPTIONS for
> > > > > > > > > > > > > > > > > passing assembler options specified with -Wa, 
> > > > > > > > > > > > > > > > > to the link-time driver.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The proposed solution only works for uniform 
> > > > > > > > > > > > > > > > > -Wa options across all
> > > > > > > > > > > > > > > > > TUs. As mentioned by Richard Biener, 
> > > > > > > > > > > > > > > > > supporting non-uniform -Wa flags
> > > > > > > > > > > > > > > > > would require either adjusting partitioning 
> > > > > > > > > > > > > > > > > according to flags or
> > > > > > > > > > > > > > > > > emitting multiple object files  from a single 
> > > > > > > > > > > > > > > > > LTRANS CU. We could
> > > > > > > > > > > > > > > > > consider this as a follow up.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Bootstrapped and regression tests on  
> > > > > > > > > > > > > > > > > arm-linux-gcc. Is this OK for trunk?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > While it works for your simple cases it is 
> > > > > > > > > > > > > > > > unlikely to work in practice since
> > > > > > > > > > > > > > > > your implementation needs the assembler options 
> > > > > > > > > > > > > > > > be present at the link
> > > > > > > > > > > > > > > > command line.  I agree that this might be the 
> > > > > > > > > > > > > > > > way for people to go when
> > > > > > > > > > > > > > > > they face the issue but then it needs to be 
> > > > > > > > > > > > > > > > documented somewhere
> > > > > > > > > > > > > > > > in the manual.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > That is, with COLLECT_AS_OPTION (why singular?  
> > > > > > > > > > > > > > > > I'd expected
> > > > > > > > > > > > > > > > COLLECT_AS_OPTIONS) available to cc1 we could 
> > > > > > > > > > > > > > > > stream this string
> > > > > > > > > > > > > > > > to lto_options and re-materialize it at link 
> > > > > > > > > > > > > > > > time (and diagnose mismatches
> > > > > > > > > > > > > > > > even if we like).
> > > > > > > > > > > > > > > OK. I will try to implement this. So the idea is 
> > > > > > > > > > > > > > > if we provide
> > > > > > > > > > > > > > > -Wa,options as part of the lto compile, this 
> > > > > > > > > > > > > > > should be available
> > > > > > > > > > > > > > > during link time. Like in:

Re: [RFC] [c-family] PR92867 - Add returns_arg attribute

2020-01-28 Thread Prathamesh Kulkarni
On Tue, 28 Jan 2020 at 17:00, Jakub Jelinek  wrote:
>
> On Tue, Jan 28, 2020 at 04:56:59PM +0530, Prathamesh Kulkarni wrote:
> > Thanks for the suggestions. In the attached patch I bumped up value of
> > ERF_RETURNS_ARG_MASK
> > to UINT_MAX >> 2, and use highest two bits for ERF_NOALIAS and 
> > ERF_RETURNS_ARG.
> > And use fn spec "Z" to store the argument number in fn-spec format.
> > Does that look OK ?
>
> No.
>
> +#define ERF_RETURN_ARG_MASK(UINT_MAX >> 2)
>
>  /* Nonzero if the return value is equal to the argument number
> flags & ERF_RETURN_ARG_MASK.  */
> -#define ERF_RETURNS_ARG(1 << 2)
> +#define ERF_RETURNS_ARG(1 << (BITS_PER_WORD - 2))
>
> How is size of host int related to BITS_PER_WORD?  Not to mention that
> if BITS_PER_WORD is 64 and host int is 32-bit, 1 << (64 - 2) is UB.
Oops sorry. I should have used HOST_BITS_PER_INT.
I assume that'd be correct ?

Thanks,
Prathamesh
>
> Jakub
>


Re: [RFC] [c-family] PR92867 - Add returns_arg attribute

2020-01-28 Thread Jakub Jelinek
On Tue, Jan 28, 2020 at 04:56:59PM +0530, Prathamesh Kulkarni wrote:
> Thanks for the suggestions. In the attached patch I bumped up value of
> ERF_RETURNS_ARG_MASK
> to UINT_MAX >> 2, and use highest two bits for ERF_NOALIAS and 
> ERF_RETURNS_ARG.
> And use fn spec "Z" to store the argument number in fn-spec format.
> Does that look OK ?

No.

+#define ERF_RETURN_ARG_MASK(UINT_MAX >> 2) 
   

   
 /* Nonzero if the return value is equal to the argument number 
   
flags & ERF_RETURN_ARG_MASK.  */
   
-#define ERF_RETURNS_ARG(1 << 2)
   
+#define ERF_RETURNS_ARG(1 << (BITS_PER_WORD - 2))  
   

How is size of host int related to BITS_PER_WORD?  Not to mention that
if BITS_PER_WORD is 64 and host int is 32-bit, 1 << (64 - 2) is UB.

Jakub



Re: [RFC] [c-family] PR92867 - Add returns_arg attribute

2020-01-28 Thread Prathamesh Kulkarni
On Mon, 27 Jan 2020 at 17:36, Richard Biener  wrote:
>
> On Fri, Jan 24, 2020 at 11:53 PM Joseph Myers  wrote:
> >
> > On Fri, 24 Jan 2020, Prathamesh Kulkarni wrote:
> >
> > > The middle-end representation issue of ERF_RETURNS_ARG still remains,
> > > which restricts the attribute till first four args. The patch simply
> > > emits sorry(), for arguments beyond first four..
> >
> > I think this should be fixed (e.g. make the middle-end check for the
> > attribute, or something like that).
>
> Since it's pure optimization you can also simply chose to ignore this without
> notice.
>
> Note ERF_RETURN_ARG_MASK is limited to the number of available
> bits in an int and practically the only current setter was via "fn spec"
> which uses a single digit [1-9] to denote the argument (so limiting to
> three is indeed an odd choice but matches builtins using this at the moment).
>
> Feel free to up ERF_RETURN_ARG_MASK (but then you need to adjust
> the ERF_ flag defines).
Hi,
Thanks for the suggestions. In the attached patch I bumped up value of
ERF_RETURNS_ARG_MASK
to UINT_MAX >> 2, and use highest two bits for ERF_NOALIAS and ERF_RETURNS_ARG.
And use fn spec "Z" to store the argument number in fn-spec format.
Does that look OK ?

In gimple_call_return_flags, I didn't remove the existing fn spec
"0-3" in this patch, since RET1 (and possibly others?) depend on it.
I will remove that and adjust other cases to use fn-spec "Z"
if that's OK, in follow-up patch.

Thanks,
Prathamesh
>
> >  The language semantics of the
> > attribute should not be driven by such internal implementation details;
> > rather, implementation details should be determined by the language
> > semantics to be implemented.
> >
> > The sorry () has coding style issues.  Diagnostics should not end with '.'
> > or '\n', should use full words (attribute not attr, arguments not args)
> > and programming language text in them should be surrounded by %<%> (so
> > %).
> >
> > --
> > Joseph S. Myers
> > jos...@codesourcery.com
diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index dc9579c5c60..2ed41ed136d 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -150,6 +150,7 @@ static tree handle_designated_init_attribute (tree *, tree, tree, int, bool *);
 static tree handle_patchable_function_entry_attribute (tree *, tree, tree,
 		   int, bool *);
 static tree handle_copy_attribute (tree *, tree, tree, int, bool *);
+static tree handle_returns_arg_attribute (tree *, tree, tree, int, bool *);
 
 /* Helper to define attribute exclusions.  */
 #define ATTR_EXCL(name, function, type, variable)	\
@@ -484,6 +485,8 @@ const struct attribute_spec c_common_attribute_table[] =
 			  handle_noinit_attribute, attr_noinit_exclusions },
   { "access",		  1, 3, false, true, true, false,
 			  handle_access_attribute, NULL },
+  { "returns_arg",	  1, 1, true, false, false, false,
+			  handle_returns_arg_attribute, NULL },
   { NULL, 0, 0, false, false, false, false, NULL, NULL }
 };
 
@@ -4603,6 +4606,53 @@ handle_patchable_function_entry_attribute (tree *, tree name, tree args,
   return NULL_TREE;
 }
 
+/* Handle a "returns_arg" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_returns_arg_attribute (tree *node, tree name, tree args,
+			  int flags, bool *no_add_attrs)
+{
+  tree decl = *node;
+  tree rettype = TREE_TYPE (decl);
+
+  if (TREE_CODE (rettype) == METHOD_TYPE
+  || TREE_CODE (rettype) == FUNCTION_TYPE)
+rettype = TREE_TYPE (rettype);
+
+  if (VOID_TYPE_P (rettype))
+{
+  warning_at (DECL_SOURCE_LOCATION (decl), OPT_Wattributes,
+		  "%qE attribute ignored on a function returning %qT.",
+		  name, rettype);
+  *no_add_attrs = true;
+  return NULL_TREE;
+}
+
+  if (!positional_argument (TREE_TYPE (decl), name, TREE_VALUE (args),
+			TREE_CODE (rettype)))
+{
+  *no_add_attrs = true;
+  return NULL_TREE;
+}
+
+  *no_add_attrs = false;
+  gcc_assert (args);
+  tree val = TREE_VALUE (args);
+  unsigned HOST_WIDE_INT argnum = tree_to_uhwi (val);
+  char *s = (char *) xmalloc (sizeof (char) * BITS_PER_WORD);
+  s[0] = 'Z';
+  sprintf (s + 1, "%lu", argnum);
+
+  tree attr = tree_cons (get_identifier ("fn spec"),
+			 build_tree_list (NULL_TREE,
+	  build_string (strlen (s), s)),
+			 NULL_TREE);
+  decl_attributes (node, attr, flags);
+  free (s);
+  return NULL_TREE;
+}
+
 /* Attempt to partially validate a single attribute ATTR as if
it were to be applied to an entity OPER.  */
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index ec99c38a607..3531e0c8292 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3566,6 +3566,19 @@ diagnosed.  Because a pure function cannot have any observable side
 effects it does not make sense for such a function to return @code{void}.
 Declaring such a function is diagnosed.
 
+@item returns_arg (@var{position})
+@cindex 

Re: [PATCH] simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763]

2020-01-28 Thread Richard Sandiford
Jeff Law  writes:
> On Mon, 2020-01-27 at 16:41 +, Richard Sandiford wrote:
>> In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have:
>> 
>> Failed to match this instruction:
>> (set (reg:SI 95)
>> (ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0)
>> (const_int 3 [0x3])
>> (const_int 0 [0])) 0)
>> (const_int 19 [0x13])))
>> 
>> If we perform the natural simplification to:
>> 
>> (set (reg:SI 95)
>> (ashift:SI (sign_extract:SI (reg:SI 97)
>> (const_int 3 [0x3])
>> (const_int 0 [0])) 0)
>> (const_int 19 [0x13])))
>> 
>> then the pattern matches.  And it turns out that we do have a
>> simplification like that already, but it would only kick in for
>> extractions from a reg, not a subreg.  E.g.:
> Yea.  I ran into similar problems with the extract/extend bits in
> combine.  And we know it's a fairly general problem that we don't
> handle SUBREGs anywhere near as consistently as REGs.
>
>
>> 
>> (set (reg:SI 95)
>> (ashift:SI (subreg:SI (sign_extract:DI (reg:DI X)
>> (const_int 3 [0x3])
>> (const_int 0 [0])) 0)
>> (const_int 19 [0x13])))
>> 
>> would simplify to:
>> 
>> (set (reg:SI 95)
>> (ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0)
>> (const_int 3 [0x3])
>> (const_int 0 [0])) 0)
>> (const_int 19 [0x13])))
>> 
>> IMO the subreg case is even more obviously a simplification
>> than the bare reg case, since the net effect is to remove
>> either one or two subregs, rather than simply change the
>> position of a subreg/truncation.
>> 
>> However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c
>> for -m32 on x86_64-linux-gnu, because we could then simplify
>> a :HI zero_extract to a :QI one.  The associated *testqi_ext_3
>> pattern did already seem to want to handle QImode extractions:
>> 
>>   "ix86_match_ccmode (insn, CCNOmode)
>>&& ((TARGET_64BIT && GET_MODE (operands[2]) == DImode)
>>|| GET_MODE (operands[2]) == SImode
>>|| GET_MODE (operands[2]) == HImode
>>|| GET_MODE (operands[2]) == QImode)
>> 
>> but I'm not sure how often the QI case would trigger in practice,
>> since the zero_extract mode was restricted to HI and above.  I checked
>> the other x86 patterns and couldn't see any other instances of this.
>> 
>> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu,
>> OK to install?
>> 
>> Richard
>> 
>> 
>> 2020-01-27  Richard Sandiford  
>> 
>> gcc/
>>  PR rtl-optimization/87763
>>  * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract
>>  simplification to handle subregs as well as bare regs.
>>  * config/i386/i386.md (*testqi_ext_3): Match QI extracts too.
> Do you need to check for and reject paradoxicals here?  If not, OK as-
> is.  If you need to check, then that's pre-approved as well.

Thanks, pushed.

On the paradoxical thing: the outer subreg is always non-paradoxical
for simplify_truncation, so we don't need to check that specifically.
For the inner subreg we need to handle the paradoxical case for the PR.

I wondered at one point about punting for non-paradoxical inner subregs,
but there didn't seem to be a good reason to keep an expression like
(truncate (op (truncate...))) over (op (truncate ...)).  If it turns
out there is a good reason though, changing SUBREG_P to
paradoxical_subreg_p would be fine in terms of this PR.

Richard


[Patch, committed][Fortran] avoid ICE in gfc_omp_check_optional_argument (PR93464)

2020-01-28 Thread Tobias Burnus

(This was a GCC 10 regression, affecting both OpenMP and OpenACC.)

In gfc_omp_check_optional_argument:

DECL_LANG_SPECIFIC is always available (check at the top). However, if 
decl is not a PARM_DECL, it can have two reasons: Either it is no 
parameter (C sense; Fortran: dummy argument) at all or – as gfortran 
does for assumed-shape arrays – it is a locally defined decl which 
belongs to the array parameter. In the latter case, 
GFC_DECL_SAVED_DESCRIPTOR contains the actual PARAM_DECL.


Well, as this test case shows, the GFC_DESCRIPTOR_TYPE_P condition is 
fulfilled, but as it is a locally defined variable, 
GFC_DECL_SAVED_DESCRIPTOR is NULL.


The reason that it didn't show up before is that most of the local 
arrays do not have DECL_LANG_SPECIFIC – this one does because it is a 
character string (used for the string length).


Applied as obvious after building and regtesting.

Cheers,

Tobias

commit 627d59b6b3062de921fbdd80b2b48de18f599d03
Author: Tobias Burnus 
Date:   Tue Jan 28 11:54:57 2020 +0100

[Fortran] avoid ICE in gfc_omp_check_optional_argument (PR93464)

PR fortran/93464
* openmp.c (gfc_omp_check_optional_argument): Avoid ICE when
DECL_LANG_SPECIFIC and GFC_DESCRIPTOR_TYPE_P but not
GFC_DECL_SAVED_DESCRIPTOR as for local allocatable character vars.

PR fortran/93464
* gfortran.dg/goacc/pr93464.f90: New.

diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog
index 4040ff284b3..eb8842b0ab8 100644
--- a/gcc/fortran/ChangeLog
+++ b/gcc/fortran/ChangeLog
@@ -1,9 +1,16 @@
+2020-01-28  Tobias Burnus  
+
+	PR fortran/93464
+	* openmp.c (gfc_omp_check_optional_argument): Avoid ICE when
+	DECL_LANG_SPECIFIC and GFC_DESCRIPTOR_TYPE_P but not
+	GFC_DECL_SAVED_DESCRIPTOR as for local allocatable character vars.
+
 2020-01-28  Tobias Burnus  
 
 	* gfortran.texi (Runtime): Remove tailing '.' in @menu.
 
 2020-01-27  Tobias Burnus  
 
 	PR fortran/85781
 	* trans-expr.c (gfc_conv_substring): Handle non-ARRAY_TYPE strings
 	of Bind(C) procedures.
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index fd60bbbed5d..9550499 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -95,19 +95,20 @@ gfc_omp_check_optional_argument (tree decl, bool for_present_check)
   /* For assumed-shape arrays, a local decl with arg->data is used.  */
   if (TREE_CODE (decl) != PARM_DECL
   && (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl))
 	  || GFC_ARRAY_TYPE_P (TREE_TYPE (decl
 {
   is_array_type = true;
   decl = GFC_DECL_SAVED_DESCRIPTOR (decl);
 }
 
-  if (TREE_CODE (decl) != PARM_DECL
+  if (decl == NULL_TREE
+  || TREE_CODE (decl) != PARM_DECL
   || !DECL_LANG_SPECIFIC (decl)
   || !GFC_DECL_OPTIONAL_ARGUMENT (decl))
 return NULL_TREE;
 
/* Scalars with VALUE attribute which are passed by value use a hidden
   argument to denote the present status.  They are passed as nonpointer type
   with one exception: 'type(c_ptr), value' as 'void*'.  */
/* Cf. trans-expr.c's gfc_conv_expr_present.  */
if (TREE_CODE (TREE_TYPE (decl)) != POINTER_TYPE
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index eac18206b12..d9441cb0a2e 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,9 +1,14 @@
+2020-01-28  Tobias Burnus  
+
+	PR fortran/93464
+	* gfortran.dg/goacc/pr93464.f90: New.
+
 2020-01-28  Richard Sandiford  
 
 	PR tree-optimization/93434
 	* gcc.c-torture/execute/pr93434.c: New test.
 
 2020-01-28  Richard Sandiford  
 
 	PR testsuite/93460
 	* gcc.dg/torture/pr93170.c: Add -Wpsabi.
diff --git a/gcc/testsuite/gfortran.dg/goacc/pr93464.f90 b/gcc/testsuite/gfortran.dg/goacc/pr93464.f90
new file mode 100644
index 000..922106540f9
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/pr93464.f90
@@ -0,0 +1,16 @@
+! { dg-do compile }
+!
+! PR fortran/93464
+!
+! Contributed by G. Steinmetz
+!
+program p
+   character :: c(2) = 'a'
+   character, allocatable :: z(:)
+   !$acc parallel
+   !$omp target
+   z = c
+   !$acc end parallel
+   !$omp end target
+   print *, z
+end


[PATCH] testsuite: XFAIL gcc.dg/torture/pr93133.c for powerpc*-*-* [PR93393]

2020-01-28 Thread Richard Sandiford
Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
OK to install?

Richard


2020-01-28  Richard Sandiford  

gcc/testsuite/
PR testsuite/93393
* gcc.dg/torture/pr93133.c: XFAIL for powerpc*-*-*.
---
 gcc/testsuite/gcc.dg/torture/pr93133.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/torture/pr93133.c 
b/gcc/testsuite/gcc.dg/torture/pr93133.c
index 21eae1eb3dd..19af6b8ee6d 100644
--- a/gcc/testsuite/gcc.dg/torture/pr93133.c
+++ b/gcc/testsuite/gcc.dg/torture/pr93133.c
@@ -1,4 +1,4 @@
-/* { dg-do run } */
+/* { dg-do run { xfail powerpc*-*-* } } */
 /* { dg-add-options ieee } */
 /* { dg-require-effective-target fenv_exceptions } */
 /* { dg-skip-if "fenv" { powerpc-ibm-aix* } } */


[COMMITTED] testsuite: Add -Wpsabi to gcc.dg/torture/pr93170.c [PR93460]

2020-01-28 Thread Richard Sandiford
Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
Pushed as obvious.

Richard


2020-01-28  Richard Sandiford  

gcc/testsuite/
PR testsuite/93460
* gcc.dg/torture/pr93170.c: Add -Wpsabi.
---
 gcc/testsuite/gcc.dg/torture/pr93170.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.dg/torture/pr93170.c 
b/gcc/testsuite/gcc.dg/torture/pr93170.c
index 25a93a3743e..80910bfb3d5 100644
--- a/gcc/testsuite/gcc.dg/torture/pr93170.c
+++ b/gcc/testsuite/gcc.dg/torture/pr93170.c
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-require-effective-target int128 } */
+/* { dg-options "-Wno-psabi" } */
 /* { dg-additional-options "-frename-registers -fno-tree-forwprop 
-fno-tree-fre -fira-algorithm=priority -mstringop-strategy=loop 
--param=hot-bb-frequency-fraction=0 -Wno-psabi" { target { x86_64-*-* i?86-*-* 
} } } */
 
 typedef unsigned char v64u8 __attribute__ ((vector_size (64)));


Re: [PING][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)

2020-01-28 Thread Kyrill Tkachov

Hi Stam,

On 1/8/20 3:18 PM, Stam Markianos-Wright wrote:


On 12/10/19 5:03 PM, Kyrill Tkachov wrote:

Hi Stam,

On 11/15/19 5:26 PM, Stam Markianos-Wright wrote:

Pinging with more correct maintainers this time :)

Also would need to backport to gcc7,8,9, but need to get this approved
first!


Sorry for the delay.

Same here now! Sorry totally forget about this in the lead up to Xmas!

Done the changes marked below and also removed the unnecessary extra #defines
from the test.



This is ok with a nit on the testcase...


diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c 
b/gcc/testsuite/gcc.target/arm/pr91816.c
new file mode 100644
index 
..757c897e9c0db32709227b3fdf1b4a8033428232
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr91816.c
@@ -0,0 +1,61 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
+int printf(const char *, ...);
+

I think this needs a couple of effective target checks like arm_hard_vfp_ok and 
arm_thumb2_ok. See other tests in gcc.target/arm that add -mthumb to the 
options.

Thanks,
Kyrill



Re: [PATCH] gcc: Add new configure options to allow static libraries to be selected

2020-01-28 Thread Iain Sandoe

Hi Andrew,

Andrew Burgess  wrote:


* Jeff Law  [2020-01-22 13:52:27 -0700]:


On Wed, 2020-01-22 at 15:39 +, Andrew Burgess wrote:

The motivation behind this change is to make it easier for a user to
link against static libraries on a target where dynamic libraries are
the default library type (for example GNU/Linux).

Further, my motivation is really for linking libraries into GDB,
however, the binutils-gdb/config/ directory is a copy of gcc/config/
so changes for GDB need to be approved by the GCC project first.

After making this change in the gcc/config/ directory I've run
autoreconf on all of the configure scripts in the GCC tree and a
couple have been updated, so I'll use one of these to describe what my
change does.

Consider libcpp, this library links against libiconv.  Currently if
the user builds on a system with both static and dynamic libiconv
installed then autotools will pick up the dynamic libiconv by
default.  This is almost certainly the right thing to do.

However, if the user wants to link against static libiconv then things
are a little harder, they could remove the dynamic libiconv from their
system, but this is probably a bad idea (other things might depend on
that library), or the user can build their own version of libiconv,
install it into a unique prefix, and then configure gcc using the
--with-libiconv-prefix=DIR flag.  This works fine, but is somewhat
annoying, the static library available, I just can't get autotools to
use it.

My change then adds a new flag --with-libiconv-type=TYPE, where type
is either auto, static, or shared.  The default auto, ensures we keep
the existing behaviour unchanged.

If the user configures with --with-libiconv-type=static then the
configure script will ignore any dynamic libiconv it finds, and will
only look for a static libiconv, if no static libiconv is found then
the configure will continue as though there is no libiconv at all
available.

Similarly a user can specify --with-libiconv-type=shared and force the
use of shared libiconv, any static libiconv will be ignored.

As I've implemented this change within the AC_LIB_LINKFLAGS_BODY macro
then only libraries configured using the AC_LIB_LINKFLAGS or
AC_LIB_HAVE_LINKFLAGS macros will gain the new configure flag.

If this is accepted into GCC then there will be follow on patches for
binutils and GDB to regenerate some configure scripts in those
projects.

For GCC only two configure scripts needed updated after this commit,
libcpp and libstdc++-v3, both of which link against libiconv.


This kinda surprises me, gcc/ also configures for iconv


config/ChangeLog:

* lib-link.m4 (AC_LIB_LINKFLAGS_BODY): Add new
--with-libXXX-type=... option.  Use this to guide the selection of
either a shared library or a static library.

libcpp/ChangeLog:

* configure: Regnerate.

libstdc++-v3/ChangeLog:

* configure: Regnerate.

s/Regnerate/Regenerate/

This isn't strictly a regression bugfix.  But given the nature of these
files I think we probably need to be a bit more lax and allow safe
changes so that downstream uses can move forward independent of the gcc
development and release schedule.

So, OK.


Thanks for the flexibility.  Now pushed.


this (r10-6269,  
https://gcc.gnu.org/g:e7c26e04b2dd6266d62d5a5825ff7eb44d1cf14e) causes or  
exposes a problem which breaks bootstrap on all Darwin platforms I tried.


Bootstrap fails stage1 self-check with:
cc1: internal compiler error: in on_diagnostic, at input.c:2182

* AFAICT, this is caused by self-test attempting to do something that  
libcpp was not configured to support.


* All viable Darwin platforms have libiconv installed (but Darwin’s /lib is  
/usr/lib; this might well apply to other BSD derivatives too).


 * Before the patch, libcpp and gcc configury finds this and they agree on the 
availability of ICONV (#define HAVE_ICONV 1).

 * After the patch libcpp no longer thinks iconv is available, but gcc 
continues to find it - and that, I think, leads to it attempting tests for 
which libcpp has not been configured.

 * I can work around this by adding —with-iconv-prefix=/usr to the configure 
line which forces both libcpp and gcc to find the library explicitly - but 
that’s only a short-term solution.

Can you clarify why there’s no need to match the configury changes in  
libcpp / gcc / libstdc++ ?


thanks
Iain





Re: [committed] Add diagnostic_metadata and CWE support

2020-01-28 Thread Jakub Jelinek
On Wed, Dec 18, 2019 at 07:08:25PM -0500, David Malcolm wrote:
> This patch adds support for associating a diagnostic message with an
> optional diagnostic_metadata object, so that plugins can add extra data
> to their diagnostics (e.g. mapping a diagnostic to a taxonomy or coding
> standard such as from CERT or MISRA).
> 
> Currently this only supports associating a CWE identifier with a
> diagnostic (which is what I'm using for the warnings in the analyzer
> patch kit), but adding a diagnostic_metadata class allows for future
> growth in this area without an explosion of further "warning_at"
> overloads for all of the different kinds of custom data that a plugin
> might want to add.
> 
> This version of the patch renames the overly-general
> -fdiagnostics-show-metadata to -fdiagnostics-show-cwe and adds test
> coverage for it via a plugin.
> 
> It also adds a note to the documentation that no GCC diagnostics
> currently use this; it's a feature for plugins (and, at some point,
> I hope, the analyzer).
> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> 
> Committed to trunk as r279556.

Unfortunately, this patch broke the i18n.
$ make gcc.pot
/bin/sh ../../gcc/../mkinstalldirs po
make srcextra
make[1]: Entering directory '/usr/src/gcc/obj/gcc'
cp -p gengtype-lex.c ../../gcc
cp -p gengtype-lex.c ../../gcc
make[1]: Leaving directory '/usr/src/gcc/obj/gcc'
AWK=gawk /bin/sh ../../gcc/po/exgettext \
/usr/bin/xgettext gcc ../../gcc
scanning for keywords, %e and %n strings...
emit_diagnostic_valist used incompatibly as both 
--keyword=emit_diagnostic_valist:4
--flag=emit_diagnostic_valist:4:gcc-internal-format and 
--keyword=emit_diagnostic_valist:5
--flag=emit_diagnostic_valist:5:gcc-internal-format
make: *** [Makefile:4338: po/gcc.pot] Error 1

While C++ can have overloads, xgettext can't deal with overloads that have
different argument positions.
emit_diagnostic_valist with the new args (i.e. the :5 one) isn't used right
now, so one way around at least for now is to remove it again, another is to
rename it.

Jakub



Re: [PATCH] analyzer: fix build with gcc 4.4 (PR 93276)

2020-01-28 Thread Jakub Jelinek
On Fri, Jan 24, 2020 at 07:53:28PM -0500, David Malcolm wrote:
> This patch fixes various build failures seen with gcc 4.4
> 
> gcc prior to 4.6 complains about:
> 
>   error: #pragma GCC diagnostic not allowed inside functions
> 
> for various uses of PUSH_IGNORE_WFORMAT and POP_IGNORE_WFORMAT.
> This patch makes them a no-op with such compilers.

I think this is wrong.
All that is really needed is make sure you #include "diagnostic-core.h"
before including pretty-print.h.  By including
diagnostic-core.h first, you do:
#ifndef GCC_DIAG_STYLE
#define GCC_DIAG_STYLE __gcc_tdiag__
#endif
and then pretty-print.h will do:
#ifdef GCC_DIAG_STYLE
#define GCC_PPDIAG_STYLE GCC_DIAG_STYLE
#else
#define GCC_PPDIAG_STYLE __gcc_diag__
#endif
If instead pretty-print.h is included first, then it will use __gcc_diag__
instead of __gcc_tdiag__ and thus will assume %E/%D etc. can't be handled.

I've so far just tested that in stage3 with this patch analyzer builds
without any -Wformat/-Wformat-extra-args warnings.

Ok for trunk if it passes bootstrap/regtest?

2020-01-28  Jakub Jelinek  

* analyzer.h (PUSH_IGNORE_WFORMAT, POP_IGNORE_WFORMAT): Remove.
* constraint-manager.cc: Include diagnostic-core.h before graphviz.h.
(range::dump, equiv_class::print): Don't use PUSH_IGNORE_WFORMAT or
POP_IGNORE_WFORMAT.
* state-purge.cc: Include diagnostic-core.h before
gimple-pretty-print.h.
(state_purge_annotator::add_node_annotations, print_vec_of_names):
Don't use PUSH_IGNORE_WFORMAT or POP_IGNORE_WFORMAT.
* region-model.cc: Move diagnostic-core.h include before graphviz.h.
(path_var::dump, svalue::print, constant_svalue::print_details,
region::dump_to_pp, region::dump_child_label, region::print_fields,
map_region::print_fields, map_region::dump_dot_to_pp,
map_region::dump_child_label, array_region::print_fields,
array_region::dump_dot_to_pp): Don't use PUSH_IGNORE_WFORMAT or
POP_IGNORE_WFORMAT.

--- gcc/analyzer/analyzer.h.jj  2020-01-27 22:40:57.012420793 +0100
+++ gcc/analyzer/analyzer.h 2020-01-28 10:57:28.078715244 +0100
@@ -100,22 +100,6 @@ public:
   ~auto_cfun () { pop_cfun (); }
 };
 
-/* Macros for temporarily suppressing -Wformat and -Wformat-extra-args,
-   for those versions of GCC that support pragmas within a function
-   (4.6 onwards).  */
-
-#if GCC_VERSION >= 4006
-# define PUSH_IGNORE_WFORMAT \
-  _Pragma("GCC diagnostic push") \
-  _Pragma("GCC diagnostic ignored \"-Wformat\"") \
-  _Pragma("GCC diagnostic ignored \"-Wformat-extra-args\"")
-# define POP_IGNORE_WFORMAT \
-  _Pragma("GCC diagnostic pop")
-#else
-# define PUSH_IGNORE_WFORMAT
-# define POP_IGNORE_WFORMAT
-#endif
-
 /* A template for creating hash traits for a POD type.  */
 
 template 
--- gcc/analyzer/constraint-manager.cc.jj   2020-01-22 22:37:04.478533500 
+0100
+++ gcc/analyzer/constraint-manager.cc  2020-01-28 10:54:41.403226986 +0100
@@ -28,6 +28,7 @@ along with GCC; see the file COPYING3.
 #include "gimple-iterator.h"
 #include "fold-const.h"
 #include "selftest.h"
+#include "diagnostic-core.h"
 #include "graphviz.h"
 #include "function.h"
 #include "analyzer/analyzer.h"
@@ -120,13 +121,11 @@ bound::get_relation_as_str () const
 void
 range::dump (pretty_printer *pp) const
 {
-PUSH_IGNORE_WFORMAT
   pp_printf (pp, "%qE %s x %s %qE",
 m_lower_bound.m_constant,
 m_lower_bound.get_relation_as_str (),
 m_upper_bound.get_relation_as_str (),
 m_upper_bound.m_constant);
-POP_IGNORE_WFORMAT
 }
 
 /* Determine if there is only one possible value for this range.
@@ -200,9 +199,7 @@ equiv_class::print (pretty_printer *pp)
 {
   if (i > 0)
pp_string (pp, " == ");
-PUSH_IGNORE_WFORMAT
   pp_printf (pp, "%qE", m_constant);
-POP_IGNORE_WFORMAT
 }
   pp_character (pp, '}');
 }
--- gcc/analyzer/state-purge.cc.jj  2020-01-14 22:57:20.802253484 +0100
+++ gcc/analyzer/state-purge.cc 2020-01-28 10:57:17.642872508 +0100
@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.
 #include "tree-phinodes.h"
 #include "options.h"
 #include "ssa-iterators.h"
+#include "diagnostic-core.h"
 #include "gimple-pretty-print.h"
 #include "function.h"
 #include "analyzer/analyzer.h"
@@ -444,12 +445,10 @@ state_purge_annotator::add_node_annotati
state_purge_per_ssa_name *per_name_data = (*iter).second;
if (per_name_data->get_function () == n.m_fun)
 {
-PUSH_IGNORE_WFORMAT
   if (per_name_data->needed_at_point_p (before_supernode))
 pp_printf (pp, "%qE needed here", name);
   else
 pp_printf (pp, "%qE not needed here", name);
-POP_IGNORE_WFORMAT
 }
pp_newline (pp);
  }
@@ -476,9 +475,7 @@ print_vec_of_names (graphviz_out *gv, co
 {
   if (i > 0)
pp_string (pp, ", ");
-PUSH_IGNORE_WFORMAT
   pp_printf (pp, "%qE", name);
-POP_IGNORE_WFORMAT
 }
   pp_printf (pp, "}");
   

Re: [PATCH] predcom: Fix invalid store-store commoning [PR93434]

2020-01-28 Thread Richard Biener
On Tue, Jan 28, 2020 at 10:49 AM Richard Sandiford
 wrote:
>
> Richard Sandiford  writes:
> > predcom has the following code to stop one rogue load from
> > interfering with other store-load opportunities:
> >
> >   /* If A is read and B write or vice versa and there is unsuitable
> >dependence, instead of merging both components into a component
> >that will certainly not pass suitable_component_p, just put the
> >read into bad component, perhaps at least the write together with
> >all the other data refs in it's component will be optimizable.  */
> >
> > But when store-store commoning was added later, this had the effect
> > of ignoring loads that occur between two candidate stores.
> >
> > There is code further up to handle loads and stores with unknown
> > dependences:
> >
> >   /* Don't do store elimination if there is any unknown dependence for
> >any store data reference.  */
> >   if ((DR_IS_WRITE (dra) || DR_IS_WRITE (drb))
> > && (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know
> > || DDR_NUM_DIST_VECTS (ddr) == 0))
> >   eliminate_store_p = false;
> >
> > But the store-load code above skips loads for *known* dependences
> > if (a) the load has already been marked "bad" or (b) the data-ref
> > machinery knows the dependence distance, but determine_offsets
> > can't handle the combination.
> >
> > (a) happens to be the problem in the testcase, but a different
> > sequence could have given (b) instead.  We have writes to individual
> > fields of a structure and reads from the whole structure.  Since
> > determine_offsets requires the types to be the same, it returns false
> > for each such read/write combination.
> >
> > This patch records which components have had loads removed and
> > prevents store-store commoning for them.  It's a bit too pessimistic,
> > since there shouldn't be a problem if a "bad" load dominates all stores
> > in a component.  But (a) we can't AFAIK use pcom_stmt_dominates_stmt_p
> > here and (b) the handling for that case would probably need to be
> > removed again if we handled more exotic cases in future.
>
> Forgot to add:
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK for master
> and eventually for backports?

OK for master and backports.
Richard.

> Thanks,
> Richard
>
> >
> > 2020-01-28  Richard Sandiford  
> >
> > gcc/
> >   PR tree-optimization/93434
> >   * tree-predcom.c (split_data_refs_to_components): Record which
> >   components have had aliasing loads removed.  Prevent store-store
> >   commoning for all such components.
> >
> > gcc/testsuite/
> >   PR tree-optimization/93434
> >   * gcc.c-torture/execute/pr93434.c: New test.
> > ---
> >  gcc/testsuite/gcc.c-torture/execute/pr93434.c | 36 +++
> >  gcc/tree-predcom.c| 24 +++--
> >  2 files changed, 58 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr93434.c
> >
> > diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
> > index 047c7fa9583..d2dcfe7f42d 100644
> > --- a/gcc/tree-predcom.c
> > +++ b/gcc/tree-predcom.c
> > @@ -767,6 +767,7 @@ split_data_refs_to_components (class loop *loop,
> >/* Don't do store elimination if loop has multiple exit edges.  */
> >bool eliminate_store_p = single_exit (loop) != NULL;
> >basic_block last_always_executed = last_always_executed_block (loop);
> > +  auto_bitmap no_store_store_comps;
> >
> >FOR_EACH_VEC_ELT (datarefs, i, dr)
> >  {
> > @@ -838,9 +839,13 @@ split_data_refs_to_components (class loop *loop,
> >else if (DR_IS_READ (dra) && ib != bad)
> >   {
> > if (ia == bad)
> > - continue;
> > + {
> > +   bitmap_set_bit (no_store_store_comps, ib);
> > +   continue;
> > + }
> > else if (!determine_offset (dra, drb, _off))
> >   {
> > +   bitmap_set_bit (no_store_store_comps, ib);
> > merge_comps (comp_father, comp_size, bad, ia);
> > continue;
> >   }
> > @@ -848,9 +853,13 @@ split_data_refs_to_components (class loop *loop,
> >else if (DR_IS_READ (drb) && ia != bad)
> >   {
> > if (ib == bad)
> > - continue;
> > + {
> > +   bitmap_set_bit (no_store_store_comps, ia);
> > +   continue;
> > + }
> > else if (!determine_offset (dra, drb, _off))
> >   {
> > +   bitmap_set_bit (no_store_store_comps, ia);
> > merge_comps (comp_father, comp_size, bad, ib);
> > continue;
> >   }
> > @@ -908,6 +917,17 @@ split_data_refs_to_components (class loop *loop,
> >comp->refs.quick_push (dataref);
> >  }
> >
> > +  if (eliminate_store_p)
> > +{
> > +  bitmap_iterator bi;
> > +  EXECUTE_IF_SET_IN_BITMAP (no_store_store_comps, 0, ia, bi)
> > + {
> > +   ca = component_of (comp_father, ia);
> > +   if (ca != bad)
> 

Re: [PATCH] doc: clarify the situation with pointer arithmetic

2020-01-28 Thread Richard Biener
On Tue, Jan 28, 2020 at 8:20 AM Alexander Monakov  wrote:
>
> On Tue, 28 Jan 2020, Uecker, Martin wrote:
>
> > > (*) this also shows the level of "obfuscation" needed to fool compilers
> > > to lose provenance knowledge is hard to predict.
> >
> > Well, this is exactly the problem we want to address by defining
> > a clear way to do this. Casting to an integer would be the way
> > to state: "consider the pointer as escaped and forget the
> > provenance"  and casting an integer to a  pointer would
> > mean "this pointer may point to all objects whose pointer has
> > escaped". This would give the programmer explicit control about
> > this aspect and make most existing code using pointer-to-integer
> > casts well-defined. At the same time, this should be simple
> > to add to existing points-to analysis.
>
> Can you explain why you make it required for the compiler to treat the
> points-to set unnecessarily broader than it could prove? In the Matlab
> example, there's a simple chain of computations that the compiler is
> following to prove that the pointer resulting from the final cast is
> derived from exactly one other pointer (no other pointers have
> participated in the computations).
>
> Or, in other words:
>
> is there an example where a programmer can distinguish between the
> requirement you explain above vs. the fine-grained interpretation
> that GIMPLE aims to implement (at least as I understand it), which is:
>
>   when the program creates a pointer by means of non-pointer computations
>   (casts, representation access, etc), the resulting pointer may point to:
>
> * any object which address could have participated in the computation
>   (which is at worst the entire set of "exposed" objects up to that
>program point, but can be much narrower if the compiler can see
>the entire chain of computations)
>
> * any objects which is not "exposed" but could have known address, e.g.
>   because it is placed at a specific address during linking

Note for the current PTA implementation there's almost no cases we can
handle conservatively enough.  Consider the simple

 int a[4];
 int *p = [1];
 uintptr_t pi = (uintptr_t)p;
 pi += 4;
 int *q = (int *)pi;

our PTA knows that p points to a (but not the exact offset), same for pi
(the cast doesn't change the value).  Now you add 4 - this could lead
you outside of 'a' so the points-to set becomes 'a and anything'.

I'm also not sure what PVNI does to

 int a[4];
 int *p = [1];
 p += 10;
 uintptr_t pi = (uintptr_t)p;
 p = (int *)pi;

we assume that p points to a even after p += 10 (but it of course points
outside of the object - obvious here, but not necessarily in more
obfuscated cases).
Now, can we assume pi points to a?  The cast isn't value-changing.  Do we have
to assume (int *)pi points to anything?  So, is

 p = (int *)(uintptr_t)p;

something like "laundering" a pointer?  We don't preserve such simple round-trip
casts since they are value-preserving.  Are they provenance preserving?

Richard.

> Thanks.
> Alexander


[Patch][Fortran] gfortran.texi - minor style cleanup

2020-01-28 Thread Tobias Burnus
Most items in the @menu do not end with a period; all of them do not end 
with one when used as @section. Hence, it make sense to unify the style 
to w/o tailing '.'.


Committed.

Tobias

commit 4593f60558474983c79cbbdedc31b4c19e63b671
Author: Tobias Burnus 
Date:   Tue Jan 28 10:58:00 2020 +0100

[Fortran] gfortran.texi - minor style cleanup

* gfortran.texi (Runtime): Remove tailing '.' in @menu.

diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog
index bfc3b224ecb..4040ff284b3 100644
--- a/gcc/fortran/ChangeLog
+++ b/gcc/fortran/ChangeLog
@@ -1,3 +1,7 @@
+2020-01-28  Tobias Burnus  
+
+	* gfortran.texi (Runtime): Remove tailing '.' in @menu.
+
 2020-01-27  Tobias Burnus  
 
 	PR fortran/85781
diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index a50634ab9d2..0b52c1b6ab8 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -604,15 +604,15 @@ Malformed environment variables are silently ignored.
 * GFORTRAN_STDIN_UNIT:: Unit number for standard input
 * GFORTRAN_STDOUT_UNIT:: Unit number for standard output
 * GFORTRAN_STDERR_UNIT:: Unit number for standard error
-* GFORTRAN_UNBUFFERED_ALL:: Do not buffer I/O for all units.
+* GFORTRAN_UNBUFFERED_ALL:: Do not buffer I/O for all units
 * GFORTRAN_UNBUFFERED_PRECONNECTED:: Do not buffer I/O for preconnected units.
 * GFORTRAN_SHOW_LOCUS::  Show location for runtime errors
 * GFORTRAN_OPTIONAL_PLUS:: Print leading + where permitted
 * GFORTRAN_LIST_SEPARATOR::  Separator for list output
 * GFORTRAN_CONVERT_UNIT::  Set endianness for unformatted I/O
 * GFORTRAN_ERROR_BACKTRACE:: Show backtrace on run-time errors
-* GFORTRAN_FORMATTED_BUFFER_SIZE:: Buffer size for formatted files.
-* GFORTRAN_UNFORMATTED_BUFFER_SIZE:: Buffer size for unformatted files.
+* GFORTRAN_FORMATTED_BUFFER_SIZE:: Buffer size for formatted files
+* GFORTRAN_UNFORMATTED_BUFFER_SIZE:: Buffer size for unformatted files
 @end menu
 
 @node TMPDIR


Re: [PATCH] predcom: Fix invalid store-store commoning [PR93434]

2020-01-28 Thread Richard Sandiford
Richard Sandiford  writes:
> predcom has the following code to stop one rogue load from
> interfering with other store-load opportunities:
>
>   /* If A is read and B write or vice versa and there is unsuitable
>dependence, instead of merging both components into a component
>that will certainly not pass suitable_component_p, just put the
>read into bad component, perhaps at least the write together with
>all the other data refs in it's component will be optimizable.  */
>
> But when store-store commoning was added later, this had the effect
> of ignoring loads that occur between two candidate stores.
>
> There is code further up to handle loads and stores with unknown
> dependences:
>
>   /* Don't do store elimination if there is any unknown dependence for
>any store data reference.  */
>   if ((DR_IS_WRITE (dra) || DR_IS_WRITE (drb))
> && (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know
> || DDR_NUM_DIST_VECTS (ddr) == 0))
>   eliminate_store_p = false;
>
> But the store-load code above skips loads for *known* dependences
> if (a) the load has already been marked "bad" or (b) the data-ref
> machinery knows the dependence distance, but determine_offsets
> can't handle the combination.
>
> (a) happens to be the problem in the testcase, but a different
> sequence could have given (b) instead.  We have writes to individual
> fields of a structure and reads from the whole structure.  Since
> determine_offsets requires the types to be the same, it returns false
> for each such read/write combination.
>
> This patch records which components have had loads removed and
> prevents store-store commoning for them.  It's a bit too pessimistic,
> since there shouldn't be a problem if a "bad" load dominates all stores
> in a component.  But (a) we can't AFAIK use pcom_stmt_dominates_stmt_p
> here and (b) the handling for that case would probably need to be
> removed again if we handled more exotic cases in future.

Forgot to add:

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK for master
and eventually for backports?

Thanks,
Richard

>
> 2020-01-28  Richard Sandiford  
>
> gcc/
>   PR tree-optimization/93434
>   * tree-predcom.c (split_data_refs_to_components): Record which
>   components have had aliasing loads removed.  Prevent store-store
>   commoning for all such components.
>
> gcc/testsuite/
>   PR tree-optimization/93434
>   * gcc.c-torture/execute/pr93434.c: New test.
> ---
>  gcc/testsuite/gcc.c-torture/execute/pr93434.c | 36 +++
>  gcc/tree-predcom.c| 24 +++--
>  2 files changed, 58 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr93434.c
>
> diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
> index 047c7fa9583..d2dcfe7f42d 100644
> --- a/gcc/tree-predcom.c
> +++ b/gcc/tree-predcom.c
> @@ -767,6 +767,7 @@ split_data_refs_to_components (class loop *loop,
>/* Don't do store elimination if loop has multiple exit edges.  */
>bool eliminate_store_p = single_exit (loop) != NULL;
>basic_block last_always_executed = last_always_executed_block (loop);
> +  auto_bitmap no_store_store_comps;
>  
>FOR_EACH_VEC_ELT (datarefs, i, dr)
>  {
> @@ -838,9 +839,13 @@ split_data_refs_to_components (class loop *loop,
>else if (DR_IS_READ (dra) && ib != bad)
>   {
> if (ia == bad)
> - continue;
> + {
> +   bitmap_set_bit (no_store_store_comps, ib);
> +   continue;
> + }
> else if (!determine_offset (dra, drb, _off))
>   {
> +   bitmap_set_bit (no_store_store_comps, ib);
> merge_comps (comp_father, comp_size, bad, ia);
> continue;
>   }
> @@ -848,9 +853,13 @@ split_data_refs_to_components (class loop *loop,
>else if (DR_IS_READ (drb) && ia != bad)
>   {
> if (ib == bad)
> - continue;
> + {
> +   bitmap_set_bit (no_store_store_comps, ia);
> +   continue;
> + }
> else if (!determine_offset (dra, drb, _off))
>   {
> +   bitmap_set_bit (no_store_store_comps, ia);
> merge_comps (comp_father, comp_size, bad, ib);
> continue;
>   }
> @@ -908,6 +917,17 @@ split_data_refs_to_components (class loop *loop,
>comp->refs.quick_push (dataref);
>  }
>  
> +  if (eliminate_store_p)
> +{
> +  bitmap_iterator bi;
> +  EXECUTE_IF_SET_IN_BITMAP (no_store_store_comps, 0, ia, bi)
> + {
> +   ca = component_of (comp_father, ia);
> +   if (ca != bad)
> + comps[ca]->eliminate_store_p = false;
> + }
> +}
> +
>for (i = 0; i < n; i++)
>  {
>comp = comps[i];
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr93434.c 
> b/gcc/testsuite/gcc.c-torture/execute/pr93434.c
> new file mode 100644
> index 000..e786252794b
> --- /dev/null
> +++ 

Re: [PATCH v2][ARM] Disable code hoisting with -O3 (PR80155)

2020-01-28 Thread Richard Biener
On Mon, Jan 27, 2020 at 10:47 PM Segher Boessenkool
 wrote:
>
> Hi!
>
> On Tue, Jan 21, 2020 at 02:10:21PM +, Wilco Dijkstra wrote:
> > While code hoisting generally improves codesize, it can affect performance
> > negatively. Benchmarking shows it doesn't help SPEC and negatively affects
> > embedded benchmarks. Since the impact is relatively small with -O2 and 
> > mainly
> > affects -O3, the simplest option is to disable code hoisting for -O3 and 
> > higher.
>
> Should this be a generic thing, not target-specific?

The change doesn't make sense - I'm sure if you'd look at the actual cases
it's not code hoisting in itself but "optimizations" enabled by it that cause
the issues.  It's probably the same thing as PRE causing issues downstream
for some benchmarks but do you disable PRE then by default just because of that?

Richard.

>
> Segher


[PATCH] predcom: Fix invalid store-store commoning [PR93434]

2020-01-28 Thread Richard Sandiford
predcom has the following code to stop one rogue load from
interfering with other store-load opportunities:

  /* If A is read and B write or vice versa and there is unsuitable
 dependence, instead of merging both components into a component
 that will certainly not pass suitable_component_p, just put the
 read into bad component, perhaps at least the write together with
 all the other data refs in it's component will be optimizable.  */

But when store-store commoning was added later, this had the effect
of ignoring loads that occur between two candidate stores.

There is code further up to handle loads and stores with unknown
dependences:

  /* Don't do store elimination if there is any unknown dependence for
 any store data reference.  */
  if ((DR_IS_WRITE (dra) || DR_IS_WRITE (drb))
  && (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know
  || DDR_NUM_DIST_VECTS (ddr) == 0))
eliminate_store_p = false;

But the store-load code above skips loads for *known* dependences
if (a) the load has already been marked "bad" or (b) the data-ref
machinery knows the dependence distance, but determine_offsets
can't handle the combination.

(a) happens to be the problem in the testcase, but a different
sequence could have given (b) instead.  We have writes to individual
fields of a structure and reads from the whole structure.  Since
determine_offsets requires the types to be the same, it returns false
for each such read/write combination.

This patch records which components have had loads removed and
prevents store-store commoning for them.  It's a bit too pessimistic,
since there shouldn't be a problem if a "bad" load dominates all stores
in a component.  But (a) we can't AFAIK use pcom_stmt_dominates_stmt_p
here and (b) the handling for that case would probably need to be
removed again if we handled more exotic cases in future.

2020-01-28  Richard Sandiford  

gcc/
PR tree-optimization/93434
* tree-predcom.c (split_data_refs_to_components): Record which
components have had aliasing loads removed.  Prevent store-store
commoning for all such components.

gcc/testsuite/
PR tree-optimization/93434
* gcc.c-torture/execute/pr93434.c: New test.
---
 gcc/testsuite/gcc.c-torture/execute/pr93434.c | 36 +++
 gcc/tree-predcom.c| 24 +++--
 2 files changed, 58 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr93434.c

diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index 047c7fa9583..d2dcfe7f42d 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -767,6 +767,7 @@ split_data_refs_to_components (class loop *loop,
   /* Don't do store elimination if loop has multiple exit edges.  */
   bool eliminate_store_p = single_exit (loop) != NULL;
   basic_block last_always_executed = last_always_executed_block (loop);
+  auto_bitmap no_store_store_comps;
 
   FOR_EACH_VEC_ELT (datarefs, i, dr)
 {
@@ -838,9 +839,13 @@ split_data_refs_to_components (class loop *loop,
   else if (DR_IS_READ (dra) && ib != bad)
{
  if (ia == bad)
-   continue;
+   {
+ bitmap_set_bit (no_store_store_comps, ib);
+ continue;
+   }
  else if (!determine_offset (dra, drb, _off))
{
+ bitmap_set_bit (no_store_store_comps, ib);
  merge_comps (comp_father, comp_size, bad, ia);
  continue;
}
@@ -848,9 +853,13 @@ split_data_refs_to_components (class loop *loop,
   else if (DR_IS_READ (drb) && ia != bad)
{
  if (ib == bad)
-   continue;
+   {
+ bitmap_set_bit (no_store_store_comps, ia);
+ continue;
+   }
  else if (!determine_offset (dra, drb, _off))
{
+ bitmap_set_bit (no_store_store_comps, ia);
  merge_comps (comp_father, comp_size, bad, ib);
  continue;
}
@@ -908,6 +917,17 @@ split_data_refs_to_components (class loop *loop,
   comp->refs.quick_push (dataref);
 }
 
+  if (eliminate_store_p)
+{
+  bitmap_iterator bi;
+  EXECUTE_IF_SET_IN_BITMAP (no_store_store_comps, 0, ia, bi)
+   {
+ ca = component_of (comp_father, ia);
+ if (ca != bad)
+   comps[ca]->eliminate_store_p = false;
+   }
+}
+
   for (i = 0; i < n; i++)
 {
   comp = comps[i];
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr93434.c 
b/gcc/testsuite/gcc.c-torture/execute/pr93434.c
new file mode 100644
index 000..e786252794b
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr93434.c
@@ -0,0 +1,36 @@
+typedef struct creal_T {
+  double re;
+  double im;
+} creal_T;
+
+#define N 16
+int main() {
+  int k;
+  int i;
+  int j;
+  creal_T t2[N];
+  double inval;
+
+  inval = 1.0;
+  for (j = 0; j < N; ++j) {
+t2[j].re = 0;
+t2[j].im = 

Re: [Ping][GCC][IRA] Revert 11b8091fb to fix Bug 93221

2020-01-28 Thread Joel Hutton
On 28/01/2020 09:07, Eric Botcazou wrote:
>> Ping! Eric, do you have any objections to reverting?
>
> See my comment posted in the audit trail of the TN on 01/20...

> Probably missing live range splitting or somesuch, as envisioned by
> Vladimir in its approval message.  Feel free to eventually revert it.

Great. Vladimir, Ok for trunk?

Changelog:

2020-01-21  Joel Hutton  

 PR target/93221
 * ira.c (ira): Revert use of simplified LRA algorithm.

gcc/testsuite/ChangeLog:

2020-01-21  Joel Hutton  

 PR target/93221
 * gcc.target/aarch64/pr93221.c: New test.
From 1a2980ef6eeb76dbf0556f806a85a4f49ad3ebdd Mon Sep 17 00:00:00 2001
From: Joel Hutton 
Date: Tue, 21 Jan 2020 09:37:48 +
Subject: [PATCH] [IRA] Fix bug 93221 by reverting 11b8091fb

11b8091fb introduced a simplified LRA algorithm for -O0 that turned off
hard register splitting, this causes a problem for parameters passed in
multiple registers on aarch64. This fixes bug 93221.
---
 gcc/ira.c  | 38 +-
 gcc/testsuite/gcc.target/aarch64/pr93221.c | 10 ++
 2 files changed, 25 insertions(+), 23 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr93221.c

diff --git a/gcc/ira.c b/gcc/ira.c
index 46091adf8109263c72343dccfe4913857b5c74ae..c8b5f869da121506f0414901271eae9810689316 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -5205,35 +5205,27 @@ ira (FILE *f)
   /* Perform target specific PIC register initialization.  */
   targetm.init_pic_reg ();
 
-  if (optimize)
-{
-  ira_conflicts_p = true;
-
-  /* Determine the number of pseudos actually requiring coloring.  */
-  unsigned int num_used_regs = 0;
-  for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++)
-	if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i))
-	  num_used_regs++;
-
-  /* If there are too many pseudos and/or basic blocks (e.g. 10K
-	 pseudos and 10K blocks or 100K pseudos and 1K blocks), we will
-	 use simplified and faster algorithms in LRA.  */
-  lra_simple_p
-	= ira_use_lra_p
-	  && num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun);
-}
-  else
-{
-  ira_conflicts_p = false;
-  lra_simple_p = ira_use_lra_p;
-}
+  ira_conflicts_p = optimize > 0;
+
+  /* Determine the number of pseudos actually requiring coloring.  */
+  unsigned int num_used_regs = 0;
+  for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++)
+if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i))
+  num_used_regs++;
+
+  /* If there are too many pseudos and/or basic blocks (e.g. 10K
+ pseudos and 10K blocks or 100K pseudos and 1K blocks), we will
+ use simplified and faster algorithms in LRA.  */
+  lra_simple_p
+= ira_use_lra_p
+  && num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun);
 
   if (lra_simple_p)
 {
   /* It permits to skip live range splitting in LRA.  */
   flag_caller_saves = false;
   /* There is no sense to do regional allocation when we use
-	 simplified LRA.  */
+	simplified LRA.  */
   flag_ira_region = IRA_REGION_ONE;
   ira_conflicts_p = false;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/pr93221.c b/gcc/testsuite/gcc.target/aarch64/pr93221.c
new file mode 100644
index ..4dc2c3d0149423dd3d666f7428277ffa9eb765c4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr93221.c
@@ -0,0 +1,10 @@
+/* PR target/93221 */
+/* { dg-do compile } */
+/* { dg-options "-O0 -mno-omit-leaf-frame-pointer" } */
+
+struct S { __Int32x4_t b[2]; };
+
+void
+foo (struct S x)
+{
+}
-- 
2.17.1



[PATCH] Fix 2 typos in documentation of libstdc++.

2020-01-28 Thread Martin Liška

Hi.

It's a simple documentation fix.
I'm going to install the patch.

Martin

libstdc++-v3/ChangeLog:

2020-01-28  Martin Liska  

PR libstdc++/93478
* include/std/atomic: Fix typo.
* include/std/optional: Likewise.
---
 libstdc++-v3/include/std/atomic   | 2 +-
 libstdc++-v3/include/std/optional | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)


diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic
index 66c62381e6b..40f23bdfc96 100644
--- a/libstdc++-v3/include/std/atomic
+++ b/libstdc++-v3/include/std/atomic
@@ -174,7 +174,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /**
*  @brief Generic atomic type, primary class template.
*
-   *  @tparam _Tp  Type to be made atomic, must be trivally copyable.
+   *  @tparam _Tp  Type to be made atomic, must be trivially copyable.
*/
   template
 struct atomic
diff --git a/libstdc++-v3/include/std/optional b/libstdc++-v3/include/std/optional
index 09e7a7efca7..b920a1453ba 100644
--- a/libstdc++-v3/include/std/optional
+++ b/libstdc++-v3/include/std/optional
@@ -453,7 +453,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 * Such a separate base class template is necessary in order to
 * conditionally make copy/move constructors trivial.
 *
-* When the contained value is trivally copy/move constructible,
+* When the contained value is trivially copy/move constructible,
 * the copy/move constructors of _Optional_base will invoke the
 * trivial copy/move constructor of _Optional_payload. Otherwise,
 * they will invoke _Optional_payload(bool, const _Optional_payload&)



Re: [Ping][GCC][IRA] Revert 11b8091fb to fix Bug 93221

2020-01-28 Thread Eric Botcazou
> Ping! Eric, do you have any objections to reverting?

See my comment posted in the audit trail of the TN on 01/20...

-- 
Eric Botcazou


* PING * Re: [OpenACC] bump version for 2.6 plus libgomp.texi update — document acc_attach/acc_detach, acc_*_async, acc_*_finalize(_async) functions

2020-01-28 Thread Tobias Burnus

*PING*

Those two patches bump the OpenACC version number from 2.0 (alias 
201306) to OpenACC 2.6 (alias 201711). I believe except for bugs and 
known omissions (e.g. PR93225+93226), the OpenACC 2.6 support is complete.


Additionally, it updates the documentation accordingly, no longer marks 
OpenACC as experimental and documents some missing run-time functions.


* https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00600.html – main patch
* https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01173.html – remove more 
'experimental' status.


OK? — Sandra was so kind and looke through the documentation, which look 
fine to here.


Tobias

On 1/20/20 10:39 PM, Sandra Loosemore wrote:

On 1/20/20 3:08 AM, Tobias Burnus wrote:

Hi Sandra,

On 1/20/20 5:39 AM, Sandra Loosemore wrote:
I happen to have noticed a couple weeks ago that this language about 
OpenACC support being experimental appears in multiple places in the 
gfortran manual, […]  The same disclaimer for that option in the 
main GCC manual was removed years ago, so unless the Fortran support 
is much more broken than the C/C++ support, I think it ought to be 
removed from the Fortran manual as well.  […]


I concur. That would be the attached patch (on top of the previous 
patch* in this thread).


This is good, thank you.

-Sandra


Re: [PATCH 3/3] Check array contiguity for OpenACC/Fortran

2020-01-28 Thread Tobias Burnus

(CC: fortran@ as it relates to Fortran.)

Hi all,

On 1/7/20 12:16 PM, Tobias Burnus wrote:
in terms of the check, it looks fine to me – but I am not sure about 
the spec.


* [OpenACC] Actually, I simply missed the bit (here: OpenACC 3; OpenACC 
2.6 is same): “Any array or subarray in a data clause, including Fortran 
array pointers, must be a contiguous block of memory, except for dynamic 
multidimensional C arrays.” (2.7.1 under Restrictions).


* OpenMP (quoting TR8): “If a list item is an array section, it must 
specify contiguous storage.” (2.22.7.1 map Clause under Restrictions). 
However, that one seems to miss the case: “non_cont_ptr => A(::2); !$omp 
... map(non_cont_ptr)” as non_cont_ptr is noncontiguous and an array but 
not an array section.


In any case, those are restrictions imposed on the user – which the 
compiler may or may not report. (A good one will, I presume. Although, 
one can also regard it as implementation defined/vendor extension as GCC 
will properly handle those – by mapping also the gaps.)


Cheers,

Tobias


At least the following test case seems to work fine:

integer :: A(10,10), out(12)
A = reshape([(i, i=0,100)], shape(A))
!$omp target map(A(3:6,3:5), out)
!$acc parallel copy(A(3:6,3:5), out)
  out = reshape(A(3:6,3:5), shape(out))
  A(3:6,3:5) = 4
!$acc end parallel
!$omp end target
print '(4i3)', out
print '(10i3)', A
end

The reason that it works is that the stride is included in the length 
calculation:
#pragma omp target num_teams(1) thread_limit(0) map(tofrom:MEM[(c_char 
*)_6] [len: _5])


Has for the section (with A(3:6,3:5) -> parm(1:4,1:3)):
  parm.1.dim[0].lbound = 1;
  parm.1.dim[0].ubound = 4;
  parm.1.dim[0].stride = 1;
  parm.1.dim[1].lbound = 1;
  parm.1.dim[1].ubound = 3;
  parm.1.dim[1].stride = 10;
And instead of doing '(4-1+1) * (3-1+1)' = 12 (i.e. multiplying the 
extends),

the code does: 'stride * (3-1+1)' = 30.

Cheers,

Tobias

PS: It also works if one puts the stride on the ptr, i.e.:

integer,target :: A(10,10), out(12)
integer, pointer :: ptr(:,:)
A = reshape([(i, i=0,100)], shape(A))
ptr => A(3:6,3:5)
!$omp target map(ptr, out)
!$acc parallel copy(ptr, out)
  out = reshape(ptr, shape(out))
  ptr = 4
!$acc end parallel
!$omp end target
print '(4i3)', out
print '(10i3)', A
end

On 1/4/20 3:25 AM, Julian Brown wrote:

This patch tightens up error checking for array references used in
OpenACC clauses such that they must now be contiguous. I believe this
matches up to the spec (as of 2.6). I've tried to make it so an error
only triggers if the compiler is sure that a given array expression must
be non-contiguous at compile time, although this errs on the side
of potentially allowing the user to shoot themselves in the foot at
runtime: at least one existing test in the testsuite appears to expect
that behaviour.
[…]


Re: [PATCH] forwprop: Tweak choice of VEC_PERM_EXPR filler [PR92822]

2020-01-28 Thread Richard Biener
On Mon, 27 Jan 2020, Richard Sandiford wrote:

> For the 2s failures in the PR, we have a V4SF VEC_PERM_EXPR in
> which the first two elements are duplicates of one element and
> the other two are don't-care:
> 
> v4sf_b = VEC_PERM_EXPR ;
> 
> The heuristic was to extend this with a blend:
> 
> v4sf_b = VEC_PERM_EXPR ;
> 
> but it seems better to extend a partial duplicate to a full duplicate:
> 
> v4sf_b = VEC_PERM_EXPR ;
> 
> Obviously this is still just a heuristic though.
> 
> I wondered whether to restrict this to two elements or more
> but couldn't find any examples in which it made a difference.
> Either way should be fine for the purposes of fixing this PR.
> 
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  Richard mentioned
> in the PR that he had a different fix in mind, but since I'd tested
> this overnight, I thought I might as well post it anyway as a possible
> belt-and-braces fix.  OK to install?

OK.

It's a sound heuristic - IIRC I played with some form of this
(not sure if I tried to restrict it to all-same elts) and for
some cases it was worse on x86.  On x86 there's a splat but
obviously not from a random vector lane so if one wants to use
that it's an extract (move to lane zero) and then splat.

Anyway, if we have particular cases that regress with this
we want testsuite coverage.  [not sure how difficult it is
to cover all cases up to N elements with some scripting...  looking
over the assembly diff would be not fun in any case...]

Thanks,
Richard.

> Richard
> 
> 
> 2020-01-27  Richard Sandiford  
> 
> gcc/
>   PR tree-optimization/92822
>   * tree-ssa-forwprop.c (simplify_vector_constructor): When filling
>   out the don't-care elements of a vector whose significant elements
>   are duplicates, make the don't-care elements duplicates too.
> ---
>  gcc/tree-ssa-forwprop.c | 22 --
>  1 file changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
> index d63e87c8a5b..5203891950a 100644
> --- a/gcc/tree-ssa-forwprop.c
> +++ b/gcc/tree-ssa-forwprop.c
> @@ -2455,16 +2455,26 @@ simplify_vector_constructor (gimple_stmt_iterator 
> *gsi)
>it and its source indexes to make the permutation supported.
>For now it mimics a blend.  */
>vec_perm_builder sel (refnelts, refnelts, 1);
> +  bool all_same_p = true;
>for (i = 0; i < elts.length (); ++i)
> - sel.quick_push (elts[i].second + elts[i].first * refnelts);
> + {
> +   sel.quick_push (elts[i].second + elts[i].first * refnelts);
> +   all_same_p &= known_eq (sel[i], sel[0]);
> + }
>/* And fill the tail with "something".  It's really don't care,
>   and ideally we'd allow VEC_PERM to have a smaller destination
> -  vector.  As heuristic try to preserve a uniform orig[0] which
> -  facilitates later pattern-matching VEC_PERM_EXPR to a
> -  BIT_INSERT_EXPR.  */
> +  vector.  As a heuristic:
> +
> +  (a) if what we have so far duplicates a single element, make the
> +  tail do the same
> +
> +  (b) otherwise preserve a uniform orig[0].  This facilitates
> +  later pattern-matching of VEC_PERM_EXPR to a BIT_INSERT_EXPR.  */
>for (; i < refnelts; ++i)
> - sel.quick_push ((elts[0].second == 0 && elts[0].first == 0
> -  ? 0 : refnelts) + i);
> + sel.quick_push (all_same_p
> + ? sel[0]
> + : (elts[0].second == 0 && elts[0].first == 0
> +? 0 : refnelts) + i);
>vec_perm_indices indices (sel, orig[1] ? 2 : 1, refnelts);
>if (!can_vec_perm_const_p (TYPE_MODE (perm_type), indices))
>   return false;


Re: [patch, fortran] PR93473 - ICE on valid with long module + submodule names

2020-01-28 Thread Tobias Burnus

On 1/28/20 12:41 AM, Andrew Benson wrote:

The problem occurs in set_syms_host_assoc() where the "parent1" and "parent2"
variables have a maximum length of GFC_MAX_SYMBOL_LEN+1. This is insufficient
when the parent names are a module+submodule name concatenated with a ".". The
patch above fixes this by increasing their length to 2*GFC_MAX_SYMBOL_LEN+2.

A patch to fix this is attached. The patch regression tests cleanly - ok to
commit?


The patch is okay, but can you add a comment – similar to the other 
patch – which makes clear why one needs twice the normal 
GFC_MAX_SYMBOL_LEN (+2)? You currently have:



const char dot[2] = ".";
-  char parent1[GFC_MAX_SYMBOL_LEN + 1];
-  char parent2[GFC_MAX_SYMBOL_LEN + 1];
+  char parent1[2 * GFC_MAX_SYMBOL_LEN + 2];
+  char parent2[2 * GFC_MAX_SYMBOL_LEN + 2];


Tobias




Re: [patch, fortran] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule

2020-01-28 Thread Tobias Burnus

Hi Andrew,

On 1/27/20 9:57 PM, Andrew Benson wrote:

The problem occurs because GFC_MAX_MANGLED_SYMBOL_LEN is set to
GFC_MAX_SYMBOL_LEN*2+4, which is sufficient for a module name plus function name
(plus the additional "_"'s that get prepended), but insufficient if a submodule
name is included. The name then gets truncated and can lead to two different
functions having the same (truncated) symbol name.

The fix is to increase this length to GFC_MAX_SYMBOL_LEN*3+5 - which allows for
the submodule name plus the "." added between module and submodule names.

I've attached a patch for this which includes a new test case for this PR. The
patch regression tests cleanly.

OK to commit?


Can you also update the comment before the #define? It currently has:

 /* Mangled symbols take the form __module__name.  */
-#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
+#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*3+5)

Otherweise, it LGTM.

Tobias

PS: I wonder whether there are relevant systems which will fail because they
do not handle that long symbol names...