Re: [PATCH] Fix wrong code with truncated string literals (PR 86711/86714)

2018-07-30 Thread Martin Sebor

On 07/30/2018 02:21 PM, Bernd Edlinger wrote:

On 07/30/18 21:52, Martin Sebor wrote:

On 07/30/2018 09:24 AM, Bernd Edlinger wrote:

On 07/30/18 01:05, Martin Sebor wrote:

On 07/29/2018 04:56 AM, Bernd Edlinger wrote:

Hi!

This fixes two wrong code bugs where string_constant
returns over length string constants.  Initializers
like that are rejected in C++, but valid in C.


If by valid you are referring to declarations like the one in
the added test:

 const char a[2][3] = { "1234", "xyz" };

then (as I explained), the excess elements in "1234" make
the char[3] initialization and thus the test case undefined.
I have resolved bug 86711 as invalid on those grounds.

Bug 86711 has a valid test case that needs to be fixed, along
with bug 86688 that I raised for the same underlying problem:
considering the excess nul as part of the string.  As has been
discussed in a separate bug, rather than working around
the excessively long strings in the middle-end, it would be
preferable to avoid creating them to begin with.

I'm already working on a fix for bug 86688, in part because
I introduced the code change and also because I'm making other
changes in this area -- bug 86552.  Both of these in response
to your comments.



Sorry, I must admit, I have completely lost track on how many things
you are trying to work in parallel.

Nevertheless I started to review you pr86552 patch here:
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html

But so far you did not respond to me.

Well actually I doubt your patch does apply to trunk,
maybe you start to re-base that one, and post it again
instead?


I read your comments and have been busy working on enhancing
the patch (among other things).  There are a large number of
additional contexts where constant strings are expected and
where a missing nul needs to be detected.  Some include
additional instances of strlen calls that my initial patch
didn't handle, many more others that involve other string
functions.  I have posted an updated patch that applies
cleanly and that handles the first set.

There is also a class of problems involving constant character
arrays initialized by a braced list, as in char [] = { x, y, z };
Those are currently not recognized as strings even if they are
nul-terminated, but they are far more likely to be a source of
these problems than string literals, certainly in C++ where
string initializers must fit in the array.  I am testing a patch
to convert those into STRING_CST so they can be handled as well.

Since initializing arrays with more elements than fit is
undefined in C and since the behavior is undefined at compile
time it seems to me that rejecting such initializers with
a hard error (as opposed to a warning) would be appropriate
and obviate having to deal with them in the middle-end.



We do not want to change what is currently accepted by the
front end. period.


On whose behalf are you making such categorical statements?
It was Jakub and Richard's suggestion in bug 86714 to reject
the undefined excessive initializers and I happen to like
the idea.  I don't recall anyone making a decision about what
"we" do or don't want to change.

That said, if rejecting such initializers is not acceptable
an alternate solution that I believe Richard preferred is to
trim excess elements early on, e.g., during gimplification
(or at some other point after the front-end is done).  That's
okay with me too.



But there is no reason why ambiguous string constants
have to be passed to the middle end.

For instance char c[2] = "a\0"; should look like c[1] = "a";
while c[2] = "aaa"; should look like c[2] = "aa"; varasm.c
will cut the excess precision off anyway.

That is TREE_STRING_LENGTH (str) == 3 and TREE_STRING_POINTER(str) = "aa\0";

I propose to have all STRING_CST always be created by the
FE with explicit nul termination, but the
TYPE_SIZE_UNIT (TREE_TYPE (str)) >= TREE_STRING_LENGTH(str) in normal case 
(null-terminated)
TREE_SIZE_UNIT (TREE_TYPE (str)) < TREE_STRING_LENGTH(str) if non zero 
terminated,
truncated in the initializer.

Do you understand what I mean?


I don't insist on any particular internal representation as long
as it makes it possible to detect and diagnose common bugs, and
(ideally) also help mitigate their worst consequences.

Martin


Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552)

2018-07-30 Thread Martin Sebor

On 07/30/2018 03:11 PM, Bernd Edlinger wrote:

Hi,


@@ -621,6 +674,12 @@ c_strlen (tree src, int only_value)
maxelts = maxelts / eltsize - 1;
  }

+  /* Unless the caller is prepared to handle it by passing in a non-null
+ ARR, fail if the terminating nul doesn't fit in the array the string
+ is stored in (as in const char a[3] = "123";  */
+  if (!arr && maxelts < strelts)
+return NULL_TREE;
+


this is c_strlen, how is the caller ever supposed to handle non-zero terminated 
strings???
especially if you do this above?


Callers that pass in a non-null ARR handle them by issuing
a warning.  The rest get back a null result.  It should be
evident from the rest of the patch.  It can be debated what
each caller should do when it detects such a missing nul
where one is expected.  Different approaches may be more
or less appropriate for different callers/functions (e.g.,
strcpy vs strlen).


+c_strlen (tree src, int only_value, tree *arr /* = NULL */)
{
  STRIP_NOPS (src);
+
+  /* Used to detect non-nul-terminated strings in subexpressions
+ of a conditional expression.  When ARR is null, point it at
+ one of the elements for simplicity.  */
+  tree arrs[] = { NULL_TREE, NULL_TREE };
+  if (!arr)
+arr = arrs;



@@ -11427,7 +11478,9 @@ string_constant (tree arg, tree *ptr_offset)
  unsigned HOST_WIDE_INT length = TREE_STRING_LENGTH (init);
  length = string_length (TREE_STRING_POINTER (init), charsize,
  length / charsize);
-  if (compare_tree_int (array_size, length + 1) < 0)
+  if (nulterm)
+*nulterm = array_elts > length;
+  else if (array_elts <= length)
return NULL_TREE;


I don't understand why you can't use
compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (init)), TREE_STRING_LENGTH (init))
instead of this convoluted code above ???

Sorry, this patch does not look like it is ready any time soon.


I'm open to technical comments on the substance of my changes
but I'm not interested in your opinion of the readiness of
the patch (whatever that might mean), certainly not if you
have formed it after skimming a random handful of lines out
of a 600 line patch.


But actually I am totally puzzled by your priorities.
This is what I see right now:

1) We have missing warnings.
2) We have wrong code bugs.
3) We have apparently a specification error on the C Language standard (*)


Why are you prioritizing 1) over 2) thus blocking my attempts to fix a wrong 
code
issue,and why do you not tackle 3) in your WG14?


My priorities are none of your concern.

Your "attempts to fix" issues interfere with my work on a number
of projects.  You are not being helpful -- instead, by submitting
changes that you know fully well conflict with mine, you are
impeding and undermining my work.  That is why I object to them.


(*) which means that GCC is currently removing code from assertions
as I pointed out here: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01695.html

This happens because GCC follows the language standards literally right now.

I would say too literally, and it proves that the language standard's logic is
flawed IMHO.


I have no idea what your point is about standards, but bugs
like the one in the example, including those arising from
uninitialized arrays, could be detected with only minor
enhancements to the tree-ssa-strlen pass.  Implementing some
of this is among the projects I'm expected and expecting to
work on for GCC 9.  This patch is a small step in that
direction.

If you care about detecting bugs I would expect you to be
supportive rather than dismissive of this work, and helpful
in bringing it to fruition rather that putting it down or
questioning my priorities.  Especially since the work was
prompted by your own (valid) complaint that GCC doesn't
diagnose them.

Martin



Re: [PATCH] Make strlen range computations more conservative

2018-07-30 Thread Martin Sebor

On 07/27/2018 12:48 AM, Bernd Edlinger wrote:

I have one more example similar to PR86259, that resembles IMHO real world code:

Consider the following:


int fun (char *p)
{
  char buf[16];

  assert(strlen(p) < 4); //here: security relevant check

  sprintf(buf, "echo %s - %s", p, p); //here: security relevant code
  return system(buf);
}


What is wrong with the assertion?

Nothing, except it is removed, when this function is called from untrusted code:

untrused_fun ()
{
   char b[2] = "ab";
   fun(b);
}

 don't try to execute that: after "ab" there can be "; rm -rF / ;" on your 
stack

Even the slightly more safe check "assert(strnlen(p, 4) < 4);" would have
been removed.

Now that is a simple error and it would be easy to fix -- normally.
But when the assertion is removed, the security relevant code
is allowed to continue where it creates more damage and is
suddenly much harder to debug.


sprintf() is a known source of buffer overflows.  The recommended
practice is to use snprintf.  An alternate mechanism to constrain
the number of bytes formatted by an individual %s directive is to
use the precision, such as %.4s.


So, I start to believe that strlen range assumptions are unsafe, unless
we can prove that the string is in fact zero terminated.

I would like to guard the strlen range checks with a new option, maybe
-fassume-zero-terminated-char-arrays, and enable that under -Ofast only.

What do you think?


I'm not opposed to providing options to control various
features but I'm not in favor of disabling them by default as
a solution to accommodate buggy code.  For every instance of
a bug in a program with undefined behavior, whether it's reading
or writing past the end of an object or subobject, or integer
overflow, it's possible to show security-related consequences.
One could just as easily create a test case where allowing strlen
to read past the end of a member array could be exploited to cause
a subsequent buffer overflow.   Some of these consequences might
be in some cases mitigated by one strategy and others in other
cases by another.  There's no silver bullet -- the best approach
is to drive improvements to code to help weed out these bugs.

Even without _FORTIFY_SOURCE GCC diagnoses (some) writes past
the end of subobjects by string functions.  With _FORTIFY_SOURCE=2
it calls abort.  This is the default on popular distributions,
including Fedora, RHEL, and Ubuntu.   -Wstringop-truncation tries
to help detect the creation of unterminated strings by strncpy
and strncat.  There is little reason in my mind to treat strlen
or any other function as special, except perhaps for the few
existing exceptions of the raw memory functions (memcpy, et al.)

As you know, I have already posted a patch to detect a subset
of the problem of calling strlen on non-terminated arrays.
More such issues, including uses of dynamically created and
uninitialized arrays, can be detected by relatively modest
enhancements to the tree-ssa-strlen pass (also on my list
of things to do).  It may also be worth considering moving
the "initializer-string for array chars is too long" warning
from -Wc++-compat to -Wall or -Wextra.  But I would much rather
focus on these solutions and work toward overall improvements
than on weakening optimization to accommodate undefined code.
With sufficient awareness as a result of warnings such code
should all but disappear.  Following stricter rules opens up
opportunities for deeper analyses to enable more optimization
and detect even more bugs.

Martin


Re: [PATCH] Libraries' configure scripts should not read config-ml.in when multilib is disabled

2018-07-30 Thread John Ericson

On 07/30/18 19:48, Joseph Myers wrote:

On the contrary, I think an important principle here is that non-multilib
and multilib builds follow the same code paths as far as possible, with
the multilib variables just set to trivial values (modulo osdirname) in
the case of a non-multilib build - a non-multilib build should be building
libraries exactly the same, with the same logic and the same variable
settings, as the default multilib in a multilib build.
I whole-heatedly agree with that principle. [FWIW, I've taken a similar 
approach with keeping the cross and native compilation code paths as 
close as possible. This is why I'm building the libraries separately in 
the first place.]


That said, it is my tentative understanding that the point of having 
config-ml is to cordon-off all the necessarily-multilib-specific logic 
so it doesn't pollute everything else. When that script isn't run, I 
think the Makefiles already contain default "trivial values" for 
capitalized MULTI* variables (which are the only ones actually used by 
the build itself), yielding precisely that deduplication of code paths 
we both want.


John


[PATCH] convert braced initializers to strings (PR 71625)

2018-07-30 Thread Martin Sebor

The middle-end contains code to determine the lengths of constant
character arrays initialized by string literals.  The code is used
in a number of optimizations and warnings.

However, the code is unable to deal with constant arrays initialized
using the braced initializer syntax, as in

  const char a[] = { '1', '2', '\0' };

The attached patch extends the C and C++ front-ends to convert such
initializers into a STRING_CST form.

The goal of this work is to both enable existing optimizations for
such arrays, and to help detect bugs due to using non-nul terminated
arrays where nul-terminated strings are expected.  The latter is
an extension of the GCC 8 _Wstringop-overflow and
-Wstringop-truncation warnings that help detect or prevent reading
past the end of dynamically created character arrays.  Future work
includes detecting potential past-the-end reads from uninitialized
local character arrays.

Tested on x86_64-linux.

Martin
PR tree-optimization/71625 - missing strlen optimization on different array initialization style

gcc/c/ChangeLog:

	PR tree-optimization/71625
	* c-parser.c (c_parser_declaration_or_fndef): Call
	convert_braced_list_to_string.

gcc/c-family/ChangeLog:

	PR tree-optimization/71625
	* c-common.c (convert_braced_list_to_string): New function.
	* c-common.h (convert_braced_list_to_string): Declare it.

gcc/cp/ChangeLog:

	PR tree-optimization/71625
	* parser.c (cp_parser_init_declarator):  Call
	convert_braced_list_to_string.

gcc/testsuite/ChangeLog:

	PR tree-optimization/71625
	* g++.dg/init/string2.C: New test.
	* g++.dg/init/string3.C: New test.
	* gcc.dg/strlenopt-55.c: New test.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 422d668..9a93175 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -8345,4 +8345,72 @@ maybe_add_include_fixit (rich_location *richloc, const char *header)
   free (text);
 }
 
+/* Attempt to convert a braced array initializer list CTOR into
+   a STRING_CST for convenience and efficiency.  When non-null,
+   use EVAL to attempt to evalue constants (used by C++).
+   MAXELTS gives the maximum number of elements to accept.
+   Return the converted string on success or null on failure.  */
+
+tree
+convert_braced_list_to_string (tree ctor, tree (*eval)(tree),
+			   unsigned HOST_WIDE_INT maxelts)
+{
+  unsigned HOST_WIDE_INT nelts = CONSTRUCTOR_NELTS (ctor);
+
+  auto_vec str;
+  str.reserve (nelts + 1);
+
+  unsigned HOST_WIDE_INT i;
+  tree index, value;
+
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), i, index, value)
+{
+  unsigned HOST_WIDE_INT idx = index ? tree_to_uhwi (index) : i;
+
+  /* auto_vec is limited to UINT_MAX elements.  */
+  if (idx > UINT_MAX)
+	return NULL_TREE;
+
+  /* Attempt to evaluate constants.  */
+  if (eval)
+	value = eval (value);
+
+  /* Avoid non-constant initializers.  */
+ if (!tree_fits_uhwi_p (value))
+	return NULL_TREE;
+
+  /* Skip over embedded nuls.  */
+  unsigned val = tree_to_uhwi (value);
+  if (!val)
+	continue;
+
+  /* Bail if the CTOR has a block of more than 256 embedded nuls
+	 due to implicitly initialized elements.  */
+  unsigned nelts = (idx - str.length ()) + 1;
+  if (nelts > 256)
+	return NULL_TREE;
+
+  if (nelts > 1)
+	{
+	  str.reserve (idx);
+	  str.quick_grow_cleared (idx);
+	}
+
+  if (idx > maxelts)
+	return NULL_TREE;
+
+  str.safe_insert (idx, val);
+}
+
+  /* Append a nul for the empty initializer { } and for the last
+ explicit initializer in the loop above that is a nul.  */
+  if (!nelts || str.length () < i)
+str.safe_push (0);
+
+  /* Build a string literal but return the embedded STRING_CST.  */
+  tree res = build_string_literal (str.length (), str.begin ());
+  res = TREE_OPERAND (TREE_OPERAND (res, 0), 0);
+  return res;
+}
+
 #include "gt-c-family-c-common.h"
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index fcec95b..343a1ae 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1331,6 +1331,8 @@ extern void maybe_add_include_fixit (rich_location *, const char *);
 extern void maybe_suggest_missing_token_insertion (rich_location *richloc,
 		   enum cpp_ttype token_type,
 		   location_t prev_token_loc);
+extern tree convert_braced_list_to_string (tree, tree (*)(tree) = NULL,
+	   unsigned HOST_WIDE_INT = -1);
 
 #if CHECKING_P
 namespace selftest {
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 7a92628..e12d270 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -2126,6 +2126,23 @@ c_parser_declaration_or_fndef (c_parser *parser, bool fndef_ok,
 	  if (d != error_mark_node)
 		{
 		  maybe_warn_string_init (init_loc, TREE_TYPE (d), init);
+
+		  /* Convert a string CONSTRUCTOR into a STRING_CST.  */
+		  tree valtype = TREE_TYPE (init.value);
+		  if (TREE_CODE (init.value) == CONSTRUCTOR
+		  && TREE_CODE (valtype) == ARRAY_TYPE)
+		{
+		  valtype = TREE_TYPE (valtype);

Re: [PATCH] Libraries' configure scripts should not read config-ml.in when multilib is disabled

2018-07-30 Thread Joseph Myers
On Mon, 30 Jul 2018, John Ericson wrote:

> I understand this building them separately is not supported, but am
> nevertheless hoping the patch can nevertheless be upstreamed on the grounds
> that this generally cleans up the build system in accordance with the
> principle that "feature foo" variables need not be written and should not be
> read when feature foo is disabled. libgcc's configure script, for example, has

On the contrary, I think an important principle here is that non-multilib 
and multilib builds follow the same code paths as far as possible, with 
the multilib variables just set to trivial values (modulo osdirname) in 
the case of a non-multilib build - a non-multilib build should be building 
libraries exactly the same, with the same logic and the same variable 
settings, as the default multilib in a multilib build.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Libraries' configure scripts should not read config-ml.in when multilib is disabled

2018-07-30 Thread John Ericson
Currently some multilib variables are initialized, and config.ml.in 
instantiated, whether or not a multilib build is being performed. I ran 
into this because I am building the runtime libraries (libatomic right 
now) separately from GCC. Multilib is disabled, and no multilib 
variables are set, yet the configure script fails after it cannot find 
`${multi_basedir}/config-ml.in`.


I understand this building them separately is not supported, but am 
nevertheless hoping the patch can nevertheless be upstreamed on the 
grounds that this generally cleans up the build system in accordance 
with the principle that "feature foo" variables need not be written and 
should not be read when feature foo is disabled. libgcc's configure 
script, for example, has some similar-looking bespoke m4 script with `. 
${libgcc_topdir}/config-ml.in` instead, side-stepping the issue. Perhaps 
with this fixed, all the libraries could use the same macro. [I suppose 
I'd be willing to investigate that.]


Thanks,

John

[Continuing from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86746. Did 
not know this sort of thing should exclusively go to this list. My 
apologies.]


From 3e0e7d6e5cfdc46342fcad5fe6b24b4f47af0d87 Mon Sep 17 00:00:00 2001
Message-Id: 
<3e0e7d6e5cfdc46342fcad5fe6b24b4f47af0d87.1532988611.git.John.Ericson@Obsidian.Systems>
From: John Ericson 
Date: Mon, 30 Jul 2018 18:06:02 -0400
Subject: [PATCH] multilib: Don't bother with multilib configuration
To: gcc-patches@gcc.gnu.org

* config/multi.m4: Don't bother with multilib configuration when
it is disabled.
---
 ChangeLog   |  5 +
 config/multi.m4 | 50 +++--
 2 files changed, 33 insertions(+), 22 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 4bc5123c84e..2445071bea4 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2018-07-30  John Ericson  
+
+   * config/multi.m4: Don't bother with multilib configuration when it
+   is disabled.
+
 2018-07-19  DJ Delorie  
 
* MAINTAINERS (m32c, msp43, rl78, libiberty, build): Remove myself
diff --git a/config/multi.m4 b/config/multi.m4
index bba338a8265..f5081155564 100644
--- a/config/multi.m4
+++ b/config/multi.m4
@@ -20,40 +20,45 @@ AC_ARG_ENABLE(multilib,
   no)  multilib=no ;;
   *)   AC_MSG_ERROR([bad value $enableval for multilib option]) ;;
  esac],
- [multilib=yes])
+[multilib=yes])
 
-# We may get other options which we leave undocumented:
-# --with-target-subdir, --with-multisrctop, --with-multisubdir
-# See config-ml.in if you want the gory details.
+if test "x$multilib" = xyes; then
 
-if test "$srcdir" = "."; then
-  if test "$with_target_subdir" != "."; then
-multi_basedir="$srcdir/$with_multisrctop../$2"
+  # We may get other options which we leave undocumented:
+  # --with-target-subdir, --with-multisrctop, --with-multisubdir
+  # See config-ml.in if you want the gory details.
+
+  if test "$srcdir" = "."; then
+if test "$with_target_subdir" != "."; then
+  multi_basedir="$srcdir/$with_multisrctop../$2"
+else
+  multi_basedir="$srcdir/$with_multisrctop$2"
+fi
   else
-multi_basedir="$srcdir/$with_multisrctop$2"
+multi_basedir="$srcdir/$2"
+  fi
+  AC_SUBST(multi_basedir)
+
+  # Even if the default multilib is not a cross compilation,
+  # it may be that some of the other multilibs are.
+  if test $cross_compiling = no && test $multilib = yes \
+ && test "x${with_multisubdir}" != x ; then
+ cross_compiling=maybe
   fi
-else
-  multi_basedir="$srcdir/$2"
-fi
-AC_SUBST(multi_basedir)
 
-# Even if the default multilib is not a cross compilation,
-# it may be that some of the other multilibs are.
-if test $cross_compiling = no && test $multilib = yes \
-   && test "x${with_multisubdir}" != x ; then
-   cross_compiling=maybe
 fi
 
-AC_OUTPUT_COMMANDS([
+AC_CONFIG_COMMANDS([config-ml],
+[if test "x$multilib" = xyes; then
 # Only add multilib support code if we just rebuilt the top-level
 # Makefile.
 case " $CONFIG_FILES " in
  *" ]m4_default([$1],Makefile)[ "*)
ac_file=]m4_default([$1],Makefile)[ . ${multi_basedir}/config-ml.in
;;
-esac],
-  [
-srcdir="$srcdir"
+esac
+fi],
+[srcdir="$srcdir"
 host="$host"
 target="$target"
 with_multisubdir="$with_multisubdir"
@@ -64,4 +69,5 @@ multi_basedir="$multi_basedir"
 CONFIG_SHELL=${CONFIG_SHELL-/bin/sh}
 CC="$CC"
 CXX="$CXX"
-GFORTRAN="$GFORTRAN"])])dnl
+GFORTRAN="$GFORTRAN"])
+])dnl
-- 
2.17.1



Re: [PATCH] Introduce instance discriminators

2018-07-30 Thread Alexandre Oliva
On Jul 24, 2018, Alexandre Oliva  wrote:

> Ok to install the first two patches?  (the third is just for reference)

Ping?

https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01419.html


> Introduce instance discriminators

> From: Alexandre Oliva 

> With -gnateS, the Ada compiler sets itself up to output discriminators
> for different instantiations of generics, but the middle and back ends
> have lacked support for that.  This patch introduces the missing bits,
> translating the GNAT-internal representation of the per-file instance
> map to an instance_table that maps decls to instance discriminators.


> From: Alexandre Oliva  , Olivier Hainque  
> 
> for  gcc/ChangeLog

>   * debug.h (decl_to_instance_map_t): New type.
>   (decl_to_instance_map): Declare.
>   (maybe_create_decl_to_instance_map): New inline function.
>   * final.c (bb_discriminator, last_bb_discriminator): New statics,
>   to track basic block discriminators.
>   (final_start_function_1): Initialize them.
>   (final_scan_insn_1): On NOTE_INSN_BASIC_BLOCK, track
>   bb_discriminator.
>   (decl_to_instance_map): New variable.
>   (map_decl_to_instance, maybe_set_discriminator): New functions.
>   (notice_source_line): Set discriminator.

> for  gcc/ada

>   * trans.c: Include debug.h.
>   (file_map): New static variable.
>   (gigi): Set it.  Create decl_to_instance_map when needed.
>   (Subprogram_Body_to_gnu): Pass gnu_subprog_decl to...
>   (Sloc_to_locus): ... this.  Add decl parm, map it to instance.
>   * gigi.h (Sloc_to_locus): Adjust declaration.

> for  gcc/testsuite/ChangeLog

>   * gnat.dg/dinst.adb: New.
>   * gnat.dg/dinst_pkg.ads, gnat.dg/dinst_pkg.adb: New.

[...]

> Save discriminator info for LTO

> From: Alexandre Oliva 

> for  gcc/ChangeLog

>   * gimple-streamer-in.c (input_bb): Restore BB discriminator.
>   * gimple-streamer-out.c (output_bb): Save it.
>   * lto-streamer-in.c (input_struct_function_base): Restore
>   instance discriminator if available.  Create map on demand.
>   * lto-streamer-out.c (output_struct_function_base): Save it if
>   available.
>   * final.c (decl_to_instance_map): Document LTO strategy.


-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist


Re: [PATCH] Move -Walloca and related warnings from c.opt to common.opt

2018-07-30 Thread Iain Buclaw
On 30 July 2018 at 17:45, Martin Sebor  wrote:
> On 07/30/2018 09:28 AM, Jakub Jelinek wrote:
>>
>> On Sun, Jul 29, 2018 at 08:35:39PM +0200, Iain Buclaw wrote:
>>>
>>> Since r262910, it was noticed that new -Walloca-larger-than= warnings
>>> started appearing when building the D frontend's standard library.
>>> These have been checked and verified as valid, and appropriate fixes
>>> will be sent on upstream.
>>>
>>> As for the warning itself, as it is now default on, it would be
>>> preferable to be able control it.  Given the choice between adding
>>> these options to the D frontend or moving them to common, I'd rather
>>> move them to common.
>>
>>
>> It is strange to add a warning option for languages that don't even have
>> alloca.  Instead, the warning shouldn't be enabled by default (but Martin
>> doesn't want to listen to that), or at least should be enabled only for
>> the
>> languages where it makes sense.
>
>
> I agree that the warning shouldn't be enabled for languages that
> don't have support for alloca.
>

Fair enough, I'll add them to D instead then.

I am curious though if its possible to trigger warnings from languages
that don't have direct support for alloca.  However having a look, the
only places that generate a call to alloca deal with variable-sized
objects.

Iain.


Re: Fwd: [PATCH, rs6000] Replace __uint128_t and __int128_t with __uint128 and __int128 in Power PC built-in documentation

2018-07-30 Thread Segher Boessenkool
On Fri, Jul 27, 2018 at 10:07:20AM -0500, Kelvin Nilsen wrote:
> On 7/26/18 9:54 AM, Segher Boessenkool wrote:
> > On Thu, Jul 26, 2018 at 08:40:01AM -0500, Kelvin Nilsen wrote:
> >> To improve internal consistency and to improve consistency with published 
> >> ABI documents, this patch replaces the __uint128_t type with __uint128 and 
> >> replaces __int128_t with __int128.
> > 
> >> Is this ok for trunk?
> > 
> > Looks good, thanks!  Most (all?) of these functions are not documented
> > in the ABI, but this is a step forward anyway.  Okay for trunk.
> > 
> > What do things like error messages involving these functions look like?
> > What types do those say?
> 
> Thanks for review and approval.  To respond to your question about error 
> messages:
> > 
> > microdoc3.c:22:3: error: invalid parameter combination for AltiVec 
> > intrinsic ‘__builtin_vec_vaddcuq’
> >u1 = vec_vaddcuq (d2, d3);
> >^~

Ah, so no type at all.  Well that's good :-)  Thanks,


Segher


Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552)

2018-07-30 Thread Bernd Edlinger
Hi,

>@@ -621,6 +674,12 @@ c_strlen (tree src, int only_value)
>   maxelts = maxelts / eltsize - 1;
>   }
> 
>+  /* Unless the caller is prepared to handle it by passing in a non-null
>+ ARR, fail if the terminating nul doesn't fit in the array the string
>+ is stored in (as in const char a[3] = "123";  */
>+  if (!arr && maxelts < strelts)
>+return NULL_TREE;
>+

this is c_strlen, how is the caller ever supposed to handle non-zero terminated 
strings???
especially if you do this above?

>+c_strlen (tree src, int only_value, tree *arr /* = NULL */)
> {
>   STRIP_NOPS (src);
>+
>+  /* Used to detect non-nul-terminated strings in subexpressions
>+ of a conditional expression.  When ARR is null, point it at
>+ one of the elements for simplicity.  */
>+  tree arrs[] = { NULL_TREE, NULL_TREE };
>+  if (!arr)
>+arr = arrs;

>@@ -11427,7 +11478,9 @@ string_constant (tree arg, tree *ptr_offset)
>   unsigned HOST_WIDE_INT length = TREE_STRING_LENGTH (init);
>   length = string_length (TREE_STRING_POINTER (init), charsize,
> length / charsize);
>-  if (compare_tree_int (array_size, length + 1) < 0)
>+  if (nulterm)
>+*nulterm = array_elts > length;
>+  else if (array_elts <= length)
> return NULL_TREE;

I don't understand why you can't use
compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (init)), TREE_STRING_LENGTH (init))
instead of this convoluted code above ???

Sorry, this patch does not look like it is ready any time soon.


But actually I am totally puzzled by your priorities.
This is what I see right now:

1) We have missing warnings.
2) We have wrong code bugs.
3) We have apparently a specification error on the C Language standard (*)


Why are you prioritizing 1) over 2) thus blocking my attempts to fix a wrong 
code
issue,and why do you not tackle 3) in your WG14?



(*) which means that GCC is currently removing code from assertions
as I pointed out here: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01695.html

This happens because GCC follows the language standards literally right now.

I would say too literally, and it proves that the language standard's logic is
flawed IMHO.

Thanks
Bernd.


Re: [PATCH] Fix wrong code with truncated string literals (PR 86711/86714)

2018-07-30 Thread Bernd Edlinger
On 07/30/18 21:52, Martin Sebor wrote:
> On 07/30/2018 09:24 AM, Bernd Edlinger wrote:
>> On 07/30/18 01:05, Martin Sebor wrote:
>>> On 07/29/2018 04:56 AM, Bernd Edlinger wrote:
 Hi!

 This fixes two wrong code bugs where string_constant
 returns over length string constants.  Initializers
 like that are rejected in C++, but valid in C.
>>>
>>> If by valid you are referring to declarations like the one in
>>> the added test:
>>>
>>>  const char a[2][3] = { "1234", "xyz" };
>>>
>>> then (as I explained), the excess elements in "1234" make
>>> the char[3] initialization and thus the test case undefined.
>>> I have resolved bug 86711 as invalid on those grounds.
>>>
>>> Bug 86711 has a valid test case that needs to be fixed, along
>>> with bug 86688 that I raised for the same underlying problem:
>>> considering the excess nul as part of the string.  As has been
>>> discussed in a separate bug, rather than working around
>>> the excessively long strings in the middle-end, it would be
>>> preferable to avoid creating them to begin with.
>>>
>>> I'm already working on a fix for bug 86688, in part because
>>> I introduced the code change and also because I'm making other
>>> changes in this area -- bug 86552.  Both of these in response
>>> to your comments.
>>>
>>
>> Sorry, I must admit, I have completely lost track on how many things
>> you are trying to work in parallel.
>>
>> Nevertheless I started to review you pr86552 patch here:
>> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html
>>
>> But so far you did not respond to me.
>>
>> Well actually I doubt your patch does apply to trunk,
>> maybe you start to re-base that one, and post it again
>> instead?
> 
> I read your comments and have been busy working on enhancing
> the patch (among other things).  There are a large number of
> additional contexts where constant strings are expected and
> where a missing nul needs to be detected.  Some include
> additional instances of strlen calls that my initial patch
> didn't handle, many more others that involve other string
> functions.  I have posted an updated patch that applies
> cleanly and that handles the first set.
> 
> There is also a class of problems involving constant character
> arrays initialized by a braced list, as in char [] = { x, y, z };
> Those are currently not recognized as strings even if they are
> nul-terminated, but they are far more likely to be a source of
> these problems than string literals, certainly in C++ where
> string initializers must fit in the array.  I am testing a patch
> to convert those into STRING_CST so they can be handled as well.
> 
> Since initializing arrays with more elements than fit is
> undefined in C and since the behavior is undefined at compile
> time it seems to me that rejecting such initializers with
> a hard error (as opposed to a warning) would be appropriate
> and obviate having to deal with them in the middle-end.
> 

We do not want to change what is currently accepted by the
front end. period.

But there is no reason why ambiguous string constants
have to be passed to the middle end.

For instance char c[2] = "a\0"; should look like c[1] = "a";
while c[2] = "aaa"; should look like c[2] = "aa"; varasm.c
will cut the excess precision off anyway.

That is TREE_STRING_LENGTH (str) == 3 and TREE_STRING_POINTER(str) = "aa\0";

I propose to have all STRING_CST always be created by the
FE with explicit nul termination, but the
TYPE_SIZE_UNIT (TREE_TYPE (str)) >= TREE_STRING_LENGTH(str) in normal case 
(null-terminated)
TREE_SIZE_UNIT (TREE_TYPE (str)) < TREE_STRING_LENGTH(str) if non zero 
terminated,
truncated in the initializer.

Do you understand what I mean?


Bernd.


[PATCH] avoid incomplete types in -Warray-bounds (PR 86741)

2018-07-30 Thread Martin Sebor

The enhanced handling of MEM_REFs in -Warray-bounds assumes
the object from whose address an offset is being computed has
a complete type.  Since the size of such objects isn't known,
whether the offset (or index) from its beginning is valid
cannot be reliably determined.  The attached patch avoids
dealing with such objects.

Martin
PR tree-optimization/86741 - ICE in -Warray-bounds indexing into an object of incomplete type

gcc/ChangeLog:

	PR tree-optimization/86741
	* tree-vrp.c (vrp_prop::check_mem_ref): Avoid incomplete types.

gcc/testsuite/ChangeLog:

	PR tree-optimization/86741
	* gcc.dg/Warray-bounds-33.c: New test.

Index: gcc/testsuite/gcc.dg/Warray-bounds-33.c
===
--- gcc/testsuite/gcc.dg/Warray-bounds-33.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/Warray-bounds-33.c	(working copy)
@@ -0,0 +1,36 @@
+/* PR tree-optimization/86741 - ICE in -Warray-bounds indexing into
+   an object of incomplete type
+   { dg-do compile }
+   { dg-options "-O2 -Wall" }  */
+
+struct S
+{
+  int s;
+};
+
+void f (void);
+
+void test_void (void)
+{
+  extern void v;
+  struct S *b = (struct S*)
+  if (b->s)
+f ();
+}
+
+void test_incomplete_enum (void)
+{
+  extern enum E e;
+  struct S *b = (struct S*)
+  if (b->s)
+f ();
+}
+
+void test_func (void)
+{
+  struct S *b = (struct S*)
+  if (b->s)
+f ();
+}
+
+/* { dg-prune-output "taking address of expression of type .void." } */
Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c	(revision 263072)
+++ gcc/tree-vrp.c	(working copy)
@@ -5048,9 +5048,12 @@ vrp_prop::check_mem_ref (location_t location, tree
  a reference/subscript via a pointer to an object that is not
  an element of an array.  References to members of structs and
  unions are excluded because MEM_REF doesn't make it possible
- to identify the member where the reference originated.  */
+ to identify the member where the reference originated.
+ Incomplete types are excluded as well because their size is
+ not known.  */
   tree reftype = TREE_TYPE (arg);
   if (POINTER_TYPE_P (reftype)
+  || !COMPLETE_TYPE_P (reftype)
   || RECORD_OR_UNION_TYPE_P (reftype))
 return;
 


Re: [PATCH] Fix wrong code with truncated string literals (PR 86711/86714)

2018-07-30 Thread Martin Sebor

On 07/30/2018 09:24 AM, Bernd Edlinger wrote:

On 07/30/18 01:05, Martin Sebor wrote:

On 07/29/2018 04:56 AM, Bernd Edlinger wrote:

Hi!

This fixes two wrong code bugs where string_constant
returns over length string constants.  Initializers
like that are rejected in C++, but valid in C.


If by valid you are referring to declarations like the one in
the added test:

 const char a[2][3] = { "1234", "xyz" };

then (as I explained), the excess elements in "1234" make
the char[3] initialization and thus the test case undefined.
I have resolved bug 86711 as invalid on those grounds.

Bug 86711 has a valid test case that needs to be fixed, along
with bug 86688 that I raised for the same underlying problem:
considering the excess nul as part of the string.  As has been
discussed in a separate bug, rather than working around
the excessively long strings in the middle-end, it would be
preferable to avoid creating them to begin with.

I'm already working on a fix for bug 86688, in part because
I introduced the code change and also because I'm making other
changes in this area -- bug 86552.  Both of these in response
to your comments.



Sorry, I must admit, I have completely lost track on how many things
you are trying to work in parallel.

Nevertheless I started to review you pr86552 patch here:
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html

But so far you did not respond to me.

Well actually I doubt your patch does apply to trunk,
maybe you start to re-base that one, and post it again
instead?


I read your comments and have been busy working on enhancing
the patch (among other things).  There are a large number of
additional contexts where constant strings are expected and
where a missing nul needs to be detected.  Some include
additional instances of strlen calls that my initial patch
didn't handle, many more others that involve other string
functions.  I have posted an updated patch that applies
cleanly and that handles the first set.

There is also a class of problems involving constant character
arrays initialized by a braced list, as in char [] = { x, y, z };
Those are currently not recognized as strings even if they are
nul-terminated, but they are far more likely to be a source of
these problems than string literals, certainly in C++ where
string initializers must fit in the array.  I am testing a patch
to convert those into STRING_CST so they can be handled as well.

Since initializing arrays with more elements than fit is
undefined in C and since the behavior is undefined at compile
time it seems to me that rejecting such initializers with
a hard error (as opposed to a warning) would be appropriate
and obviate having to deal with them in the middle-end.

Martin


Re: [PATCH] warn for strlen of arrays with missing nul (PR 86552)

2018-07-30 Thread Martin Sebor

On 07/26/2018 02:53 AM, Bernd Edlinger wrote:

@@ -567,13 +597,17 @@ string_length (const void *ptr, unsigned eltsize, 
unsigned maxelts)
accesses.  Note that this implies the result is not going to be emitted
   into the instruction stream.

+   When ARR is non-null and the string is not properly nul-terminated,
+   set *ARR to the declaration of the outermost constant object whose
+   initializer (or one of its elements) is not nul-terminated.
+
The value returned is of type `ssizetype'.

Unfortunately, string_constant can't access the values of const char
arrays with initializers, so neither can we do so here.  */


Maybe drop that sentence when it is no longer true?


I believe the sentence means is that folding the length of arrays
like this isn't handled:

  const char a[] = { 'a', 'b', '\0' };

because they are represented not as STRING_CST but rather as
aggregate CONSTRUCTORs.

Adding this handling has been something I've been meaning to
do for some time and this seems like a good opportunity to do
it.  I'm testing a patch with this enhancement.


 tree
-c_strlen (tree src, int only_value)
+c_strlen (tree src, int only_value, tree *arr /* = NULL */)
 {
   STRIP_NOPS (src);
   if (TREE_CODE (src) == COND_EXPR
@@ -581,24 +615,31 @@ c_strlen (tree src, int only_value)
 {
   tree len1, len2;

-  len1 = c_strlen (TREE_OPERAND (src, 1), only_value);
-  len2 = c_strlen (TREE_OPERAND (src, 2), only_value);
+  len1 = c_strlen (TREE_OPERAND (src, 1), only_value, arr);
+  len2 = c_strlen (TREE_OPERAND (src, 2), only_value, arr);


Wow, what happens here if the first operand is non-zero terminated and the 
second is zero-terminated?


It depends on the size of the first array and on the length
of the second.  This test case triggers a warning:

  const char a[2][3] = { "123", "12" };

  int f (int i)
  {
return __builtin_strlen (i < 0 ? a[0] : a[1]);
  }

  warning: ‘__builtin_strlen’ argument missing terminating nul 
[-Wstringop-overflow=]

  note: referenced argument declared here
  const char a[2][3] = { "123", "12" };
 ^

but this one didn't:

  const char a[3] = "123";
  const char b[4] = "123";

  int f (int i)
  {
return __builtin_strlen (i < 0 ? a : b);
  }

The usual arguments both for and against issuing a diagnostic
here apply but I think it makes more sense to err on the side
of caution and diagnose them than not so I enhanced the patch
to do that, as well as handle a few more strlen contexts.

Martin


Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552)

2018-07-30 Thread Martin Sebor

Attached is an updated version of the patch that handles more
instances of calling strlen() on a constant array that is not
a nul-terminated string.

No other functions except strlen are explicitly handled yet,
and neither are constant arrays with braced-initializer lists
like const char a[] = { 'a', 'b', 'c' };  I am testing
an independent solution for those (bug 86552).  Once those
are handled the warning will be able to detect those as well.

Tested on x86_64-linux.

On 07/25/2018 05:38 PM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01124.html

The fix for bug 86532 has been checked in so this enhancement
can now be applied on top of it (with only minor adjustments).

On 07/19/2018 02:08 PM, Martin Sebor wrote:

In the discussion of my patch for pr86532 Bernd noted that
GCC silently accepts constant character arrays with no
terminating nul as arguments to strlen (and other string
functions).

The attached patch is a first step in detecting these kinds
of bugs in strlen calls by issuing -Wstringop-overflow.
The next step is to modify all other handlers of built-in
functions to detect the same problem (not part of this patch).
Yet another step is to detect these problems in arguments
initialized using the non-string form:

  const char a[] = { 'a', 'b', 'c' };

This patch is meant to apply on top of the one for bug 86532
(I tested it with an earlier version of that patch so there
is code in the context that does not appear in the latest
version of the other diff).

Martin





PR tree-optimization/86552 - missing warning for reading past the end of non-string arrays

gcc/ChangeLog:

	PR tree-optimization/86552
	* builtins.h (warn_string_no_nul): Declare..
	(c_strlen): Add argument.
	* builtins.c (warn_string_no_nul): New function.
	(fold_builtin_strlen): Add argument.  Detect missing nul.
	(fold_builtin_1): Adjust.
	(string_length): Add argument and use it.
	(c_strlen): Same.
	(expand_builtin_strlen): Detect missing nul.
	* expr.c (string_constant): Add arguments.  Detect missing nul
	terminator and outermost declaration it's missing in.
	* expr.h (string_constant): Add argument.
	* fold-const.c (c_getstr): Change argument to bool*, rename
	other arguments.
	* fold-const-call.c (fold_const_call): Detect missing nul.
	* gimple-fold.c (get_range_strlen): Add argument.
	(get_maxval_strlen): Adjust.
	* gimple-fold.h (get_range_strlen): Add argument.

gcc/testsuite/ChangeLog:

	PR tree-optimization/86552
	* gcc.dg/warn-string-no-nul.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index aa3e0d8..f4924d5 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -150,7 +150,7 @@ static tree stabilize_va_list_loc (location_t, tree, int);
 static rtx expand_builtin_expect (tree, rtx);
 static tree fold_builtin_constant_p (tree);
 static tree fold_builtin_classify_type (tree);
-static tree fold_builtin_strlen (location_t, tree, tree);
+static tree fold_builtin_strlen (location_t, tree, tree, tree);
 static tree fold_builtin_inf (location_t, tree, int);
 static tree rewrite_call_expr (location_t, tree, int, tree, int, ...);
 static bool validate_arg (const_tree, enum tree_code code);
@@ -550,6 +550,36 @@ string_length (const void *ptr, unsigned eltsize, unsigned maxelts)
   return n;
 }
 
+/* For a call expression EXP to a function that expects a string argument,
+   issue a diagnostic due to it being a called with an argument NONSTR
+   that is a character array with no terminating NUL.  */
+
+void
+warn_string_no_nul (location_t loc, tree exp, tree fndecl, tree nonstr)
+{
+  loc = expansion_point_location_if_in_system_header (loc);
+
+  bool warned;
+  if (exp)
+{
+  if (!fndecl)
+	fndecl = get_callee_fndecl (exp);
+  warned = warning_at (loc, OPT_Wstringop_overflow_,
+			   "%K%qD argument missing terminating nul",
+			   exp, fndecl);
+}
+  else
+{
+  gcc_assert (fndecl);
+  warned = warning_at (loc, OPT_Wstringop_overflow_,
+			   "%qD argument missing terminating nul",
+			   fndecl);
+}
+
+  if (warned && DECL_P (nonstr))
+inform (DECL_SOURCE_LOCATION (nonstr), "referenced argument declared here");
+}
+
 /* Compute the length of a null-terminated character string or wide
character string handling character sizes of 1, 2, and 4 bytes.
TREE_STRING_LENGTH is not the right way because it evaluates to
@@ -567,37 +597,60 @@ string_length (const void *ptr, unsigned eltsize, unsigned maxelts)
accesses.  Note that this implies the result is not going to be emitted
into the instruction stream.
 
+   When ARR is non-null and the string is not properly nul-terminated,
+   set *ARR to the declaration of the outermost constant object whose
+   initializer (or one of its elements) is not nul-terminated.
+
The value returned is of type `ssizetype'.
 
Unfortunately, string_constant can't access the values of const char
arrays with initializers, so neither can we do so here.  */
 
 tree
-c_strlen (tree src, int only_value)

Re: Fold pointer range checks with equal spans

2018-07-30 Thread Richard Sandiford
[Sorry, somehow missed this till now]

Richard Biener  writes:
> On Mon, Jul 23, 2018 at 5:05 PM Richard Sandiford
>  wrote:
>>
>> Marc Glisse  writes:
>> > On Fri, 20 Jul 2018, Richard Sandiford wrote:
>> >
>> >> --- gcc/match.pd 2018-07-18 18:44:22.565914281 +0100
>> >> +++ gcc/match.pd 2018-07-20 11:24:33.692045585 +0100
>> >> @@ -4924,3 +4924,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>> >>(if (inverse_conditions_p (@0, @2)
>> >> && element_precision (type) == element_precision (op_type))
>> >> (view_convert (cond_op @2 @3 @4 @5 (view_convert:op_type @1)))
>> >> +
>> >> +/* For pointers @0 and @2 and unsigned constant offset @1, look for
>> >> +   expressions like:
>> >> +
>> >> +   A: (@0 + @1 < @2) | (@2 + @1 < @0)
>> >> +   B: (@0 + @1 <= @2) | (@2 + @1 <= @0)
>> >> +
>> >> +   If pointers are known not to wrap, B checks whether @1 bytes starting 
>> >> at
>> >> +   @0 and @2 do not overlap, while A tests the same thing for @1 + 1 
>> >> bytes.
>> >> +   A is more efficiently tested as:
>> >> +
>> >> +   ((sizetype) @0 - (sizetype) @2 + @1) > (@1 * 2)
>> >> +
>> >> +   as long as @1 * 2 doesn't overflow.  B is the same with @1 replaced
>> >> +   with @1 - 1.  */
>> >> +(for ior (truth_orif truth_or bit_ior)
>> >> + (for cmp (le lt)
>> >> +  (simplify
>> >> +   (ior (cmp (pointer_plus:s @0 INTEGER_CST@1) @2)
>> >> +(cmp (pointer_plus:s @2 @1) @0))
>> >
>> > Do you want :c on cmp, in case it appears as @2 > @0 + @1 ? (may need some
>> > care with "cmp == LE_EXPR" below)
>> > Do you want :s on cmp as well?
>> >
>> >> +   (if (!flag_trapv && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
>> >
>> > Don't you want TYPE_OVERFLOW_UNDEFINED?
>>
>> Thanks, fixed below.  Think the cmp == LE_EXPR stuff is still ok with :c,
>> since the generated code sets cmp to LE_EXPR when matching GE_EXPR.
>>
>> Tested as before.  OK to install?
>>
>> Richard
>>
>>
>> 2018-07-23  Richard Sandiford  
>>
>> gcc/
>> * match.pd: Optimise pointer range checks.
>>
>> gcc/testsuite/
>> * gcc.dg/pointer-range-check-1.c: New test.
>> * gcc.dg/pointer-range-check-2.c: Likewise.
>>
>> Index: gcc/match.pd
>> ===
>> --- gcc/match.pd2018-07-23 15:56:47.0 +0100
>> +++ gcc/match.pd2018-07-23 15:58:33.480269844 +0100
>> @@ -4924,3 +4924,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>> (if (inverse_conditions_p (@0, @2)
>>  && element_precision (type) == element_precision (op_type))
>>  (view_convert (cond_op @2 @3 @4 @5 (view_convert:op_type @1)))
>> +
>> +/* For pointers @0 and @2 and unsigned constant offset @1, look for
>> +   expressions like:
>> +
>> +   A: (@0 + @1 < @2) | (@2 + @1 < @0)
>> +   B: (@0 + @1 <= @2) | (@2 + @1 <= @0)
>> +
>> +   If pointers are known not to wrap, B checks whether @1 bytes starting at
>> +   @0 and @2 do not overlap, while A tests the same thing for @1 + 1 bytes.
>> +   A is more efficiently tested as:
>> +
>> +   ((sizetype) @0 - (sizetype) @2 + @1) > (@1 * 2)
>> +
>> +   as long as @1 * 2 doesn't overflow.  B is the same with @1 replaced
>> +   with @1 - 1.  */
>> +(for ior (truth_orif truth_or bit_ior)
>> + (for cmp (le lt)
>> +  (simplify
>> +   (ior (cmp:cs (pointer_plus:s @0 INTEGER_CST@1) @2)
>> +   (cmp:cs (pointer_plus:s @2 @1) @0))
>> +   (if (!flag_trapv && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
>
> no need to check flag_trapv, pointer arithmetic is not covered by -ftrapv.

This was there because we're converting pointer arithmetic into integer
arithmetic, and -ftrapv could cause that new integer code to trap.
TYPE_OVERFLOW_UNDEFINED says that pointer overflow is undefined,
but it seemed bad form to make it actually trap, especially when the
above does some reassociation too.

>> +/* Convert the B form to the A form.  */
>> + (with { offset_int off = wi::to_offset (@1) - (cmp == LE_EXPR ? 1 :
>> 0); }
>> + /* Always fails for negative values.  */
>> + (if (wi::min_precision (off, UNSIGNED) * 2 <= TYPE_PRECISION
>> (sizetype))
>> +  /* It doesn't matter whether we use @2 - @0 or @0 - @2, so let
>> +tree_swap_operands_p pick a canonical order.  */
>
> gimple_resimplify takes care of that - well, it doesn't since minus isn't
> commutative...  I guess you get better CSE later when doing this thus ok,
> but it does look a bit off here ;)
>
> I think you shouldn't use 'sizetype' without guarding this whole thing
> with TYPE_PRECISION (sizetype) == TYPE_PRECISION (TREE_TYPE (@0)).

OK, hadn't realised that could be false.  Would building the appropriate
unsigned type be OK without the condition, or does it need to be sizetype?

> Since the original IL performs an ordered compare of two eventually unrelated
> pointers (or pointers adjusted in a way that crosses object
> boundaries) (uh... ;))
> I wonder if you can use POINTER_DIFF_EXPR here to avoid the sizetype
> conversion?  Since POINTER_PLUS_EXPR offsets are supposed to 

Re: [PATCH] arm: Generate correct const_ints (PR86640)

2018-07-30 Thread Segher Boessenkool
On Mon, Jul 30, 2018 at 03:55:30PM +0100, Kyrill Tkachov wrote:
> Hi Segher,
> 
> On 30/07/18 14:14, Segher Boessenkool wrote:
> >In arm_block_set_aligned_vect 8-bit constants are generated as zero-
> >extended const_ints, not sign-extended as required.  Fix that.
> >
> >Tamar tested the patch (see PR); no problems were found.  Is this okay
> >for trunk?
> >
> 
> The patch is okay but please add the testcase from the PR to gcc.dg/
> or somewhere else generic (it's not arm-specific).

It only failed with very specific options, which aren't valid on every
ARM config either I think?

-O3 -mfpu=neon -mfloat-abi=hard -march=armv7-a

I don't know the magic incantations for ARM tests, sorry.


Segher


Re: [PATCH] Fix the damage done by my other patch from yesterday to strlenopt-49.c

2018-07-30 Thread Richard Biener
On July 30, 2018 4:41:19 PM GMT+02:00, Bernd Edlinger 
 wrote:
>
>
>On 07/30/18 15:03, Richard Biener wrote:
>> On Mon, 30 Jul 2018, Bernd Edlinger wrote:
>> 
>>> Hi,
>>>
>>> this is how I would like to handle the over length strings issue in
>the C FE.
>>> If the string constant is exactly the right length and ends in one
>explicit
>>> NUL character, shorten it by one character.
>>>
>>> I thought Martin would be working on it,  but as this is a really
>simple fix,
>>> I would dare to send it to gcc-patches anyway, hope you don't
>mind...
>>>
>>> The patch is relative to the other patch here:
>https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01800.html
>>>
>>>
>>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>>> Is it OK for trunk?
>> 
>> I'll leave this to FE maintainers but can I ask you to verify the
>> (other) FEs do not leak this kind of invalid initializers to the
>> middle-end?  I suggest to put this verification in
>> output_constructor which otherwise happily truncates initializers
>> with excess size.  There's also gimplification which might elide
>> a = { "abcd", "cdse" }; to  a.x = "abcd"; a.y = "cdse"; but
>> hopefully there the GIMPLE verifier (verify_gimple_assign_single)
>> verifies this - well, it only dispatches to useless_type_conversion_p
>> (lhs_type, rhs1_type) for this case, but non-flexarrays should be
>> handled fine there.
>> 
>> Richard.
>> 
>
>In the moment I would already be happy if all STRING_CSTs would
>be zero terminated.
>
>However Go does not create zero-terminated STRING_CSTs, @Ian sorry,
>could you look at changing this to include the terminating NUL char?
>
>Index: gcc/go/go-gcc.cc
>===
>--- gcc/go/go-gcc.cc   (revision 263068)
>+++ gcc/go/go-gcc.cc   (working copy)
>@@ -1394,7 +1394,7 @@ Gcc_backend::string_constant_expression(const
>std:
> TYPE_QUAL_CONST);
>tree string_type = build_array_type(const_char_type, index_type);
>TYPE_STRING_FLAG(string_type) = 1;
>-  tree string_val = build_string(val.length(), val.data());
>+  tree string_val = build_string(val.length() + 1, val.data());
>TREE_TYPE(string_val) = string_type;
>  
>return this->make_expression(string_val);
>
>
>A am pretty sure that the C++ FE allows overlength initializers
>with -permissive.  They should be hedged in string_constant IMHO,
>however with the patch I am still holding back on Jeff's request
>I ran over a string constant in tree-ssa-strlen.c
>(get_min_string_length)
>that had a terminating NUL char but the index range type did not
>include the string terminator.  One just needs to be careful here.
>
>A quick survey shows that Fortran creates C strings with range
>1..n, which puts the pr86532 address computation again in question.
>Remember, you asked for array_ref_element_size but not for
>array_ref_low_bound, and Jeff acked the patch in this state.

The (earlier) changes in this area are unfortunately in a quite messy state. It 
would be nice to roll back completely and attack this in smaller and more 
obvious issues. 

Richard. 

>
>
>Thanks
>Bernd.



[PATCH] PR libstdc++/86734 make reverse_iterator::operator-> more robust

2018-07-30 Thread Jonathan Wakely

Implement the proposed resolution from LWG 1052, which also resolves
DR 2118 by avoiding taking the address in the first place.

PR libstdc++/86734
* include/bits/stl_iterator.h (reverse_iterator::operator->): Call
_S_to_pointer (LWG 1052, LWG 2118).
(reverse_iterator::_S_to_pointer): Define overloaded helper functions.
* testsuite/24_iterators/reverse_iterator/dr1052.cc: New test.
* testsuite/24_iterators/reverse_iterator/dr2188.cc: New test.

Tested powerpc64le-linux, committed to trunk.


commit 7e221535ac8ad07732c1ce019dd2c8c889a95dce
Author: Jonathan Wakely 
Date:   Mon Jul 30 15:14:11 2018 +0100

PR libstdc++/86734 make reverse_iterator::operator-> more robust

Implement the proposed resolution from LWG 1052, which also resolves
DR 2118 by avoiding taking the address in the first place.

PR libstdc++/86734
* include/bits/stl_iterator.h (reverse_iterator::operator->): Call
_S_to_pointer (LWG 1052, LWG 2118).
(reverse_iterator::_S_to_pointer): Define overloaded helper 
functions.
* testsuite/24_iterators/reverse_iterator/dr1052.cc: New test.
* testsuite/24_iterators/reverse_iterator/dr2188.cc: New test.

diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
b/libstdc++-v3/include/bits/stl_iterator.h
index 0d5f20bc2c6..8562f879c16 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -122,6 +122,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   */
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // 235 No specification of default ctor for reverse_iterator
+  // 1012. reverse_iterator default ctor should value initialize
   _GLIBCXX17_CONSTEXPR
   reverse_iterator() : current() { }
 
@@ -182,7 +183,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   */
   _GLIBCXX17_CONSTEXPR pointer
   operator->() const
-  { return &(operator*()); }
+  {
+   // _GLIBCXX_RESOLVE_LIB_DEFECTS
+   // 1052. operator-> should also support smart pointers
+   _Iterator __tmp = current;
+   --__tmp;
+   return _S_to_pointer(__tmp);
+  }
 
   /**
*  @return  @c *this
@@ -286,6 +293,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _GLIBCXX17_CONSTEXPR reference
   operator[](difference_type __n) const
   { return *(*this + __n); }
+
+private:
+  template
+   static _GLIBCXX17_CONSTEXPR _Tp*
+   _S_to_pointer(_Tp* __p)
+{ return __p; }
+
+  template
+   static _GLIBCXX17_CONSTEXPR pointer
+   _S_to_pointer(_Tp __t)
+{ return __t.operator->(); }
 };
 
   //@{
diff --git a/libstdc++-v3/testsuite/24_iterators/reverse_iterator/dr1052.cc 
b/libstdc++-v3/testsuite/24_iterators/reverse_iterator/dr1052.cc
new file mode 100644
index 000..2704010a083
--- /dev/null
+++ b/libstdc++-v3/testsuite/24_iterators/reverse_iterator/dr1052.cc
@@ -0,0 +1,82 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run { target c++11 } }
+
+// PR libstdc++/86734
+// LWG 1052. reverse_iterator::operator-> should also support smart pointers
+// LWG 2775. reverse_iterator is does not compile for fancy pointers
+
+#include 
+#include 
+
+void
+test01()
+{
+  // Example 1 from LWG 1052
+
+  struct X { int m; };
+
+  static X x;
+
+  struct IterX {
+typedef std::bidirectional_iterator_tag iterator_category;
+typedef X& reference;
+struct pointer
+{
+  pointer(X& v) : value(v) {}
+  X& value;
+  X* operator->() const {return }
+};
+typedef std::ptrdiff_t difference_type;
+typedef X value_type;
+// additional iterator requirements not important for this issue
+
+reference operator*() const { return x; }
+pointer operator->() const { return pointer(x); }
+IterX& operator--() {return *this;}
+
+  };
+
+  std::reverse_iterator ix;
+  VERIFY( >m == &(*ix).m );
+}
+
+void
+test02()
+{
+  // Example 2 from LWG 1052
+
+  struct P {
+P() : first(10), second(20.0) { }
+int first;
+double second;
+  };
+  P op;
+  std::reverse_iterator ri( + 1);
+  VERIFY( ri->first == 10 );
+}
+
+// N.B. Example 3 from LWG 1052 isn't expected to work,
+// because a caching iterator like IterX is not a 

[PATCH] Add workaround for aligned_alloc bug on AIX

2018-07-30 Thread Jonathan Wakely

20_util/memory_resource/2.cc FAILs on AIX 7.2.0.0, because aligned_alloc
incorrectly requires the alignment to be a multiple of sizeof(void*).

This adds a workaround to the operator new overload taking an alignment
value, to increase the alignment (and size) if needed.

* libsupc++/new_opa.cc (operator new(size_t, align_val_t)): Add
workaround for aligned_alloc bug on AIX.
* testsuite/18_support/new_aligned.cc: New test.

Tested powerpc64le-linux and powerpc-aix7.2.0.0, committed to trunk.


commit 8f66f02efa907dbcd3b88a761a4106e4f0354ccd
Author: Jonathan Wakely 
Date:   Mon Jul 30 11:47:19 2018 +0100

Add workaround for aligned_alloc bug on AIX

20_util/memory_resource/2.cc FAILs on AIX 7.2.0.0, because aligned_alloc
incorrectly requires the alignment to be a multiple of sizeof(void*).

This adds a workaround to the operator new overload taking an alignment
value, to increase the alignment (and size) if needed.

* libsupc++/new_opa.cc (operator new(size_t, align_val_t)): Add
workaround for aligned_alloc bug on AIX.
* testsuite/18_support/new_aligned.cc: New test.

diff --git a/libstdc++-v3/libsupc++/new_opa.cc 
b/libstdc++-v3/libsupc++/new_opa.cc
index 097280d9b54..7c4bb79cdab 100644
--- a/libstdc++-v3/libsupc++/new_opa.cc
+++ b/libstdc++-v3/libsupc++/new_opa.cc
@@ -95,6 +95,12 @@ operator new (std::size_t sz, std::align_val_t al)
 sz = 1;
 
 #if _GLIBCXX_HAVE_ALIGNED_ALLOC
+# ifdef _AIX
+  /* AIX 7.2.0.0 aligned_alloc incorrectly has posix_memalign's requirement
+   * that alignment is a multiple of sizeof(void*).  */
+  if (align < sizeof(void*))
+align = sizeof(void*);
+# endif
   /* C11: the value of size shall be an integral multiple of alignment.  */
   if (std::size_t rem = sz & (align - 1))
 sz += align - rem;
diff --git a/libstdc++-v3/testsuite/18_support/new_aligned.cc 
b/libstdc++-v3/testsuite/18_support/new_aligned.cc
new file mode 100644
index 000..a9f539d36e8
--- /dev/null
+++ b/libstdc++-v3/testsuite/18_support/new_aligned.cc
@@ -0,0 +1,119 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++17" }
+// { dg-do run { target c++17 } }
+
+#include 
+#include 
+#include 
+
+struct Test
+{
+  Test(std::size_t size, std::size_t a)
+  : size(size), alignment(std::align_val_t{a}),
+p(::operator new(size, alignment))
+  { }
+
+  ~Test() { ::operator delete(p, size, alignment); }
+
+  std::size_t size;
+  std::align_val_t alignment;
+  void* p;
+
+  bool valid() const { return p != nullptr; }
+
+  bool aligned() const
+  {
+auto ptr = p;
+auto space = size;
+return std::align((std::size_t)alignment, size, ptr, space) == p;
+  }
+};
+
+// operator new(size_t size, align_val_t alignment) has
+// undefined behaviour if the alignment argument is not
+// a valid alignment value (i.e. not a power of two).
+//
+// Unlike posix_memalign there is no requirement that
+// alignment >= sizeof(void*).
+//
+// Unlike aligned_alloc there is no requirement that
+// size is an integer multiple of alignment.
+
+void
+test01()
+{
+  // Test small values that would not be valid for
+  // posix_memalign or aligned_alloc.
+
+  Test t11{1, 1};
+  VERIFY( t11.valid() );
+  VERIFY( t11.aligned() );
+
+  Test t21{2, 1};
+  VERIFY( t21.valid() );
+  VERIFY( t21.aligned() );
+
+  Test t12{1, 2};
+  VERIFY( t12.valid() );
+  VERIFY( t12.aligned() );
+
+  Test t22{2, 2};
+  VERIFY( t22.valid() );
+  VERIFY( t22.aligned() );
+
+  Test t32{3, 2};
+  VERIFY( t32.valid() );
+  VERIFY( t32.aligned() );
+
+  Test t42{4, 2};
+  VERIFY( t42.valid() );
+  VERIFY( t42.aligned() );
+
+  Test t24{2, 4};
+  VERIFY( t24.valid() );
+  VERIFY( t24.aligned() );
+
+  Test t34{3, 4};
+  VERIFY( t34.valid() );
+  VERIFY( t34.aligned() );
+
+  Test t44{4, 4};
+  VERIFY( t44.valid() );
+  VERIFY( t44.aligned() );
+
+  // Test some larger values.
+
+  Test t128_16{128, 16};
+  VERIFY( t128_16.valid() );
+  VERIFY( t128_16.aligned() );
+
+  Test t128_64{128, 64};
+  VERIFY( t128_64.valid() );
+  VERIFY( t128_64.aligned() );
+
+  Test t64_128{64, 128};
+  VERIFY( t64_128.valid() );
+  VERIFY( t64_128.aligned() );
+}
+
+int
+main()
+{
+  test01();
+}


Re: [0/5] C-SKY port

2018-07-30 Thread Sandra Loosemore

On 07/26/2018 05:04 PM, Joseph Myers wrote:


Could you provide the proposed GCC website changes for the port
(backends.html, readings.html, news item for index.html)?  readings.html,
in particular, would link to the ABI and ISA documentation, while
backends.html gives summary information about the properties of both the
architecture and the GCC port.


The attached patch is a bit drafty, but I think it at least as 
placeholders for everything.


-Sandra

? htdocs/backends.html.~1.78.~
? htdocs/index.html.~1.1090.~
? htdocs/readings.html.~1.296.~
Index: htdocs/backends.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/backends.html,v
retrieving revision 1.78
diff -u -r1.78 backends.html
--- htdocs/backends.html	28 Jul 2018 22:20:20 -	1.78
+++ htdocs/backends.html	30 Jul 2018 16:56:17 -
@@ -76,6 +76,7 @@
 c6x|   S CB  gi
 cr16   |L  F C   gs
 cris   |   F  B cgi   s
+csky   |  b   ia
 epiphany   | C   gi   s
 fr30   | ??FI B  pb mgs
 frv| ??   B   b   i   s
Index: htdocs/index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
retrieving revision 1.1090
diff -u -r1.1090 index.html
--- htdocs/index.html	30 Jul 2018 01:38:08 -	1.1090
+++ htdocs/index.html	30 Jul 2018 16:56:17 -
@@ -52,6 +52,11 @@
 News
 
 
+C-SKY support
+ [2018-xx-xx]
+ GCC support for C-SKY V2 processors has been added.  This back
+   end was contributed by C-SKY Microsystems and Mentor Graphics.
+
 https://gcc.gnu.org/wiki/cauldron2018;>GNU Tools Cauldron 2018
 [2018-07-29]
 Will be held in Manchester, September 7-9 2018
Index: htdocs/readings.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/readings.html,v
retrieving revision 1.296
diff -u -r1.296 readings.html
--- htdocs/readings.html	28 Jul 2018 22:20:20 -	1.296
+++ htdocs/readings.html	30 Jul 2018 16:56:17 -
@@ -120,6 +120,11 @@
http://developer.axis.com/;>Site with CPU documentation
  
  
+ C-SKY
+   Manufacturer: C-SKY Microsystems
+   https://github.com/c-sky/csky-doc;>C-SKY Documentation
+ 
+ 
  Epiphany
   Manufacturer: Adapteva
   http://www.adapteva.com/;>Manufacturer's website with
Index: htdocs/gcc-9/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-9/changes.html,v
retrieving revision 1.15
diff -u -r1.15 changes.html
--- htdocs/gcc-9/changes.html	25 Jul 2018 21:54:59 -	1.15
+++ htdocs/gcc-9/changes.html	30 Jul 2018 16:56:17 -
@@ -138,6 +138,13 @@
 
 
 
+C-SKY
+
+  
+A new back end targeting C-SKY V2 processors has been contributed to GCC.
+  
+
+
 
 
 


[5/5] C-SKY port v2: libgcc

2018-07-30 Thread Sandra Loosemore

2018-07-30  Jojo  
Huibin Wang  
Sandra Loosemore  
Chung-Lin Tang  

C-SKY port: libgcc

libgcc/
* config.host: Add C-SKY support.
* config/csky/*: New.
diff --git a/libgcc/config.host b/libgcc/config.host
index 18cabaf..bd4ef1e 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -108,6 +108,9 @@ cr16-*-*)
 crisv32-*-*)
 	cpu_type=cris
 	;;
+csky*-*-*)
+	cpu_type=csky
+	;;
 fido-*-*)
 	cpu_type=m68k
 	;;
@@ -507,6 +510,15 @@ cris-*-elf)
 cris-*-linux* | crisv32-*-linux*)
 	tmake_file="$tmake_file cris/t-cris t-softfp-sfdf t-softfp cris/t-linux"
 	;;
+csky-*-elf*)
+	tmake_file="csky/t-csky t-fdpbit"
+	extra_parts="$extra_parts crti.o crtn.o"
+	;;
+csky-*-linux*)
+	tmake_file="$tmake_file csky/t-csky t-slibgcc-libgcc t-fdpbit csky/t-linux-csky"
+	extra_parts="$extra_parts crti.o crtn.o"
+	md_unwind_header=csky/linux-unwind.h
+	;;
 epiphany-*-elf* | epiphany-*-rtems*)
 	tmake_file="$tmake_file epiphany/t-epiphany t-fdpbit epiphany/t-custom-eqsf"
 	extra_parts="$extra_parts crti.o crtint.o crtrunc.o crtm1reg-r43.o crtm1reg-r63.o crtn.o"
diff --git a/libgcc/config/csky/crti.S b/libgcc/config/csky/crti.S
new file mode 100644
index 000..3e4beb9
--- /dev/null
+++ b/libgcc/config/csky/crti.S
@@ -0,0 +1,140 @@
+# Define _init and _fini entry points for C-SKY.
+# Copyright (C) 2018 Free Software Foundation, Inc.
+# Contributed by C-SKY Microsystems and Mentor Graphics.
+#
+# This file is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by the
+# Free Software Foundation; either version 3, or (at your option) any
+# later version.
+#
+# This file is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# General Public License for more details.
+#
+# Under Section 7 of GPL version 3, you are granted additional
+# permissions described in the GCC Runtime Library Exception, version
+# 3.1, as published by the Free Software Foundation.
+#
+# You should have received a copy of the GNU General Public License and
+# a copy of the GCC Runtime Library Exception along with this program;
+# see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+# .
+
+
+# This file just makes a stack frame for the contents of the .fini and
+# .init sections.  Users may put any desired instructions in those
+# sections.
+
+	.file"crti.S"
+
+/* We use more complicated versions of this code with GLIBC.  */
+#if defined(__gnu_linux__)
+
+#ifndef PREINIT_FUNCTION
+# define PREINIT_FUNCTION __gmon_start__
+#endif
+
+#ifndef PREINIT_FUNCTION_WEAK
+# define PREINIT_FUNCTION_WEAK 1
+#endif
+
+#if PREINIT_FUNCTION_WEAK
+	.global PREINIT_FUNCTION
+	.weak PREINIT_FUNCTION
+	.align 4
+	.type call_weak_fn, %function
+call_weak_fn:
+	// push  lr
+	subisp, 4
+	stw lr, (sp)
+#ifdef  __PIC__
+	lrw a2, PREINIT_FUNCTION@GOT
+	addua2, gb
+	ldw a2, (a2)
+#else
+	lrw a2, PREINIT_FUNCTION
+#endif
+	cmpnei  a2, 0
+	bf  1f
+	jsr a2
+1:
+	// pop lr
+	ldw lr, (sp)
+	addisp, 4
+	rts
+
+	.align 4
+#else
+	.hidden PREINIT_FUNCTION
+#endif /* PREINIT_FUNCTION_WEAK */
+
+	.section .init,"ax",@progbits
+	.align 4
+	.globl _init
+	.type _init, @function
+_init:
+	subisp, 8
+	stw lr, (sp, 0)
+#ifdef __PIC__
+	//  stw gb, (sp, 4)
+	bsr .Lgetpc
+.Lgetpc:
+	lrw gb, .Lgetpc@GOTPC
+	add gb, lr
+#endif
+#if PREINIT_FUNCTION_WEAK
+#ifdef __PIC__
+	lrw a2, call_weak_fn@GOTOFF
+	add a2, gb
+	jsr a2
+#else
+	jsricall_weak_fn
+#endif
+#else /* !PREINIT_FUNCTION_WEAK */
+#ifdef  __PIC__
+	lrw a2, PREINIT_FUNCTION@PLT
+	addua2, gb
+	ldw a2, (a2)
+	jsr a2
+#else
+	jsriPREINIT_FUNCTION
+#endif
+#endif /* PREINIT_FUNCTION_WEAK */
+
+	br  2f
+	.literals
+	.align  4
+2:
+	.section .fini,"ax",@progbits
+	.align 4
+	.globl _fini
+	.type _fini, @function
+_fini:
+	subisp,8
+	stw lr, (sp, 0)
+	br  2f
+	.literals
+	.align  4
+2:
+
+/* These are the non-GLIBC versions.  */
+#else  /* !defined(__gnu_linux__) */
+	.section  ".init"
+	.global  _init
+	.type  _init,@function
+	.align  2
+_init:
+	subi  sp, 16
+	st.w  lr, (sp, 12)
+	mov r0, r0
+
+	.section  ".fini"
+	.global  _fini
+	.type  _fini,@function
+	.align  2
+_fini:
+	subi  sp, 16
+	st.w  lr, (sp, 12)
+	mov r0, r0
+#endif /* defined(__gnu_linux__) */
diff --git a/libgcc/config/csky/crtn.S b/libgcc/config/csky/crtn.S
new file mode 100644
index 000..8bef996
--- /dev/null
+++ b/libgcc/config/csky/crtn.S
@@ -0,0 +1,55 @@
+# Terminate C-SKY .init and .fini sections.
+# Copyright (C) 2018 Free Software Foundation, Inc.
+# Contributed by C-SKY Microsystems and Mentor Graphics.
+#
+# This file is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public 

[0/5] C-SKY port v2

2018-07-30 Thread Sandra Loosemore
This patch series is a revised version of the C-SKY port, taking into 
account review comments received so far on the initial patch set.  The 
changes made are:


- Removed excess whitespace in predicates.md and other places (part 2).
- Defined TARGET_CUSTOM_FUNCTION_DESCRIPTORS (part 2).
- Moved cse_cc pass later (part 2) and added a test case (part 4).
- Fixed libgcc alphabetization problem (part 5).

Parts 1 and 3 are unchanged (and already approved) but I've included
them for completeness.

For more background on this target, see the initial patch posting here.

https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01289.html

I also have a patch with draft changes for the web site that I'll post
separately.

-Sandra


[4/5] C-SKY port v2: Testsuite

2018-07-30 Thread Sandra Loosemore

2018-07-30  Sandra Loosemore  
Chung-Lin Tang  
Xianmiao Qu  

C-SKY port: Testsuite

gcc/testsuite/
* g++.dg/Wno-frame-address.C: Adjust for C-SKY.
* g++.dg/torture/type-generic-1.C: Likewise.
* gcc.c-torture/compile/2804-1.c: Likewise.
* gcc.c-torture/execute/20101011-1.c: Likewise.
* gcc.c-torture/execute/ieee/mul-subnormal-single-1.x: Likewise.
* gcc.dg/20020312-2.c: Likewise.
* gcc.dg/Wno-frame-address.c: Likewise.
* gcc.dg/c11-true_min-1.c: Likewise.
* gcc.dg/sibcall-10.c: Likewise.
* gcc.dg/sibcall-9.c: Likewise.
* gcc.dg/stack-usage-1.c: Likewise.
* gcc.dg/torture/float32-tg-3.c: Likewise.
* gcc.dg/torture/float32x-tg-3.c: Likewise.
* gcc.dg/torture/float64-tg-3.c: Likewise.
* gcc.dg/torture/float64x-tg-3.c: Likewise.
* gcc.dg/torture/type-generic-1.c: Likewise.
* gcc.target/csky/*: New.
* lib/target-supports.exp (check_profiling_available): Add
csky-*-elf.
(check_effective_target_hard_float): Handle C-SKY targets with
single-precision hard float only.
(check_effective_target_logical_op_short_circuit): Handle C-SKY.
diff --git a/gcc/testsuite/g++.dg/Wno-frame-address.C b/gcc/testsuite/g++.dg/Wno-frame-address.C
index a2df034..54a02fe 100644
--- a/gcc/testsuite/g++.dg/Wno-frame-address.C
+++ b/gcc/testsuite/g++.dg/Wno-frame-address.C
@@ -1,5 +1,5 @@
 // { dg-do compile }
-// { dg-skip-if "Cannot access arbitrary stack frames." { arm*-*-* hppa*-*-* ia64-*-* } }
+// { dg-skip-if "Cannot access arbitrary stack frames." { arm*-*-* hppa*-*-* ia64-*-* csky*-*-* } }
 // { dg-options "-Werror" }
 // { dg-additional-options "-mbackchain" { target s390*-*-* } }
 
diff --git a/gcc/testsuite/g++.dg/torture/type-generic-1.C b/gcc/testsuite/g++.dg/torture/type-generic-1.C
index 4d82592..7708724 100644
--- a/gcc/testsuite/g++.dg/torture/type-generic-1.C
+++ b/gcc/testsuite/g++.dg/torture/type-generic-1.C
@@ -4,6 +4,7 @@
 /* { dg-do run } */
 /* { dg-add-options ieee } */
 /* { dg-skip-if "No Inf/NaN support" { spu-*-* } } */
+/* { dg-skip-if "No subnormal support" { csky-*-* } { "-mhard-float" } } */
 
 #include "../../gcc.dg/tg-tests.h"
 
diff --git a/gcc/testsuite/gcc.c-torture/compile/2804-1.c b/gcc/testsuite/gcc.c-torture/compile/2804-1.c
index 5c6b731..35464c2 100644
--- a/gcc/testsuite/gcc.c-torture/compile/2804-1.c
+++ b/gcc/testsuite/gcc.c-torture/compile/2804-1.c
@@ -4,6 +4,7 @@
 /* { dg-skip-if "" { { i?86-*-* x86_64-*-* } && { ia32 && { ! nonpic } } } } */
 /* { dg-skip-if "No 64-bit registers" { m32c-*-* } } */
 /* { dg-skip-if "Not enough 64-bit registers" { pdp11-*-* } { "-O0" } { "" } } */
+/* { dg-xfail-if "Inconsistent constraint on asm" { csky-*-* } { "-O0" } { "" } } */
 /* { dg-xfail-if "" { h8300-*-* } } */
 
 /* Copyright (C) 2000, 2003 Free Software Foundation */
diff --git a/gcc/testsuite/gcc.c-torture/execute/20101011-1.c b/gcc/testsuite/gcc.c-torture/execute/20101011-1.c
index dda49a5..f95d900 100644
--- a/gcc/testsuite/gcc.c-torture/execute/20101011-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/20101011-1.c
@@ -93,6 +93,10 @@ __aeabi_idiv0 (int return_value)
 #elif defined (__nvptx__)
 /* There isn't even a signal function.  */
 # define DO_TEST 0
+#elif defined (__csky__)
+  /* This presently doesn't raise SIGFPE even on csky-linux-gnu, much
+ less bare metal.  See the implementation of __divsi3 in libgcc.  */
+# define DO_TEST 0
 #else
 # define DO_TEST 1
 #endif
diff --git a/gcc/testsuite/gcc.c-torture/execute/ieee/mul-subnormal-single-1.x b/gcc/testsuite/gcc.c-torture/execute/ieee/mul-subnormal-single-1.x
index 16df951..ee40863 100644
--- a/gcc/testsuite/gcc.c-torture/execute/ieee/mul-subnormal-single-1.x
+++ b/gcc/testsuite/gcc.c-torture/execute/ieee/mul-subnormal-single-1.x
@@ -1,3 +1,8 @@
+if {[istarget "csky-*-*"] && [check_effective_target_hard_float]} {
+# The C-SKY hardware FPU only supports flush-to-zero mode.
+set torture_execute_xfail "csky-*-*"
+return 1
+}
 if [istarget "epiphany-*-*"] {
 # The Epiphany single-precision floating point format does not
 # support subnormals.
diff --git a/gcc/testsuite/gcc.dg/20020312-2.c b/gcc/testsuite/gcc.dg/20020312-2.c
index f5929e0..f8be3ce 100644
--- a/gcc/testsuite/gcc.dg/20020312-2.c
+++ b/gcc/testsuite/gcc.dg/20020312-2.c
@@ -111,6 +111,11 @@ extern void abort (void);
 /* No pic register.  */
 #elif defined (__nvptx__)
 /* No pic register.  */
+#elif defined (__csky__)
+/* Pic register is r28, but some cores only have r0-r15.  */
+# if defined (__CK807__) || defined (__CK810__)
+#   define PIC_REG  "r28"
+# endif
 #else
 # error "Modify the test for your target."
 #endif
diff --git a/gcc/testsuite/gcc.dg/Wno-frame-address.c b/gcc/testsuite/gcc.dg/Wno-frame-address.c
index e6dfe52..9fe4d07 100644
--- a/gcc/testsuite/gcc.dg/Wno-frame-address.c
+++ 

[3/5] C-SKY port v2: Documentation

2018-07-30 Thread Sandra Loosemore

2018-07-30  Sandra Loosemore  

C-SKY port: Documentation

gcc/
* doc/extend.texi (C-SKY Function Attributes): New section.
* doc/invoke.texi (Option Summary): Add C-SKY options.
(C-SKY Options): New section.
* doc/md.texi (Machine Constraints): Document C-SKY constraints.
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index c7745c4..71c1d01 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2324,6 +2324,7 @@ GCC plugins may provide their own attributes.
 * AVR Function Attributes::
 * Blackfin Function Attributes::
 * CR16 Function Attributes::
+* C-SKY Function Attributes::
 * Epiphany Function Attributes::
 * H8/300 Function Attributes::
 * IA-64 Function Attributes::
@@ -4145,6 +4146,38 @@ function entry and exit sequences suitable for use in an interrupt handler
 when this attribute is present.
 @end table
 
+@node C-SKY Function Attributes
+@subsection C-SKY Function Attributes
+
+These function attributes are supported by the C-SKY back end:
+
+@table @code
+@item interrupt
+@itemx isr
+@cindex @code{interrupt} function attribute, C-SKY
+@cindex @code{isr} function attribute, C-SKY
+Use these attributes to indicate that the specified function
+is an interrupt handler.
+The compiler generates function entry and exit sequences suitable for
+use in an interrupt handler when either of these attributes are present.
+
+Use of these options requires the @option{-mistack} command-line option
+to enable support for the necessary interrupt stack instructions.  They
+are ignored with a warning otherwise.  @xref{C-SKY Options}.
+
+@item naked
+@cindex @code{naked} function attribute, C-SKY
+This attribute allows the compiler to construct the
+requisite function declaration, while allowing the body of the
+function to be assembly code. The specified function will not have
+prologue/epilogue sequences generated by the compiler. Only basic
+@code{asm} statements can safely be included in naked functions
+(@pxref{Basic Asm}). While using extended @code{asm} or a mixture of
+basic @code{asm} and C code may appear to work, they cannot be
+depended upon to work reliably and are not supported.
+@end table
+
+
 @node Epiphany Function Attributes
 @subsection Epiphany Function Attributes
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e0e59f6..75e147e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -703,6 +703,16 @@ Objective-C and Objective-C++ Dialects}.
 -msim  -mint32  -mbit-ops
 -mdata-model=@var{model}}
 
+@emph{C-SKY Options}
+@gccoptlist{-march=@var{arch}  -mcpu=@var{cpu} @gol
+-mbig-endian  -EB  -mlittle-endian  -EL @gol
+-mhard-float  -msoft-float  -mfpu=@var{fpu}  -mdouble-float  -mfdivdu @gol
+-melrw  -mistack  -mmp  -mcp  -mcache  -msecurity  -mtrust @gol
+-mdsp  -medsp  -mvdsp @gol
+-mdiv  -msmart  -mhigh-registers  -manchor @gol
+-mpushpop  -mmultiple-stld  -mconstpool  -mstack-size  -mccrt @gol
+-mbranch-cost=@var{n}  -mcse-cc  -msched-prolog}
+
 @emph{Darwin Options}
 @gccoptlist{-all_load  -allowable_client  -arch  -arch_errors_fatal @gol
 -arch_only  -bind_at_load  -bundle  -bundle_loader @gol
@@ -14468,6 +14478,7 @@ platform.
 * C6X Options::
 * CRIS Options::
 * CR16 Options::
+* C-SKY Options::
 * Darwin Options::
 * DEC Alpha Options::
 * FR30 Options::
@@ -17581,6 +17592,205 @@ However, @samp{far} is not valid with @option{-mcr16c}, as the
 CR16C architecture does not support the far data model.
 @end table
 
+@node C-SKY Options
+@subsection C-SKY Options
+@cindex C-SKY Options
+
+GCC supports these options when compiling for C-SKY V2 processors.
+
+@table @gcctabopt
+
+@item -march=@var{arch}
+@opindex march=
+Specify the C-SKY target architecture.  Valid values for @var{arch} are:
+@samp{ck801}, @samp{ck802}, @samp{ck803}, @samp{ck807}, and @samp{ck810}.
+The default is @samp{ck810}.
+
+@item -mcpu=@var{cpu}
+@opindex mcpu=
+Specify the C-SKY target processor.  Valid values for @var{cpu} are:
+@samp{ck801}, @samp{ck801t},
+@samp{ck802}, @samp{ck802t}, @samp{ck802j},
+@samp{ck803}, @samp{ck803h}, @samp{ck803t}, @samp{ck803ht},
+@samp{ck803f}, @samp{ck803fh}, @samp{ck803e}, @samp{ck803eh},
+@samp{ck803et}, @samp{ck803eht}, @samp{ck803ef}, @samp{ck803efh},
+@samp{ck803ft}, @samp{ck803eft}, @samp{ck803efht}, @samp{ck803r1},
+@samp{ck803hr1}, @samp{ck803tr1}, @samp{ck803htr1}, @samp{ck803fr1},
+@samp{ck803fhr1}, @samp{ck803er1}, @samp{ck803ehr1}, @samp{ck803etr1},
+@samp{ck803ehtr1}, @samp{ck803efr1}, @samp{ck803efhr1}, @samp{ck803ftr1},
+@samp{ck803eftr1}, @samp{ck803efhtr1},
+@samp{ck803s}, @samp{ck803st}, @samp{ck803se}, @samp{ck803sf},
+@samp{ck803sef}, @samp{ck803seft},
+@samp{ck807e}, @samp{ck807ef}, @samp{ck807}, @samp{ck807f},
+@samp{ck810e}, @samp{ck810et}, @samp{ck810ef}, @samp{ck810eft},
+@samp{ck810}, @samp{ck810v}, @samp{ck810f}, @samp{ck810t}, @samp{ck810fv},
+@samp{ck810tv}, @samp{ck810ft}, and @samp{ck810ftv}.
+
+@item -mbig-endian
+@opindex mbig-endian
+@itemx -EB
+@opindex -EB
+@itemx 

[2/5] C-SKY port v2: Backend implementation

2018-07-30 Thread Sandra Loosemore

2018-07-30  Jojo  
Huibin Wang  
Sandra Loosemore  
Chung-Lin Tang  

C-SKY port: Backend implementation

gcc/
* config/csky/*: New.
* common/config/csky/*: New.


csky-gcc-2.patch.gz
Description: application/gzip


[1/5] C-SKY port v2: Configury

2018-07-30 Thread Sandra Loosemore

2018-07-30  Jojo  
Huibin Wang  
Sandra Loosemore  
Chung-Lin Tang  
Andrew Jenner  

C-SKY port: Configury

gcc/
* config.gcc (csky-*-*): New.
* configure.ac: Add csky to targets for dwarf2 debug_line support.
* configure: Regenerated.

contrib/
* config-list.mk (LIST): Add csky-elf and csky-linux-gnu.

diff --git a/contrib/config-list.mk b/contrib/config-list.mk
index c3537d2..d9e48a9 100644
--- a/contrib/config-list.mk
+++ b/contrib/config-list.mk
@@ -40,6 +40,7 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
   arm-symbianelf avr-elf \
   bfin-elf bfin-uclinux bfin-linux-uclibc bfin-rtems bfin-openbsd \
   c6x-elf c6x-uclinux cr16-elf cris-elf cris-linux crisv32-elf crisv32-linux \
+  csky-elf csky-linux-gnu \
   epiphany-elf epiphany-elfOPT-with-stack-offset=16 fido-elf \
   fr30-elf frv-elf frv-linux ft32-elf h8300-elf hppa-linux-gnu \
   hppa-linux-gnuOPT-enable-sjlj-exceptions=yes hppa64-linux-gnu \
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 78e84c2..cd98836 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1278,6 +1278,70 @@ crisv32-*-linux* | cris-*-linux*)
 		;;
 	esac
 	;;
+csky-*-*)
+	if test x${with_endian} != x; then
+	case ${with_endian} in
+		big|little)		;;
+		*)
+		echo "with_endian=${with_endian} not supported."
+		exit 1
+		;;
+	esac
+	fi
+	if test x${with_float} != x; then
+	case ${with_float} in
+		soft | hard) ;;
+		*) echo
+		"Unknown floating point type used in --with-float=$with_float"
+		exit 1
+		;;
+	esac
+	fi
+	tm_file="csky/csky.h"
+	md_file="csky/csky.md"
+	out_file="csky/csky.c"
+	tm_p_file="${tm_p_file} csky/csky-protos.h"
+	extra_options="${extra_options} csky/csky_tables.opt"
+
+	if test x${enable_tpf_debug} = xyes; then
+	tm_defines="${tm_defines} ENABLE_TPF_DEBUG"
+	fi
+
+	case ${target} in
+	csky-*-elf*)
+		tm_file="dbxelf.h elfos.h newlib-stdint.h ${tm_file} csky/csky-elf.h"
+		tmake_file="csky/t-csky csky/t-csky-elf"
+		default_use_cxa_atexit=no
+		;;
+	csky-*-linux*)
+		tm_file="dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h ${tm_file} csky/csky-linux-elf.h"
+		tmake_file="${tmake_file} csky/t-csky csky/t-csky-linux"
+
+		if test "x${enable_multilib}" = xyes ; then
+		tm_file="$tm_file ./sysroot-suffix.h"
+		tmake_file="${tmake_file} csky/t-sysroot-suffix"
+		fi
+
+		case ${target} in
+		csky-*-linux-gnu*)
+			tm_defines="$tm_defines DEFAULT_LIBC=LIBC_GLIBC"
+			;;
+		csky-*-linux-uclibc*)
+			tm_defines="$tm_defines DEFAULT_LIBC=LIBC_UCLIBC"
+			default_use_cxa_atexit=no
+			;;
+		*)
+			echo "Unknown target $target"
+			exit 1
+			;;
+		esac
+		;;
+	*)
+		echo "Unknown target $target"
+		exit 1
+		;;
+	esac
+	;;
 epiphany-*-elf | epiphany-*-rtems*)
 	tm_file="${tm_file} dbxelf.h elfos.h"
 	tmake_file="${tmake_file} epiphany/t-epiphany"
@@ -3831,6 +3895,10 @@ case "${target}" in
 		fi
 		;;
 
+csky-*-*)
+	supported_defaults="cpu endian float"
+	;;
+
 	arm*-*-*)
 		supported_defaults="arch cpu float tune fpu abi mode tls"
 		for which in cpu tune arch; do
diff --git a/gcc/configure b/gcc/configure
index 80ac4a3..b7a8e36 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -27838,7 +27838,7 @@ esac
 # ??? Once 2.11 is released, probably need to add first known working
 # version to the per-target configury.
 case "$cpu_type" in
-  aarch64 | alpha | arc | arm | avr | bfin | cris | i386 | m32c | m68k \
+  aarch64 | alpha | arc | arm | avr | bfin | cris | csky | i386 | m32c | m68k \
   | microblaze | mips | nios2 | pa | riscv | rs6000 | score | sparc | spu \
   | tilegx | tilepro | visium | xstormy16 | xtensa)
 insn="nop"
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 4fc851c..65f9c92 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -4932,7 +4932,7 @@ esac
 # ??? Once 2.11 is released, probably need to add first known working
 # version to the per-target configury.
 case "$cpu_type" in
-  aarch64 | alpha | arc | arm | avr | bfin | cris | i386 | m32c | m68k \
+  aarch64 | alpha | arc | arm | avr | bfin | cris | csky | i386 | m32c | m68k \
   | microblaze | mips | nios2 | pa | riscv | rs6000 | score | sparc | spu \
   | tilegx | tilepro | visium | xstormy16 | xtensa)
 insn="nop"


Re: [PATCH] Fix the damage done by my other patch from yesterday to strlenopt-49.c

2018-07-30 Thread Jakub Jelinek
On Mon, Jul 30, 2018 at 04:28:50PM +, Bernd Edlinger wrote:
> >>> generic.texi says they need not be.  Making the STRING_CST contain only
> >>> the bytes of the initializer and not the trailing NUL in the C case where
> >>> the trailing NUL does not fit in the object initialized would of course
> >>> mean you get non-NUL-terminated STRING_CSTs for valid C code as well.
> >>
> >> One thing is whether TREE_STRING_LENGTH includes the trailing NUL byte,
> >> that doesn't need to be the case e.g. for the shortened initializers.
> >> The other thing is whether we as a convenience for the compiler's internals
> >> want to waste some memory for the NUL termination; I think we could avoid
> >> some bugs that way.
> > 
> > TREE_STRING_LENGTH includes the NUL if it is logically part of the object,
> > but should not if the NUL is purely an implementation convenience.
> > 
> 
> To complicate things a bit more STRING_CST that are created by the Ada FE
> for the purpose of ASM constraints, do not contain a NUL character.
> That is in TREE_STRING_LENGTH, there is of course always another NUL char
> after the payload.  Likewise for C __attribute__ values.

If there is a NUL at TREE_STRING_POINTER (x) + TREE_STRING_LENGTH (x), that
is ok.

Jakub


Re: [PATCH] Fix the damage done by my other patch from yesterday to strlenopt-49.c

2018-07-30 Thread Bernd Edlinger
On 07/30/18 18:01, Joseph Myers wrote:
> On Mon, 30 Jul 2018, Jakub Jelinek wrote:
> 
>> On Mon, Jul 30, 2018 at 03:52:39PM +, Joseph Myers wrote:
>>> On Mon, 30 Jul 2018, Bernd Edlinger wrote:
>>>
 In the moment I would already be happy if all STRING_CSTs would
 be zero terminated.
>>>
>>> generic.texi says they need not be.  Making the STRING_CST contain only
>>> the bytes of the initializer and not the trailing NUL in the C case where
>>> the trailing NUL does not fit in the object initialized would of course
>>> mean you get non-NUL-terminated STRING_CSTs for valid C code as well.
>>
>> One thing is whether TREE_STRING_LENGTH includes the trailing NUL byte,
>> that doesn't need to be the case e.g. for the shortened initializers.
>> The other thing is whether we as a convenience for the compiler's internals
>> want to waste some memory for the NUL termination; I think we could avoid
>> some bugs that way.
> 
> TREE_STRING_LENGTH includes the NUL if it is logically part of the object,
> but should not if the NUL is purely an implementation convenience.
> 

To complicate things a bit more STRING_CST that are created by the Ada FE
for the purpose of ASM constraints, do not contain a NUL character.
That is in TREE_STRING_LENGTH, there is of course always another NUL char
after the payload.  Likewise for C __attribute__ values.


Thanks
Bernd.


Re: [PATCH] combine: Allow combining two insns to two insns

2018-07-30 Thread Segher Boessenkool
On Tue, Jul 24, 2018 at 05:18:41PM +, Segher Boessenkool wrote:
> This patch allows combine to combine two insns into two.  This helps
> in many cases, by reducing instruction path length, and also allowing
> further combinations to happen.  PR85160 is a typical example of code
> that it can improve.
> 
> This patch does not allow such combinations if either of the original
> instructions was a simple move instruction.  In those cases combining
> the two instructions increases register pressure without improving the
> code.  With this move test register pressure does no longer increase
> noticably as far as I can tell.
> 
> (At first I also didn't allow either of the resulting insns to be a
> move instruction.  But that is actually a very good thing to have, as
> should have been obvious).
> 
> Tested for many months; tested on about 30 targets.
> 
> I'll commit this later this week if there are no objections.

Done now, with the testcase at 
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01856.html .


Segher


Re: [PATCH] Fix the damage done by my other patch from yesterday to strlenopt-49.c

2018-07-30 Thread Bernd Edlinger
On 07/30/18 17:57, Jakub Jelinek wrote:
> On Mon, Jul 30, 2018 at 03:52:39PM +, Joseph Myers wrote:
>> On Mon, 30 Jul 2018, Bernd Edlinger wrote:
>>
>>> In the moment I would already be happy if all STRING_CSTs would
>>> be zero terminated.
>>
>> generic.texi says they need not be.  Making the STRING_CST contain only
>> the bytes of the initializer and not the trailing NUL in the C case where
>> the trailing NUL does not fit in the object initialized would of course
>> mean you get non-NUL-terminated STRING_CSTs for valid C code as well.
> 
> One thing is whether TREE_STRING_LENGTH includes the trailing NUL byte,
> that doesn't need to be the case e.g. for the shortened initializers.
> The other thing is whether we as a convenience for the compiler's internals
> want to waste some memory for the NUL termination; I think we could avoid
> some bugs that way.
> 

Yes, exactly, currently the middle-end tries determine if a STRING_CST
is nul terminated, but that is broken for wide character, for instance
c_getstr:

   else if (string[string_length - 1] != '\0')
 {
   /* Support only properly NUL-terminated strings but handle
  consecutive strings within the same array, such as the six
  substrings in "1\0002\0003".  */
   return NULL;
 }

It would be much better if any string constant could be zero terminated.

When always zero-terminated STRING_CST the check for a non-zero terminated
value is much more easy:

compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (init)), TREE_STRING_LENGTH (init))



Bernd.


[PATCH] testcase for 2-2 combine

2018-07-30 Thread Segher Boessenkool
Committing.


Segher


2018-07-30  Segher Boessenkool  

gcc/testsuite/
PR rtl-optimization/85160
* gcc.target/powerpc/combine-2-2.c: New testcase.

---
 gcc/testsuite/gcc.target/powerpc/combine-2-2.c | 17 +
 1 file changed, 17 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/combine-2-2.c

diff --git a/gcc/testsuite/gcc.target/powerpc/combine-2-2.c 
b/gcc/testsuite/gcc.target/powerpc/combine-2-2.c
new file mode 100644
index 000..234476d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/combine-2-2.c
@@ -0,0 +1,17 @@
+/* { dg-options "-O2" } */
+
+/* PR85160 */
+
+/* Originally, the "x >> 14" are CSEd away (eventually becoming a srawi
+   instruction), and the two ANDs remain separate instructions because
+   combine cannot deal with this.
+
+   Now that combine knows how to combine two RTL insns into two, it manages
+   to make this just the sum of two rlwinm instructions.  */
+
+int f(int x)
+{
+  return ((x >> 14) & 6) + ((x >> 14) & 4);
+}
+
+/* { dg-final { scan-assembler-not {\msrawi\M} } } */
-- 
1.8.3.1



Re: [PATCH] Fix the damage done by my other patch from yesterday to strlenopt-49.c

2018-07-30 Thread Joseph Myers
On Mon, 30 Jul 2018, Jakub Jelinek wrote:

> On Mon, Jul 30, 2018 at 03:52:39PM +, Joseph Myers wrote:
> > On Mon, 30 Jul 2018, Bernd Edlinger wrote:
> > 
> > > In the moment I would already be happy if all STRING_CSTs would
> > > be zero terminated.
> > 
> > generic.texi says they need not be.  Making the STRING_CST contain only 
> > the bytes of the initializer and not the trailing NUL in the C case where 
> > the trailing NUL does not fit in the object initialized would of course 
> > mean you get non-NUL-terminated STRING_CSTs for valid C code as well.
> 
> One thing is whether TREE_STRING_LENGTH includes the trailing NUL byte,
> that doesn't need to be the case e.g. for the shortened initializers.
> The other thing is whether we as a convenience for the compiler's internals
> want to waste some memory for the NUL termination; I think we could avoid
> some bugs that way.

TREE_STRING_LENGTH includes the NUL if it is logically part of the object, 
but should not if the NUL is purely an implementation convenience.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix the damage done by my other patch from yesterday to strlenopt-49.c

2018-07-30 Thread Jakub Jelinek
On Mon, Jul 30, 2018 at 03:52:39PM +, Joseph Myers wrote:
> On Mon, 30 Jul 2018, Bernd Edlinger wrote:
> 
> > In the moment I would already be happy if all STRING_CSTs would
> > be zero terminated.
> 
> generic.texi says they need not be.  Making the STRING_CST contain only 
> the bytes of the initializer and not the trailing NUL in the C case where 
> the trailing NUL does not fit in the object initialized would of course 
> mean you get non-NUL-terminated STRING_CSTs for valid C code as well.

One thing is whether TREE_STRING_LENGTH includes the trailing NUL byte,
that doesn't need to be the case e.g. for the shortened initializers.
The other thing is whether we as a convenience for the compiler's internals
want to waste some memory for the NUL termination; I think we could avoid
some bugs that way.

Jakub


Re: [PATCH] Fix the damage done by my other patch from yesterday to strlenopt-49.c

2018-07-30 Thread Joseph Myers
On Mon, 30 Jul 2018, Bernd Edlinger wrote:

> In the moment I would already be happy if all STRING_CSTs would
> be zero terminated.

generic.texi says they need not be.  Making the STRING_CST contain only 
the bytes of the initializer and not the trailing NUL in the C case where 
the trailing NUL does not fit in the object initialized would of course 
mean you get non-NUL-terminated STRING_CSTs for valid C code as well.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix the damage done by my other patch from yesterday to strlenopt-49.c

2018-07-30 Thread Joseph Myers
On Mon, 30 Jul 2018, Bernd Edlinger wrote:

> Hi,
> 
> this is how I would like to handle the over length strings issue in the C FE.
> If the string constant is exactly the right length and ends in one explicit
> NUL character, shorten it by one character.

I don't think shortening should be limited to that case.  I think the case 
where the constant is longer than that (and so gets an unconditional 
pedwarn) should also have it shortened - any constant that doesn't fit in 
the object being initialized should be shortened to fit, whether diagnosed 
or not, we should define GENERIC / GIMPLE to disallow too-large string 
constants in initializers, and should add an assertion somewhere in the 
middle-end that no too-large string constants reach it.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Move -Walloca and related warnings from c.opt to common.opt

2018-07-30 Thread Martin Sebor

On 07/30/2018 09:28 AM, Jakub Jelinek wrote:

On Sun, Jul 29, 2018 at 08:35:39PM +0200, Iain Buclaw wrote:

Since r262910, it was noticed that new -Walloca-larger-than= warnings
started appearing when building the D frontend's standard library.
These have been checked and verified as valid, and appropriate fixes
will be sent on upstream.

As for the warning itself, as it is now default on, it would be
preferable to be able control it.  Given the choice between adding
these options to the D frontend or moving them to common, I'd rather
move them to common.


It is strange to add a warning option for languages that don't even have
alloca.  Instead, the warning shouldn't be enabled by default (but Martin
doesn't want to listen to that), or at least should be enabled only for the
languages where it makes sense.


I agree that the warning shouldn't be enabled for languages that
don't have support for alloca.

Martin


[PATCH][c++] Fix DECL_BY_REFERENCE of clone parms

2018-07-30 Thread Tom de Vries
Hi,

Consider test.C compiled at -O0 -g:
...
class string {
public:
  string (const char *p) { this->p = p ; }
  string (const string ) { this->p = s.p; }

private:
  const char *p;
};

class foo {
public:
  foo (string dir_hint) {}
};

int
main (void)
{
  std::string s = "This is just a string";
  foo bar(s);
  return 0;
}
...

When parsing foo::foo, the dir_hint parameter gets a DECL_ARG_TYPE of
'struct string & restrict'.  Then during finish_struct, we call
clone_constructors_and_destructors and create clones for foo::foo, and
set the DECL_ARG_TYPE in the same way.

Later on, during finish_function, cp_genericize is called for the original
foo::foo, which sets the type of parm dir_hint to DECL_ARG_TYPE, and sets
DECL_BY_REFERENCE of dir_hint to 1.

After that, during maybe_clone_body update_cloned_parm is called with:
...
(gdb) call debug_generic_expr (parm.typed.type)
struct string & restrict
(gdb) call debug_generic_expr (cloned_parm.typed.type)
struct string
...
The type of the cloned_parm is then set to the type of parm, but
DECL_BY_REFERENCE is not set.

When doing cp_genericize for the clone later on,
TREE_ADDRESSABLE (TREE_TYPE ()) is no longer true for the updated type of
the parm, so DECL_BY_REFERENCE is not set there either.

This patch fixes the problem by copying DECL_BY_REFERENCE in update_cloned_parm.

Build and reg-tested on x86_64.

OK for trunk?

Thanks,
- Tom

[c++] Fix DECL_BY_REFERENCE of clone parms

2018-07-30  Tom de Vries  

PR debug/86687
* optimize.c (update_cloned_parm): Copy DECL_BY_REFERENCE.

* g++.dg/guality/pr86687.C: New test.

---
 gcc/cp/optimize.c  |  2 ++
 gcc/testsuite/g++.dg/guality/pr86687.C | 28 
 2 files changed, 30 insertions(+)

diff --git a/gcc/cp/optimize.c b/gcc/cp/optimize.c
index 0e9b84ed8a4..3923a5fc6c4 100644
--- a/gcc/cp/optimize.c
+++ b/gcc/cp/optimize.c
@@ -46,6 +46,8 @@ update_cloned_parm (tree parm, tree cloned_parm, bool first)
   /* We may have taken its address.  */
   TREE_ADDRESSABLE (cloned_parm) = TREE_ADDRESSABLE (parm);
 
+  DECL_BY_REFERENCE (cloned_parm) = DECL_BY_REFERENCE (parm);
+
   /* The definition might have different constness.  */
   TREE_READONLY (cloned_parm) = TREE_READONLY (parm);
 
diff --git a/gcc/testsuite/g++.dg/guality/pr86687.C 
b/gcc/testsuite/g++.dg/guality/pr86687.C
new file mode 100644
index 000..140a6fce596
--- /dev/null
+++ b/gcc/testsuite/g++.dg/guality/pr86687.C
@@ -0,0 +1,28 @@
+// PR debug/86687
+// { dg-do run }
+// { dg-options "-g" }
+
+class string {
+public:
+  string (int p) { this->p = p ; }
+  string (const string ) { this->p = s.p; }
+
+  int p;
+};
+
+class foo {
+public:
+  foo (string dir_hint) {
+p = dir_hint.p; // { dg-final { gdb-test . "dir_hint.p" 3 } }
+  }
+
+  int p;
+};
+
+int
+main (void)
+{
+  string s = 3;
+  foo bar(s);
+  return !(bar.p == 3);
+}


Re: [PATCH] Move -Walloca and related warnings from c.opt to common.opt

2018-07-30 Thread Jakub Jelinek
On Sun, Jul 29, 2018 at 08:35:39PM +0200, Iain Buclaw wrote:
> Since r262910, it was noticed that new -Walloca-larger-than= warnings
> started appearing when building the D frontend's standard library.
> These have been checked and verified as valid, and appropriate fixes
> will be sent on upstream.
> 
> As for the warning itself, as it is now default on, it would be
> preferable to be able control it.  Given the choice between adding
> these options to the D frontend or moving them to common, I'd rather
> move them to common.

It is strange to add a warning option for languages that don't even have
alloca.  Instead, the warning shouldn't be enabled by default (but Martin
doesn't want to listen to that), or at least should be enabled only for the
languages where it makes sense.

Jakub


Re: [PATCH] Fix wrong code with truncated string literals (PR 86711/86714)

2018-07-30 Thread Bernd Edlinger
On 07/30/18 01:05, Martin Sebor wrote:
> On 07/29/2018 04:56 AM, Bernd Edlinger wrote:
>> Hi!
>>
>> This fixes two wrong code bugs where string_constant
>> returns over length string constants.  Initializers
>> like that are rejected in C++, but valid in C.
> 
> If by valid you are referring to declarations like the one in
> the added test:
> 
>      const char a[2][3] = { "1234", "xyz" };
> 
> then (as I explained), the excess elements in "1234" make
> the char[3] initialization and thus the test case undefined.
> I have resolved bug 86711 as invalid on those grounds.
> 
> Bug 86711 has a valid test case that needs to be fixed, along
> with bug 86688 that I raised for the same underlying problem:
> considering the excess nul as part of the string.  As has been
> discussed in a separate bug, rather than working around
> the excessively long strings in the middle-end, it would be
> preferable to avoid creating them to begin with.
> 
> I'm already working on a fix for bug 86688, in part because
> I introduced the code change and also because I'm making other
> changes in this area -- bug 86552.  Both of these in response
> to your comments.
> 

Sorry, I must admit, I have completely lost track on how many things
you are trying to work in parallel.

Nevertheless I started to review you pr86552 patch here:
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html

But so far you did not respond to me.

Well actually I doubt your patch does apply to trunk,
maybe you start to re-base that one, and post it again
instead?


Thanks
Bernd.


Re: [PATCH] Fix the damage done by my other patch from yesterday to strlenopt-49.c

2018-07-30 Thread Martin Sebor

On 07/30/2018 05:51 AM, Bernd Edlinger wrote:

Hi,

this is how I would like to handle the over length strings issue in the C FE.
If the string constant is exactly the right length and ends in one explicit
NUL character, shorten it by one character.

I thought Martin would be working on it,  but as this is a really simple fix,
I would dare to send it to gcc-patches anyway, hope you don't mind...


And as I said repeatedly, I am working on it along with a number
of other things in this very area.  There are a number of issues
to solve:

1) issue a warning for the non-nul terminated strings (bug 86552:
   raised in response to your comments, initial patch posted,
   more work in progress)
2) avoid creating overlong string literals (bug 86688: raised
   partly also in response to your earlier comments)
3) handle braced-initializers (as in char a[] = { 1, 2, 0 }; )
   I'm testing a patch for that.

I am actively working on all three of these so please, for the fourth
time, let me finish my work.  Submitting patches that try to handle
any of these issues in a slightly or substantially different way at
the same time isn't helpful.  On the contrary, it's very disruptive.
This has nothing to do with ownership and everything with
coordination and an apparent divergence of opinions.

There are over 10,000 open bugs in Bugzilla.  Working on any of
them that's not assigned to anyone would be helpful.  This is
not.

Martin


Re: [PATCH] Fix wrong code with truncated string literals (PR 86711/86714)

2018-07-30 Thread Martin Sebor

On 07/30/2018 12:57 AM, Richard Biener wrote:

On Sun, 29 Jul 2018, Martin Sebor wrote:


On 07/29/2018 04:56 AM, Bernd Edlinger wrote:

Hi!

This fixes two wrong code bugs where string_constant
returns over length string constants.  Initializers
like that are rejected in C++, but valid in C.


If by valid you are referring to declarations like the one in
the added test:

const char a[2][3] = { "1234", "xyz" };

then (as I explained), the excess elements in "1234" make
the char[3] initialization and thus the test case undefined.
I have resolved bug 86711 as invalid on those grounds.

Bug 86711 has a valid test case that needs to be fixed, along
with bug 86688 that I raised for the same underlying problem:
considering the excess nul as part of the string.  As has been
discussed in a separate bug, rather than working around
the excessively long strings in the middle-end, it would be
preferable to avoid creating them to begin with.

I'm already working on a fix for bug 86688, in part because
I introduced the code change and also because I'm making other
changes in this area -- bug 86552.  Both of these in response
to your comments.

I would normally welcome someone else helping with my work
but (as I already made clear last week) it's counteproductive
to have two people working in the very same area at the same
time, especially when they are working at cross purposes as
you seem to be hell-bent on doing.


I have xfailed strlenopt-49.c, which tests this feature.


That's not appropriate.  The purpose of the test is to verify
the fix for bug 86428: namely, that a call to strlen() on
an array initialized with a string of the same length is
folded, such as in:

const char b[4] = "123\0";

That's a valid initializer that can and should continue to be
folded.  The test needs to continue to exercise that feature.

The test also happens to exercise invalid/overlong initializers.
This is because that, in my view, it's safer to fold the result
of such calls to a constant than than to call the library
function and have it either return an unpredictable value or
perhaps even crash.


Personally I don't think that it is worth the effort to
optimize something that is per se invalid in C++.


This is a C test, not C++.  (I don't suppose you are actually
saying that only the common subset between C and C++ is worth
optimizing.)

Just in case it isn't clear from the above: the point of
the test exercising the behavior for overlong strings isn't
optimizing undefined behavior but rather avoiding the worst
consequences of it.  I have already explained this once
before so I'm starting to wonder if I'm being insufficiently
clear or if you are not receiving or reading (or understanding)
my responses.  We can have a broader discussion about whether
this is the best approach for GCC to take either in this instance
or in general, but in the meantime I would appreciate it if you
refrained from undoing my changes just because you don't agree
with or don't understand the motivation behind them.

Martin

PS I continue to wonder about your motivation and ethics.  It's
rare to have someone so openly, blatantly and persistently try
to undermine someone else's work.


Martin, you are clearly the one being hostile here - Bernd is trying
to help.  In fact his patches are more focused, easier to undestand
and thus easier to review.


As I explained, it's unhelpful for two people to making changes
to the same code at the same time, especially when one is undoing
the other one's changes.  I have made it clear that I am working
in this area -- I welcome and address valid feedback but I can't
very well do that while someone is compromising my work.

I appreciate test cases and suggestions for improvements but
please avoid making changes to this code while I'm working on
it.  It is not helpful.

Martin


Re: [PATCH] arm: Generate correct const_ints (PR86640)

2018-07-30 Thread Kyrill Tkachov

Hi Segher,

On 30/07/18 14:14, Segher Boessenkool wrote:

In arm_block_set_aligned_vect 8-bit constants are generated as zero-
extended const_ints, not sign-extended as required.  Fix that.

Tamar tested the patch (see PR); no problems were found.  Is this okay
for trunk?



The patch is okay but please add the testcase from the PR to gcc.dg/
or somewhere else generic (it's not arm-specific).
Thanks Tamar for testing.

Kyrill



Segher


2018-07-30  Segher Boessenkool 

PR target/86640
* config/arm/arm.c (arm_block_set_aligned_vect): Use gen_int_mode
instead of GEN_INT.

---
 gcc/config/arm/arm.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index cf12ace..f5eece4 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -30046,7 +30046,6 @@ arm_block_set_aligned_vect (rtx dstbase,
   rtx dst, addr, mem;
   rtx val_vec, reg;
   machine_mode mode;
-  unsigned HOST_WIDE_INT v = value;
   unsigned int offset = 0;

   gcc_assert ((align & 0x3) == 0);
@@ -30065,10 +30064,8 @@ arm_block_set_aligned_vect (rtx dstbase,

   dst = copy_addr_to_reg (XEXP (dstbase, 0));

-  v = sext_hwi (v, BITS_PER_WORD);
-
   reg = gen_reg_rtx (mode);
-  val_vec = gen_const_vec_duplicate (mode, GEN_INT (v));
+  val_vec = gen_const_vec_duplicate (mode, gen_int_mode (value, QImode));
   /* Emit instruction loading the constant value.  */
   emit_move_insn (reg, val_vec);

--
1.8.3.1





Re: [PATCH] Fix the damage done by my other patch from yesterday to strlenopt-49.c

2018-07-30 Thread Bernd Edlinger


On 07/30/18 15:03, Richard Biener wrote:
> On Mon, 30 Jul 2018, Bernd Edlinger wrote:
> 
>> Hi,
>>
>> this is how I would like to handle the over length strings issue in the C FE.
>> If the string constant is exactly the right length and ends in one explicit
>> NUL character, shorten it by one character.
>>
>> I thought Martin would be working on it,  but as this is a really simple fix,
>> I would dare to send it to gcc-patches anyway, hope you don't mind...
>>
>> The patch is relative to the other patch here: 
>> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01800.html
>>
>>
>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>> Is it OK for trunk?
> 
> I'll leave this to FE maintainers but can I ask you to verify the
> (other) FEs do not leak this kind of invalid initializers to the
> middle-end?  I suggest to put this verification in
> output_constructor which otherwise happily truncates initializers
> with excess size.  There's also gimplification which might elide
> a = { "abcd", "cdse" }; to  a.x = "abcd"; a.y = "cdse"; but
> hopefully there the GIMPLE verifier (verify_gimple_assign_single)
> verifies this - well, it only dispatches to useless_type_conversion_p
> (lhs_type, rhs1_type) for this case, but non-flexarrays should be
> handled fine there.
> 
> Richard.
> 

In the moment I would already be happy if all STRING_CSTs would
be zero terminated.

However Go does not create zero-terminated STRING_CSTs, @Ian sorry,
could you look at changing this to include the terminating NUL char?

Index: gcc/go/go-gcc.cc
===
--- gcc/go/go-gcc.cc(revision 263068)
+++ gcc/go/go-gcc.cc(working copy)
@@ -1394,7 +1394,7 @@ Gcc_backend::string_constant_expression(const std:
  TYPE_QUAL_CONST);
tree string_type = build_array_type(const_char_type, index_type);
TYPE_STRING_FLAG(string_type) = 1;
-  tree string_val = build_string(val.length(), val.data());
+  tree string_val = build_string(val.length() + 1, val.data());
TREE_TYPE(string_val) = string_type;
  
return this->make_expression(string_val);


A am pretty sure that the C++ FE allows overlength initializers
with -permissive.  They should be hedged in string_constant IMHO,
however with the patch I am still holding back on Jeff's request
I ran over a string constant in tree-ssa-strlen.c (get_min_string_length)
that had a terminating NUL char but the index range type did not
include the string terminator.  One just needs to be careful here.

A quick survey shows that Fortran creates C strings with range
1..n, which puts the pr86532 address computation again in question.
Remember, you asked for array_ref_element_size but not for
array_ref_low_bound, and Jeff acked the patch in this state.



Thanks
Bernd.


Re: [libgomp, nvptx, committed] Calculate default dims per device

2018-07-30 Thread Cesar Philippidis
On 07/30/2018 03:19 AM, Tom de Vries wrote:
> 
> [libgomp, nvptx] Calculate default dims per device
> 
> The default dimensions are calculated using per-device properties, but
> initialized once and used on all devices.
> 
> This patch fixes this problem by introducing per-device default dimensions.

Neat, thanks!

I wonder if it's worthwhile to optimize the case where a system has more
than one identical GPU.

Cesar


Re: [PATCH][GCC][AARCH64] Canonicalize aarch64 widening simd plus insns

2018-07-30 Thread Kyrill Tkachov



On 30/07/18 14:30, Christophe Lyon wrote:

Hi,

On Tue, 24 Jul 2018 at 17:39, Kyrill Tkachov
 wrote:


On 24/07/18 16:12, James Greenhalgh wrote:

On Thu, Jul 19, 2018 at 07:35:22AM -0500, Matthew Malcomson wrote:

Hi again.

Providing an updated patch to include the formatting suggestions.

Please try not to top-post replies, it makes the conversation thread
harder to follow (reply continues below!).


On 12/07/18 11:39, Sudakshina Das wrote:

Hi Matthew

On 12/07/18 11:18, Richard Sandiford wrote:

Looks good to me FWIW (not a maintainer), just a minor formatting thing:

Matthew Malcomson  writes:

diff --git a/gcc/config/aarch64/aarch64-simd.md
b/gcc/config/aarch64/aarch64-simd.md
index
aac5fa146ed8dde4507a0eb4ad6a07ce78d2f0cd..67b29cbe2cad91e031ee23be656ec61a403f2cf9
100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3302,38 +3302,78 @@
 DONE;
   })
   -(define_insn "aarch64_w"
+(define_insn "aarch64_subw"
 [(set (match_operand: 0 "register_operand" "=w")
-(ADDSUB: (match_operand: 1 "register_operand"
"w")
-(ANY_EXTEND:
-  (match_operand:VD_BHSI 2 "register_operand" "w"]
+(minus:
+ (match_operand: 1 "register_operand" "w")
+ (ANY_EXTEND:
+   (match_operand:VD_BHSI 2 "register_operand" "w"]

The (minus should be under the "(match_operand":

(define_insn "aarch64_subw"
[(set (match_operand: 0 "register_operand" "=w")
 (minus: (match_operand: 1 "register_operand" "w")
(ANY_EXTEND:
  (match_operand:VD_BHSI 2 "register_operand" "w"]

Same for the other patterns.

Thanks,
Richard


You will need a maintainer's approval but this looks good to me.
Thanks for doing this. I would only point out one other nit which you
can choose to ignore:

+/* Ensure
+   saddw2 and one saddw for the function add()
+   ssubw2 and one ssubw for the function subtract()
+   uaddw2 and one uaddw for the function uadd()
+   usubw2 and one usubw for the function usubtract() */
+
+/* { dg-final { scan-assembler-times "\[ \t\]ssubw2\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]ssubw\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]saddw2\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]saddw\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]usubw2\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]usubw\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]uaddw2\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]uaddw\[ \t\]+" 1 } } */

The scan-assembly directives for the different
functions can be placed right below each of them and that would
make it easier to read the expected results in the test and you
can get rid of the comments saying the same.

Thanks for the first-line review Sudi.

OK for trunk.


I've committed this on behalf of Matthew as: https://gcc.gnu.org/r262949


The new test fail with -mabi=ilp32:
 gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]saddw2[ \t]+ 1
 gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]saddw[ \t]+ 1
 gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]ssubw2[ \t]+ 1
 gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]ssubw[ \t]+ 1
 gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]uaddw2[ \t]+ 1
 gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]uaddw[ \t]+ 1
 gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]usubw2[ \t]+ 1
 gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]usubw[ \t]+ 1

I'm not sure how much we care about ilp32?


Looks like the usual pitfall of unsigned longs being 32-bit in ILP32.
The test should be made to use "long long"s that are guaranteed to be 64 bits
instead of just longs

Thanks,
Kyrill


Christophe


Thanks,
Kyrill


Thanks,
James





Re: [PATCH][Middle-end] disable strcmp/strncmp inlining with O2 below and Os

2018-07-30 Thread Christophe Lyon
On Wed, 25 Jul 2018 at 19:08, Qing Zhao  wrote:
>
> Hi,
>
> As Wilco suggested, the new added strcmp/strncmp inlining should be only 
> enabled with O2 and above.
>
> this is the simple patch for this change.
>
> tested on both X86 and aarch64.
>
> Okay for thunk?
>
> Qing
>
> gcc/ChangeLog:
>
> +2018-07-25  Qing Zhao  
> +
> +   * builtins.c (inline_expand_builtin_string_cmp): Disable inlining
> +   when optimization level is lower than 2 or optimize for size.
> +
>
> gcc/testsuite/ChangeLog:
>
> +2018-07-25  Qing Zhao  
> +
> +   * gcc.dg/strcmpopt_5.c: Change to O2 to enable the transformation.
> +   * gcc.dg/strcmpopt_6.c: Likewise.
> +
>

Hi,

After this change, I've noticed that:
FAIL: gcc.dg/strcmpopt_6.c scan-rtl-dump-times expand "__builtin_memcmp" 6
on arm-none-linux-gnueabi
--with-mode thumb
--with-cpu cortex-a9
and forcing -march=armv5t via RUNTESTFLAGS

The log says:
gcc.dg/strcmpopt_6.c: pattern found 4 times
FAIL: gcc.dg/strcmpopt_6.c scan-rtl-dump-times expand "__builtin_memcmp" 6

Christophe


Re: [PATCH][GCC][AARCH64] Canonicalize aarch64 widening simd plus insns

2018-07-30 Thread Christophe Lyon
Hi,

On Tue, 24 Jul 2018 at 17:39, Kyrill Tkachov
 wrote:
>
>
> On 24/07/18 16:12, James Greenhalgh wrote:
> > On Thu, Jul 19, 2018 at 07:35:22AM -0500, Matthew Malcomson wrote:
> > > Hi again.
> > >
> > > Providing an updated patch to include the formatting suggestions.
> >
> > Please try not to top-post replies, it makes the conversation thread
> > harder to follow (reply continues below!).
> >
> > > On 12/07/18 11:39, Sudakshina Das wrote:
> > > > Hi Matthew
> > > >
> > > > On 12/07/18 11:18, Richard Sandiford wrote:
> > > >> Looks good to me FWIW (not a maintainer), just a minor formatting 
> > > >> thing:
> > > >>
> > > >> Matthew Malcomson  writes:
> > > >>> diff --git a/gcc/config/aarch64/aarch64-simd.md
> > > >>> b/gcc/config/aarch64/aarch64-simd.md
> > > >>> index
> > > >>> aac5fa146ed8dde4507a0eb4ad6a07ce78d2f0cd..67b29cbe2cad91e031ee23be656ec61a403f2cf9
> > > >>> 100644
> > > >>> --- a/gcc/config/aarch64/aarch64-simd.md
> > > >>> +++ b/gcc/config/aarch64/aarch64-simd.md
> > > >>> @@ -3302,38 +3302,78 @@
> > > >>> DONE;
> > > >>>   })
> > > >>>   -(define_insn "aarch64_w"
> > > >>> +(define_insn "aarch64_subw"
> > > >>> [(set (match_operand: 0 "register_operand" "=w")
> > > >>> -(ADDSUB: (match_operand: 1 "register_operand"
> > > >>> "w")
> > > >>> -(ANY_EXTEND:
> > > >>> -  (match_operand:VD_BHSI 2 "register_operand" "w"]
> > > >>> +(minus:
> > > >>> + (match_operand: 1 "register_operand" "w")
> > > >>> + (ANY_EXTEND:
> > > >>> +   (match_operand:VD_BHSI 2 "register_operand" "w"]
> > > >>
> > > >> The (minus should be under the "(match_operand":
> > > >>
> > > >> (define_insn "aarch64_subw"
> > > >>[(set (match_operand: 0 "register_operand" "=w")
> > > >> (minus: (match_operand: 1 "register_operand" "w")
> > > >>(ANY_EXTEND:
> > > >>  (match_operand:VD_BHSI 2 "register_operand" "w"]
> > > >>
> > > >> Same for the other patterns.
> > > >>
> > > >> Thanks,
> > > >> Richard
> > > >>
> > > >
> > > > You will need a maintainer's approval but this looks good to me.
> > > > Thanks for doing this. I would only point out one other nit which you
> > > > can choose to ignore:
> > > >
> > > > +/* Ensure
> > > > +   saddw2 and one saddw for the function add()
> > > > +   ssubw2 and one ssubw for the function subtract()
> > > > +   uaddw2 and one uaddw for the function uadd()
> > > > +   usubw2 and one usubw for the function usubtract() */
> > > > +
> > > > +/* { dg-final { scan-assembler-times "\[ \t\]ssubw2\[ \t\]+" 1 } } */
> > > > +/* { dg-final { scan-assembler-times "\[ \t\]ssubw\[ \t\]+" 1 } } */
> > > > +/* { dg-final { scan-assembler-times "\[ \t\]saddw2\[ \t\]+" 1 } } */
> > > > +/* { dg-final { scan-assembler-times "\[ \t\]saddw\[ \t\]+" 1 } } */
> > > > +/* { dg-final { scan-assembler-times "\[ \t\]usubw2\[ \t\]+" 1 } } */
> > > > +/* { dg-final { scan-assembler-times "\[ \t\]usubw\[ \t\]+" 1 } } */
> > > > +/* { dg-final { scan-assembler-times "\[ \t\]uaddw2\[ \t\]+" 1 } } */
> > > > +/* { dg-final { scan-assembler-times "\[ \t\]uaddw\[ \t\]+" 1 } } */
> > > >
> > > > The scan-assembly directives for the different
> > > > functions can be placed right below each of them and that would
> > > > make it easier to read the expected results in the test and you
> > > > can get rid of the comments saying the same.
> >
> > Thanks for the first-line review Sudi.
> >
> > OK for trunk.
> >
>
> I've committed this on behalf of Matthew as: https://gcc.gnu.org/r262949
>

The new test fail with -mabi=ilp32:
gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]saddw2[ \t]+ 1
gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]saddw[ \t]+ 1
gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]ssubw2[ \t]+ 1
gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]ssubw[ \t]+ 1
gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]uaddw2[ \t]+ 1
gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]uaddw[ \t]+ 1
gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]usubw2[ \t]+ 1
gcc.target/aarch64/simd/vect_su_add_sub.c scan-assembler-times [
\t]usubw[ \t]+ 1

I'm not sure how much we care about ilp32?

Christophe

> Thanks,
> Kyrill
>
> > Thanks,
> > James
> >
>


Re: [PATCH] Fix PR middle-end/86705

2018-07-30 Thread Richard Biener
On Sun, Jul 29, 2018 at 6:27 PM Jozef Lawrynowicz
 wrote:
>
> pr45678-2.c ICEs for msp430-elf with -mlarge, because an alignment of
> POINTER_SIZE is attempted. POINTER_SIZE with -mlarge is 20-bits, so further
> code in the middle-end that expects this to be a power or 2 causes odd
> alignments to be set, in this case eventually resulting in an ICE.
>
> The test ICEs on gcc-7-branch, gcc-8-branch, and current trunk. It
> successfully builds on gcc-6-branch.
> The failure is caused by r235172.
>
> Successfully bootstrapped and regtested the attached patch for
> x86-64-pc-linux-gnu, and msp430-elf with -mlarge, on trunk.
>
> Ok for gcc-7-branch, gcc-8-branch and trunk?

I wonder if most (if not all) places you touch want to use
get_mode_alignment (Pmode) instead?  (or ptr_mode)

Anyhow, the patch is otherwise obvious though factoring
the thing might be nice (thus my suggestion above...)

Richard.


Re: [PATCH 08/11] targhooks - provide an alternative hook for targets that never execute speculatively

2018-07-30 Thread Richard Biener
On Fri, 27 Jul 2018, Richard Earnshaw wrote:

> 
> This hook adds an alternative implementation for the target hook
> TARGET_HAVE_SPECULATION_SAFE_VALUE; it can be used by targets that have no
> CPU implementations that execute code speculatively.  All that is needed for
> such targets now is to add:
> 
>  #undef TARGET_HAVE_SPECULATION_SAFE_VALUE
>  #define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed.
> 
> to where you have your other target hooks and you're done.

OK.

> gcc:
>   * targhooks.h (speculation_safe_value_not_needed): New prototype.
>   * targhooks.c (speculation_safe_value_not_needed): New function.
>   * target.def (have_speculation_safe_value): Update documentation.
>   * doc/tm.texi: Regenerated.
> ---
>  gcc/doc/tm.texi | 5 +
>  gcc/target.def  | 7 ++-
>  gcc/targhooks.c | 7 +++
>  gcc/targhooks.h | 1 +
>  4 files changed, 19 insertions(+), 1 deletion(-)
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH 01/11] Add __builtin_speculation_safe_value

2018-07-30 Thread Richard Biener
On Fri, 27 Jul 2018, Richard Earnshaw wrote:

> 
> This patch defines a new intrinsic function
> __builtin_speculation_safe_value.  A generic default implementation is
> defined which will attempt to use the backend pattern
> "speculation_safe_barrier".  If this pattern is not defined, or if it
> is not available, then the compiler will emit a warning, but
> compilation will continue.
> 
> Note that the test spec-barrier-1.c will currently fail on all
> targets.  This is deliberate, the failure will go away when
> appropriate action is taken for each target backend.

OK.

Thanks,
Richard.

> gcc:
>   * builtin-types.def (BT_FN_PTR_PTR_VAR): New function type.
>   (BT_FN_I1_I1_VAR, BT_FN_I2_I2_VAR, BT_FN_I4_I4_VAR): Likewise.
>   (BT_FN_I8_I8_VAR, BT_FN_I16_I16_VAR): Likewise.
>   * builtin-attrs.def (ATTR_NOVOPS_NOTHROW_LEAF_LIST): New attribute
>   list.
>   * builtins.def (BUILT_IN_SPECULATION_SAFE_VALUE_N): New builtin.
>   (BUILT_IN_SPECULATION_SAFE_VALUE_PTR): New internal builtin.
>   (BUILT_IN_SPECULATION_SAFE_VALUE_1): Likewise.
>   (BUILT_IN_SPECULATION_SAFE_VALUE_2): Likewise.
>   (BUILT_IN_SPECULATION_SAFE_VALUE_4): Likewise.
>   (BUILT_IN_SPECULATION_SAFE_VALUE_8): Likewise.
>   (BUILT_IN_SPECULATION_SAFE_VALUE_16): Likewise.
>   * builtins.c (expand_speculation_safe_value): New function.
>   (expand_builtin): Call it.
>   * doc/cpp.texi: Document predefine __HAVE_SPECULATION_SAFE_VALUE.
>   * doc/extend.texi: Document __builtin_speculation_safe_value.
>   * doc/md.texi: Document "speculation_barrier" pattern.
>   * doc/tm.texi.in: Pull in TARGET_SPECULATION_SAFE_VALUE and
>   TARGET_HAVE_SPECULATION_SAFE_VALUE.
>   * doc/tm.texi: Regenerated.
>   * target.def (have_speculation_safe_value, speculation_safe_value): New
>   hooks.
>   * targhooks.c (default_have_speculation_safe_value): New function.
>   (default_speculation_safe_value): New function.
>   * targhooks.h (default_have_speculation_safe_value): Add prototype.
>   (default_speculation_safe_value): Add prototype.
> 
> c-family:
>   * c-common.c (speculation_safe_resolve_call): New function.
>   (speculation_safe_resolve_params): New function.
>   (speculation_safe_resolve_return): New function.
>   (resolve_overloaded_builtin): Handle __builtin_speculation_safe_value.
>   * c-cppbuiltin.c (c_cpp_builtins): Add pre-define for
>   __HAVE_SPECULATION_SAFE_VALUE.
> 
> testsuite:
>   * c-c++-common/spec-barrier-1.c: New test.
>   * c-c++-common/spec-barrier-2.c: New test.
>   * gcc.dg/spec-barrier-3.c: New test.
> ---
>  gcc/builtin-attrs.def   |   2 +
>  gcc/builtin-types.def   |   6 +
>  gcc/builtins.c  |  60 ++
>  gcc/builtins.def|  22 
>  gcc/c-family/c-common.c | 164 
> 
>  gcc/c-family/c-cppbuiltin.c |   7 +-
>  gcc/doc/cpp.texi|   4 +
>  gcc/doc/extend.texi |  91 +++
>  gcc/doc/md.texi |  15 +++
>  gcc/doc/tm.texi |  31 ++
>  gcc/doc/tm.texi.in  |   4 +
>  gcc/target.def  |  35 ++
>  gcc/targhooks.c |  32 ++
>  gcc/targhooks.h |   3 +
>  gcc/testsuite/c-c++-common/spec-barrier-1.c |  38 +++
>  gcc/testsuite/c-c++-common/spec-barrier-2.c |  17 +++
>  gcc/testsuite/gcc.dg/spec-barrier-3.c   |  13 +++
>  17 files changed, 543 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/c-c++-common/spec-barrier-1.c
>  create mode 100644 gcc/testsuite/c-c++-common/spec-barrier-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/spec-barrier-3.c
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[PATCH] arm: Generate correct const_ints (PR86640)

2018-07-30 Thread Segher Boessenkool
In arm_block_set_aligned_vect 8-bit constants are generated as zero-
extended const_ints, not sign-extended as required.  Fix that.

Tamar tested the patch (see PR); no problems were found.  Is this okay
for trunk?


Segher


2018-07-30  Segher Boessenkool  

PR target/86640
* config/arm/arm.c (arm_block_set_aligned_vect): Use gen_int_mode
instead of GEN_INT.

---
 gcc/config/arm/arm.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index cf12ace..f5eece4 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -30046,7 +30046,6 @@ arm_block_set_aligned_vect (rtx dstbase,
   rtx dst, addr, mem;
   rtx val_vec, reg;
   machine_mode mode;
-  unsigned HOST_WIDE_INT v = value;
   unsigned int offset = 0;
 
   gcc_assert ((align & 0x3) == 0);
@@ -30065,10 +30064,8 @@ arm_block_set_aligned_vect (rtx dstbase,
 
   dst = copy_addr_to_reg (XEXP (dstbase, 0));
 
-  v = sext_hwi (v, BITS_PER_WORD);
-
   reg = gen_reg_rtx (mode);
-  val_vec = gen_const_vec_duplicate (mode, GEN_INT (v));
+  val_vec = gen_const_vec_duplicate (mode, gen_int_mode (value, QImode));
   /* Emit instruction loading the constant value.  */
   emit_move_insn (reg, val_vec);
 
-- 
1.8.3.1



Re: [PATCH] Fix the damage done by my other patch from yesterday to strlenopt-49.c

2018-07-30 Thread Richard Biener
On Mon, 30 Jul 2018, Bernd Edlinger wrote:

> Hi,
> 
> this is how I would like to handle the over length strings issue in the C FE.
> If the string constant is exactly the right length and ends in one explicit
> NUL character, shorten it by one character.
> 
> I thought Martin would be working on it,  but as this is a really simple fix,
> I would dare to send it to gcc-patches anyway, hope you don't mind...
> 
> The patch is relative to the other patch here: 
> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01800.html
> 
> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?

I'll leave this to FE maintainers but can I ask you to verify the
(other) FEs do not leak this kind of invalid initializers to the
middle-end?  I suggest to put this verification in
output_constructor which otherwise happily truncates initializers
with excess size.  There's also gimplification which might elide
a = { "abcd", "cdse" }; to  a.x = "abcd"; a.y = "cdse"; but
hopefully there the GIMPLE verifier (verify_gimple_assign_single)
verifies this - well, it only dispatches to useless_type_conversion_p 
(lhs_type, rhs1_type) for this case, but non-flexarrays should be
handled fine there.

Richard.


[OBVIOUS][ARM][libgcc] Fix comment for code working on architectures >= 4

2018-07-30 Thread Christophe Lyon
Hi,

In r261840 I added an inaccurate comment: the code works on
architectures >= 4, not > 4.

I committed this obvious fix as r263066:

2018-07-30  Christophe Lyon  

   * config/arm/ieee754-df.S: Fix comment for code working on
   architectures >= 4.
   * config/arm/ieee754-sf.S: Likewise.

Index: libgcc/config/arm/ieee754-df.S
===
--- libgcc/config/arm/ieee754-df.S  (revision 263065)
+++ libgcc/config/arm/ieee754-df.S  (revision 263066)
@@ -657,7 +657,7 @@
beq LSYM(Lml_1)

@ Here is the actual multiplication.
-   @ This code works on architecture versions > 4
+   @ This code works on architecture versions >= 4
umull   ip, lr, xl, yl
mov r5, #0
umlal   lr, r5, xh, yl
Index: libgcc/config/arm/ieee754-sf.S
===
--- libgcc/config/arm/ieee754-sf.S  (revision 263065)
+++ libgcc/config/arm/ieee754-sf.S  (revision 263066)
@@ -461,7 +461,7 @@
orr r1, r3, r1, lsr #5

@ The actual multiplication.
-   @ This code works on architecture versions > 4
+   @ This code works on architecture versions >= 4
umull   r3, r1, r0, r1

@ Put final sign in r0.


[PATCH] Fix the damage done by my other patch from yesterday to strlenopt-49.c

2018-07-30 Thread Bernd Edlinger
Hi,

this is how I would like to handle the over length strings issue in the C FE.
If the string constant is exactly the right length and ends in one explicit
NUL character, shorten it by one character.

I thought Martin would be working on it,  but as this is a really simple fix,
I would dare to send it to gcc-patches anyway, hope you don't mind...

The patch is relative to the other patch here: 
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01800.html


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.gcc/c:
2018-07-30  Bernd Edlinger  

	* c-typeck.c (digest_init): Fix overlength strings.

testsuite:
2018-07-30  Bernd Edlinger  

	* gcc.dg/strlenopt-49.c: Adjust test expectations.

diff -pur gcc/c/c-typeck.c gcc/c/c-typeck.c
--- gcc/c/c-typeck.c	2018-06-20 18:35:15.0 +0200
+++ gcc/c/c-typeck.c	2018-07-30 12:17:34.175481372 +0200
@@ -7435,29 +7435,38 @@ digest_init (location_t init_loc, tree t
 		}
 	}
 
-	  TREE_TYPE (inside_init) = type;
 	  if (TYPE_DOMAIN (type) != NULL_TREE
 	  && TYPE_SIZE (type) != NULL_TREE
 	  && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST)
 	{
 	  unsigned HOST_WIDE_INT len = TREE_STRING_LENGTH (inside_init);
+	  unsigned unit = TYPE_PRECISION (typ1) / BITS_PER_UNIT;
 
 	  /* Subtract the size of a single (possibly wide) character
 		 because it's ok to ignore the terminating null char
 		 that is counted in the length of the constant.  */
-	  if (compare_tree_int (TYPE_SIZE_UNIT (type),
-(len - (TYPE_PRECISION (typ1)
-	/ BITS_PER_UNIT))) < 0)
+	  if (compare_tree_int (TYPE_SIZE_UNIT (type), len - unit) < 0)
 		pedwarn_init (init_loc, 0,
 			  ("initializer-string for array of chars "
 			   "is too long"));
-	  else if (warn_cxx_compat
-		   && compare_tree_int (TYPE_SIZE_UNIT (type), len) < 0)
-		warning_at (init_loc, OPT_Wc___compat,
-			("initializer-string for array chars "
-			 "is too long for C++"));
+	  else if (compare_tree_int (TYPE_SIZE_UNIT (type), len) < 0)
+		{
+		  if (warn_cxx_compat)
+		warning_at (init_loc, OPT_Wc___compat,
+("initializer-string for array chars "
+ "is too long for C++"));
+		  if (len >= 2 * unit)
+		{
+		  const char *p = TREE_STRING_POINTER (inside_init);
+
+		  len -= unit;
+		  if (memcmp (p + len - unit, "\0\0\0\0", unit) == 0)
+			inside_init = build_string (len, p);
+		}
+		}
 	}
 
+	  TREE_TYPE (inside_init) = type;
 	  return inside_init;
 	}
   else if (INTEGRAL_TYPE_P (typ1))
diff -pur gcc/testsuite/gcc.dg/strlenopt-49.c gcc/testsuite/gcc.dg/strlenopt-49.c
--- gcc/testsuite/gcc.dg/strlenopt-49.c	2018-07-30 13:02:34.735478726 +0200
+++ gcc/testsuite/gcc.dg/strlenopt-49.c	2018-07-30 13:08:21.074859303 +0200
@@ -11,9 +11,6 @@ const char a3[3] = "12\0";
 const char a8[8] = "1234567\0";
 const char a9[9] = "12345678\0";
 
-const char ax[9] = "12345678\0\0\0\0";   /* { dg-warning "initializer-string for array of chars is too long" } */
-const char ay[9] = "\00012345678\0\0\0\0";   /* { dg-warning "initializer-string for array of chars is too long" } */
-
 
 int len1 (void)
 {
@@ -27,27 +24,13 @@ int len (void)
   return len;
 }
 
-int lenx (void)
-{
-  size_t lenx = strlen (ax);
-  return lenx;
-}
-
-int leny (void)
-{
-  size_t leny = strlen (ay);
-  return leny;
-}
-
 int cmp88 (void)
 {
   int cmp88 = memcmp (a8, "1234567\0", sizeof a8);
   return cmp88;
 }
 
-/* { dg-final { scan-tree-dump-times "strlen" 0 "gimple" { xfail *-*-* } } }
-   { dg-final { scan-tree-dump-times "len0 = 0;" 1 "gimple" { xfail *-*-* } } }
-   { dg-final { scan-tree-dump-times "len = 18;" 1 "gimple" { xfail *-*-* } } }
-   { dg-final { scan-tree-dump-times "lenx = 8;" 1 "gimple" { xfail *-*-* } } }
-   { dg-final { scan-tree-dump-times "leny = 0;" 1 "gimple" { xfail *-*-* } } }
-   { dg-final { scan-tree-dump-times "cmp88 = 0;" 1 "gimple" { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump-times "strlen" 0 "gimple" } }
+   { dg-final { scan-tree-dump-times "len0 = 0;" 1 "gimple" } }
+   { dg-final { scan-tree-dump-times "len = 18;" 1 "gimple" } }
+   { dg-final { scan-tree-dump-times "cmp88 = 0;" 1 "gimple" } } */


[11/11] Insert pattern statements into vec_basic_blocks

2018-07-30 Thread Richard Sandiford
The point of this patch is to put pattern statements in the same
vec_basic_block as the statements they replace, with the pattern
statements for S coming between S and S's original predecessor.
This removes the need to handle them specially in various places.


2018-07-30  Richard Sandiford  

gcc/
* tree-vectorizer.h (vec_basic_block): Expand comment.
(_stmt_vec_info::pattern_def_seq): Delete.
(STMT_VINFO_PATTERN_DEF_SEQ): Likewise.
(is_main_pattern_stmt_p): New function.
* tree-vect-loop.c (vect_determine_vf_for_stmt_1): Rename to...
(vect_determine_vf_for_stmt): ...this, deleting the original
function with this name.  Remove vectype_maybe_set_p argument
and test is_pattern_stmt_p instead.  Retain the "examining..."
message from the previous vect_determine_vf_for_stmt.
(vect_compute_single_scalar_iteration_cost, vect_update_vf_for_slp)
(vect_analyze_loop_2): Don't treat pattern statements specially.
(vect_transform_loop): Likewise.  Use vect_orig_stmt to find the
insertion point.
* tree-vect-slp.c (vect_detect_hybrid_slp): Expect pattern statements
to be in the statement list, without needing to follow
STMT_VINFO_RELATED_STMT.  Remove PATTERN_DEF_SEQ handling.
* tree-vect-stmts.c (vect_analyze_stmt): Don't handle pattern
statements specially.
(vect_remove_dead_scalar_stmts): Ignore pattern statements.
* tree-vect-patterns.c (vect_set_pattern_stmt): Insert the pattern
statement into the vec_basic_block immediately before the statement
it replaces.
(append_pattern_def_seq): Likewise.  If the original statement is
itself a pattern statement, associate the new one with the original
statement.
(vect_split_statement): Use append_pattern_def_seq to insert the
first pattern statement.
(vect_recog_vector_vector_shift_pattern): Remove mention of
STMT_VINFO_PATTERN_DEF_SEQ.
(adjust_bool_stmts): Get the last pattern statement from the
stmt_vec_info chain.
(vect_mark_pattern_stmts): Rename to...
(vect_replace_stmt_with_pattern): ...this.  Remove the
PATTERN_DEF_SEQ handling and process only the pattern statement given.
Use append_pattern_def_seq when replacing a pattern statement with
another pattern statement, and use vec_basic_block::remove instead
of gsi_remove to remove the old one.
(vect_pattern_recog_1): Update accordingly.  Remove PATTERN_DEF_SEQ
handling.  On failure, remove any half-formed pattern sequence from
the vec_basic_block.  Install the vector type in pattern statements
that don't yet have one.
(vect_pattern_recog): Iterate over statements that are added
by previous recognizers, but skipping those that have already
been replaced, or the main pattern statement in such a replacement.

Index: gcc/tree-vectorizer.h
===
*** gcc/tree-vectorizer.h   2018-07-30 12:32:46.658356275 +0100
--- gcc/tree-vectorizer.h   2018-07-30 12:32:49.898327734 +0100
*** #define SLP_TREE_TWO_OPERATORS(S)(S)-
*** 172,178 
  #define SLP_TREE_DEF_TYPE(S)   (S)->def_type
  
  /* Information about the phis and statements in a block that we're trying
!to vectorize, in their original order.  */
  class vec_basic_block
  {
  public:
--- 172,184 
  #define SLP_TREE_DEF_TYPE(S)   (S)->def_type
  
  /* Information about the phis and statements in a block that we're trying
!to vectorize.  This includes the phis and statements that were in the
!original scalar code, in their original order.  It also includes any
!pattern statements that the vectorizer has created to replace some
!of the scalar ones.  Such pattern statements come immediately before
!the statement that they replace; that is, all pattern statements P for
!which vect_orig_stmt (P) == S form a sequence that comes immediately
!before S.  */
  class vec_basic_block
  {
  public:
*** struct _stmt_vec_info {
*** 870,880 
  pattern).  */
stmt_vec_info related_stmt;
  
-   /* Used to keep a sequence of def stmts of a pattern stmt if such exists.
-  The sequence is attached to the original statement rather than the
-  pattern statement.  */
-   gimple_seq pattern_def_seq;
- 
/* List of datarefs that are known to have the same alignment as the dataref
   of this stmt.  */
vec same_align_refs;
--- 876,881 
*** #define STMT_VINFO_DR_INFO(S) \
*** 1048,1054 
  
  #define STMT_VINFO_IN_PATTERN_P(S) (S)->in_pattern_p
  #define STMT_VINFO_RELATED_STMT(S) (S)->related_stmt
- #define STMT_VINFO_PATTERN_DEF_SEQ(S)  (S)->pattern_def_seq
  #define STMT_VINFO_SAME_ALIGN_REFS(S)  

[10/11] Make the vectoriser do its own DCE

2018-07-30 Thread Richard Sandiford
The vectoriser normally leaves a later DCE pass to remove the scalar
code, but we've accumulated various bits of code to remove cases that
DCE can't handle, such as removing the scalar stores that have been
replaced by vector stores, and the scalar calls to internal functions.
(The latter must be removed for correctness, since no underlying scalar
optabs exist for those calls.)

Now that vec_basic_block gives us an easy way of iterating over the
original scalar code (ignoring any new code inserted by the vectoriser),
it seems easier to do the DCE directly.  This involves marking the few
cases in which the vector code needs part of the original scalar code
to be kept around.


2018-07-30  Richard Sandiford  

gcc/
* tree-vectorizer.h (_stmt_vec_info::used_by_vector_code_p): New
member variable.
(vect_mark_used_by_vector_code): Declare.
(vect_remove_dead_scalar_stmts): Likewise.
(vect_transform_stmt): Return void.
(vect_remove_stores): Delete.
* tree-vectorizer.c (vec_info::remove_stmt): Handle phis.
* tree-vect-stmts.c (vect_mark_used_by_vector_code): New function.
(vectorizable_call, vectorizable_simd_clone_call): Don't remove
scalar calls here.
(vectorizable_load): Mark unhoisted scalar loads that feed a
load-and-broadcast operation as being needed by the vector code.
(vect_transform_stmt): Return void.
(vect_remove_stores): Delete.
(vect_maybe_remove_scalar_stmt): New function.
(vect_remove_dead_scalar_stmts): Likewise.
* tree-vect-slp.c (vect_slp_bb): Call vect_remove_dead_scalar_stmts.
(vect_remove_slp_scalar_calls): Delete.
(vect_schedule_slp): Don't call it.  Don't remove scalar stores here.
* tree-vect-loop.c (vectorizable_reduction): Mark scalar phis that
are retained by the vector code.
(vectorizable_live_operation): Mark scalar live-out statements that
are retained by the vector code.
(vect_transform_loop_stmt): Remove seen_store argument.  Mark gconds
in nested loops as being needed by the vector code.  Replace the
outer loop's gcond with a dummy condition.
(vect_transform_loop): Update calls accordingly.  Don't remove
scalar stores or calls here; call vect_remove_dead_scalar_stmts
instead.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-30 12:32:42.790390350 +0100
+++ gcc/tree-vectorizer.h   2018-07-30 12:32:46.658356275 +0100
@@ -925,6 +925,10 @@ struct _stmt_vec_info {
   /* For both loads and stores.  */
   bool simd_lane_access_p;
 
+  /* True if the vectorized code keeps this statement in its current form.
+ Only meaningful for statements that were in the original scalar code.  */
+  bool used_by_vector_code_p;
+
   /* Classifies how the load or store is going to be implemented
  for loop vectorization.  */
   vect_memory_access_type memory_access_type;
@@ -1522,6 +1526,7 @@ extern stmt_vec_info vect_finish_replace
 extern stmt_vec_info vect_finish_stmt_generation (stmt_vec_info, gimple *,
  gimple_stmt_iterator *);
 extern bool vect_mark_stmts_to_be_vectorized (loop_vec_info);
+extern void vect_mark_used_by_vector_code (stmt_vec_info);
 extern tree vect_get_store_rhs (stmt_vec_info);
 extern tree vect_get_vec_def_for_operand_1 (stmt_vec_info, enum vect_def_type);
 extern tree vect_get_vec_def_for_operand (tree, stmt_vec_info, tree = NULL);
@@ -1532,9 +1537,8 @@ extern void vect_get_vec_defs_for_stmt_c
 extern tree vect_init_vector (stmt_vec_info, tree, tree,
   gimple_stmt_iterator *);
 extern tree vect_get_vec_def_for_stmt_copy (vec_info *, tree);
-extern bool vect_transform_stmt (stmt_vec_info, gimple_stmt_iterator *,
+extern void vect_transform_stmt (stmt_vec_info, gimple_stmt_iterator *,
 slp_tree, slp_instance);
-extern void vect_remove_stores (stmt_vec_info);
 extern bool vect_analyze_stmt (stmt_vec_info, bool *, slp_tree, slp_instance,
   stmt_vector_for_cost *);
 extern bool vectorizable_condition (stmt_vec_info, gimple_stmt_iterator *,
@@ -1554,6 +1558,7 @@ extern gcall *vect_gen_while (tree, tree
 extern tree vect_gen_while_not (gimple_seq *, tree, tree, tree);
 extern bool vect_get_vector_types_for_stmt (stmt_vec_info, tree *, tree *);
 extern tree vect_get_mask_type_for_stmt (stmt_vec_info);
+extern void vect_remove_dead_scalar_stmts (vec_info *);
 
 /* In tree-vect-data-refs.c.  */
 extern bool vect_can_force_dr_alignment_p (const_tree, unsigned int);
Index: gcc/tree-vectorizer.c
===
--- gcc/tree-vectorizer.c   2018-07-30 12:32:42.790390350 +0100
+++ gcc/tree-vectorizer.c   2018-07-30 12:32:46.658356275 +0100
@@ -653,8 +653,13 @@ 

[09/11] Add a vec_basic_block structure

2018-07-30 Thread Richard Sandiford
This patch adds a vec_basic_block that records the scalar phis and
scalar statements that we need to vectorise.  This is a slight
simplification in its own right, since it avoids unnecesary statement
lookups and shaves >50 LOC.  But the main reason for doing it is
to allow the final patch in the series to treat pattern statements
less specially.

Putting phis (which are logically parallel) and normal statements
(which are logically serial) into a single list might seem dangerous,
but I think in practice it should be fine.  Very little vectoriser
code needs to handle the parallel nature of phis specially, and code
that does can still do so.  Having a single list simplifies code that
wants to look at every scalar phi or stmt in isolation.


2018-07-30  Richard Sandiford  

gcc/
* tree-vectorizer.h (vec_basic_block): New structure.
(vec_info::blocks, _stmt_vec_info::block, _stmt_vec_info::prev)
(_stmt_vec_info::next): New member variables.
(FOR_EACH_VEC_BB_STMT, FOR_EACH_VEC_BB_STMT_REVERSE): New macros.
(vec_basic_block::vec_basic_block): New function.
* tree-vectorizer.c (vec_basic_block::add_to_end): Likewise.
(vec_basic_block::add_before): Likewise.
(vec_basic_block::remove): Likewise.
(vec_info::~vec_info): Free the vec_basic_blocks.
(vec_info::remove_stmt): Remove the statement from the containing
vec_basic_block.
* tree-vect-patterns.c (vect_determine_precisions)
(vect_pattern_recog): Iterate over vec_basic_blocks.
* tree-vect-loop.c (vect_determine_vectorization_factor)
(vect_compute_single_scalar_iteration_cost, vect_update_vf_for_slp)
(vect_analyze_loop_operations, vect_transform_loop): Likewise.
(_loop_vec_info::_loop_vec_info): Construct vec_basic_blocks.
* tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Likewise.
(vect_detect_hybrid_slp): Iterate over vec_basic_blocks.
* tree-vect-stmts.c (vect_mark_stmts_to_be_vectorized): Likewise.
(vect_finish_replace_stmt, vectorizable_condition): Remove the original
statement from the containing block.
(hoist_defs_of_uses): Likewise the statement that we're hoisting.

Index: gcc/tree-vectorizer.h
===
*** gcc/tree-vectorizer.h   2018-07-30 12:43:34.512651826 +0100
--- gcc/tree-vectorizer.h   2018-07-30 12:43:34.508651861 +0100
*** #define SLP_TREE_LOAD_PERMUTATION(S)
*** 171,177 
--- 171,200 
  #define SLP_TREE_TWO_OPERATORS(S)  (S)->two_operators
  #define SLP_TREE_DEF_TYPE(S)   (S)->def_type
  
+ /* Information about the phis and statements in a block that we're trying
+to vectorize, in their original order.  */
+ class vec_basic_block
+ {
+ public:
+   vec_basic_block (basic_block);
+ 
+   void add_to_end (stmt_vec_info);
+   void add_before (stmt_vec_info, stmt_vec_info);
+   void remove (stmt_vec_info);
+ 
+   basic_block bb () const { return m_bb; }
+   stmt_vec_info first () const { return m_first; }
+   stmt_vec_info last () const { return m_last; }
+ 
+ private:
+   /* The block itself.  */
+   basic_block m_bb;
  
+   /* The first and last statements in the block, forming a double-linked list.
+  The list includes both phis and true statements.  */
+   stmt_vec_info m_first;
+   stmt_vec_info m_last;
+ };
  
  /* Describes two objects whose addresses must be unequal for the vectorized
 loop to be valid.  */
*** struct vec_info {
*** 249,254 
--- 272,280 
/* Cost data used by the target cost model.  */
void *target_cost_data;
  
+   /* The basic blocks in the vectorization region.  */
+   auto_vec blocks;
+ 
  private:
stmt_vec_info new_stmt_vec_info (gimple *stmt);
void set_vinfo_for_stmt (gimple *, stmt_vec_info);
*** struct dr_vec_info {
*** 776,781 
--- 802,812 
  typedef struct data_reference *dr_p;
  
  struct _stmt_vec_info {
+   /* The block to which the statement belongs, or null if none.  */
+   vec_basic_block *block;
+ 
+   /* Link chains for the previous and next statements in BLOCK.  */
+   stmt_vec_info prev, next;
  
enum stmt_vec_info_type type;
  
*** #define VECT_SCALAR_BOOLEAN_TYPE_P(TYPE)
*** 1072,1077 
--- 1103,1129 
 && TYPE_PRECISION (TYPE) == 1  \
 && TYPE_UNSIGNED (TYPE)))
  
+ /* Make STMT_INFO iterate over each statement in vec_basic_block VEC_BB
+in forward order.  */
+ 
+ #define FOR_EACH_VEC_BB_STMT(VEC_BB, STMT_INFO) \
+   for (stmt_vec_info STMT_INFO = (VEC_BB)->first (); STMT_INFO; \
+STMT_INFO = STMT_INFO->next)
+ 
+ /* Make STMT_INFO iterate over each statement in vec_basic_block VEC_BB
+in backward order.  */
+ 
+ #define FOR_EACH_VEC_BB_STMT_REVERSE(VEC_BB, STMT_INFO) \
+   for (stmt_vec_info STMT_INFO = (VEC_BB)->last (); STMT_INFO; \
+STMT_INFO = STMT_INFO->prev)
+ 
+ /* 

[08/11] Make hoist_defs_of_uses use vec_info::lookup_def

2018-07-30 Thread Richard Sandiford
This patch makes hoist_defs_of_uses use vec_info::lookup_def instead of:

  if (!gimple_nop_p (def_stmt)
  && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))

to test whether a feeding scalar statement needs to be hoisted out
of the vectorised loop.  It isn't worth doing in its own right,
but it's a prerequisite for the next patch, which needs to update
the stmt_vec_infos of the hoisted statements.


2018-07-30  Richard Sandiford  

gcc/
* tree-vect-stmts.c (hoist_defs_of_uses): Use vec_info::lookup_def
instead of gimple_nop_p and flow_bb_inside_loop_p to decide
whether a statement needs to be hoisted.

Index: gcc/tree-vect-stmts.c
===
*** gcc/tree-vect-stmts.c   2018-07-30 12:42:35.633169005 +0100
--- gcc/tree-vect-stmts.c   2018-07-30 12:42:35.629169040 +0100
*** permute_vec_elements (tree x, tree y, tr
*** 7322,7370 
  static bool
  hoist_defs_of_uses (stmt_vec_info stmt_info, struct loop *loop)
  {
ssa_op_iter i;
tree op;
bool any = false;
  
FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
! {
!   gimple *def_stmt = SSA_NAME_DEF_STMT (op);
!   if (!gimple_nop_p (def_stmt)
! && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))
!   {
! /* Make sure we don't need to recurse.  While we could do
!so in simple cases when there are more complex use webs
!we don't have an easy way to preserve stmt order to fulfil
!dependencies within them.  */
! tree op2;
! ssa_op_iter i2;
! if (gimple_code (def_stmt) == GIMPLE_PHI)
return false;
! FOR_EACH_SSA_TREE_OPERAND (op2, def_stmt, i2, SSA_OP_USE)
!   {
! gimple *def_stmt2 = SSA_NAME_DEF_STMT (op2);
! if (!gimple_nop_p (def_stmt2)
! && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt2)))
!   return false;
!   }
! any = true;
!   }
! }
  
if (!any)
  return true;
  
FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
! {
!   gimple *def_stmt = SSA_NAME_DEF_STMT (op);
!   if (!gimple_nop_p (def_stmt)
! && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))
!   {
! gimple_stmt_iterator gsi = gsi_for_stmt (def_stmt);
! gsi_remove (, false);
! gsi_insert_on_edge_immediate (loop_preheader_edge (loop), def_stmt);
!   }
! }
  
return true;
  }
--- 7322,7360 
  static bool
  hoist_defs_of_uses (stmt_vec_info stmt_info, struct loop *loop)
  {
+   vec_info *vinfo = stmt_info->vinfo;
ssa_op_iter i;
tree op;
bool any = false;
  
FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
! if (stmt_vec_info def_stmt_info = vinfo->lookup_def (op))
!   {
!   /* Make sure we don't need to recurse.  While we could do
!  so in simple cases when there are more complex use webs
!  we don't have an easy way to preserve stmt order to fulfil
!  dependencies within them.  */
!   tree op2;
!   ssa_op_iter i2;
!   if (gimple_code (def_stmt_info->stmt) == GIMPLE_PHI)
! return false;
!   FOR_EACH_SSA_TREE_OPERAND (op2, def_stmt_info->stmt, i2, SSA_OP_USE)
! if (vinfo->lookup_def (op2))
return false;
!   any = true;
!   }
  
if (!any)
  return true;
  
FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
! if (stmt_vec_info def_stmt_info = vinfo->lookup_def (op))
!   {
!   gimple_stmt_iterator gsi = gsi_for_stmt (def_stmt_info->stmt);
!   gsi_remove (, false);
!   gsi_insert_on_edge_immediate (loop_preheader_edge (loop),
! def_stmt_info->stmt);
!   }
  
return true;
  }


[07/11] Use single basic block array in loop_vec_info

2018-07-30 Thread Richard Sandiford
_loop_vec_info::_loop_vec_info used get_loop_array to get the
order of the blocks when creating stmt_vec_infos, but then used
dfs_enumerate_from to get the order of the blocks that the rest
of the vectoriser uses.  We should be able to use that order
for creating stmt_vec_infos too.


2018-07-30  Richard Sandiford  

gcc/
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Use the
result of dfs_enumerate_from when constructing stmt_vec_infos,
instead of additionally calling get_loop_body.

Index: gcc/tree-vect-loop.c
===
*** gcc/tree-vect-loop.c2018-07-30 12:40:59.366015643 +0100
--- gcc/tree-vect-loop.c2018-07-30 12:40:59.362015678 +0100
*** _loop_vec_info::_loop_vec_info (struct l
*** 834,844 
  scalar_loop (NULL),
  orig_loop_info (NULL)
  {
!   /* Create/Update stmt_info for all stmts in the loop.  */
!   basic_block *body = get_loop_body (loop);
!   for (unsigned int i = 0; i < loop->num_nodes; i++)
  {
!   basic_block bb = body[i];
gimple_stmt_iterator si;
  
for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next ())
--- 834,851 
  scalar_loop (NULL),
  orig_loop_info (NULL)
  {
!   /* CHECKME: We want to visit all BBs before their successors (except for
!  latch blocks, for which this assertion wouldn't hold).  In the simple
!  case of the loop forms we allow, a dfs order of the BBs would the same
!  as reversed postorder traversal, so we are safe.  */
! 
!   unsigned int nbbs = dfs_enumerate_from (loop->header, 0, bb_in_loop_p,
! bbs, loop->num_nodes, loop);
!   gcc_assert (nbbs == loop->num_nodes);
! 
!   for (unsigned int i = 0; i < nbbs; i++)
  {
!   basic_block bb = bbs[i];
gimple_stmt_iterator si;
  
for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next ())
*** _loop_vec_info::_loop_vec_info (struct l
*** 855,870 
  add_stmt (stmt);
}
  }
-   free (body);
- 
-   /* CHECKME: We want to visit all BBs before their successors (except for
-  latch blocks, for which this assertion wouldn't hold).  In the simple
-  case of the loop forms we allow, a dfs order of the BBs would the same
-  as reversed postorder traversal, so we are safe.  */
- 
-   unsigned int nbbs = dfs_enumerate_from (loop->header, 0, bb_in_loop_p,
- bbs, loop->num_nodes, loop);
-   gcc_assert (nbbs == loop->num_nodes);
  }
  
  /* Free all levels of MASKS.  */
--- 862,867 


[06/11] Handle VMAT_INVARIANT separately

2018-07-30 Thread Richard Sandiford
Invariant loads were handled as a variation on the code for contiguous
loads.  We detected whether they were invariant or not as a byproduct of
creating the vector pointer ivs: vect_create_data_ref_ptr passed back an
inv_p to say whether the pointer was invariant.

But vectorised invariant loads just keep the original scalar load,
so this meant that detecting invariant loads had the side-effect of
creating an unwanted vector pointer iv.  The placement of the code
also meant that we'd create a vector load and then not use the result.
In principle this is wrong code, since there's no guarantee that there's
a vector's worth of accessible data at that address, but we rely on DCE
to get rid of the load before any harm is done.

E.g., for an invariant load in an inner loop (which seems like the more
common use case for this code), we'd create:

   vectp_a.6_52 =  + 4;

   # vectp_a.5_53 = PHI 

   # vectp_a.5_55 = PHI 

   vect_next_a_11.7_57 = MEM[(int *)vectp_a.5_55];
   next_a_11 = a[_1];
   vect_cst__58 = {next_a_11, next_a_11, next_a_11, next_a_11};

   vectp_a.5_56 = vectp_a.5_55 + 4;

   vectp_a.5_54 = vectp_a.5_53 + 0;

whereas all we want is:

   next_a_11 = a[_1];
   vect_cst__58 = {next_a_11, next_a_11, next_a_11, next_a_11};

This patch moves the handling to its own block and makes
vect_create_data_ref_ptr assert (when creating a full iv) that the
address isn't invariant.

The ncopies handling is unfortunate, but a preexisting issue.
Richi's suggestion of using a vector of vector statements would
let us reuse one statement for all copies.


2018-07-30  Richard Sandiford  

gcc/
* tree-vectorizer.h (vect_create_data_ref_ptr): Remove inv_p
parameter.
* tree-vect-data-refs.c (vect_create_data_ref_ptr): Likewise.
When creating an iv, assert that the step is not known to be zero.
(vect_setup_realignment): Update call accordingly.
* tree-vect-stmts.c (vectorizable_store): Likewise.
(vectorizable_load): Likewise.  Handle VMAT_INVARIANT separately.

Index: gcc/tree-vectorizer.h
===
*** gcc/tree-vectorizer.h   2018-07-30 12:32:29.586506669 +0100
--- gcc/tree-vectorizer.h   2018-07-30 12:40:13.0 +0100
*** extern bool vect_analyze_data_refs (vec_
*** 1527,1533 
  extern void vect_record_base_alignments (vec_info *);
  extern tree vect_create_data_ref_ptr (stmt_vec_info, tree, struct loop *, 
tree,
  tree *, gimple_stmt_iterator *,
! gimple **, bool, bool *,
  tree = NULL_TREE, tree = NULL_TREE);
  extern tree bump_vector_ptr (tree, gimple *, gimple_stmt_iterator *,
 stmt_vec_info, tree);
--- 1527,1533 
  extern void vect_record_base_alignments (vec_info *);
  extern tree vect_create_data_ref_ptr (stmt_vec_info, tree, struct loop *, 
tree,
  tree *, gimple_stmt_iterator *,
! gimple **, bool,
  tree = NULL_TREE, tree = NULL_TREE);
  extern tree bump_vector_ptr (tree, gimple *, gimple_stmt_iterator *,
 stmt_vec_info, tree);
Index: gcc/tree-vect-data-refs.c
===
*** gcc/tree-vect-data-refs.c   2018-07-30 12:32:26.214536374 +0100
--- gcc/tree-vect-data-refs.c   2018-07-30 12:32:32.546480596 +0100
*** vect_create_addr_base_for_vector_ref (st
*** 4674,4689 
  
Return the increment stmt that updates the pointer in PTR_INCR.
  
!3. Set INV_P to true if the access pattern of the data reference in the
!   vectorized loop is invariant.  Set it to false otherwise.
! 
!4. Return the pointer.  */
  
  tree
  vect_create_data_ref_ptr (stmt_vec_info stmt_info, tree aggr_type,
  struct loop *at_loop, tree offset,
  tree *initial_address, gimple_stmt_iterator *gsi,
! gimple **ptr_incr, bool only_init, bool *inv_p,
  tree byte_offset, tree iv_step)
  {
const char *base_name;
--- 4674,4686 
  
Return the increment stmt that updates the pointer in PTR_INCR.
  
!3. Return the pointer.  */
  
  tree
  vect_create_data_ref_ptr (stmt_vec_info stmt_info, tree aggr_type,
  struct loop *at_loop, tree offset,
  tree *initial_address, gimple_stmt_iterator *gsi,
! gimple **ptr_incr, bool only_init,
  tree byte_offset, tree iv_step)
  {
const char *base_name;
*** vect_create_data_ref_ptr (stmt_vec_info
*** 4705,4711 
bool insert_after;
tree indx_before_incr, indx_after_incr;
gimple *incr;
-   tree step;
bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
  
gcc_assert 

[05/11] Add a vect_stmt_to_vectorize helper function

2018-07-30 Thread Richard Sandiford
This patch adds a helper that does the opposite of vect_orig_stmt:
go from the original scalar statement to the statement that should
actually be vectorised.

The use in the last two hunks of vectorizable_reduction are because
reduc_stmt_info (first hunk) and stmt_info (second hunk) are already
pattern statements if appropriate.


2018-07-30  Richard Sandiford  

gcc/
* tree-vectorizer.h (vect_stmt_to_vectorize): New function.
* tree-vect-loop.c (vect_update_vf_for_slp): Use it.
(vectorizable_reduction): Likewise.
* tree-vect-slp.c (vect_analyze_slp_instance): Likewise.
(vect_detect_hybrid_slp_stmts): Likewise.
* tree-vect-stmts.c (vect_is_simple_use): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-30 12:32:26.218536339 +0100
+++ gcc/tree-vectorizer.h   2018-07-30 12:32:29.586506669 +0100
@@ -1131,6 +1131,17 @@ vect_orig_stmt (stmt_vec_info stmt_info)
   return stmt_info;
 }
 
+/* If STMT_INFO has been replaced by a pattern statement, return the
+   replacement statement, otherwise return STMT_INFO itself.  */
+
+inline stmt_vec_info
+vect_stmt_to_vectorize (stmt_vec_info stmt_info)
+{
+  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
+return STMT_VINFO_RELATED_STMT (stmt_info);
+  return stmt_info;
+}
+
 /* Return true if BB is a loop header.  */
 
 static inline bool
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-30 12:32:26.214536374 +0100
+++ gcc/tree-vect-loop.c2018-07-30 12:32:29.586506669 +0100
@@ -1424,9 +1424,7 @@ vect_update_vf_for_slp (loop_vec_info lo
   gsi_next ())
{
  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
- if (STMT_VINFO_IN_PATTERN_P (stmt_info)
- && STMT_VINFO_RELATED_STMT (stmt_info))
-   stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
+ stmt_info = vect_stmt_to_vectorize (stmt_info);
  if ((STMT_VINFO_RELEVANT_P (stmt_info)
   || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))
  && !PURE_SLP_STMT (stmt_info))
@@ -6111,8 +6109,7 @@ vectorizable_reduction (stmt_vec_info st
return true;
 
   stmt_vec_info reduc_stmt_info = STMT_VINFO_REDUC_DEF (stmt_info);
-  if (STMT_VINFO_IN_PATTERN_P (reduc_stmt_info))
-   reduc_stmt_info = STMT_VINFO_RELATED_STMT (reduc_stmt_info);
+  reduc_stmt_info = vect_stmt_to_vectorize (reduc_stmt_info);
 
   if (STMT_VINFO_VEC_REDUCTION_TYPE (reduc_stmt_info)
  == EXTRACT_LAST_REDUCTION)
@@ -6145,8 +6142,7 @@ vectorizable_reduction (stmt_vec_info st
   if (ncopies > 1
  && STMT_VINFO_RELEVANT (reduc_stmt_info) <= vect_used_only_live
  && (use_stmt_info = loop_vinfo->lookup_single_use (phi_result))
- && (use_stmt_info == reduc_stmt_info
- || STMT_VINFO_RELATED_STMT (use_stmt_info) == reduc_stmt_info))
+ && vect_stmt_to_vectorize (use_stmt_info) == reduc_stmt_info)
single_defuse_cycle = true;
 
   /* Create the destination vector  */
@@ -6915,8 +6911,7 @@ vectorizable_reduction (stmt_vec_info st
   if (ncopies > 1
   && (STMT_VINFO_RELEVANT (stmt_info) <= vect_used_only_live)
   && (use_stmt_info = loop_vinfo->lookup_single_use (reduc_phi_result))
-  && (use_stmt_info == stmt_info
- || STMT_VINFO_RELATED_STMT (use_stmt_info) == stmt_info))
+  && vect_stmt_to_vectorize (use_stmt_info) == stmt_info)
 {
   single_defuse_cycle = true;
   epilog_copies = 1;
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c 2018-07-30 12:32:26.218536339 +0100
+++ gcc/tree-vect-slp.c 2018-07-30 12:32:29.586506669 +0100
@@ -1969,11 +1969,7 @@ vect_analyze_slp_instance (vec_info *vin
   /* Collect the stores and store them in SLP_TREE_SCALAR_STMTS.  */
   while (next_info)
 {
- if (STMT_VINFO_IN_PATTERN_P (next_info)
- && STMT_VINFO_RELATED_STMT (next_info))
-   scalar_stmts.safe_push (STMT_VINFO_RELATED_STMT (next_info));
- else
-   scalar_stmts.safe_push (next_info);
+ scalar_stmts.safe_push (vect_stmt_to_vectorize (next_info));
  next_info = DR_GROUP_NEXT_ELEMENT (next_info);
 }
 }
@@ -1983,11 +1979,7 @@ vect_analyze_slp_instance (vec_info *vin
 SLP_TREE_SCALAR_STMTS.  */
   while (next_info)
 {
- if (STMT_VINFO_IN_PATTERN_P (next_info)
- && STMT_VINFO_RELATED_STMT (next_info))
-   scalar_stmts.safe_push (STMT_VINFO_RELATED_STMT (next_info));
- else
-   scalar_stmts.safe_push (next_info);
+ scalar_stmts.safe_push (vect_stmt_to_vectorize (next_info));
  next_info = REDUC_GROUP_NEXT_ELEMENT (next_info);
 }
   /* Mark the first element of the 

[04/11] Add a vect_orig_stmt helper function

2018-07-30 Thread Richard Sandiford
This patch just adds a helper function for going from a potential
pattern statement to the original scalar statement.


2018-07-30  Richard Sandiford  

gcc/
* tree-vectorizer.h (vect_orig_stmt): New function.
* tree-vect-data-refs.c (vect_preserves_scalar_order_p): Use it.
* tree-vect-loop.c (vect_model_reduction_cost): Likewise.
(vect_create_epilog_for_reduction): Likewise.
(vectorizable_live_operation): Likewise.
* tree-vect-slp.c (vect_find_last_scalar_stmt_in_slp): Likewise.
(vect_detect_hybrid_slp_stmts, vect_schedule_slp): Likewise.
* tree-vect-stmts.c (vectorizable_call): Likewise.
(vectorizable_simd_clone_call, vect_remove_stores): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-30 12:32:22.718567174 +0100
+++ gcc/tree-vectorizer.h   2018-07-30 12:32:26.218536339 +0100
@@ -1120,6 +1120,17 @@ is_pattern_stmt_p (stmt_vec_info stmt_in
   return stmt_info->pattern_stmt_p;
 }
 
+/* If STMT_INFO is a pattern statement, return the statement that it
+   replaces, otherwise return STMT_INFO itself.  */
+
+inline stmt_vec_info
+vect_orig_stmt (stmt_vec_info stmt_info)
+{
+  if (is_pattern_stmt_p (stmt_info))
+return STMT_VINFO_RELATED_STMT (stmt_info);
+  return stmt_info;
+}
+
 /* Return true if BB is a loop header.  */
 
 static inline bool
Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-07-30 12:32:08.934688600 +0100
+++ gcc/tree-vect-data-refs.c   2018-07-30 12:32:26.214536374 +0100
@@ -214,10 +214,8 @@ vect_preserves_scalar_order_p (dr_vec_in
  (but could happen later) while reads will happen no later than their
  current position (but could happen earlier).  Reordering is therefore
  only possible if the first access is a write.  */
-  if (is_pattern_stmt_p (stmtinfo_a))
-stmtinfo_a = STMT_VINFO_RELATED_STMT (stmtinfo_a);
-  if (is_pattern_stmt_p (stmtinfo_b))
-stmtinfo_b = STMT_VINFO_RELATED_STMT (stmtinfo_b);
+  stmtinfo_a = vect_orig_stmt (stmtinfo_a);
+  stmtinfo_b = vect_orig_stmt (stmtinfo_b);
   stmt_vec_info earlier_stmt_info = get_earlier_stmt (stmtinfo_a, stmtinfo_b);
   return !DR_IS_WRITE (STMT_VINFO_DATA_REF (earlier_stmt_info));
 }
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-30 12:32:22.714567210 +0100
+++ gcc/tree-vect-loop.c2018-07-30 12:32:26.214536374 +0100
@@ -3814,10 +3814,7 @@ vect_model_reduction_cost (stmt_vec_info
 
   vectype = STMT_VINFO_VECTYPE (stmt_info);
   mode = TYPE_MODE (vectype);
-  stmt_vec_info orig_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
-
-  if (!orig_stmt_info)
-orig_stmt_info = stmt_info;
+  stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info);
 
   code = gimple_assign_rhs_code (orig_stmt_info->stmt);
 
@@ -4738,13 +4735,8 @@ vect_create_epilog_for_reduction (vecstmt);
   group_size = 1;
 }
@@ -7898,10 +7886,8 @@ vectorizable_live_operation (stmt_vec_in
   return true;
 }
 
-  /* If stmt has a related stmt, then use that for getting the lhs.  */
-  gimple *stmt = (is_pattern_stmt_p (stmt_info)
- ? STMT_VINFO_RELATED_STMT (stmt_info)->stmt
- : stmt_info->stmt);
+  /* Use the lhs of the original scalar statement.  */
+  gimple *stmt = vect_orig_stmt (stmt_info)->stmt;
 
   lhs = (is_a  (stmt)) ? gimple_phi_result (stmt)
: gimple_get_lhs (stmt);
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c 2018-07-30 12:32:22.714567210 +0100
+++ gcc/tree-vect-slp.c 2018-07-30 12:32:26.218536339 +0100
@@ -1848,8 +1848,7 @@ vect_find_last_scalar_stmt_in_slp (slp_t
 
   for (int i = 0; SLP_TREE_SCALAR_STMTS (node).iterate (i, _vinfo); i++)
 {
-  if (is_pattern_stmt_p (stmt_vinfo))
-   stmt_vinfo = STMT_VINFO_RELATED_STMT (stmt_vinfo);
+  stmt_vinfo = vect_orig_stmt (stmt_vinfo);
   last = last ? get_later_stmt (stmt_vinfo, last) : stmt_vinfo;
 }
 
@@ -2314,10 +2313,7 @@ vect_detect_hybrid_slp_stmts (slp_tree n
   gcc_checking_assert (PURE_SLP_STMT (stmt_vinfo));
   /* If we get a pattern stmt here we have to use the LHS of the
  original stmt for immediate uses.  */
-  gimple *stmt = stmt_vinfo->stmt;
-  if (! STMT_VINFO_IN_PATTERN_P (stmt_vinfo)
- && STMT_VINFO_RELATED_STMT (stmt_vinfo))
-   stmt = STMT_VINFO_RELATED_STMT (stmt_vinfo)->stmt;
+  gimple *stmt = vect_orig_stmt (stmt_vinfo)->stmt;
   tree def;
   if (gimple_code (stmt) == GIMPLE_PHI)
def = gimple_phi_result (stmt);
@@ -4087,8 +4083,7 @@ vect_schedule_slp (vec_info *vinfo)
  if (!STMT_VINFO_DATA_REF (store_info))
break;
 
- if (is_pattern_stmt_p (store_info))
-   store_info = 

[03/11] Remove vect_transform_stmt grouped_store argument

2018-07-30 Thread Richard Sandiford
Nothing now uses the grouped_store value passed back by
vect_transform_stmt, so we might as well remove it.


2018-07-30  Richard Sandiford  

gcc/
* tree-vectorizer.h (vect_transform_stmt): Remove grouped_store
argument.
* tree-vect-stmts.c (vect_transform_stmt): Likewise.
* tree-vect-loop.c (vect_transform_loop_stmt): Update call accordingly.
(vect_transform_loop): Likewise.
* tree-vect-slp.c (vect_schedule_slp_instance): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-30 12:32:19.366596715 +0100
+++ gcc/tree-vectorizer.h   2018-07-30 12:32:22.718567174 +0100
@@ -1459,7 +1459,7 @@ extern tree vect_init_vector (stmt_vec_i
   gimple_stmt_iterator *);
 extern tree vect_get_vec_def_for_stmt_copy (vec_info *, tree);
 extern bool vect_transform_stmt (stmt_vec_info, gimple_stmt_iterator *,
- bool *, slp_tree, slp_instance);
+slp_tree, slp_instance);
 extern void vect_remove_stores (stmt_vec_info);
 extern bool vect_analyze_stmt (stmt_vec_info, bool *, slp_tree, slp_instance,
   stmt_vector_for_cost *);
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2018-07-30 12:32:09.114687014 +0100
+++ gcc/tree-vect-stmts.c   2018-07-30 12:32:22.718567174 +0100
@@ -9662,8 +9662,7 @@ vect_analyze_stmt (stmt_vec_info stmt_in
 
 bool
 vect_transform_stmt (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
-bool *grouped_store, slp_tree slp_node,
- slp_instance slp_node_instance)
+slp_tree slp_node, slp_instance slp_node_instance)
 {
   vec_info *vinfo = stmt_info->vinfo;
   bool is_store = false;
@@ -9727,7 +9726,6 @@ vect_transform_stmt (stmt_vec_info stmt_
 last store in the chain is reached.  Store stmts before the last
 one are skipped, and there vec_stmt_info shouldn't be freed
 meanwhile.  */
- *grouped_store = true;
  stmt_vec_info group_info = DR_GROUP_FIRST_ELEMENT (stmt_info);
  if (DR_GROUP_STORE_COUNT (group_info) == DR_GROUP_SIZE (group_info))
is_store = true;
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-30 12:32:16.190624704 +0100
+++ gcc/tree-vect-loop.c2018-07-30 12:32:22.714567210 +0100
@@ -8243,8 +8243,7 @@ vect_transform_loop_stmt (loop_vec_info
   if (dump_enabled_p ())
 dump_printf_loc (MSG_NOTE, vect_location, "transform statement.\n");
 
-  bool grouped_store = false;
-  if (vect_transform_stmt (stmt_info, gsi, _store, NULL, NULL))
+  if (vect_transform_stmt (stmt_info, gsi, NULL, NULL))
 *seen_store = stmt_info;
 }
 
@@ -8425,7 +8424,7 @@ vect_transform_loop (loop_vec_info loop_
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location, "transform phi.\n");
- vect_transform_stmt (stmt_info, NULL, NULL, NULL, NULL);
+ vect_transform_stmt (stmt_info, NULL, NULL, NULL);
}
}
 
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c 2018-07-30 12:32:19.366596715 +0100
+++ gcc/tree-vect-slp.c 2018-07-30 12:32:22.714567210 +0100
@@ -3853,7 +3853,6 @@ vect_transform_slp_perm_load (slp_tree n
 vect_schedule_slp_instance (slp_tree node, slp_instance instance,
scalar_stmts_to_slp_tree_map_t *bst_map)
 {
-  bool grouped_store;
   gimple_stmt_iterator si;
   stmt_vec_info stmt_info;
   unsigned int group_size;
@@ -3945,11 +3944,11 @@ vect_schedule_slp_instance (slp_tree nod
  vec v1;
  unsigned j;
  tree tmask = NULL_TREE;
- vect_transform_stmt (stmt_info, , _store, node, instance);
+ vect_transform_stmt (stmt_info, , node, instance);
  v0 = SLP_TREE_VEC_STMTS (node).copy ();
  SLP_TREE_VEC_STMTS (node).truncate (0);
  gimple_assign_set_rhs_code (stmt, ocode);
- vect_transform_stmt (stmt_info, , _store, node, instance);
+ vect_transform_stmt (stmt_info, , node, instance);
  gimple_assign_set_rhs_code (stmt, code0);
  v1 = SLP_TREE_VEC_STMTS (node).copy ();
  SLP_TREE_VEC_STMTS (node).truncate (0);
@@ -3994,7 +3993,7 @@ vect_schedule_slp_instance (slp_tree nod
  return;
}
 }
-  vect_transform_stmt (stmt_info, , _store, node, instance);
+  vect_transform_stmt (stmt_info, , node, instance);
 
   /* Restore stmt def-types.  */
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)


[02/11] Remove vect_schedule_slp return value

2018-07-30 Thread Richard Sandiford
Nothing now uses the vect_schedule_slp return value, so it's not worth
propagating the value through vect_schedule_slp_instance.


2018-07-30  Richard Sandiford  

gcc/
* tree-vectorizer.h (vect_schedule_slp): Return void.
* tree-vect-slp.c (vect_schedule_slp_instance): Likewise.
(vect_schedule_slp): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-30 12:32:09.114687014 +0100
+++ gcc/tree-vectorizer.h   2018-07-30 12:32:19.366596715 +0100
@@ -1575,7 +1575,7 @@ extern bool vect_transform_slp_perm_load
  gimple_stmt_iterator *, poly_uint64,
  slp_instance, bool, unsigned *);
 extern bool vect_slp_analyze_operations (vec_info *);
-extern bool vect_schedule_slp (vec_info *);
+extern void vect_schedule_slp (vec_info *);
 extern bool vect_analyze_slp (vec_info *, unsigned);
 extern bool vect_make_slp_decision (loop_vec_info);
 extern void vect_detect_hybrid_slp (loop_vec_info);
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c 2018-07-30 12:32:09.026687790 +0100
+++ gcc/tree-vect-slp.c 2018-07-30 12:32:19.366596715 +0100
@@ -3849,11 +3849,11 @@ vect_transform_slp_perm_load (slp_tree n
 
 /* Vectorize SLP instance tree in postorder.  */
 
-static bool
+static void
 vect_schedule_slp_instance (slp_tree node, slp_instance instance,
scalar_stmts_to_slp_tree_map_t *bst_map)
 {
-  bool grouped_store, is_store;
+  bool grouped_store;
   gimple_stmt_iterator si;
   stmt_vec_info stmt_info;
   unsigned int group_size;
@@ -3862,14 +3862,14 @@ vect_schedule_slp_instance (slp_tree nod
   slp_tree child;
 
   if (SLP_TREE_DEF_TYPE (node) != vect_internal_def)
-return false;
+return;
 
   /* See if we have already vectorized the same set of stmts and reuse their
  vectorized stmts.  */
   if (slp_tree *leader = bst_map->get (SLP_TREE_SCALAR_STMTS (node)))
 {
   SLP_TREE_VEC_STMTS (node).safe_splice (SLP_TREE_VEC_STMTS (*leader));
-  return false;
+  return;
 }
 
   bst_map->put (SLP_TREE_SCALAR_STMTS (node).copy (), node);
@@ -3991,11 +3991,10 @@ vect_schedule_slp_instance (slp_tree nod
}
  v0.release ();
  v1.release ();
- return false;
+ return;
}
 }
-  is_store = vect_transform_stmt (stmt_info, , _store, node,
- instance);
+  vect_transform_stmt (stmt_info, , _store, node, instance);
 
   /* Restore stmt def-types.  */
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
@@ -4005,8 +4004,6 @@ vect_schedule_slp_instance (slp_tree nod
FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (child), j, child_stmt_info)
  STMT_VINFO_DEF_TYPE (child_stmt_info) = vect_internal_def;
   }
-
-  return is_store;
 }
 
 /* Replace scalar calls from SLP node NODE with setting of their lhs to zero.
@@ -4048,14 +4045,12 @@ vect_remove_slp_scalar_calls (slp_tree n
 
 /* Generate vector code for all SLP instances in the loop/basic block.  */
 
-bool
+void
 vect_schedule_slp (vec_info *vinfo)
 {
   vec slp_instances;
   slp_instance instance;
   unsigned int i;
-  bool is_store = false;
-
 
   scalar_stmts_to_slp_tree_map_t *bst_map
 = new scalar_stmts_to_slp_tree_map_t ();
@@ -4063,8 +4058,8 @@ vect_schedule_slp (vec_info *vinfo)
   FOR_EACH_VEC_ELT (slp_instances, i, instance)
 {
   /* Schedule the tree of INSTANCE.  */
-  is_store = vect_schedule_slp_instance (SLP_INSTANCE_TREE (instance),
- instance, bst_map);
+  vect_schedule_slp_instance (SLP_INSTANCE_TREE (instance),
+ instance, bst_map);
   if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
  "vectorizing stmts using SLP.\n");
@@ -4099,6 +4094,4 @@ vect_schedule_slp (vec_info *vinfo)
  vinfo->remove_stmt (store_info);
 }
 }
-
-  return is_store;
 }


[01/11] Schedule SLP earlier

2018-07-30 Thread Richard Sandiford
vect_transform_loop used to call vect_schedule_slp lazily when it
came across the first SLP statement, but it seems easier to do it
before the main loop.


2018-07-30  Richard Sandiford  

gcc/
* tree-vect-loop.c (vect_transform_loop_stmt): Remove slp_scheduled
argument.
(vect_transform_loop): Update calls accordingly.  Schedule SLP
instances before the main loop, if any exist.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-30 12:32:15.0 +0100
+++ gcc/tree-vect-loop.c2018-07-30 12:32:16.190624704 +0100
@@ -8199,14 +8199,12 @@ scale_profile_for_vect_loop (struct loop
 }
 
 /* Vectorize STMT_INFO if relevant, inserting any new instructions before GSI.
-   When vectorizing STMT_INFO as a store, set *SEEN_STORE to its stmt_vec_info.
-   *SLP_SCHEDULE is a running record of whether we have called
-   vect_schedule_slp.  */
+   When vectorizing STMT_INFO as a store, set *SEEN_STORE to its
+   stmt_vec_info.  */
 
 static void
 vect_transform_loop_stmt (loop_vec_info loop_vinfo, stmt_vec_info stmt_info,
- gimple_stmt_iterator *gsi,
- stmt_vec_info *seen_store, bool *slp_scheduled)
+ gimple_stmt_iterator *gsi, stmt_vec_info *seen_store)
 {
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
@@ -8237,24 +8235,10 @@ vect_transform_loop_stmt (loop_vec_info
dump_printf_loc (MSG_NOTE, vect_location, "multiple-types.\n");
 }
 
-  /* SLP.  Schedule all the SLP instances when the first SLP stmt is
- reached.  */
-  if (slp_vect_type slptype = STMT_SLP_TYPE (stmt_info))
-{
-
-  if (!*slp_scheduled)
-   {
- *slp_scheduled = true;
-
- DUMP_VECT_SCOPE ("scheduling SLP instances");
-
- vect_schedule_slp (loop_vinfo);
-   }
-
-  /* Hybrid SLP stmts must be vectorized in addition to SLP.  */
-  if (slptype == pure_slp)
-   return;
-}
+  /* Pure SLP statements have already been vectorized.  We still need
+ to apply loop vectorization to hybrid SLP statements.  */
+  if (PURE_SLP_STMT (stmt_info))
+return;
 
   if (dump_enabled_p ())
 dump_printf_loc (MSG_NOTE, vect_location, "transform statement.\n");
@@ -8284,7 +8268,6 @@ vect_transform_loop (loop_vec_info loop_
   tree niters_vector_mult_vf = NULL_TREE;
   poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
   unsigned int lowest_vf = constant_lower_bound (vf);
-  bool slp_scheduled = false;
   gimple *stmt;
   bool check_profitability = false;
   unsigned int th;
@@ -8390,6 +8373,14 @@ vect_transform_loop (loop_vec_info loop_
 /* This will deal with any possible peeling.  */
 vect_prepare_for_masked_peels (loop_vinfo);
 
+  /* Schedule the SLP instances first, then handle loop vectorization
+ below.  */
+  if (!loop_vinfo->slp_instances.is_empty ())
+{
+  DUMP_VECT_SCOPE ("scheduling SLP instances");
+  vect_schedule_slp (loop_vinfo);
+}
+
   /* FORNOW: the vectorizer supports only loops which body consist
  of one basic block (header + empty latch). When the vectorizer will
  support more involved loop forms, the order by which the BBs are
@@ -8468,16 +8459,15 @@ vect_transform_loop (loop_vec_info loop_
  stmt_vec_info pat_stmt_info
= loop_vinfo->lookup_stmt (gsi_stmt (subsi));
  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info,
-   , _store,
-   _scheduled);
+   , _store);
}
  stmt_vec_info pat_stmt_info
= STMT_VINFO_RELATED_STMT (stmt_info);
  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, ,
-   _store, _scheduled);
+   _store);
}
  vect_transform_loop_stmt (loop_vinfo, stmt_info, ,
-   _store, _scheduled);
+   _store);
}
  gsi_next ();
  if (seen_store)


[00/11] Add a vec_basic_block of scalar statements

2018-07-30 Thread Richard Sandiford
This series puts the statements that need to be vectorised into a
"vec_basic_block" structure of linked stmt_vec_infos, and then puts
pattern statements into this block rather than hanging them off the
original scalar statement.

Partly this is clean-up, since making pattern statements more like
first-class statements removes a lot of indirection.  The diffstat
for the series is:

 7 files changed, 691 insertions(+), 978 deletions(-)

It also makes it easier to do something approaching proper DCE
on the scalar code (patch 10).  However, the main motivation is
to allow the result of an earlier pattern statement to be reused
as the STMT_VINFO_RELATED_STMT for a later (non-pattern) statement.
I have two current uses for this:

(1) The way overwidening detection works means that we can sometimes
be left with sequences of the form:

  type1 narrowed = ... + ...;   // originally done in type2
  type2 extended = (type2) narrowed;
  type3 truncated = (type3) extended;

which cast_forwprop can simplify to:

  type1 narrowed = ... + ...;   // originally done in type2
  type3 truncated = (type3) narrowed;

But if type3 == type1, we really want to replace truncated
directly with narrowed.  The current representation doesn't
allow this.

(2) For SVE extending loads, we want to look for:

  type1 narrow = *ptr;
  type2 extended = (type2) narrow; // only use of narrow

replace narrow with:

  type2 tmp = .LOAD_EXT (ptr, ...);

and replace extended directly with tmp.  (Deleting narrow and
replacing tmp with a .LOAD_EXT would move the location of the
load and so wouldn't be safe in general.)

The series doesn't do either of these things, it's just laying the
groundwork.  It applies on top of:

https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01308.html

I tested each individual patch on aarch64-linux-gnu and the series as a
whole on aarch64-linux-gnu with SVE, aarch64_be-elf and x86_64-linux-gnu.
OK to install?

Thanks,
Richard


Re: [GCC][PATCH][Aarch64] Exploiting BFXIL when OR-ing two AND-operations with appropriate bitmasks

2018-07-30 Thread Sam Tebbs

Hi all,

This update fixes an issue where the predicate would match but during 
register allocation the constraints wouldn't match and so an internal 
compiler error would be raised. This was fixed by adding two 
left_consecutive checks to the pattern's predicate to stop it from 
matching and causing the issue described above. The changelog and 
description from before still apply.


Thanks,
Sam


On 07/24/2018 05:23 PM, Sam Tebbs wrote:



On 07/23/2018 02:14 PM, Sam Tebbs wrote:



On 07/23/2018 12:38 PM, Renlin Li wrote:

+(define_insn "*aarch64_bfxil"
+  [(set (match_operand:GPI 0 "register_operand" "=r,r")
+    (ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "r,0")
+    (match_operand:GPI 3 "const_int_operand" "n, Ulc"))
+    (and:GPI (match_operand:GPI 2 "register_operand" "0,r")
+    (match_operand:GPI 4 "const_int_operand" "Ulc, n"]
+  "INTVAL (operands[3]) == ~INTVAL (operands[4])"
+  {
+    switch (which_alternative)
+    {
+  case 0:
+    operands[3] = GEN_INT (ceil_log2 (INTVAL (operands[3])));
+    return "bfxil\\t%0, %1, 0, %3";
+  case 1:
+    operands[3] = GEN_INT (ceil_log2 (INTVAL (operands[4])));
+    return "bfxil\\t%0, %2, 0, %3";
+  default:
+    gcc_unreachable ();
+    }
+  }
+  [(set_attr "type" "bfm")]
+)


Hi Sam,

Is it possible that operand[3] or operand[4] is 1?

ceil_log2() could return 0 if the argument is 1.
The width field of the resulting instruction will be 0. Is it still 
correct?


Regard,
Renlin



Hi Renlin,

I think you have found an edge-case that I didn't think of and that 
the code would fail under. I have added a check for this situation 
and will send the updated patch soon.


Thanks,
Sam


Here is the updated patch that fixes the problem outlined by Renlin, 
and adds another testcase to check for the same issue. The changelog 
and example from my earlier email (sent on 07/20/2018 10:31 AM) still 
apply.


Thanks,
Sam



On 07/20/2018 10:33 AM, Sam Tebbs wrote:

Please disregard the original patch and see this updated version.


On 07/20/2018 10:31 AM, Sam Tebbs wrote:

Hi all,

Here is an updated patch that does the following:

* Adds a new constraint in config/aarch64/constraints.md to check 
for a constant integer that is left consecutive. This addresses 
Richard Henderson's suggestion about combining the 
aarch64_is_left_consecutive call and the const_int match in the 
pattern.


* Merges the two patterns defined into one.

* Changes the pattern's type attribute to bfm.

* Improved the comment above the aarch64_is_left_consecutive 
implementation.


* Makes the pattern use the GPI iterator to accept smaller integer 
sizes (an extra test is added to check for this).


* Improves the tests in combine_bfxil.c to ensure they aren't 
optimised away and that they check for the pattern's correctness.


Below is a new changelog and the example given before.

Is this OK for trunk?

This patch adds an optimisation that exploits the AArch64 BFXIL 
instruction
when or-ing the result of two bitwise and operations with 
non-overlapping

bitmasks (e.g. (a & 0x) | (b & 0x)).

Example:

unsigned long long combine(unsigned long long a, unsigned long 
long b) {

  return (a & 0xll) | (b & 0xll);
}

void read(unsigned long long a, unsigned long long b, unsigned 
long long *c) {

  *c = combine(a, b);
}

When compiled with -O2, read would result in:

read:
  and   x5, x1, #0x
  and   x4, x0, #0x
  orr   x4, x4, x5
  str   x4, [x2]
  ret

But with this patch results in:

read:
  mov    x4, x0
  bfxil    x4, x1, 0, 32
  str    x4, [x2]
  ret



Bootstrapped and regtested on aarch64-none-linux-gnu and 
aarch64-none-elf with no regressions.



gcc/
2018-07-11  Sam Tebbs  

    PR target/85628
    * config/aarch64/aarch64.md (*aarch64_bfxil):
    Define.
    * config/aarch64/constraints.md (Ulc): Define
    * config/aarch64/aarch64-protos.h 
(aarch64_is_left_consecutive):

    Define.
    * config/aarch64/aarch64.c (aarch64_is_left_consecutive): 
New function.


gcc/testsuite
2018-07-11  Sam Tebbs  

    PR target/85628
    * gcc.target/aarch64/combine_bfxil.c: New file.
    * gcc.target/aarch64/combine_bfxil_2.c: New file.


On 07/19/2018 02:02 PM, Sam Tebbs wrote:

Hi Richard,

Thanks for the feedback. I find that using "is_left_consecutive" 
is more descriptive than checking for it being a power of 2 - 1, 
since it describes the requirement (having consecutive ones from 
the MSB) more explicitly. I would be happy to change it though if 
that is the consensus.


I have addressed your point about just returning the string 
instead of using output_asm_insn and have changed it locally. 
I'll send an updated patch soon.



On 07/17/2018 02:33 AM, Richard Henderson wrote:

On 07/16/2018 10:10 AM, Sam Tebbs wrote:

+++ b/gcc/config/aarch64/aarch64.c
@@ -1439,6 +1439,14 @@ aarch64_hard_regno_caller_save_mode 
(unsigned regno, 

[committed] Resync inline implementation of ceil_log2 (PR 86506)

2018-07-30 Thread Richard Sandiford
In r262961 I only updated the out-of-line copy of ceil_log2.  This patch
applies the same change to the other (inline) one.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  Applied as obvious.

Richard


2018-07-30  Richard Sandiford  

gcc/
* hwint.h (ceil_log2): Resync with hwint.c implementation.

Index: gcc/hwint.h
===
--- gcc/hwint.h 2018-05-02 08:38:14.433364094 +0100
+++ gcc/hwint.h 2018-07-30 12:21:39.204235940 +0100
@@ -242,7 +242,7 @@ floor_log2 (unsigned HOST_WIDE_INT x)
 static inline int
 ceil_log2 (unsigned HOST_WIDE_INT x)
 {
-  return floor_log2 (x - 1) + 1;
+  return x == 0 ? 0 : floor_log2 (x - 1) + 1;
 }
 
 static inline int


[C++2A] Implement P1008R1 - prohibit aggregates with user-declared constructors

2018-07-30 Thread Jakub Jelinek
Hi!

Seems what is considered an aggregate type keeps changing in every single
C++ version.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-07-27  Jakub Jelinek  

P1008R1 - prohibit aggregates with user-declared constructors
* class.c (check_bases_and_members): For C++2a set
CLASSTYPE_NON_AGGREGATE based on TYPE_HAS_USER_CONSTRUCTOR rather than
type_has_user_provided_or_explicit_constructor.

* g++.dg/ext/is_aggregate.C: Add tests with deleted or defaulted ctor.
* g++.dg/cpp0x/defaulted1.C (main): Ifdef out for C++2a B b = {1};.
* g++.dg/cpp0x/deleted2.C: Expect error for C++2a.
* g++.dg/cpp2a/aggr1.C: New test.
* g++.dg/cpp2a/aggr2.C: New test.

--- gcc/cp/class.c.jj   2018-07-25 18:46:56.168149411 +0200
+++ gcc/cp/class.c  2018-07-27 21:21:22.382933221 +0200
@@ -5571,7 +5571,9 @@ check_bases_and_members (tree t)
  Again, other conditions for being an aggregate are checked
  elsewhere.  */
   CLASSTYPE_NON_AGGREGATE (t)
-|= (type_has_user_provided_or_explicit_constructor (t)
+|= ((cxx_dialect < cxx2a
+? type_has_user_provided_or_explicit_constructor (t)
+: TYPE_HAS_USER_CONSTRUCTOR (t))
|| TYPE_POLYMORPHIC_P (t));
   /* This is the C++98/03 definition of POD; it changed in C++0x, but we
  retain the old definition internally for ABI reasons.  */
--- gcc/testsuite/g++.dg/ext/is_aggregate.C.jj  2017-03-31 08:39:08.141481789 
+0200
+++ gcc/testsuite/g++.dg/ext/is_aggregate.C 2018-07-27 21:46:58.832053123 
+0200
@@ -61,6 +61,8 @@ struct K { int a, b; virtual void foo ()
 struct L : virtual public A { int d, e; };
 struct M : protected A { int d, e; };
 struct N : private A { int d, e; };
+struct O { O () = delete; int a, b, c; };
+struct P { P () = default; int a, b, c; };
 typedef int T;
 typedef float U;
 typedef int V __attribute__((vector_size (4 * sizeof (int;
@@ -94,6 +96,13 @@ main ()
   assert (NTEST (L));
   assert (NTEST (M));
   assert (NTEST (N));
+#if __cplusplus > 201703L
+  assert (NTEST (O));
+  assert (NTEST (P));
+#else
+  assert (PTEST (O));
+  assert (PTEST (P));
+#endif
   assert (PTEST (int[]));
   assert (PTEST (double[]));
   assert (PTEST (T[2]));
@@ -114,4 +123,6 @@ main ()
   assert (PTEST (L[]));
   assert (PTEST (M[6]));
   assert (PTEST (N[]));
+  assert (PTEST (O[]));
+  assert (PTEST (P[]));
 }
--- gcc/testsuite/g++.dg/cpp0x/defaulted1.C.jj  2014-03-10 10:50:13.280983792 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/defaulted1.C 2018-07-27 22:03:48.425103195 
+0200
@@ -23,7 +23,9 @@ struct B
 int main()
 {
   A a1, a2;
+#if __cplusplus <= 201703L
   B b = {1};
+#endif
   a1 = a2;
 }
 
--- gcc/testsuite/g++.dg/cpp0x/deleted2.C.jj2013-12-09 14:32:14.279033576 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/deleted2.C   2018-07-27 22:04:35.539198867 
+0200
@@ -6,4 +6,4 @@ struct A {
  A() = delete;
 };
 
-A a = {1};
+A a = {1}; // { dg-error "could not convert" "" { target c++2a } }
--- gcc/testsuite/g++.dg/cpp2a/aggr1.C.jj   2018-07-27 21:33:20.566391559 
+0200
+++ gcc/testsuite/g++.dg/cpp2a/aggr1.C  2018-07-27 22:06:09.093388836 +0200
@@ -0,0 +1,15 @@
+// { dg-do compile { target c++11 } }
+struct A {
+  A () = delete;   // { dg-message "declared here" "" { target c++2a } }
+};
+struct B {
+  B () = default;
+  int b = 0;
+};
+struct C {
+  C (C&&) = default;   // { dg-message "candidate" "" { target c++2a } }
+  int c, d;
+};
+A a {};// { dg-error "use of deleted function" "" { 
target c++2a } }
+B b = {1}; // { dg-error "could not convert" "" { target { 
c++11_only || c++2a } } }
+C *c = new C {2, 3};   // { dg-error "no matching function for call to" "" { 
target c++2a } }
--- gcc/testsuite/g++.dg/cpp2a/aggr2.C.jj   2018-07-27 22:10:51.956963223 
+0200
+++ gcc/testsuite/g++.dg/cpp2a/aggr2.C  2018-07-27 22:15:20.820509168 +0200
@@ -0,0 +1,25 @@
+// { dg-do run { target c++11 } }
+
+struct A;
+struct B { operator A (); };
+struct A { A (const A &) = default; A () = default; B a; };
+A a {B {}};
+bool seen;
+
+B::operator A ()
+{
+  seen = true;
+  return A ();
+}
+
+int
+main ()
+{
+#if __cplusplus > 201703L
+  if (!seen)
+__builtin_abort ();
+#else
+  if (seen)
+__builtin_abort ();
+#endif
+}

Jakub


[libgomp, nvptx. committed] Handle per-function max-threads-per-block in default dims

2018-07-30 Thread Tom de Vries
Hi,

Build and reg-tested on x86_64 with nvptx accelerator.

Committed to trunk.

Thanks,
- Tom
[libgomp, nvptx] Handle per-function max-threads-per-block in default dims

Currently parallel-loop-1.c fails at -O0 on a Quadro M1200, because one of the
kernel launch configurations exceeds the resources available in the device, due
to the default dimensions chosen by the runtime.

This patch fixes that by taking the per-function max_threads_per_block into
account when using the default dimensions.

2018-07-27  Tom de Vries  

	* plugin/plugin-nvptx.c (MIN, MAX): Redefine.
	(nvptx_exec): Ensure worker and vector default dims don't exceed
	targ_fn->max_threads_per_block.

---
 libgomp/plugin/plugin-nvptx.c | 29 +
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 5c522aaf281..b6ec5f88d59 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -141,6 +141,11 @@ init_cuda_lib (void)
 
 #include "secure_getenv.h"
 
+#undef MIN
+#undef MAX
+#define MIN(X,Y) ((X) < (Y) ? (X) : (Y))
+#define MAX(X,Y) ((X) > (Y) ? (X) : (Y))
+
 /* Convenience macros for the frequently used CUDA library call and
error handling sequence as well as CUDA library calls that
do the error checking themselves or don't do it at all.  */
@@ -1135,6 +1140,7 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
   void *kargs[1];
   void *hp, *dp;
   struct nvptx_thread *nvthd = nvptx_thread ();
+  int warp_size = nvthd->ptx_dev->warp_size;
   const char *maybe_abort_msg = "(perhaps abort was called)";
 
   function = targ_fn->fn;
@@ -1175,7 +1181,6 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 
 	  int gang, worker, vector;
 	  {
-	int warp_size = nvthd->ptx_dev->warp_size;
 	int block_size = nvthd->ptx_dev->max_threads_per_block;
 	int cpu_size = nvthd->ptx_dev->max_threads_per_multiprocessor;
 	int dev_size = nvthd->ptx_dev->num_sms;
@@ -1213,9 +1218,25 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 	}
   pthread_mutex_unlock (_dev_lock);
 
-  for (i = 0; i != GOMP_DIM_MAX; i++)
-	if (!dims[i])
-	  dims[i] = nvthd->ptx_dev->default_dims[i];
+  {
+	bool default_dim_p[GOMP_DIM_MAX];
+	for (i = 0; i != GOMP_DIM_MAX; i++)
+	  {
+	default_dim_p[i] = !dims[i];
+	if (default_dim_p[i])
+	  dims[i] = nvthd->ptx_dev->default_dims[i];
+	  }
+
+	if (default_dim_p[GOMP_DIM_VECTOR])
+	  dims[GOMP_DIM_VECTOR]
+	= MIN (dims[GOMP_DIM_VECTOR],
+		   (targ_fn->max_threads_per_block / warp_size * warp_size));
+
+	if (default_dim_p[GOMP_DIM_WORKER])
+	  dims[GOMP_DIM_WORKER]
+	= MIN (dims[GOMP_DIM_WORKER],
+		   targ_fn->max_threads_per_block / dims[GOMP_DIM_VECTOR]);
+  }
 }
 
   /* Check if the accelerator has sufficient hardware resources to


[libgomp, nvptx, committed] Calculate default dims per device

2018-07-30 Thread Tom de Vries
Hi,

Build and reg-tested on x86_64 with nvptx accelerator.

Committed to trunk.

Thanks,
- Tom
[libgomp, nvptx] Calculate default dims per device

The default dimensions are calculated using per-device properties, but
initialized once and used on all devices.

This patch fixes this problem by introducing per-device default dimensions.

2018-07-27  Tom de Vries  

	* plugin/plugin-nvptx.c (struct ptx_device): Add default_dims field.
	(nvptx_open_device): Init default_dims for device.
	(nvptx_exec): Use default_dims from device.

---
 libgomp/plugin/plugin-nvptx.c | 28 +---
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 3a4077a1315..5c522aaf281 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -417,6 +417,7 @@ struct ptx_device
   int warp_size;
   int max_threads_per_block;
   int max_threads_per_multiprocessor;
+  int default_dims[GOMP_DIM_MAX];
 
   struct ptx_image_data *images;  /* Images loaded on device.  */
   pthread_mutex_t image_lock; /* Lock for above list.  */
@@ -818,6 +819,9 @@ nvptx_open_device (int n)
   if (r != CUDA_SUCCESS)
 async_engines = 1;
 
+  for (int i = 0; i != GOMP_DIM_MAX; i++)
+ptx_dev->default_dims[i] = 0;
+
   ptx_dev->images = NULL;
   pthread_mutex_init (_dev->image_lock, NULL);
 
@@ -1152,15 +1156,22 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 
   if (seen_zero)
 {
-  /* See if the user provided GOMP_OPENACC_DIM environment
-	 variable to specify runtime defaults. */
-  static int default_dims[GOMP_DIM_MAX];
-
   pthread_mutex_lock (_dev_lock);
-  if (!default_dims[0])
+
+  static int gomp_openacc_dims[GOMP_DIM_MAX];
+  if (!gomp_openacc_dims[0])
+	{
+	  /* See if the user provided GOMP_OPENACC_DIM environment
+	 variable to specify runtime defaults.  */
+	  for (int i = 0; i < GOMP_DIM_MAX; ++i)
+	gomp_openacc_dims[i] = GOMP_PLUGIN_acc_default_dim (i);
+	}
+
+  if (!nvthd->ptx_dev->default_dims[0])
 	{
+	  int default_dims[GOMP_DIM_MAX];
 	  for (int i = 0; i < GOMP_DIM_MAX; ++i)
-	default_dims[i] = GOMP_PLUGIN_acc_default_dim (i);
+	default_dims[i] = gomp_openacc_dims[i];
 
 	  int gang, worker, vector;
 	  {
@@ -1196,12 +1207,15 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 			 default_dims[GOMP_DIM_GANG],
 			 default_dims[GOMP_DIM_WORKER],
 			 default_dims[GOMP_DIM_VECTOR]);
+
+	  for (i = 0; i != GOMP_DIM_MAX; i++)
+	nvthd->ptx_dev->default_dims[i] = default_dims[i];
 	}
   pthread_mutex_unlock (_dev_lock);
 
   for (i = 0; i != GOMP_DIM_MAX; i++)
 	if (!dims[i])
-	  dims[i] = default_dims[i];
+	  dims[i] = nvthd->ptx_dev->default_dims[i];
 }
 
   /* Check if the accelerator has sufficient hardware resources to


Re: [patch] adjust default nvptx launch geometry for OpenACC offloaded regions

2018-07-30 Thread Tom de Vries
On 07/11/2018 09:13 PM, Cesar Philippidis wrote:
> 2018-07-XX  Cesar Philippidis  
>   Tom de Vries  
> 
>   gcc/
>   * config/nvptx/nvptx.c (PTX_GANG_DEFAULT): Rename to ...
>   (PTX_DEFAULT_RUNTIME_DIM): ... this.
>   (nvptx_goacc_validate_dims): Set default worker and gang dims to
>   PTX_DEFAULT_RUNTIME_DIM.
>   (nvptx_dim_limit): Ignore GOMP_DIM_WORKER;

That's an independent patch.

Committed at below.

Thanks,
- Tom
[nvptx, offloading] Determine default workers at runtime

Currently, if the user doesn't specify the number of workers for an openacc
region, the compiler hardcodes it to a default value.

This patch removes this functionality, such that the libgomp runtime can decide
on a default value.

2018-07-27  Cesar Philippidis  
	Tom de Vries  

	* config/nvptx/nvptx.c (PTX_GANG_DEFAULT): Rename to ...
	(PTX_DEFAULT_RUNTIME_DIM): ... this.
	(nvptx_goacc_validate_dims): Set default worker and gang dims to
	PTX_DEFAULT_RUNTIME_DIM.
	(nvptx_dim_limit): Ignore GOMP_DIM_WORKER.

---
 gcc/config/nvptx/nvptx.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 5608bee8a8d..c1946e75f42 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -5165,7 +5165,7 @@ nvptx_expand_builtin (tree exp, rtx target, rtx ARG_UNUSED (subtarget),
 /* Define dimension sizes for known hardware.  */
 #define PTX_VECTOR_LENGTH 32
 #define PTX_WORKER_LENGTH 32
-#define PTX_GANG_DEFAULT  0 /* Defer to runtime.  */
+#define PTX_DEFAULT_RUNTIME_DIM 0 /* Defer to runtime.  */
 
 /* Implement TARGET_SIMT_VF target hook: number of threads in a warp.  */
 
@@ -5214,9 +5214,9 @@ nvptx_goacc_validate_dims (tree decl, int dims[], int fn_level)
 {
   dims[GOMP_DIM_VECTOR] = PTX_VECTOR_LENGTH;
   if (dims[GOMP_DIM_WORKER] < 0)
-	dims[GOMP_DIM_WORKER] = PTX_WORKER_LENGTH;
+	dims[GOMP_DIM_WORKER] = PTX_DEFAULT_RUNTIME_DIM;
   if (dims[GOMP_DIM_GANG] < 0)
-	dims[GOMP_DIM_GANG] = PTX_GANG_DEFAULT;
+	dims[GOMP_DIM_GANG] = PTX_DEFAULT_RUNTIME_DIM;
   changed = true;
 }
 
@@ -5230,9 +5230,6 @@ nvptx_dim_limit (int axis)
 {
   switch (axis)
 {
-case GOMP_DIM_WORKER:
-  return PTX_WORKER_LENGTH;
-
 case GOMP_DIM_VECTOR:
   return PTX_VECTOR_LENGTH;
 


abstract remaining wide int operations in VRP

2018-07-30 Thread Aldy Hernandez

...well, most of them anyhow...

I got tired of submitting these piecemeal, and it's probably easier to 
review them in one go.


There should be no difference in functionality, barring an extra call to 
set_and_canonicalize_value_range (instead of set_value_range) due to the 
way I've organized multiplication and lshifts for maximal sharing.  This 
also gets rid of some repetitive stuff.


I've also added a value_range::dump like wide_int::dump.  It makes 
debugging a lot easier.


My next patch will move all the wide_int_range_* stuff into 
wide-int-range.[hc].


I'm really liking how this is turning out, BTW: a *lot* cleaner, less 
code duplication, and shareable to boot :).


OK pending one more round of tests?

Aldy
gcc/

	* tree-vrp (zero_nonzero_bits_from_bounds): Rename to...
	(wide_int_set_zero_nonzero_bits): ...this.
	(zero_nonzero_bits_from_vr): Rename to...
	(vrp_set_zero_nonzero_bits): ...this.
	(extract_range_from_multiplicative_op_1): Abstract wide int
	code...
	(wide_int_range_multiplicative_op): ...here.
	(extract_range_from_binary_expr_1): Extract wide int binary
	operations into their own functions.
	(wide_int_range_lshift): New.
	(wide_int_range_can_optimize_bit_op): New.
	(wide_int_range_shift_undefined_p): New.
	(wide_int_range_bit_xor): New.
	(wide_int_range_bit_ior): New.
	(wide_int_range_bit_and): New.
	(wide_int_range_trunc_mod): New.
	(extract_range_into_wide_ints): New.
	(vrp_shift_undefined_p): New.
	(extract_range_from_multiplicative_op): New.
	(vrp_can_optimize_bit_op): New.
	* tree-vrp.h (value_range::dump): New.
	(wide_int_range_multiplicative_op): New.
	(wide_int_range_lshift):New.
	(wide_int_range_shift_undefined_p): New.
	(wide_int_range_bit_xor): New.
	(wide_int_range_bit_ior): New.
	(wide_int_range_bit_and): New.
	(wide_int_range_trunc_mod): New.
	(zero_nonzero_bits_from_bounds): Rename to...
	(wide_int_set_zero_nonzero_bits): ...this.
	(zero_nonzero_bits_from_vr): Rename to...
	(vrp_set_zero_nonzero_bits): ...this.
	(range_easy_mask_min_max): Rename to...
	(wide_int_range_can_optimize_bit_op): this.
	* vr-values.c (simplify_bit_ops_using_ranges): Rename
	zero_nonzero_bits_from_vr into vrp_set_zero_nonzero_bits.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 7ab8898b453..619be7d07be 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -1008,45 +1008,54 @@ wide_int_binop_overflow (wide_int ,
   return !overflow;
 }
 
-/* For range [LB, UB] compute two wide_int bitmasks.  In *MAY_BE_NONZERO
-   bitmask, if some bit is unset, it means for all numbers in the range
-   the bit is 0, otherwise it might be 0 or 1.  In *MUST_BE_NONZERO
-   bitmask, if some bit is set, it means for all numbers in the range
-   the bit is 1, otherwise it might be 0 or 1.  */
+/* For range [LB, UB] compute two wide_int bit masks.
+
+   In the MAY_BE_NONZERO bit mask, if some bit is unset, it means that
+   for all numbers in the range the bit is 0, otherwise it might be 0
+   or 1.
+
+   In the MUST_BE_NONZERO bit mask, if some bit is set, it means that
+   for all numbers in the range the bit is 1, otherwise it might be 0
+   or 1.  */
 
 void
-zero_nonzero_bits_from_bounds (signop sign,
-			   const wide_int , const wide_int ,
-			   wide_int *may_be_nonzero,
-			   wide_int *must_be_nonzero)
+wide_int_set_zero_nonzero_bits (signop sign,
+const wide_int , const wide_int ,
+wide_int _be_nonzero,
+wide_int _be_nonzero)
 {
-  *may_be_nonzero = wi::minus_one (lb.get_precision ());
-  *must_be_nonzero = wi::zero (lb.get_precision ());
+  may_be_nonzero = wi::minus_one (lb.get_precision ());
+  must_be_nonzero = wi::zero (lb.get_precision ());
 
   if (wi::eq_p (lb, ub))
 {
-  *may_be_nonzero = lb;
-  *must_be_nonzero = *may_be_nonzero;
+  may_be_nonzero = lb;
+  must_be_nonzero = may_be_nonzero;
 }
   else if (wi::ge_p (lb, 0, sign) || wi::lt_p (ub, 0, sign))
 {
   wide_int xor_mask = lb ^ ub;
-  *may_be_nonzero = lb | ub;
-  *must_be_nonzero = lb & ub;
+  may_be_nonzero = lb | ub;
+  must_be_nonzero = lb & ub;
   if (xor_mask != 0)
 	{
 	  wide_int mask = wi::mask (wi::floor_log2 (xor_mask), false,
-may_be_nonzero->get_precision ());
-	  *may_be_nonzero = *may_be_nonzero | mask;
-	  *must_be_nonzero = wi::bit_and_not (*must_be_nonzero, mask);
+may_be_nonzero.get_precision ());
+	  may_be_nonzero = may_be_nonzero | mask;
+	  must_be_nonzero = wi::bit_and_not (must_be_nonzero, mask);
 	}
 }
 }
 
-/* Like zero_nonzero_bits_from_bounds, but use the range in value_range VR.  */
+/* Value range wrapper for wide_int_set_zero_nonzero_bits.
+
+   Compute MAY_BE_NONZERO and MUST_BE_NONZERO bit masks for range in VR.
+
+   Return TRUE if VR was a constant range and we were able to compute
+   the bit masks.  */
 
 bool
-zero_nonzero_bits_from_vr (const tree expr_type,
+vrp_set_zero_nonzero_bits (const tree expr_type,
 			   value_range *vr,
 			   wide_int *may_be_nonzero,
 			   wide_int *must_be_nonzero)
@@ 

Re: [PATCH] Fix wrong code with truncated string literals (PR 86711/86714)

2018-07-30 Thread Richard Biener
On Sun, 29 Jul 2018, Martin Sebor wrote:

> On 07/29/2018 04:56 AM, Bernd Edlinger wrote:
> > Hi!
> > 
> > This fixes two wrong code bugs where string_constant
> > returns over length string constants.  Initializers
> > like that are rejected in C++, but valid in C.
> 
> If by valid you are referring to declarations like the one in
> the added test:
> 
> const char a[2][3] = { "1234", "xyz" };
> 
> then (as I explained), the excess elements in "1234" make
> the char[3] initialization and thus the test case undefined.
> I have resolved bug 86711 as invalid on those grounds.
> 
> Bug 86711 has a valid test case that needs to be fixed, along
> with bug 86688 that I raised for the same underlying problem:
> considering the excess nul as part of the string.  As has been
> discussed in a separate bug, rather than working around
> the excessively long strings in the middle-end, it would be
> preferable to avoid creating them to begin with.
> 
> I'm already working on a fix for bug 86688, in part because
> I introduced the code change and also because I'm making other
> changes in this area -- bug 86552.  Both of these in response
> to your comments.
> 
> I would normally welcome someone else helping with my work
> but (as I already made clear last week) it's counteproductive
> to have two people working in the very same area at the same
> time, especially when they are working at cross purposes as
> you seem to be hell-bent on doing.
> 
> > I have xfailed strlenopt-49.c, which tests this feature.
> 
> That's not appropriate.  The purpose of the test is to verify
> the fix for bug 86428: namely, that a call to strlen() on
> an array initialized with a string of the same length is
> folded, such as in:
> 
> const char b[4] = "123\0";
> 
> That's a valid initializer that can and should continue to be
> folded.  The test needs to continue to exercise that feature.
> 
> The test also happens to exercise invalid/overlong initializers.
> This is because that, in my view, it's safer to fold the result
> of such calls to a constant than than to call the library
> function and have it either return an unpredictable value or
> perhaps even crash.
> 
> > Personally I don't think that it is worth the effort to
> > optimize something that is per se invalid in C++.
> 
> This is a C test, not C++.  (I don't suppose you are actually
> saying that only the common subset between C and C++ is worth
> optimizing.)
> 
> Just in case it isn't clear from the above: the point of
> the test exercising the behavior for overlong strings isn't
> optimizing undefined behavior but rather avoiding the worst
> consequences of it.  I have already explained this once
> before so I'm starting to wonder if I'm being insufficiently
> clear or if you are not receiving or reading (or understanding)
> my responses.  We can have a broader discussion about whether
> this is the best approach for GCC to take either in this instance
> or in general, but in the meantime I would appreciate it if you
> refrained from undoing my changes just because you don't agree
> with or don't understand the motivation behind them.
> 
> Martin
> 
> PS I continue to wonder about your motivation and ethics.  It's
> rare to have someone so openly, blatantly and persistently try
> to undermine someone else's work.

Martin, you are clearly the one being hostile here - Bernd is trying
to help.  In fact his patches are more focused, easier to undestand
and thus easier to review.

Get your feelings about "ownership" under control.

Richard.


Re: [PATCH] Avoid another non zero terminated string constant

2018-07-30 Thread Richard Biener
On Sun, 29 Jul 2018, Bernd Edlinger wrote:

> Hi!
> 
> This fixes another not NUL terminated string literal that is created
> in tree-ssa-forwprop.c at simplify_builtin_call.
> 
> src_buf is set up to contain a NUL at src_buf[src_len], thus use src_len + 1
> as length parameter to build_string_literal.  All other uses of
> build_string_literal do it right, as far as I can see.
> 
> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?

OK.

Richard.


[PING][PATCH] [v4][aarch64] Avoid tag collisions for loads falkor

2018-07-30 Thread Siddhesh Poyarekar

Hello,

Ping!

On 07/24/2018 12:37 PM, Siddhesh Poyarekar wrote:

Hi,

This is a rewrite of the tag collision avoidance patch that Kugan had
written as a machine reorg pass back in February.

The falkor hardware prefetching system uses a combination of the
source, destination and offset to decide which prefetcher unit to
train with the load.  This is great when loads in a loop are
sequential but sub-optimal if there are unrelated loads in a loop that
tag to the same prefetcher unit.

This pass attempts to rename the desination register of such colliding
loads using routines available in regrename.c so that their tags do
not collide.  This shows some performance gains with mcf and xalancbmk
(~5% each) and will be tweaked further.  The pass is placed near the
fag end of the pass list so that subsequent passes don't inadvertantly
end up undoing the renames.

A full gcc bootstrap and testsuite ran successfully on aarch64, i.e. it
did not introduce any new regressions.  I also did a make-check with
-mcpu=falkor to ensure that there were no regressions.  The couple of
regressions I found were target-specific and were related to scheduling
and cost differences and are not correctness issues.

Changes from v3:
- Avoid renaming argument/return registers and registers that have a
   specific architectural meaning, i.e. stack pointer, frame pointer,
   etc.  Try renaming their aliases instead.

Changes from v2:
- Ignore SVE instead of asserting that falkor does not support sve

Changes from v1:

- Fixed up issues pointed out by Kyrill
- Avoid renaming R0/V0 since they could be return values
- Fixed minor formatting issues.

2018-07-02  Siddhesh Poyarekar  
Kugan Vivekanandarajah  

* config/aarch64/falkor-tag-collision-avoidance.c: New file.
* config.gcc (extra_objs): Build it.
* config/aarch64/t-aarch64 (falkor-tag-collision-avoidance.o):
Likewise.
* config/aarch64/aarch64-passes.def
(pass_tag_collision_avoidance): New pass.
* config/aarch64/aarch64.c (qdf24xx_tunings): Add
AARCH64_EXTRA_TUNE_RENAME_LOAD_REGS to tuning_flags.
(aarch64_classify_address): Remove static qualifier.
(aarch64_address_info, aarch64_address_type): Move to...
* config/aarch64/aarch64-protos.h: ... here.
(make_pass_tag_collision_avoidance): New function.
* config/aarch64/aarch64-tuning-flags.def (rename_load_regs):
New tuning flag.

CC: james.greenha...@arm.com
CC: kyrylo.tkac...@foss.arm.com
---
  gcc/config.gcc|   2 +-
  gcc/config/aarch64/aarch64-passes.def |   1 +
  gcc/config/aarch64/aarch64-protos.h   |  49 +
  gcc/config/aarch64/aarch64-tuning-flags.def   |   2 +
  gcc/config/aarch64/aarch64.c  |  48 +-
  .../aarch64/falkor-tag-collision-avoidance.c  | 881 ++
  gcc/config/aarch64/t-aarch64  |   9 +
  7 files changed, 946 insertions(+), 46 deletions(-)
  create mode 100644 gcc/config/aarch64/falkor-tag-collision-avoidance.c

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 78e84c2b864..8f5e458e8a6 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -304,7 +304,7 @@ aarch64*-*-*)
extra_headers="arm_fp16.h arm_neon.h arm_acle.h"
c_target_objs="aarch64-c.o"
cxx_target_objs="aarch64-c.o"
-   extra_objs="aarch64-builtins.o aarch-common.o cortex-a57-fma-steering.o"
+   extra_objs="aarch64-builtins.o aarch-common.o cortex-a57-fma-steering.o 
falkor-tag-collision-avoidance.o"
target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.c"
target_has_targetm_common=yes
;;
diff --git a/gcc/config/aarch64/aarch64-passes.def 
b/gcc/config/aarch64/aarch64-passes.def
index 87747b420b0..f61a8870aa1 100644
--- a/gcc/config/aarch64/aarch64-passes.def
+++ b/gcc/config/aarch64/aarch64-passes.def
@@ -19,3 +19,4 @@
 .  */
  
  INSERT_PASS_AFTER (pass_regrename, 1, pass_fma_steering);

+INSERT_PASS_AFTER (pass_machine_reorg, 1, pass_tag_collision_avoidance);
diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index af5db9c5953..647ad7a9c37 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -288,6 +288,49 @@ struct tune_params
const struct cpu_prefetch_tune *prefetch;
  };
  
+/* Classifies an address.

+
+   ADDRESS_REG_IMM
+   A simple base register plus immediate offset.
+
+   ADDRESS_REG_WB
+   A base register indexed by immediate offset with writeback.
+
+   ADDRESS_REG_REG
+   A base register indexed by (optionally scaled) register.
+
+   ADDRESS_REG_UXTW
+   A base register indexed by (optionally scaled) zero-extended register.
+
+   ADDRESS_REG_SXTW
+   A base register indexed by (optionally scaled) sign-extended register.
+
+   ADDRESS_LO_SUM
+   A LO_SUM rtx with a base register and "LO12" symbol relocation.
+
+   ADDRESS_SYMBOLIC:
+