[PATCH, contrib] Re: port contrib/download_prerequisites script to macOS

2017-04-06 Thread Jerry DeLisle
ping

On 04/04/2017 06:10 PM, Damian Rouson wrote:
> All,
> 
> The attached patch modifies the contrib/download_prerequisites script to work 
> on macOS. 
> The revised script detects the operating system and adjusts the shasum and 
> md5 commands 
> to their expected name and arguments on macOS.  The revised script also uses 
> curl if 
> wget is not present.  macOS ships with curl but not wget.
> 
> Tested on macOS and Lubuntu and Fedora Linux distributions. 
> 
> Ok for trunk?
> 
> Damian
> 
> 
> 2017-04-05  Damian Rouson  
> 
> * download_prerequisites (md5_check): New function emulates Linux
> 'md5 --check' on macOS.  Modified script for macOS compatibility.
> 


Re: [RS6000] Fix PR45053

2017-04-06 Thread Segher Boessenkool
On Thu, Feb 07, 2013 at 07:15:05PM +1030, Alan Modra wrote:
> I think this one counts as obvious, but I'll ask for permission anyway.
> Bootstrapped etc. powerpc-linux.  OK everywhere?

Okay for trunk (and backports, if you want).  -O2 may look odd but
there is a nice big comment explaining it (please add something like
"(see PR45053)" though).

Thanks,


Segher


>   PR target/45053
>   * config/rs6000/t-crtstuff (CRTSTUFF_T_CFLAGS): Add -O2.
> 
> Index: libgcc/config/rs6000/t-crtstuff
> ===
> --- libgcc/config/rs6000/t-crtstuff   (revision 195839)
> +++ libgcc/config/rs6000/t-crtstuff   (working copy)
> @@ -1,3 +1,6 @@
>  # If .sdata is enabled __CTOR_{LIST,END}__ go into .sdata instead of
>  # .ctors.
> -CRTSTUFF_T_CFLAGS = -msdata=none
> +# Do not build crtend.o with -Os as that can result in references to
> +# out-of-line register save/restore functions, which may be unresolved
> +# as crtend.o is linked after libgcc.a.
> +CRTSTUFF_T_CFLAGS = -msdata=none -O2
> 


[PATCH v2,testsuite] PR79867: Merge fixes for windows DLL loading problem from libffi

2017-04-06 Thread Daniel Santos
We currently have two copies of target-libpath.exp in the tree under
gcc/testsuite/lib and libffi/testsuite/lib.  It was originally pulled
into the libffi project (from downstream gcc) in 2009
(https://github.com/libffi/libffi/commit/5cbe2058c128e848446ae79fe15ee54260a90559).
Then in 2012, Anthony Green (from libffi) modified it to correct this
Windows problem (and thank you:
https://github.com/libffi/libffi/commit/bd78c9c3311244dd5f877c915b0dff91621dd253).
In 2015, this file got pulled from upstream libffi back into gcc, thus
beginning two separate development paths
(https://github.com/gcc-mirror/gcc/commit/89d8a412de548b218cf7c967e65ad98bceb1ed4e).

This patch merges the changes from libffi upstream which correctly solve
the Windows DLL load path problem in set_ld_library_path_env_vars and
restore_ld_library_path_env_vars, thus fixing most PR79867.  However,
there is still incorrect behaviour in DejaGNU's unix_load that should
eventually be adddressed, although I cannot yet point to a specific
failure that it is causing.

Ultimately, I think that this functionality should be moved upstream to
DejaGNU where it can be managed more cleanly in board config files, but
we'll have to keep this code in gcc for when DejaGNU doesn't have
set/restore or push/pop libpath functionality.

Signed-off-by: Daniel Santos 
---
 gcc/testsuite/lib/target-libpath.exp | 21 +
 1 file changed, 21 insertions(+)

diff --git a/gcc/testsuite/lib/target-libpath.exp 
b/gcc/testsuite/lib/target-libpath.exp
index 9b3e201ed68..b6d01b31016 100644
--- a/gcc/testsuite/lib/target-libpath.exp
+++ b/gcc/testsuite/lib/target-libpath.exp
@@ -23,6 +23,7 @@ set orig_shlib_path_saved 0
 set orig_ld_library_path_32_saved 0
 set orig_ld_library_path_64_saved 0
 set orig_dyld_library_path_saved 0
+set orig_path_saved 0
 set orig_gcc_exec_prefix_saved 0
 set orig_gcc_exec_prefix_checked 0
 
@@ -55,6 +56,7 @@ proc set_ld_library_path_env_vars { } {
   global orig_ld_library_path_32_saved
   global orig_ld_library_path_64_saved
   global orig_dyld_library_path_saved
+  global orig_path_saved
   global orig_gcc_exec_prefix_saved
   global orig_gcc_exec_prefix_checked
   global orig_ld_library_path
@@ -63,6 +65,7 @@ proc set_ld_library_path_env_vars { } {
   global orig_ld_library_path_32
   global orig_ld_library_path_64
   global orig_dyld_library_path
+  global orig_path
   global orig_gcc_exec_prefix
   global env
 
@@ -110,6 +113,10 @@ proc set_ld_library_path_env_vars { } {
   set orig_dyld_library_path "$env(DYLD_LIBRARY_PATH)"
   set orig_dyld_library_path_saved 1
 }
+if [info exists env(PATH)] {
+  set orig_path "$env(PATH)"
+  set orig_path_saved 1
+}
   }
 
   # We need to set ld library path in the environment.  Currently,
@@ -164,6 +171,13 @@ proc set_ld_library_path_env_vars { } {
   } else {
 setenv DYLD_LIBRARY_PATH "$ld_library_path"
   }
+  if { [istarget *-*-cygwin*] || [istarget *-*-mingw*] } {
+if { $orig_path_saved } {
+  setenv PATH "$ld_library_path:$orig_path"
+} else {
+  setenv PATH "$ld_library_path"
+}
+  }
 
   verbose -log "LD_LIBRARY_PATH=[getenv LD_LIBRARY_PATH]"
   verbose -log "LD_RUN_PATH=[getenv LD_RUN_PATH]"
@@ -201,12 +215,14 @@ proc restore_ld_library_path_env_vars { } {
   global orig_ld_library_path_32_saved
   global orig_ld_library_path_64_saved
   global orig_dyld_library_path_saved
+  global orig_path_saved
   global orig_ld_library_path
   global orig_ld_run_path
   global orig_shlib_path
   global orig_ld_library_path_32
   global orig_ld_library_path_64
   global orig_dyld_library_path
+  global orig_path
   global env
 
   restore_gcc_exec_prefix_env_var
@@ -245,6 +261,11 @@ proc restore_ld_library_path_env_vars { } {
   } elseif [info exists env(DYLD_LIBRARY_PATH)] {
 unsetenv DYLD_LIBRARY_PATH
   }
+  if { $orig_path_saved } {
+setenv PATH "$orig_path"
+  } elseif [info exists env(PATH)] {
+unsetenv PATH
+  }
 }
 
 ###
-- 
2.11.0



Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Bernd Edlinger
On 04/06/17 09:47, Richard Biener wrote:
> On Wed, 5 Apr 2017, Bernd Edlinger wrote:
>
>> On 04/05/17 19:22, Bernd Edlinger wrote:
>>> On 04/05/17 18:08, Jakub Jelinek wrote:
>>>
>>> Yes, exactly.  I really want to reach the deadline for gcc-7.
>>> Fixing the name is certainly the most important first step,
>>> and if everybody agrees on "typeless_storage", for the name
>>> I can start with adjusting the name, and look into how
>>> to use a spare type-flag that should be a mechanical change.
>>>
>>
>> Jakub, I just renamed the attribute and reworked the patch
>> as you suggested, reg-testing is not yet completed, but
>> it looks good so far.  I also added a few more tests.
>>
>> I have changed the documentation as Richi suggested, but
>> I am not too sure what to say here.
>
> The alias.c changes are not sufficient.  I think what you want is
> sth like
>
> Index: gcc/alias.c
> ===
> --- gcc/alias.c   (revision 246678)
> +++ gcc/alias.c   (working copy)
> @@ -136,6 +136,9 @@ struct GTY(()) alias_set_entry {
>bool is_pointer;
>/* Nonzero if is_pointer or if one of childs have has_pointer set.  */
>bool has_pointer;
> +  /* Nonzero if we have a child serving as typeless storage (or are
> + such storage ourselves).  */
> +  bool has_typeless_storage;
>
>/* The children of the alias set.  These are not just the immediate
>   children, but, in fact, all descendants.  So, if we have:
> @@ -419,7 +422,8 @@ alias_set_subset_of (alias_set_type set1
>/* Check if set1 is a subset of set2.  */
>ase2 = get_alias_set_entry (set2);
>if (ase2 != 0
> -  && (ase2->has_zero_child
> +  && (ase2->has_typeless_storage
> +   || ase2->has_zero_child
> || (ase2->children && ase2->children->get (set1
>  return true;
>

I think get_alias_set(t) will return 0 for typeless_storage
types, and therefore has_zero_child will be set anyway.
I think both mean the same thing in the end, but it depends on
what typeless_storage should actually mean, and we have
not yet the same idea about it.

> @@ -825,6 +829,7 @@ init_alias_set_entry (alias_set_type set
>ase->has_zero_child = false;
>ase->is_pointer = false;
>ase->has_pointer = false;
> +  ase->has_typeless_storage = false;
>gcc_checking_assert (!get_alias_set_entry (set));
>(*alias_sets)[set] = ase;
>return ase;
> @@ -955,6 +960,7 @@ get_alias_set (tree t)
>   Just be pragmatic here and make sure the array and its element
>   type get the same alias set assigned.  */
>else if (TREE_CODE (t) == ARRAY_TYPE
> +&& ! TYPE_TYPELESS_STORAGE (t)
>  && (!TYPE_NONALIASED_COMPONENT (t)
>  || TYPE_STRUCTURAL_EQUALITY_P (t)))
>  set = get_alias_set (TREE_TYPE (t));
> @@ -1094,6 +1100,15 @@ get_alias_set (tree t)
>
>TYPE_ALIAS_SET (t) = set;
>
> +  if (TREE_CODE (t) == ARRAY_TYPE
> +  && TYPE_TYPELESS_STORAGE (t))
> +{
> +  alias_set_entry *ase = get_alias_set_entry (set);
> +  if (!ase)
> + ase = init_alias_set_entry (set);
> +  ase->has_typeless_storage = true;
> +}
> +
>/* If this is an aggregate type or a complex type, we must record any
>   component aliasing information.  */
>if (AGGREGATE_TYPE_P (t) || TREE_CODE (t) == COMPLEX_TYPE)
> @@ -1173,6 +1188,8 @@ record_alias_subset (alias_set_type supe
>   superset_entry->has_zero_child = true;
>if (subset_entry->has_pointer)
>   superset_entry->has_pointer = true;
> +   if (subset_entry->has_typeless_storage)
> + superset_entry->has_typeless_storage = true;
>
> if (subset_entry->children)
>   {
>
>
> please also restrict TYPE_TYPELESS_STORAGE to ARRAY_TYPEs (otherwise
> more complications will arise).
>
> Index: gcc/cp/class.c
> ===
> --- gcc/cp/class.c  (revision 246678)
> +++ gcc/cp/class.c  (working copy)
> @@ -2083,7 +2083,8 @@ fixup_attribute_variants (tree t)
>tree attrs = TYPE_ATTRIBUTES (t);
>unsigned align = TYPE_ALIGN (t);
>bool user_align = TYPE_USER_ALIGN (t);
> -  bool may_alias = lookup_attribute ("may_alias", attrs);
> +  bool may_alias = TYPE_TYPELESS_STORAGE (t)
> +  || lookup_attribute ("may_alias", attrs);
>
>if (may_alias)
>  fixup_may_alias (t);
> @@ -7345,6 +7348,12 @@ finish_struct_1 (tree t)
>   the class or perform any other required target modifications.  */
>targetm.cxx.adjust_class_at_definition (t);
>
> +  if (cxx_dialect >= cxx1z && cxx_type_contains_byte_buffer (t))
> +{
> +  TYPE_TYPELESS_STORAGE (t) = 1;
> +  fixup_attribute_variants (t);
> ...
>
> I don't think you need all this given alias.c only looks at
> TYPE_MAIN_VARIANTs.

I wanted to be able to declare a int __attribute__((typeless_storage))
as in the test case, and the sample in the spec.  And that
information is not in the 

[wwwdocs, PATCH v2] C++ terminology: the One Definition Rule in diagnostics

2017-04-06 Thread David Malcolm
On Thu, 2017-04-06 at 10:44 -0600, Jeff Law wrote:
> On 03/24/2017 03:29 AM, Martin Liška wrote:
> > I would like to ping that. I'm not sure what's agreement after I
> > read
> > discussion in: https://gcc.gnu.org/ml/gcc/2017-03/msg00070.html
> > 
> > Martin Sebor may know, CC'ing him.
> Not sure if you're pinging the internal_error_cont stuff, or the ODR 
> diagnostics changes.
> 
> WRT the ODR diagnostics, I'd say let's go with the C++17 style 
> (all-lowercase with a hyphen).

Here's an updated version of the patch for the coding conventions page
of the website, which makes that recommendation (with a leading "the").

For easy of review, the pertinent table headings are:

Use..instead ofRationale

OK to commit?

(I've got a followup patch to update the diagnostics accordingly)

> If you've got a pointer to the internal_err_cont changes, please pass
> it 
> along.
> 
> jeff
> Index: htdocs/codingconventions.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/codingconventions.html,v
retrieving revision 1.80
diff -u -p -r1.80 codingconventions.html
--- htdocs/codingconventions.html	13 Mar 2017 22:01:05 -	1.80
+++ htdocs/codingconventions.html	6 Apr 2017 20:55:51 -
@@ -448,6 +448,12 @@ and code.  The following table lists som
 "Objective C"
   
   
+"the one-definition rule"
+"One Definition Rule", "the C++ One Definition Rule",
+  or "one definition rule"
+Style used in the C++17 standard
+  
+  
 "prologue"
 "prolog"
 Established convention


Re: [PATCH] Fix PR80334

2017-04-06 Thread Rainer Orth
Hi Richard,

> The following patch makes sure to preserve (mis-)alignment of memory
> references when IVOPTs generates TARGET_MEM_REFs for them.
>
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>
> Richard.
>
> 2017-04-06  Richard Biener  
>
>   PR tree-optimization/80334
>   * tree-ssa-loop-ivopts.c (rewrite_use_address): Properly
>   preserve alignment of accesses.
>
>   * g++.dg/torture/pr80334.C: New testcase.

the new testcase FAILs on 32-bit Solaris/SPARC:

+FAIL: g++.dg/torture/pr80334.C   -O0  (test for excess errors)
+FAIL: g++.dg/torture/pr80334.C   -O1  (test for excess errors)
+FAIL: g++.dg/torture/pr80334.C   -O2  (test for excess errors)
+FAIL: g++.dg/torture/pr80334.C   -O2 -flto  (test for excess errors)
+FAIL: g++.dg/torture/pr80334.C   -O2 -flto -flto-partition=none  (test for exce
ss errors)
+FAIL: g++.dg/torture/pr80334.C   -O3 -fomit-frame-pointer -funroll-loops -fpeel
-loops -ftracer -finline-functions  (test for excess errors)
+FAIL: g++.dg/torture/pr80334.C   -O3 -g  (test for excess errors)
+FAIL: g++.dg/torture/pr80334.C   -Os  (test for excess errors)

Excess errors:
/vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.dg/torture/pr80334.C:11:20: 
warning: requested alignment 16 is larger than 8 [-Wattributes]

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] PR80101: Fix ICE in store_data_bypass_p

2017-04-06 Thread Segher Boessenkool
On Thu, Apr 06, 2017 at 10:05:54PM +0200, Eric Botcazou wrote:
> > [This is a repost of a patch previously posted on 3/29/2017.
> > Eric, I hope you might consider that this falls within your scope
> > of maintenance.  Thanks.]
> 
> My viewpoint is that it's better to keep the assertions and fix the back-end 
> instead, which looks rather straightforward.

The only straightforward way I see is to use a rs6000_store_data_bypass_p
instead, which would be doing the same thing.  :-(

[ It of course is easy to workaround the specific problem in the testcase,
but that does not solve any of the underlying problems. ]


Segher


Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Bernd Edlinger
On 04/05/17 23:02, Bernd Edlinger wrote:
> On 04/05/17 19:22, Bernd Edlinger wrote:
>> On 04/05/17 18:08, Jakub Jelinek wrote:
>>
>> Yes, exactly.  I really want to reach the deadline for gcc-7.
>> Fixing the name is certainly the most important first step,
>> and if everybody agrees on "typeless_storage", for the name
>> I can start with adjusting the name, and look into how
>> to use a spare type-flag that should be a mechanical change.
>>
>
> Jakub, I just renamed the attribute and reworked the patch
> as you suggested, reg-testing is not yet completed, but
> it looks good so far.  I also added a few more tests.
>

Aehm, sorry, actually I ran into a problem with the latest
patch version, where I tried to convert the TYPE_ATTRIBUTE
into a TYPE_FLAG here:
https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00254.html

That is for instance with g++.dg/cpp1z/init-statement6.C an
internal error: "same canonical type node for different types"
happened, and I was not able to fix it immediately.

Although I would have liked it better this way, I think this
can be fixed separately, unless someone sees an obvious thinko
in the previous version.

So in the moment I restored the typeless_storage as an
ordinary TYPE_ATTRIBUTE, but at least it bootstraps and
causes no test regressions, as always with
languages=all,ada,go,obj-c++


Thanks
Bernd.
Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi	(revision 246678)
+++ gcc/doc/extend.texi	(working copy)
@@ -6656,6 +6656,36 @@
 @option{-fstrict-aliasing}, which is on by default at @option{-O2} or
 above.
 
+@item typeless_storage
+@cindex @code{typeless_storage} type attribute
+In the context of section 6.5 paragraph 6 of the C11 standard,
+an object of this type behaves as if it has no declared type.
+In the context of section 6.5 paragraph 7 of the C11 standard,
+an object or a pointer if this type behaves as if it were a
+character type.
+This attribute is similar to the @code{may_alias} attribute,
+except that it is not restricted to pointers.
+
+Example of use:
+
+@smallexample
+typedef int __attribute__((__typeless_storage__)) int_a;
+
+int
+main (void)
+@{
+  int_a a = 0x12345678;
+  short *b = (short *) 
+
+  b[1] = 0;
+
+  if (a == 0x12345678)
+abort();
+
+  exit(0);
+@}
+@end smallexample
+
 @item packed
 @cindex @code{packed} type attribute
 This attribute, attached to @code{struct} or @code{union} type
Index: gcc/alias.c
===
--- gcc/alias.c	(revision 246678)
+++ gcc/alias.c	(working copy)
@@ -879,6 +879,10 @@ get_alias_set (tree t)
   t = TREE_TYPE (t);
 }
 
+  /* Honor the typeless_storage type attribute.  */
+  if (lookup_attribute ("typeless_storage", TYPE_ATTRIBUTES (t)))
+return 0;
+
   /* Variant qualifiers don't affect the alias set, so get the main
  variant.  */
   t = TYPE_MAIN_VARIANT (t);
@@ -1234,7 +1238,9 @@ record_component_aliases (tree type)
 		/* VECTOR_TYPE and ARRAY_TYPE share the alias set with their
 		   element type and that type has to be normalized to void *,
 		   too, in the case it is a pointer. */
-		while (!canonical_type_used_p (t) && !POINTER_TYPE_P (t))
+		while (!canonical_type_used_p (t) && !POINTER_TYPE_P (t)
+		   && !lookup_attribute ("typeless_storage",
+	 TYPE_ATTRIBUTES (t)))
 		  {
 		gcc_checking_assert (TYPE_STRUCTURAL_EQUALITY_P (t));
 		t = TREE_TYPE (t);
Index: gcc/tree.c
===
--- gcc/tree.c	(revision 246678)
+++ gcc/tree.c	(working copy)
@@ -8041,7 +8041,8 @@ build_pointer_type_for_mode (tree to_type, machine
 
   /* If the pointed-to type has the may_alias attribute set, force
  a TYPE_REF_CAN_ALIAS_ALL pointer to be generated.  */
-  if (lookup_attribute ("may_alias", TYPE_ATTRIBUTES (to_type)))
+  if (lookup_attribute ("may_alias", TYPE_ATTRIBUTES (to_type))
+  || lookup_attribute ("typeless_storage", TYPE_ATTRIBUTES (to_type)))
 can_alias_all = true;
 
   /* In some cases, languages will have things that aren't a POINTER_TYPE
@@ -8110,7 +8111,8 @@ build_reference_type_for_mode (tree to_type, machi
 
   /* If the pointed-to type has the may_alias attribute set, force
  a TYPE_REF_CAN_ALIAS_ALL pointer to be generated.  */
-  if (lookup_attribute ("may_alias", TYPE_ATTRIBUTES (to_type)))
+  if (lookup_attribute ("may_alias", TYPE_ATTRIBUTES (to_type))
+  || lookup_attribute ("typeless_storage", TYPE_ATTRIBUTES (to_type)))
 can_alias_all = true;
 
   /* In some cases, languages will have things that aren't a REFERENCE_TYPE
Index: gcc/c-family/c-attribs.c
===
--- gcc/c-family/c-attribs.c	(revision 246678)
+++ gcc/c-family/c-attribs.c	(working copy)
@@ -265,6 +265,7 @@ const struct attribute_spec c_common_attribute_tab
   { "nothrow",0, 0, true,  false, false,
 			  

Re: [PR 79905] ICE with vector_type

2017-04-06 Thread Segher Boessenkool
Hi!

On Thu, Apr 06, 2017 at 02:34:03PM -0400, Nathan Sidwell wrote:
> Segher, this fixes a C++ ICE where TYPE_CANONICALs didn't match, but the 
> types were the same (and non-structural comparison).  The underlying 
> cause is that types with different TYPE_NAME are considered different 
> canonical types.  add_builtin_type smacked TYPE_NAME of the canonical 
> type, therefore guaranteeing that any subsequent vector types would be 
> thought of as different.

> Index: config/rs6000/rs6000.c
> ===
> --- config/rs6000/rs6000.c(revision 246647)
> +++ config/rs6000/rs6000.c(working copy)
> @@ -17257,6 +17257,22 @@ rs6000_expand_builtin (tree exp, rtx tar
>gcc_unreachable ();
>  }
>  
> +/* Create a builtin vector type with a name.  Taking care not to give
> +   the canonical type a name.  */
> +
> +static tree
> +rs6000_vt (const char *name, tree elt_type, unsigned num_elts)

I don't like this cryptic name very much.  Maybe you could just use a
longer name and indent differently (break at the "=" for example), or
do a macro around where it is used a lot?

But, okay for trunk whatever you decide on this.  Thanks!


Segher


Re: [PATCH] PR80101: Fix ICE in store_data_bypass_p

2017-04-06 Thread Eric Botcazou
> [This is a repost of a patch previously posted on 3/29/2017.
> Eric, I hope you might consider that this falls within your scope
> of maintenance.  Thanks.]

My viewpoint is that it's better to keep the assertions and fix the back-end 
instead, which looks rather straightforward.

-- 
Eric Botcazou


Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Bernd Edlinger
On 04/06/17 21:14, Richard Biener wrote:
> On April 6, 2017 7:39:14 PM GMT+02:00, Bernd Edlinger 
>  wrote:
>> On 04/06/17 16:17, Florian Weimer wrote:
 Here is what I want to write in the doc:

 @item typeless_storage
 @cindex @code{typeless_storage} type attribute
 A type declared with this attribute behaves like a character type
 with respect to aliasing semantics.
 This is attribute is similar to the @code{may_alias} attribute,
 except that it is not restricted to pointers.
>>>
>>> As Jakub pointed out, this is not what we need here.  An object of
>> type
>>> char does *not* have untyped storage.  Accessing it as a different
>> type
>>> is still undefined.
>>>
>>
>> but, do you agree that this is valid in C11?
>>
>> typedef char char_a[4];
>>
>> int
>> main (void)
>> {
>>   char_a a = {1,2,3,4};
>>   short *b = (short *) 
>>
>>   b[1] = 0;
>>
>>   if (a[0] == 1 && a[1] == 2 && a[2] == 3 && a[3] == 4)
>> abort();
>>
>>   exit(0);
>> }
>>
>>
>> all I want to do is replace "char" with a different type.
>
> Why?

- It feels more othogonal this way.
- Otherwise malloc would have magic power, in creating objects with no
   declared type.
- And implementing something like malloc in plain C would actually be
   forbidden, which is ridiculous, because I already have done it.
- It was easy to implement in the middle end.
- It feels useful in C and C++.
- Jason says :)


Bernd.

>
> Richard.


Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Bernd Edlinger
On 04/06/17 21:05, Florian Weimer wrote:
> On 04/06/2017 08:49 PM, Bernd Edlinger wrote:
>
>> For instance how do you "declare an object without a declared type"?
>
> malloc and other allocation functions return pointers to objects without
> a declared type.
>

Thanks Florian,

this discussion is very helpful.

How about this for the documentation:

@item typeless_storage
@cindex @code{typeless_storage} type attribute
In the context of section 6.5 paragraph 6 of the C11 standard,
an object of this type behaves as if it has no declared type.
In the context of section 6.5 paragraph 7 of the C11 standard,
an object or a pointer if this type behaves as if it were a
character type.
This is attribute is similar to the @code{may_alias} attribute,
except that it is not restricted to pointers.

Example of use:

@smallexample
typedef int __attribute__((__typeless_storage__)) int_a;

int
main (void)
@{
   int_a a = 0x12345678;
   short *b = (short *) 

   b[1] = 0;

   if (a == 0x12345678)
 abort();

   exit(0);
@}
@end smallexample


Bernd.


[PATCH, i386] Introduce SSE 4.1 SI->DI zero-extension to moves between SSE registers

2017-04-06 Thread Uros Bizjak
Attached patch considerably improves zero-extended SImode -> DImode
moves between SSE registers for SSE4.1 targets. The patch teaches the
compiler to generate:

vmovdqa m(%rip), %ymm1
vpmovzxdq   %xmm1, %xmm1
vpsrlw  %xmm1, %xmm0, %xmm0

to zero-extend the value in the SSE register, instead of round
tripping the value to GPR:

vmovdqa m(%rip), %ymm1
vmovd   %xmm1, %eax
vmovq   %rax, %xmm1
vpsrlw  %xmm1, %xmm0, %xmm0

... or horrible code for targets without preference to inter-unit moves.

As mentioned by Jakub, there are other optimization opportunities with
count argument handling.

2017-04-06  Uros Bizjak  

PR target/80286
* config/i386/sse.md (*vec_extractv4si_0_zext_sse4): New pattern.
* config/i386/i386.md (*zero_extendsidi2):
Add (?*x,*x) and (?*v,*v) alternatives.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 246738)
+++ config/i386/i386.md (working copy)
@@ -3767,10 +3767,10 @@
 
 (define_insn "*zero_extendsidi2"
   [(set (match_operand:DI 0 "nonimmediate_operand"
-   "=r,?r,?o,r   ,o,?*Ym,?!*y,?r ,?r,?*Yi,?*x,*r")
+   "=r,?r,?o,r   ,o,?*Ym,?!*y,?r ,?r,?*Yi,?*x,?*x,?*v,*r")
(zero_extend:DI
 (match_operand:SI 1 "x86_64_zext_operand"
-   "0 ,rm,r ,rmWz,0,r   ,m   ,*Yj,*x,r   ,m  ,*k")))]
+   "0 ,rm,r ,rmWz,0,r   ,m   ,*Yj,*x,r   ,m  , *x, *v,*k")))]
   ""
 {
   switch (get_attr_type (insn))
@@ -3791,6 +3791,15 @@
   return "%vpextrd\t{$0, %1, %k0|%k0, %1, 0}";
 
 case TYPE_SSEMOV:
+  if (SSE_REG_P (operands[0]) && SSE_REG_P (operands[1]))
+   {
+ if (EXT_REX_SSE_REG_P (operands[0])
+ || EXT_REX_SSE_REG_P (operands[1]))
+   return "vpmovzxdq\t{%t1, %g0|%g0, %t1}";
+ else
+   return "%vpmovzxdq\t{%1, %0|%0, %1}";
+   }
+
   if (GENERAL_REG_P (operands[0]))
return "%vmovd\t{%1, %k0|%k0, %1}";
 
@@ -3813,6 +3822,10 @@
(eq_attr "alternative" "10")
  (const_string "sse2")
(eq_attr "alternative" "11")
+ (const_string "sse4")
+   (eq_attr "alternative" "12")
+ (const_string "avx512f")
+   (eq_attr "alternative" "13")
  (const_string "x64_avx512bw")
   ]
   (const_string "*")))
@@ -3821,16 +3834,16 @@
  (const_string "multi")
(eq_attr "alternative" "5,6")
  (const_string "mmxmov")
-   (eq_attr "alternative" "7,9,10")
+   (eq_attr "alternative" "7,9,10,11,12")
  (const_string "ssemov")
(eq_attr "alternative" "8")
  (const_string "sselog1")
-   (eq_attr "alternative" "11")
+   (eq_attr "alternative" "13")
  (const_string "mskmov")
   ]
   (const_string "imovx")))
(set (attr "prefix_extra")
- (if_then_else (eq_attr "alternative" "8")
+ (if_then_else (eq_attr "alternative" "8,11,12")
(const_string "1")
(const_string "*")))
(set (attr "length_immediate")
@@ -3848,7 +3861,7 @@
(set (attr "mode")
  (cond [(eq_attr "alternative" "5,6")
  (const_string "DI")
-   (eq_attr "alternative" "7,8,9")
+   (eq_attr "alternative" "7,8,9,11,12")
  (const_string "TI")
   ]
   (const_string "SI")))])
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 246738)
+++ config/i386/sse.md  (working copy)
@@ -13516,18 +13516,6 @@
   "#"
   [(set_attr "isa" "*,sse4,*,*")])
 
-(define_insn_and_split "*vec_extractv4si_0_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-   (zero_extend:DI
- (vec_select:SI
-   (match_operand:V4SI 1 "register_operand" "v")
-   (parallel [(const_int 0)]]
-  "TARGET_64BIT && TARGET_SSE2 && TARGET_INTER_UNIT_MOVES_FROM_VEC"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0) (zero_extend:DI (match_dup 1)))]
-  "operands[1] = gen_lowpart (SImode, operands[1]);")
-
 (define_insn "*vec_extractv2di_0_sse"
   [(set (match_operand:DI 0 "nonimmediate_operand" "=v,m")
(vec_select:DI
@@ -13546,6 +13534,35 @@
   [(set (match_dup 0) (match_dup 1))]
   "operands[1] = gen_lowpart (mode, operands[1]);")
 
+(define_insn "*vec_extractv4si_0_zext_sse4"
+  [(set (match_operand:DI 0 "register_operand" "=r,x,v")
+   (zero_extend:DI
+ (vec_select:SI
+   (match_operand:V4SI 1 "register_operand" "Yj,x,v")
+   (parallel [(const_int 0)]]
+  "TARGET_SSE4_1"
+  "#"
+  [(set_attr "isa" "x64,*,avx512f")])
+
+(define_insn "*vec_extractv4si_0_zext"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   

Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Richard Biener
On April 6, 2017 8:12:29 PM GMT+02:00, Bernd Edlinger 
 wrote:
>On 04/06/17 19:47, Florian Weimer wrote:
>> On 04/06/2017 07:39 PM, Bernd Edlinger wrote:
>>> On 04/06/17 16:17, Florian Weimer wrote:
> Here is what I want to write in the doc:
>
> @item typeless_storage
> @cindex @code{typeless_storage} type attribute
> A type declared with this attribute behaves like a character type
> with respect to aliasing semantics.
> This is attribute is similar to the @code{may_alias} attribute,
> except that it is not restricted to pointers.

 As Jakub pointed out, this is not what we need here.  An object of
>type
 char does *not* have untyped storage.  Accessing it as a different
>type
 is still undefined.

>>>
>>> but, do you agree that this is valid in C11?
>>>
>>> typedef char char_a[4];
>>>
>>> int
>>> main (void)
>>> {
>>>char_a a = {1,2,3,4};
>>>short *b = (short *) 
>>>
>>>b[1] = 0;
>>>
>>>if (a[0] == 1 && a[1] == 2 && a[2] == 3 && a[3] == 4)
>>>  abort();
>>>
>>>exit(0);
>>> }
>>>
>>>
>>> all I want to do is replace "char" with a different type.
>>
>> Thanks a lot for posting a concrete example.
>>
>> The effective type of a[2] and [3] is char.  The character type
>wildcard
>> in 6.5(7) only applies to the type of the lvalue expression ysed for
>the
>> access, not the effective type of the object being accessed.  The
>type
>> of the LHS of the assignment expression is short.  So the access is
>> undefined.
>>
>
>exactly *that* is what I want to make valid with that attribute, which
>would be also useful in C and kernel code, IMHO.
>
>But isn't the effective type changed by the assignment b[1] = 0;
>as described in 6.5(6):
>"If a value is stored into an object having no declared type through an
>lvalue having a type that is not a character type, then the type of the
>lvalue becomes the effective type of the object for that access and for
>subsequent accesses that do not modify the stored value."

Yes.  I think the example is valid.  At least GCCs memory model makes it so.

Richard.

>
>
>Bernd.



Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Richard Biener
On April 6, 2017 7:39:14 PM GMT+02:00, Bernd Edlinger 
 wrote:
>On 04/06/17 16:17, Florian Weimer wrote:
>>> Here is what I want to write in the doc:
>>>
>>> @item typeless_storage
>>> @cindex @code{typeless_storage} type attribute
>>> A type declared with this attribute behaves like a character type
>>> with respect to aliasing semantics.
>>> This is attribute is similar to the @code{may_alias} attribute,
>>> except that it is not restricted to pointers.
>>
>> As Jakub pointed out, this is not what we need here.  An object of
>type
>> char does *not* have untyped storage.  Accessing it as a different
>type
>> is still undefined.
>>
>
>but, do you agree that this is valid in C11?
>
>typedef char char_a[4];
>
>int
>main (void)
>{
>   char_a a = {1,2,3,4};
>   short *b = (short *) 
>
>   b[1] = 0;
>
>   if (a[0] == 1 && a[1] == 2 && a[2] == 3 && a[3] == 4)
> abort();
>
>   exit(0);
>}
>
>
>all I want to do is replace "char" with a different type.

Why?

Richard.

>Bernd.
>
>> The documentation says that the memory region is considered to by
>> untyped, like a memory region returned by malloc (but obviously not
>with
>> the implication that the memory region is separated from everything
>else).
>>
>> Thanks,
>> Florian



Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Richard Biener
On April 6, 2017 4:51:01 PM GMT+02:00, Florian Weimer  
wrote:
>On 04/06/2017 04:43 PM, Jonathan Wakely wrote:
>> On 06/04/17 16:23 +0200, Richard Biener wrote:
>>> On Thu, 6 Apr 2017, Florian Weimer wrote:
>>>
 On 04/06/2017 04:11 PM, Bernd Edlinger wrote:

 > I think it is not too complicated to done in the C++ FE.
 > The FE looks for array of std::byte and unsigned char,
 > and sets the attribute when the final type is constructed.
 >
 > What I am trying to do is just extend the semantic of may_alias
 > a bit, and then have the C++ FE use it in the way it has to.

 We also need this for some POSIX and Linux kernel interfaces.  A
 C++-only
 solution would not help with that.
>>>
>>> Example(s)?
>>
>> sockaddr_storage comes to mind.
>
>Right.  The kernel also has many APIs which return multiple 
>variable-length data blocks, such as getdents64, and many more 
>interfaces in combination with read/recv system calls.  Variable length
>
>means that you cannot declare the appropriate type after the first data
>
>item, so you technically have to use malloc.
>
>POSIX interfaces which exhibit a similar pattern are getpwnam_r and 
>friends, but for them, you can probably use malloc without ill effect 
>(although there are still performance concerns).

Can you give a concrete example which shows the issue and how typeless_storage 
helps?

Thanks,
Richard.

>Thanks,
>Florian



Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Florian Weimer

On 04/06/2017 08:49 PM, Bernd Edlinger wrote:


For instance how do you "declare an object without a declared type"?


malloc and other allocation functions return pointers to objects without 
a declared type.


Thanks,
Florian



[PATCH, i386]: Fix PR 79733, ICE in int_mode_for_mode, at stor-layout.c

2017-04-06 Thread Uros Bizjak
Hello!

Attached patch rewrites handling of
IX86_BUILTIN_K{,OR}TEST{C,Z}{8,16,32,64} builtins. It is possible to
get VOIDmode const0_rtx when expanding arguments, and code was not
prepared to handle it. The expansion crashed when copying VOIDmode
immediate to a reg via copy_to_reg.

2017-04-06  Uros Bizjak  

PR target/79733
* config/i386/i386.c (ix86_expand_builtin)
: Determine insn operand
mode from insn data. Convert operands to insn operand mode.
Copy operands that don't satisfy insn predicate to a register.

testsuite/ChangeLog:

2017-04-06  Uros Bizjak  

PR target/79733
* gcc.target/i386/pr79733.c: New test.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN, will be backported to gcc-6 branch.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 246733)
+++ config/i386/i386.c  (working copy)
@@ -37752,98 +37752,82 @@ rdseed_step:
 
 case IX86_BUILTIN_KTESTC8:
   icode = CODE_FOR_ktestqi;
-  mode0 = QImode;
-  mode1 = CCCmode;
+  mode3 = CCCmode;
   goto kortest;
 
 case IX86_BUILTIN_KTESTZ8:
   icode = CODE_FOR_ktestqi;
-  mode0 = QImode;
-  mode1 = CCZmode;
+  mode3 = CCZmode;
   goto kortest;
 
 case IX86_BUILTIN_KTESTC16:
   icode = CODE_FOR_ktesthi;
-  mode0 = HImode;
-  mode1 = CCCmode;
+  mode3 = CCCmode;
   goto kortest;
 
 case IX86_BUILTIN_KTESTZ16:
   icode = CODE_FOR_ktesthi;
-  mode0 = HImode;
-  mode1 = CCZmode;
+  mode3 = CCZmode;
   goto kortest;
 
 case IX86_BUILTIN_KTESTC32:
   icode = CODE_FOR_ktestsi;
-  mode0 = SImode;
-  mode1 = CCCmode;
+  mode3 = CCCmode;
   goto kortest;
 
 case IX86_BUILTIN_KTESTZ32:
   icode = CODE_FOR_ktestsi;
-  mode0 = SImode;
-  mode1 = CCZmode;
+  mode3 = CCZmode;
   goto kortest;
 
 case IX86_BUILTIN_KTESTC64:
   icode = CODE_FOR_ktestdi;
-  mode0 = DImode;
-  mode1 = CCCmode;
+  mode3 = CCCmode;
   goto kortest;
 
 case IX86_BUILTIN_KTESTZ64:
   icode = CODE_FOR_ktestdi;
-  mode0 = DImode;
-  mode1 = CCZmode;
+  mode3 = CCZmode;
   goto kortest;
 
 case IX86_BUILTIN_KORTESTC8:
   icode = CODE_FOR_kortestqi;
-  mode0 = QImode;
-  mode1 = CCCmode;
+  mode3 = CCCmode;
   goto kortest;
 
 case IX86_BUILTIN_KORTESTZ8:
   icode = CODE_FOR_kortestqi;
-  mode0 = QImode;
-  mode1 = CCZmode;
+  mode3 = CCZmode;
   goto kortest;
 
 case IX86_BUILTIN_KORTESTC16:
   icode = CODE_FOR_kortesthi;
-  mode0 = HImode;
-  mode1 = CCCmode;
+  mode3 = CCCmode;
   goto kortest;
 
 case IX86_BUILTIN_KORTESTZ16:
   icode = CODE_FOR_kortesthi;
-  mode0 = HImode;
-  mode1 = CCZmode;
+  mode3 = CCZmode;
   goto kortest;
 
 case IX86_BUILTIN_KORTESTC32:
   icode = CODE_FOR_kortestsi;
-  mode0 = SImode;
-  mode1 = CCCmode;
+  mode3 = CCCmode;
   goto kortest;
 
 case IX86_BUILTIN_KORTESTZ32:
   icode = CODE_FOR_kortestsi;
-  mode0 = SImode;
-  mode1 = CCZmode;
+  mode3 = CCZmode;
   goto kortest;
 
 case IX86_BUILTIN_KORTESTC64:
   icode = CODE_FOR_kortestdi;
-  mode0 = DImode;
-  mode1 = CCCmode;
+  mode3 = CCCmode;
   goto kortest;
 
 case IX86_BUILTIN_KORTESTZ64:
   icode = CODE_FOR_kortestdi;
-  mode0 = DImode;
-  mode1 = CCZmode;
+  mode3 = CCZmode;
 
 kortest:
   arg0 = CALL_EXPR_ARG (exp, 0); /* Mask reg src1.  */
@@ -37851,19 +37835,32 @@ rdseed_step:
   op0 = expand_normal (arg0);
   op1 = expand_normal (arg1);
 
-  op0 = copy_to_reg (op0);
-  op0 = lowpart_subreg (mode0, op0, GET_MODE (op0));
-  op1 = copy_to_reg (op1);
-  op1 = lowpart_subreg (mode0, op1, GET_MODE (op1));
+  mode0 = insn_data[icode].operand[0].mode;
+  mode1 = insn_data[icode].operand[1].mode;
 
+  if (GET_MODE (op0) != VOIDmode)
+   op0 = force_reg (GET_MODE (op0), op0);
+
+  op0 = gen_lowpart (mode0, op0);
+
+  if (!insn_data[icode].operand[0].predicate (op0, mode0))
+   op0 = copy_to_mode_reg (mode0, op0);
+
+  if (GET_MODE (op1) != VOIDmode)
+   op1 = force_reg (GET_MODE (op1), op1);
+
+  op1 = gen_lowpart (mode1, op1);
+
+  if (!insn_data[icode].operand[1].predicate (op1, mode1))
+   op1 = copy_to_mode_reg (mode1, op1);
+
   target = gen_reg_rtx (QImode);
-  emit_insn (gen_rtx_SET (target, const0_rtx));
 
   /* Emit kortest.  */
   emit_insn (GEN_FCN (icode) (op0, op1));
   /* And use setcc to return result from flags.  */
   ix86_expand_setcc (target, EQ,
-gen_rtx_REG (mode1, FLAGS_REG), const0_rtx);
+gen_rtx_REG (mode3, FLAGS_REG), const0_rtx);
   return target;
 
 

Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Bernd Edlinger
On 04/06/17 20:19, Florian Weimer wrote:
>
> I don't know what your patch does, but your proposed documentation does
> not make this valid because “declared as char” is still not “having no
> declared type”.  Or put differently, “behaves like a character type” is
> not what we actually want here.
>

What the patch does is just so simple but it is hard for me to find the
right words so that really everybody understands:

Technically, we already have the may_alias attribute, that forces all
access through pointers to have "alias set 0" that in turn makes all
other objects volatile, unless the compiler can prove that the address
is in fact different.  But it has no impact on DECLs, so if you
use may_alias on a type, and you declare an object with that type,
then directly accessing that object by name does NOT have "alias set 0".

When I noticed that in the context of PR79671 I initially thought that
was by accident, but Richi pointed out that this is a useful feature for
vector types, that are always declared as may_alias, and moreover
the may_alias is / has been always documented to have only meaning
on pointers, all that changed is that the TBAA aliasing oracle has
improved recently to follow the specified behavior more closely.

My patch simply duplicates the semantic of may_alias and adds
"alias set 0" for accesses through DECLs of that type.


However, I must confess I find it difficult to understand the
language in which the ISO standard is written.

For instance how do you "declare an object without a declared type"?


> Let me repeat that I don't know if this is merely a documentation issue.
>
> Thanks,
> Florian


Re: [PATCH] Don't error about x86 return value in SSE reg (or x86 reg) or argument in SSE reg too early (PR target/80298)

2017-04-06 Thread Uros Bizjak
On Wed, Apr 5, 2017 at 5:37 PM, Uros Bizjak  wrote:

> 2017-04-05  Uros Bizjak  
>
> PR target/80298
> * config/i386/mmintrin.h: Add -msse target option when __SSE__ is
> not defined for x86_64 target.  Add -mmmx target option when __SSE2__
> is not defined.
> * config/i386/mm3dnow.h: Add -msse target when __SSE__ is not defined
> for x86_64 target.  Handle -m3dnowa option.
>
> I choose not to include testcases, since mm_malloc includes stdlib.h,
> which uses SSE register return with -O2, resulting in:

Including only  and its dependant mmintrin.h tests the
issue as well while avoiding external include files.

2017-04-06  Uros Bizjak  

PR target/80298
* gcc.target/i386/pr80298-1.c: New test.
* gcc.target/i386/pr80298-2.c: Ditto.

Tested on x86_64-linux-gnu {,-m32} and committed to mainline SVN.

Uros.

Index: gcc.target/i386/pr80298-1.c
===
--- gcc.target/i386/pr80298-1.c (nonexistent)
+++ gcc.target/i386/pr80298-1.c (working copy)
@@ -0,0 +1,7 @@
+/* PR target/80298 */
+/* { dg-do compile } */
+/* { dg-options "-mno-sse -mmmx" } */
+
+#include 
+
+int i;
Index: gcc.target/i386/pr80298-2.c
===
--- gcc.target/i386/pr80298-2.c (nonexistent)
+++ gcc.target/i386/pr80298-2.c (working copy)
@@ -0,0 +1,7 @@
+/* PR target/80298 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-sse -mmmx" } */
+
+#include 
+
+int i;


Re: [PR 79905] ICE with vector_type

2017-04-06 Thread Nathan Sidwell

On 04/06/2017 11:13 AM, Bill Schmidt wrote:


Nathan's patch regstraps cleanly.  I'll try Richard's variant (dropping the if 
test below) now.


As expected, this version passes regstrap as well.


Thanks for testing.

Segher, this fixes a C++ ICE where TYPE_CANONICALs didn't match, but the 
types were the same (and non-structural comparison).  The underlying 
cause is that types with different TYPE_NAME are considered different 
canonical types.  add_builtin_type smacked TYPE_NAME of the canonical 
type, therefore guaranteeing that any subsequent vector types would be 
thought of as different.


ok?

Bill's been testing these patches.

nathan

--
Nathan Sidwell
2017-04-05  Nathan Sidwell  
	Richard Biener  

	PR target/79905
	* config/rs6000/rs6000.c (rs6000_vt): New.
	(rs6000_init_builtins): Use it.

	testsuite/
	* g++.dg/torture/pr79905.C: New.

Index: config/rs6000/rs6000.c
===
--- config/rs6000/rs6000.c	(revision 246647)
+++ config/rs6000/rs6000.c	(working copy)
@@ -17257,6 +17257,22 @@ rs6000_expand_builtin (tree exp, rtx tar
   gcc_unreachable ();
 }
 
+/* Create a builtin vector type with a name.  Taking care not to give
+   the canonical type a name.  */
+
+static tree
+rs6000_vt (const char *name, tree elt_type, unsigned num_elts)
+{
+  tree result = build_vector_type (elt_type, num_elts);
+
+  /* Copy so we don't give the canonical type a name.  */
+  result = build_variant_type_copy (result);
+
+  add_builtin_type (name, result);
+
+  return result;
+}
+
 static void
 rs6000_init_builtins (void)
 {
@@ -17273,18 +17289,25 @@ rs6000_init_builtins (void)
 
   V2SI_type_node = build_vector_type (intSI_type_node, 2);
   V2SF_type_node = build_vector_type (float_type_node, 2);
-  V2DI_type_node = build_vector_type (intDI_type_node, 2);
-  V2DF_type_node = build_vector_type (double_type_node, 2);
+  V2DI_type_node = rs6000_vt (TARGET_POWERPC64 ? "__vector long"
+			  : "__vector long long", intDI_type_node, 2);
+  V2DF_type_node = rs6000_vt ("__vector double", double_type_node, 2);
   V4HI_type_node = build_vector_type (intHI_type_node, 4);
-  V4SI_type_node = build_vector_type (intSI_type_node, 4);
-  V4SF_type_node = build_vector_type (float_type_node, 4);
-  V8HI_type_node = build_vector_type (intHI_type_node, 8);
-  V16QI_type_node = build_vector_type (intQI_type_node, 16);
-
-  unsigned_V16QI_type_node = build_vector_type (unsigned_intQI_type_node, 16);
-  unsigned_V8HI_type_node = build_vector_type (unsigned_intHI_type_node, 8);
-  unsigned_V4SI_type_node = build_vector_type (unsigned_intSI_type_node, 4);
-  unsigned_V2DI_type_node = build_vector_type (unsigned_intDI_type_node, 2);
+  V4SI_type_node = rs6000_vt ("__vector signed int", intSI_type_node, 4);
+  V4SF_type_node = rs6000_vt ("__vector float", float_type_node, 4);
+  V8HI_type_node = rs6000_vt ("__vector signed short", intHI_type_node, 8);
+  V16QI_type_node = rs6000_vt ("__vector signed char", intQI_type_node, 16);
+
+  unsigned_V16QI_type_node = rs6000_vt ("__vector unsigned char",
+	unsigned_intQI_type_node, 16);
+  unsigned_V8HI_type_node = rs6000_vt ("__vector unsigned short",
+   unsigned_intHI_type_node, 8);
+  unsigned_V4SI_type_node = rs6000_vt ("__vector unsigned int",
+   unsigned_intSI_type_node, 4);
+  unsigned_V2DI_type_node = rs6000_vt (TARGET_POWERPC64
+   ? "__vector unsigned long"
+   : "__vector unsigned long long",
+   unsigned_intDI_type_node, 2);
 
   opaque_V2SF_type_node = build_opaque_vector_type (float_type_node, 2);
   opaque_V2SI_type_node = build_opaque_vector_type (intSI_type_node, 2);
@@ -17299,8 +17322,9 @@ rs6000_init_builtins (void)
  must live in VSX registers.  */
   if (intTI_type_node)
 {
-  V1TI_type_node = build_vector_type (intTI_type_node, 1);
-  unsigned_V1TI_type_node = build_vector_type (unsigned_intTI_type_node, 1);
+  V1TI_type_node = rs6000_vt ("__vector __int128", intTI_type_node, 1);
+  unsigned_V1TI_type_node = rs6000_vt ("__vector unsigned __int128",
+	   unsigned_intTI_type_node, 1);
 }
 
   /* The 'vector bool ...' types must be kept distinct from 'vector unsigned ...'
@@ -17432,83 +17456,16 @@ rs6000_init_builtins (void)
   tdecl = add_builtin_type ("__pixel", pixel_type_node);
   TYPE_NAME (pixel_type_node) = tdecl;
 
-  bool_V16QI_type_node = build_vector_type (bool_char_type_node, 16);
-  bool_V8HI_type_node = build_vector_type (bool_short_type_node, 8);
-  bool_V4SI_type_node = build_vector_type (bool_int_type_node, 4);
-  bool_V2DI_type_node = build_vector_type (bool_long_type_node, 2);
-  pixel_V8HI_type_node = build_vector_type (pixel_type_node, 8);
-
-  tdecl = add_builtin_type ("__vector unsigned char", unsigned_V16QI_type_node);
-  TYPE_NAME (unsigned_V16QI_type_node) = tdecl;
-
-  tdecl = add_builtin_type ("__vector signed char", V16QI_type_node);
-  TYPE_NAME (V16QI_type_node) = tdecl;
-

Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Florian Weimer

On 04/06/2017 08:12 PM, Bernd Edlinger wrote:

On 04/06/17 19:47, Florian Weimer wrote:

On 04/06/2017 07:39 PM, Bernd Edlinger wrote:

On 04/06/17 16:17, Florian Weimer wrote:

Here is what I want to write in the doc:

@item typeless_storage
@cindex @code{typeless_storage} type attribute
A type declared with this attribute behaves like a character type
with respect to aliasing semantics.
This is attribute is similar to the @code{may_alias} attribute,
except that it is not restricted to pointers.


As Jakub pointed out, this is not what we need here.  An object of type
char does *not* have untyped storage.  Accessing it as a different type
is still undefined.



but, do you agree that this is valid in C11?

typedef char char_a[4];

int
main (void)
{
   char_a a = {1,2,3,4};
   short *b = (short *) 

   b[1] = 0;

   if (a[0] == 1 && a[1] == 2 && a[2] == 3 && a[3] == 4)
 abort();

   exit(0);
}


all I want to do is replace "char" with a different type.


Thanks a lot for posting a concrete example.

The effective type of a[2] and [3] is char.  The character type wildcard
in 6.5(7) only applies to the type of the lvalue expression ysed for the
access, not the effective type of the object being accessed.  The type
of the LHS of the assignment expression is short.  So the access is
undefined.



exactly *that* is what I want to make valid with that attribute, which
would be also useful in C and kernel code, IMHO.


And I think we all agree that this is a laudable goal.


But isn't the effective type changed by the assignment b[1] = 0;
as described in 6.5(6):
"If a value is stored into an object having no declared type through an
lvalue having a type that is not a character type, then the type of the
lvalue becomes the effective type of the object for that access and for
subsequent accesses that do not modify the stored value."


I don't know what your patch does, but your proposed documentation does 
not make this valid because “declared as char” is still not “having no 
declared type”.  Or put differently, “behaves like a character type” is 
not what we actually want here.


Let me repeat that I don't know if this is merely a documentation issue.

Thanks,
Florian


Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Bernd Edlinger
On 04/06/17 19:47, Florian Weimer wrote:
> On 04/06/2017 07:39 PM, Bernd Edlinger wrote:
>> On 04/06/17 16:17, Florian Weimer wrote:
 Here is what I want to write in the doc:

 @item typeless_storage
 @cindex @code{typeless_storage} type attribute
 A type declared with this attribute behaves like a character type
 with respect to aliasing semantics.
 This is attribute is similar to the @code{may_alias} attribute,
 except that it is not restricted to pointers.
>>>
>>> As Jakub pointed out, this is not what we need here.  An object of type
>>> char does *not* have untyped storage.  Accessing it as a different type
>>> is still undefined.
>>>
>>
>> but, do you agree that this is valid in C11?
>>
>> typedef char char_a[4];
>>
>> int
>> main (void)
>> {
>>char_a a = {1,2,3,4};
>>short *b = (short *) 
>>
>>b[1] = 0;
>>
>>if (a[0] == 1 && a[1] == 2 && a[2] == 3 && a[3] == 4)
>>  abort();
>>
>>exit(0);
>> }
>>
>>
>> all I want to do is replace "char" with a different type.
>
> Thanks a lot for posting a concrete example.
>
> The effective type of a[2] and [3] is char.  The character type wildcard
> in 6.5(7) only applies to the type of the lvalue expression ysed for the
> access, not the effective type of the object being accessed.  The type
> of the LHS of the assignment expression is short.  So the access is
> undefined.
>

exactly *that* is what I want to make valid with that attribute, which
would be also useful in C and kernel code, IMHO.

But isn't the effective type changed by the assignment b[1] = 0;
as described in 6.5(6):
"If a value is stored into an object having no declared type through an
lvalue having a type that is not a character type, then the type of the
lvalue becomes the effective type of the object for that access and for
subsequent accesses that do not modify the stored value."



Bernd.


Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Florian Weimer

On 04/06/2017 07:39 PM, Bernd Edlinger wrote:

On 04/06/17 16:17, Florian Weimer wrote:

Here is what I want to write in the doc:

@item typeless_storage
@cindex @code{typeless_storage} type attribute
A type declared with this attribute behaves like a character type
with respect to aliasing semantics.
This is attribute is similar to the @code{may_alias} attribute,
except that it is not restricted to pointers.


As Jakub pointed out, this is not what we need here.  An object of type
char does *not* have untyped storage.  Accessing it as a different type
is still undefined.



but, do you agree that this is valid in C11?

typedef char char_a[4];

int
main (void)
{
   char_a a = {1,2,3,4};
   short *b = (short *) 

   b[1] = 0;

   if (a[0] == 1 && a[1] == 2 && a[2] == 3 && a[3] == 4)
 abort();

   exit(0);
}


all I want to do is replace "char" with a different type.


Thanks a lot for posting a concrete example.

The effective type of a[2] and [3] is char.  The character type wildcard 
in 6.5(7) only applies to the type of the lvalue expression ysed for the 
access, not the effective type of the object being accessed.  The type 
of the LHS of the assignment expression is short.  So the access is 
undefined.


Florian


Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Bernd Edlinger
On 04/06/17 16:17, Florian Weimer wrote:
>> Here is what I want to write in the doc:
>>
>> @item typeless_storage
>> @cindex @code{typeless_storage} type attribute
>> A type declared with this attribute behaves like a character type
>> with respect to aliasing semantics.
>> This is attribute is similar to the @code{may_alias} attribute,
>> except that it is not restricted to pointers.
>
> As Jakub pointed out, this is not what we need here.  An object of type
> char does *not* have untyped storage.  Accessing it as a different type
> is still undefined.
>

but, do you agree that this is valid in C11?

typedef char char_a[4];

int
main (void)
{
   char_a a = {1,2,3,4};
   short *b = (short *) 

   b[1] = 0;

   if (a[0] == 1 && a[1] == 2 && a[2] == 3 && a[3] == 4)
 abort();

   exit(0);
}


all I want to do is replace "char" with a different type.

Bernd.

> The documentation says that the memory region is considered to by
> untyped, like a memory region returned by malloc (but obviously not with
> the implication that the memory region is separated from everything else).
>
> Thanks,
> Florian


Re: [PATCH] Destroy arguments for _Cilk_spawn calling in the child (PR 80038)

2017-04-06 Thread Jeff Law

On 03/31/2017 07:50 AM, Xi Ruoyao wrote:

Hi,

I''ve sent this patch once 
().
But I haven't got any response.  I'd like to resend it instead of pinging, and 
explain it more.
There's a couple things going on here.   Cilk+ does not currently have a 
maintainer, thus review of Cilk+ changes can easily fall through the 
cracks.  Additionally, the plan is deprecate Cilk+ in gcc-7, so it's 
very very low on our priority list.


With the likely deprecation in mind, I've only done a cursory review of 
the changes -- mostly to verify that they hit Cilk+ paths only.


What's the purpose behind changing when we set the in_lto_p flag?

Jeff



Re: [PATCH] Increase memory limit for genautomata on AIX

2017-04-06 Thread Jeff Law

On 03/31/2017 05:42 AM, Sam Thursfield wrote:

When doing 64-bit builds of GCC on AIX, genautomata experiences malloc()
failures. It seems that a 512MB heap is no longer big enough.

I also updated the comments in this file to match the values passed to
the linker. The AIX `ld` manual says this about the maxdata option:

Sets the maximum size (in bytes) that is allowed for the user data
area (or user heap) when the executable program is run.

So the comments were giving incorrect values.

gcc/

* config/rs6000/x-aix: Increase memory limit for genautomata on AIX.
  It has been experiencing malloc() failures during 64-bit compiler
  builds. Also correct the comments.
Thanks.  I removed the comment about jc1 (Java was removed a while ago) 
and installed the rest of your patch onto the trunk.


Jeff


Re: [PATCH] Fix dwarf2out ICE with C++17 inline static data members with redundant redeclaration (PR debug/80234)

2017-04-06 Thread Jeff Law

On 03/29/2017 01:42 PM, Jakub Jelinek wrote:

Hi!

When a C++17 inline static data member has a redundant out-of-class
deprecated redeclaration, we can end up with 2 DW_TAG_variable in
DW_TAG_compile_unit, one DW_AT_declaration and one with DW_AT_specification
pointing to it (the latter emitted for the redeclaration), before
gen_member_die can do its job.  In there we want to move the declaration
DIE into the class and have a CU child DW_TAG_variable that has
DW_AT_specification pointing to it.  But in this case we've put
the DIE with DW_AT_specification into the hash table and gen_member_die
ICEs in splice_child_die.  The following patch handles that case gracefully,
by moving the DW_AT_declaration DIE into the class instead of trying to
move the DW_AT_specification one, and by making sure we don't create
yet another DIE with DW_AT_specification because we already have one.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-03-29  Jakub Jelinek  

PR debug/80234
* dwarf2out.c (gen_member_die): Handle C++17 inline static data
members with redundant out-of-class redeclaration.

* g++.dg/debug/dwarf2/pr80234-1.C: New test.
* g++.dg/debug/dwarf2/pr80234-2.C: New test.

OK.
jeff



Re: [PATCH 0/3] Introduce internal_error_cont and exclude it from pot files

2017-04-06 Thread Jeff Law

On 03/24/2017 03:29 AM, Martin Liška wrote:

I would like to ping that. I'm not sure what's agreement after I read
discussion in: https://gcc.gnu.org/ml/gcc/2017-03/msg00070.html

Martin Sebor may know, CC'ing him.
Not sure if you're pinging the internal_error_cont stuff, or the ODR 
diagnostics changes.


WRT the ODR diagnostics, I'd say let's go with the C++17 style 
(all-lowercase with a hyphen).


If you've got a pointer to the internal_err_cont changes, please pass it 
along.


jeff



Re: [PATCH, GCC/ARM, gcc-6-branch] Fix PR80082: LDRD erronously used for 64bit load on ARMv7-R

2017-04-06 Thread Thomas Preudhomme



On 06/04/17 14:05, Ramana Radhakrishnan wrote:

On Mon, Mar 27, 2017 at 12:15 PM, Thomas Preudhomme
 wrote:

Hi,

Currently GCC is happy to use LDRD to perform a 64bit load on ARMv7-R,
as shown by the testcase on this patch. However, LDRD is only atomic
when LPAE extensions is available, which they are not for ARMv7-R. This
commit solve the issue by introducing a new feature bit to distinguish
LPAE extensions instead of deducing it from div instruction
availability.



Ok but with the testsuite fix that I just approved,  please also fix
in gcc-5 branch.


The backport already contains it. I haven't asked for gcc-5 branch yet because 
testing is still ongoing. Will send a separate mail once testing is done.


Best regards,

Thomas


Re: [PATCH] S/390: Optimize atomic_compare_exchange and atomic_compare builtins.

2017-04-06 Thread Ulrich Weigand
I wrote (incorrectly):

> >[(parallel
> >  [(set (match_operand:SI 0 "register_operand" "")
> >   (match_operator:SI 1 "s390_eqne_operator"
> > -   [(match_operand:CCZ1 2 "register_operand")
> > +   [(match_operand 2 "cc_reg_operand")
> > (match_operand 3 "const0_operand")]))
> >   (clobber (reg:CC CC_REGNUM))])]
> >""
> > -  "emit_insn (gen_sne (operands[0], operands[2]));
> > -   if (GET_CODE (operands[1]) == EQ)
> > - emit_insn (gen_xorsi3 (operands[0], operands[0], const1_rtx));
> > +  "machine_mode mode = GET_MODE (operands[2]);
> > +   if (TARGET_Z196)
> > + {
> > +   rtx cond, ite;
> > +
> > +   if (GET_CODE (operands[1]) == NE)
> > +cond = gen_rtx_NE (VOIDmode, operands[2], const0_rtx);
> > +   else
> > +cond = gen_rtx_EQ (VOIDmode, operands[2], const0_rtx);
> 
> I guess as a result cond is now always the same as operands[1] and
> could be just taken from there?

This is wrong -- I didn't notice the mode changes (in the cstore
pattern, the mode on the operator is SImode, but of the if_then_else
we want VOIDmode.

> > +   ite = gen_rtx_IF_THEN_ELSE (SImode, cond, const1_rtx, const0_rtx);
> > +   emit_insn (gen_rtx_SET (operands[0], ite));
> 
> Also, since you're just emitting RTL directly, maybe you could simply use
> the expander pattern above to do so (and not use emit_insn followed
> by DONE in this case?)

Therefore this doesn't work either.

Sorry for the confusion.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH] S/390: Optimize atomic_compare_exchange and atomic_compare builtins.

2017-04-06 Thread Ulrich Weigand
Dominik Vogt wrote:

> > v3:
> > 
> >   * Remove sne* patterns.
> >   * Move alignment check from s390_expand_cs to s390.md.
> >   * Use s_operand instead of memory_nosymref_operand.
> >   * Remove memory_nosymref_operand.
> >   * Allow any CC-mode in cstorecc4 for TARGET_Z196.
> >   * Fix EQ with TARGET_Z196 in cstorecc4.
> >   * Duplicate CS patterns for CCZmode.
> > 
> > Bootstrapped and regression tested on a zEC12 with s390 and s390x
> > biarch.

>  s390_emit_compare_and_swap (enum rtx_code code, rtx old, rtx mem,
> - rtx cmp, rtx new_rtx)
> + rtx cmp, rtx new_rtx, machine_mode ccmode)
>  {
> -  emit_insn (gen_atomic_compare_and_swapsi_internal (old, mem, cmp, 
> new_rtx));
> -  return s390_emit_compare (code, gen_rtx_REG (CCZ1mode, CC_REGNUM),
> - const0_rtx);
> +  switch (GET_MODE (mem))
> +{
> +case SImode:
> +  if (ccmode == CCZ1mode)
> + emit_insn (gen_atomic_compare_and_swapsiccz1_internal (old, mem, cmp,
> +new_rtx));
> +  else
> + emit_insn (gen_atomic_compare_and_swapsiccz_internal (old, mem, cmp,
> +new_rtx));
> +  break;
> +case DImode:
> +  if (ccmode == CCZ1mode)
> + emit_insn (gen_atomic_compare_and_swapdiccz1_internal (old, mem, cmp,
> +new_rtx));
> +  else
> + emit_insn (gen_atomic_compare_and_swapdiccz_internal (old, mem, cmp,
> +   new_rtx));
> +  break;
> +case TImode:
> +  if (ccmode == CCZ1mode)
> + emit_insn (gen_atomic_compare_and_swapticcz1_internal (old, mem, cmp,
> +new_rtx));
> +  else
> + emit_insn (gen_atomic_compare_and_swapticcz_internal (old, mem, cmp,
> +   new_rtx));
> +  break;

These expanders don't really do anything different depending on the
mode of the accessed word (SI/DI/TImode), so this seems like a bit of
unncessary duplication.  The original code was correct in always
calling the SImode variant, even if this looks a bit odd.  Maybe a
better fix is to just remove the mode from this expander.

> +  if (TARGET_Z196
> +  && (mode == SImode || mode == DImode))
> +do_const_opt = (is_weak && CONST_INT_P (cmp));
> +
> +  if (do_const_opt)
> +{
> +  const int very_unlikely = REG_BR_PROB_BASE / 100 - 1;
> +  rtx cc = gen_rtx_REG (CCZmode, CC_REGNUM);
> +
> +  skip_cs_label = gen_label_rtx ();
> +  emit_move_insn (output, mem);
> +  emit_move_insn (btarget, const0_rtx);
> +  emit_insn (gen_rtx_SET (cc, gen_rtx_COMPARE (CCZmode, output, cmp)));
> +  s390_emit_jump (skip_cs_label, gen_rtx_NE (VOIDmode, cc, const0_rtx));
> +  add_int_reg_note (get_last_insn (), REG_BR_PROB, very_unlikely);
> +  /* If the jump is not taken, OUTPUT is the expected value.  */
> +  cmp = output;
> +  /* Reload newval to a register manually, *after* the compare and jump
> +  above.  Otherwise Reload might place it before the jump.  */
> +}
> +  else
> +cmp = force_reg (mode, cmp);
> +  new_rtx = force_reg (mode, new_rtx);
> +  s390_emit_compare_and_swap (EQ, output, mem, cmp, new_rtx,
> +   (do_const_opt) ? CCZmode : CCZ1mode);
> +
> +  /* We deliberately accept non-register operands in the predicate
> + to ensure the write back to the output operand happens *before*
> + the store-flags code below.  This makes it easier for combine
> + to merge the store-flags code with a potential test-and-branch
> + pattern following (immediately!) afterwards.  */
> +  if (output != vtarget)
> +emit_move_insn (vtarget, output);
> +
> +  if (skip_cs_label != NULL)
> +  emit_label (skip_cs_label);

So if do_const_opt is true, but output != vtarget, the code above will
write to output, but this is then never moved to vtarget.  This looks
incorrect.

> +  if (TARGET_Z196 && do_const_opt)

do_const_opt seems to always imply TARGET_Z196.

> +; Peephole to combine a load-and-test from volatile memory which combine does
> +; not do.
> +(define_peephole2
> +  [(set (match_operand:GPR 0 "register_operand")
> + (match_operand:GPR 2 "memory_operand"))
> +   (set (reg CC_REGNUM)
> + (compare (match_dup 0) (match_operand:GPR 1 "const0_operand")))]
> +  "s390_match_ccmode(insn, CCSmode) && TARGET_EXTIMM
> +   && GENERAL_REG_P (operands[0])
> +   && satisfies_constraint_T (operands[2])"
> +  [(parallel
> +[(set (reg:CCS CC_REGNUM)
> +   (compare:CCS (match_dup 2) (match_dup 1)))
> + (set (match_dup 0) (match_dup 2))])])

We should really try to understand why this isn't done earlier and
fix the problem there ...

>[(parallel
>  [(set (match_operand:SI 0 "register_operand" "")
> 

RE: [patch, MIPS] Update -mvirt option description

2017-04-06 Thread Moore, Catherine


> -Original Message-
> From: Matthew Fortune [mailto:matthew.fort...@imgtec.com]
> Sent: Friday, March 31, 2017 4:00 PM
> To: Moore, Catherine 
> Cc: 'gcc-patches@gcc.gnu.org' (gcc-patches@gcc.gnu.org)  patc...@gcc.gnu.org>; roland.il...@gmx.de
> Subject: [patch, MIPS] Update -mvirt option description
> 
> Hi Catherine,
> 
> I'm following up on PR target/80057 to update the description of -
> mvirt.
> 
> I agree with Roland that the description is inconsistent and should not
> state 'application specific' as none of the other ASE options include
> it. Instead I suggest we put the shortened form of (VZ) in like (XPA)
> is shown for -mxpa. The short form of virtualization ASE is usually VZ
> not VIRT which has always irritated me about this option name but
> that
> is history.
> 
> What do you think?

This looks good to me.
> 
> gcc/
>   PR target/80057
>   * config/mips/mips.opt (-mvirt): Update description.
> 




[PATCH] PR80101: Fix ICE in store_data_bypass_p

2017-04-06 Thread Kelvin Nilsen

[This is a repost of a patch previously posted on 3/29/2017.
Eric, I hope you might consider that this falls within your scope
of maintenance.  Thanks.]

This problem reports an assertion error when certain rtl expressions
which are not eligible as producers or consumers of a store bypass
optimization are passed as arguments to the store_data_bypass_p
function.  The proposed patch returns false from store_data_bypass_p
rather than terminating with an assertion error.  False indicates that
the passed arguments are not eligible for the store bypass scheduling
optimization.

The patch has been boostrapped without regressions on
powerpc64le-unknown-linux-gnu.  Is this ok for the trunk?

gcc/ChangeLog:

2017-03-29  Kelvin Nilsen  

PR target/80101
* recog.c (store_data_bypass_p): Rather than terminate with
assertion error, return false if either function argument is not a
single_set or a PARALLEL with SETs inside.

gcc/testsuite/ChangeLog:

2017-03-29  Kelvin Nilsen  

PR target/80101
* gcc.target/powerpc/pr80101-1.c: New test.


Index: gcc/recog.c
===
--- gcc/recog.c (revision 246469)
+++ gcc/recog.c (working copy)
@@ -3663,9 +3663,12 @@ peephole2_optimize (void)

 /* Common predicates for use with define_bypass.  */

-/* True if the dependency between OUT_INSN and IN_INSN is on the store
-   data not the address operand(s) of the store.  IN_INSN and OUT_INSN
-   must be either a single_set or a PARALLEL with SETs inside.  */
+/* Returns true if the dependency between OUT_INSN and IN_INSN is on
+   the stored data, false if there is no dependency.  Note that a
+   consumer instruction that loads only the address (rather than the
+   value) stored by a producer instruction does not represent a
+   dependency.  If IN_INSN or OUT_INSN are not a single_set or a
+   PARALLEL with SETs inside, this function returns false.  */

 int
 store_data_bypass_p (rtx_insn *out_insn, rtx_insn *in_insn)
@@ -3701,7 +3704,8 @@ store_data_bypass_p (rtx_insn *out_insn, rtx_insn
 if (GET_CODE (out_exp) == CLOBBER)
   continue;

-gcc_assert (GET_CODE (out_exp) == SET);
+   if (GET_CODE (out_exp) != SET)
+ return false;

 if (reg_mentioned_p (SET_DEST (out_exp), SET_DEST (in_set)))
   return false;
@@ -3711,7 +3715,8 @@ store_data_bypass_p (rtx_insn *out_insn, rtx_insn
   else
 {
   in_pat = PATTERN (in_insn);
-  gcc_assert (GET_CODE (in_pat) == PARALLEL);
+  if (GET_CODE (in_pat) != PARALLEL)
+   return false;

   for (i = 0; i < XVECLEN (in_pat, 0); i++)
{
@@ -3720,7 +3725,8 @@ store_data_bypass_p (rtx_insn *out_insn, rtx_insn
  if (GET_CODE (in_exp) == CLOBBER)
continue;

- gcc_assert (GET_CODE (in_exp) == SET);
+ if (GET_CODE (in_exp) != SET)
+   return false;

  if (!MEM_P (SET_DEST (in_exp)))
return false;
@@ -3734,7 +3740,8 @@ store_data_bypass_p (rtx_insn *out_insn, rtx_insn
   else
 {
   out_pat = PATTERN (out_insn);
-  gcc_assert (GET_CODE (out_pat) == PARALLEL);
+ if (GET_CODE (out_pat) != PARALLEL)
+   return false;

   for (j = 0; j < XVECLEN (out_pat, 0); j++)
 {
@@ -3743,7 +3750,8 @@ store_data_bypass_p (rtx_insn *out_insn, rtx_insn
   if (GET_CODE (out_exp) == CLOBBER)
 continue;

-  gcc_assert (GET_CODE (out_exp) == SET);
+ if (GET_CODE (out_exp) != SET)
+   return false;

   if (reg_mentioned_p (SET_DEST (out_exp), SET_DEST (in_exp)))
 return false;
Index: gcc/testsuite/gcc.target/powerpc/pr80101-1.c
===
--- gcc/testsuite/gcc.target/powerpc/pr80101-1.c(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr80101-1.c(working copy)
@@ -0,0 +1,22 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power6" } } */
+/* { dg-require-effective-target dfp_hw } */
+/* { dg-options "-mcpu=power6 -mno-sched-epilog -Ofast" } */
+
+/* Prior to resolving PR 80101, this test case resulted in an internal
+   compiler error.  The role of this test program is to assure that
+   dejagnu's "test for excess errors" does not find any.  */
+
+int b;
+
+void e ();
+
+int c ()
+{
+  struct
+  {
+int a[b];
+  } d;
+  if (d.a[0])
+e ();
+}



Re: [PR 79905] ICE with vector_type

2017-04-06 Thread Bill Schmidt

> On Apr 6, 2017, at 9:26 AM, Bill Schmidt  wrote:
> 
> 
>> On Apr 6, 2017, at 9:19 AM, Bill Schmidt  wrote:
>> 
>> 
>>> On Apr 6, 2017, at 9:04 AM, Richard Biener  
>>> wrote:
>>> 
>>> On Thu, Apr 6, 2017 at 1:28 PM, Nathan Sidwell  wrote:
 Let's try this one then.
>> 
>> Nathan's patch regstraps cleanly.  I'll try Richard's variant (dropping the 
>> if test below) now.

As expected, this version passes regstrap as well.

Bill
> 
> FYI, the test case should also include:
> 
> // { dg-require-effective-target powerpc_altivec_ok } 
> 
> to avoid problems on targets without AltiVec support.
> 
> Bill
>> 
>> Bill
>>> 
>>> I'd call this
>>> 
>>> +  if (result == TYPE_CANONICAL (result))
>>> +/* Copy so we don't give the canonical type a name.  */
>>> +result = build_variant_type_copy (result);
>>> 
>>> premature optimization -- I wonder if anything breaks if you always copy?
>>> (that is, I expect result is always the canonical type here?)
>>> 
>>> Richard.
>>> 
 nathan
 
 --
 Nathan Sidwell
>>> 
>> 
> 



Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Florian Weimer

On 04/06/2017 05:05 PM, Jakub Jelinek wrote:

On Thu, Apr 06, 2017 at 04:51:01PM +0200, Florian Weimer wrote:

On 04/06/2017 04:43 PM, Jonathan Wakely wrote:

On 06/04/17 16:23 +0200, Richard Biener wrote:

On Thu, 6 Apr 2017, Florian Weimer wrote:


On 04/06/2017 04:11 PM, Bernd Edlinger wrote:


I think it is not too complicated to done in the C++ FE.
The FE looks for array of std::byte and unsigned char,
and sets the attribute when the final type is constructed.

What I am trying to do is just extend the semantic of may_alias
a bit, and then have the C++ FE use it in the way it has to.


We also need this for some POSIX and Linux kernel interfaces.  A
C++-only
solution would not help with that.


Example(s)?


sockaddr_storage comes to mind.


Right.  The kernel also has many APIs which return multiple variable-length
data blocks, such as getdents64, and many more interfaces in combination


The kernel uses -fno-strict-aliasing I think, so it doesn't care.


These APIs (getdents64, inotify, lots of netlink stuff, probably more) 
extend to user space.


Thanks,
Florian



Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Jakub Jelinek
On Thu, Apr 06, 2017 at 04:51:01PM +0200, Florian Weimer wrote:
> On 04/06/2017 04:43 PM, Jonathan Wakely wrote:
> > On 06/04/17 16:23 +0200, Richard Biener wrote:
> > > On Thu, 6 Apr 2017, Florian Weimer wrote:
> > > 
> > > > On 04/06/2017 04:11 PM, Bernd Edlinger wrote:
> > > > 
> > > > > I think it is not too complicated to done in the C++ FE.
> > > > > The FE looks for array of std::byte and unsigned char,
> > > > > and sets the attribute when the final type is constructed.
> > > > >
> > > > > What I am trying to do is just extend the semantic of may_alias
> > > > > a bit, and then have the C++ FE use it in the way it has to.
> > > > 
> > > > We also need this for some POSIX and Linux kernel interfaces.  A
> > > > C++-only
> > > > solution would not help with that.
> > > 
> > > Example(s)?
> > 
> > sockaddr_storage comes to mind.
> 
> Right.  The kernel also has many APIs which return multiple variable-length
> data blocks, such as getdents64, and many more interfaces in combination

The kernel uses -fno-strict-aliasing I think, so it doesn't care.

Jakub


Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Florian Weimer

On 04/06/2017 04:43 PM, Jonathan Wakely wrote:

On 06/04/17 16:23 +0200, Richard Biener wrote:

On Thu, 6 Apr 2017, Florian Weimer wrote:


On 04/06/2017 04:11 PM, Bernd Edlinger wrote:

> I think it is not too complicated to done in the C++ FE.
> The FE looks for array of std::byte and unsigned char,
> and sets the attribute when the final type is constructed.
>
> What I am trying to do is just extend the semantic of may_alias
> a bit, and then have the C++ FE use it in the way it has to.

We also need this for some POSIX and Linux kernel interfaces.  A
C++-only
solution would not help with that.


Example(s)?


sockaddr_storage comes to mind.


Right.  The kernel also has many APIs which return multiple 
variable-length data blocks, such as getdents64, and many more 
interfaces in combination with read/recv system calls.  Variable length 
means that you cannot declare the appropriate type after the first data 
item, so you technically have to use malloc.


POSIX interfaces which exhibit a similar pattern are getpwnam_r and 
friends, but for them, you can probably use malloc without ill effect 
(although there are still performance concerns).


Thanks,
Florian


Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Jonathan Wakely

On 06/04/17 16:23 +0200, Richard Biener wrote:

On Thu, 6 Apr 2017, Florian Weimer wrote:


On 04/06/2017 04:11 PM, Bernd Edlinger wrote:

> I think it is not too complicated to done in the C++ FE.
> The FE looks for array of std::byte and unsigned char,
> and sets the attribute when the final type is constructed.
>
> What I am trying to do is just extend the semantic of may_alias
> a bit, and then have the C++ FE use it in the way it has to.

We also need this for some POSIX and Linux kernel interfaces.  A C++-only
solution would not help with that.


Example(s)?


sockaddr_storage comes to mind.




[PATCH v2,rs6000] PR80108: Fix ICE with cross compiler

2017-04-06 Thread Kelvin Nilsen
I am reposting this patch, previously posted just moments ago, to
correct the subject so that it clarifies that this is a rs6000-specific
patch.  Thanks.

PR 80108 describes an ICE that occurs on an existing test program when
compiled with a particular combination of target options.

This patch fixes the compiler to reject that particular combination of
target options since it is not meaningful and duplicates the offending
test case with a dg-options directive to exercise the problematic
command-line options.

Thanks to feedback from Pat Haugen, Michael Meissner, and Segher
Boessenkool, version 2 of this proposed patch integrates the following
refinements:

1. Issue an error message when -mpower9-minmax is used in combination
   with -mcpu=power9 if specific prerequisite target options have been
   explicitly disabled.

2. Change the exclude-opts clause on the test case's dg-skip-if
   directive from -mcpu=power9 to -mcpu=405.  (This was a
   copy-and-paste error when this line was borrowed from a
   different test program.)

3. Remove -m32 from the dg-options directive.  Though this target
   option had been specified in the original problem report, subsequent
   testing confirmed that the original ICE occurs independent of this
   option.  Eliminating this option allows the regression test to be
   exercised in more more contexts.

This patch has been bootstrapped and tested with no regressions on both
powerpc64-unknown-linux-gnu and powerpc64le-unknown-linux-gnu.  Is
this ok for the trunk?

gcc/ChangeLog:

2017-04-06  Kelvin Nilsen  

PR target/80108
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Enhance special handling given to the TARGET_P9_MINMAX option in
relation to certain other options.

gcc/testsuite/ChangeLog:

2017-04-06  Kelvin Nilsen  

PR target/80108
* gcc.target/powerpc/ppc-fortran/ppc-fortran.exp: New file.
* gcc.target/powerpc/ppc-fortran/pr80108-1.f90: New test.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 246573)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -4273,8 +4273,40 @@ rs6000_option_override_internal (bool global_init_
   /* For the newer switches (vsx, dfp, etc.) set some of the older options,
  unless the user explicitly used the -mno- to disable the
code.  */
   if (TARGET_P9_VECTOR || TARGET_MODULO || TARGET_P9_DFORM_SCALAR
-  || TARGET_P9_DFORM_VECTOR || TARGET_P9_DFORM_BOTH > 0 ||
TARGET_P9_MINMAX)
+  || TARGET_P9_DFORM_VECTOR || TARGET_P9_DFORM_BOTH > 0)
 rs6000_isa_flags |= (ISA_3_0_MASKS_SERVER &
~rs6000_isa_flags_explicit);
+  else if (TARGET_P9_MINMAX)
+{
+  if (have_cpu)
+   {
+ if (cpu_index == PROCESSOR_POWER9)
+   {
+ /* legacy behavior: allow -mcpu-power9 with certain
+capabilities explicitly disabled.  */
+ rs6000_isa_flags |=
+   (ISA_3_0_MASKS_SERVER & ~rs6000_isa_flags_explicit);
+ /* However, reject this automatic fix if certain
+capabilities required for TARGET_P9_MINMAX support
+have been explicitly disabled.  */
+ if (((OPTION_MASK_VSX | OPTION_MASK_UPPER_REGS_SF
+   | OPTION_MASK_UPPER_REGS_DF) & rs6000_isa_flags)
+ != (OPTION_MASK_VSX | OPTION_MASK_UPPER_REGS_SF
+  | OPTION_MASK_UPPER_REGS_DF))
+   error ("-mpower9-minmax incompatible with explicitly disabled 
options");
+   }
+ else
+   error ("Power9 target option is incompatible with -mcpu= for "
+  " less than power9");
+   }
+  else if ((ISA_3_0_MASKS_SERVER & rs6000_isa_flags_explicit)
+  != (ISA_3_0_MASKS_SERVER & rs6000_isa_flags
+  & rs6000_isa_flags_explicit))
+   /* Enforce that none of the ISA_3_0_MASKS_SERVER flags
+  were explicitly cleared.  */
+   error ("-mpower9-minmax incompatible with explicitly disabled options");
+  else
+   rs6000_isa_flags |= ISA_3_0_MASKS_SERVER;
+}
   else if (TARGET_P8_VECTOR || TARGET_DIRECT_MOVE || TARGET_CRYPTO)
 rs6000_isa_flags |= (ISA_2_7_MASKS_SERVER &
~rs6000_isa_flags_explicit);
   else if (TARGET_VSX)
Index: gcc/testsuite/gcc.target/powerpc/ppc-fortran/ppc-fortran.exp
===
--- gcc/testsuite/gcc.target/powerpc/ppc-fortran/ppc-fortran.exp
(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/ppc-fortran/ppc-fortran.exp
(revision 246624)
@@ -0,0 +1,65 @@
+#   Copyright (C) 2004-2017 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later 

Re: [gomp4] add support for fortran allocate support with declare create

2017-04-06 Thread Cesar Philippidis
On 04/06/2017 02:05 AM, Thomas Schwinge wrote:

>> --- /dev/null
>> +++ b/gcc/testsuite/gfortran.dg/goacc/declare-allocatable-1.f90
>> @@ -0,0 +1,25 @@
>> +! Verify that OpenACC declared allocatable arrays have implicit
>> +! OpenACC enter and exit pragmas at the time of allocation and
>> +! deallocation.
>> +
>> +! { dg-additional-options "-fdump-tree-original" }
>> +[...]
>> +! { dg-final { scan-tree-dump-times "pragma acc enter data 
>> map.declare_allocate" 1 "gimple" } }
>> +! { dg-final { scan-tree-dump-times "pragma acc exit data 
>> map.declare_deallocate" 1 "gimple" } }
> 
> UNRESOLVED: gfortran.dg/goacc/declare-allocatable-1.f90   -O   
> scan-tree-dump-times gimple "pragma acc enter data map.declare_allocate" 1
> UNRESOLVED: gfortran.dg/goacc/declare-allocatable-1.f90   -O   
> scan-tree-dump-times gimple "pragma acc exit data map.declare_deallocate" 1
> PASS: gfortran.dg/goacc/declare-allocatable-1.f90   -O  (test for excess 
> errors)
> 
> "original" vs. "gimple" -- which one should it be?

I'm bad at noticing new unresolved test cases.

It could be either, but I changed it to original to ensure that the
fortran FE inserts those acc enter/exit data directives appropriately.

This patch has been committed to gomp-4_0-branch.

Cesar
2017-04-06  Cesar Philippidis  

	gcc/testsuite/
	* gfortran.dg/goacc/declare-allocatable-1.f90: Correct test.

diff --git a/gcc/testsuite/gfortran.dg/goacc/declare-allocatable-1.f90 b/gcc/testsuite/gfortran.dg/goacc/declare-allocatable-1.f90
index 9195055..b6bb6b3 100644
--- a/gcc/testsuite/gfortran.dg/goacc/declare-allocatable-1.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/declare-allocatable-1.f90
@@ -21,5 +21,5 @@ program allocate
   deallocate (a)
 end program allocate
 
-! { dg-final { scan-tree-dump-times "pragma acc enter data map.declare_allocate" 1 "gimple" } }
-! { dg-final { scan-tree-dump-times "pragma acc exit data map.declare_deallocate" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "pragma acc enter data map.declare_allocate" 1 "original" } }
+! { dg-final { scan-tree-dump-times "pragma acc exit data map.declare_deallocate" 1 "original" } }


Re: [PR 79905] ICE with vector_type

2017-04-06 Thread Bill Schmidt

> On Apr 6, 2017, at 9:19 AM, Bill Schmidt  wrote:
> 
> 
>> On Apr 6, 2017, at 9:04 AM, Richard Biener  
>> wrote:
>> 
>> On Thu, Apr 6, 2017 at 1:28 PM, Nathan Sidwell  wrote:
>>> Let's try this one then.
> 
> Nathan's patch regstraps cleanly.  I'll try Richard's variant (dropping the 
> if test below) now.

FYI, the test case should also include:

// { dg-require-effective-target powerpc_altivec_ok } 

to avoid problems on targets without AltiVec support.

Bill
> 
> Bill
>> 
>> I'd call this
>> 
>> +  if (result == TYPE_CANONICAL (result))
>> +/* Copy so we don't give the canonical type a name.  */
>> +result = build_variant_type_copy (result);
>> 
>> premature optimization -- I wonder if anything breaks if you always copy?
>> (that is, I expect result is always the canonical type here?)
>> 
>> Richard.
>> 
>>> nathan
>>> 
>>> --
>>> Nathan Sidwell
>> 
> 



Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Richard Biener
On Thu, 6 Apr 2017, Florian Weimer wrote:

> On 04/06/2017 04:11 PM, Bernd Edlinger wrote:
> 
> > I think it is not too complicated to done in the C++ FE.
> > The FE looks for array of std::byte and unsigned char,
> > and sets the attribute when the final type is constructed.
> > 
> > What I am trying to do is just extend the semantic of may_alias
> > a bit, and then have the C++ FE use it in the way it has to.
> 
> We also need this for some POSIX and Linux kernel interfaces.  A C++-only
> solution would not help with that.

Example(s)?

> > Here is what I want to write in the doc:
> > 
> > @item typeless_storage
> > @cindex @code{typeless_storage} type attribute
> > A type declared with this attribute behaves like a character type
> > with respect to aliasing semantics.
> > This is attribute is similar to the @code{may_alias} attribute,
> > except that it is not restricted to pointers.
> 
> As Jakub pointed out, this is not what we need here.  An object of type char
> does *not* have untyped storage.  Accessing it as a different type is still
> undefined.
> 
> The documentation says that the memory region is considered to by untyped,
> like a memory region returned by malloc (but obviously not with the
> implication that the memory region is separated from everything else).
> 
> Thanks,
> Florian
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Richard Biener
On Thu, 6 Apr 2017, Bernd Edlinger wrote:

> On 04/06/17 09:55, Richard Biener wrote:
> > On Thu, 6 Apr 2017, Jakub Jelinek wrote:
> >
> >> On Thu, Apr 06, 2017 at 09:47:10AM +0200, Richard Biener wrote:
> >>> @@ -955,6 +960,7 @@ get_alias_set (tree t)
> >>>   Just be pragmatic here and make sure the array and its element
> >>>   type get the same alias set assigned.  */
> >>>else if (TREE_CODE (t) == ARRAY_TYPE
> >>> +&& ! TYPE_TYPELESS_STORAGE (t)
> >>>  && (!TYPE_NONALIASED_COMPONENT (t)
> >>>  || TYPE_STRUCTURAL_EQUALITY_P (t)))
> >>>  set = get_alias_set (TREE_TYPE (t));
> >>> @@ -1094,6 +1100,15 @@ get_alias_set (tree t)
> >>>
> >>>TYPE_ALIAS_SET (t) = set;
> >>>
> >>> +  if (TREE_CODE (t) == ARRAY_TYPE
> >>> +  && TYPE_TYPELESS_STORAGE (t))
> >>
> >> Shouldn't TYPE_TYPELESS_STORAGE apply even for non-array types?
> >> If somebody chooses to store anything in long long
> >> __attribute__((typeless_storage)), so be it.  Or what kind of complications
> >> do you see for that?
> >
> > It's a new feature so I don't see why we should allow that.  Given that
> > people will have to do sth when the compiler doesn't support it the
> > only "reliable" way of using it is on an array of char anyway.
> >
> > The complication starts when people use it on a type that currently
> > uses alias-set zero (because "zero" doesn't get an alias_set_entry).
> >
> 
> The typeless_storage does not need to implement all the C++ semantic
> by itself.  It would be possible, but then it is not as generic as
> it could be.  What I'd really like to have is make an arbitrary
> type behave as if it were a char with respect to aliasing.
> 
> In my mind, the typeless_storage attribute has a value of its own,
> but it can be used to implement the C++17 semantic of std::byte [N].
> 
> So I would not want to completely change the way TBAA is working
> today.  I believe it is doing a fairly good job.
> 
> The TBAA machinery, does for instance not need to propagate this
> attribute from the member to the enclosing struct that is
> also not done for a struct that contains a char.
> 
> I think it is not too complicated to done in the C++ FE.
> The FE looks for array of std::byte and unsigned char,
> and sets the attribute when the final type is constructed.
> 
> What I am trying to do is just extend the semantic of may_alias
> a bit, and then have the C++ FE use it in the way it has to.
> 
> Here is what I want to write in the doc:
> 
> @item typeless_storage
> @cindex @code{typeless_storage} type attribute
> A type declared with this attribute behaves like a character type
> with respect to aliasing semantics.
> This is attribute is similar to the @code{may_alias} attribute,
> except that it is not restricted to pointers.
> 
> Example of use:
> 
> @smallexample
> typedef int __attribute__((__typeless_storage__)) int_a;
> 
> int
> main (void)
> @{
>int_a a = 0x12345678;
>short *b = (short *) 
> 
>b[1] = 0;
> 
>if (a == 0x12345678)
>  abort();
> 
>exit(0);
> @}
> @end smallexample
> 
> 
> we should first agree on that.

I don't see anyone needing the above example, it's not going to be
portable in any way.

Please don't invent sth that invites users to write bad code.
I'd even restrict it to work on arrays of chars only!
(arrays of byte-size integer types)

Richard.


Re: [PR 79905] ICE with vector_type

2017-04-06 Thread Bill Schmidt

> On Apr 6, 2017, at 9:04 AM, Richard Biener  wrote:
> 
> On Thu, Apr 6, 2017 at 1:28 PM, Nathan Sidwell  wrote:
>> Let's try this one then.

Nathan's patch regstraps cleanly.  I'll try Richard's variant (dropping the if 
test below) now.

Bill
> 
> I'd call this
> 
> +  if (result == TYPE_CANONICAL (result))
> +/* Copy so we don't give the canonical type a name.  */
> +result = build_variant_type_copy (result);
> 
> premature optimization -- I wonder if anything breaks if you always copy?
> (that is, I expect result is always the canonical type here?)
> 
> Richard.
> 
>> nathan
>> 
>> --
>> Nathan Sidwell
> 



Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Florian Weimer

On 04/06/2017 04:11 PM, Bernd Edlinger wrote:


I think it is not too complicated to done in the C++ FE.
The FE looks for array of std::byte and unsigned char,
and sets the attribute when the final type is constructed.

What I am trying to do is just extend the semantic of may_alias
a bit, and then have the C++ FE use it in the way it has to.


We also need this for some POSIX and Linux kernel interfaces.  A 
C++-only solution would not help with that.



Here is what I want to write in the doc:

@item typeless_storage
@cindex @code{typeless_storage} type attribute
A type declared with this attribute behaves like a character type
with respect to aliasing semantics.
This is attribute is similar to the @code{may_alias} attribute,
except that it is not restricted to pointers.


As Jakub pointed out, this is not what we need here.  An object of type 
char does *not* have untyped storage.  Accessing it as a different type 
is still undefined.


The documentation says that the memory region is considered to by 
untyped, like a memory region returned by malloc (but obviously not with 
the implication that the memory region is separated from everything else).


Thanks,
Florian


[PATCH v2] PR80108: Fix ICE with cross compiler

2017-04-06 Thread Kelvin Nilsen

PR 80108 describes an ICE that occurs on an existing test program when
compiled with a particular combination of target options.

This patch fixes the compiler to reject that particular combination of
target options since it is not meaningful and duplicates the offending
test case with a dg-options directive to exercise the problematic
command-line options.

Thanks to feedback from Pat Haugen, Michael Meissner, and Segher
Boessenkool, version 2 of this proposed patch integrates the following
refinements:

1. Issue an error message when -mpower9-minmax is used in combination
   with -mcpu=power9 if specific prerequisite target options have been
   explicitly disabled.

2. Change the exclude-opts clause on the test case's dg-skip-if
   directive from -mcpu=power9 to -mcpu=405.  (This was a
   copy-and-paste error when this line was borrowed from a
   different test program.)

3. Remove -m32 from the dg-options directive.  Though this target
   option had been specified in the original problem report, subsequent
   testing confirmed that the original ICE occurs independent of this
   option.  Eliminating this option allows the regression test to be
   exercised in more more contexts.

This patch has been bootstrapped and tested with no regressions on both
powerpc64-unknown-linux-gnu and powerpc64le-unknown-linux-gnu.  Is
this ok for the trunk?

gcc/ChangeLog:

2017-04-06  Kelvin Nilsen  

PR target/80108
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Enhance special handling given to the TARGET_P9_MINMAX option in
relation to certain other options.

gcc/testsuite/ChangeLog:

2017-04-06  Kelvin Nilsen  

PR target/80108
* gcc.target/powerpc/ppc-fortran/ppc-fortran.exp: New file.
* gcc.target/powerpc/ppc-fortran/pr80108-1.f90: New test.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 246573)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -4273,8 +4273,40 @@ rs6000_option_override_internal (bool global_init_
   /* For the newer switches (vsx, dfp, etc.) set some of the older options,
  unless the user explicitly used the -mno- to disable the code.  */
   if (TARGET_P9_VECTOR || TARGET_MODULO || TARGET_P9_DFORM_SCALAR
-  || TARGET_P9_DFORM_VECTOR || TARGET_P9_DFORM_BOTH > 0 || 
TARGET_P9_MINMAX)
+  || TARGET_P9_DFORM_VECTOR || TARGET_P9_DFORM_BOTH > 0)
 rs6000_isa_flags |= (ISA_3_0_MASKS_SERVER & ~rs6000_isa_flags_explicit);
+  else if (TARGET_P9_MINMAX)
+{
+  if (have_cpu)
+   {
+ if (cpu_index == PROCESSOR_POWER9)
+   {
+ /* legacy behavior: allow -mcpu-power9 with certain
+capabilities explicitly disabled.  */
+ rs6000_isa_flags |=
+   (ISA_3_0_MASKS_SERVER & ~rs6000_isa_flags_explicit);
+ /* However, reject this automatic fix if certain
+capabilities required for TARGET_P9_MINMAX support
+have been explicitly disabled.  */
+ if (((OPTION_MASK_VSX | OPTION_MASK_UPPER_REGS_SF
+   | OPTION_MASK_UPPER_REGS_DF) & rs6000_isa_flags)
+ != (OPTION_MASK_VSX | OPTION_MASK_UPPER_REGS_SF
+  | OPTION_MASK_UPPER_REGS_DF))
+   error ("-mpower9-minmax incompatible with explicitly disabled 
options");
+   }
+ else
+   error ("Power9 target option is incompatible with -mcpu= for "
+  " less than power9");
+   }
+  else if ((ISA_3_0_MASKS_SERVER & rs6000_isa_flags_explicit)
+  != (ISA_3_0_MASKS_SERVER & rs6000_isa_flags
+  & rs6000_isa_flags_explicit))
+   /* Enforce that none of the ISA_3_0_MASKS_SERVER flags
+  were explicitly cleared.  */
+   error ("-mpower9-minmax incompatible with explicitly disabled options");
+  else
+   rs6000_isa_flags |= ISA_3_0_MASKS_SERVER;
+}
   else if (TARGET_P8_VECTOR || TARGET_DIRECT_MOVE || TARGET_CRYPTO)
 rs6000_isa_flags |= (ISA_2_7_MASKS_SERVER & ~rs6000_isa_flags_explicit);
   else if (TARGET_VSX)
Index: gcc/testsuite/gcc.target/powerpc/ppc-fortran/ppc-fortran.exp
===
--- gcc/testsuite/gcc.target/powerpc/ppc-fortran/ppc-fortran.exp
(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/ppc-fortran/ppc-fortran.exp
(revision 246624)
@@ -0,0 +1,65 @@
+#   Copyright (C) 2004-2017 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+# 
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty 

Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Bernd Edlinger
On 04/06/17 09:55, Richard Biener wrote:
> On Thu, 6 Apr 2017, Jakub Jelinek wrote:
>
>> On Thu, Apr 06, 2017 at 09:47:10AM +0200, Richard Biener wrote:
>>> @@ -955,6 +960,7 @@ get_alias_set (tree t)
>>>   Just be pragmatic here and make sure the array and its element
>>>   type get the same alias set assigned.  */
>>>else if (TREE_CODE (t) == ARRAY_TYPE
>>> +  && ! TYPE_TYPELESS_STORAGE (t)
>>>&& (!TYPE_NONALIASED_COMPONENT (t)
>>>|| TYPE_STRUCTURAL_EQUALITY_P (t)))
>>>  set = get_alias_set (TREE_TYPE (t));
>>> @@ -1094,6 +1100,15 @@ get_alias_set (tree t)
>>>
>>>TYPE_ALIAS_SET (t) = set;
>>>
>>> +  if (TREE_CODE (t) == ARRAY_TYPE
>>> +  && TYPE_TYPELESS_STORAGE (t))
>>
>> Shouldn't TYPE_TYPELESS_STORAGE apply even for non-array types?
>> If somebody chooses to store anything in long long
>> __attribute__((typeless_storage)), so be it.  Or what kind of complications
>> do you see for that?
>
> It's a new feature so I don't see why we should allow that.  Given that
> people will have to do sth when the compiler doesn't support it the
> only "reliable" way of using it is on an array of char anyway.
>
> The complication starts when people use it on a type that currently
> uses alias-set zero (because "zero" doesn't get an alias_set_entry).
>

The typeless_storage does not need to implement all the C++ semantic
by itself.  It would be possible, but then it is not as generic as
it could be.  What I'd really like to have is make an arbitrary
type behave as if it were a char with respect to aliasing.

In my mind, the typeless_storage attribute has a value of its own,
but it can be used to implement the C++17 semantic of std::byte [N].

So I would not want to completely change the way TBAA is working
today.  I believe it is doing a fairly good job.

The TBAA machinery, does for instance not need to propagate this
attribute from the member to the enclosing struct that is
also not done for a struct that contains a char.

I think it is not too complicated to done in the C++ FE.
The FE looks for array of std::byte and unsigned char,
and sets the attribute when the final type is constructed.

What I am trying to do is just extend the semantic of may_alias
a bit, and then have the C++ FE use it in the way it has to.

Here is what I want to write in the doc:

@item typeless_storage
@cindex @code{typeless_storage} type attribute
A type declared with this attribute behaves like a character type
with respect to aliasing semantics.
This is attribute is similar to the @code{may_alias} attribute,
except that it is not restricted to pointers.

Example of use:

@smallexample
typedef int __attribute__((__typeless_storage__)) int_a;

int
main (void)
@{
   int_a a = 0x12345678;
   short *b = (short *) 

   b[1] = 0;

   if (a == 0x12345678)
 abort();

   exit(0);
@}
@end smallexample


we should first agree on that.

Bernd.


Re: [PR 79905] ICE with vector_type

2017-04-06 Thread Richard Biener
On Thu, Apr 6, 2017 at 1:28 PM, Nathan Sidwell  wrote:
> Let's try this one then.

I'd call this

+  if (result == TYPE_CANONICAL (result))
+/* Copy so we don't give the canonical type a name.  */
+result = build_variant_type_copy (result);

premature optimization -- I wonder if anything breaks if you always copy?
(that is, I expect result is always the canonical type here?)

Richard.

> nathan
>
> --
> Nathan Sidwell


[PATCH] Fix PR80341

2017-04-06 Thread Richard Biener

This fixes an old bug I introduced.  We may not pass in the final
truncation type to get_unwidened for divisions but need to preserve
the original values.  That means the INTEGER_CST handling needs
extension for the !for_type case as well.  And some refactoring.

(as much as I hate working on this premature folding code...)

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2017-04-06  Richard Biener  

PR middle-end/80341
* tree.c (get_unwidened): Also handle ! for_type case for
INTEGER_CSTs.
* convert.c (do_narrow): Split out from ...
(convert_to_integer_1): ... here.  Do not pass final truncation
type to get_unwidened for TRUNC_DIV_EXPR.

* gcc.dg/torture/pr80341.c: New testcase.

Index: gcc/tree.c
===
*** gcc/tree.c  (revision 246728)
--- gcc/tree.c  (working copy)
*** get_unwidened (tree op, tree for_type)
*** 9033,9045 
}
  }
  
!   /* If we finally reach a constant see if it fits in for_type and
   in that case convert it.  */
!   if (for_type
!   && TREE_CODE (win) == INTEGER_CST
!   && TREE_TYPE (win) != for_type
!   && int_fits_type_p (win, for_type))
! win = fold_convert (for_type, win);
  
return win;
  }
--- 9033,9053 
}
  }
  
!   /* If we finally reach a constant see if it fits in sth smaller and
   in that case convert it.  */
!   if (TREE_CODE (win) == INTEGER_CST)
! {
!   tree wtype = TREE_TYPE (win);
!   unsigned prec = wi::min_precision (win, TYPE_SIGN (wtype));
!   if (for_type)
!   prec = MAX (prec, final_prec);
!   if (prec < TYPE_PRECISION (wtype))
!   {
! tree t = lang_hooks.types.type_for_size (prec, TYPE_UNSIGNED (wtype));
! if (t && TYPE_PRECISION (t) < TYPE_PRECISION (wtype))
!   win = fold_convert (t, win);
!   }
! }
  
return win;
  }
Index: gcc/convert.c
===
*** gcc/convert.c   (revision 246728)
--- gcc/convert.c   (working copy)
*** convert_to_real_maybe_fold (tree type, t
*** 413,418 
--- 413,495 
return convert_to_real_1 (type, expr, dofold || CONSTANT_CLASS_P (expr));
  }
  
+ /* Try to narrow EX_FORM ARG0 ARG1 in narrowed arg types producing a
+result in TYPE.  */
+ 
+ static tree
+ do_narrow (location_t loc,
+  enum tree_code ex_form, tree type, tree arg0, tree arg1,
+  tree expr, unsigned inprec, unsigned outprec, bool dofold)
+ {
+   /* Do the arithmetic in type TYPEX,
+  then convert result to TYPE.  */
+   tree typex = type;
+ 
+   /* Can't do arithmetic in enumeral types
+  so use an integer type that will hold the values.  */
+   if (TREE_CODE (typex) == ENUMERAL_TYPE)
+ typex = lang_hooks.types.type_for_size (TYPE_PRECISION (typex),
+   TYPE_UNSIGNED (typex));
+ 
+   /* But now perhaps TYPEX is as wide as INPREC.
+  In that case, do nothing special here.
+  (Otherwise would recurse infinitely in convert.  */
+   if (TYPE_PRECISION (typex) != inprec)
+ {
+   /* Don't do unsigned arithmetic where signed was wanted,
+or vice versa.
+Exception: if both of the original operands were
+unsigned then we can safely do the work as unsigned.
+Exception: shift operations take their type solely
+from the first argument.
+Exception: the LSHIFT_EXPR case above requires that
+we perform this operation unsigned lest we produce
+signed-overflow undefinedness.
+And we may need to do it as unsigned
+if we truncate to the original size.  */
+   if (TYPE_UNSIGNED (TREE_TYPE (expr))
+ || (TYPE_UNSIGNED (TREE_TYPE (arg0))
+ && (TYPE_UNSIGNED (TREE_TYPE (arg1))
+ || ex_form == LSHIFT_EXPR
+ || ex_form == RSHIFT_EXPR
+ || ex_form == LROTATE_EXPR
+ || ex_form == RROTATE_EXPR))
+ || ex_form == LSHIFT_EXPR
+ /* If we have !flag_wrapv, and either ARG0 or
+ARG1 is of a signed type, we have to do
+PLUS_EXPR, MINUS_EXPR or MULT_EXPR in an unsigned
+type in case the operation in outprec precision
+could overflow.  Otherwise, we would introduce
+signed-overflow undefinedness.  */
+ || ((!TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg0))
+  || !TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg1)))
+ && ((TYPE_PRECISION (TREE_TYPE (arg0)) * 2u
+  > outprec)
+ || (TYPE_PRECISION (TREE_TYPE (arg1)) * 2u
+ > outprec))
+ && (ex_form == PLUS_EXPR
+ || ex_form == MINUS_EXPR
+ || ex_form == MULT_EXPR)))
+   {
+ if (!TYPE_UNSIGNED (typex))
+   typex = unsigned_type_for 

Re: [PATCH] Cherry-pick upstream r299036 from libsanitizer (PR sanitizer/80166).

2017-04-06 Thread Martin Liška
On 04/06/2017 02:57 PM, Jakub Jelinek wrote:
> On Thu, Apr 06, 2017 at 02:53:06PM +0200, Martin Liška wrote:
>> PING^1
> 
> I've acked that on IRC, with:
> marxin: please use PR sanitizer/80166 in the ChangeLog, ok with that

Sorry, I must miss that due to connection troubles.

Anyway, thanks,
Martin

> 
>> On 03/31/2017 10:34 AM, Martin Liška wrote:
>>> Hello.
>>>
>>> Cherry-picking the commit to fix PR reported originally to the GCC.
>>> Ready to install after it finishes regression tests?
> 
>   Jakub
> 



Re: [PATCH, GCC/ARM, gcc-6-branch] Fix PR80082: LDRD erronously used for 64bit load on ARMv7-R

2017-04-06 Thread Ramana Radhakrishnan
On Mon, Mar 27, 2017 at 12:15 PM, Thomas Preudhomme
 wrote:
> Hi,
>
> Currently GCC is happy to use LDRD to perform a 64bit load on ARMv7-R,
> as shown by the testcase on this patch. However, LDRD is only atomic
> when LPAE extensions is available, which they are not for ARMv7-R. This
> commit solve the issue by introducing a new feature bit to distinguish
> LPAE extensions instead of deducing it from div instruction
> availability.


Ok but with the testsuite fix that I just approved,  please also fix
in gcc-5 branch.

Thanks,
Ramana

>
> ChangeLog entries are as follow:
>
> *** gcc/ChangeLog ***
>
> 2017-03-22  Thomas Preud'homme  
>
> PR target/80082
> * config/arm/arm-protos.h (FL_LPAE): Define macro.
> (FL_FOR_ARCH7VE): Add FL_LPAE.
> (arm_arch_lpae): Declare extern.
> * config/arm/arm.c (arm_arch_lpae): Declare.
> (arm_option_override): Define arm_arch_lpae.
> * config/arm/arm.h (TARGET_HAVE_LPAE): Redefine in term of
> arm_arch_lpae.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2017-03-22  Thomas Preud'homme  
>
> PR target/80082
> * gcc.target/arm/atomic_loaddi_10.c: New testcase.
> * gcc.target/arm/atomic_loaddi_11.c: Likewise.
>
>
> Testing: bootstrapped for -march=armv7ve and testsuite shows no regression.
>
> Is this ok for gcc-6-branch?
>
> Best regards,
>
> Thomas


Re: [PATCH, GCC/testsuite/ARM, stage4, ping] Compile atomic_loaddi_11 for Cortex-R5

2017-04-06 Thread Ramana Radhakrishnan
On Tue, Apr 4, 2017 at 6:00 PM, Thomas Preudhomme
 wrote:
> Hi,
>
> gcc.target/arm/atomic_loaddi_11.c testcase contributed in r246365 does
> not test the changed code since ARMv7-R does not have division
> instructions in ARM state. This patch changes it to target Cortex-R5
> processor instead which does have division instructions in ARM state.
>
> ChangeLog entry is as follows:
>
> *** gcc/testsuite/ChangeLog ***
>
> 2017-03-22  Thomas Preud'homme  
> PR target/80082
> * gcc.target/arm/atomic_loaddi_11.c: Target Cortex-R5 instead of
> ARMv7-R.
>
> Is this ok for stage4?

OK.

Ramana
>
> Best regards,
>
> Thomas
>
> On 30/03/17 11:55, Thomas Preudhomme wrote:
>>
>> Ping?
>>
>> Best regards,
>>
>> Thomas
>>
>> On 23/03/17 17:09, Thomas Preudhomme wrote:
>>>
>>> My apologize, this works for both -march of -mcpu not cortex-r4 in
>>> RUNTESTFLAGS.
>>>
>>> ChangeLog entry is unchanged:
>>>
>>> *** gcc/testsuite/ChangeLog ***
>>>
>>> 2017-03-22  Thomas Preud'homme  >>
>>> PR target/80082
>>> * gcc.target/arm/atomic_loaddi_11.c: Target Cortex-R5 instead of
>>> ARMv7-R.
>>>
>>> Best regards,
>>>
>>> Thomas
>>>
>>> On 23/03/17 16:53, Thomas Preudhomme wrote:

 Sorry, I forgot about -march. Hold on.

 On 23/03/17 16:51, Thomas Preudhomme wrote:
>
> Please find attached an updated patch. ChangeLog entry unchanged:
>
> *** gcc/testsuite/ChangeLog ***
>
> 2017-03-22  Thomas Preud'homme  
> PR target/80082
> * gcc.target/arm/atomic_loaddi_11.c: Target Cortex-R5 instead of
> ARMv7-R.
>
> Is this ok for stage4?
>
> Best regards,
>
> Thomas
>
> On 23/03/17 16:19, Thomas Preudhomme wrote:
>>
>> Mmmh I probably need to add a dg-skip-if in there. Will respin the
>> patch.
>>
>> Best regards,
>>
>> Thomas
>>
>> On 23/03/17 16:10, Richard Earnshaw (lists) wrote:
>>>
>>> On 23/03/17 16:02, Thomas Preudhomme wrote:

 Hi,

 gcc.target/arm/atomic_loaddi_11.c testcase contributed in r246365
 does
 not test the changed code since ARMv7-R does not have division
 instructions in ARM state. This patch changes it to target Cortex-R5
 processor instead which does have division instructions in ARM
 state.

 ChangeLog entry is as follows:

 *** gcc/testsuite/ChangeLog ***

 2017-03-22  Thomas Preud'homme  >>
>>> Will that work properly if doing multilib testing with a specific CPU
>>> target?
>>>
>>> R.
>>>
>


Re: [PATCH] Cherry-pick upstream r299036 from libsanitizer (PR sanitizer/80166).

2017-04-06 Thread Jakub Jelinek
On Thu, Apr 06, 2017 at 02:53:06PM +0200, Martin Liška wrote:
> PING^1

I've acked that on IRC, with:
marxin: please use PR sanitizer/80166 in the ChangeLog, ok with that

> On 03/31/2017 10:34 AM, Martin Liška wrote:
> > Hello.
> > 
> > Cherry-picking the commit to fix PR reported originally to the GCC.
> > Ready to install after it finishes regression tests?

Jakub


Re: [PATCH] On x86 allow if-conversion of more than one insn as long as there is at most one cmov (PR tree-optimization/79390)

2017-04-06 Thread Jakub Jelinek
On Thu, Apr 06, 2017 at 02:50:27PM +0200, Rainer Orth wrote:
> 2017-04-06  Rainer Orth  
> 
>   * gcc.target/i386/pr79390.c: Allow for cmovl.a.

Please mention PR tree-optimization/79390 in the ChangeLog
and commit message.  Ok with that change.

> # HG changeset patch
> # Parent  7c92d635959dcb1a757b301344d8519dde9e1e7a
> Fix gcc.target/i386/pr79390.c for Solaris as
> 
> diff --git a/gcc/testsuite/gcc.target/i386/pr79390.c 
> b/gcc/testsuite/gcc.target/i386/pr79390.c
> --- a/gcc/testsuite/gcc.target/i386/pr79390.c
> +++ b/gcc/testsuite/gcc.target/i386/pr79390.c
> @@ -25,4 +25,4 @@ foo (void)
>return jp;
>  }
>  
> -/* { dg-final { scan-assembler "\[ \\t\]cmov\[a-z]+\[ \\t\]" } } */
> +/* { dg-final { scan-assembler "\[ \\t\]cmov\[a-z.]+\[ \\t\]" } } */

Jakub


Re: [PATCH] Cherry-pick upstream r299036 from libsanitizer (PR sanitizer/80166).

2017-04-06 Thread Martin Liška
PING^1

On 03/31/2017 10:34 AM, Martin Liška wrote:
> Hello.
> 
> Cherry-picking the commit to fix PR reported originally to the GCC.
> Ready to install after it finishes regression tests?
> 
> Thanks,
> Martin
> 



Re: [PATCH] Support multiple files w/ -i option in gcov (PR gcov-profile/80224).

2017-04-06 Thread Martin Liška
PING^1

On 03/28/2017 11:06 AM, Martin Liška wrote:
> Hello.
> 
> The fix of the PR, where intermediate format is currently dumped in a bit 
> different manner.
> I believe it can share majority of file creation (and destruction) with 
> normal format.
> Apart from that I refined usage string from:
> 
> Usage: gcov [OPTION]... SOURCE|OBJ...
> 
> to:
> Usage: gcov [OPTION...] SOURCE|OBJ...
> 
> Patch survives make check -k -j10 RUNTESTFLAGS="gcov.exp"
> 
> Ready for trunk?
> Thanks,
> Martin
> 



Re: [PATCH] On x86 allow if-conversion of more than one insn as long as there is at most one cmov (PR tree-optimization/79390)

2017-04-06 Thread Rainer Orth
Jeff Law  writes:

> On 04/01/2017 06:20 AM, Jakub Jelinek wrote:
>> Hi!
>>
>> As discussed in the PR, in the following testcase we don't if-convert
>> with the generic (and many other) tuning, because we default to
>> --param max-rtl-if-conversion-insns=1 in most of the tunings.
>> The problem we have is with multiple cmov instructions, but in the
>> testcase there is just one cmov and the other insn is turned into a SSE
>> max insn, which is fine.
>>
>> This patch stops artificially lowering that param, and for one_if_conv_insn
>> tuning it instead rejects the if-conversion if the resulting sequence has
>> multiple cmov instructions.  The hook is passed if_info too, so it can
>> in the future do better heuristics based on predictability of the edges,
>> how far the uses of the cmov result are (I assume cmov major problem is
>> latency, right?) etc.
>>
>> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>>
>> 2017-04-01  Jakub Jelinek  
>>
>>  PR tree-optimization/79390
>>  * target.h (struct noce_if_info): Declare.
>>  * targhooks.h (default_noce_conversion_profitable_p): Declare.
>>  * target.def (noce_conversion_profitable_p): New target hook.
>>  * ifcvt.h (struct noce_if_info): New type, moved from ...
>>  * ifcvt.c (struct noce_if_info): ... here.
>>  (noce_conversion_profitable_p): Renamed to ...
>>  (default_noce_conversion_profitable_p): ... this.  No longer
>>  static nor inline.
>>  (noce_try_store_flag_constants, noce_try_addcc,
>>  noce_try_store_flag_mask, noce_try_cmove, noce_try_cmove_arith,
>>  noce_convert_multiple_sets): Use targetm.noce_conversion_profitable_p
>>  instead of noce_conversion_profitable_p.
>>  * config/i386/i386.c: Include ifcvt.h.
>>  (ix86_option_override_internal): Don't override
>>  PARAM_MAX_RTL_IF_CONVERSION_INSNS default.
>>  (ix86_noce_conversion_profitable_p): New function.
>>  (TARGET_NOCE_CONVERSION_PROFITABLE_P): Redefine.
>>  * config/i386/x86-tune.def (X86_TUNE_ONE_IF_CONV_INSN): Adjust comment.
>>  * doc/tm.texi.in (TARGET_NOCE_CONVERSION_PROFITABLE_P): Add.
>>  * doc/tm.texi: Regenerated.
>>
>>  * gcc.target/i386/pr79390.c: New test.
>>  * gcc.dg/ifcvt-4.c: Use -mtune-ctrl=^one_if_conv_insn for i?86/x86_64.
> OK.

the new test FAILs on Solaris/x86 with /bin/as:

FAIL: gcc.target/i386/pr79390.c scan-assembler [ t]cmov[a-z]+[ t]

That's because gcc emits

cmovl.a %edx, %eax

instead of

cmova   %edx, %eax

Fixed as follows, tested with the appropriate runtest invocations on
i386-pc-solaris2.12 and x86_64-pc-linux-gnu.

I guess this is obvious?

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2017-04-06  Rainer Orth  

* gcc.target/i386/pr79390.c: Allow for cmovl.a.

# HG changeset patch
# Parent  7c92d635959dcb1a757b301344d8519dde9e1e7a
Fix gcc.target/i386/pr79390.c for Solaris as

diff --git a/gcc/testsuite/gcc.target/i386/pr79390.c b/gcc/testsuite/gcc.target/i386/pr79390.c
--- a/gcc/testsuite/gcc.target/i386/pr79390.c
+++ b/gcc/testsuite/gcc.target/i386/pr79390.c
@@ -25,4 +25,4 @@ foo (void)
   return jp;
 }
 
-/* { dg-final { scan-assembler "\[ \\t\]cmov\[a-z]+\[ \\t\]" } } */
+/* { dg-final { scan-assembler "\[ \\t\]cmov\[a-z.]+\[ \\t\]" } } */


[PATCH] Refactor tree-affine.c wide_int_ext_for_comb

2017-04-06 Thread Richard Biener

This makes a following real change smaller, thus I am testing
this refactoring separately at this stage.

Bootstrap / regtest in progress on x86_64-unknown-linux-gnu.

Richard.

2017-04-06  Richard Biener  

* tree-affine.c (wide_int_ext_for_comb): Take type rather
than aff_tree.
(aff_combination_const): Adjust.
(aff_combination_scale): Likewise.
(aff_combination_add_elt): Likewise.
(aff_combination_add_cst): Likewise.
(aff_combination_convert): Likewise.
(add_elt_to_tree): Likewise.  Remove unused argument.
(aff_combination_to_tree): Adjust calls to add_elt_to_tree.

Index: gcc/tree-affine.c
===
--- gcc/tree-affine.c   (revision 246727)
+++ gcc/tree-affine.c   (working copy)
@@ -34,9 +34,9 @@ along with GCC; see the file COPYING3.
 /* Extends CST as appropriate for the affine combinations COMB.  */
 
 widest_int
-wide_int_ext_for_comb (const widest_int , aff_tree *comb)
+wide_int_ext_for_comb (const widest_int , tree type)
 {
-  return wi::sext (cst, TYPE_PRECISION (comb->type));
+  return wi::sext (cst, TYPE_PRECISION (type));
 }
 
 /* Initializes affine combination COMB so that its value is zero in TYPE.  */
@@ -59,7 +59,7 @@ void
 aff_combination_const (aff_tree *comb, tree type, const widest_int )
 {
   aff_combination_zero (comb, type);
-  comb->offset = wide_int_ext_for_comb (cst, comb);;
+  comb->offset = wide_int_ext_for_comb (cst, comb->type);;
 }
 
 /* Sets COMB to single element ELT.  */
@@ -81,7 +81,7 @@ aff_combination_scale (aff_tree *comb, c
 {
   unsigned i, j;
 
-  widest_int scale = wide_int_ext_for_comb (scale_in, comb);
+  widest_int scale = wide_int_ext_for_comb (scale_in, comb->type);
   if (scale == 1)
 return;
 
@@ -91,11 +91,11 @@ aff_combination_scale (aff_tree *comb, c
   return;
 }
 
-  comb->offset = wide_int_ext_for_comb (scale * comb->offset, comb);
+  comb->offset = wide_int_ext_for_comb (scale * comb->offset, comb->type);
   for (i = 0, j = 0; i < comb->n; i++)
 {
   widest_int new_coef
-   = wide_int_ext_for_comb (scale * comb->elts[i].coef, comb);
+   = wide_int_ext_for_comb (scale * comb->elts[i].coef, comb->type);
   /* A coefficient may become zero due to overflow.  Remove the zero
 elements.  */
   if (new_coef == 0)
@@ -132,7 +132,7 @@ aff_combination_add_elt (aff_tree *comb,
   unsigned i;
   tree type;
 
-  widest_int scale = wide_int_ext_for_comb (scale_in, comb);
+  widest_int scale = wide_int_ext_for_comb (scale_in, comb->type);
   if (scale == 0)
 return;
 
@@ -140,7 +140,7 @@ aff_combination_add_elt (aff_tree *comb,
 if (operand_equal_p (comb->elts[i].val, elt, 0))
   {
widest_int new_coef
- = wide_int_ext_for_comb (comb->elts[i].coef + scale, comb);
+ = wide_int_ext_for_comb (comb->elts[i].coef + scale, comb->type);
if (new_coef != 0)
  {
comb->elts[i].coef = new_coef;
@@ -191,7 +191,7 @@ aff_combination_add_elt (aff_tree *comb,
 static void
 aff_combination_add_cst (aff_tree *c, const widest_int )
 {
-  c->offset = wide_int_ext_for_comb (c->offset + cst, c);
+  c->offset = wide_int_ext_for_comb (c->offset + cst, c->type);
 }
 
 /* Adds COMB2 to COMB1.  */
@@ -230,7 +230,7 @@ aff_combination_convert (aff_tree *comb,
   if (TYPE_PRECISION (type) == TYPE_PRECISION (comb_type))
 return;
 
-  comb->offset = wide_int_ext_for_comb (comb->offset, comb);
+  comb->offset = wide_int_ext_for_comb (comb->offset, comb->type);
   for (i = j = 0; i < comb->n; i++)
 {
   if (comb->elts[i].coef == 0)
@@ -374,15 +374,14 @@ tree_to_aff_combination (tree expr, tree
combination COMB.  */
 
 static tree
-add_elt_to_tree (tree expr, tree type, tree elt, const widest_int _in,
-aff_tree *comb ATTRIBUTE_UNUSED)
+add_elt_to_tree (tree expr, tree type, tree elt, const widest_int _in)
 {
   enum tree_code code;
   tree type1 = type;
   if (POINTER_TYPE_P (type))
 type1 = sizetype;
 
-  widest_int scale = wide_int_ext_for_comb (scale_in, comb);
+  widest_int scale = wide_int_ext_for_comb (scale_in, type);
 
   if (scale == -1
   && POINTER_TYPE_P (TREE_TYPE (elt)))
@@ -466,11 +465,10 @@ aff_combination_to_tree (aff_tree *comb)
   gcc_assert (comb->n == MAX_AFF_ELTS || comb->rest == NULL_TREE);
 
   for (i = 0; i < comb->n; i++)
-expr = add_elt_to_tree (expr, type, comb->elts[i].val, comb->elts[i].coef,
-   comb);
+expr = add_elt_to_tree (expr, type, comb->elts[i].val, comb->elts[i].coef);
 
   if (comb->rest)
-expr = add_elt_to_tree (expr, type, comb->rest, 1, comb);
+expr = add_elt_to_tree (expr, type, comb->rest, 1);
 
   /* Ensure that we get x - 1, not x + (-1) or x + 0xff..f if x is
  unsigned.  */
@@ -484,8 +482,7 @@ aff_combination_to_tree (aff_tree *comb)
   off = comb->offset;
   sgn = 1;
 }
-  return add_elt_to_tree (expr, 

[PATCH] Fix PR80334

2017-04-06 Thread Richard Biener

The following patch makes sure to preserve (mis-)alignment of memory
references when IVOPTs generates TARGET_MEM_REFs for them.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2017-04-06  Richard Biener  

PR tree-optimization/80334
* tree-ssa-loop-ivopts.c (rewrite_use_address): Properly
preserve alignment of accesses.

* g++.dg/torture/pr80334.C: New testcase.

Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c  (revision 246724)
+++ gcc/tree-ssa-loop-ivopts.c  (working copy)
@@ -7396,7 +7396,11 @@ rewrite_use_address (struct ivopts_data
 base_hint = var_at_stmt (data->current_loop, cand, use->stmt);
 
   iv = var_at_stmt (data->current_loop, cand, use->stmt);
-  ref = create_mem_ref (, TREE_TYPE (*use->op_p), ,
+  tree type = TREE_TYPE (*use->op_p);
+  unsigned int align = get_object_alignment (*use->op_p);
+  if (align != TYPE_ALIGN (type))
+type = build_aligned_type (type, align);
+  ref = create_mem_ref (, type, ,
reference_alias_ptr_type (*use->op_p),
iv, base_hint, data->speed);
   copy_ref_info (ref, *use->op_p);
Index: gcc/testsuite/g++.dg/torture/pr80334.C
===
--- gcc/testsuite/g++.dg/torture/pr80334.C  (nonexistent)
+++ gcc/testsuite/g++.dg/torture/pr80334.C  (working copy)
@@ -0,0 +1,18 @@
+// { dg-do run }
+
+struct A { alignas(16) char c; };
+struct B { A unpacked; char d; } __attribute__((packed));
+
+char x;
+
+int
+main()
+{
+  alignas(16) B b[3];
+  for (int i = 0; i < 3; i++) b[i].unpacked.c = 'a' + i;
+  for (int i = 0; i < 3; i++)
+{
+  auto a = new A(b[i].unpacked);
+  x = a->c;
+}
+}


[PATCH] Fix ICE/wrong-code with address-spaces and SRA/SCCVN

2017-04-06 Thread Richard Biener

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2017-04-06  Richard Biener  

PR tree-optimization/80262
* tree-sra.c (build_ref_for_offset): Preserve address-space
information.
* tree-ssa-sccvn.c (vn_reference_maybe_forwprop_address):
Drop useless address-space information on MEM_REF offsets.

* gcc.target/i386/pr80262.c: New testcase.

Index: gcc/tree-sra.c
===
*** gcc/tree-sra.c  (revision 246724)
--- gcc/tree-sra.c  (working copy)
*** build_ref_for_offset (location_t loc, tr
*** 1638,1643 
--- 1638,1650 
unsigned HOST_WIDE_INT misalign;
unsigned int align;
  
+   /* Preserve address-space information.  */
+   addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (base));
+   if (as != TYPE_ADDR_SPACE (exp_type))
+ exp_type = build_qualified_type (exp_type,
+TYPE_QUALS (exp_type)
+| ENCODE_QUAL_ADDR_SPACE (as));
+ 
gcc_checking_assert (offset % BITS_PER_UNIT == 0);
get_object_alignment_1 (base, , );
base = get_addr_base_and_unit_offset (base, _offset);
Index: gcc/tree-ssa-sccvn.c
===
*** gcc/tree-ssa-sccvn.c(revision 246724)
--- gcc/tree-ssa-sccvn.c(working copy)
*** vn_reference_maybe_forwprop_address (vec
*** 1233,1240 
  && tem[tem.length () - 2].opcode == MEM_REF)
{
  vn_reference_op_t new_mem_op = [tem.length () - 2];
! new_mem_op->op0 = fold_convert (TREE_TYPE (mem_op->op0),
! new_mem_op->op0);
}
  else
gcc_assert (tem.last ().opcode == STRING_CST);
--- 1233,1240 
  && tem[tem.length () - 2].opcode == MEM_REF)
{
  vn_reference_op_t new_mem_op = [tem.length () - 2];
! new_mem_op->op0 = wide_int_to_tree (TREE_TYPE (mem_op->op0),
! new_mem_op->op0);
}
  else
gcc_assert (tem.last ().opcode == STRING_CST);
Index: gcc/testsuite/gcc.target/i386/pr80262.c
===
*** gcc/testsuite/gcc.target/i386/pr80262.c (nonexistent)
--- gcc/testsuite/gcc.target/i386/pr80262.c (working copy)
***
*** 0 
--- 1,26 
+ /* { dg-do compile } */
+ /* { dg-options "-O2" } */
+ 
+ typedef struct {
+   int v;
+ } S1;
+ S1 clearS1 () { S1 s1 = { 0 }; return s1; }
+  
+ typedef struct {
+   S1 s1[4];
+ } S2;
+ void clearS2 (__seg_gs S2* p, int n) {
+   for (int i = 0; i < n; ++i)
+ p->s1[i] = clearS1 ();
+ }
+  
+ typedef struct {
+   int pad;
+   S2 s2;
+ } S3;
+  
+ long int BASE;
+  
+ void fn1(int n) {
+   clearS2 (&(((__seg_gs S3*)(BASE))->s2), n);
+ }


[Patch, testsuite] Fix failing builtin-sprintf-warn-{3,10}.c for avr

2017-04-06 Thread Senthil Kumar Selvaraj
Hi,

  This patch fixes a whole bunch of failures reported for
  gcc.dg/tree-ssa/builtin-sprintf-warn-{3,10}.c for the avr target.

  builtin-sprintf-warn-10.c fails because the bounds in the warning
  messages expect 4 digit wide exponents i.e. __DBL_MAX_EXP__ > 999.
  For the avr, floats and doubles are both 32 bits wide, __DBL_MAX_EXP__
  == 128, and the max number of exponent digits can only be 3 .
  The computed size thus ends up one short of the value the test
  expects. The patch makes the test run only for targets with double64plus.

  builtin-sprintf-warn-3.c fails because the test appears to assume all
  non lp64 targets to be ilp32. For the avr, pointer size and int size
  are equal, but both are 16 bits, not 32. The patch fixes this by
  explicitly adding avr to the dejagnu selector.

  Ok for trunk?

Regards
Senthil

gcc/testsuite/ChangeLog:

2017-04-06  Senthil Kumar Selvaraj  

* gcc.dg/tree-ssa/builtin-sprintf-warn-10.c: Require double64plus.
* gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (void test_too_large): Add
  avr-*-* to non-lp64 selector.


diff --git gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-10.c 
gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-10.c
index 1213e89f7bb..30599ad04dc 100644
--- gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-10.c
+++ gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-10.c
@@ -2,6 +2,7 @@
Test to verify the correctness of ranges of output computed for floating
point directives.
{ dg-do compile }
+   { dg-require-effective-target double64plus }
{ dg-options "-O2 -Wformat -Wformat-overflow -ftrack-macro-expansion=0" } */
 
 typedef __builtin_va_list va_list;
diff --git gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-3.c 
gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
index 72ec3afaa41..9db7ad74f37 100644
--- gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
+++ gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
@@ -358,19 +358,19 @@ void test_too_large (char *d, int x, __builtin_va_list va)
 
   __builtin_snprintf (d, imax,"%c", x);
   __builtin_snprintf (d, imax_p1, "%c", x);   /* { dg-warning "specified bound 
\[0-9\]+ exceeds .INT_MAX." "INT_MAX + 1" { target lp64 } } */
-  /* { dg-warning "specified bound \[0-9\]+ exceeds maximum object size" 
"INT_MAX + 1" { target { ilp32 } } .-1 } */
+  /* { dg-warning "specified bound \[0-9\]+ exceeds maximum object size" 
"INT_MAX + 1" { target { { avr-*-* } || ilp32 } } .-1 } */
 
   __builtin_vsnprintf (d, imax,"%c", va);
   __builtin_vsnprintf (d, imax_p1, "%c", va);   /* { dg-warning "specified 
bound \[0-9\]+ exceeds .INT_MAX." "INT_MAX + 1" { target lp64 } } */
-  /* { dg-warning "specified bound \[0-9\]+ exceeds maximum object size" 
"INT_MAX + 1" { target { ilp32 } } .-1 } */
+  /* { dg-warning "specified bound \[0-9\]+ exceeds maximum object size" 
"INT_MAX + 1" { target { { avr-*-* } || ilp32 } } .-1 } */
 
   __builtin___snprintf_chk (d, imax,0, imax,"%c", x);
   __builtin___snprintf_chk (d, imax_p1, 0, imax_p1, "%c", x);   /* { 
dg-warning "specified bound \[0-9\]+ exceeds .INT_MAX." "INT_MAX + 1" { target 
lp64 } } */
-  /* { dg-warning "specified bound \[0-9\]+ exceeds maximum object size" 
"INT_MAX + 1" { target { ilp32 } } .-1 } */
+  /* { dg-warning "specified bound \[0-9\]+ exceeds maximum object size" 
"INT_MAX + 1" { target { { avr-*-* } || ilp32 } } .-1 } */
 
   __builtin___vsnprintf_chk (d, imax,0, imax,"%c", va);
   __builtin___vsnprintf_chk (d, imax_p1, 0, imax_p1, "%c", va);   /* { 
dg-warning "specified bound \[0-9\]+ exceeds .INT_MAX." "INT_MAX + 1" { target 
lp64 } } */
-  /* { dg-warning "specified bound \[0-9\]+ exceeds maximum object size" 
"INT_MAX + 1" { target { ilp32 } } .-1 } */
+  /* { dg-warning "specified bound \[0-9\]+ exceeds maximum object size" 
"INT_MAX + 1" { target { { avr-*-* } || ilp32 } } .-1 } */
 
   const size_t ptrmax = __PTRDIFF_MAX__;
   const size_t ptrmax_m1 = ptrmax - 1;


Re: [PR 79905] ICE with vector_type

2017-04-06 Thread Nathan Sidwell

Let's try this one then.

nathan

--
Nathan Sidwell
2017-04-05  Nathan Sidwell  

	PR target/79905
	* config/rs6000/rs6000.c (rs6000_vt): New.
	(rs6000_init_builtins): Use it.

	testsuite/
	* g++.dg/torture/pr79905.C: New.

Index: config/rs6000/rs6000.c
===
--- config/rs6000/rs6000.c	(revision 246647)
+++ config/rs6000/rs6000.c	(working copy)
@@ -17257,6 +17257,23 @@ rs6000_expand_builtin (tree exp, rtx tar
   gcc_unreachable ();
 }
 
+/* Create a builtin vector type with a name.  Taking care not to give
+   the canonical type a name.  */
+
+static tree
+rs6000_vt (const char *name, tree elt_type, unsigned num_elts)
+{
+  tree result = build_vector_type (elt_type, num_elts);
+
+  if (result == TYPE_CANONICAL (result))
+/* Copy so we don't give the canonical type a name.  */
+result = build_variant_type_copy (result);
+
+  add_builtin_type (name, result);
+
+  return result;
+}
+
 static void
 rs6000_init_builtins (void)
 {
@@ -17273,18 +17290,25 @@ rs6000_init_builtins (void)
 
   V2SI_type_node = build_vector_type (intSI_type_node, 2);
   V2SF_type_node = build_vector_type (float_type_node, 2);
-  V2DI_type_node = build_vector_type (intDI_type_node, 2);
-  V2DF_type_node = build_vector_type (double_type_node, 2);
+  V2DI_type_node = rs6000_vt (TARGET_POWERPC64 ? "__vector long"
+			  : "__vector long long", intDI_type_node, 2);
+  V2DF_type_node = rs6000_vt ("__vector double", double_type_node, 2);
   V4HI_type_node = build_vector_type (intHI_type_node, 4);
-  V4SI_type_node = build_vector_type (intSI_type_node, 4);
-  V4SF_type_node = build_vector_type (float_type_node, 4);
-  V8HI_type_node = build_vector_type (intHI_type_node, 8);
-  V16QI_type_node = build_vector_type (intQI_type_node, 16);
-
-  unsigned_V16QI_type_node = build_vector_type (unsigned_intQI_type_node, 16);
-  unsigned_V8HI_type_node = build_vector_type (unsigned_intHI_type_node, 8);
-  unsigned_V4SI_type_node = build_vector_type (unsigned_intSI_type_node, 4);
-  unsigned_V2DI_type_node = build_vector_type (unsigned_intDI_type_node, 2);
+  V4SI_type_node = rs6000_vt ("__vector signed int", intSI_type_node, 4);
+  V4SF_type_node = rs6000_vt ("__vector float", float_type_node, 4);
+  V8HI_type_node = rs6000_vt ("__vector signed short", intHI_type_node, 8);
+  V16QI_type_node = rs6000_vt ("__vector signed char", intQI_type_node, 16);
+
+  unsigned_V16QI_type_node = rs6000_vt ("__vector unsigned char",
+	unsigned_intQI_type_node, 16);
+  unsigned_V8HI_type_node = rs6000_vt ("__vector unsigned short",
+   unsigned_intHI_type_node, 8);
+  unsigned_V4SI_type_node = rs6000_vt ("__vector unsigned int",
+   unsigned_intSI_type_node, 4);
+  unsigned_V2DI_type_node = rs6000_vt (TARGET_POWERPC64
+   ? "__vector unsigned long"
+   : "__vector unsigned long long",
+   unsigned_intDI_type_node, 2);
 
   opaque_V2SF_type_node = build_opaque_vector_type (float_type_node, 2);
   opaque_V2SI_type_node = build_opaque_vector_type (intSI_type_node, 2);
@@ -17299,8 +17323,9 @@ rs6000_init_builtins (void)
  must live in VSX registers.  */
   if (intTI_type_node)
 {
-  V1TI_type_node = build_vector_type (intTI_type_node, 1);
-  unsigned_V1TI_type_node = build_vector_type (unsigned_intTI_type_node, 1);
+  V1TI_type_node = rs6000_vt ("__vector __int128", intTI_type_node, 1);
+  unsigned_V1TI_type_node = rs6000_vt ("__vector unsigned __int128",
+	   unsigned_intTI_type_node, 1);
 }
 
   /* The 'vector bool ...' types must be kept distinct from 'vector unsigned ...'
@@ -17432,83 +17457,16 @@ rs6000_init_builtins (void)
   tdecl = add_builtin_type ("__pixel", pixel_type_node);
   TYPE_NAME (pixel_type_node) = tdecl;
 
-  bool_V16QI_type_node = build_vector_type (bool_char_type_node, 16);
-  bool_V8HI_type_node = build_vector_type (bool_short_type_node, 8);
-  bool_V4SI_type_node = build_vector_type (bool_int_type_node, 4);
-  bool_V2DI_type_node = build_vector_type (bool_long_type_node, 2);
-  pixel_V8HI_type_node = build_vector_type (pixel_type_node, 8);
-
-  tdecl = add_builtin_type ("__vector unsigned char", unsigned_V16QI_type_node);
-  TYPE_NAME (unsigned_V16QI_type_node) = tdecl;
-
-  tdecl = add_builtin_type ("__vector signed char", V16QI_type_node);
-  TYPE_NAME (V16QI_type_node) = tdecl;
-
-  tdecl = add_builtin_type ("__vector __bool char", bool_V16QI_type_node);
-  TYPE_NAME (bool_V16QI_type_node) = tdecl;
-
-  tdecl = add_builtin_type ("__vector unsigned short", unsigned_V8HI_type_node);
-  TYPE_NAME (unsigned_V8HI_type_node) = tdecl;
-
-  tdecl = add_builtin_type ("__vector signed short", V8HI_type_node);
-  TYPE_NAME (V8HI_type_node) = tdecl;
-
-  tdecl = add_builtin_type ("__vector __bool short", bool_V8HI_type_node);
-  TYPE_NAME (bool_V8HI_type_node) = tdecl;
-
-  tdecl = add_builtin_type ("__vector unsigned int", unsigned_V4SI_type_node);
-  TYPE_NAME (unsigned_V4SI_type_node) = 

Re: [PATCH] Fix PR80281

2017-04-06 Thread Richard Biener
On Wed, 5 Apr 2017, Christophe Lyon wrote:

> On 5 April 2017 at 13:41, Bin.Cheng  wrote:
> > On Wed, Apr 5, 2017 at 12:38 PM, Markus Trippelsdorf
> >  wrote:
> >> On 2017.04.03 at 15:20 +0200, Richard Biener wrote:
> >>> I'm re-testing the following variant.
> >>>
> >>> Richard.
> >>>
> >>> 2017-04-03  Richard Biener  
> >>>
> >>>   PR middle-end/80281
> >>>   * match.pd (A + (-B) -> A - B): Make sure to preserve unsigned
> >>>   arithmetic done for the negate or the plus.  Simplify.
> >>>   (A - (-B) -> A + B): Likewise.
> >>>   * fold-const.c (split_tree): Make sure to not negate pointers.
> >>>
> >>>   * gcc.dg/torture/pr80281.c: New testcase.
> >>
> >> gcc.dg/tree-ssa/pr40921.c started to fail with -march=skylake:
> >>
> >>  % gcc -march=skylake -c -O2 -fdump-tree-optimized -ffast-math -c 
> >> gcc.dg/tree-ssa/pr40921.c
> >>  % cat pr40921.i.227t.optimized | grep "\-y"
> >>_3 = -y_4(D);
> > Also on AArch64.
> >
> 
> And on some arm configurations, if that's easier to reproduce:
> * -mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard
> * --with-cpu=cortex-a15 --with-fpu=neon-vfpv4
> * --with-cpu=cortex-a57 --with-fpu=crypto-neon-fp-armv8

These are all spurious -- when you allow FMAs to be detected there'll
be an unary minus but which SSA name is negated depends on SSA name
allocation.

It's somewhat hard to fortify the testcase against the FMA case so
the following simply turns off FMA detection.

Tested on x86_64-unknown-linux-gnu, applied.

Richard.

2017-04-06  Richard Biener  

PR middle-end/80281
* gcc.dg/tree-ssa/pr40921.c: Add -fp-contract=off.

Index: gcc/testsuite/gcc.dg/tree-ssa/pr40921.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr40921.c (revision 246725)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr40921.c (working copy)
@@ -1,26 +1,24 @@
-
 /* { dg-do compile } */
-/* { dg-options "-O2  -fdump-tree-optimized -ffast-math" } */
+/* { dg-options "-O2  -fdump-tree-optimized -ffast-math -ffp-contract=off" } */
 
 unsigned int foo (unsigned int x, unsigned int y, unsigned int z)
 {
-  return x + (-y * z * z);
+  return x + (-y * z * z);
 }
 
 float bar (float x, float y, float z)
 {
-  return x + (-y * z * z);
+  return x + (-y * z * z);
 }
 
 float bar2 (float x, float y, float z)
 {
-  return x + (-y * z * z * 5.0f);
+  return x + (-y * z * z * 5.0f);
 }
 
 float bar3 (float x, float y, float z)
 {
-  return x + (-y * x * -z);
+  return x + (-y * x * -z);
 }
 
-
 /* { dg-final { scan-tree-dump-times "_* = -y_" 0 "optimized" } } */


[unified-autovect: Patch 1b/N] Instruction tile and grammar creation.

2017-04-06 Thread Sameera Deshpande
Hi Richard,

This is regarding the implementation of auto-vectorization based on my talk 
titled "Improving the effectiveness and generality of GCC auto-vectorization" 
in GNU Tools Cauldron 2015.

I have applied previous patches in branch 'unified-autovect' which generates 
primitive reorder tree (PRT) for load/store operations, and performs few 
optimizations to ensure optimal PRT is available for pattern matching. The 
options to enable unified auto-vectorization are '-ftree-vectorize 
-fopt-info-vec-all=file.txt -O2 -ftree-loop-vectorize-unified'. 

Now I am working on pattern matcher for PRT using bottom up rewrite parsing 
system (BURS) as described in "BURS Automata Generation" by Todd Proebsting. 
The paper describes an algorithm to generate BURS transition tables for cost 
effective pattern matching of complex target instructions. It takes target 
instructions represented in context free grammar in normal form as input, and 
generates transition table at build time of the compiler.

We have divided this problem into 3 parts:
1. Grammar creation for permute instructions of desired target at build time.
2. Transition table generation, and compile time executable code generation at 
build time.
3. Pattern matching code at compile time.

The patch I have attached addresses 1st part.

As number of states in automata increase exponentially for operators to be 
supported, I wanted to restrict the operators under consideration for BURS 
automata to be generated. As scalar to vector translator for arithmetic and 
logical instructions is already in place in GCC, and those operands do not play 
any role in permute order creation, the BURS is generated only for permute 
operators. For all other operators, GCC's current framework is to be used for 
code generation.

Hence, the only operators that are supported by the automata are
 ILV_2 : Interleave contents of 2 registers of size x and generate register of 
size 2x.
 EXTR_2,0   : Extract even elements from register of size x and generate 
register of size x/2
 EXTR_2,1  : Extract odd elements from register of size x and generate register 
of size x/2

Also, the number of states also depend upon the leaves in grammar. Hence, we 
are restricting the grammar to have 3 leaves namely REG, MEM and CONST.

We have defined new macro TARGET_VEC_PERM_CONST_ORDER which takes target 
supported permute orders as input in struct vec_perm_order_spec.

Current patch constructs instruction tiles for the target from this macro and 
optimizes those tiles using k-arity promotion reduction and redundancy 
elimination to have optimal PRT per instruction. It then generates context free 
grammar in normal form which can then be accepted by BURS algorithm to 
construct transition table.

There are multiple issues that need to be handled to support permute 
instructions across all architectures:
1. Constraints on input/output operands : We allow permutations on memory as 
well as registers to allow strided load/store and register permutes.
2. Different vector sizes supported : Each vector size needs to be enlisted in 
the macro in order to add support for that vector type.
3. Conditions under which the instruction is available : Like the condition 
field in RTL in md patterns.
4. Arity of permute operation : Multiple input single output model is supported.
5. Default rules to be generated for grammar to be complete : We need to 
identify default rules that are to be generated for pattern matcher so that the 
grammar will be complete. We have added
 goal --> reg  
 goal --> mem   
 reg --> mem
 reg --> const   
 mem --> reg   
 mem --> const
   :
   :
 reg_32 --> REG  
 reg_16 --> REG  
 reg_8 --> REG
 reg_4 --> REG
 reg_2 --> REG 
 reg -> reg_2
 reg -> reg_4
 reg -> reg_8   
 reg -> reg_16 
 reg -> reg_32  
   :
   :
 mem --> MEM 
 const --> CONST   
 reg --> EXTR_2,0 (reg)
 reg --> EXTR_2,1 (reg)
 reg --> ILV (reg, reg)

As we create 1 instruction tile per permute order in 
TARGET_VEC_PERM_CONST_ORDER, more detailed the macro,  more accurate the 
pattern matcher becomes.

Interestingly, even though different entries are created for different vector 
sizes, as and when possible, single tile is created for all vector sizes.
eg: In generated file insn-vect-inst-tiles.h which I have attached as sample, 
tile for permute order
permute order -  0  16  2  18  4  20  6  22  8  24  10  26  12  28  14  30 
ILV_2 (
  EXTR_2,0 (PHR:0) , 
  EXTR_2,0 (PHR:1))

is same as that for
permute order -  0  8  2  10  4  12  6  14 
ILV_2 (
  EXTR_2,0 (PHR:0) , 
  EXTR_2,0 (PHR:1))

This can be useful for SVE like extensions where vector size can be dynamic.

Richard, can you please see if this looks correct, or do I need additional 
information to successfully generate pattern matcher 

Re: [PR 79905] ICE with vector_type

2017-04-06 Thread Richard Biener
On Wed, Apr 5, 2017 at 10:33 PM, Bill Schmidt
 wrote:
> On 4/5/17 9:14 AM, Nathan Sidwell wrote:
>>
>>> Thanks for the patch!  Looks like there are some compile problems.  I
>>> can fix "resut", but not sure what the intent is for "canonical":
>>
>> I'm a dumbass.  I built the x86_64 compiler :(
>> try this.
>>
>> nathan
> Thanks!  Regtest showed that this blows up on the gcc_assert for Fortran
> and Go.  If I change the assert to an explicit test, everything is
> fine.  Corrected patch attached; needs a change log ofc.

Ick.

As said, why not do

+static tree
+rs6000_vt (const char *name, tree elt_type, unsigned num_elts)
+{
+  tree result = build_vector_type (elt_type, num_elts);
+  tree orig_name = TYPE_NAME (result);
+
+  /* Tell set_underlying_type to create a clone.  */
+  TYPE_NAME (result) = error_mark_node;

result = build_variant_type_copy  (result);

+  tree named_type = add_builtin_type (name, result);
+  TYPE_NAME (result) = orig_name;
+
+  if (named_type && TREE_TYPE (named_type) != result)
+{
+  result = TREE_TYPE (named_type);
+  TREE_USED (result) = true;

why's this needed?

+}

I don't like that error_mark_node hack at all.

Richard.


> Bill


Re: patch to fix PR70703

2017-04-06 Thread Richard Biener
On Wed, Apr 5, 2017 at 6:18 PM, Vladimir Makarov  wrote:
>
>
> On 04/05/2017 12:07 PM, Vladimir Makarov wrote:
>>
>>
>>
>> I'll correct the patch.
>>
> Here is the patch I've committed.

Note that in such contexts it's better to just use [u]int64_t.

Richard.

> 2017-04-05  Vladimir Makarov  
>
> PR rtl-optimization/70703
> * ira-color.c (update_conflict_hard_regno_costs): Use
> HOST_WIDE_INT instead of long.
>
> Index: ira-color.c
> ===
> --- ira-color.c (revision 246707)
> +++ ira-color.c (working copy)
> @@ -1522,7 +1522,7 @@
> index = ira_class_hard_reg_index[aclass][hard_regno];
> if (index < 0)
>   continue;
> -   cost = (int) (((long) conflict_costs [i] * mult) / div);
> +   cost = (int) (((HOST_WIDE_INT) conflict_costs [i] * mult) /
> div);
> if (cost == 0)
>   continue;
> cont_p = true;
>


Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Jonathan Wakely

On 05/04/17 20:46 +, Bernd Edlinger wrote:

It does the same as may_alias but additionally objects
declared with that type have alias set 0, I just don't
know if I have yet found the right words so that it can
be understood.  Based on the feedback I have now written:

+@item typeless_storage
+@cindex @code{typeless_storage} type attribute
+An object declared with a type with this attribute behaves like a
+character type with respect to aliasing semantics.


This says that an object behaves like a type, which is a category
error. Don't you mean that a type declared with this attribute behaves
like a character type w.r.t aliasing semantics?

Or an object of a type with this attribute behaves like an object of
character type w.r.t aliasing semantics.



Re: [PATCH] Fix MMX/SSE/AVX* shifts by non-immediate scalar (PR target/80286)

2017-04-06 Thread Jakub Jelinek
On Thu, Apr 06, 2017 at 10:47:03AM +0200, Uros Bizjak wrote:
> > +(define_insn_and_split "*3_1"
> > +  [(set (match_operand:VI2_AVX2_AVX512BW 0 "register_operand")
> > +   (any_lshift:VI2_AVX2_AVX512BW
> > + (match_operand:VI2_AVX2_AVX512BW 1 "register_operand")
> > + (sign_extend:DI (match_operand:SI 2 "nonmemory_operand"]
> > +  "TARGET_SSE2 &&  && 
> > +   && can_create_pseudo_p ()"
> > +  "#"
> > +  "&& 1"
> > +  [(set (match_dup 3) (zero_extend:DI (match_dup 2)))
> > +   (set (match_dup 0) (any_lshift:VI2_AVX2_AVX512BW
> > +   (match_dup 1) (match_dup 3)))]
> > +{
> > +  operands[3] = gen_reg_rtx (DImode);
> > +})
> >
> Yes, something like this. You ca use any_extend instead of
> sign_extend, so the pattern will also remove possible zero_extend of
> count operand.

The pattern splits it immediately (during split1) into a zext + shift,
so unless we let the pattern survive in this form (but then we need
constraints and it is unclear which ones) after reload, I don't see
advantage in matching it for zext, it is split exactly to what there
used to be before.

> >   (match_operand: 3 "register_operand" "Yk")))])
> > that is a transformation we want to do on the define_insn part of
> > define_insn_and_split, but not exactly what we want to do on the split
> > part of the insn - there we want literaly match_dup 0, match_dup 1,
> > and instead of the 2 other match_operand match_dup 2 and match_dup 3.
> 
> Hm, I'm not that versed in define_subst, but that looks quite a
> drawback of define_subst to me.

Perhaps, but we'd need to define what it means to subst a
define_insn_and_split.

Jakub


Re: [PATCH] S/390: Optimize atomic_compare_exchange and atomic_compare builtins.

2017-04-06 Thread Dominik Vogt
On Wed, Apr 05, 2017 at 02:52:00PM +0100, Dominik Vogt wrote:
> On Mon, Mar 27, 2017 at 09:27:35PM +0100, Dominik Vogt wrote:
> > The attached patch optimizes the atomic_exchange and
> > atomic_compare patterns on s390 and s390x (mostly limited to
> > SImode and DImode).  Among general optimizaation, the changes fix
> > most of the problems reported in PR 80080:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080
> > 
> > Bootstrapped and regression tested on a zEC12 with s390 and s390x
> > biarch.
> 
> New version attached.

This time it really is.  :-)

> v3:
> 
>   * Remove sne* patterns.
>   * Move alignment check from s390_expand_cs to s390.md.
>   * Use s_operand instead of memory_nosymref_operand.
>   * Remove memory_nosymref_operand.
>   * Allow any CC-mode in cstorecc4 for TARGET_Z196.
>   * Fix EQ with TARGET_Z196 in cstorecc4.
>   * Duplicate CS patterns for CCZmode.
> 
> Bootstrapped and regression tested on a zEC12 with s390 and s390x
> biarch.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog-dv-atomic-gcc7

* s390-protos.h (s390_expand_cs_hqi): Removed.
(s390_expand_cs, s390_expand_atomic_exchange_tdsi): New prototypes.
* config/s390/s390.c (s390_emit_compare_and_swap): Handle all integer
modes as well as CCZ1mode and CCZmode.
(s390_expand_atomic_exchange_tdsi, s390_expand_atomic): Adapt to new
signature of s390_emit_compare_and_swap.
(s390_expand_cs_hqi): Likewise, make static.
(s390_expand_cs_tdsi): Generate an explicit compare before trying
compare-and-swap, in some cases.
(s390_expand_cs): Wrapper function.
(s390_expand_atomic_exchange_tdsi): New backend specific expander for
atomic_exchange.
* config/s390/s390.md (CCZZ1): New mode iterator for compare-and-swap.
(define_peephole2): New peephole to help combining the load-and-test
pattern with volatile memory.
("cstorecc4"): Use load-on-condition and deal with CCZmode for
TARGET_Z196.
("atomic_compare_and_swap"): Merge the patterns for small and
large integers.  Forbid symref memory operands.  Move expander to
s390.c.  Require cc register.
("atomic_compare_and_swap_internal")
("*atomic_compare_and_swap_1")
("*atomic_compare_and_swapdi_2")
("*atomic_compare_and_swapsi_3"): Duplicate for CCZ1mode and
CCZmode.  Use s_operand to forbid symref memory operands.
("atomic_exchange"): Allow and implement all integer modes.
gcc/testsuite/ChangeLog-dv-atomic-gcc7

* gcc.target/s390/md/atomic_compare_exchange-1.c: New test.
* gcc.target/s390/md/atomic_compare_exchange-1.inc: New test.
* gcc.target/s390/md/atomic_exchange-1.inc: New test.
>From d5e4c5785eaee076112d8493b5104db6689fe209 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Thu, 23 Feb 2017 17:23:11 +0100
Subject: [PATCH] S/390: Optimize atomic_compare_exchange and
 atomic_compare builtins.

1) Use the load-and-test instructions for atomic_exchange if the value is 0.
2) If IS_WEAK is true, compare the memory contents before a compare-and-swap
   and skip the CS instructions if the value is not the expected one.
---
 gcc/config/s390/s390-protos.h  |   4 +-
 gcc/config/s390/s390.c | 176 ++-
 gcc/config/s390/s390.md| 150 -
 .../gcc.target/s390/md/atomic_compare_exchange-1.c |  84 ++
 .../s390/md/atomic_compare_exchange-1.inc  | 336 +
 .../gcc.target/s390/md/atomic_exchange-1.c | 309 +++
 6 files changed, 980 insertions(+), 79 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/md/atomic_compare_exchange-1.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/md/atomic_compare_exchange-1.inc
 create mode 100644 gcc/testsuite/gcc.target/s390/md/atomic_exchange-1.c

diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index 7f06a20..3fdb320 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -112,8 +112,8 @@ extern void s390_expand_vec_strlen (rtx, rtx, rtx);
 extern void s390_expand_vec_movstr (rtx, rtx, rtx);
 extern bool s390_expand_addcc (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
 extern bool s390_expand_insv (rtx, rtx, rtx, rtx);
-extern void s390_expand_cs_hqi (machine_mode, rtx, rtx, rtx,
-   rtx, rtx, bool);
+extern void s390_expand_cs (machine_mode, rtx, rtx, rtx, rtx, rtx, bool);
+extern void s390_expand_atomic_exchange_tdsi (rtx, rtx, rtx);
 extern void s390_expand_atomic (machine_mode, enum rtx_code,
rtx, rtx, rtx, bool);
 extern void s390_expand_tbegin (rtx, rtx, rtx, bool);
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 2cb8947..b1d6088 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -1762,11 +1762,40 @@ 

Re: [gomp4] add support for fortran allocate support with declare create

2017-04-06 Thread Thomas Schwinge
Hi Cesar!

On Wed, 5 Apr 2017 08:23:58 -0700, Cesar Philippidis  
wrote:
> This patch implements the OpenACC 2.5 behavior of fortran allocate on
> variables marked with declare create as defined in Section 2.13.2 in the
> OpenACC spec.

Thanks!


> While working on adding support for allocate, I noticed that OpenACC
> declare has a number of quirks. For starters, the fortran FE wasn't
> lowering them properly, so there was no way for omplower to utilize them
> inside acc parallel regions.

> There is still some unimplemented functionality.
> [...]

File (at least some of these?) as separate issues, I guess?


> I've applied this patch to gomp-4_0-branch.

Not reviewed, but I noticed:

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/declare-allocatable-1.f90
> @@ -0,0 +1,25 @@
> +! Verify that OpenACC declared allocatable arrays have implicit
> +! OpenACC enter and exit pragmas at the time of allocation and
> +! deallocation.
> +
> +! { dg-additional-options "-fdump-tree-original" }
> +[...]
> +! { dg-final { scan-tree-dump-times "pragma acc enter data 
> map.declare_allocate" 1 "gimple" } }
> +! { dg-final { scan-tree-dump-times "pragma acc exit data 
> map.declare_deallocate" 1 "gimple" } }

UNRESOLVED: gfortran.dg/goacc/declare-allocatable-1.f90   -O   
scan-tree-dump-times gimple "pragma acc enter data map.declare_allocate" 1
UNRESOLVED: gfortran.dg/goacc/declare-allocatable-1.f90   -O   
scan-tree-dump-times gimple "pragma acc exit data map.declare_deallocate" 1
PASS: gfortran.dg/goacc/declare-allocatable-1.f90   -O  (test for excess 
errors)

"original" vs. "gimple" -- which one should it be?


Grüße
 Thomas


Re: [PATCH] Fix MMX/SSE/AVX* shifts by non-immediate scalar (PR target/80286)

2017-04-06 Thread Jakub Jelinek
On Thu, Apr 06, 2017 at 10:40:07AM +0200, Jakub Jelinek wrote:
> On Thu, Apr 06, 2017 at 09:33:58AM +0200, Uros Bizjak wrote:
> > Newly introduced alternatives (x/x) and (v/v) are valid also for
> > 32-bit targets, so we have to adjust insn constraint of
> > *vec_extractv4si_0_zext and enable alternatives accordingly. After the
> 
> That is true.  But if we provide just the x/x and v/v alternatives in
> *vec_extractv4si_0_zext, then it will be forced to always do the zero
> extraction on the SSE registers in 32-bit mode.  Is that what we want?

Also, I think we can do the zero extension even without SSE4.1,
if we have a spare SSE register (or before reload), we can use
pxor into that scratch reg and punpck* it, if we don't, we can
construct a V4SI constaint in memory with { -1, 0, 0, 0 } or so
and and with that.

Jakub


Re: [PATCH] Fix MMX/SSE/AVX* shifts by non-immediate scalar (PR target/80286)

2017-04-06 Thread Uros Bizjak
On Thu, Apr 6, 2017 at 10:40 AM, Jakub Jelinek  wrote:
> On Thu, Apr 06, 2017 at 09:33:58AM +0200, Uros Bizjak wrote:
>> Newly introduced alternatives (x/x) and (v/v) are valid also for
>> 32-bit targets, so we have to adjust insn constraint of
>> *vec_extractv4si_0_zext and enable alternatives accordingly. After the
>
> That is true.  But if we provide just the x/x and v/v alternatives in
> *vec_extractv4si_0_zext, then it will be forced to always do the zero
> extraction on the SSE registers in 32-bit mode.  Is that what we want?

Yes, for SSE4 targets. We are sure that we have SSE source register
here, and there is no direct zero-extension to a general reg in
32-bit case.

> As for the define_insn_and_split that would transform sign extensions
> used solely by the vector shifts by scalar shift count, did you mean
> something like following (for every shift pattern)?
>
> --- sse.md.jj1  2017-04-04 19:51:01.0 +0200
> +++ sse.md  2017-04-06 10:26:26.877545109 +0200
> @@ -10696,6 +10696,22 @@
> (set_attr "prefix" "orig,vex")
> (set_attr "mode" "")])
>
> +(define_insn_and_split "*3_1"
> +  [(set (match_operand:VI2_AVX2_AVX512BW 0 "register_operand")
> +   (any_lshift:VI2_AVX2_AVX512BW
> + (match_operand:VI2_AVX2_AVX512BW 1 "register_operand")
> + (sign_extend:DI (match_operand:SI 2 "nonmemory_operand"]
> +  "TARGET_SSE2 &&  && 
> +   && can_create_pseudo_p ()"
> +  "#"
> +  "&& 1"
> +  [(set (match_dup 3) (zero_extend:DI (match_dup 2)))
> +   (set (match_dup 0) (any_lshift:VI2_AVX2_AVX512BW
> +   (match_dup 1) (match_dup 3)))]
> +{
> +  operands[3] = gen_reg_rtx (DImode);
> +})
>
Yes, something like this. You ca use any_extend instead of
sign_extend, so the pattern will also remove possible zero_extend of
count operand.

>  (define_insn "3"
>[(set (match_operand:VI48_AVX2 0 "register_operand" "=x,x,v")
> (any_lshift:VI48_AVX2
>
> The problem with that is that apparently our infrastructure doesn't support
> define_subst for define_insn_and_split (and define_split), so either we'd
> need to have separate define_insn_and_split for masked and for non-masked,
> or we'd need to extend the define_subst infrastructure for
> define_insn_and_split somehow.  Looking say at
> (define_subst "mask"
>   [(set (match_operand:SUBST_V 0)
> (match_operand:SUBST_V 1))]
>   "TARGET_AVX512F"
>   [(set (match_dup 0)
> (vec_merge:SUBST_V
>   (match_dup 1)
>   (match_operand:SUBST_V 2 "vector_move_operand" "0C")
>   (match_operand: 3 "register_operand" "Yk")))])
> that is a transformation we want to do on the define_insn part of
> define_insn_and_split, but not exactly what we want to do on the split
> part of the insn - there we want literaly match_dup 0, match_dup 1,
> and instead of the 2 other match_operand match_dup 2 and match_dup 3.

Hm, I'm not that versed in define_subst, but that looks quite a
drawback of define_subst to me.

Uros.


Re: [PATCH] Fix MMX/SSE/AVX* shifts by non-immediate scalar (PR target/80286)

2017-04-06 Thread Jakub Jelinek
On Thu, Apr 06, 2017 at 09:33:58AM +0200, Uros Bizjak wrote:
> Newly introduced alternatives (x/x) and (v/v) are valid also for
> 32-bit targets, so we have to adjust insn constraint of
> *vec_extractv4si_0_zext and enable alternatives accordingly. After the

That is true.  But if we provide just the x/x and v/v alternatives in
*vec_extractv4si_0_zext, then it will be forced to always do the zero
extraction on the SSE registers in 32-bit mode.  Is that what we want?

As for the define_insn_and_split that would transform sign extensions
used solely by the vector shifts by scalar shift count, did you mean
something like following (for every shift pattern)?

--- sse.md.jj1  2017-04-04 19:51:01.0 +0200
+++ sse.md  2017-04-06 10:26:26.877545109 +0200
@@ -10696,6 +10696,22 @@
(set_attr "prefix" "orig,vex")
(set_attr "mode" "")])
 
+(define_insn_and_split "*3_1"
+  [(set (match_operand:VI2_AVX2_AVX512BW 0 "register_operand")
+   (any_lshift:VI2_AVX2_AVX512BW
+ (match_operand:VI2_AVX2_AVX512BW 1 "register_operand")
+ (sign_extend:DI (match_operand:SI 2 "nonmemory_operand"]
+  "TARGET_SSE2 &&  && 
+   && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(set (match_dup 3) (zero_extend:DI (match_dup 2)))
+   (set (match_dup 0) (any_lshift:VI2_AVX2_AVX512BW
+   (match_dup 1) (match_dup 3)))]
+{
+  operands[3] = gen_reg_rtx (DImode);
+})
+
 (define_insn "3"
   [(set (match_operand:VI48_AVX2 0 "register_operand" "=x,x,v")
(any_lshift:VI48_AVX2

The problem with that is that apparently our infrastructure doesn't support
define_subst for define_insn_and_split (and define_split), so either we'd
need to have separate define_insn_and_split for masked and for non-masked,
or we'd need to extend the define_subst infrastructure for
define_insn_and_split somehow.  Looking say at
(define_subst "mask"
  [(set (match_operand:SUBST_V 0)
(match_operand:SUBST_V 1))]
  "TARGET_AVX512F"
  [(set (match_dup 0)
(vec_merge:SUBST_V
  (match_dup 1)
  (match_operand:SUBST_V 2 "vector_move_operand" "0C")
  (match_operand: 3 "register_operand" "Yk")))])
that is a transformation we want to do on the define_insn part of
define_insn_and_split, but not exactly what we want to do on the split
part of the insn - there we want literaly match_dup 0, match_dup 1,
and instead of the 2 other match_operand match_dup 2 and match_dup 3.

Jakub


Re: [PATCH] Fix MMX/SSE/AVX* shifts by non-immediate scalar (PR target/80286)

2017-04-06 Thread Uros Bizjak
On Thu, Apr 6, 2017 at 9:33 AM, Uros Bizjak  wrote:
> On Tue, Apr 4, 2017 at 5:09 PM, Jakub Jelinek  wrote:
>> On Tue, Apr 04, 2017 at 02:33:24PM +0200, Uros Bizjak wrote:
>>> > I assume split those before reload.  Because we want to give reload a 
>>> > chance
>>> > to do the zero extension on GPRs if it is more beneficial, and it might
>>> > choose to store it into memory and load into XMM from memory and that is
>>> > hard to do after reload.
>>>
>>> Yes, split before reload, and hope that alternative's decorations play
>>> well with RA.
>>
>> Haven't done these splitters yet, just playing now with:
>> typedef long long __m256i __attribute__ ((__vector_size__ (32), 
>> __may_alias__));
>> typedef int __v4si __attribute__ ((__vector_size__ (16)));
>> typedef short __v8hi __attribute__ ((__vector_size__ (16)));
>> typedef int __v8si __attribute__ ((__vector_size__ (32)));
>> typedef long long __m128i __attribute__ ((__vector_size__ (16), 
>> __may_alias__));
>> extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
>> __artificial__))
>> _mm256_castsi256_si128 (__m256i __A) { return (__m128i) 
>> __builtin_ia32_si_si256 ((__v8si)__A); }
>> extern __inline int __attribute__((__gnu_inline__, __always_inline__, 
>> __artificial__))
>> _mm_cvtsi128_si32 (__m128i __A) { return __builtin_ia32_vec_ext_v4si 
>> ((__v4si)__A, 0); }
>> extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
>> __artificial__))
>> _mm_srli_epi16 (__m128i __A, int __B) { return 
>> (__m128i)__builtin_ia32_psrlwi128 ((__v8hi)__A, __B); }
>> __m256i m;
>> __m128i foo (__m128i minmax)
>> {
>>   int shift = _mm_cvtsi128_si32 (_mm256_castsi256_si128 (m));
>>   return _mm_srli_epi16 (minmax, shift);
>> }
>> to see what it emits (in that case we already have zero extension rather
>> than sign extension).
>>> > With ? in front of it or without?  I admit I've only tried so far:
>>>
>>> I'd leave ?* in this case. In my experience, RA allocates alternative
>>> with ?* only when really needed.
>>
>> So far I have following, which seems to work fine for the above testcase and
>> -O2 -m64 -mavx2, but doesn't work for -O2 -m32 -mavx2.
>> For 64-bit combiner matches the *vec_extractv4si_0_zext pattern and as that
>> doesn't have ? nor * in the constraint, it is used.
>> For 32-bit there is no such pattern and we end up with just zero_extendsidi2
>> pattern and apparently either the ? or * prevent IRA/LRA from using it.
>> If I remove both ?*, I get nice code even for 32-bit.
>
> Newly introduced alternatives (x/x) and (v/v) are valid also for
> 32-bit targets, so we have to adjust insn constraint of
> *vec_extractv4si_0_zext and enable alternatives accordingly. After the
> adjustment, the pattern will be split to a zero-extend.

Attached patch fixes your testcase above for 64 and 32-bit targets.
What do you think?

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 6ed2390..d1c3c16 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3767,10 +3767,10 @@
 
 (define_insn "*zero_extendsidi2"
   [(set (match_operand:DI 0 "nonimmediate_operand"
-   "=r,?r,?o,r   ,o,?*Ym,?!*y,?r ,?r,?*Yi,?*x,*r")
+   "=r,?r,?o,r   ,o,?*Ym,?!*y,?r ,?r,?*Yi,?*x,?*x,?*v,*r")
(zero_extend:DI
 (match_operand:SI 1 "x86_64_zext_operand"
-   "0 ,rm,r ,rmWz,0,r   ,m   ,*Yj,*x,r   ,m  ,*k")))]
+   "0 ,rm,r ,rmWz,0,r   ,m   ,*Yj,*x,r   ,m  , *x, *v,*k")))]
   ""
 {
   switch (get_attr_type (insn))
@@ -3791,6 +3791,15 @@
   return "%vpextrd\t{$0, %1, %k0|%k0, %1, 0}";
 
 case TYPE_SSEMOV:
+  if (SSE_REG_P (operands[0]) && SSE_REG_P (operands[1]))
+   {
+ if (EXT_REX_SSE_REG_P (operands[0])
+ || EXT_REX_SSE_REG_P (operands[1]))
+   return "vpmovzxdq\t{%t1, %g0|%g0, %t1}";
+ else
+   return "%vpmovzxdq\t{%1, %0|%0, %1}";
+   }
+
   if (GENERAL_REG_P (operands[0]))
return "%vmovd\t{%1, %k0|%k0, %1}";
 
@@ -3813,6 +3822,10 @@
(eq_attr "alternative" "10")
  (const_string "sse2")
(eq_attr "alternative" "11")
+ (const_string "sse4")
+   (eq_attr "alternative" "12")
+ (const_string "avx512f")
+   (eq_attr "alternative" "13")
  (const_string "x64_avx512bw")
   ]
   (const_string "*")))
@@ -3821,16 +3834,16 @@
  (const_string "multi")
(eq_attr "alternative" "5,6")
  (const_string "mmxmov")
-   (eq_attr "alternative" "7,9,10")
+   (eq_attr "alternative" "7,9,10,11,12")
  (const_string "ssemov")
(eq_attr "alternative" "8")
  (const_string "sselog1")
-   (eq_attr "alternative" "11")
+   (eq_attr "alternative" "13")
  (const_string "mskmov")
   ]
   (const_string "imovx")))
(set (attr 

Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Richard Biener
On Thu, 6 Apr 2017, Jakub Jelinek wrote:

> On Thu, Apr 06, 2017 at 09:47:10AM +0200, Richard Biener wrote:
> > @@ -955,6 +960,7 @@ get_alias_set (tree t)
> >   Just be pragmatic here and make sure the array and its element
> >   type get the same alias set assigned.  */
> >else if (TREE_CODE (t) == ARRAY_TYPE
> > +  && ! TYPE_TYPELESS_STORAGE (t)
> >&& (!TYPE_NONALIASED_COMPONENT (t)
> >|| TYPE_STRUCTURAL_EQUALITY_P (t)))
> >  set = get_alias_set (TREE_TYPE (t));
> > @@ -1094,6 +1100,15 @@ get_alias_set (tree t)
> >  
> >TYPE_ALIAS_SET (t) = set;
> >  
> > +  if (TREE_CODE (t) == ARRAY_TYPE
> > +  && TYPE_TYPELESS_STORAGE (t))
> 
> Shouldn't TYPE_TYPELESS_STORAGE apply even for non-array types?
> If somebody chooses to store anything in long long
> __attribute__((typeless_storage)), so be it.  Or what kind of complications
> do you see for that?

It's a new feature so I don't see why we should allow that.  Given that
people will have to do sth when the compiler doesn't support it the
only "reliable" way of using it is on an array of char anyway.

The complication starts when people use it on a type that currently
uses alias-set zero (because "zero" doesn't get an alias_set_entry).

Richard.


Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Jakub Jelinek
On Thu, Apr 06, 2017 at 09:47:10AM +0200, Richard Biener wrote:
> @@ -955,6 +960,7 @@ get_alias_set (tree t)
>   Just be pragmatic here and make sure the array and its element
>   type get the same alias set assigned.  */
>else if (TREE_CODE (t) == ARRAY_TYPE
> +&& ! TYPE_TYPELESS_STORAGE (t)
>  && (!TYPE_NONALIASED_COMPONENT (t)
>  || TYPE_STRUCTURAL_EQUALITY_P (t)))
>  set = get_alias_set (TREE_TYPE (t));
> @@ -1094,6 +1100,15 @@ get_alias_set (tree t)
>  
>TYPE_ALIAS_SET (t) = set;
>  
> +  if (TREE_CODE (t) == ARRAY_TYPE
> +  && TYPE_TYPELESS_STORAGE (t))

Shouldn't TYPE_TYPELESS_STORAGE apply even for non-array types?
If somebody chooses to store anything in long long
__attribute__((typeless_storage)), so be it.  Or what kind of complications
do you see for that?

Jakub


Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Richard Biener
On Wed, 5 Apr 2017, Bernd Edlinger wrote:

> On 04/05/17 19:22, Bernd Edlinger wrote:
> > On 04/05/17 18:08, Jakub Jelinek wrote:
> >
> > Yes, exactly.  I really want to reach the deadline for gcc-7.
> > Fixing the name is certainly the most important first step,
> > and if everybody agrees on "typeless_storage", for the name
> > I can start with adjusting the name, and look into how
> > to use a spare type-flag that should be a mechanical change.
> >
> 
> Jakub, I just renamed the attribute and reworked the patch
> as you suggested, reg-testing is not yet completed, but
> it looks good so far.  I also added a few more tests.
> 
> I have changed the documentation as Richi suggested, but
> I am not too sure what to say here.

The alias.c changes are not sufficient.  I think what you want is
sth like

Index: gcc/alias.c
===
--- gcc/alias.c (revision 246678)
+++ gcc/alias.c (working copy)
@@ -136,6 +136,9 @@ struct GTY(()) alias_set_entry {
   bool is_pointer;
   /* Nonzero if is_pointer or if one of childs have has_pointer set.  */
   bool has_pointer;
+  /* Nonzero if we have a child serving as typeless storage (or are
+ such storage ourselves).  */
+  bool has_typeless_storage;
 
   /* The children of the alias set.  These are not just the immediate
  children, but, in fact, all descendants.  So, if we have:
@@ -419,7 +422,8 @@ alias_set_subset_of (alias_set_type set1
   /* Check if set1 is a subset of set2.  */
   ase2 = get_alias_set_entry (set2);
   if (ase2 != 0
-  && (ase2->has_zero_child
+  && (ase2->has_typeless_storage
+ || ase2->has_zero_child
  || (ase2->children && ase2->children->get (set1
 return true;
 
@@ -825,6 +829,7 @@ init_alias_set_entry (alias_set_type set
   ase->has_zero_child = false;
   ase->is_pointer = false;
   ase->has_pointer = false;
+  ase->has_typeless_storage = false;
   gcc_checking_assert (!get_alias_set_entry (set));
   (*alias_sets)[set] = ase;
   return ase;
@@ -955,6 +960,7 @@ get_alias_set (tree t)
  Just be pragmatic here and make sure the array and its element
  type get the same alias set assigned.  */
   else if (TREE_CODE (t) == ARRAY_TYPE
+  && ! TYPE_TYPELESS_STORAGE (t)
   && (!TYPE_NONALIASED_COMPONENT (t)
   || TYPE_STRUCTURAL_EQUALITY_P (t)))
 set = get_alias_set (TREE_TYPE (t));
@@ -1094,6 +1100,15 @@ get_alias_set (tree t)
 
   TYPE_ALIAS_SET (t) = set;
 
+  if (TREE_CODE (t) == ARRAY_TYPE
+  && TYPE_TYPELESS_STORAGE (t))
+{
+  alias_set_entry *ase = get_alias_set_entry (set);
+  if (!ase)
+   ase = init_alias_set_entry (set);
+  ase->has_typeless_storage = true;
+}
+
   /* If this is an aggregate type or a complex type, we must record any
  component aliasing information.  */
   if (AGGREGATE_TYPE_P (t) || TREE_CODE (t) == COMPLEX_TYPE)
@@ -1173,6 +1188,8 @@ record_alias_subset (alias_set_type supe
superset_entry->has_zero_child = true;
   if (subset_entry->has_pointer)
superset_entry->has_pointer = true;
+ if (subset_entry->has_typeless_storage)
+   superset_entry->has_typeless_storage = true;
 
  if (subset_entry->children)
{


please also restrict TYPE_TYPELESS_STORAGE to ARRAY_TYPEs (otherwise
more complications will arise).

Index: gcc/cp/class.c
===
--- gcc/cp/class.c  (revision 246678)
+++ gcc/cp/class.c  (working copy)
@@ -2083,7 +2083,8 @@ fixup_attribute_variants (tree t)
   tree attrs = TYPE_ATTRIBUTES (t);
   unsigned align = TYPE_ALIGN (t);
   bool user_align = TYPE_USER_ALIGN (t);
-  bool may_alias = lookup_attribute ("may_alias", attrs);
+  bool may_alias = TYPE_TYPELESS_STORAGE (t)
+  || lookup_attribute ("may_alias", attrs);

   if (may_alias)
 fixup_may_alias (t);
@@ -7345,6 +7348,12 @@ finish_struct_1 (tree t)
  the class or perform any other required target modifications.  */
   targetm.cxx.adjust_class_at_definition (t);

+  if (cxx_dialect >= cxx1z && cxx_type_contains_byte_buffer (t))
+{
+  TYPE_TYPELESS_STORAGE (t) = 1;
+  fixup_attribute_variants (t);
...

I don't think you need all this given alias.c only looks at
TYPE_MAIN_VARIANTs.

Index: gcc/cp/decl.c
===
--- gcc/cp/decl.c   (revision 246678)
+++ gcc/cp/decl.c   (working copy)
@@ -14081,10 +14081,11 @@ start_enum (tree name, tree enumtype, tree 
underly
  enumtype = pushtag (name, enumtype, /*tag_scope=*/ts_current);

  /* std::byte aliases anything.  */
- if (enumtype != error_mark_node
+ if (cxx_dialect >= cxx1z
+ && enumtype != error_mark_node
  && TYPE_CONTEXT (enumtype) == std_node
  && !strcmp ("byte", TYPE_NAME_STRING (enumtype)))
-   TYPE_ALIAS_SET (enumtype) = 0;
+ 

Re: [PATCH] Fix MMX/SSE/AVX* shifts by non-immediate scalar (PR target/80286)

2017-04-06 Thread Uros Bizjak
On Tue, Apr 4, 2017 at 5:09 PM, Jakub Jelinek  wrote:
> On Tue, Apr 04, 2017 at 02:33:24PM +0200, Uros Bizjak wrote:
>> > I assume split those before reload.  Because we want to give reload a 
>> > chance
>> > to do the zero extension on GPRs if it is more beneficial, and it might
>> > choose to store it into memory and load into XMM from memory and that is
>> > hard to do after reload.
>>
>> Yes, split before reload, and hope that alternative's decorations play
>> well with RA.
>
> Haven't done these splitters yet, just playing now with:
> typedef long long __m256i __attribute__ ((__vector_size__ (32), 
> __may_alias__));
> typedef int __v4si __attribute__ ((__vector_size__ (16)));
> typedef short __v8hi __attribute__ ((__vector_size__ (16)));
> typedef int __v8si __attribute__ ((__vector_size__ (32)));
> typedef long long __m128i __attribute__ ((__vector_size__ (16), 
> __may_alias__));
> extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
> _mm256_castsi256_si128 (__m256i __A) { return (__m128i) 
> __builtin_ia32_si_si256 ((__v8si)__A); }
> extern __inline int __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
> _mm_cvtsi128_si32 (__m128i __A) { return __builtin_ia32_vec_ext_v4si 
> ((__v4si)__A, 0); }
> extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
> _mm_srli_epi16 (__m128i __A, int __B) { return 
> (__m128i)__builtin_ia32_psrlwi128 ((__v8hi)__A, __B); }
> __m256i m;
> __m128i foo (__m128i minmax)
> {
>   int shift = _mm_cvtsi128_si32 (_mm256_castsi256_si128 (m));
>   return _mm_srli_epi16 (minmax, shift);
> }
> to see what it emits (in that case we already have zero extension rather
> than sign extension).
>> > With ? in front of it or without?  I admit I've only tried so far:
>>
>> I'd leave ?* in this case. In my experience, RA allocates alternative
>> with ?* only when really needed.
>
> So far I have following, which seems to work fine for the above testcase and
> -O2 -m64 -mavx2, but doesn't work for -O2 -m32 -mavx2.
> For 64-bit combiner matches the *vec_extractv4si_0_zext pattern and as that
> doesn't have ? nor * in the constraint, it is used.
> For 32-bit there is no such pattern and we end up with just zero_extendsidi2
> pattern and apparently either the ? or * prevent IRA/LRA from using it.
> If I remove both ?*, I get nice code even for 32-bit.

Newly introduced alternatives (x/x) and (v/v) are valid also for
32-bit targets, so we have to adjust insn constraint of
*vec_extractv4si_0_zext and enable alternatives accordingly. After the
adjustment, the pattern will be split to a zero-extend.

With -m32, I get:

(insn 10 8 13 2 (set (reg:SI 98)
(vec_select:SI (reg:V4SI 95)
(parallel [
(const_int 0 [0])
]))) "pr80286.c":9 3663 {*vec_extractv4si_0}
 (expr_list:REG_DEAD (reg:V4SI 95)
(nil)))
(insn 13 10 14 2 (set (reg:DI 101 [ _7 ])
(zero_extend:DI (reg:SI 98))) "pr80286.c":11 131 {*zero_extendsidi2}
 (expr_list:REG_DEAD (reg:SI 98)
(nil)))

and for SSE4+, combine can merge these two patterns to
*vec_extractv4si_0_zext, with the anticipation that pmovzx will be
generated.

Uros.

> --- gcc/config/i386/sse.md.jj   2017-04-04 12:45:08.0 +0200
> +++ gcc/config/i386/sse.md  2017-04-04 16:54:58.667382522 +0200
> @@ -13517,16 +13517,17 @@ (define_insn "*vec_extract[(set_attr "isa" "*,sse4,*,*")])
>
>  (define_insn_and_split "*vec_extractv4si_0_zext"
> -  [(set (match_operand:DI 0 "register_operand" "=r")
> +  [(set (match_operand:DI 0 "register_operand" "=r,x,v")
> (zero_extend:DI
>   (vec_select:SI
> -   (match_operand:V4SI 1 "register_operand" "v")
> +   (match_operand:V4SI 1 "register_operand" "v,x,v")
> (parallel [(const_int 0)]]
>"TARGET_64BIT && TARGET_SSE2 && TARGET_INTER_UNIT_MOVES_FROM_VEC"
>"#"
>"&& reload_completed"
>[(set (match_dup 0) (zero_extend:DI (match_dup 1)))]
> -  "operands[1] = gen_lowpart (SImode, operands[1]);")
> +  "operands[1] = gen_lowpart (SImode, operands[1]);"
> +  [(set_attr "isa" "*,sse4,avx512f")])
>
>  (define_insn "*vec_extractv2di_0_sse"
>[(set (match_operand:DI 0 "nonimmediate_operand" "=v,m")
> --- gcc/config/i386/i386.md.jj  2017-04-03 13:43:50.0 +0200
> +++ gcc/config/i386/i386.md 2017-04-04 16:54:09.786014373 +0200
> @@ -3767,10 +3767,10 @@ (define_expand "zero_extendsidi2"
>
>  (define_insn "*zero_extendsidi2"
>[(set (match_operand:DI 0 "nonimmediate_operand"
> -   "=r,?r,?o,r   ,o,?*Ym,?!*y,?r ,?r,?*Yi,?*x,*r")
> +   "=r,?r,?o,r   ,o,?*Ym,?!*y,?r 
> ,?r,?*Yi,?*x,*r,?*x,?*v")
> (zero_extend:DI
>  (match_operand:SI 1 "x86_64_zext_operand"
> -   "0 ,rm,r ,rmWz,0,r   ,m   ,*Yj,*x,r   ,m  ,*k")))]
> +   "0 ,rm,r ,rmWz,0,r   ,m   ,*Yj,*x,r   ,m  ,*k,x  
> 

Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-06 Thread Richard Biener
On Wed, 5 Apr 2017, Jason Merrill wrote:

> On Wed, Apr 5, 2017 at 1:41 PM, Bernd Edlinger
>  wrote:
> > On 04/05/17 17:20, Richard Biener wrote:
> >>> Also, wonder if you need to mark all types containing such arrays,
> >>> if you couldn't just set that flag in C++ on unsigned char/std::byte
> >>> arrays (and on anything with that attribute), have that imply alias set
> >>> 0 on it and then let the rest of alias machinery handle aggregate types
> >>> containing such fields.
> >>
> >> Yes, I expected it to work like this (didn't look at the patch yet).
> 
> My impression is that this is how GCC 6 worked, but GCC 7 decides to
> ignore alias set 0 members.  Is that right?

Yes, in GCC 6 an access via an aggregate type that had an alias-set zero
member was using alias-set zero.

So we'd re-instantiate that behavior for aggregates containing
a type marked with the proposed new flag.

> > I want to allow *only* what the C++ standard requires or what Jason says
> > of course :), and not a single bit more, because it suppresses otherwise
> > correct optimizations.
> >
> > So a struct with a std::byte member is not alias_set 0,
> > only the std::byte itself is, but an array of std::byte
> > is itself typeless_storage, and makes the whole structure
> > also typeless_storage
> 
> Well, only the array member, not the whole structure, but it may make
> sense for GCC to treat the whole structure as such internally.

Yes, it's the easiest way to implement the required behavior.  But
I wouldn't document it that way but clearly state that only the
marked member behaves so but that for the purpose of accesses via
a container type the standards access rules apply as if the member
is accessed with a character type.

Richard.