Re: [PATCH][4/4] SLP induction vectorization

2017-06-06 Thread Michael Meissner
On Tue, Jun 06, 2017 at 09:38:04AM +0200, Richard Biener wrote:
> On Sat, 3 Jun 2017, Richard Biener wrote:
> 
> > On June 3, 2017 1:38:14 AM GMT+02:00, Michael Meissner 
> >  wrote:
> > >On Fri, Jun 02, 2017 at 03:22:27PM +0200, Richard Biener wrote:
> > >> 
> > >> This implements vectorization of SLP inductions (in the not outer
> > >loop
> > >> vectorization case for now).
> > >> 
> > >> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > >> 
> > >> More testing is appreciated, I'm throwing it at SPEC2k6 now.
> > >
> > >I was going to apply to the PowerPC and do a spec run but the patch for
> > >tree-vect-loop.c doesn't apply.  Could you regenerate the patch, and
> > >also tell
> > >me the subversion id it is based off of?
> > 
> > Hum, it should apply cleanly (maybe one rejected hunk that I already 
> > committed separately).  I'll double check on Tuesday.
> 
> My SPEC 2k6 testing went ok and I've committed the patch as r248909
> now.

FWIW, I saw no significant differeneces in Spec 2006 CPU on a little endian
power8 system comparing spec built with GCC svn id 248908 and 248909 using the
same switches and libraries.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH] handle bzero/bcopy in DSE and aliasing (PR 80933, 80934)

2017-06-06 Thread Martin Sebor

Note I'd be _much_ more sympathetic to simply canonicalizing all of
bzero and bcopy
to memset / memmove and be done with all the above complexity.


Attached is an updated patch along these lines.  Please let me
know if it matches your expectations.

FWIW, although I don't feel too strongly about bzero et al. I'm
not sure that this approach is the right one in general.  It might
(slightly) simplify GCC itself, but other than the incidental code
generation improvement, it offers no benefit to users.  In some
cases, it even degrades user experience by causing GCC issue
diagnostics that refer to functions that don't appear in the source
code, such as for:

  char d[1];

  void* f (const void *p)
  {
bzero (d, 7);
  }

  warning: ‘__builtin_memset’ writing 7 bytes into a region of size 1 
overflows the destination [-Wstringop-overflow=]


For some functions like mempcpy it might even worse code overall
(slower and bigger).

In other cases (like profiling) it loses interesting information.

I think these types of transformations would be justified  f they
were done based on measurably improved efficiency of the generated
code, but I'm uneasy about swapping calls to one function for another
solely because it simplifies the implementation.  Not least because
it doesn't seem like a viable general approach to simplifying the
implementation.

Martin

PS I stopped short of simplifying GCC to remove the existing special
handling of these three built-ins.  If the patch is approved I'm
willing to do the cleanup in a subsequent pass.
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 37bb236..363d104 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -804,6 +804,10 @@ Wnon-template-friend
 C++ ObjC++ Var(warn_nontemplate_friend) Init(1) Warning
 Warn when non-templatized friend functions are declared within a template.
 
+Wclass-memaccess
+C++ ObjC++ Var(warn_class_memaccess) Warning LangEnabledBy(C++ ObjC++, Wall)
+Warn for unsafe raw memory writes to objects of class types.
+
 Wnon-virtual-dtor
 C++ ObjC++ Var(warn_nonvdtor) Warning LangEnabledBy(C++ ObjC++,Weffc++)
 Warn about non-virtual destructors.
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 51260f0..6d8e77e 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -8146,6 +8146,396 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
   return call;
 }
 
+/* Return the DECL of the first non-public data member of class TYPE
+   or null if none can be found.  */
+
+static tree
+first_non_public_field (tree type)
+{
+  if (!CLASS_TYPE_P (type))
+return NULL_TREE;
+
+  for (tree field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
+{
+  if (TREE_CODE (field) != FIELD_DECL)
+	continue;
+  if (TREE_STATIC (field))
+	continue;
+  if (TREE_PRIVATE (field) || TREE_PROTECTED (field))
+	return field;
+}
+
+  int i = 0;
+
+  for (tree base_binfo, binfo = TYPE_BINFO (type);
+   BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
+{
+  tree base = TREE_TYPE (base_binfo);
+
+  if (tree field = first_non_public_field (base))
+	return field;
+}
+
+  return NULL_TREE;
+}
+
+/* Return true if all copy and move assignment operator overloads for
+   class TYPE are trivial and at least one of them is not deleted and,
+   when ACCESS is set, accessible.  Return false otherwise.  Set
+   HASASSIGN to true when the TYPE has a (not necessarily trivial)
+   copy or move assignment.  */
+
+static bool
+has_trivial_copy_assign_p (tree type, bool access, bool *hasassign)
+{
+  tree fns = cp_assignment_operator_id (NOP_EXPR);
+  fns = lookup_fnfields_slot (type, fns);
+
+  bool all_trivial = true;
+
+  /* Iterate over copy and move assignment overloads.  */
+
+  for (ovl_iterator oi (fns); oi; ++oi)
+{
+  tree f = *oi;
+
+  bool accessible = !access || !(TREE_PRIVATE (f) || TREE_PROTECTED (f));
+
+  /* Skip template assignment operators and deleted functions.  */
+  if (TREE_CODE (f) != FUNCTION_DECL || DECL_DELETED_FN (f))
+	continue;
+
+  if (accessible)
+	*hasassign = true;
+
+  if (!accessible || !trivial_fn_p (f))
+	all_trivial = false;
+
+  /* Break early when both properties have been determined.  */
+  if (*hasassign && !all_trivial)
+	break;
+}
+
+  /* Return true if they're all trivial and one of the expressions
+ TYPE() = TYPE() or TYPE() = (TYPE&)() is valid.  */
+  tree ref = cp_build_reference_type (type, false);
+  return (all_trivial
+	  && (is_trivially_xible (MODIFY_EXPR, type, type)
+	  || is_trivially_xible (MODIFY_EXPR, type, ref)));
+}
+
+/* Return true if all copy and move ctor overloads for class TYPE are
+   trivial and at least one of them is not deleted and, when ACCESS is
+   set, accessible.  Return false otherwise.  Set each element of HASCTOR[]
+   to true when the TYPE has a (not necessarily trivial) default and copy
+   (or move) ctor, respectively.  */
+
+static bool
+has_trivial_copy_p (tree type, bool access, bool 

Re: SSA range class and removal of VR_ANTI_RANGEs

2017-06-06 Thread Jeff Law
On 05/23/2017 08:34 AM, Jakub Jelinek wrote:
> On Tue, May 23, 2017 at 10:29:58AM -0400, David Malcolm wrote:
>>> Do we really want methods starting with capital letters?
>>> I understand why you can't use union, but I don't think we use
>>> CamelCase
>>> anywhere.
>>
>> FWIW in the JIT, I have a class switch_ (i.e. with a trailing
>> underscore).  Maybe we should use trailing underscores for names that
>> clash with reserved words?
> 
> union_ and not_ is just fine with me.
Likewise.  No strong opinions here.

jeff


Re: SSA range class and removal of VR_ANTI_RANGEs

2017-06-06 Thread Jeff Law
On 05/23/2017 05:28 AM, Nathan Sidwell wrote:
> On 05/23/2017 06:48 AM, Aldy Hernandez wrote:
> 
>> The class can live outside of his work, as can be demonstrated by the
>> attached patch.  With it, I was able to rewrite the post-VRP range
>> information to use this class and get rid of VR_ANTI_RANGE throughout
>> the compiler.  A VR_ANTI_RANGE of ~[5,10] becomes [-MIN,4][11,+MAX].
> 
> Seems useful.
That's the idea :-)

The general consensus is that ANTI ranges are just painful to support
and they can be represented as two distinct sub-intervals -- and we know
how to operate on those subintervals reasonably well.

I haven't looked at the this version of the patch, but earlier versions
did show how dropping anti range support and instead using the new
representation cleaned things up considerably.

In reality I suspect the only really important anti range is ~[0,0]
anyway.  Everything else is likely small potatoes.


> 
>> +  /* Remove negative numbers from the range.  */
>> +  irange positives;
>> +  range_positives (, exptype, allow_zero ? 0 : 1);
> 
> 'allow_zero ? 0 : 1' looks mighty strange. I know that's a nit, but you
> placed it front and centre!
> 
>> +  if (positives.Intersect (*ir))
> 
> I notice you have a number of Uppercase member fns ...
Aldy, this ought to get fixed :-)

jeff


Re: [PATCH] Another extract_muldiv-induced overflow (PR sanitizer/80932)

2017-06-06 Thread Jeff Law
On 06/06/2017 04:31 AM, Marek Polacek wrote:
> Another case of extract_muldiv running off the rails.  Here it did a wrong
> distribution; turning 
> 
>   ((A * x) - (B * x)) * -6
> 
> into
> 
>   (A' * x) - (B' * x)
> 
> incurred an overflow in the subtraction.  The fix is essentially the same
> as what I did in sanitizer/80800.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk/7/6?
> 
> 2017-06-06  Marek Polacek  
> 
>   PR sanitizer/80932
>   * fold-const.c (extract_muldiv_1) : Add
>   TYPE_OVERFLOW_WRAPS check. 
> 
>   * c-c++-common/ubsan/pr80932.c: New test.
OK
jeff


Re: Clarify define_insn documentation

2017-06-06 Thread Jeff Law
On 06/06/2017 12:55 PM, Richard Sandiford wrote:
> This patch tries to clarify some of the restrictions on define_insn
> conditions, and also on the use of "#".
> 
> OK to install?
> 
> Richard
> 
> 
> 2017-06-06  Richard Sandiford  
> 
> gcc/
>   * doc/md.texi: Clarify the restrictions on a define_insn condition.
>   Say that # requires an associated define_split to exist, and that
>   the define_split must be suitable for use after register allocation.
OK.
jeff


Re: [PR80693] drop value of parallel SETs dropped by combine

2017-06-06 Thread Alexandre Oliva
On May 18, 2017, Alexandre Oliva  wrote:

> When an insn used by combine has multiple SETs, only the non-REG_UNUSED
> set is used: others will end up dropped on the floor.  We have to take
> note of the dropped REG_UNUSED SETs, clearing their cached values, so
> that, even if the REGs remain used (e.g. because they were referenced
> in the used SET_SRC), we will not use properties of the latest value
> as if they applied to the earlier one.

> Regstrapped on x86_64-linux-gnu.  Ok to install?

> for  gcc/ChangeLog

>   PR rtl-optimization/80693
>   * combine.c (distribute_notes): Add IDEST parameter.  Reset any
>   REG_UNUSED REGs that are not IDEST, if IDEST is given.  Adjust
>   all callers.

> for  gcc/testsuite/ChangeLog

>   PR rtl-optimization/80693
>   * gcc.dg/pr80693.c: New.

Ping?

https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01444.html

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: RFC: [PATCH] Add warn_if_not_aligned attribute

2017-06-06 Thread Martin Sebor

On 06/06/2017 04:57 PM, H.J. Lu wrote:

On Tue, Jun 6, 2017 at 10:34 AM, Martin Sebor  wrote:

On 06/06/2017 10:59 AM, H.J. Lu wrote:


On Tue, Jun 6, 2017 at 9:10 AM, Martin Sebor  wrote:


On 06/06/2017 10:07 AM, Martin Sebor wrote:



On 06/05/2017 11:45 AM, H.J. Lu wrote:



On Mon, Jun 5, 2017 at 8:11 AM, Joseph Myers 
wrote:



The new attribute needs documentation.  Should the test be in
c-c++-common




This feature does support C++.  But C++ compiler issues a slightly
different warning at a different location.


or does this feature not support C++?



Here is the updated patch with documentation and a C++ test.  This
patch caused a few testsuite failures:

FAIL: gcc.dg/compat/struct-align-1 c_compat_x_tst.o compile



/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.dg/compat//struct-align-1.h:169:1:

warning: alignment 1 of 'struct B2_m_inner_p_outer' is less than 16

FAIL: g++.dg/torture/pr80334.C   -O0  (test for excess errors)



/export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/torture/pr80334.C:4:8:

warning: alignment 1 of 'B' is less than 16



Users often want the ability to control a warning, even when it
certainly indicates a bug.  I would suggest to add an option to
make it possible for this warning as well.

Btw., a bug related to some of those this warning is meant to
detect is assigning the address of an underaligned object to
a pointer of a natively aligned type.  Clang has an option
to detect this problem: -Waddress-of-packed-member.  It might
make a nice follow-on enhancement to add support for the same
thing.  I mention this because I think it would make sense to
consider this when choosing the name of the GCC option (i.e.,
rather than having two distinct but closely related warnings,
have one that detects both of these alignment type of bugs.




A bug that has some additional context on this is pr 51628.
A possible name for the new option suggested there is -Wpacked.

Martin



Isn't -Waddress-of-packed-member a subset of or the same as
-Wpacked?



In Clang it's neither.  -Waddress-of-packed-member only triggers
when the address of a packed member is taken but not for the cases
in bug 53037 (i.e., reducing the alignment of a member).  It's
also enabled by default, while -Wpacked needs to be specified
explicitly (i.e., it's in neither -Wall or -Wextra).

FWIW, I don't really have a strong opinion about the names of
the options.  My input is that the proliferation of fine-grained
warning options for closely related problems tends to make it
tricky to get their interactions right (both in the compiler
and for users).  Enabling both/all such options can lead to
multiple warnings for what boils down to essentially the same
bug in the same expression, overwhelming the user in repetitive
diagnostics.



There is already -Wpacked.  Should I overload it for this?


I'd say yes if -Wpacked were at least in -Wall.  But it's
an opt-in kind of warning that's not even in -Wextra, and
relaxing an explicitly specified alignment seems more like
a bug than just something that might be nice to know about.
I would expect warn_if_not_aligned to trigger a warning even
without -Wall (i.e., as you have it in your patch, but with
an option to control it).  That would suggest three levels
of warnings:

1) warn by default (warn_if_not_aligned violation)
2) warn with -Wall (using a type with attribute aligned to
   define a member of a packed struct)
3) warn if requested (current -Wpacked)

So one way to deal with it would be to change -Wpacked to
take an argument between 0 and 3, set the default to
correspond to the (1) above, and have -Wall bump it up to
(2).

If the equivalent of -Waddress-of-packed-member were to be
implemented in GCC it would probably be a candidate to add
to the (2) above.(*)

This might be more involved than you envisioned.  A slightly
simpler alternative would be to add a different option, say
something like -Walign=N, and have it handle just (1) and
(2) above, leaving -Wpacked unchanged.

Martin

PS [*] On a related note, in the Clang discussion of
-Waddress-of-packed-member they briefly considered reusing
-Wcast-align for the same purpose, but decided against it
because it apparently involves an explicit cast.  That
doesn't seem to me like a string enough argument not to
change -Wcast-align to trigger on implicit conversions that
increase alignment.  (The option is essentially useless on
most targets so this extension would make it generally
useful.)


Re: [PATCH,AIX] Enable libiberty to read AIX XCOFF

2017-06-06 Thread DJ Delorie

David Edelsohn  writes:
> This patch generally looks good to me -- it clearly is an incremental
> improvement.  One of the libiberty maintainers, such as Ian, needs to
> approve the patch.

As AIX maintainer, I think you have the authority to approve patches
like this, which only affect your OS.  I see no reason to reject the
patch myself, other than:

+  symtab = XNEWVEC (struct external_syment, ocr->nsyms * SYMESZ);
+  if (!simple_object_internal_read (sobj->descriptor,

There's no check to see if XNEWVEC succeeded.


Also, the use of XDELETEVEC is inconsistently protected with a "if (foo
!= NULL)" throughout, but passing NULL to XDELETEVEC (essentially,
free()) is allowed anyway, so this is only a stylistic issue, which I'm
not particularly worried about.



Re: [PATCH,AIX] Enable XCOFF in libbacktrace on AIX

2017-06-06 Thread David Edelsohn
Tony,

This patch seems like a reasonable start for the minimal libbacktrace
framework needed by GCC Go.  The libbacktrace maintainer, Ian, needs
to approve the patch.

https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01186.html

I'll defer to Ian on the style of the fileline.c changes.

Thanks, David


Re: [PATCH,AIX] Enable libiberty to read AIX XCOFF

2017-06-06 Thread David Edelsohn
Tony,

This patch generally looks good to me -- it clearly is an incremental
improvement.  One of the libiberty maintainers, such as Ian, needs to
approve the patch.

https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01181.html

+  if (strcmp (name, ".text") == 0)
+textptr = scnptr;

The above code does not seem very robust.  What if the application is
compiled with -ffunction-sections so the text section is not named
".text"?

+  if (strtab == NULL)
+{
+ XDELETEVEC (symtab);
+  XDELETEVEC (scnbuf);
+  return errmsg;

The first XDELETEVEC (symtab) is indented incorrectly and should be fixed.

Thanks, David


[C++ PATCH] Spell correction tweak

2017-06-06 Thread Nathan Sidwell
Spelling correction blew up for me on the modules branch because 
suggest_alternatives_for was using the raw namespace accessor.  That led 
me to reconsider the behaviour.   This patch implements qualified name 
lookup (which it used to do), but ignoring using directives (which is 
new).  That way we'll see inline namespaces at the same time as their 
parent.  Which also means we shouldn't walk them in their own right. 
This also means we won't count them as part of the walking limit.


As inline-namespaces are intended as an implementation detail that the 
user doesn't care about, I think this is a better behaviour.


Committed to trunk.

nathan
--
Nathan Sidwell
2017-06-06  Nathan Sidwell  

	* name-lookup.c (suggest_alternatives_for): Use qualified lookup
	sans using directives.  Don't walk into inline namespaces.

	* g++.dg/pr45330.C: Add inline namespace case.

Index: cp/name-lookup.c
===
--- cp/name-lookup.c	(revision 248928)
+++ cp/name-lookup.c	(working copy)
@@ -4714,9 +4714,10 @@ suggest_alternatives_for (location_t loc
   for (unsigned ix = 0; ix != worklist.length (); ix++)
 {
   tree ns = worklist[ix];
+  name_lookup lookup (name);
 
-  if (tree value = ovl_skip_hidden (find_namespace_value (ns, name)))
-	candidates.safe_push (value);
+  if (lookup.search_qualified (ns, false))
+	candidates.safe_push (lookup.value);
 
   if (!limited)
 	{
@@ -4728,7 +4729,8 @@ suggest_alternatives_for (location_t loc
 	  for (tree decl = NAMESPACE_LEVEL (ns)->names;
 	   decl; decl = TREE_CHAIN (decl))
 	if (TREE_CODE (decl) == NAMESPACE_DECL
-		&& !DECL_NAMESPACE_ALIAS (decl))
+		&& !DECL_NAMESPACE_ALIAS (decl)
+		&& !DECL_NAMESPACE_INLINE_P (decl))
 	  children.safe_push (decl);
 
 	  while (!limited && !children.is_empty ())
Index: testsuite/g++.dg/pr45330.C
===
--- testsuite/g++.dg/pr45330.C	(revision 248928)
+++ testsuite/g++.dg/pr45330.C	(working copy)
@@ -1,4 +1,4 @@
-// { dg-do compile }
+// { dg-do compile { target c++11 } }
 // Search std, __cxxabiv1, and global namespaces, plus two more,
 // breadth first
 
@@ -17,7 +17,10 @@ namespace A
 
 namespace B
 {
-  int foo;			// { dg-message "B::foo" "suggested alternative" }
+  inline namespace I
+  {
+int foo;			// { dg-message "B::I::foo" "suggested alternative" }
+  }
 }
 
 namespace C


Re: RFC: [PATCH] Add warn_if_not_aligned attribute

2017-06-06 Thread H.J. Lu
On Tue, Jun 6, 2017 at 10:34 AM, Martin Sebor  wrote:
> On 06/06/2017 10:59 AM, H.J. Lu wrote:
>>
>> On Tue, Jun 6, 2017 at 9:10 AM, Martin Sebor  wrote:
>>>
>>> On 06/06/2017 10:07 AM, Martin Sebor wrote:


 On 06/05/2017 11:45 AM, H.J. Lu wrote:
>
>
> On Mon, Jun 5, 2017 at 8:11 AM, Joseph Myers 
> wrote:
>>
>>
>> The new attribute needs documentation.  Should the test be in
>> c-c++-common
>
>
>
> This feature does support C++.  But C++ compiler issues a slightly
> different warning at a different location.
>
>> or does this feature not support C++?
>>
>
> Here is the updated patch with documentation and a C++ test.  This
> patch caused a few testsuite failures:
>
> FAIL: gcc.dg/compat/struct-align-1 c_compat_x_tst.o compile
>
>
>
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.dg/compat//struct-align-1.h:169:1:
>
> warning: alignment 1 of 'struct B2_m_inner_p_outer' is less than 16
>
> FAIL: g++.dg/torture/pr80334.C   -O0  (test for excess errors)
>
>
>
> /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/torture/pr80334.C:4:8:
>
> warning: alignment 1 of 'B' is less than 16
>

 Users often want the ability to control a warning, even when it
 certainly indicates a bug.  I would suggest to add an option to
 make it possible for this warning as well.

 Btw., a bug related to some of those this warning is meant to
 detect is assigning the address of an underaligned object to
 a pointer of a natively aligned type.  Clang has an option
 to detect this problem: -Waddress-of-packed-member.  It might
 make a nice follow-on enhancement to add support for the same
 thing.  I mention this because I think it would make sense to
 consider this when choosing the name of the GCC option (i.e.,
 rather than having two distinct but closely related warnings,
 have one that detects both of these alignment type of bugs.
>>>
>>>
>>>
>>> A bug that has some additional context on this is pr 51628.
>>> A possible name for the new option suggested there is -Wpacked.
>>>
>>> Martin
>>
>>
>> Isn't -Waddress-of-packed-member a subset of or the same as
>> -Wpacked?
>
>
> In Clang it's neither.  -Waddress-of-packed-member only triggers
> when the address of a packed member is taken but not for the cases
> in bug 53037 (i.e., reducing the alignment of a member).  It's
> also enabled by default, while -Wpacked needs to be specified
> explicitly (i.e., it's in neither -Wall or -Wextra).
>
> FWIW, I don't really have a strong opinion about the names of
> the options.  My input is that the proliferation of fine-grained
> warning options for closely related problems tends to make it
> tricky to get their interactions right (both in the compiler
> and for users).  Enabling both/all such options can lead to
> multiple warnings for what boils down to essentially the same
> bug in the same expression, overwhelming the user in repetitive
> diagnostics.
>

There is already -Wpacked.  Should I overload it for this?


-- 
H.J.


[PATCH, rs6000] Fix vec_mulo and vec_mule builtin implementations

2017-06-06 Thread Carl E. Love
GCC Maintainers:

The support for the vec_mulo and vec_mule that was recently submitted
has a couple of bugs.  Specifically, they were implemented with
int/unsigned int args and return int/unsigned int.  The return types
should have been long long/unsigned long long.  Additionally it was
noted that unsigned version returned a signed version by mistake.  

The following patch fixes these issues.  The patch has been tested on
powerpc64le-unknown-linux-gnu (Power 8 LE) and on
powerpc64-unknown-linux-gnu (Power 8 BE) with no regressions.

Is the patch OK for gcc mainline?

  Carl Love
---

gcc/ChangeLog:

2017-06-08  Carl Love  

* config/rs6000/rs6000-c: The return type of the following
built-in functions was implemented as int not long long.  Fix sign
of return value for the unsigned version of vec_mulo and vec_mule.
vector unsigned long long vec_bperm (vector unsigned long long,
 vector unsigned char)
vector signed long long vec_mule (vector signed int,
  vector signed int)
vector unsigned long long vec_mule (vector unsigned int,
vector unsigned int)
vector signed long long vec_mulo (vector signed int,
  vector signed int)
vector unsigned long long vec_mulo (vector unsigned int,
vector unsigned int)
* doc/extend.texi: Fix the documentation for the
built-in functions.

gcc/testsuite/ChangeLog:

2017-06-08  Carl Love  

* gcc.target/powerpc/builtins-3.c: Fix vec_mule, vec_mulo test cases.
---
 gcc/config/rs6000/rs6000-c.c  | 12 +--
 gcc/doc/extend.texi   | 13 +++-
 gcc/testsuite/gcc.target/powerpc/builtins-3.c | 30 ++-
 3 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index b602dee..a917ea7 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -2212,9 +2212,9 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
   { ALTIVEC_BUILTIN_VEC_MULE, ALTIVEC_BUILTIN_VMULESH,
 RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 },
   { ALTIVEC_BUILTIN_VEC_MULE, ALTIVEC_BUILTIN_VMULESH,
-RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 },
-  { ALTIVEC_BUILTIN_VEC_MULE, ALTIVEC_BUILTIN_VMULESH,
-RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI,
+RS6000_BTI_V2DI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 },
+  { ALTIVEC_BUILTIN_VEC_MULE, ALTIVEC_BUILTIN_VMULEUH,
+RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V4SI,
 RS6000_BTI_unsigned_V4SI, 0 },
   { ALTIVEC_BUILTIN_VEC_VMULEUB, ALTIVEC_BUILTIN_VMULEUB,
 RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V16QI, 
RS6000_BTI_unsigned_V16QI, 0 },
@@ -2231,9 +2231,9 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
   { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOUH,
 RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V8HI, 
RS6000_BTI_unsigned_V8HI, 0 },
   { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOSH,
-RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 },
-  { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOSH,
-RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI,
+RS6000_BTI_V2DI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 },
+  { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOUH,
+RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V4SI,
 RS6000_BTI_unsigned_V4SI, 0 },
   { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOSH,
 RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 },
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index d147d5a..d467a16 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -16345,10 +16345,10 @@ vector signed short vec_mule (vector signed char,
 vector unsigned int vec_mule (vector unsigned short,
   vector unsigned short);
 vector signed int vec_mule (vector signed short, vector signed short);
-vector unsigned int vec_mule (vector unsigned int,
-  vector unsigned int);
-vector signed int vec_mule (vector signed int,
-vector signed int);
+vector unsigned long long vec_mule (vector unsigned int,
+vector unsigned int);
+vector signed long long vec_mule (vector signed int,
+  vector signed int);
 
 vector signed int vec_vmulesh (vector signed short,
vector signed short);
@@ -16368,7 +16368,10 @@ vector signed short vec_mulo (vector signed char, 
vector signed char);
 vector unsigned int vec_mulo (vector unsigned short,
   vector unsigned short);
 

Re: [PATCH] warn on mem calls modifying objects of non-trivial types (PR 80560)

2017-06-06 Thread Martin Sebor

On 06/05/2017 07:53 PM, Martin Sebor wrote:

On 06/05/2017 01:13 PM, Martin Sebor wrote:

On 06/05/2017 10:07 AM, Martin Sebor wrote:

Maybe I should use a different approach and instead of trying
to see if a function is deleted use trivially_xible to see if
it's usable.  That will mean changing the diagnostics from
"with a deleted special function" to "without trivial special
function" but it will avoid calling synthesized_method_walk
while still avoiding giving bogus suggestions.


Actually, this would check for one possible argument type and not
others, so I think it's better to keep looking at the declarations.
You
can do that by just looking them up (lookup_fnfields_slot) and
iterating
over them, you don't need to call synthesized_method_walk.


You mean using trivially_xible might check assignability or copy
constructibility from const T& but not from T& (or the other way
around), and you think both (or perhaps even other forms) should
be considered?

E.g., given:

  struct S
  {
S& operator= (const S&) = default;
void operator= (S&) = delete;
  };

  void f (S *d, const S *s)
  {
memcpy(d, s, sizeof *d);   // don't warn here
  }

  void g (S *d, S *s)
  {
memcpy(d, s, sizeof *d);   // but warn here
  }

And your suggestion is to iterate over the assignment operator
(and copy ctor) overloads for S looking for one that's trivial,
public, and not deleted?

If that's it, I was thinking of just checking for the const T&
overload (as if by using std::is_trivially_copy_assignable()).

I don't mind trying the approach you suggest.  It should be more
accurate.  I just want to make sure we're on the same page.


Actually, after some more thought and testing the approach I have
a feeling that distinguishing between the two cases above is not
what you meant.

Classes that overload copy assignment or copy ctors on the constness
of the argument are tricky to begin with and using raw memory calls
on them seems suspect and worthy of a warning.

I'm guessing what you meant by "checking for one possible argument
type and not others" is actually checking to make sure all copy
assignment (and copy ctor) overloads are trivial, not jut some,
and at least one of them is accessible.  I'll go with that unless
I hear otherwise.


Attached is an update that implements this simplified approach.


Another update, this one rebased on top of trunk (where OVL_NEXT
doesn't exist).  This version also adds a couple of minor tweaks
to handle template ctors and assignment operators, and to handle
and exercise move ctors and move assignment and const/non-const
overloads of each.  Tested on x86_64-linux.

Martin
PR c++/80560 - warn on undefined memory operations involving non-trivial types

gcc/c-family/ChangeLog:

	PR c++/80560
	* c.opt (-Wclass-memaccess): New option.

gcc/cp/ChangeLog:

	PR c++/80560
	* call.c (first_non_public_field, maybe_warn_class_memaccess): New
	functions.
	(has_trivial_copy_assign_p, has_trivial_copy_p): Ditto.
	(build_cxx_call): Call maybe_warn_class_memaccess.

gcc/ChangeLog:

	PR c++/80560
	* dumpfile.c (dump_register): Avoid calling memset to initialize
	a class with a default ctor.
	* gcc.c (struct compiler): Remove const qualification.
	* genattrtab.c (gen_insn_reserv): Replace memset with initialization.
	* hash-table.h: Ditto.
	* ipa-cp.c (allocate_and_init_ipcp_value): Replace memset with
	  assignment.
	* ipa-prop.c (ipa_free_edge_args_substructures): Ditto.
	* omp-low.c (lower_omp_ordered_clauses): Replace memset with
	default ctor.
	* params.h (struct param_info): Make struct members non-const.
	* tree-switch-conversion.c (emit_case_bit_tests): Replace memset
	with default initialization.
	* vec.h (vec_copy_construct, vec_default_construct): New helper
	functions.
	(vec::copy, vec::splice, vec::reserve): Replace memcpy
	with vec_copy_construct.
	(vect::quick_grow_cleared): Replace memset with default ctor.
	(vect::vec_safe_grow_cleared, vec_safe_grow_cleared): Same.
	* doc/invoke.texi (-Wclass-memaccess): Document.

libcpp/ChangeLog:

	PR c++/80560
	* line-map.c (line_maps::~line_maps): Avoid calling htab_delete
	with a null pointer.
	(linemap_init): Avoid calling memset on an object of a non-trivial
	type.

libitm/ChangeLog:

	PR c++/80560
	* beginend.cc (GTM::gtm_thread::rollback): Avoid calling memset
	on an object of a non-trivial type.
	(GTM::gtm_transaction_cp::commit): Use assignment instead of memcpy
	to copy an object.
	* method-ml.cc (orec_iterator::reinit): Avoid -Wclass-memaccess.

gcc/testsuite/ChangeLog:

	PR c++/80560
	* g++.dg/Wclass-memaccess.C: New test.

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 37bb236..363d104 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -804,6 +804,10 @@ Wnon-template-friend
 C++ ObjC++ Var(warn_nontemplate_friend) Init(1) Warning
 Warn when non-templatized friend functions are declared within a template.
 
+Wclass-memaccess
+C++ ObjC++ Var(warn_class_memaccess) Warning LangEnabledBy(C++ ObjC++, Wall)
+Warn for 

Re: [PATCH] vec_merge + vec_duplicate + vec_concat simplification

2017-06-06 Thread Marc Glisse

On Tue, 6 Jun 2017, Kyrill Tkachov wrote:


Another vec_merge simplification that's missing is transforming:
(vec_merge (vec_duplicate x) (vec_concat (y) (z)) (const_int N))
into
(vec_concat x z) if N == 1 (0b01) or
(vec_concat y x) if N == 2 (0b10)


Do we have a canonicalization somewhere that guarantees we cannot get
(vec_merge (vec_concat (y) (z)) (vec_duplicate x) (const_int N))
?

I was wondering if it would be possible to merge the transformations for 
concat and duplicate into a single one, but maybe it becomes too 
unreadable.


--
Marc Glisse


Re: add VxWorks specific crtstuff implementation files for Ada runtimes

2017-06-06 Thread Olivier Hainque
Hi Nathan,

> On Jun 5, 2017, at 12:58 , Nathan Sidwell  wrote:
> 
> On 06/02/2017 11:58 AM, Olivier Hainque wrote:
>> Hello,
> 
>> 2017-06-02  Olivier Hainque  
>>  ada/
>>  * vx_crtbegin_auto.c: New file.
>>  * vx_crtbegin.c: New file.
>>  * vx_crtbegin.inc: New file.
>>  * vx_crtend.c: New file.
> 
> + *  Copyright (C) 2016, Free Software Foundation, Inc. 
> 
> Date update?

Yes, I can do that. For a moment I thought that keeping
the date of the latest change to the source contents was
correct. I understand that committing upstream represents
an event of interest as well.

Regards,

Olivier





Go patch committed: fix types used in interface method tables

2017-06-06 Thread Ian Lance Taylor
This patch by Than McIntosh fixes the types that the Go frontend
passes to the middle-end for interface method tables.  The Go frontend
was using Go function types for values that are actually C function
pointers.  The value is a pointer either way, but clearly the type
should be correct.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 248765)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-e5870eac67d4d5b1f86bdbfb13dadf4d5723f71d
+7e3904e4370ccfd9062c2661c612476288244e17
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 248528)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -10156,9 +10156,13 @@ Call_expression::do_must_eval_in_order()
 Expression*
 Call_expression::interface_method_function(
 Interface_field_reference_expression* interface_method,
-Expression** first_arg_ptr)
+Expression** first_arg_ptr,
+Location location)
 {
-  *first_arg_ptr = interface_method->get_underlying_object();
+  Expression* object = interface_method->get_underlying_object();
+  Type* unsafe_ptr_type = Type::make_pointer_type(Type::make_void_type());
+  *first_arg_ptr =
+  Expression::make_unsafe_cast(unsafe_ptr_type, object, location);
   return interface_method->get_function();
 }
 
@@ -10267,7 +10271,8 @@ Call_expression::do_get_backend(Translat
   else
 {
   Expression* first_arg;
-  fn = this->interface_method_function(interface_method, _arg);
+  fn = this->interface_method_function(interface_method, _arg,
+   location);
   fn_args[0] = first_arg->get_backend(context);
 }
 
@@ -15392,10 +15397,16 @@ Interface_mtable_expression::do_type()
   Typed_identifier tid("__type_descriptor", 
Type::make_type_descriptor_ptr_type(),
this->location());
   sfl->push_back(Struct_field(tid));
+  Type* unsafe_ptr_type = Type::make_pointer_type(Type::make_void_type());
   for (Typed_identifier_list::const_iterator p = interface_methods->begin();
p != interface_methods->end();
++p)
-sfl->push_back(Struct_field(*p));
+{
+  // We want C function pointers here, not func descriptors; model
+  // using void* pointers.
+  Typed_identifier method(p->name(), unsafe_ptr_type, p->location());
+  sfl->push_back(Struct_field(method));
+}
   Struct_type* st = Type::make_struct_type(sfl, this->location());
   st->set_is_struct_incomparable();
   this->method_table_type_ = st;
@@ -15456,11 +15467,18 @@ Interface_mtable_expression::do_get_back
   else
 td_type = Type::make_pointer_type(this->type_);
 
+  std::vector bstructfields;
+
   // Build an interface method table for a type: a type descriptor followed by 
a
   // list of function pointers, one for each interface method.  This is used 
for
   // interfaces.
   Expression_list* svals = new Expression_list();
-  svals->push_back(Expression::make_type_descriptor(td_type, loc));
+  Expression* tdescriptor = Expression::make_type_descriptor(td_type, loc);
+  svals->push_back(tdescriptor);
+
+  Btype* tdesc_btype = tdescriptor->type()->get_backend(gogo);
+  Backend::Btyped_identifier btd("_type", tdesc_btype, loc);
+  bstructfields.push_back(btd);
 
   Named_type* nt = this->type_->named_type();
   Struct_type* st = this->type_->struct_type();
@@ -15480,13 +15498,24 @@ Interface_mtable_expression::do_get_back
   Named_object* no = m->named_object();
 
   go_assert(no->is_function() || no->is_function_declaration());
+
+  Btype* fcn_btype = m->type()->get_backend_fntype(gogo);
+  Backend::Btyped_identifier bmtype(p->name(), fcn_btype, loc);
+  bstructfields.push_back(bmtype);
+
   svals->push_back(Expression::make_func_code_reference(no, loc));
 }
 
-  Btype* btype = this->type()->get_backend(gogo);
-  Expression* mtable = Expression::make_struct_composite_literal(this->type(),
- svals, loc);
-  Bexpression* ctor = mtable->get_backend(context);
+  Btype *btype = gogo->backend()->struct_type(bstructfields);
+  std::vector ctor_bexprs;
+  for (Expression_list::const_iterator pe = svals->begin();
+   pe != svals->end();
+   ++pe)
+{
+  ctor_bexprs.push_back((*pe)->get_backend(context));
+}
+  Bexpression* ctor =
+  gogo->backend()->constructor_expression(btype, ctor_bexprs, loc);
 
   bool is_public = has_hidden_methods && this->type_->named_type() != NULL;
   std::string asm_name(go_selectively_encode_id(mangled_name));
Index: gcc/go/gofrontend/expressions.h

Re: Reorgnanization of profile count maintenance code, part 1

2017-06-06 Thread Jan Hubicka
> Hi!
> 
> On Thu, Jun 01, 2017 at 01:35:56PM +0200, Jan Hubicka wrote:
> > +  /* FIXME: shrink wrapping violates this sanity check.  */
> > +  gcc_checking_assert ((num >= 0
> > +   && (num <= REG_BR_PROB_BASE
> > +   || den <= REG_BR_PROB_BASE)
> > +   && den > 0) || 1);
> > +  ret.m_val = RDIV (m_val * num, den);
> > +  return ret;
> 
> Sorry if I missed this...  But where/how does it violate this?

It sums multiple probabilties together and overflows the limit.
> 
> > +  /* Take care for overflows!  */
> > +  if (num.m_val < 8196 || m_val < 8196)
> > +ret.m_val = RDIV (m_val * num.m_val, den.m_val);
> > +  else
> > +ret.m_val = RDIV (m_val * RDIV (num.m_val * 8196, den.m_val), 
> > 8196);
> > +  return ret;
> 
> 8196 is a strange number, did you mean 8192?

Yep I meant :) Not sure how it got there.

I will fix both problems incrementally.

Thanks,
Honza
> 
> 
> Segher


Re: [gcn][patch] Add -mgpu option and plumb in assembler/linker

2017-06-06 Thread Andrew Stubbs

On 29/05/17 18:27, Martin Jambor wrote:

I apologize for taking so long to reply, I was traveling for two past
weeks and just before that we suffered some local infrastructure
issues that prevented me from working on this too.


And I've just been on vacation for a week. :-)


On Fri, Apr 28, 2017 at 06:06:39PM +0100, Andrew Stubbs wrote:
At the moment I prefer to use --with-as and --with-ld configure


Those only work with absolute paths, which is not suitable for a 
toolchain that may be installed in a user's home directory, which will 
be my use case.



despite the error message.  I guess we are fine with passing -fno-lto
or rather disabling lto at configure time for the time being.


I configure with --disable-lto.


This is the one thing that was it difficult for me to get it working.
I had to upgrade my kernel and both run-time libraries to the newest
ROCm 1.5, re-compile llvm and lld from ROCm github branches, and
rewrite our testing kernel invoker to use non-deprecated HSA 1.1
functions (we had been using hsa_code_object_deserialize and friends
from HSA 1.0).  But finally, my kernels get loaded, started and work.


Hmm, I didn't realize it would be so hard. I guess I'm joining the party 
late.



3. Add -mgpu option and corresponding --with-gpu. I've deliberately used
"gpu" instead of "cpu" because I want offloading compilers to be able to say
"-mcpu=foo -foffload=-mgpu=bar", or even have the host compiler just
understand -mgpu and DTRT.


As far as I am concerned, this seems like a good idea.


Thomas objects to the new option, and after talking with him the 
reasoning seems sound. GCC has been moving away from -mcpu in any case, 
so I guess I'll put -march and -mtune back, and use those for the same 
purpose.


I'll commit the patch with those changes soonish.

Andrew


[PATCH] PR c++/80990 use cv-qualifiers in class template argument deduction

2017-06-06 Thread Jonathan Wakely

This fixes class template argument deduction so that cv-qualifiers are
not ignored.

Bootstrapped and tested powerpc64le-linux. OK for trunk?

PR c++/80990
* pt.c (do_class_deduction): Build qualified type.

commit 1a81d74fa9cf32cb33580cc4a5e9a7c5868ce32b
Author: Jonathan Wakely 
Date:   Tue Jun 6 18:38:30 2017 +0100

PR c++/80990 use cv-qualifiers in class template argument deduction

gcc/cp:

	PR c++/80990
	* pt.c (do_class_deduction): Build qualified type.

gcc/testsuite:

	PR c++/80990
	* g++.dg/cpp1z/class-deduction39.C: New.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index a04f2e0..ada94bd 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -25367,7 +25367,7 @@ do_class_deduction (tree ptype, tree tmpl, tree init, int flags,
   --cp_unevaluated_operand;
   release_tree_vector (args);
 
-  return TREE_TYPE (t);
+  return cp_build_qualified_type (TREE_TYPE (t), cp_type_quals (ptype));
 }
 
 /* Replace occurrences of 'auto' in TYPE with the appropriate type deduced
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction39.C b/gcc/testsuite/g++.dg/cpp1z/class-deduction39.C
new file mode 100644
index 000..98a3664
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction39.C
@@ -0,0 +1,15 @@
+// { dg-options -std=c++1z }
+
+template  struct A { };
+
+A a;
+const A c = a;
+volatile A v = a;
+const volatile A cv = a;
+
+template  struct same;
+template  struct same {};
+
+same s1;
+same s2;
+same s3;


Re: [PATCH][SPARC] PR target/80968 Prevent stack loads in return delay slot.

2017-06-06 Thread David Miller
From: David Miller 
Date: Mon, 05 Jun 2017 20:54:46 -0400 (EDT)

> From: Eric Botcazou 
> Date: Tue, 06 Jun 2017 00:02:06 +0200
> 
>>> That seems to work as well, following is going through a testsuite
>>> run right now:
>>> 
>>> 
>>> [PATCH] sparc: Fix stack references in return delay slot.
>>> 
>>> gcc/
>>> 
>>> PR target/80968
>>> * config/sparc/sparc.c (sparc_expand_prologue): Emit frame
>>> blockage if function uses alloca.
>> 
>> Probably worth applying on all active branches I'd think.
> 
> I agree, this is a really nasty bug.
> 
> I'll do that after some more testing.

This is now done, thanks again.


Clarify define_insn documentation

2017-06-06 Thread Richard Sandiford
This patch tries to clarify some of the restrictions on define_insn
conditions, and also on the use of "#".

OK to install?

Richard


2017-06-06  Richard Sandiford  

gcc/
* doc/md.texi: Clarify the restrictions on a define_insn condition.
Say that # requires an associated define_split to exist, and that
the define_split must be suitable for use after register allocation.

Index: gcc/doc/md.texi
===
--- gcc/doc/md.texi 2017-05-31 10:02:28.617662269 +0100
+++ gcc/doc/md.texi 2017-06-06 19:49:40.453694308 +0100
@@ -115,7 +115,7 @@ A @code{define_insn} is an RTL expressio
 
 @enumerate
 @item
-An optional name.  The presence of a name indicate that this instruction
+An optional name.  The presence of a name indicates that this instruction
 pattern can perform a certain standard job for the RTL-generation
 pass of the compiler.  This pass knows certain names and will use
 the instruction patterns with those names, if the names are defined
@@ -166,9 +166,21 @@ individual insn, and only after the insn
 recognition template.  The insn's operands may be found in the vector
 @code{operands}.
 
-For an insn where the condition has once matched, it
-cannot later be used to control register allocation by excluding
-certain register or value combinations.
+An instruction condition cannot become more restrictive as compilation
+progresses.  If the condition accepts a particular RTL instruction at
+one stage of compilation, it must continue to accept that instruction
+until the final pass.  For example, @samp{!reload_completed} and
+@samp{can_create_pseudo_p ()} are both invalid instruction conditions,
+because they are true during the earlier RTL passes and false during
+the later ones.  For the same reason, if a condition accepts an
+instruction before register allocation, it cannot later try to control
+register allocation by excluding certain register or value combinations.
+
+Although a condition cannot become more restrictive as compilation
+progresses, the condition for a nameless pattern @emph{can} become
+more permissive.  For example, a nameless instruction can require
+@samp{reload_completed} to be true, in which case it only matches
+after register allocation.
 
 @item
 The @dfn{output template} or @dfn{output statement}: This is either
@@ -561,6 +573,11 @@ already defined, then you can simply use
 instead of writing an output template that emits the multiple assembler
 instructions.
 
+Note that @code{#} only has an effect while generating assembly code;
+it does not affect whether a split occurs earlier.  An associated
+@code{define_split} must exist and it must be suitable for use after
+register allocation.
+
 If the macro @code{ASSEMBLER_DIALECT} is defined, you can use construct
 of the form @samp{@{option0|option1|option2@}} in the templates.  These
 describe multiple variants of assembler language syntax.


Re: [PATCH 9/13] D: D2 Testsuite Dejagnu files.

2017-06-06 Thread Mike Stump
On Jun 6, 2017, at 10:48 AM, Iain Buclaw  wrote:
>> Something this large can be integration tested on a svn/git branch, if you 
>> need others to help out.
>> 
> 
> It would probably be easier for me to maintain also, rather than
> continuously regenerating patches each time I make an update to
> reflect something that changed in GCC.

A branch doesn't remove the need to post patches for review; only eases the 
burden of working with external people before it hits trunk.



Re: MinGW compilation warnings in libiberty's waitpid.c

2017-06-06 Thread Iain Buclaw
On 30 May 2017 at 19:10, Joel Brobecker  wrote:
>> This has been on my todo-list for a little while, as re-syncing is
>> something I normally do after pushing D language support updates into
>> libiberty.  However I decided to give it a wait until I got all
>> pending patches in, the last of which I'm just pushing in now.
>
> That's very kind of you to help us with that :). Thank you!
>

I ended up not having time before going on holiday.  If the resync
hasn't already been done, I'll do it now.

Iain.


Re: [PATCH 9/13] D: D2 Testsuite Dejagnu files.

2017-06-06 Thread Iain Buclaw
On 31 May 2017 at 01:32, Mike Stump  wrote:
> On May 28, 2017, at 2:16 PM, Iain Buclaw  wrote:
>>
>> This patch adds D language support to the GCC test suite.
>
> Ok.  If you could ensure that gcc without D retains all it's goodness and 
> that gcc with D works on 2 different systems, that will help ensure 
> integration smoothness.
>
> Something this large can be integration tested on a svn/git branch, if you 
> need others to help out.
>

It would probably be easier for me to maintain also, rather than
continuously regenerating patches each time I make an update to
reflect something that changed in GCC.

Iain.


Re: [PATCH GCC][5/5]Enable tree loop distribution at -O3 and above optimization levels.

2017-06-06 Thread Jeff Law
On 06/02/2017 05:52 AM, Bin Cheng wrote:
> Hi,
> This patch enables -ftree-loop-distribution by default at -O3 and above 
> optimization levels.
> Bootstrap and test at O2/O3 on x86_64 and AArch64.  is it OK?
> 
> Note I don't have strong opinion here and am fine with either it's accepted 
> or rejected.
> 
> Thanks,
> bin
> 2017-05-31  Bin Cheng  
> 
>   * opts.c (default_options_table): Enable OPT_ftree_loop_distribution
>   for -O3 and above levels.
I think the question is how does this generally impact the performance
of the generated code and to a lesser degree compile-time.

Do you have any performance data?

jeff



Re: [PATCH 1/1] Remove redundant definition of srcrootpre

2017-06-06 Thread Jeff Law
On 06/05/2017 01:30 AM, coypu wrote:
> This script has the only occurrence of it, and in this line
> it defines and exports it.
> 
> ---
>  config-ml.in | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/config-ml.in b/config-ml.in
> index 47f153350..58c80a35c 100644
> --- a/config-ml.in
> +++ b/config-ml.in
> @@ -493,7 +493,6 @@ multi-do:
> true; \
>   else \
> rootpre=`${PWD_COMMAND}`/; export rootpre; \
> -   srcrootpre=`cd $(srcdir); ${PWD_COMMAND}`/; export srcrootpre; \
> lib=`echo "$${rootpre}" | sed -e 's,^.*/\([^/][^/]*\)/$$,\1,'`; \
> compiler="$(CC)"; \
> for i in `$${compiler} --print-multi-lib 2>/dev/null`; do \
> 
I think this is used outside GCC (newlib in particular).  It may be a
remnant of when folks used to build everything in a single unified tree
for crosses.

jeff


Re: RFC: [PATCH] Add warn_if_not_aligned attribute

2017-06-06 Thread Martin Sebor

On 06/06/2017 10:59 AM, H.J. Lu wrote:

On Tue, Jun 6, 2017 at 9:10 AM, Martin Sebor  wrote:

On 06/06/2017 10:07 AM, Martin Sebor wrote:


On 06/05/2017 11:45 AM, H.J. Lu wrote:


On Mon, Jun 5, 2017 at 8:11 AM, Joseph Myers 
wrote:


The new attribute needs documentation.  Should the test be in
c-c++-common



This feature does support C++.  But C++ compiler issues a slightly
different warning at a different location.


or does this feature not support C++?



Here is the updated patch with documentation and a C++ test.  This
patch caused a few testsuite failures:

FAIL: gcc.dg/compat/struct-align-1 c_compat_x_tst.o compile


/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.dg/compat//struct-align-1.h:169:1:

warning: alignment 1 of 'struct B2_m_inner_p_outer' is less than 16

FAIL: g++.dg/torture/pr80334.C   -O0  (test for excess errors)


/export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/torture/pr80334.C:4:8:

warning: alignment 1 of 'B' is less than 16



Users often want the ability to control a warning, even when it
certainly indicates a bug.  I would suggest to add an option to
make it possible for this warning as well.

Btw., a bug related to some of those this warning is meant to
detect is assigning the address of an underaligned object to
a pointer of a natively aligned type.  Clang has an option
to detect this problem: -Waddress-of-packed-member.  It might
make a nice follow-on enhancement to add support for the same
thing.  I mention this because I think it would make sense to
consider this when choosing the name of the GCC option (i.e.,
rather than having two distinct but closely related warnings,
have one that detects both of these alignment type of bugs.



A bug that has some additional context on this is pr 51628.
A possible name for the new option suggested there is -Wpacked.

Martin


Isn't -Waddress-of-packed-member a subset of or the same as
-Wpacked?


In Clang it's neither.  -Waddress-of-packed-member only triggers
when the address of a packed member is taken but not for the cases
in bug 53037 (i.e., reducing the alignment of a member).  It's
also enabled by default, while -Wpacked needs to be specified
explicitly (i.e., it's in neither -Wall or -Wextra).

FWIW, I don't really have a strong opinion about the names of
the options.  My input is that the proliferation of fine-grained
warning options for closely related problems tends to make it
tricky to get their interactions right (both in the compiler
and for users).  Enabling both/all such options can lead to
multiple warnings for what boils down to essentially the same
bug in the same expression, overwhelming the user in repetitive
diagnostics.

Martin


Re: [PATCH 0/5 v3] Vect peeling cost model

2017-06-06 Thread Andreas Schwab
http://gcc.gnu.org/ml/gcc-testresults/2017-06/msg00297.html

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH 14/14] rs6000: Remove rs6000_nonimmediate_operand

2017-06-06 Thread David Edelsohn
On Tue, Jun 6, 2017 at 11:56 AM, Segher Boessenkool
 wrote:
> Now rs6000_nonimmediate_operand is just nonimmediate_operand.
>
>
> 2017-06-06  Segher Boessenkool  
>
> * config/rs6000/predicates.md (rs6000_nonimmediate_operand): Delete.
> * config/rs6000/rs6000.md (*movsi_internal1, movsi_from_sf,
> *mov_softfloat, and an anonymous splitter): Use
> nonimmediate_operand instead of rs6000_nonimmediate_operand.

Okay.

Thanks, David


Re: [PATCH 13/14] rs6000: Remove spe_acc and spefscr

2017-06-06 Thread David Edelsohn
On Tue, Jun 6, 2017 at 11:56 AM, Segher Boessenkool
 wrote:
> We can also remove the two other SPE registers.
>
>
> 2017-06-06  Segher Boessenkool  
>
> * config/rs6000/darwin.h (REGISTER_NAMES): Delete the SPE_ACC and
> SPEFSCR registers.
> * config/rs6000/rs6000.c (rs6000_reg_names, alt_reg_names): Ditto.
> (enum rs6000_reg_type): Delete SPE_ACC_TYPE and SPEFSCR_REG_TYPE.
> (rs6000_debug_reg_global): Adjust.
> (rs6000_init_hard_regno_mode_ok): Adjust.
> (rs6000_dbx_register_number): Adjust.
> * config/rs6000/rs6000.h (FIRST_PSEUDO_REGISTER): Change to 115.
> (FIXED_REGISTERS, CALL_USED_REGISTERS, CALL_REALLY_USED_REGISTERS):
> Remove SPE_ACC and SPEFSCR.
> (REG_ALLOC_ORDER): Ditto.
> (FRAME_POINTER_REGNUM): Change to 111.
> (enum reg_class): Remove the SPE_ACC and SPEFSCR registers.
> (REG_CLASS_NAMES): Ditto.
> (REG_CLASS_CONTENTS): Delete the SPE_ACC and SPEFSCR registers.
> (REGISTER_NAMES): Ditto.
> (ADDITIONAL_REG_NAMES): Ditto.
> (rs6000_reg_names): Ditto.
> * config/rs6000/rs6000.md: Renumber some register number
> define_constants.

Okay.

Thanks, David


Re: [PATCH 12/14] rs6000: Remove SPE high registers

2017-06-06 Thread David Edelsohn
On Tue, Jun 6, 2017 at 11:56 AM, Segher Boessenkool
 wrote:
> Now we can remove the SPE high registers.
>
>
> 2017-06-06  Segher Boessenkool  
>
> * config/rs6000/darwin.h (REGISTER_NAMES): Delete the SPE high
> registers.
> * config/rs6000/rs6000.c (rs6000_reg_names, alt_reg_names): Ditto.
> * config/rs6000/rs6000.h (FIRST_PSEUDO_REGISTER): Change from 149
> to 117.
> (DWARF_REG_TO_UNWIND_COLUMN): Do not define.
> (FIXED_REGISTERS, CALL_USED_REGISTERS, CALL_REALLY_USED_REGISTERS):
> Delete the SPE high registers.
> (REG_ALLOC_ORDER): Ditto.
> (enum reg_class): Remove SPE_HIGH_REGS.
> (REG_CLASS_NAMES): Ditto.
> (REG_CLASS_CONTENTS): Delete the SPE high registers.
> (REGISTER_NAMES): Ditto.
> (rs6000_reg_names): Ditto.
> * doc/tm.texi.in: Remove SPE as example.
> * doc/tm.texi: Regenerate.

Okay.

Thank, David


Re: [PATCH 11/14] rs6000: Remove type attribute "brinc"

2017-06-06 Thread David Edelsohn
On Tue, Jun 6, 2017 at 11:56 AM, Segher Boessenkool
 wrote:
> Nothing uses it anymore.
>
>
> 2017-06-06  Segher Boessenkool  
>
> * config/rs6000/8540.md (ppc8540_brinc): Delete.
> * config/rs6000/e500mc.md (e500mc_brinc): Delete.
> * config/rs6000/e500mc64.md (e500mc64_brinc): Delete.
> * config/rs6000/rs6000.md (type): Remove "brinc".

Okay.

Thanks, David


Re: [PATCH 10/14] rs6000: Remove spe.md, spe.h, linuxspe.h

2017-06-06 Thread David Edelsohn
On Tue, Jun 6, 2017 at 11:56 AM, Segher Boessenkool
 wrote:
> 2017-06-06  Segher Boessenkool  
>
> * config.gcc (powerpc*-*-*): Don't add spe.h to extra_headers.
> (powerpc*-linux*spe*): Use ${cpu_type} instead of rs6000.
> * config/rs6000/linuxspe.h: Delete file.
> * config/rs6000/rs6000.md: Don't include spe.md.
> * config/rs6000/spe.h: Delete file.
> * config/rs6000/spe.md: Delete file.
> * config/rs6000/t-rs6000: Remove spe.md.

Okay.

Thanks, David


Re: [PATCH 09/14] rs6000: Remove reg_or_none500mem_operand

2017-06-06 Thread David Edelsohn
On Tue, Jun 6, 2017 at 11:56 AM, Segher Boessenkool
 wrote:
> 2017-06-06  Segher Boessenkool  
>
> * config/rs6000/predicates.md (reg_or_mem_operand): Reformat.
> (reg_or_none500mem_operand): Delete.
> * config/rs6000/rs6000.md (extendsfdf2): Use reg_or_mem_operand
> instead of reg_or_none500mem_operand.

Okay.

Thanks, David


Re: [PATCH 08/14] rs6000: Remove -mspe options

2017-06-06 Thread David Edelsohn
On Tue, Jun 6, 2017 at 11:56 AM, Segher Boessenkool
 wrote:
> 2017-06-06  Segher Boessenkool  
>
> * config/rs6000/rs6000.c (rs6000_option_override_internal): Delete
> handling of SPE flags.
> * config/rs6000/rs6000.opt (-mspe, -mspe=no, -mspe=yes): Delete.

Okay.

Thanks, David


Re: [PATCH 07/14] rs6000: Remove TARGET_SPE and TARGET_SPE_ABI and friends

2017-06-06 Thread David Edelsohn
On Tue, Jun 6, 2017 at 11:56 AM, Segher Boessenkool
 wrote:
> 2017-06-06  Segher Boessenkool  
>
> * config/rs6000/rs6000-common.c (rs6000_handle_option): Remove
> SPE ABI handling.
> * config/rs6000/paired.md (paired_negv2sf2): Rename to negv2sf2.
> (paired_absv2sf2, paired_addv2sf3, paired_subv2sf3, paired_mulv2sf3,
> paired_divv2sf3): Similar.
> * config/rs6000/predicates.md: Replace TARGET_SPE, TARGET_SPE_ABI,
> SPE_VECTOR_MODE and SPE_HIGH_REGNO_P by 0; simplify.
> * config/rs6000/rs6000-builtin.def: Delete RS6000_BUILTIN_E and
> RS6000_BUILTIN_S.
> Delete BU_SPE_1, BU_SPE_2, BU_SPE_3, BU_SPE_E, BU_SPE_P, and BU_SPE_X.
> Rename the paired_* instruction patterns.
> * config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Do not
> define __SPE__.
> * config/rs6000/rs6000-protos.h (invalid_e500_subreg): Delete.
> * config/rs6000/rs6000.c: Delete RS6000_BUILTIN_E and 
> RS6000_BUILTIN_S.
> (struct rs6000_stack): Delete fields spe_gp_save_offset, spe_gp_size,
> spe_padding_size, and spe_64bit_regs_used.  Replace TARGET_SPE and
> TARGET_SPE_ABI with 0, simplify.  Replace SPE_VECTOR_MODE with
> PAIRED_VECTOR_MODE.
> (struct machine_function): Delete field spe_insn_chain_scanned_p.
> (spe_func_has_64bit_regs_p): Delete.
> (spe_expand_predicate_builtin): Delete.
> (spe_expand_evsel_builtin): Delete.
> (TARGET_DWARF_REGISTER_SPAN): Do not define.
> (TARGET_MEMBER_TYPE_FORCES_BLK): Do not define.
> (invalid_e500_subreg): Delete.
> (rs6000_legitimize_address): Always force_reg op2 as well, for
> paired single memory accesses.
> (rs6000_member_type_forces_blk): Delete.
> (rs6000_spe_function_arg): Delete.
> (rs6000_expand_unop_builtin): Delete SPE handling.
> (rs6000_expand_binop_builtin): Ditto.
> (spe_expand_stv_builtin): Delete.
> (bdesc_2arg_spe): Delete.
> (spe_expand_builtin): Delete.
> (spe_expand_predicate_builtin): Delete.
> (spe_expand_evsel_builtin): Delete.
> (rs6000_invalid_builtin): Remove RS6000_BTM_SPE handling.
> (spe_init_builtins): Delete.
> (spe_func_has_64bit_regs_p): Delete.
> (savres_routine_name): Delete "info" parameter.  Adjust callers.
> (rs6000_emit_stack_reset): Ditto.
> (rs6000_dwarf_register_span): Delete.
> * config/rs6000/rs6000.h (TARGET_SPE_ABI, TARGET_SPE,
> UNITS_PER_SPE_WORD, SPE_HIGH_REGNO_P, SPE_SIMD_REGNO_P,
> SPE_VECTOR_MODE, RS6000_BTM_SPE, RS6000_BUILTIN_E, RS6000_BUILTIN_S):
> Delete.
> * config/rs6000/rs6000.md (FIRST_SPE_HIGH_REGNO, LAST_SPE_HIGH_REGNO):
> Delete.
> * config/rs6000/rs6000.opt (-mabi=spe, -mabi=no-spe): Delete.
> * config/rs6000/spe.md: Delete every pattern that uses TARGET_SPE.
> * config/rs6000/vector.md (absv2sf2, negv2sf2, addv2sf3, subv2sf3,
> mulv2sf3, divv2sf3): Delete expanders.

Okay.

Thanks, David


Re: [PATCH 06/14] rs6000: Remove UNSPEC_MV_CR_GT

2017-06-06 Thread David Edelsohn
On Tue, Jun 6, 2017 at 11:56 AM, Segher Boessenkool
 wrote:
> 2017-06-06  Segher Boessenkool  
>
> config/rs6000/rs6000.md (UNSPEC_MV_CR_GT): Delete.

Okay.

Thanks, David


Re: [PATCH 05/14] rs6000: Remove output_e500_flip_gt_bit

2017-06-06 Thread David Edelsohn
On Tue, Jun 6, 2017 at 11:56 AM, Segher Boessenkool
 wrote:
> 2017-06-06  Segher Boessenkool  
>
> * config/rs6000/rs6000-protos.h (output_e500_flip_gt_bit): Delete.
> * config/rs6000/rs6000.c: Ditto.

Okay.

Thanks, David


Re: [PATCH 04/14] rs6000: Remove rs6000_cbranch_operator

2017-06-06 Thread David Edelsohn
On Tue, Jun 6, 2017 at 11:56 AM, Segher Boessenkool
 wrote:
> rs6000_cbranch_operator now is just comparison_operator, so just use
> that directly.
>
>
> 2017-06-06  Segher Boessenkool  
>
> * config/rs6000/predicated.md (rs6000_cbranch_operator): Delete.
> * config/rs6000/rs6000.md: Replace rs6000_cbranch_operator by
> comparison_operator.

Okay.

Thanks, David


Re: [PATCH 03/14] rs6000: Remove -mfloat-gprs

2017-06-06 Thread David Edelsohn
On Tue, Jun 6, 2017 at 11:56 AM, Segher Boessenkool
 wrote:
> This deletes -mfloat-gprs and the variables that go with it.
>
>
> 2017-06-06  Segher Boessenkool  
>
> * config/rs6000/rs6000.c: Remove everything related to -mfloat-gprs.
> * config/rs6000/rs6000.opt: Ditto.
> * config/rs6000/t-rtems: Ditto.

Okay.

Thanks, David


Re: [PATCH 02/14] rs6000: Remove TARGET_E500_{SINGLE,DOUBLE}

2017-06-06 Thread David Edelsohn
On Tue, Jun 6, 2017 at 11:56 AM, Segher Boessenkool
 wrote:
> Similarly, TARGET_E500_{SINGLE,DOUBLE} is always false now.
>
>
> 2017-06-06  Segher Boessenkool  
>
> * config/rs6000/predicates.md: Replace TARGET_E500_DOUBLE and
> TARGET_E500_SINGLE by 0, simplify.
> * config/rs6000/rs6000.c: Ditto.
> (rs6000_option_override_internal): Delete CHECK_E500_OPTIONS.
> (spe_build_register_parallel): Delete.
> * config/rs6000/rs6000.h: Delete TARGET_E500_SINGLE,
> TARGET_E500_DOUBLE, and CHECK_E500_OPTIONS.
> * config/rs6000/rs6000.md: Replace TARGET_E500_DOUBLE,
> TARGET_E500_SINGLE, and  by 0, simplify.
> (E500_CONVERT): Delete.
> * config/rs6000/spe.md: Remove many patterns and all define_constants.

Okay.

Thanks, David


Re: [PATCH 01/14] rs6000: Remove TARGET_FPRS

2017-06-06 Thread David Edelsohn
On Tue, Jun 6, 2017 at 11:56 AM, Segher Boessenkool
 wrote:
> Since rs6000 no longer supports SPE, TARGET_FPRS now always is true.
>
> This makes TARGET_{SF,DF}_SPE always false.  Many patterns in spe.md
> can now be deleted; which makes it possible to merge e.g. negdd2 with
> *negdd2_fpr.
>
> Finally, e500.h is deleted (it isn't used).
>
>
> 2017-06-06  Segher Boessenkool  
>
> * config/rs6000/darwin.md: Replace TARGET_FPRS by 1 and simplify.
> * config/rs6000/dfp.md: Ditto.
> (negdd2, *negdd2_fpr): Merge.
> (absdd2, *absdd2_fpr): Merge.
> (negtd2, *negtd2_fpr): Merge.
> (abstd2, *abstd2_fpr): Merge.
> * config/rs6000/e500.h: Delete file.
> * config/rs6000/predicates.md (rs6000_cbranch_operator): Replace
> TARGET_FPRS by 1 and simplify.
> * config/rs6000/rs6000-c.c: Ditto.
> * config/rs6000/rs6000.c: Ditto.  Also replace TARGET_SF_SPE and
> TARGET_DF_SPE by 0.
> * config/rs6000/rs6000.h: Ditto.  Delete TARGET_SF_SPE and
> TARGET_DF_SPE.
> * config/rs6000/rs6000.md: Ditto.
> (floatdidf2, *floatdidf2_fpr): Merge.
> (move_from_CR_gt_bit): Delete.
> * config/rs6000/spe.md: Replace TARGET_FPRS by 1 and simplify.
> (E500_CR_IOR_COMPARE): Delete.
> (All patterns that require !TARGET_FPRS): Delete.
> * config/rs6000/vsx.md: Replace TARGET_FPRS by 1 and simplify.

Okay.

Thanks, David


Re: RFC: [PATCH] Add warn_if_not_aligned attribute

2017-06-06 Thread H.J. Lu
On Tue, Jun 6, 2017 at 9:10 AM, Martin Sebor  wrote:
> On 06/06/2017 10:07 AM, Martin Sebor wrote:
>>
>> On 06/05/2017 11:45 AM, H.J. Lu wrote:
>>>
>>> On Mon, Jun 5, 2017 at 8:11 AM, Joseph Myers 
>>> wrote:

 The new attribute needs documentation.  Should the test be in
 c-c++-common
>>>
>>>
>>> This feature does support C++.  But C++ compiler issues a slightly
>>> different warning at a different location.
>>>
 or does this feature not support C++?

>>>
>>> Here is the updated patch with documentation and a C++ test.  This
>>> patch caused a few testsuite failures:
>>>
>>> FAIL: gcc.dg/compat/struct-align-1 c_compat_x_tst.o compile
>>>
>>>
>>> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.dg/compat//struct-align-1.h:169:1:
>>>
>>> warning: alignment 1 of 'struct B2_m_inner_p_outer' is less than 16
>>>
>>> FAIL: g++.dg/torture/pr80334.C   -O0  (test for excess errors)
>>>
>>>
>>> /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/torture/pr80334.C:4:8:
>>>
>>> warning: alignment 1 of 'B' is less than 16
>>>
>>
>> Users often want the ability to control a warning, even when it
>> certainly indicates a bug.  I would suggest to add an option to
>> make it possible for this warning as well.
>>
>> Btw., a bug related to some of those this warning is meant to
>> detect is assigning the address of an underaligned object to
>> a pointer of a natively aligned type.  Clang has an option
>> to detect this problem: -Waddress-of-packed-member.  It might
>> make a nice follow-on enhancement to add support for the same
>> thing.  I mention this because I think it would make sense to
>> consider this when choosing the name of the GCC option (i.e.,
>> rather than having two distinct but closely related warnings,
>> have one that detects both of these alignment type of bugs.
>
>
> A bug that has some additional context on this is pr 51628.
> A possible name for the new option suggested there is -Wpacked.
>
> Martin

Isn't -Waddress-of-packed-member a subset of or the same as
-Wpacked?

-- 
H.J.


Re: [PATCH] handle bzero/bcopy in DSE and aliasing (PR 80933, 80934)

2017-06-06 Thread Jeff Law
On 06/04/2017 09:36 AM, Bernhard Reutner-Fischer wrote:
> On 2 June 2017 13:12:41 CEST, Richard Biener  
> wrote:
> 
>> Note I'd be _much_ more sympathetic to simply canonicalizing all of
>> bzero and bcopy
>> to memset / memmove and be done with all the above complexity.
> 
> Indeed and even more so since SUSv3 marked it LEGACY and both were removed in 
> SUSv4.
> thanks,
Likewise.
jeff


Re: C PATCH to improve enum and struct redefinition diagnostic (PR c/79983)

2017-06-06 Thread Joseph Myers
On Tue, 6 Jun 2017, Marek Polacek wrote:

> 2017-06-06  Marek Polacek  
> 
>   PR c/79983
>   * c-decl.c (start_struct): Use the location of TYPE_STUB_DECL of
>   ref.
>   (start_enum): Use the location of TYPE_STUB_DECL of enumtype.
> 
>   * gcc.dg/pr79983.c: New test.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


C PATCH to improve enum and struct redefinition diagnostic (PR c/79983)

2017-06-06 Thread Marek Polacek
This patch brings better enum and struct redefinition diagnostic.  In 
particular,
we'll now point to the first definition in the "originally defined here" note,
not to the forward declaration.

Now, you could argue that we don't have to be setting the location by lookup_tag
at all, because we should always have the location in TYPE_STUB_DECL, and then
we could get rid of that lookup_tag parameter completely, but I think this is
good enough for now.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2017-06-06  Marek Polacek  

PR c/79983
* c-decl.c (start_struct): Use the location of TYPE_STUB_DECL of
ref.
(start_enum): Use the location of TYPE_STUB_DECL of enumtype.

* gcc.dg/pr79983.c: New test.

diff --git gcc/c/c-decl.c gcc/c/c-decl.c
index f2b8096..3a0a4f5 100644
--- gcc/c/c-decl.c
+++ gcc/c/c-decl.c
@@ -7453,6 +7453,9 @@ start_struct (location_t loc, enum tree_code code, tree 
name,
 ref = lookup_tag (code, name, true, );
   if (ref && TREE_CODE (ref) == code)
 {
+  if (TYPE_STUB_DECL (ref))
+   refloc = DECL_SOURCE_LOCATION (TYPE_STUB_DECL (ref));
+
   if (TYPE_SIZE (ref))
{
  if (code == UNION_TYPE)
@@ -8185,7 +8188,10 @@ start_enum (location_t loc, struct c_enum_contents 
*the_enum, tree name)
   /* Update type location to the one of the definition, instead of e.g.
  a forward declaration.  */
   else if (TYPE_STUB_DECL (enumtype))
-DECL_SOURCE_LOCATION (TYPE_STUB_DECL (enumtype)) = loc;
+{
+  enumloc = DECL_SOURCE_LOCATION (TYPE_STUB_DECL (enumtype));
+  DECL_SOURCE_LOCATION (TYPE_STUB_DECL (enumtype)) = loc;
+}
 
   if (C_TYPE_BEING_DEFINED (enumtype))
 error_at (loc, "nested redefinition of %", name);
diff --git gcc/testsuite/gcc.dg/pr79983.c gcc/testsuite/gcc.dg/pr79983.c
index e69de29..84aae69 100644
--- gcc/testsuite/gcc.dg/pr79983.c
+++ gcc/testsuite/gcc.dg/pr79983.c
@@ -0,0 +1,15 @@
+/* PR c/79983 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+struct S;
+struct S { int i; }; /* { dg-message "originally defined here" } */
+struct S { int i, j; }; /* { dg-error "redefinition of 'struct S'" } */
+
+enum E;
+enum E { A, B, C }; /* { dg-message "originally defined here" } */
+enum E { D, F }; /* { dg-error "nested redefinition of 'enum E'|redeclaration 
of 'enum E'" } */
+
+union U;
+union U { int i; }; /* { dg-message "originally defined here" } */
+union U { int i; double d; }; /* { dg-error "redefinition of 'union U'" } */

Marek


Re: [PATCH, rs6000] Fold vector shifts in GIMPLE

2017-06-06 Thread Will Schmidt
On Thu, 2017-06-01 at 10:15 -0500, Bill Schmidt wrote:
> > On Jun 1, 2017, at 2:48 AM, Richard Biener  
> > wrote:
> > 
> > On Wed, May 31, 2017 at 10:01 PM, Will Schmidt
> >  wrote:
> >> Hi,
> >> 
> >> Add support for early expansion of vector shifts.  Including
> >> vec_sl (shift left), vec_sr (shift right), vec_sra (shift
> >> right algebraic), vec_rl (rotate left).
> >> Part of this includes adding the vector shift right instructions to
> >> the list of those instructions having an unsigned second argument.
> >> 
> >> The VSR (vector shift right) folding is a bit more complex than
> >> the others. This is due to requiring arg0 be unsigned for an algebraic
> >> shift before the gimple RSHIFT_EXPR assignment is built.
> > 
> > Jakub, do we sanitize that undefinedness of left shifts of negative values
> > and/or overflow of left shift of nonnegative values?


On Thu, 2017-06-01 at 10:17 +0200, Jakub Jelinek wrote:
> We don't yet, see PR77823 - all I've managed to do before stage1 was over
> was instrumentation of signed arithmetic integer overflow on vectors,
> division, shift etc. are tasks maybe for this stage1.
> 
> That said, shift instrumentation in particular is done early because every
> FE has different rules, and so if it is coming from target builtins that are
> folded into something, it wouldn't be instrumented anyway. 


On Thu, 2017-06-01 at 10:15 -0500, Bill Schmidt wrote:
> > 
> > Will, how is that defined in the intrinsics operation?  It might need 
> > similar
> > treatment as the abs case.
> 
> Answering for Will -- vec_sl is defined to simply shift bits off the end to 
> the
> left and fill with zeros from the right, regardless of whether the source type
> is signed or unsigned.  The result type is signed iff the source type is
> signed.  So a negative value can become positive as a result of the
> operation.
> 
> The same is true of vec_rl, which will naturally rotate bits regardless of 
> signedness.


> > 
> > [I'd rather make the negative left shift case implementation defined
> > given C and C++ standards
> > do not agree to 100% AFAIK]

With the above answers, how does this one stand?

[ I have no issue adding the TYPE_OVERFLOW_WRAPS logic to treat some of
the cases differently, I'm just unclear on whether none/some/all of the
shifts will require that logic.  :-) ]

thanks,
-Will




> > 
> > Richard.
> > 
> >> [gcc]
> >> 
> >> 2017-05-26  Will Schmidt  
> >> 
> >>* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add handling
> >>for early expansion of vector shifts (sl,sr,sra,rl).
> >>(builtin_function_type): Add vector shift right instructions
> >>to the unsigned argument list.
> >> 
> >> [gcc/testsuite]
> >> 
> >> 2017-05-26  Will Schmidt  
> >> 
> >>* testsuite/gcc.target/powerpc/fold-vec-shift-char.c: New.
> >>* testsuite/gcc.target/powerpc/fold-vec-shift-int.c: New.
> >>* testsuite/gcc.target/powerpc/fold-vec-shift-longlong.c: New.
> >>* testsuite/gcc.target/powerpc/fold-vec-shift-short.c: New.
> >> 
> >> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> >> index 8adbc06..6ee0bfd 100644
> >> --- a/gcc/config/rs6000/rs6000.c
> >> +++ b/gcc/config/rs6000/rs6000.c
> >> @@ -17408,6 +17408,76 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator 
> >> *gsi)
> >>gsi_replace (gsi, g, true);
> >>return true;
> >>   }
> >> +/* Flavors of vec_rotate_left . */
> >> +case ALTIVEC_BUILTIN_VRLB:
> >> +case ALTIVEC_BUILTIN_VRLH:
> >> +case ALTIVEC_BUILTIN_VRLW:
> >> +case P8V_BUILTIN_VRLD:
> >> +  {
> >> +   arg0 = gimple_call_arg (stmt, 0);
> >> +   arg1 = gimple_call_arg (stmt, 1);
> >> +   lhs = gimple_call_lhs (stmt);
> >> +   gimple *g = gimple_build_assign (lhs, LROTATE_EXPR, arg0, arg1);
> >> +   gimple_set_location (g, gimple_location (stmt));
> >> +   gsi_replace (gsi, g, true);
> >> +   return true;
> >> +  }
> >> +  /* Flavors of vector shift right algebraic.  vec_sra{b,h,w} -> 
> >> vsra{b,h,w}. */
> >> +case ALTIVEC_BUILTIN_VSRAB:
> >> +case ALTIVEC_BUILTIN_VSRAH:
> >> +case ALTIVEC_BUILTIN_VSRAW:
> >> +case P8V_BUILTIN_VSRAD:
> >> +  {
> >> +   arg0 = gimple_call_arg (stmt, 0);
> >> +   arg1 = gimple_call_arg (stmt, 1);
> >> +   lhs = gimple_call_lhs (stmt);
> >> +   gimple *g = gimple_build_assign (lhs, RSHIFT_EXPR, arg0, arg1);
> >> +   gimple_set_location (g, gimple_location (stmt));
> >> +   gsi_replace (gsi, g, true);
> >> +   return true;
> >> +  }
> >> +   /* Flavors of vector shift left.  builtin_altivec_vsl{b,h,w} -> 
> >> vsl{b,h,w}.  */
> >> +case ALTIVEC_BUILTIN_VSLB:
> >> +case ALTIVEC_BUILTIN_VSLH:
> >> +case ALTIVEC_BUILTIN_VSLW:
> >> +case P8V_BUILTIN_VSLD:
> >> +  {
> >> +   arg0 = gimple_call_arg (stmt, 0);
> >> +  

Re: Fix PR rtl-optimization/80474

2017-06-06 Thread Jeff Law
On 06/05/2017 04:40 PM, Eric Botcazou wrote:
> This is a regression present on the 6 branch for MIPS (and latent on the 7 
> branch and mainline).  The reorg pass generates wrong code for the attached 
> testcase with a combination of options under the form of a double lo_sum(sym) 
> instruction for a single hi(sum).
> 
> The pass starts from a convoluted CFG with 4 instances of the same pair:
> 
>   set $16,hi(sym)  (1)
>   set $16,lo_sum($16,sym)  (2)
> 
> and generates wrong code after 12 steps: first, the 4 (1) instructions are 
> moved into delay slots, then one of them is stolen and moved into another 
> slot, then a second one is deleted, then a third one is also stolen, and then 
> the first stolen one is deleted.  Then 3 (2) instructions are moved into the 
> same delay slots (again empty) as the (1) and, finally, one of them is stolen.
> 
> All this dance considerably changes the live ranges of the $16 register, in 
> particular the deletion of 2 (1) instructions.  The strategy of reorg is not 
> to update the liveness info, but to leave markers instead so that it can be 
> recomputed locally.  But there is a hitch: it doesn't leave a marker when
> 
>   /* Ignore if this was in a delay slot and it came from the target of
>  a branch.  */
>   if (INSN_FROM_TARGET_P (insn))
> return;
> 
> This exception has been there since the dawn of time and I can guess what 
> kind 
> of reasoning led to it, but it's probably valid only for simple situations 
> and 
> not for the kind of big transformations described above.
> 
> Lifting it fixes the wrong code because it leaves the necessary markers when 
> instructions that were stolen are then deleted.  Surprisingly(?) enough, it 
> doesn't seem to have much effect outside this case (e.g. 0 changes for the 
> entire compile.exp testsuite at -O2 on SPARC and virtually same cc1 binaries).
> 
> Tested on SPARC/Solaris, objections to applying it on mainline,7 & 6 branches?
> 
> 
> 2017-06-06  Eric Botcazou  
> 
>   PR rtl-optimization/80474
>   * reorg.c (update_block): Do not ignore instructions in a delay slot.
> 
I'll trust your judgement on this one...  The updating parts of reorg.c
were always IMHO sketchy and anything which brings more consistency to
the update mechanism is a step forward -- and IMHO this patch fits that
category.

jeff


Re: Reorgnanization of profile count maintenance code, part 1

2017-06-06 Thread Segher Boessenkool
Hi!

On Thu, Jun 01, 2017 at 01:35:56PM +0200, Jan Hubicka wrote:
> +  /* FIXME: shrink wrapping violates this sanity check.  */
> +  gcc_checking_assert ((num >= 0
> + && (num <= REG_BR_PROB_BASE
> + || den <= REG_BR_PROB_BASE)
> + && den > 0) || 1);
> +  ret.m_val = RDIV (m_val * num, den);
> +  return ret;

Sorry if I missed this...  But where/how does it violate this?

> +  /* Take care for overflows!  */
> +  if (num.m_val < 8196 || m_val < 8196)
> +ret.m_val = RDIV (m_val * num.m_val, den.m_val);
> +  else
> +ret.m_val = RDIV (m_val * RDIV (num.m_val * 8196, den.m_val), 8196);
> +  return ret;

8196 is a strange number, did you mean 8192?


Segher


Re: RFC: [PATCH] Add warn_if_not_aligned attribute

2017-06-06 Thread Martin Sebor

On 06/06/2017 10:07 AM, Martin Sebor wrote:

On 06/05/2017 11:45 AM, H.J. Lu wrote:

On Mon, Jun 5, 2017 at 8:11 AM, Joseph Myers 
wrote:

The new attribute needs documentation.  Should the test be in
c-c++-common


This feature does support C++.  But C++ compiler issues a slightly
different warning at a different location.


or does this feature not support C++?



Here is the updated patch with documentation and a C++ test.  This
patch caused a few testsuite failures:

FAIL: gcc.dg/compat/struct-align-1 c_compat_x_tst.o compile

/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.dg/compat//struct-align-1.h:169:1:

warning: alignment 1 of 'struct B2_m_inner_p_outer' is less than 16

FAIL: g++.dg/torture/pr80334.C   -O0  (test for excess errors)

/export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/torture/pr80334.C:4:8:

warning: alignment 1 of 'B' is less than 16



Users often want the ability to control a warning, even when it
certainly indicates a bug.  I would suggest to add an option to
make it possible for this warning as well.

Btw., a bug related to some of those this warning is meant to
detect is assigning the address of an underaligned object to
a pointer of a natively aligned type.  Clang has an option
to detect this problem: -Waddress-of-packed-member.  It might
make a nice follow-on enhancement to add support for the same
thing.  I mention this because I think it would make sense to
consider this when choosing the name of the GCC option (i.e.,
rather than having two distinct but closely related warnings,
have one that detects both of these alignment type of bugs.


A bug that has some additional context on this is pr 51628.
A possible name for the new option suggested there is -Wpacked.

Martin


Re: RFC: [PATCH] Add warn_if_not_aligned attribute

2017-06-06 Thread Martin Sebor

On 06/05/2017 11:45 AM, H.J. Lu wrote:

On Mon, Jun 5, 2017 at 8:11 AM, Joseph Myers  wrote:

The new attribute needs documentation.  Should the test be in c-c++-common


This feature does support C++.  But C++ compiler issues a slightly
different warning at a different location.


or does this feature not support C++?



Here is the updated patch with documentation and a C++ test.  This
patch caused a few testsuite failures:

FAIL: gcc.dg/compat/struct-align-1 c_compat_x_tst.o compile

/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.dg/compat//struct-align-1.h:169:1:
warning: alignment 1 of 'struct B2_m_inner_p_outer' is less than 16

FAIL: g++.dg/torture/pr80334.C   -O0  (test for excess errors)

/export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/torture/pr80334.C:4:8:
warning: alignment 1 of 'B' is less than 16



Users often want the ability to control a warning, even when it
certainly indicates a bug.  I would suggest to add an option to
make it possible for this warning as well.

Btw., a bug related to some of those this warning is meant to
detect is assigning the address of an underaligned object to
a pointer of a natively aligned type.  Clang has an option
to detect this problem: -Waddress-of-packed-member.  It might
make a nice follow-on enhancement to add support for the same
thing.  I mention this because I think it would make sense to
consider this when choosing the name of the GCC option (i.e.,
rather than having two distinct but closely related warnings,
have one that detects both of these alignment type of bugs.

Martin


[PATCH 14/14] rs6000: Remove rs6000_nonimmediate_operand

2017-06-06 Thread Segher Boessenkool
Now rs6000_nonimmediate_operand is just nonimmediate_operand.


2017-06-06  Segher Boessenkool  

* config/rs6000/predicates.md (rs6000_nonimmediate_operand): Delete.
* config/rs6000/rs6000.md (*movsi_internal1, movsi_from_sf,
*mov_softfloat, and an anonymous splitter): Use
nonimmediate_operand instead of rs6000_nonimmediate_operand.

---
 gcc/config/rs6000/predicates.md |  7 ---
 gcc/config/rs6000/rs6000.md | 14 +++---
 2 files changed, 7 insertions(+), 14 deletions(-)

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 1bf9194..aa1c01b 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1150,13 +1150,6 @@ (define_predicate "splat_input_operand"
   return gpc_reg_operand (op, mode);
 })
 
-;; Return true if OP is a non-immediate operand.
-(define_predicate "rs6000_nonimmediate_operand"
-  (match_code "reg,subreg,mem")
-{
-  return nonimmediate_operand (op, mode);
-})
-
 ;; Return true if operand is an operator used in rotate-and-mask instructions.
 (define_predicate "rotate_mask_operator"
   (match_code "rotate,ashift,lshiftrt"))
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index d0120d1..1bb565a 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -6693,7 +6693,7 @@ (define_insn "movsi_low"
 ;; XXLXOR 0 XXLORC -1P9 const MTVSRWZ  MFVSRWZ
 ;; MF%1 MT%0 MT%0 NOP
 (define_insn "*movsi_internal1"
-  [(set (match_operand:SI 0 "rs6000_nonimmediate_operand"
+  [(set (match_operand:SI 0 "nonimmediate_operand"
"=r, r,   r,   ?*wI,?*wH,
 m,  ?Z,  ?Z,  r,   r,
 r,  ?*wIwH,  ?*wJwK,  ?*wJwK,  ?*wu,
@@ -6749,7 +6749,7 @@ (define_insn "*movsi_internal1"
 4,  4,   4,   4")])
 
 (define_insn "*movsi_internal1_single"
-  [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" 
"=r,r,r,m,r,r,r,r,*c*l,*h,*h,m,*f")
+  [(set (match_operand:SI 0 "nonimmediate_operand" 
"=r,r,r,m,r,r,r,r,*c*l,*h,*h,m,*f")
 (match_operand:SI 1 "input_operand" "r,U,m,r,I,L,n,*h,r,r,0,f,m"))]
   "TARGET_SINGLE_FPU &&
(gpc_reg_operand (operands[0], SImode) || gpc_reg_operand (operands[1], 
SImode))"
@@ -6790,7 +6790,7 @@ (define_insn "*movsi_internal1_single"
 ;; VSX->VSX
 
 (define_insn_and_split "movsi_from_sf"
-  [(set (match_operand:SI 0 "rs6000_nonimmediate_operand"
+  [(set (match_operand:SI 0 "nonimmediate_operand"
"=r, r,   ?*wI,?*wH, m,
 m,  wY,  Z,   r,wIwH,
 ?wK")
@@ -7236,7 +7236,7 @@ (define_insn "*mov_softfloat"
 ;; LWZ  LFSLXSSP  LXSSPX STWSTFIWX
 ;; STXSIWX  GPR->VSX   VSX->GPR   GPR->GPR
 (define_insn_and_split "movsf_from_si"
-  [(set (match_operand:SF 0 "rs6000_nonimmediate_operand"
+  [(set (match_operand:SF 0 "nonimmediate_operand"
"=!r,   f, wb,wu,m, Z,
 Z, wy,?r,!r")
 
@@ -7521,7 +7521,7 @@ (define_insn_and_split "*mov_32bit"
   [(set_attr "length" "8,8,8,8,20,20,16")])
 
 (define_insn_and_split "*mov_softfloat"
-  [(set (match_operand:FMOVE128 0 "rs6000_nonimmediate_operand" "=Y,r,r")
+  [(set (match_operand:FMOVE128 0 "nonimmediate_operand" "=Y,r,r")
(match_operand:FMOVE128 1 "input_operand" "r,YGHF,r"))]
   "TARGET_SOFT_FLOAT
&& (gpc_reg_operand (operands[0], mode)
@@ -8463,7 +8463,7 @@ (define_insn "p8_mfvsrd_4_disf"
 ;;AVX const  
 
 (define_insn "*movdi_internal32"
-  [(set (match_operand:DI 0 "rs6000_nonimmediate_operand"
+  [(set (match_operand:DI 0 "nonimmediate_operand"
  "=Y,r, r, ^m,^d, ^d,
   r, ^wY,   $Z,^wb,   $wv,^wi,
   *wo,   *wo,   *wv,   *wi,   *wi,*wv,
@@ -8525,7 +8525,7 @@ (define_split
 }")
 
 (define_split
-  [(set (match_operand:DIFD 0 "rs6000_nonimmediate_operand" "")
+  [(set (match_operand:DIFD 0 "nonimmediate_operand" "")
 (match_operand:DIFD 1 "input_operand" ""))]
   "reload_completed && !TARGET_POWERPC64
&& gpr_or_gpr_p (operands[0], operands[1])
-- 
1.9.3



[PATCH 13/14] rs6000: Remove spe_acc and spefscr

2017-06-06 Thread Segher Boessenkool
We can also remove the two other SPE registers.


2017-06-06  Segher Boessenkool  

* config/rs6000/darwin.h (REGISTER_NAMES): Delete the SPE_ACC and
SPEFSCR registers.
* config/rs6000/rs6000.c (rs6000_reg_names, alt_reg_names): Ditto.
(enum rs6000_reg_type): Delete SPE_ACC_TYPE and SPEFSCR_REG_TYPE.
(rs6000_debug_reg_global): Adjust.
(rs6000_init_hard_regno_mode_ok): Adjust.
(rs6000_dbx_register_number): Adjust.
* config/rs6000/rs6000.h (FIRST_PSEUDO_REGISTER): Change to 115.
(FIXED_REGISTERS, CALL_USED_REGISTERS, CALL_REALLY_USED_REGISTERS):
Remove SPE_ACC and SPEFSCR.
(REG_ALLOC_ORDER): Ditto.
(FRAME_POINTER_REGNUM): Change to 111.
(enum reg_class): Remove the SPE_ACC and SPEFSCR registers.
(REG_CLASS_NAMES): Ditto.
(REG_CLASS_CONTENTS): Delete the SPE_ACC and SPEFSCR registers.
(REGISTER_NAMES): Ditto.
(ADDITIONAL_REG_NAMES): Ditto.
(rs6000_reg_names): Ditto.
* config/rs6000/rs6000.md: Renumber some register number
define_constants.

---
 gcc/config/rs6000/darwin.h  |  1 -
 gcc/config/rs6000/rs6000.c  | 16 ---
 gcc/config/rs6000/rs6000.h  | 48 +
 gcc/config/rs6000/rs6000.md | 10 --
 4 files changed, 22 insertions(+), 53 deletions(-)

diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h
index 2422f25..90fc757 100644
--- a/gcc/config/rs6000/darwin.h
+++ b/gcc/config/rs6000/darwin.h
@@ -192,7 +192,6 @@ extern int darwin_emit_branch_islands;
 "v16", "v17", "v18", "v19", "v20", "v21", "v22", "v23", \
 "v24", "v25", "v26", "v27", "v28", "v29", "v30", "v31", \
 "vrsave", "vscr",  \
-"spe_acc", "spefscr",   \
 "sfp", \
 "tfhar", "tfiar", "texasr" \
 }
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 426400c..415ac1b 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -408,8 +408,6 @@ enum rs6000_reg_type {
   FPR_REG_TYPE,
   SPR_REG_TYPE,
   CR_REG_TYPE,
-  SPE_ACC_TYPE,
-  SPEFSCR_REG_TYPE
 };
 
 /* Map register class to register type.  */
@@ -1473,8 +1471,6 @@ char rs6000_reg_names[][8] =
   "16", "17", "18", "19", "20", "21", "22", "23",
   "24", "25", "26", "27", "28", "29", "30", "31",
   "vrsave", "vscr",
-  /* SPE registers.  */
-  "spe_acc", "spefscr",
   /* Soft frame pointer.  */
   "sfp",
   /* HTM SPR registers.  */
@@ -1501,8 +1497,6 @@ static const char alt_reg_names[][8] =
   "%v16", "%v17", "%v18", "%v19", "%v20", "%v21", "%v22", "%v23",
   "%v24", "%v25", "%v26", "%v27", "%v28", "%v29", "%v30", "%v31",
   "vrsave", "vscr",
-  /* SPE registers.  */
-  "spe_acc", "spefscr",
   /* Soft frame pointer.  */
   "sfp",
   /* HTM SPR registers.  */
@@ -2470,8 +2464,6 @@ rs6000_debug_reg_global (void)
   rs6000_debug_reg_print (CA_REGNO, CA_REGNO, "ca");
   rs6000_debug_reg_print (VRSAVE_REGNO, VRSAVE_REGNO, "vrsave");
   rs6000_debug_reg_print (VSCR_REGNO, VSCR_REGNO, "vscr");
-  rs6000_debug_reg_print (SPE_ACC_REGNO, SPE_ACC_REGNO, "spe_a");
-  rs6000_debug_reg_print (SPEFSCR_REGNO, SPEFSCR_REGNO, "spe_f");
 
   fputs ("\nVirtual/stack/frame registers:\n", stderr);
   for (v = 0; v < ARRAY_SIZE (virtual_regs); v++)
@@ -2980,8 +2972,6 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
   rs6000_regno_regclass[CA_REGNO] = NO_REGS;
   rs6000_regno_regclass[VRSAVE_REGNO] = VRSAVE_REGS;
   rs6000_regno_regclass[VSCR_REGNO] = VRSAVE_REGS;
-  rs6000_regno_regclass[SPE_ACC_REGNO] = SPE_ACC_REGS;
-  rs6000_regno_regclass[SPEFSCR_REGNO] = SPEFSCR_REGS;
   rs6000_regno_regclass[TFHAR_REGNO] = SPR_REGS;
   rs6000_regno_regclass[TFIAR_REGNO] = SPR_REGS;
   rs6000_regno_regclass[TEXASR_REGNO] = SPR_REGS;
@@ -3004,8 +2994,6 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
   reg_class_to_reg_type[(int)LINK_OR_CTR_REGS] = SPR_REG_TYPE;
   reg_class_to_reg_type[(int)CR_REGS] = CR_REG_TYPE;
   reg_class_to_reg_type[(int)CR0_REGS] = CR_REG_TYPE;
-  reg_class_to_reg_type[(int)SPE_ACC_REGS] = SPE_ACC_TYPE;
-  reg_class_to_reg_type[(int)SPEFSCR_REGS] = SPEFSCR_REG_TYPE;
 
   if (TARGET_VSX)
 {
@@ -37363,10 +37351,6 @@ rs6000_dbx_register_number (unsigned int regno, 
unsigned int format)
 return 356;
   if (regno == VSCR_REGNO)
 return 67;
-  if (regno == SPE_ACC_REGNO)
-return 99;
-  if (regno == SPEFSCR_REGNO)
-return 612;
 #endif
   return regno;
 }
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index a154c5d..edfa546 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -1017,7 +1017,7 @@ enum data_align { align_abi, align_opt, align_both };
 
The 3 HTM registers 

[PATCH 11/14] rs6000: Remove type attribute "brinc"

2017-06-06 Thread Segher Boessenkool
Nothing uses it anymore.


2017-06-06  Segher Boessenkool  

* config/rs6000/8540.md (ppc8540_brinc): Delete.
* config/rs6000/e500mc.md (e500mc_brinc): Delete.
* config/rs6000/e500mc64.md (e500mc64_brinc): Delete.
* config/rs6000/rs6000.md (type): Remove "brinc".

---
 gcc/config/rs6000/8540.md | 6 --
 gcc/config/rs6000/e500mc.md   | 6 --
 gcc/config/rs6000/e500mc64.md | 6 --
 gcc/config/rs6000/rs6000.md   | 1 -
 4 files changed, 19 deletions(-)

diff --git a/gcc/config/rs6000/8540.md b/gcc/config/rs6000/8540.md
index fae369d..7b91b5b 100644
--- a/gcc/config/rs6000/8540.md
+++ b/gcc/config/rs6000/8540.md
@@ -182,12 +182,6 @@ (define_insn_reservation "ppc8540_float_vector_divide" 29
   "ppc8540_decode,ppc8540_issue+ppc8540_mu_stage0+ppc8540_mu_div,\
ppc8540_mu_div*28")
 
-;; Brinc
-(define_insn_reservation "ppc8540_brinc" 1
-  (and (eq_attr "type" "brinc")
-   (eq_attr "cpu" "ppc8540,ppc8548"))
-  "ppc8540_decode,ppc8540_issue+ppc8540_su_stage0+ppc8540_retire")
-
 ;; Simple vector
 (define_insn_reservation "ppc8540_simple_vector" 1
   (and (eq_attr "type" "vecsimple,veclogical,vecmove")
diff --git a/gcc/config/rs6000/e500mc.md b/gcc/config/rs6000/e500mc.md
index 9878aaa..9f7f884 100644
--- a/gcc/config/rs6000/e500mc.md
+++ b/gcc/config/rs6000/e500mc.md
@@ -132,12 +132,6 @@ (define_insn_reservation "e500mc_mtjmpr" 1
(eq_attr "cpu" "ppce500mc"))
   "e500mc_decode,e500mc_issue+e500mc_su_stage0+e500mc_retire")
 
-;; Brinc.
-(define_insn_reservation "e500mc_brinc" 1
-  (and (eq_attr "type" "brinc")
-   (eq_attr "cpu" "ppce500mc"))
-  "e500mc_decode,e500mc_issue+e500mc_su_stage0+e500mc_retire")
-
 ;; Loads.
 (define_insn_reservation "e500mc_load" 3
   (and (eq_attr "type" "load,load_l,sync")
diff --git a/gcc/config/rs6000/e500mc64.md b/gcc/config/rs6000/e500mc64.md
index 366b4c4..6f1ec81 100644
--- a/gcc/config/rs6000/e500mc64.md
+++ b/gcc/config/rs6000/e500mc64.md
@@ -151,12 +151,6 @@ (define_insn_reservation "e500mc64_mtjmpr" 1
(eq_attr "cpu" "ppce500mc64"))
   "e500mc64_decode,e500mc64_issue+e500mc64_su_stage0+e500mc64_retire")
 
-;; Brinc.
-(define_insn_reservation "e500mc64_brinc" 1
-  (and (eq_attr "type" "brinc")
-   (eq_attr "cpu" "ppce500mc64"))
-  "e500mc64_decode,e500mc64_issue+e500mc64_su_stage0+e500mc64_retire")
-
 ;; Loads.
 (define_insn_reservation "e500mc64_load" 3
   (and (eq_attr "type" "load,load_l,sync")
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index ec25f45..9cf761c 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -179,7 +179,6 @@ (define_attr "type"
branch,jmpreg,mfjmpr,mtjmpr,trap,isync,sync,load_l,store_c,
cr_logical,delayed_cr,mfcr,mfcrf,mtcr,
fpcompare,fp,fpsimple,dmul,sdiv,ddiv,ssqrt,dsqrt,
-   brinc,
vecsimple,veccomplex,vecdiv,veccmp,veccmpsimple,vecperm,
vecfloat,vecfdiv,vecdouble,mffgpr,mftgpr,crypto,
veclogical,veccmpfx,vecexts,vecmove,
-- 
1.9.3



[PATCH 12/14] rs6000: Remove SPE high registers

2017-06-06 Thread Segher Boessenkool
Now we can remove the SPE high registers.


2017-06-06  Segher Boessenkool  

* config/rs6000/darwin.h (REGISTER_NAMES): Delete the SPE high
registers.
* config/rs6000/rs6000.c (rs6000_reg_names, alt_reg_names): Ditto.
* config/rs6000/rs6000.h (FIRST_PSEUDO_REGISTER): Change from 149
to 117.
(DWARF_REG_TO_UNWIND_COLUMN): Do not define.
(FIXED_REGISTERS, CALL_USED_REGISTERS, CALL_REALLY_USED_REGISTERS):
Delete the SPE high registers.
(REG_ALLOC_ORDER): Ditto.
(enum reg_class): Remove SPE_HIGH_REGS.
(REG_CLASS_NAMES): Ditto.
(REG_CLASS_CONTENTS): Delete the SPE high registers.
(REGISTER_NAMES): Ditto.
(rs6000_reg_names): Ditto.
* doc/tm.texi.in: Remove SPE as example.
* doc/tm.texi: Regenerate.

---
 gcc/config/rs6000/darwin.h |   6 +--
 gcc/config/rs6000/rs6000.c |  14 +-
 gcc/config/rs6000/rs6000.h | 122 ++---
 gcc/doc/tm.texi|   2 -
 gcc/doc/tm.texi.in |   2 -
 5 files changed, 30 insertions(+), 116 deletions(-)

diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h
index 61e5e83..2422f25 100644
--- a/gcc/config/rs6000/darwin.h
+++ b/gcc/config/rs6000/darwin.h
@@ -194,11 +194,7 @@ extern int darwin_emit_branch_islands;
 "vrsave", "vscr",  \
 "spe_acc", "spefscr",   \
 "sfp", \
-"tfhar", "tfiar", "texasr",
\
-"rh0",  "rh1",  "rh2",  "rh3",  "rh4",  "rh5",  "rh6",  "rh7", \
-"rh8",  "rh9",  "rh10", "rh11", "rh12", "rh13", "rh14", "rh15",\
-"rh16", "rh17", "rh18", "rh19", "rh20", "rh21", "rh22", "rh23",\
-"rh24", "rh25", "rh26", "rh27", "rh28", "rh29", "rh30", "rh31" \
+"tfhar", "tfiar", "texasr" \
 }
 
 /* This outputs NAME to FILE.  */
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 89f9fc2..426400c 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1478,12 +1478,7 @@ char rs6000_reg_names[][8] =
   /* Soft frame pointer.  */
   "sfp",
   /* HTM SPR registers.  */
-  "tfhar", "tfiar", "texasr",
-  /* SPE High registers.  */
-  "0",  "1",  "2",  "3",  "4",  "5",  "6",  "7",
-  "8",  "9", "10", "11", "12", "13", "14", "15",
- "16", "17", "18", "19", "20", "21", "22", "23",
- "24", "25", "26", "27", "28", "29", "30", "31"
+  "tfhar", "tfiar", "texasr"
 };
 
 #ifdef TARGET_REGNAMES
@@ -1511,12 +1506,7 @@ static const char alt_reg_names[][8] =
   /* Soft frame pointer.  */
   "sfp",
   /* HTM SPR registers.  */
-  "tfhar", "tfiar", "texasr",
-  /* SPE High registers.  */
-  "%rh0",  "%rh1",  "%rh2",  "%rh3",  "%rh4",  "%rh5",  "%rh6",   "%rh7",
-  "%rh8",  "%rh9",  "%rh10", "%r11",  "%rh12", "%rh13", "%rh14", "%rh15",
-  "%rh16", "%rh17", "%rh18", "%rh19", "%rh20", "%rh21", "%rh22", "%rh23",
-  "%rh24", "%rh25", "%rh26", "%rh27", "%rh28", "%rh29", "%rh30", "%rh31"
+  "tfhar", "tfiar", "texasr"
 };
 #endif
 
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index da3b877..a154c5d 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -1017,7 +1017,7 @@ enum data_align { align_abi, align_opt, align_both };
 
The 3 HTM registers aren't also included in DWARF_FRAME_REGISTERS.  */
 
-#define FIRST_PSEUDO_REGISTER 149
+#define FIRST_PSEUDO_REGISTER 117
 
 /* This must be included for pre gcc 3.0 glibc compatibility.  */
 #define PRE_GCC3_DWARF_FRAME_REGISTERS 77
@@ -1026,16 +1026,6 @@ enum data_align { align_abi, align_opt, align_both };
aren't included in DWARF_FRAME_REGISTERS.  */
 #define DWARF_FRAME_REGISTERS (FIRST_PSEUDO_REGISTER - 4)
 
-/* The SPE has an additional 32 synthetic registers, with DWARF debug
-   info numbering for these registers starting at 1200.  While eh_frame
-   register numbering need not be the same as the debug info numbering,
-   we choose to number these regs for eh_frame at 1200 too.
-
-   We must map them here to avoid huge unwinder tables mostly consisting
-   of unused space.  */
-#define DWARF_REG_TO_UNWIND_COLUMN(r) \
-  ((r) >= 1200 ? ((r) - 1200 + (DWARF_FRAME_REGISTERS - 32)) : (r))
-
 /* Use standard DWARF numbering for DWARF debugging information.  */
 #define DBX_REGISTER_NUMBER(REGNO) rs6000_dbx_register_number ((REGNO), 0)
 
@@ -1066,10 +1056,7 @@ enum data_align { align_abi, align_opt, align_both };
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
1, 1   \
-   , 1, 1, 1, 1, 1, 1,\
-   /* SPE High registers.  */ \
-   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 

[PATCH 10/14] rs6000: Remove spe.md, spe.h, linuxspe.h

2017-06-06 Thread Segher Boessenkool
2017-06-06  Segher Boessenkool  

* config.gcc (powerpc*-*-*): Don't add spe.h to extra_headers.
(powerpc*-linux*spe*): Use ${cpu_type} instead of rs6000.
* config/rs6000/linuxspe.h: Delete file.
* config/rs6000/rs6000.md: Don't include spe.md.
* config/rs6000/spe.h: Delete file.
* config/rs6000/spe.md: Delete file.
* config/rs6000/t-rs6000: Remove spe.md.

---
 gcc/config.gcc   |4 +-
 gcc/config/rs6000/linuxspe.h |   32 --
 gcc/config/rs6000/rs6000.md  |1 -
 gcc/config/rs6000/spe.h  | 1107 --
 gcc/config/rs6000/spe.md |   28 --
 gcc/config/rs6000/t-rs6000   |1 -
 6 files changed, 2 insertions(+), 1171 deletions(-)
 delete mode 100644 gcc/config/rs6000/linuxspe.h
 delete mode 100644 gcc/config/rs6000/spe.h
 delete mode 100644 gcc/config/rs6000/spe.md

diff --git a/gcc/config.gcc b/gcc/config.gcc
index f55dcaa..a311cd95 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -457,7 +457,7 @@ powerpc*-*-*)
extra_headers="ppc-asm.h altivec.h htmintrin.h htmxlintrin.h"
extra_headers="${extra_headers} bmi2intrin.h bmiintrin.h x86intrin.h"
extra_headers="${extra_headers} ppu_intrinsics.h spu2vmx.h vec_types.h 
si2vmx.h"
-   extra_headers="${extra_headers} spe.h paired.h"
+   extra_headers="${extra_headers} paired.h"
case x$with_cpu in

xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500)
cpu_is_64bit=yes
@@ -2510,7 +2510,7 @@ powerpc*-*-linux*)
powerpc*-*-linux*altivec*)
tm_file="${tm_file} rs6000/linuxaltivec.h" ;;
powerpc*-*-linux*spe*)
-   tm_file="${tm_file} rs6000/linuxspe.h rs6000/e500.h" ;;
+   tm_file="${tm_file} ${cpu_type}/linuxspe.h ${cpu_type}/e500.h" 
;;
powerpc*-*-linux*paired*)
tm_file="${tm_file} rs6000/750cl.h" ;;
esac
diff --git a/gcc/config/rs6000/linuxspe.h b/gcc/config/rs6000/linuxspe.h
deleted file mode 100644
index 92efabf..000
--- a/gcc/config/rs6000/linuxspe.h
+++ /dev/null
@@ -1,32 +0,0 @@
-/* Definitions of target machine for GNU compiler,
-   for PowerPC e500 machines running GNU/Linux.
-   Copyright (C) 2003-2017 Free Software Foundation, Inc.
-   Contributed by Aldy Hernandez (a...@quesejoda.com).
-
-   This file is part of GCC.
-
-   GCC is free software; you can redistribute it and/or modify it
-   under the terms of the GNU General Public License as published
-   by the Free Software Foundation; either version 3, or (at your
-   option) any later version.
-
-   GCC is distributed in the hope that it will be useful, but WITHOUT
-   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
-   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
-   License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with GCC; see the file COPYING3.  If not see
-   .  */
-
-/* Override rs6000.h and sysv4.h definition.  */
-#if (TARGET_DEFAULT & MASK_LITTLE_ENDIAN)
-#undef TARGET_DEFAULT
-#define TARGET_DEFAULT (MASK_STRICT_ALIGN | MASK_LITTLE_ENDIAN)
-#else
-#undef TARGET_DEFAULT
-#define TARGET_DEFAULT MASK_STRICT_ALIGN
-#endif
-
-#undef  ASM_DEFAULT_SPEC
-#defineASM_DEFAULT_SPEC "-mppc -mspe -me500"
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index edb5208..ec25f45 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -14629,7 +14629,6 @@ (define_insn "*cmp_hw"
 (include "vector.md")
 (include "vsx.md")
 (include "altivec.md")
-(include "spe.md")
 (include "dfp.md")
 (include "paired.md")
 (include "crypto.md")
diff --git a/gcc/config/rs6000/spe.h b/gcc/config/rs6000/spe.h
deleted file mode 100644
index 3d556c0..000
--- a/gcc/config/rs6000/spe.h
+++ /dev/null
@@ -1,1107 +0,0 @@
-/* PowerPC E500 user include file.
-   Copyright (C) 2002-2017 Free Software Foundation, Inc.
-   Contributed by Aldy Hernandez (al...@redhat.com).
-
-   This file is part of GCC.
-
-   GCC is free software; you can redistribute it and/or modify it
-   under the terms of the GNU General Public License as published
-   by the Free Software Foundation; either version 3, or (at your
-   option) any later version.
-
-   GCC is distributed in the hope that it will be useful, but WITHOUT
-   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
-   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
-   License for more details.
-
-   Under Section 7 of GPL version 3, you are granted additional
-   permissions described in the GCC Runtime Library Exception, version
-   3.1, as published by the Free Software Foundation.
-
-   You should have received a copy of the GNU General Public License and
-   a copy of the GCC Runtime Library Exception along with this program;

[PATCH 09/14] rs6000: Remove reg_or_none500mem_operand

2017-06-06 Thread Segher Boessenkool
2017-06-06  Segher Boessenkool  

* config/rs6000/predicates.md (reg_or_mem_operand): Reformat.
(reg_or_none500mem_operand): Delete.
* config/rs6000/rs6000.md (extendsfdf2): Use reg_or_mem_operand
instead of reg_or_none500mem_operand.

---
 gcc/config/rs6000/predicates.md | 18 +-
 gcc/config/rs6000/rs6000.md |  2 +-
 2 files changed, 6 insertions(+), 14 deletions(-)

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 4edfdbb..1bf9194 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -970,19 +970,11 @@ (define_predicate "scc_eq_operand"
 
 ;; Return 1 if the operand is a general non-special register or memory operand.
 (define_predicate "reg_or_mem_operand"
- (ior (match_operand 0 "memory_operand")
- (ior (and (match_code "mem")
-   (match_test "macho_lo_sum_memory_operand (op, mode)"))
-  (ior (match_operand 0 "volatile_mem_operand")
-   (match_operand 0 "gpc_reg_operand")
-
-;; Return 1 if the operand is either an easy FP constant or memory or reg.
-(define_predicate "reg_or_none500mem_operand"
-  (if_then_else (match_code "mem")
- (ior (match_operand 0 "memory_operand")
- (match_test "macho_lo_sum_memory_operand (op, mode)")
- (match_operand 0 "volatile_mem_operand"))
- (match_operand 0 "gpc_reg_operand")))
+  (ior (match_operand 0 "memory_operand")
+   (and (match_code "mem")
+   (match_test "macho_lo_sum_memory_operand (op, mode)"))
+   (match_operand 0 "volatile_mem_operand")
+   (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is CONST_DOUBLE 0, register or memory operand.
 (define_predicate "zero_reg_mem_operand"
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 3fea231..edb5208 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4638,7 +4638,7 @@ (define_insn "*cmp_fpr"
 ;; Floating point conversions
 (define_expand "extendsfdf2"
   [(set (match_operand:DF 0 "gpc_reg_operand")
-   (float_extend:DF (match_operand:SF 1 "reg_or_none500mem_operand")))]
+   (float_extend:DF (match_operand:SF 1 "reg_or_mem_operand")))]
   "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT"
 {
   if (HONOR_SNANS (SFmode))
-- 
1.9.3



[PATCH 08/14] rs6000: Remove -mspe options

2017-06-06 Thread Segher Boessenkool
2017-06-06  Segher Boessenkool  

* config/rs6000/rs6000.c (rs6000_option_override_internal): Delete
handling of SPE flags.
* config/rs6000/rs6000.opt (-mspe, -mspe=no, -mspe=yes): Delete.

---
 gcc/config/rs6000/rs6000.c   | 18 --
 gcc/config/rs6000/rs6000.opt | 12 
 2 files changed, 30 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index a2bf968..89f9fc2 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4146,24 +4146,6 @@ rs6000_option_override_internal (bool global_init_p)
   gcc_assert (tune_index >= 0);
   rs6000_cpu = processor_target_table[tune_index].processor;
 
-  /* Pick defaults for SPE related control flags.  Do this early to make sure
- that the TARGET_ macros are representative ASAP.  */
-  {
-int spe_capable_cpu =
-  (rs6000_cpu == PROCESSOR_PPC8540
-   || rs6000_cpu == PROCESSOR_PPC8548);
-
-if (!global_options_set.x_rs6000_spe)
-  rs6000_spe = spe_capable_cpu;
-  }
-
-  if (global_options_set.x_rs6000_spe && rs6000_spe)
-error ("not configured for SPE instruction set");
-
-  if (main_target_opt != NULL
-  && main_target_opt->x_rs6000_spe != rs6000_spe)
-error ("target attribute or pragma changes SPE ABI");
-
   if (rs6000_cpu == PROCESSOR_PPCE300C2 || rs6000_cpu == PROCESSOR_PPCE300C3
   || rs6000_cpu == PROCESSOR_PPCE500MC || rs6000_cpu == 
PROCESSOR_PPCE500MC64
   || rs6000_cpu == PROCESSOR_PPCE5500)
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index a1a7753..28d8993 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -353,22 +353,10 @@ misel=yes
 Target RejectNegative Alias(misel)
 Deprecated option.  Use -misel instead.
 
-mspe
-Target Var(rs6000_spe) Save
-Generate SPE SIMD instructions on E500.
-
 mpaired
 Target Var(rs6000_paired_float) Save
 Generate PPC750CL paired-single instructions.
 
-mspe=no
-Target RejectNegative Alias(mspe) NegativeAlias
-Deprecated option.  Use -mno-spe instead.
-
-mspe=yes
-Target RejectNegative Alias(mspe)
-Deprecated option.  Use -mspe instead.
-
 mdebug=
 Target RejectNegative Joined
 -mdebug=   Enable debug output.
-- 
1.9.3



[PATCH 06/14] rs6000: Remove UNSPEC_MV_CR_GT

2017-06-06 Thread Segher Boessenkool
2017-06-06  Segher Boessenkool  

config/rs6000/rs6000.md (UNSPEC_MV_CR_GT): Delete.


---
 gcc/config/rs6000/rs6000.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 108ad8f..997d1fe 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -95,7 +95,6 @@ (define_c_enum "unspec"
UNSPEC_TLSGOTTPREL
UNSPEC_TLSTLS
UNSPEC_FIX_TRUNC_TF ; fadd, rounding towards zero
-   UNSPEC_MV_CR_GT ; move_from_CR_gt_bit
UNSPEC_STFIWX
UNSPEC_POPCNTB
UNSPEC_FRES
-- 
1.9.3



[PATCH 05/14] rs6000: Remove output_e500_flip_gt_bit

2017-06-06 Thread Segher Boessenkool
2017-06-06  Segher Boessenkool  

* config/rs6000/rs6000-protos.h (output_e500_flip_gt_bit): Delete.
* config/rs6000/rs6000.c: Ditto.

---
 gcc/config/rs6000/rs6000-protos.h |  1 -
 gcc/config/rs6000/rs6000.c| 18 --
 2 files changed, 19 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 0344823..2955d97 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -133,7 +133,6 @@ extern void rs6000_emit_sISEL (machine_mode, rtx[]);
 extern void rs6000_emit_sCOND (machine_mode, rtx[]);
 extern void rs6000_emit_cbranch (machine_mode, rtx[]);
 extern char * output_cbranch (rtx, const char *, int, rtx_insn *);
-extern char * output_e500_flip_gt_bit (rtx, rtx);
 extern const char * output_probe_stack_range (rtx, rtx);
 extern bool rs6000_emit_set_const (rtx, rtx);
 extern int rs6000_emit_cmove (rtx, rtx, rtx, rtx);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 1ad08d0..a31c608 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -24987,24 +24987,6 @@ output_cbranch (rtx op, const char *label, int 
reversed, rtx_insn *insn)
   return string;
 }
 
-/* Return the string to flip the GT bit on a CR.  */
-char *
-output_e500_flip_gt_bit (rtx dst, rtx src)
-{
-  static char string[64];
-  int a, b;
-
-  gcc_assert (GET_CODE (dst) == REG && CR_REGNO_P (REGNO (dst))
- && GET_CODE (src) == REG && CR_REGNO_P (REGNO (src)));
-
-  /* GT bit.  */
-  a = 4 * (REGNO (dst) - CR0_REGNO) + 1;
-  b = 4 * (REGNO (src) - CR0_REGNO) + 1;
-
-  sprintf (string, "crnot %d,%d", a, b);
-  return string;
-}
-
 /* Return insn for VSX or Altivec comparisons.  */
 
 static rtx
-- 
1.9.3



[PATCH 04/14] rs6000: Remove rs6000_cbranch_operator

2017-06-06 Thread Segher Boessenkool
rs6000_cbranch_operator now is just comparison_operator, so just use
that directly.


2017-06-06  Segher Boessenkool  

* config/rs6000/predicated.md (rs6000_cbranch_operator): Delete.
* config/rs6000/rs6000.md: Replace rs6000_cbranch_operator by
comparison_operator.

---
 gcc/config/rs6000/predicates.md | 4 
 gcc/config/rs6000/rs6000.md | 8 
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index dd961a7..11aecbd 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1238,10 +1238,6 @@ (define_predicate "branch_comparison_operator"
   GET_MODE (XEXP (op, 0))),
  1"
 
-;; Return 1 if OP is a valid comparison operator for "cbranch" instructions.
-(define_predicate "rs6000_cbranch_operator"
-  (match_operand 0 "comparison_operator"))
-
 ;; Return 1 if OP is an unsigned comparison operator.
 (define_predicate "unsigned_comparison_operator"
   (match_code "ltu,gtu,leu,geu"))
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index efca26c..108ad8f 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -11430,7 +11430,7 @@ (define_insn "probe_stack_range"
 ;; insns, and branches.
 
 (define_expand "cbranch4"
-  [(use (match_operator 0 "rs6000_cbranch_operator"
+  [(use (match_operator 0 "comparison_operator"
  [(match_operand:GPR 1 "gpc_reg_operand" "")
   (match_operand:GPR 2 "reg_or_short_operand" "")]))
(use (match_operand 3 ""))]
@@ -11453,7 +11453,7 @@ (define_expand "cbranch4"
 }")
 
 (define_expand "cbranch4"
-  [(use (match_operator 0 "rs6000_cbranch_operator"
+  [(use (match_operator 0 "comparison_operator"
  [(match_operand:FP 1 "gpc_reg_operand" "")
   (match_operand:FP 2 "gpc_reg_operand" "")]))
(use (match_operand 3 ""))]
@@ -11683,7 +11683,7 @@ (define_expand "cstore4_unsigned_imm"
 })
 
 (define_expand "cstore4"
-  [(use (match_operator 1 "rs6000_cbranch_operator"
+  [(use (match_operator 1 "comparison_operator"
  [(match_operand:GPR 2 "gpc_reg_operand")
   (match_operand:GPR 3 "reg_or_short_operand")]))
(clobber (match_operand:GPR 0 "gpc_reg_operand"))]
@@ -11746,7 +11746,7 @@ (define_expand "cstore4"
 })
 
 (define_expand "cstore4"
-  [(use (match_operator 1 "rs6000_cbranch_operator"
+  [(use (match_operator 1 "comparison_operator"
  [(match_operand:FP 2 "gpc_reg_operand")
   (match_operand:FP 3 "gpc_reg_operand")]))
(clobber (match_operand:SI 0 "gpc_reg_operand"))]
-- 
1.9.3



[PATCH 03/14] rs6000: Remove -mfloat-gprs

2017-06-06 Thread Segher Boessenkool
This deletes -mfloat-gprs and the variables that go with it.


2017-06-06  Segher Boessenkool  

* config/rs6000/rs6000.c: Remove everything related to -mfloat-gprs.
* config/rs6000/rs6000.opt: Ditto.
* config/rs6000/t-rtems: Ditto.

---
 gcc/config/rs6000/rs6000.c   | 12 +---
 gcc/config/rs6000/rs6000.opt | 20 
 gcc/config/rs6000/t-rtems|  9 ++---
 3 files changed, 3 insertions(+), 38 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 8d578f4..1ad08d0 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -2797,9 +2797,6 @@ rs6000_debug_reg_global (void)
   if (rs6000_darwin64_abi)
 fprintf (stderr, DEBUG_FMT_S, "darwin64_abi", "true");
 
-  if (rs6000_float_gprs)
-fprintf (stderr, DEBUG_FMT_S, "float_gprs", "true");
-
   fprintf (stderr, DEBUG_FMT_S, "single_float",
   (TARGET_SINGLE_FLOAT ? "true" : "false"));
 
@@ -4198,12 +4195,6 @@ rs6000_option_override_internal (bool global_init_p)
 
 if (!global_options_set.x_rs6000_spe)
   rs6000_spe = spe_capable_cpu;
-
-if (!global_options_set.x_rs6000_float_gprs)
-  rs6000_float_gprs =
-(rs6000_cpu == PROCESSOR_PPC8540 ? 1
- : rs6000_cpu == PROCESSOR_PPC8548 ? 2
- : 0);
   }
 
   if (global_options_set.x_rs6000_spe_abi
@@ -4218,8 +4209,7 @@ rs6000_option_override_internal (bool global_init_p)
 
   if (main_target_opt != NULL
   && ((main_target_opt->x_rs6000_spe_abi != rs6000_spe_abi)
-  || (main_target_opt->x_rs6000_spe != rs6000_spe)
-  || (main_target_opt->x_rs6000_float_gprs != rs6000_float_gprs)))
+  || (main_target_opt->x_rs6000_spe != rs6000_spe)))
 error ("target attribute or pragma changes SPE ABI");
 
   if (rs6000_cpu == PROCESSOR_PPCE300C2 || rs6000_cpu == PROCESSOR_PPCE300C3
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index fdac5c7..c5c11c5 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -449,26 +449,6 @@ mwarn-altivec-long
 Target Var(rs6000_warn_altivec_long) Init(1) Save
 Warn about deprecated 'vector long ...' AltiVec type usage.
 
-mfloat-gprs=
-Target RejectNegative Joined Enum(rs6000_float_gprs) Var(rs6000_float_gprs) 
Save
--mfloat-gprs=  Select GPR floating point method.
-
-Enum
-Name(rs6000_float_gprs) Type(unsigned char)
-Valid arguments to -mfloat-gprs=:
-
-EnumValue
-Enum(rs6000_float_gprs) String(yes) Value(1)
-
-EnumValue
-Enum(rs6000_float_gprs) String(single) Value(1)
-
-EnumValue
-Enum(rs6000_float_gprs) String(double) Value(2)
-
-EnumValue
-Enum(rs6000_float_gprs) String(no) Value(0)
-
 mlong-double-
 Target RejectNegative Joined UInteger Var(rs6000_long_double_type_size) Save
 -mlong-double-  Specify size of long double (64 or 128 bits).
diff --git a/gcc/config/rs6000/t-rtems b/gcc/config/rs6000/t-rtems
index 7c7637d..723c6a3 100644
--- a/gcc/config/rs6000/t-rtems
+++ b/gcc/config/rs6000/t-rtems
@@ -30,8 +30,8 @@ MULTILIB_DIRNAMES += m403 m505 m603e m604 m860 m7400 m8540 
me6500
 MULTILIB_OPTIONS += m32
 MULTILIB_DIRNAMES += m32
 
-MULTILIB_OPTIONS += msoft-float/mfloat-gprs=double
-MULTILIB_DIRNAMES += nof gprsdouble
+MULTILIB_OPTIONS += msoft-float
+MULTILIB_DIRNAMES += nof
 
 MULTILIB_OPTIONS += mno-spe/mno-altivec
 MULTILIB_DIRNAMES += nospe noaltivec
@@ -56,10 +56,6 @@ MULTILIB_MATCHES += mcpu?750=
 # Map 8548 to 8540
 MULTILIB_MATCHES   += mcpu?8540=mcpu?8548
 
-# Map -mcpu=8540 -mfloat-gprs=single to -mcpu=8540
-# (mfloat-gprs=single is implicit default)
-MULTILIB_MATCHES   += mcpu?8540=mcpu?8540/mfloat-gprs?single
-
 # Enumeration of multilibs
 
 MULTILIB_REQUIRED += msoft-float
@@ -73,7 +69,6 @@ MULTILIB_REQUIRED += mcpu=7400
 MULTILIB_REQUIRED += mcpu=7400/msoft-float
 MULTILIB_REQUIRED += mcpu=8540
 MULTILIB_REQUIRED += mcpu=8540/msoft-float/mno-spe
-MULTILIB_REQUIRED += mcpu=8540/mfloat-gprs=double
 MULTILIB_REQUIRED += mcpu=860
 MULTILIB_REQUIRED += mcpu=e6500/m32
 MULTILIB_REQUIRED += mcpu=e6500/m32/msoft-float/mno-altivec
-- 
1.9.3



[PATCH 02/14] rs6000: Remove TARGET_E500_{SINGLE,DOUBLE}

2017-06-06 Thread Segher Boessenkool
Similarly, TARGET_E500_{SINGLE,DOUBLE} is always false now.


2017-06-06  Segher Boessenkool  

* config/rs6000/predicates.md: Replace TARGET_E500_DOUBLE and
TARGET_E500_SINGLE by 0, simplify.
* config/rs6000/rs6000.c: Ditto.
(rs6000_option_override_internal): Delete CHECK_E500_OPTIONS.
(spe_build_register_parallel): Delete.
* config/rs6000/rs6000.h: Delete TARGET_E500_SINGLE,
TARGET_E500_DOUBLE, and CHECK_E500_OPTIONS.
* config/rs6000/rs6000.md: Replace TARGET_E500_DOUBLE,
TARGET_E500_SINGLE, and  by 0, simplify.
(E500_CONVERT): Delete.
* config/rs6000/spe.md: Remove many patterns and all define_constants.

---
 gcc/config/rs6000/predicates.md |  19 +-
 gcc/config/rs6000/rs6000.c  | 188 +
 gcc/config/rs6000/rs6000.h  |  11 +-
 gcc/config/rs6000/rs6000.md |  73 ++---
 gcc/config/rs6000/spe.md| 589 ++--
 5 files changed, 60 insertions(+), 820 deletions(-)

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index a9bf854..dd961a7 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -299,7 +299,7 @@ (define_predicate "const_0_to_15_operand"
 (define_predicate "gpc_reg_operand"
   (match_operand 0 "register_operand")
 {
-  if ((TARGET_E500_DOUBLE || TARGET_SPE) && invalid_e500_subreg (op, mode))
+  if (TARGET_SPE && invalid_e500_subreg (op, mode))
 return 0;
 
   if (GET_CODE (op) == SUBREG)
@@ -331,7 +331,7 @@ (define_predicate "gpc_reg_operand"
 (define_predicate "int_reg_operand"
   (match_operand 0 "register_operand")
 {
-  if ((TARGET_E500_DOUBLE || TARGET_SPE) && invalid_e500_subreg (op, mode))
+  if (TARGET_SPE && invalid_e500_subreg (op, mode))
 return 0;
 
   if (GET_CODE (op) == SUBREG)
@@ -357,7 +357,7 @@ (define_predicate "int_reg_operand"
 (define_predicate "int_reg_operand_not_pseudo"
   (match_operand 0 "register_operand")
 {
-  if ((TARGET_E500_DOUBLE || TARGET_SPE) && invalid_e500_subreg (op, mode))
+  if (TARGET_SPE && invalid_e500_subreg (op, mode))
 return 0;
 
   if (GET_CODE (op) == SUBREG)
@@ -606,7 +606,7 @@ (define_predicate "easy_fp_constant"
 return 0;
 
   /* Consider all constants with -msoft-float to be easy.  */
-  if ((TARGET_SOFT_FLOAT || TARGET_E500_SINGLE 
+  if ((TARGET_SOFT_FLOAT
   || (TARGET_HARD_FLOAT && (TARGET_SINGLE_FLOAT && ! TARGET_DOUBLE_FLOAT)))
   && mode != DImode)
 return 1;
@@ -1014,10 +1014,9 @@ (define_predicate "reg_or_mem_operand"
 ;; Return 1 if the operand is either an easy FP constant or memory or reg.
 (define_predicate "reg_or_none500mem_operand"
   (if_then_else (match_code "mem")
- (and (match_test "!TARGET_E500_DOUBLE")
- (ior (match_operand 0 "memory_operand")
-  (ior (match_test "macho_lo_sum_memory_operand (op, mode)")
-   (match_operand 0 "volatile_mem_operand"
+ (ior (match_operand 0 "memory_operand")
+ (match_test "macho_lo_sum_memory_operand (op, mode)")
+ (match_operand 0 "volatile_mem_operand"))
  (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is CONST_DOUBLE 0, register or memory operand.
@@ -1137,7 +1136,7 @@ (define_predicate "input_operand"
 return 1;
 
   /* Do not allow invalid E500 subregs.  */
-  if ((TARGET_E500_DOUBLE || TARGET_SPE)
+  if (TARGET_SPE
   && GET_CODE (op) == SUBREG
   && invalid_e500_subreg (op, mode))
 return 0;
@@ -1205,7 +1204,7 @@ (define_predicate "splat_input_operand"
 (define_predicate "rs6000_nonimmediate_operand"
   (match_code "reg,subreg,mem")
 {
-  if ((TARGET_E500_DOUBLE || TARGET_SPE)
+  if (TARGET_SPE
   && GET_CODE (op) == SUBREG
   && invalid_e500_subreg (op, mode))
 return 0;
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 4a37a58..8d578f4 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -2038,15 +2038,6 @@ rs6000_hard_regno_nregs_internal (int regno, 
machine_mode mode)
   else if (ALTIVEC_REGNO_P (regno))
 reg_size = UNITS_PER_ALTIVEC_WORD;
 
-  /* The value returned for SCmode in the E500 double case is 2 for
- ABI compatibility; storing an SCmode value in a single register
- would require function_arg and rs6000_spe_function_arg to handle
- SCmode so as to pass the value correctly in a pair of
- registers.  */
-  else if (TARGET_E500_DOUBLE && FLOAT_MODE_P (mode) && mode != SCmode
-  && !DECIMAL_FLOAT_MODE_P (mode) && SPE_SIMD_REGNO_P (regno))
-reg_size = UNITS_PER_FP_WORD;
-
   else
 reg_size = UNITS_PER_WORD;
 
@@ -2818,12 +2809,6 @@ rs6000_debug_reg_global (void)
   fprintf (stderr, DEBUG_FMT_S, "soft_float",
   (TARGET_SOFT_FLOAT ? "true" : "false"));
 
-  fprintf (stderr, DEBUG_FMT_S, "e500_single",
-  (TARGET_E500_SINGLE ? "true" : "false"));
-
-  fprintf (stderr, DEBUG_FMT_S, "e500_double",
-

[PATCH 01/14] rs6000: Remove TARGET_FPRS

2017-06-06 Thread Segher Boessenkool
Since rs6000 no longer supports SPE, TARGET_FPRS now always is true.

This makes TARGET_{SF,DF}_SPE always false.  Many patterns in spe.md
can now be deleted; which makes it possible to merge e.g. negdd2 with
*negdd2_fpr.

Finally, e500.h is deleted (it isn't used).


2017-06-06  Segher Boessenkool  

* config/rs6000/darwin.md: Replace TARGET_FPRS by 1 and simplify.
* config/rs6000/dfp.md: Ditto.
(negdd2, *negdd2_fpr): Merge.
(absdd2, *absdd2_fpr): Merge.
(negtd2, *negtd2_fpr): Merge.
(abstd2, *abstd2_fpr): Merge.
* config/rs6000/e500.h: Delete file.
* config/rs6000/predicates.md (rs6000_cbranch_operator): Replace
TARGET_FPRS by 1 and simplify.
* config/rs6000/rs6000-c.c: Ditto.
* config/rs6000/rs6000.c: Ditto.  Also replace TARGET_SF_SPE and
TARGET_DF_SPE by 0.
* config/rs6000/rs6000.h: Ditto.  Delete TARGET_SF_SPE and
TARGET_DF_SPE.
* config/rs6000/rs6000.md: Ditto.
(floatdidf2, *floatdidf2_fpr): Merge.
(move_from_CR_gt_bit): Delete.
* config/rs6000/spe.md: Replace TARGET_FPRS by 1 and simplify.
(E500_CR_IOR_COMPARE): Delete.
(All patterns that require !TARGET_FPRS): Delete.
* config/rs6000/vsx.md: Replace TARGET_FPRS by 1 and simplify.

---
 gcc/config/rs6000/darwin.md |  16 +--
 gcc/config/rs6000/dfp.md|  48 ++-
 gcc/config/rs6000/e500.h|  45 --
 gcc/config/rs6000/predicates.md |  10 +-
 gcc/config/rs6000/rs6000-c.c|  14 +-
 gcc/config/rs6000/rs6000.c  | 297 +---
 gcc/config/rs6000/rs6000.h  |  38 ++---
 gcc/config/rs6000/rs6000.md | 262 +--
 gcc/config/rs6000/spe.md| 170 ---
 gcc/config/rs6000/vsx.md|   2 +-
 10 files changed, 171 insertions(+), 731 deletions(-)
 delete mode 100644 gcc/config/rs6000/e500.h

diff --git a/gcc/config/rs6000/darwin.md b/gcc/config/rs6000/darwin.md
index fde67fd..a60185a 100644
--- a/gcc/config/rs6000/darwin.md
+++ b/gcc/config/rs6000/darwin.md
@@ -30,7 +30,7 @@ (define_insn "movdf_low_si"
   [(set (match_operand:DF 0 "gpc_reg_operand" "=f,!r")
 (mem:DF (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,b")
(match_operand 2 "" ""]
-  "TARGET_MACHO && TARGET_HARD_FLOAT && TARGET_FPRS && !TARGET_64BIT"
+  "TARGET_MACHO && TARGET_HARD_FLOAT && !TARGET_64BIT"
   "*
 {
   switch (which_alternative)
@@ -61,7 +61,7 @@ (define_insn "movdf_low_di"
   [(set (match_operand:DF 0 "gpc_reg_operand" "=f,!r")
 (mem:DF (lo_sum:DI (match_operand:DI 1 "gpc_reg_operand" "b,b")
(match_operand 2 "" ""]
-  "TARGET_MACHO && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_64BIT"
+  "TARGET_MACHO && TARGET_HARD_FLOAT && TARGET_64BIT"
   "*
 {
   switch (which_alternative)
@@ -81,7 +81,7 @@ (define_insn "movdf_low_st_si"
   [(set (mem:DF (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b")
(match_operand 2 "" "")))
(match_operand:DF 0 "gpc_reg_operand" "f"))]
-  "TARGET_MACHO && TARGET_HARD_FLOAT && TARGET_FPRS && ! TARGET_64BIT"
+  "TARGET_MACHO && TARGET_HARD_FLOAT && ! TARGET_64BIT"
   "stfd %0,lo16(%2)(%1)"
   [(set_attr "type" "store")
(set_attr "length" "4")])
@@ -90,7 +90,7 @@ (define_insn "movdf_low_st_di"
   [(set (mem:DF (lo_sum:DI (match_operand:DI 1 "gpc_reg_operand" "b")
(match_operand 2 "" "")))
(match_operand:DF 0 "gpc_reg_operand" "f"))]
-  "TARGET_MACHO && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_64BIT"
+  "TARGET_MACHO && TARGET_HARD_FLOAT && TARGET_64BIT"
   "stfd %0,lo16(%2)(%1)"
   [(set_attr "type" "store")
(set_attr "length" "4")])
@@ -99,7 +99,7 @@ (define_insn "movsf_low_si"
   [(set (match_operand:SF 0 "gpc_reg_operand" "=f,!r")
 (mem:SF (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,b")
(match_operand 2 "" ""]
-  "TARGET_MACHO && TARGET_HARD_FLOAT && TARGET_FPRS && ! TARGET_64BIT"
+  "TARGET_MACHO && TARGET_HARD_FLOAT && ! TARGET_64BIT"
   "@
lfs %0,lo16(%2)(%1)
lwz %0,lo16(%2)(%1)"
@@ -110,7 +110,7 @@ (define_insn "movsf_low_di"
   [(set (match_operand:SF 0 "gpc_reg_operand" "=f,!r")
 (mem:SF (lo_sum:DI (match_operand:DI 1 "gpc_reg_operand" "b,b")
(match_operand 2 "" ""]
-  "TARGET_MACHO && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_64BIT"
+  "TARGET_MACHO && TARGET_HARD_FLOAT && TARGET_64BIT"
   "@
lfs %0,lo16(%2)(%1)
lwz %0,lo16(%2)(%1)"
@@ -121,7 +121,7 @@ (define_insn "movsf_low_st_si"
   [(set (mem:SF (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,b")
(match_operand 2 "" "")))
(match_operand:SF 0 "gpc_reg_operand" "f,!r"))]
-  "TARGET_MACHO && TARGET_HARD_FLOAT && TARGET_FPRS && ! TARGET_64BIT"
+  "TARGET_MACHO && TARGET_HARD_FLOAT 

[PATCH 00/14] rs6000: Delete SPE things

2017-06-06 Thread Segher Boessenkool
This series removes most SPE things.

Tested (all languages) on powerpc64-linux (power7, {-m32,-m64}); on
powerpc64le-linux (power8, all languages); on AIX (power7, default
languages); on powerpc-linux (compile only) and on powerpc-linux-gnuspe
(compile only).

Is this okay for trunk?


Segher


Segher Boessenkool (14):
  rs6000: Remove TARGET_FPRS
  rs6000: Remove TARGET_E500_{SINGLE,DOUBLE}
  rs6000: Remove -mfloat-gprs
  rs6000: Remove rs6000_cbranch_operator
  rs6000: Remove output_e500_flip_gt_bit
  rs6000: Remove UNSPEC_MV_CR_GT
  rs6000: Remove TARGET_SPE and TARGET_SPE_ABI and friends
  rs6000: Remove -mspe options
  rs6000: Remove reg_or_none500mem_operand
  rs6000: Remove spe.md, spe.h, linuxspe.h
  rs6000: Remove type attribute "brinc"
  rs6000: Remove SPE high registers
  rs6000: Remove spe_acc and spefscr
  rs6000: Remove rs6000_nonimmediate_operand

 gcc/common/config/rs6000/rs6000-common.c |9 -
 gcc/config.gcc   |4 +-
 gcc/config/rs6000/8540.md|6 -
 gcc/config/rs6000/darwin.h   |7 +-
 gcc/config/rs6000/darwin.md  |   16 +-
 gcc/config/rs6000/dfp.md |   48 +-
 gcc/config/rs6000/e500.h |   45 -
 gcc/config/rs6000/e500mc.md  |6 -
 gcc/config/rs6000/e500mc64.md|6 -
 gcc/config/rs6000/linuxspe.h |   32 -
 gcc/config/rs6000/paired.md  |   12 +-
 gcc/config/rs6000/predicates.md  |   87 +-
 gcc/config/rs6000/rs6000-builtin.def |  313 +--
 gcc/config/rs6000/rs6000-c.c |   18 +-
 gcc/config/rs6000/rs6000-protos.h|2 -
 gcc/config/rs6000/rs6000.c   | 1956 +
 gcc/config/rs6000/rs6000.h   |  235 +-
 gcc/config/rs6000/rs6000.md  |  354 +--
 gcc/config/rs6000/rs6000.opt |   40 -
 gcc/config/rs6000/spe.h  | 1107 --
 gcc/config/rs6000/spe.md | 3512 --
 gcc/config/rs6000/t-rs6000   |1 -
 gcc/config/rs6000/t-rtems|9 +-
 gcc/config/rs6000/vector.md  |   95 -
 gcc/config/rs6000/vsx.md |2 +-
 gcc/doc/tm.texi  |2 -
 gcc/doc/tm.texi.in   |2 -
 27 files changed, 305 insertions(+), 7621 deletions(-)
 delete mode 100644 gcc/config/rs6000/e500.h
 delete mode 100644 gcc/config/rs6000/linuxspe.h
 delete mode 100644 gcc/config/rs6000/spe.h
 delete mode 100644 gcc/config/rs6000/spe.md

-- 
1.9.3



[C++/80979] ADL of friends

2017-06-06 Thread Nathan Sidwell
This fixes 80979, and ICE in the new duplicate lookup matching code. 
That code is enabled when we discover using declarations are in play. 
And it was barfing on meeting an already-marked function.


That function was in the lookup twice, which was a surprise.

That had happened because the function was a friend, but not an 
invisible one.  So we added it as part of the namespace ADL and also as 
when adding the class's friends.  We only later discovered functions 
introduced by using declaration.


The fix is to only add invisible friends during the class ADL.

nathan
--
Nathan Sidwell
2017-06-06  Nathan Sidwell  

	PR c++/80979
	* name-lookup.c (adl_class_only): Don't add visible friends.

Index: cp/name-lookup.c
===
--- cp/name-lookup.c	(revision 248914)
+++ cp/name-lookup.c	(working copy)
@@ -801,6 +801,12 @@ name_lookup::adl_class_only (tree type)
 	  if (CP_DECL_CONTEXT (fn) != context)
 	continue;
 
+	  /* Only interested in anticipated friends.  (Non-anticipated
+	 ones will have been inserted during the namespace
+	 adl.)  */
+	  if (!DECL_ANTICIPATED (fn))
+	continue;
+
 	  /* Template specializations are never found by name lookup.
 	 (Templates themselves can be found, but not template
 	 specializations.)  */
Index: testsuite/g++.dg/lookup/pr80979.C
===
--- testsuite/g++.dg/lookup/pr80979.C	(revision 0)
+++ testsuite/g++.dg/lookup/pr80979.C	(working copy)
@@ -0,0 +1,26 @@
+// pr C++/80979 ICE with late discovery of using directives during ADL
+// of a friend declaration.
+
+namespace Tiny {
+  class Handsome {};
+  void Dahl (Handsome &, Handsome &);
+
+  namespace Jack {
+class Vladof {
+  friend void Dahl (Vladof &, Vladof &);
+};
+void Dahl (Vladof &, Vladof &);
+  }
+
+  struct BDonk {};
+
+  namespace Tina {
+void Dahl (BDonk &, Jack::Vladof &);
+  }
+  using Tina::Dahl;
+}
+
+void BoomBoom (Tiny::BDonk , Tiny::Jack::Vladof )
+{
+  Dahl (vault, hunter);
+}


Re: [PATCH, testsuite] Remove NO_LABEL_VALUES

2017-06-06 Thread Mike Stump
On Jun 6, 2017, at 2:23 AM, Tom de Vries  wrote:
> 
> OK for trunk?

Ok.


Re: [PATCH 9/13] D: D2 Testsuite Dejagnu files.

2017-06-06 Thread Iain Buclaw
On 31 May 2017 at 11:11, Matthias Klose  wrote:
> On 30.05.2017 16:32, Mike Stump wrote:
>> On May 28, 2017, at 2:16 PM, Iain Buclaw  wrote:
>>>
>>> This patch adds D language support to the GCC test suite.
>>
>> Ok.  If you could ensure that gcc without D retains all it's goodness and 
>> that gcc with D works on 2 different systems, that will help ensure 
>> integration smoothness.
>>
>> Something this large can be integration tested on a svn/git branch, if you 
>> need others to help out.
>
> I built the library (x86 and ARM32) and the D frontend on several Debian
> architectures and OSes (Linux, KFreeBSD, Hurd) in the past, but can do that 
> with
> the proposed patches again. A svn/git branch would be helpful for that, if a
> recent test is required.
>
> Matthias

Indeed, I have all cross-compilers running on explore.dgnu.org at
least.  A little hello world program should give a rough idea of which
targets are supported by libphobos - x86/64, ARM/64, MIPS/64 and
PPC/64 should be either supported or partial.  And if partial, then I
suspect that it's only because there's a missing 'CRuntime_xxx'
version define in the gcc/config patches.  Although I only use gdc on
Linux myself, the reference D compiler is tested on x86/Linux, OSX,
Solaris, and FreeBSD.  I'll optimistically say we should work on these
as well, although off the top of my head, the module dso support may
need looking at in the library.

Iain.


Re: How to do scan-tree-dump for test.o

2017-06-06 Thread Richard Biener
On Tue, Jun 6, 2017 at 3:39 PM, Tom de Vries  wrote:
> [ was: Re: [nvptx, PATCH, 3/3] Add v2di support ]
>
> On 06/06/2017 03:12 PM, Tom de Vries wrote:
>>
>> diff --git a/libgomp/testsuite/libgomp.oacc-c/vec.c
>> b/libgomp/testsuite/libgomp.oacc-c/vec.c
>> new file mode 100644
>> index 000..79c1c17
>> --- /dev/null
>> +++ b/libgomp/testsuite/libgomp.oacc-c/vec.c
>> @@ -0,0 +1,48 @@
>> +/* { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */
>> +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */
>> +/* { dg-additional-options "-std=c99 -ftree-slp-vectorize
>> -foffload=-ftree-slp-vectorize -foffload=-fdump-tree-slp1
>> -foffload=-save-temps -save-temps" } */
>> +
>> +#include 
>> +#include 
>> +
>> +long long int p[32 *1000] __attribute__((aligned(16)));
>> +long long int p2[32 *1000] __attribute__((aligned(16)));
>> +
>> +int
>> +main (void)
>> +{
>> +#pragma acc parallel num_gangs(1) num_workers(1) vector_length(32)
>> +  {
>> +if (((unsigned long int)p & (0xfULL)) != 0)
>> +  __builtin_abort ();
>> +if (((unsigned long int)p2 & (0xfULL)) != 0)
>> +  __builtin_abort ();
>> +
>> +for (unsigned int k = 0; k < 1; k += 1)
>> +  {
>> +#pragma acc loop vector
>> +   for (unsigned long long int j = 0; j < 32; j += 1)
>> + {
>> +   unsigned long long a, b;
>> +   unsigned long long *p3, *p4;
>> +   p3 = (unsigned long long *)((unsigned long long int)p &
>> (~0xfULL));
>> +   p4 = (unsigned long long *)((unsigned long long int)p2 &
>> (~0xfULL));
>> +
>> +   for (unsigned int i = 0; i < 1000; i += 2)
>> + {
>> +   a = p3[j * 1000 + i];
>> +   b = p3[j * 1000 + i + 1];
>> +
>> +   p4[j * 1000 + i] = a;
>> +   p4[j * 1000 + i + 1] = b;
>> + }
>> + }
>> +  }
>> +  }
>> +
>> +  return 0;
>> +}
>> +
>> +/* Todo: make a scan-tree-dump variant that scans vec.o instead.  */
>> +/* { dg-final { file copy -force [glob vec.o.*] [regsub \.o\. [glob
>> vec.o.*] \.c\.] } } */
>> +/* { dg-final { scan-tree-dump "vector\\(2\\) long long unsigned int"
>> "slp1" } } */
>
>
> Hi,
>
> we have scan-tree-dump that scans in test.c.* files.  But when we run lto1
> for the offloaded region, we produce test.o.* files instead.  In the
> test-case above, I work around that by using 'dg-final { file copy }'. What
> is a good way to get rid of this workaround ?
>
> Add scan-o-tree-dump ?
>
> Or make the "slp1" field smarter, and allow f.i. "o.slp1" ?

There is the same issue with regular LTO tests using scan-tree-dump which
end up scanning the "fat" compilation dumpfile.  Maybe add
scan-ltrans-tree-dump and scan-wpa-ipa-dump that look at appropriate
files plus passing appropriate flags to generate dumpfiles in known locations
(I think part of them end up in /tmp).

Richard.

> Thanks,
> - Tom


Re: [PATCH 0/13] D: Submission of D Front End

2017-06-06 Thread Iain Buclaw
On 29 May 2017 at 22:57, Eric Botcazou  wrote:
>> The upstream DMD compiler that comprises all components of the
>> standalone part is now implemented in D programming language itself.
>> However here GDC is still using the C++ implementation, it is a future
>> goal to switch to being a self-hosted compiler minus the GCC binding
>> interface (similar to Ada?), however extended platform support is
>> something I wish to address first before I make this a consideration.
>
> Yes, the Ada compiler is written in Ada and the glue code (called gigi) lives
> in ada/gcc-interface and is written in C++.
>
> --
> Eric Botcazou

OK, so there should be no surprises when d/dfrontend itself is
replaced by D sources.

Iain.


How to do scan-tree-dump for test.o

2017-06-06 Thread Tom de Vries

[ was: Re: [nvptx, PATCH, 3/3] Add v2di support ]

On 06/06/2017 03:12 PM, Tom de Vries wrote:

diff --git a/libgomp/testsuite/libgomp.oacc-c/vec.c 
b/libgomp/testsuite/libgomp.oacc-c/vec.c
new file mode 100644
index 000..79c1c17
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c/vec.c
@@ -0,0 +1,48 @@
+/* { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */
+/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */
+/* { dg-additional-options "-std=c99 -ftree-slp-vectorize 
-foffload=-ftree-slp-vectorize -foffload=-fdump-tree-slp1 -foffload=-save-temps 
-save-temps" } */
+
+#include 
+#include 
+
+long long int p[32 *1000] __attribute__((aligned(16)));
+long long int p2[32 *1000] __attribute__((aligned(16)));
+
+int
+main (void)
+{
+#pragma acc parallel num_gangs(1) num_workers(1) vector_length(32)
+  {
+if (((unsigned long int)p & (0xfULL)) != 0)
+  __builtin_abort ();
+if (((unsigned long int)p2 & (0xfULL)) != 0)
+  __builtin_abort ();
+
+for (unsigned int k = 0; k < 1; k += 1)
+  {
+#pragma acc loop vector
+   for (unsigned long long int j = 0; j < 32; j += 1)
+ {
+   unsigned long long a, b;
+   unsigned long long *p3, *p4;
+   p3 = (unsigned long long *)((unsigned long long int)p & (~0xfULL));
+   p4 = (unsigned long long *)((unsigned long long int)p2 & (~0xfULL));
+
+   for (unsigned int i = 0; i < 1000; i += 2)
+ {
+   a = p3[j * 1000 + i];
+   b = p3[j * 1000 + i + 1];
+   
+   p4[j * 1000 + i] = a;
+   p4[j * 1000 + i + 1] = b;
+ }
+ }
+  }
+  }
+
+  return 0;
+}
+
+/* Todo: make a scan-tree-dump variant that scans vec.o instead.  */
+/* { dg-final { file copy -force [glob vec.o.*] [regsub \.o\. [glob vec.o.*] 
\.c\.] } } */
+/* { dg-final { scan-tree-dump "vector\\(2\\) long long unsigned int" "slp1" } 
} */


Hi,

we have scan-tree-dump that scans in test.c.* files.  But when we run 
lto1 for the offloaded region, we produce test.o.* files instead.  In 
the test-case above, I work around that by using 'dg-final { file copy 
}'. What is a good way to get rid of this workaround ?


Add scan-o-tree-dump ?

Or make the "slp1" field smarter, and allow f.i. "o.slp1" ?

Thanks,
- Tom


Re: [PATCH][GCC][AArch64] Fix subreg bug in scalar copysign

2017-06-06 Thread James Greenhalgh
On Wed, Mar 15, 2017 at 04:04:35PM +, Tamar Christina wrote:
> Hi All, 
> 
> This fixes a bug in the scalar version of copysign where due to a subreg
> were generating less than efficient code.
> 
> This patch replaces
> 
>   return x * __builtin_copysignf (150.0f, y);
> 
> which used to generate
> 
>   adrpx1, .LC1
>   mov x0, 2147483648
>   ins v3.d[0], x0
>   ldr s2, [x1, #:lo12:.LC1]
>   bsl v3.8b, v1.8b, v2.8b
>   fmuls0, s0, s3
>   ret
> 
> .LC1:
>   .word   1125515264
> 
> with
>   mov x0, 1125515264
>   moviv2.2s, 0x80, lsl 24
>   fmovd3, x0
>   bit v3.8b, v1.8b, v2.8b
>   fmuls0, s0, s3
>   ret
> 
> removing the incorrect ins.
> 
> Regression tested on aarch64-none-linux-gnu and no regressions.
> 
> OK for trunk?

OK.

Thanks,
James

> gcc/
> 2017-03-15  Tamar Christina  
> 
>   * config/aarch64/aarch64.md
>   (copysignsf3): Fix mask generation.



Re: [Patch AArch64] Do not increase data alignment at -Os and with -fconserve-stack.

2017-06-06 Thread Ramana Radhakrishnan
Ping..

Ramana

On Tue, May 2, 2017 at 10:52 AM, Ramana Radhakrishnan
 wrote:
> We unnecessarily align data to 8 byte alignments even when -Os is specified.
> This brings the logic in the AArch64 backend more in line with the ARM
> backend and helps gain some image size in a few places. Caught by an
> internal report on the size of rodata sections being high with aarch64 gcc.
>
> * config/aarch64/aarch64.h (AARCH64_EXPAND_ALIGNMENT): New.
>   (DATA_ALIGNMENT): Update to use AARCH64_EXPAND_ALIGNMENT.
>   (LOCAL_ALIGNMENT): Update to use AARCH64_EXPAND_ALIGNMENT.
>
> Bootstrapped and regression tested on aarch64-none-linux-gnu with no
> regressions.
>
> Ok to commit ?
>
>
> cheers
> Ramana
>


Re: [PATCH] Introduce 4-stages profiledbootstrap to get a better profile.

2017-06-06 Thread Martin Liška
On 05/29/2017 07:04 AM, Markus Trippelsdorf wrote:
> On 2017.05.25 at 11:55 +0200, Martin Liška wrote:
>> Hi.
>>
>> As I spoke about the PGO with Honza and Richi, current 3-stage is not ideal 
>> for following
>> 2 reasons:
>>
>> 1) stageprofile compiler is train just on libraries that are built during 
>> stage2
>> 2) apart from that, as the compiler is also used to build the final 
>> compiler, profile
>> is being updated during the build. So the stage2 compiler is making 
>> different decisions.
>>
>> Both problems can be resolved by adding another step in between current 
>> stage2 and stage3
>> where we train stage2 compiler by building compiler with default options.
> 
> Another issue that I've noticed is that LTO doesn't get used in the
> final stage (stagefeedback) with "bootstrap-O3 bootstrap-lto".
> It only is used during training. So either move -flto to stagefeedback,
> or use -flto both during training and during the final stage.
> 

You're right, thus I'm suggesting to use -flto in all stages of PGO if 'make 
profiledbootstrap'
is invoked.

Martin
>From 77fb3302ef98548d37bf0f891ff09bca297f77fa Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 6 Jun 2017 11:23:38 +0200
Subject: [PATCH] Enable -flto in all PGO stages for
 bootstrap-lto-{,noplugin}.mk.

config/ChangeLog:

2017-06-06  Martin Liska  

	* bootstrap-lto-noplugin.mk: Enable -flto in all PGO stages.
	* bootstrap-lto.mk: Likewise.
---
 config/bootstrap-lto-noplugin.mk | 4 +++-
 config/bootstrap-lto.mk  | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/config/bootstrap-lto-noplugin.mk b/config/bootstrap-lto-noplugin.mk
index a5073365b5a..fc980d2bc17 100644
--- a/config/bootstrap-lto-noplugin.mk
+++ b/config/bootstrap-lto-noplugin.mk
@@ -3,4 +3,6 @@
 
 STAGE2_CFLAGS += -flto=jobserver -frandom-seed=1 -ffat-lto-objects
 STAGE3_CFLAGS += -flto=jobserver -frandom-seed=1 -ffat-lto-objects
-STAGEprofile_CFLAGS += -fno-lto
+STAGEprofile_CFLAGS += -flto=jobserver -frandom-seed=1 -ffat-lto-objects
+STAGEtrain_CFLAGS += -flto=jobserver -frandom-seed=1 -ffat-lto-objects
+STAGEfeedback_CFLAGS += -flto=jobserver -frandom-seed=1 -ffat-lto-objects
diff --git a/config/bootstrap-lto.mk b/config/bootstrap-lto.mk
index 9e065e1d85a..50b86ef1c81 100644
--- a/config/bootstrap-lto.mk
+++ b/config/bootstrap-lto.mk
@@ -2,7 +2,9 @@
 
 STAGE2_CFLAGS += -flto=jobserver -frandom-seed=1
 STAGE3_CFLAGS += -flto=jobserver -frandom-seed=1
-STAGEprofile_CFLAGS += -fno-lto
+STAGEprofile_CFLAGS += -flto=jobserver -frandom-seed=1
+STAGEtrain_CFLAGS += -flto=jobserver -frandom-seed=1
+STAGEfeedback_CFLAGS += -flto=jobserver -frandom-seed=1
 
 # assumes the host supports the linker plugin
 LTO_AR = $$r/$(HOST_SUBDIR)/prev-gcc/gcc-ar$(exeext) -B$$r/$(HOST_SUBDIR)/prev-gcc/
-- 
2.13.0



Re: [PATCH][AArch64] Add crypto_pmull attribute

2017-06-06 Thread James Greenhalgh
On Fri, Mar 10, 2017 at 06:37:30AM +, Hurugalawadi, Naveen wrote:
> Hi James,
> 
> >> You need to do this for all cores which might be affected by this change,
> >> i.e. all those which model neon_mul_d_long.
> 
> Thanks for pointing out the missing cores in patch.
> Added the support as per your comments.
> 
> Please find attached the modified patch and let us know
> if its okay for stage1?

>From an AArch64 perspective, this is OK - but please wait for an ARM
approval before you commit it.

Thanks,
James

> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 7ad3a76..1aa1b96 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -5818,7 +5818,7 @@
>   UNSPEC_PMULL))]
>   "TARGET_SIMD && TARGET_CRYPTO"
>   "pmull\\t%0.1q, %1.1d, %2.1d"
> -  [(set_attr "type" "neon_mul_d_long")]
> +  [(set_attr "type" "crypto_pmull")]
>  )
>  
>  (define_insn "aarch64_crypto_pmullv2di"
> @@ -5828,5 +5828,5 @@
> UNSPEC_PMULL2))]
>"TARGET_SIMD && TARGET_CRYPTO"
>"pmull2\\t%0.1q, %1.2d, %2.2d"
> -  [(set_attr "type" "neon_mul_d_long")]
> +  [(set_attr "type" "crypto_pmull")]
>  )
> diff --git a/gcc/config/aarch64/thunderx2t99.md 
> b/gcc/config/aarch64/thunderx2t99.md
> index 0dd7199..67011ac 100644
> --- a/gcc/config/aarch64/thunderx2t99.md
> +++ b/gcc/config/aarch64/thunderx2t99.md
> @@ -441,3 +441,8 @@
>(and (eq_attr "tune" "thunderx2t99")
> (eq_attr "type" "neon_store2_one_lane,neon_store2_one_lane_q"))
>"thunderx2t99_ls01,thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_pmull" 5
> +  (and (eq_attr "tune" "thunderx2t99")
> +   (eq_attr "type" "crypto_pmull"))
> +  "thunderx2t99_f1")
> diff --git a/gcc/config/arm/cortex-a53.md b/gcc/config/arm/cortex-a53.md
> index 7cf5fc5..049ac85 100644
> --- a/gcc/config/arm/cortex-a53.md
> +++ b/gcc/config/arm/cortex-a53.md
> @@ -379,7 +379,7 @@
>neon_sat_mul_b_long, neon_sat_mul_h_long,\
>neon_sat_mul_s_long, neon_sat_mul_h_scalar_q,\
>neon_sat_mul_s_scalar_q, neon_sat_mul_h_scalar_long,\
> -  neon_sat_mul_s_scalar_long, neon_mla_b_q,\
> +  neon_sat_mul_s_scalar_long, crypto_pmull, neon_mla_b_q,\
>neon_mla_h_q, neon_mla_s_q, neon_mla_b_long,\
>neon_mla_h_long, neon_mla_s_long,\
>neon_mla_h_scalar_q, neon_mla_s_scalar_q,\
> diff --git a/gcc/config/arm/cortex-a57.md b/gcc/config/arm/cortex-a57.md
> index fd30758..ebf4a49 100644
> --- a/gcc/config/arm/cortex-a57.md
> +++ b/gcc/config/arm/cortex-a57.md
> @@ -76,7 +76,7 @@
>  neon_mul_h_scalar_long, neon_mul_s_scalar_long,\
>  neon_sat_mul_b_long, neon_sat_mul_h_long,\
>  neon_sat_mul_s_long, neon_sat_mul_h_scalar_long,\
> -neon_sat_mul_s_scalar_long")
> +neon_sat_mul_s_scalar_long, crypto_pmull")
>   (const_string "neon_multiply")
> (eq_attr "type" "neon_mul_b_q, neon_mul_h_q, neon_mul_s_q,\
>  neon_mul_h_scalar_q, neon_mul_s_scalar_q,\
> diff --git a/gcc/config/arm/crypto.md b/gcc/config/arm/crypto.md
> index 46b0715..a5e558b 100644
> --- a/gcc/config/arm/crypto.md
> +++ b/gcc/config/arm/crypto.md
> @@ -81,7 +81,7 @@
>   UNSPEC_VMULLP64))]
>"TARGET_CRYPTO"
>"vmull.p64\\t%q0, %P1, %P2"
> -  [(set_attr "type" "neon_mul_d_long")]
> +  [(set_attr "type" "crypto_pmull")]
>  )
>  
>  (define_insn "crypto_"
> diff --git a/gcc/config/arm/exynos-m1.md b/gcc/config/arm/exynos-m1.md
> index 5d397cc..b54d4c8 100644
> --- a/gcc/config/arm/exynos-m1.md
> +++ b/gcc/config/arm/exynos-m1.md
> @@ -78,7 +78,7 @@
>  neon_sat_mul_s_scalar, neon_sat_mul_s_scalar_q,\
>  neon_sat_mul_b_long, neon_sat_mul_h_long,\
>  neon_sat_mul_s_long, neon_sat_mul_h_scalar_long,\
> -neon_sat_mul_s_scalar_long")
> +neon_sat_mul_s_scalar_long, crypto_pmull")
>   (const_string "neon_multiply")
>  
> (eq_attr "type" "neon_mla_b, neon_mla_h, neon_mla_s,\
> diff --git a/gcc/config/arm/types.md b/gcc/config/arm/types.md
> index b0b375c..253f496 100644
> --- a/gcc/config/arm/types.md
> +++ b/gcc/config/arm/types.md
> @@ -539,6 +539,7 @@
>  ; crypto_sha1_slow
>  ; crypto_sha256_fast
>  ; crypto_sha256_slow
> +; crypto_pmull
>  ;
>  ; The classification below is for coprocessor instructions
>  ;
> @@ -1078,6 +1079,7 @@
>crypto_sha1_slow,\
>crypto_sha256_fast,\
>crypto_sha256_slow,\
> +  crypto_pmull,\
>coproc"
> (const_string "untyped"))
>  
> diff --git a/gcc/config/arm/xgene1.md b/gcc/config/arm/xgene1.md
> index 62a0732..34a13f4 100644
> --- a/gcc/config/arm/xgene1.md
> +++ b/gcc/config/arm/xgene1.md
> @@ -527,5 +527,6 @@
>  

Re: [PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53

2017-06-06 Thread Ramana Radhakrishnan
On Tue, Jun 6, 2017 at 1:56 PM, James Greenhalgh
 wrote:
> On Tue, May 02, 2017 at 04:37:21PM +0100, Tamar Christina wrote:
>> Hi All,
>>
>> This patch adjusts the cost model for Cortex-A53 to increase the costs of
>> an integer division. The reason for this is that we want to always expand
>> the division to a multiply when doing a division by constant.
>>
>> On the Cortex-A53 shifts are modeled to cost 1 instruction,
>> when doing the expansion we have to perform two shifts and an addition.
>> However because the cost model can't model things such as fusing of shifts,
>> we have to fully cost both shifts.
>>
>> This leads to the cost model telling us that for the Cortex-A53 we can never
>> do the expansion. By increasing the costs of the division by two instructions
>> we recover the room required in the cost calculation to do the expansions.
>>
>> The reason for all of this is that currently the code does not produce what
>> you'd expect, which is that division by constants are always expanded. Also
>> it's inconsistent because unsigned division does get expanded.
>>
>> This all reduces the ability to do CSE when using signed modulo since that
>> one is also expanded.
>>
>> Given:
>>
>> void f5(void)
>> {
>>   int x = 0;
>>   while (x > -1000)
>>   {
>> g(x % 300);
>> x--;
>>   }
>> }
>>
>>
>> we now generate
>>
>>   smull   x0, w19, w21
>>   asr x0, x0, 37
>>   sub w0, w0, w19, asr 31
>>   msubw0, w0, w20, w19
>>   sub w19, w19, #1
>>   bl  g
>>
>> as opposed to
>>
>>   sdivw0, w19, w20
>>   msubw0, w0, w20, w19
>>   sub w19, w19, #1
>>   bl  g
>>
>>
>> Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.
>>
>> OK for trunk?
>
> OK for AArch64, but you'll need an ARM OK too.

OK.

Ramana
>
> Thanks,
> James
>
>> gcc/
>> 2017-05-02  Tamar Christina  
>>
>>   * config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase idiv 
>> cost.
>


Re: [PATCH][AArch64] Add STP pattern to store a vec_concat of two 64-bit registers

2017-06-06 Thread James Greenhalgh
On Tue, Jun 06, 2017 at 09:40:44AM +0100, Kyrill Tkachov wrote:
> Hi all,
> 
> On top of the previous vec_merge simplifications [1] we can add this pattern 
> to perform
> a store of a vec_concat of two 64-bit values in distinct registers as an STP.
> This avoids constructing such a vector explicitly in a register and storing 
> it as
> a Q register.
> This way for the code in the testcase we can generate:
> 
> construct_lane_1:
> ldp d1, d0, [x0]
> fmovd3, 1.0e+0
> fmovd2, 2.0e+0
> faddd4, d1, d3
> faddd5, d0, d2
> stp d4, d5, [x1, 32]
> ret
> 
> construct_lane_2:
> ldp x2, x0, [x0]
> add x3, x2, 1
> add x4, x0, 2
> stp x3, x4, [x1, 32]
> ret
> 
> instead of the current:
> construct_lane_1:
> ldp d0, d1, [x0]
> fmovd3, 1.0e+0
> fmovd2, 2.0e+0
> faddd0, d0, d3
> faddd1, d1, d2
> dup v0.2d, v0.d[0]
> ins v0.d[1], v1.d[0]
> str q0, [x1, 32]
> ret
> 
> construct_lane_2:
> ldp x2, x3, [x0]
> add x0, x2, 1
> add x2, x3, 2
> dup v0.2d, x0
> ins v0.d[1], x2
> str q0, [x1, 32]
> ret
> 
> Bootstrapped and tested on aarch64-none-linux-gnu.
> Ok for GCC 8?

OK.

Thanks,
James

> 2017-06-06  Kyrylo Tkachov  
> 
> * config/aarch64/aarch64-simd.md (store_pair_lanes):
> New pattern.
> * config/aarch64/constraints.md (Uml): New constraint.
> * config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): New
> predicate.
> 
> 2017-06-06  Kyrylo Tkachov  
> 
> * gcc.target/aarch64/store_v2vec_lanes.c: New test.




Re: [PATCH][AArch64] Allow const0_rtx operand for atomic compare-exchange patterns

2017-06-06 Thread James Greenhalgh
On Tue, Feb 28, 2017 at 12:29:50PM +, Kyrill Tkachov wrote:
> Hi all,
> 
> For the testcase in this patch we currently generate:
> foo:
> mov w1, 0
> ldaxr   w2, [x0]
> cmp w2, 3
> bne .L2
> stxrw3, w1, [x0]
> cmp w3, 0
> .L2:
> csetw0, eq
> ret
> 
> Note that the STXR could have been storing the WZR register instead of moving 
> zero into w1.
> This is due to overly strict predicates and constraints in the store 
> exclusive pattern and the
> atomic compare exchange expanders and splitters.
> This simple patch fixes that in the patterns concerned and with it we can 
> generate:
> foo:
> ldaxr   w1, [x0]
> cmp w1, 3
> bne .L2
> stxrw2, wzr, [x0]
> cmp w2, 0
> .L2:
> csetw0, eq
> ret
> 
> 
> Bootstrapped and tested on aarch64-none-linux-gnu.
> Ok for GCC 8?

OK.

Thanks,
James

> 2017-02-28  Kyrylo Tkachov  
> 
> * config/aarch64/atomics.md (atomic_compare_and_swap expander):
> Use aarch64_reg_or_zero predicate for operand 4.
> (aarch64_compare_and_swap define_insn_and_split):
> Use aarch64_reg_or_zero predicate for operand 3.  Add 'Z' constraint.
> (aarch64_store_exclusive): Likewise for operand 2.
> 
> 2017-02-28  Kyrylo Tkachov  
> 
> * gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: New test.



[nvptx, PATCH, 3/3] Add v2di support

2017-06-06 Thread Tom de Vries

Hi,

this patch adds v2di support to the nvptx target.  This allows us to 
generate 128-bit loads and stores.


Tested in nvptx mainkernel mode and x86_64 accelerator mode.

OK for trunk?

Thanks,
- Tom
Add v2di support

2017-06-06  Tom de Vries  

	* config/nvptx/nvptx-modes.def: Add V2DImode.
	* config/nvptx/nvptx-protos.h (nvptx_data_alignment): Declare.
	* config/nvptx/nvptx.c (nvptx_ptx_type_from_mode): Handle V2DImode.
	(nvptx_output_mov_insn): Handle lack of mov.b128.
	(nvptx_print_operand): Handle 'H' and 'L' codes.
	(nvptx_vector_mode_supported): Allow V2DImode.
	(nvptx_preferred_simd_mode): New function.
	(nvptx_data_alignment): New function.
	(TARGET_VECTORIZE_PREFERRED_SIMD_MODE): Redefine to
	nvptx_preferred_simd_mode.
	* config/nvptx/nvptx.h (STACK_BOUNDARY, BIGGEST_ALIGNMENT): Change from
	64 to 128 bits.
	(DATA_ALIGNMENT): Define.  Set to nvptx_data_alignment.

	* config/nvptx/nvptx.md (VECIM): Add V2DI.

	* gcc.target/nvptx/decl-init.c: Update alignment.
	* gcc.target/nvptx/slp-2-run.c: New test.
	* gcc.target/nvptx/slp-2.c: New test.
	* gcc.target/nvptx/v2di.c: New test.

	* testsuite/libgomp.oacc-c/vec.c: New test.

---
 gcc/config/nvptx/nvptx-modes.def   |  2 +
 gcc/config/nvptx/nvptx-protos.h|  1 +
 gcc/config/nvptx/nvptx.c   | 68 +-
 gcc/config/nvptx/nvptx.h   |  6 ++-
 gcc/config/nvptx/nvptx.md  |  2 +-
 gcc/testsuite/gcc.target/nvptx/decl-init.c |  2 +-
 gcc/testsuite/gcc.target/nvptx/slp-2-run.c | 23 ++
 gcc/testsuite/gcc.target/nvptx/slp-2.c | 25 +++
 gcc/testsuite/gcc.target/nvptx/v2di.c  | 12 ++
 libgomp/testsuite/libgomp.oacc-c/vec.c | 48 +
 10 files changed, 183 insertions(+), 6 deletions(-)

diff --git a/gcc/config/nvptx/nvptx-modes.def b/gcc/config/nvptx/nvptx-modes.def
index d49429c..ff61b36 100644
--- a/gcc/config/nvptx/nvptx-modes.def
+++ b/gcc/config/nvptx/nvptx-modes.def
@@ -1 +1,3 @@
 VECTOR_MODE (INT, SI, 2);  /* V2SI */
+
+VECTOR_MODE (INT, DI, 2);  /* V2DI */
diff --git a/gcc/config/nvptx/nvptx-protos.h b/gcc/config/nvptx/nvptx-protos.h
index 16b316f..c3e3b84 100644
--- a/gcc/config/nvptx/nvptx-protos.h
+++ b/gcc/config/nvptx/nvptx-protos.h
@@ -41,6 +41,7 @@ extern void nvptx_function_end (FILE *);
 extern void nvptx_output_skip (FILE *, unsigned HOST_WIDE_INT);
 extern void nvptx_output_ascii (FILE *, const char *, unsigned HOST_WIDE_INT);
 extern void nvptx_register_pragmas (void);
+extern unsigned int nvptx_data_alignment (const_tree, unsigned int);
 
 #ifdef RTX_CODE
 extern void nvptx_expand_oacc_fork (unsigned);
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index d513ddb..1c84b1b 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -236,6 +236,8 @@ nvptx_ptx_type_from_mode (machine_mode mode, bool promote)
 
 case V2SImode:
   return ".v2.u32";
+case V2DImode:
+  return ".v2.u64";
 
 default:
   gcc_unreachable ();
@@ -2181,7 +2183,20 @@ nvptx_output_mov_insn (rtx dst, rtx src)
 	? "%.\tmov%t0\t%0, %1;" : "%.\tmov.b%T0\t%0, %1;");
 
   if (GET_MODE_SIZE (dst_inner) == GET_MODE_SIZE (src_inner))
-return "%.\tmov.b%T0\t%0, %1;";
+{
+  if (GET_MODE_BITSIZE (dst_mode) == 128
+	  && GET_MODE_BITSIZE (GET_MODE (src)) == 128)
+	{
+	  /* mov.b128 is not supported.  */
+	  if (dst_inner == V2DImode && src_inner == TImode)
+	return "%.\tmov.u64\t%0.x, %L1;\n\t%.\tmov.u64\t%0.y, %H1;";
+	  else if (dst_inner == TImode && src_inner == V2DImode)
+	return "%.\tmov.u64\t%L0, %1.x;\n\t%.\tmov.u64\t%H0, %1.y;";
+
+	  gcc_unreachable ();
+	}
+  return "%.\tmov.b%T0\t%0, %1;";
+}
 
   return "%.\tcvt%t0%t1\t%0, %1;";
 }
@@ -2419,6 +2434,20 @@ nvptx_print_operand (FILE *file, rtx x, int code)
   fprintf (file, "%s", nvptx_ptx_type_from_mode (mode, code == 't'));
   break;
 
+case 'H':
+case 'L':
+  {
+	rtx inner_x = SUBREG_REG (x);
+	machine_mode inner_mode = GET_MODE (inner_x);
+	machine_mode split = maybe_split_mode (inner_mode);
+
+	output_reg (file, REGNO (inner_x), split,
+		(code == 'H'
+		 ? GET_MODE_SIZE (inner_mode) / 2
+		 : 0));
+  }
+  break;
+
 case 'S':
   {
 	nvptx_shuffle_kind kind = (nvptx_shuffle_kind) UINTVAL (x);
@@ -5363,7 +5392,38 @@ nvptx_goacc_reduction (gcall *call)
 static bool
 nvptx_vector_mode_supported (machine_mode mode)
 {
-  return mode == V2SImode;
+  return (mode == V2SImode
+	  || mode == V2DImode);
+}
+
+/* Return the preferred mode for vectorizing scalar MODE.  */
+
+static machine_mode
+nvptx_preferred_simd_mode (machine_mode mode)
+{
+  switch (mode)
+{
+case DImode:
+  return V2DImode;
+case SImode:
+  return V2SImode;
+
+default:
+  return default_preferred_simd_mode (mode);
+}
+}
+
+unsigned int
+nvptx_data_alignment (const_tree type, unsigned int basic_align)
+{
+  if (TREE_CODE (type) == 

Re: [PING] Re: [PATCH] Fix-it hints for -Wimplicit-fallthrough

2017-06-06 Thread Marek Polacek
On Fri, May 26, 2017 at 02:13:56PM -0600, Martin Sebor wrote:
> On 05/26/2017 01:59 PM, David Malcolm wrote:
> > Ping:
> >   https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00334.html
> > 
> > On Thu, 2017-05-04 at 14:16 -0400, David Malcolm wrote:
> > > As of r247522, fix-it-hints can suggest the insertion of new lines.
> > > 
> > > This patch updates -Wimplicit-fallthrough to provide suggestions
> > > with fix-it hints, showing the user where to insert "break;" or
> > > fallthrough attributes.
> > > 
> > > For example:
> > > 
> > >  test.c: In function 'set_x':
> > >  test.c:15:9: warning: this statement may fall through [-Wimplicit
> > > -fallthrough=]
> > > x = a;
> > > ~~^~~
> > >  test.c:22:5: note: here
> > >   case 'b':
> > >   ^~~~
> > >  test.c:22:5: note: insert '__attribute__ ((fallthrough));' to
> > > silence this warning
> > >  +__attribute__ ((fallthrough));
> > >   case 'b':
> > >   ^~~~
> > >  test.c:22:5: note: insert 'break;' to avoid fall-through
> > >  +break;
> > >   case 'b':
> > >   ^~~~
> 
> I haven't read the patch but the notes above make me wonder:
> 
> If the location of at least one of t hints is always the same
> as that of the first "note: here" would it make sense to lose
> the latter and reduce the size of the output?  (Or lose it in
> the cases where one of the fix-it hints does share a location
> with it).

I agree that it's a tad verbose but I'm not sure if it'd be easy
to suppress printing
 case 2:
 ^~~~
more times simple enough.  So I'd be fine with the current state.
I'd also be happy with e.g.

a.c: In function ‘foo’:
a.c:7:9: warning: this statement may fall through [-Wimplicit-fallthrough=]
   x = 1;
   ~~^~~
a.c:8:5: note: here
 case 2:
 ^~~~
a.c:8:5: note: insert ‘__attribute__ ((fallthrough));’ to silence this warning
+__attribute__ ((fallthrough));
 or insert ‘break;’ to avoid fall-through
+break;

but I don't think we've got the means to do so.

On the patch, I'd think that add_newline_fixit_with_indentation doesn't
belong to gimplify.c, even though it only has one user.  But I won't be
able to approve it in any case.

Marek


Re: [PATCH][GCC][AARCH64]Adjust costs so udiv is preferred over sdiv when both are valid. [Patch (1/2)]

2017-06-06 Thread James Greenhalgh
On Tue, May 02, 2017 at 04:37:16PM +0100, Tamar Christina wrote:
> Hi All, 
> 
> This patch adjusts the cost model so that when both sdiv and udiv are possible
> it prefers udiv over sdiv. This was done by making sdiv slightly more 
> expensive
> instead of making udiv cheaper to keep the baseline costs of a division the 
> same
> as before.
> 
> For aarch64 this patch along with my other two related mid-end changes
> makes a big difference in division by constants.

This patch seems to have an unrelated change to the MOD/UMOD costs to delete
the handling of floating-point values. That change makes sense, but would
have been better in a separate patch.

> Given:
> 
> int f2(int x)
> {
>   return ((x * x) % 300) + ((x * x) / 300);
> }
> 
> we now generate
> 
> f2:
>   mul w0, w0, w0
>   mov w1, 33205
>   movkw1, 0x1b4e, lsl 16
>   mov w2, 300
>   umull   x1, w0, w1
>   lsr x1, x1, 37
>   msubw0, w1, w2, w0
>   add w0, w0, w1
>   ret
> 
> as opposed to
> 
> f2:
>   mul w0, w0, w0
>   mov w2, 33205
>   movkw2, 0x1b4e, lsl 16
>   mov w3, 300
>   smull   x1, w0, w2
>   umull   x2, w0, w2
>   asr x1, x1, 37
>   sub w1, w1, w0, asr 31
>   lsr x2, x2, 37
>   msubw0, w1, w3, w0
>   add w0, w0, w2
>   ret
> 
> Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.
> 
> OK for trunk?

OK.

Thanks,
James

> gcc/
> 2017-05-02  Tamar Christina  
> 
>   * config/aarch64/aarch64.c (aarch64_rtx_costs): Make sdiv more 
> expensive than udiv.
>   Remove floating point cases from mod.
> 
> gcc/testsuite/
> 2017-05-02  Tamar Christina  
> 
>   * gcc.target/aarch64/sdiv_costs_1.c: New.



[PATCH] libgo: fix support for ia64

2017-06-06 Thread Andreas Schwab
This adds support for ia64 in lfstack.

Andreas.

diff --git a/libgo/go/runtime/lfstack_64bit.go 
b/libgo/go/runtime/lfstack_64bit.go
index b314a3ba21..99dcec02de 100644
--- a/libgo/go/runtime/lfstack_64bit.go
+++ b/libgo/go/runtime/lfstack_64bit.go
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-// +build amd64 arm64 mips64 mips64le ppc64 ppc64le s390x arm64be alpha 
mipsn64 sparc64
+// +build amd64 arm64 mips64 mips64le ppc64 ppc64le s390x arm64be alpha 
mipsn64 sparc64 ia64
 
 package runtime
 
@@ -38,12 +38,22 @@ const (
// room in the bottom for the count.
sparcLinuxAddrBits = 52
sparcLinuxCntBits  = 64 - sparcLinuxAddrBits + 3
+
+   // On IA64, the virtual address space is devided into 8 regions, with
+   // 52 address bits each (with 64k page size).
+   ia64AddrBits = 55
+   ia64CntBits = 64 - ia64AddrBits + 3
 )
 
 func lfstackPack(node *lfnode, cnt uintptr) uint64 {
if GOARCH == "sparc64" && GOOS == "linux" {
return 
uint64(uintptr(unsafe.Pointer(node)))<<(64-sparcLinuxAddrBits) | 
uint64(cnt&(1<> 
sparcLinuxCntBits << 3)))
}
+   if GOARCH == "ia64" {
+   return (*lfnode)(unsafe.Pointer(uintptr((val >> ia64CntBits << 
3)&(1<<(64-3)-1) | val&^(1<<(64-3)-1
+   }
return (*lfnode)(unsafe.Pointer(uintptr(val >> cntBits << 3)))
 }
-- 
2.13.1


-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[nvptx, PATCH, 2/3 ] Add v2si support

2017-06-06 Thread Tom de Vries

Hi,

this patch adds v2si support to the nvptx target.

Tested in nvptx mainkernel mode and x86_64 accelerator mode.

OK for trunk?

Thanks,
- Tom

Add v2si support

2017-06-06  Tom de Vries  

	* config/nvptx/nvptx-modes.def: New file.  Add V2SImode.
	* config/nvptx/nvptx.c (nvptx_ptx_type_from_mode): Handle V2SImode.
	(nvptx_vector_mode_supported): New function.  Allow V2SImode.
	(TARGET_VECTOR_MODE_SUPPORTED_P): Redefine to nvptx_vector_mode_supported.
	* config/nvptx/nvptx.md (VECIM): New mode iterator. Add V2SI.
	(mov_insn): New define_insn.
	(define_expand "mov): New define_expand.

	* gcc.target/nvptx/slp-run.c: New test.
	* gcc.target/nvptx/slp.c: New test.
	* gcc.target/nvptx/v2si-cvt.c: New test.
	* gcc.target/nvptx/v2si-run.c: New test.
	* gcc.target/nvptx/v2si.c: New test.
	* gcc.target/nvptx/vec.inc: New test.

---
 gcc/config/nvptx/nvptx-modes.def  |  1 +
 gcc/config/nvptx/nvptx.c  | 12 +
 gcc/config/nvptx/nvptx.md | 29 +++
 gcc/testsuite/gcc.target/nvptx/slp-run.c  | 23 +
 gcc/testsuite/gcc.target/nvptx/slp.c  | 25 ++
 gcc/testsuite/gcc.target/nvptx/v2si-cvt.c | 39 +++
 gcc/testsuite/gcc.target/nvptx/v2si-run.c | 83 +++
 gcc/testsuite/gcc.target/nvptx/v2si.c | 12 +
 gcc/testsuite/gcc.target/nvptx/vec.inc| 18 +++
 9 files changed, 242 insertions(+)

diff --git a/gcc/config/nvptx/nvptx-modes.def b/gcc/config/nvptx/nvptx-modes.def
new file mode 100644
index 000..d49429c
--- /dev/null
+++ b/gcc/config/nvptx/nvptx-modes.def
@@ -0,0 +1 @@
+VECTOR_MODE (INT, SI, 2);  /* V2SI */
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index beaad2c..d513ddb 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -234,6 +234,9 @@ nvptx_ptx_type_from_mode (machine_mode mode, bool promote)
 case DFmode:
   return ".f64";
 
+case V2SImode:
+  return ".v2.u32";
+
 default:
   gcc_unreachable ();
 }
@@ -5357,6 +5360,12 @@ nvptx_goacc_reduction (gcall *call)
 }
 }
 
+static bool
+nvptx_vector_mode_supported (machine_mode mode)
+{
+  return mode == V2SImode;
+}
+
 #undef TARGET_OPTION_OVERRIDE
 #define TARGET_OPTION_OVERRIDE nvptx_option_override
 
@@ -5471,6 +5480,9 @@ nvptx_goacc_reduction (gcall *call)
 #undef TARGET_GOACC_REDUCTION
 #define TARGET_GOACC_REDUCTION nvptx_goacc_reduction
 
+#undef TARGET_VECTOR_MODE_SUPPORTED_P
+#define TARGET_VECTOR_MODE_SUPPORTED_P nvptx_vector_mode_supported
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-nvptx.h"
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index f2ed63b..ba0567c 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -184,6 +184,7 @@
 (define_mode_iterator SDCM [SC DC])
 (define_mode_iterator BITS [SI SF])
 (define_mode_iterator BITD [DI DF])
+(define_mode_iterator VECIM [V2SI])
 
 ;; This mode iterator allows :P to be used for patterns that operate on
 ;; pointer-sized quantities.  Exactly one of the two alternatives will match.
@@ -201,6 +202,20 @@
%.\\tsetp.eq.u32\\t%0, 1, 1;")
 
 (define_insn "*mov_insn"
+  [(set (match_operand:VECIM 0 "nonimmediate_operand" "=R,R,m")
+	(match_operand:VECIM 1 "general_operand" "Ri,m,R"))]
+  "!MEM_P (operands[0]) || REG_P (operands[1])"
+{
+  if (which_alternative == 1)
+return "%.\\tld%A1%u1\\t%0, %1;";
+  if (which_alternative == 2)
+return "%.\\tst%A0%u0\\t%0, %1;";
+
+  return nvptx_output_mov_insn (operands[0], operands[1]);
+}
+  [(set_attr "subregs_ok" "true")])
+
+(define_insn "*mov_insn"
   [(set (match_operand:QHSDIM 0 "nonimmediate_operand" "=R,R,m")
 	(match_operand:QHSDIM 1 "general_operand" "Ri,m,R"))]
   "!MEM_P (operands[0]) || REG_P (operands[1])"
@@ -242,6 +257,20 @@
   ""
   "%.\\tmov%t0\\t%0, %%ar%1;")
 
+ (define_expand "mov"
+  [(set (match_operand:VECIM 0 "nonimmediate_operand" "")
+	(match_operand:VECIM 1 "general_operand" ""))]
+  ""
+{
+  if (MEM_P (operands[0]) && !REG_P (operands[1]))
+{
+  rtx tmp = gen_reg_rtx (mode);
+  emit_move_insn (tmp, operands[1]);
+  emit_move_insn (operands[0], tmp);
+  DONE;
+}
+})
+
 (define_expand "mov"
   [(set (match_operand:QHSDISDFM 0 "nonimmediate_operand" "")
 	(match_operand:QHSDISDFM 1 "general_operand" ""))]
diff --git a/gcc/testsuite/gcc.target/nvptx/slp-run.c b/gcc/testsuite/gcc.target/nvptx/slp-run.c
new file mode 100644
index 000..dedec471
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/slp-run.c
@@ -0,0 +1,23 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -ftree-slp-vectorize" } */
+
+#include "slp.c"
+
+int
+main(void)
+{
+  unsigned int i;
+  for (i = 0; i < 1000; i += 1)
+{
+  p[i] = i;
+  p2[i] = 0;
+}
+
+  foo ();
+
+  for (i = 0; i < 1000; i += 1)
+if (p2[i] != i)
+  return 1;
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/nvptx/slp.c b/gcc/testsuite/gcc.target/nvptx/slp.c
new file mode 100644

[nvptx, PATCH, 1/3] Add generic v2 vector mode support

2017-06-06 Thread Tom de Vries

Hi,

this patch adds generic v2 vector mode support for nvptx.

Tested in nvptx mainkernel mode and x86_64 accelerator mode.

OK for trunk?

Thanks,
- Tom
Add generic v2 vector mode support

2017-06-06  Tom de Vries  

	* config/nvptx/nvptx.c (nvptx_print_operand): Handle v2 vector mode.

---
 gcc/config/nvptx/nvptx.c | 37 +
 1 file changed, 33 insertions(+), 4 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 2eb5570..beaad2c 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -2403,9 +2403,15 @@ nvptx_print_operand (FILE *file, rtx x, int code)
 case 'u':
   if (x_code == SUBREG)
 	{
-	  mode = GET_MODE (SUBREG_REG (x));
-	  if (split_mode_p (mode))
-	mode = maybe_split_mode (mode);
+	  machine_mode inner_mode = GET_MODE (SUBREG_REG (x));
+	  if (VECTOR_MODE_P (inner_mode)
+	  && (GET_MODE_SIZE (mode)
+		  <= GET_MODE_SIZE (GET_MODE_INNER (inner_mode
+	mode = GET_MODE_INNER (inner_mode);
+	  else if (split_mode_p (inner_mode))
+	mode = maybe_split_mode (inner_mode);
+	  else
+	mode = inner_mode;
 	}
   fprintf (file, "%s", nvptx_ptx_type_from_mode (mode, code == 't'));
   break;
@@ -2506,7 +2512,14 @@ nvptx_print_operand (FILE *file, rtx x, int code)
 	machine_mode inner_mode = GET_MODE (inner_x);
 	machine_mode split = maybe_split_mode (inner_mode);
 
-	if (split_mode_p (inner_mode)
+	if (VECTOR_MODE_P (inner_mode)
+		&& (GET_MODE_SIZE (mode)
+		<= GET_MODE_SIZE (GET_MODE_INNER (inner_mode
+	  {
+		output_reg (file, REGNO (inner_x), VOIDmode);
+		fprintf (file, ".%s", SUBREG_BYTE (x) == 0 ? "x" : "y");
+	  }
+	else if (split_mode_p (inner_mode)
 		&& (GET_MODE_SIZE (inner_mode) == GET_MODE_SIZE (mode)))
 	  output_reg (file, REGNO (inner_x), split);
 	else
@@ -2548,6 +2561,22 @@ nvptx_print_operand (FILE *file, rtx x, int code)
 	fprintf (file, "0d%08lx%08lx", vals[1], vals[0]);
 	  break;
 
+	case CONST_VECTOR:
+	  {
+	unsigned n = CONST_VECTOR_NUNITS (x);
+	fprintf (file, "{ ");
+	for (unsigned i = 0; i < n; ++i)
+	  {
+		if (i != 0)
+		  fprintf (file, ", ");
+
+		rtx elem = CONST_VECTOR_ELT (x, i);
+		output_addr_const (file, elem);
+	  }
+	fprintf (file, " }");
+	  }
+	  break;
+
 	default:
 	  output_addr_const (file, x);
 	}


Re: [PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53

2017-06-06 Thread James Greenhalgh
On Tue, May 02, 2017 at 04:37:21PM +0100, Tamar Christina wrote:
> Hi All, 
> 
> This patch adjusts the cost model for Cortex-A53 to increase the costs of
> an integer division. The reason for this is that we want to always expand
> the division to a multiply when doing a division by constant.
> 
> On the Cortex-A53 shifts are modeled to cost 1 instruction,
> when doing the expansion we have to perform two shifts and an addition.
> However because the cost model can't model things such as fusing of shifts,
> we have to fully cost both shifts.
> 
> This leads to the cost model telling us that for the Cortex-A53 we can never
> do the expansion. By increasing the costs of the division by two instructions
> we recover the room required in the cost calculation to do the expansions.
> 
> The reason for all of this is that currently the code does not produce what
> you'd expect, which is that division by constants are always expanded. Also
> it's inconsistent because unsigned division does get expanded.
> 
> This all reduces the ability to do CSE when using signed modulo since that
> one is also expanded.
> 
> Given:
> 
> void f5(void)
> {
>   int x = 0;
>   while (x > -1000)
>   {
> g(x % 300);
> x--;
>   }
> }
> 
> 
> we now generate
> 
>   smull   x0, w19, w21
>   asr x0, x0, 37
>   sub w0, w0, w19, asr 31
>   msubw0, w0, w20, w19
>   sub w19, w19, #1
>   bl  g
> 
> as opposed to
> 
>   sdivw0, w19, w20
>   msubw0, w0, w20, w19
>   sub w19, w19, #1
>   bl  g
> 
> 
> Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.
> 
> OK for trunk?

OK for AArch64, but you'll need an ARM OK too.

Thanks,
James

> gcc/
> 2017-05-02  Tamar Christina  
> 
>   * config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase idiv 
> cost.



[PATCH] Fix PR80974

2017-06-06 Thread Richard Biener

The following fixes PR80974 by not being too clever when preserving
SSA info during VN.  I didn't want to invent a way to unwind changes
done in this hack.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2017-06-06  Richard Biener  

PR tree-optimization/80974
* tree-ssa-sccvn.c (set_ssa_val_to): Do not change but only
keep or clear leaders SSA info.

* gcc.dg/torture/pr80974.c: New testcase.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 248913)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -3328,6 +3328,9 @@ set_ssa_val_to (tree from, tree to)
   == get_addr_base_and_unit_offset (TREE_OPERAND (to, 0), ))
   && coff == toff))
 {
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, " (changed)\n");
+
   /* If we equate two SSA names we have to make the side-band info
  of the leader conservative (and remember whatever original value
 was present).  */
@@ -3342,22 +3345,6 @@ set_ssa_val_to (tree from, tree to)
 gimple_bb (SSA_NAME_DEF_STMT (to
/* Keep the info from the dominator.  */
;
- else if (SSA_NAME_IS_DEFAULT_DEF (from)
-  || dominated_by_p_w_unex
-   (gimple_bb (SSA_NAME_DEF_STMT (to)),
-gimple_bb (SSA_NAME_DEF_STMT (from
-   {
- /* Save old info.  */
- if (! VN_INFO (to)->info.range_info)
-   {
- VN_INFO (to)->info.range_info = SSA_NAME_RANGE_INFO (to);
- VN_INFO (to)->range_info_anti_range_p
-   = SSA_NAME_ANTI_RANGE_P (to);
-   }
- /* Use that from the dominator.  */
- SSA_NAME_RANGE_INFO (to) = SSA_NAME_RANGE_INFO (from);
- SSA_NAME_ANTI_RANGE_P (to) = SSA_NAME_ANTI_RANGE_P (from);
-   }
  else
{
  /* Save old info.  */
@@ -3369,6 +3356,12 @@ set_ssa_val_to (tree from, tree to)
}
  /* Rather than allocating memory and unioning the info
 just clear it.  */
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "clearing range info of ");
+ print_generic_expr (dump_file, to);
+ fprintf (dump_file, "\n");
+   }
  SSA_NAME_RANGE_INFO (to) = NULL;
}
}
@@ -3381,17 +3374,6 @@ set_ssa_val_to (tree from, tree to)
 gimple_bb (SSA_NAME_DEF_STMT (to
/* Keep the info from the dominator.  */
;
- else if (SSA_NAME_IS_DEFAULT_DEF (from)
-  || dominated_by_p_w_unex
-   (gimple_bb (SSA_NAME_DEF_STMT (to)),
-gimple_bb (SSA_NAME_DEF_STMT (from
-   {
- /* Save old info.  */
- if (! VN_INFO (to)->info.ptr_info)
-   VN_INFO (to)->info.ptr_info = SSA_NAME_PTR_INFO (to);
- /* Use that from the dominator.  */
- SSA_NAME_PTR_INFO (to) = SSA_NAME_PTR_INFO (from);
-   }
  else if (! SSA_NAME_PTR_INFO (from)
   /* Handle the case of trivially equivalent info.  */
   || memcmp (SSA_NAME_PTR_INFO (to),
@@ -3403,14 +3385,18 @@ set_ssa_val_to (tree from, tree to)
VN_INFO (to)->info.ptr_info = SSA_NAME_PTR_INFO (to);
  /* Rather than allocating memory and unioning the info
 just clear it.  */
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "clearing points-to info of ");
+ print_generic_expr (dump_file, to);
+ fprintf (dump_file, "\n");
+   }
  SSA_NAME_PTR_INFO (to) = NULL;
}
}
}
 
   VN_INFO (from)->valnum = to;
-  if (dump_file && (dump_flags & TDF_DETAILS))
-   fprintf (dump_file, " (changed)\n");
   return true;
 }
   if (dump_file && (dump_flags & TDF_DETAILS))
Index: gcc/testsuite/gcc.dg/torture/pr80974.c
===
--- gcc/testsuite/gcc.dg/torture/pr80974.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr80974.c  (working copy)
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+
+int a, b, c, d, e, f, g[4];
+
+static int fn1 ()
+{
+  int h, i;
+  if (b)
+goto L1;
+L2:;
+   int m = a;
+   while (1)
+ {
+   int n = 2;
+   e = !f && (n = 5);
+   

Re: [PATCH] gcc::context creation

2017-06-06 Thread Richard Biener
On Mon, 5 Jun 2017, Nathan Sidwell wrote:

> On 06/05/2017 08:50 AM, Jakub Jelinek wrote:
> 
> > It was the intent that there is no unnecessary gap, the difference
> > between those two should be simply the maximum any FE registers.
> > So, on your branch you'd bump it to 4 and on trunk when merging your branch.
> 
> I can live with that.

Me too, in case you still need approval.

Richard.


Re: [PATCH] remove incorrect assert

2017-06-06 Thread Richard Biener
On Mon, Jun 5, 2017 at 1:14 PM, Gaius Mulley
 wrote:
>
> Hi,
>
> here is a tiny patch which removes an assert which I believe is wrong.
> I think it is an anomaly as the only callee (determine_max_movement at
> gcc/tree-ssa-loop-im.c:749) tests the asserted result against NULL.  (If
> the assert really were correct then the else statement is redundant
> (dead code) at line gcc/tree-ssa-loop-im.c:758).

That check checs for the first exit, !simple_mem_ref_in_stmt.

The assert is about having populated the hash properly when gathering
all simple_mem_refs.

> I think the else is correct and the assert wrong as the gnu modula-2
> front end can provoke the assert to fail (thus needing the else statement).

Then sth is wrong with either hashing, simple_mem_ref_in_stmt or the
IL generated by modula-2.

Richard.

> I've run the C regression tests and there are no changes in the results
> if the patch is applied.
>
> regards,
> Gaius
>
>
> --- gcc-versionno-orig/gcc/tree-ssa-loop-im.c   2017-06-01 17:06:50.228216946 
> +0100
> +++ gcc-versionno/gcc/tree-ssa-loop-im.c2017-06-01 21:34:55.623992245 
> +0100
> @@ -602,7 +602,6 @@
>hash = iterative_hash_expr (*mem, 0);
>ref = memory_accesses.refs->find_with_hash (*mem, hash);
>
> -  gcc_assert (ref != NULL);
>return ref;
>  }
>
> @@ -2631,5 +2630,3 @@
>  {
>return new pass_lim (ctxt);
>  }
> -
> -
>


[nvptx, committed] Add and use split_mode_p

2017-06-06 Thread Tom de Vries

Hi,

this patch adds and uses new utility function split_mode_p.

Committed as trivial.

Thanks,
- Tom
Add and use split_mode_p

2017-05-12  Tom de Vries  

	* config/nvptx/nvptx.c (split_mode_p): New function.
	(nvptx_declare_function_name, nvptx_print_operand): Use split_mode_p.

---
 gcc/config/nvptx/nvptx.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 75ecc94..2eb5570 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -328,6 +328,14 @@ maybe_split_mode (machine_mode mode)
   return VOIDmode;
 }
 
+/* Return true if mode should be treated as two registers.  */
+
+static bool
+split_mode_p (machine_mode mode)
+{
+  return maybe_split_mode (mode) != VOIDmode;
+}
+
 /* Output a register, subreg, or register pair (with optional
enclosing braces).  */
 
@@ -1277,7 +1285,7 @@ nvptx_declare_function_name (FILE *file, const char *name, const_tree decl)
 	  machine_mode mode = PSEUDO_REGNO_MODE (i);
 	  machine_mode split = maybe_split_mode (mode);
 
-	  if (split != VOIDmode)
+	  if (split_mode_p (mode))
 	mode = split;
 	  fprintf (file, "\t.reg%s ", nvptx_ptx_type_from_mode (mode, true));
 	  output_reg (file, i, split, -2);
@@ -2396,9 +2404,8 @@ nvptx_print_operand (FILE *file, rtx x, int code)
   if (x_code == SUBREG)
 	{
 	  mode = GET_MODE (SUBREG_REG (x));
-	  machine_mode split = maybe_split_mode (mode);
-	  if (split != VOIDmode)
-	mode = split;
+	  if (split_mode_p (mode))
+	mode = maybe_split_mode (mode);
 	}
   fprintf (file, "%s", nvptx_ptx_type_from_mode (mode, code == 't'));
   break;
@@ -2499,7 +2506,7 @@ nvptx_print_operand (FILE *file, rtx x, int code)
 	machine_mode inner_mode = GET_MODE (inner_x);
 	machine_mode split = maybe_split_mode (inner_mode);
 
-	if (split != VOIDmode
+	if (split_mode_p (inner_mode)
 		&& (GET_MODE_SIZE (inner_mode) == GET_MODE_SIZE (mode)))
 	  output_reg (file, REGNO (inner_x), split);
 	else


[nvptx, committed] Use maybe_split_mode in nvptx_print_operand

2017-06-06 Thread Tom de Vries

Hi,

this patch uses maybe_split_mode in nvptx_print_operand.

Committed as trivial.

Thanks,
- Tom
Use maybe_split_mode in nvptx_print_operand

2017-05-12  Tom de Vries  

	* config/nvptx/nvptx.c (nvptx_print_operand): Use maybe_split_mode.

---
 gcc/config/nvptx/nvptx.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 4c35c16..75ecc94 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -2396,10 +2396,9 @@ nvptx_print_operand (FILE *file, rtx x, int code)
   if (x_code == SUBREG)
 	{
 	  mode = GET_MODE (SUBREG_REG (x));
-	  if (mode == TImode)
-	mode = DImode;
-	  else if (COMPLEX_MODE_P (mode))
-	mode = GET_MODE_INNER (mode);
+	  machine_mode split = maybe_split_mode (mode);
+	  if (split != VOIDmode)
+	mode = split;
 	}
   fprintf (file, "%s", nvptx_ptx_type_from_mode (mode, code == 't'));
   break;


RE: [PATCH,testsuite] Add check_effective_target_rdynamic and use it in g++.dg/lto/pr69589_0.C.

2017-06-06 Thread Toma Tabacu
Thanks, Rainer.
Committed as r248916.

Thanks for suggesting a backport, Renlin.

Regards,
Toma



Re: [PATCH][Aarch64] Add vectorized mersenne twister

2017-06-06 Thread Jonathan Wakely

On 06/06/17 12:41 +0100, Charles Baylis wrote:

On 6 June 2017 at 11:07, James Greenhalgh  wrote:


If we don't mind that, and instead take a GCC-centric view, we could avoid
namespace polution by using the GCC-internal names for the intrinsics
(__builtin_aarch64_...). As those names don't form a guaranteed interface,
that would tie us to a GCC version.

So we have a few solutions to choose from, each of which invokes a trade-off:

  1 Use the current names and pollute the namespace.
  2 Use the GCC internal __builtin_aarch64* names and tie libstdc++ to GCC
internals.
  3 Define a new set of namespace-clean names and complicate the Neon
intrinsic interface while we migrate old users to new names.

I can see why the libstdc++ community would prefer 3) over the other options,
but I'm reticent to take that route as the cost to our intrinsic maintainers
and users looks high. I've added some of the other ARM port maintainers
for their opinion.

Are there any other options I'm missing?


If solving for C++ only is OK, then it might be feasible to do something like:

namespace __arm_neon_for_ext_random {
#include// like arm_neon.h, but without
include guards [*]
}

Then the libstdc++ headers can use "using namespace
__arm_neon_for_ext_random" inside the functions which use NEON
intrinsics.

[*] without include guards so that other header files can use the same
trick in their own namespace.

I'm not sure if this will work for all host compilers, with GCC I
think it's OK because the intrinsics are implemented as inline
functions, rather than #defines, but not all compilers have taken that
approach.


It would be a bit better, but this valid (albeit silly) program would
be rejected:

#define int32x4_t ++/!!!POISONED!!!/++
#include 
int main() { }

Users are allowed to use names like int32x4_t for their own macros,
because that's not a name reserved for the implementation.

ARM developers are probably aware of those names and would avoid them,
but developers writing portable code that might get used on ARM are
now also unable to use those names if  might use them.

Of course, using names that aren't ALL_CAPS for macros is crazy, but
the standard library has to support craziness as long as it doesn't
violate the standard. That's why including  in libstdc++
headers is problematic. I would be prepared to live with it for a
non-standard extension header, but Ulrich makes a good point about
this code maybe going into namespace std one day.



Re: [PATCH][Aarch64] Add vectorized mersenne twister

2017-06-06 Thread Charles Baylis
On 6 June 2017 at 11:07, James Greenhalgh  wrote:

> If we don't mind that, and instead take a GCC-centric view, we could avoid
> namespace polution by using the GCC-internal names for the intrinsics
> (__builtin_aarch64_...). As those names don't form a guaranteed interface,
> that would tie us to a GCC version.
>
> So we have a few solutions to choose from, each of which invokes a trade-off:
>
>   1 Use the current names and pollute the namespace.
>   2 Use the GCC internal __builtin_aarch64* names and tie libstdc++ to GCC
> internals.
>   3 Define a new set of namespace-clean names and complicate the Neon
> intrinsic interface while we migrate old users to new names.
>
> I can see why the libstdc++ community would prefer 3) over the other options,
> but I'm reticent to take that route as the cost to our intrinsic maintainers
> and users looks high. I've added some of the other ARM port maintainers
> for their opinion.
>
> Are there any other options I'm missing?

If solving for C++ only is OK, then it might be feasible to do something like:

namespace __arm_neon_for_ext_random {
#include// like arm_neon.h, but without
include guards [*]
}

Then the libstdc++ headers can use "using namespace
__arm_neon_for_ext_random" inside the functions which use NEON
intrinsics.

[*] without include guards so that other header files can use the same
trick in their own namespace.

I'm not sure if this will work for all host compilers, with GCC I
think it's OK because the intrinsics are implemented as inline
functions, rather than #defines, but not all compilers have taken that
approach.


Re: [PATCH][Aarch64] Add vectorized mersenne twister

2017-06-06 Thread Jonathan Wakely

On 06/06/17 12:33 +0100, James Greenhalgh wrote:

On Tue, Jun 06, 2017 at 11:47:45AM +0100, Jonathan Wakely wrote:

On 06/06/17 11:23 +0100, Jonathan Wakely wrote:
>On 06/06/17 11:07 +0100, James Greenhalgh wrote:
>>On Fri, Jun 02, 2017 at 07:13:17PM +0100, Jonathan Wakely wrote:
>>>On 02/06/17 19:19 +0200, Ulrich Drepper wrote:
On Fri, Jun 2, 2017 at 5:46 PM, Michael Collison
 wrote:
>This implementation includes "arm_neon.h" when including the optimized 
.  This has the effect of polluting the global namespace with the Neon intrinsics, so user macros 
and functions could potentially clash with them.  Is this acceptable given this only happens when  
is explicitly included?

I don't think it is.  Ideally the sfmt classes would get added to the
standard some day; they are much better performance-wise than the mt
classes.  At that point you'd have no excuse anymore.

Why are the symbols in arm_neon.h not prefixed with an underscore as
it is the case for equivalent definitions of other platforms (notably
x86)?  If you start adding such optimizations I suggest you create
headers which do not pollute the namespace and include it in
arm_neon.h and whatever other header is needed to define the symbols
needed for compatibility.

Nip the problem in the butt now, this is probably just the beginning.
>>
>>We're a good number of years late to do that without causing some pain.
>>
>>The arm_neon.h names for AArch64 come from those for the ARM port, which
>>themselves started out in the mid 2000s in ARM's proprietary compiler. These
>>names are now all over the place; from LLVM, through popular proprietary
>>toolchains, through a variety of developer guides and codebases, and in
>>to GCC.
>>
>>Forging a new interface is possible, but expecting migration to the
>>namespace-safe names seems unlikely given the wide spread of the current
>>names. Essentially, we'd be doubling our maintenance, and asking all
>>other compilers to follow. It is possible, but we'd not want to do it
>>lightly.
>>
>>In particular, my immediate worry is needing other compilers to support the
>>new names if they are to use libstdc++.
>>
>>If we don't mind that, and instead take a GCC-centric view, we could avoid
>>namespace polution by using the GCC-internal names for the intrinsics
>>(__builtin_aarch64_...). As those names don't form a guaranteed interface,
>>that would tie us to a GCC version.
>
>Libstdc++ is already tied to a GCC version when used with GCC. Using
>the internal names might cause problems when using libstdc++ with
>non-GNU compilers if they don't use the same names (it's used with at
>least Clang and ICC).
>
>>So we have a few solutions to choose from, each of which invokes a trade-off:
>>
>>1 Use the current names and pollute the namespace.
>>2 Use the GCC internal __builtin_aarch64* names and tie libstdc++ to GCC
>>  internals.
>>3 Define a new set of namespace-clean names and complicate the Neon
>>  intrinsic interface while we migrate old users to new names.
>>
>>I can see why the libstdc++ community would prefer 3) over the other options,
>>but I'm reticent to take that route as the cost to our intrinsic maintainers
>>and users looks high. I've added some of the other ARM port maintainers
>>for their opinion.
>>
>>Are there any other options I'm missing?
>
>3b) Define a new subset of names for the parts needed by this patch,
>  but don't expect users to migrate to them.
>
>That probably doesn't have any advantage over 2.
>


I suppose  could do something like:

#if __GNUC__ >= 8  // This is GCC not Clang pretending to be GCC 4.2
namespace __detail {
 using __int32x4_t = __Int32x4_t;
 using __int8x16_t = __Int8x16_t;
}
#else
#include "arm_neon.h"
namespace __detail {
 using __int32x4_t = int32x4_t;
 using __int8x16_t = int8x16_t;
}
#endif

But we'd need to redefine or wrap the operations like veorq_u8 which
would be a pain.


Thanks for the feedback. We've got the clear message that this approach
is unacceptable, the suggestions for other possible ways forward are very
helpful.

I think we got some of the way towards this when we were discussing this
internally. For many of the intrinsics needed by this patch we can get
away with using the GCC vector syntax (a ^ b for veorq_u8) and avoid
relying on the header at all. I think we ought to go back to that idea and
see how far we can take it, and which names we really do need to agree on
or mask in this way.

I presume using the GCC vector extensions (which are certainly supported
by Clang, I don't know about ICC) will be OK?


Yes, I had a look in the Clang arm_neon.h header and it uses the same
vector extensions. If all the operations needed for Michael's patch
can be done using those instead of the named functions I'd be happy
with that. The handful of typedefs needed can be defined in some
libstdc++ header, rather than by including in arm_neon.h

I assume we don't actually need to 

Re: [PATCH][Aarch64] Add vectorized mersenne twister

2017-06-06 Thread James Greenhalgh
On Tue, Jun 06, 2017 at 11:47:45AM +0100, Jonathan Wakely wrote:
> On 06/06/17 11:23 +0100, Jonathan Wakely wrote:
> >On 06/06/17 11:07 +0100, James Greenhalgh wrote:
> >>On Fri, Jun 02, 2017 at 07:13:17PM +0100, Jonathan Wakely wrote:
> >>>On 02/06/17 19:19 +0200, Ulrich Drepper wrote:
> On Fri, Jun 2, 2017 at 5:46 PM, Michael Collison
>  wrote:
> >This implementation includes "arm_neon.h" when including the optimized 
> >.  This has the effect of polluting the global namespace 
> >with the Neon intrinsics, so user macros and functions could potentially 
> >clash with them.  Is this acceptable given this only happens when 
> > is explicitly included?
> 
> I don't think it is.  Ideally the sfmt classes would get added to the
> standard some day; they are much better performance-wise than the mt
> classes.  At that point you'd have no excuse anymore.
> 
> Why are the symbols in arm_neon.h not prefixed with an underscore as
> it is the case for equivalent definitions of other platforms (notably
> x86)?  If you start adding such optimizations I suggest you create
> headers which do not pollute the namespace and include it in
> arm_neon.h and whatever other header is needed to define the symbols
> needed for compatibility.
> 
> Nip the problem in the butt now, this is probably just the beginning.
> >>
> >>We're a good number of years late to do that without causing some pain.
> >>
> >>The arm_neon.h names for AArch64 come from those for the ARM port, which
> >>themselves started out in the mid 2000s in ARM's proprietary compiler. These
> >>names are now all over the place; from LLVM, through popular proprietary
> >>toolchains, through a variety of developer guides and codebases, and in
> >>to GCC.
> >>
> >>Forging a new interface is possible, but expecting migration to the
> >>namespace-safe names seems unlikely given the wide spread of the current
> >>names. Essentially, we'd be doubling our maintenance, and asking all
> >>other compilers to follow. It is possible, but we'd not want to do it
> >>lightly.
> >>
> >>In particular, my immediate worry is needing other compilers to support the
> >>new names if they are to use libstdc++.
> >>
> >>If we don't mind that, and instead take a GCC-centric view, we could avoid
> >>namespace polution by using the GCC-internal names for the intrinsics
> >>(__builtin_aarch64_...). As those names don't form a guaranteed interface,
> >>that would tie us to a GCC version.
> >
> >Libstdc++ is already tied to a GCC version when used with GCC. Using
> >the internal names might cause problems when using libstdc++ with
> >non-GNU compilers if they don't use the same names (it's used with at
> >least Clang and ICC).
> >
> >>So we have a few solutions to choose from, each of which invokes a 
> >>trade-off:
> >>
> >>1 Use the current names and pollute the namespace.
> >>2 Use the GCC internal __builtin_aarch64* names and tie libstdc++ to GCC
> >>  internals.
> >>3 Define a new set of namespace-clean names and complicate the Neon
> >>  intrinsic interface while we migrate old users to new names.
> >>
> >>I can see why the libstdc++ community would prefer 3) over the other 
> >>options,
> >>but I'm reticent to take that route as the cost to our intrinsic maintainers
> >>and users looks high. I've added some of the other ARM port maintainers
> >>for their opinion.
> >>
> >>Are there any other options I'm missing?
> >
> >3b) Define a new subset of names for the parts needed by this patch,
> >  but don't expect users to migrate to them.
> >
> >That probably doesn't have any advantage over 2.
> >
> 
> 
> I suppose  could do something like:
> 
> #if __GNUC__ >= 8  // This is GCC not Clang pretending to be GCC 4.2
> namespace __detail {
>  using __int32x4_t = __Int32x4_t;
>  using __int8x16_t = __Int8x16_t;
> }
> #else
> #include "arm_neon.h"
> namespace __detail {
>  using __int32x4_t = int32x4_t;
>  using __int8x16_t = int8x16_t;
> }
> #endif
> 
> But we'd need to redefine or wrap the operations like veorq_u8 which
> would be a pain.

Thanks for the feedback. We've got the clear message that this approach
is unacceptable, the suggestions for other possible ways forward are very
helpful.

I think we got some of the way towards this when we were discussing this
internally. For many of the intrinsics needed by this patch we can get
away with using the GCC vector syntax (a ^ b for veorq_u8) and avoid
relying on the header at all. I think we ought to go back to that idea and
see how far we can take it, and which names we really do need to agree on
or mask in this way.

I presume using the GCC vector extensions (which are certainly supported
by Clang, I don't know about ICC) will be OK?

Cheers,
James



Re: [PATCH][Aarch64] Add vectorized mersenne twister

2017-06-06 Thread Ulrich Drepper
On Tue, Jun 6, 2017 at 12:07 PM, James Greenhalgh
 wrote:
> We're a good number of years late to do that without causing some pain.

Well, it's pain for those who deserve it.  Who thought it to be a
smart idea to pollute the global namespace?

It's a one-time deal.


> So we have a few solutions to choose from, each of which invokes a trade-off:
>
>   1 Use the current names and pollute the namespace.

IMO unacceptable.


>   2 Use the GCC internal __builtin_aarch64* names and tie libstdc++ to GCC
> internals.

Maybe.


>   3 Define a new set of namespace-clean names and complicate the Neon
> intrinsic interface while we migrate old users to new names.

See Jonathan's proposal.  I never suggested that those who don't care
about namespace pollution would have to change their code.  Add
appropriate aliases.

There is perhaps number 4:

- use the x86-64 intrinsics which you map to aarch64 intrinsics.
Isn't this compatibility layer planned anyway?  I don't know whether
everything maps 1-to-1 and you don't lose performance but you could
this way use the arch-specific code I wrote a long time ago for
x86-64.


Re: [PING] [PING] Make the OpenACC C++ acc_on_device wrapper "always inline"

2017-06-06 Thread Jakub Jelinek
On Tue, Jun 06, 2017 at 01:16:03PM +0200, Thomas Schwinge wrote:
> On Tue, 6 Jun 2017 08:58:21 +0200, Jakub Jelinek  wrote:
> > On Tue, Jun 06, 2017 at 08:35:40AM +0200, Thomas Schwinge wrote:
> > > > > commit 9cc3a384c17e9f692f7864c604d2e2f9fbf0bac9
> > > > > Author: Thomas Schwinge 
> > > > > Date:   Tue May 23 13:21:14 2017 +0200
> > > > > 
> > > > > Make the OpenACC C++ acc_on_device wrapper "always inline"
> > > > > 
> > > > > libgomp/
> > > > > * openacc.h [__cplusplus] (acc_on_device): Mark as "always
> > > > > inline".
> > > > > * testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c: 
> > > > > Remove
> > > > > file; test cases already present...
> > > > > * testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: 
> > > > > ... in
> > > > > this file.  Update.
> > > > > * testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: 
> > > > > Remove
> > > > > file; test cases now present...
> > > > > * testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c: 
> > > > > ... in
> > > > > this new file.
> > > > > * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: 
> > > > > Update.
> > 
> > I don't like this very much.
> 
> Thanks for having a look.  Would you please clarify whether "this"
> applies to my "always inline" changes and testing additions that you
> quoted, or rather to the C++ "acc_on_device" wrapper function as it is
> currently present?

The C++ acc_on_device wrapper altogether, though of course always inline on
it doesn't sound right either (what if you want to take acc_on_device
address?).

> > Can't you instead just turn the builtin into BT_FN_INT_VAR and diagnose
> > during folding if it has no or 2+ arguments or if the argument is not type
> > compatible with int?
> 
> Thanks for the suggestion, I'll look into that!
> 
> In terms of incremental progress, do you oppose that I commit my existing
> patch now, and then rework the builtin in a later patch?

We are in stage1 and this doesn't seem to be a blocker, I think it is better
to do it right, no need to do it incrementally.

Jakub


Re: [PING] [PING] Make the OpenACC C++ acc_on_device wrapper "always inline"

2017-06-06 Thread Thomas Schwinge
Hi Jakub!

On Tue, 6 Jun 2017 08:58:21 +0200, Jakub Jelinek  wrote:
> On Tue, Jun 06, 2017 at 08:35:40AM +0200, Thomas Schwinge wrote:
> > > > commit 9cc3a384c17e9f692f7864c604d2e2f9fbf0bac9
> > > > Author: Thomas Schwinge 
> > > > Date:   Tue May 23 13:21:14 2017 +0200
> > > > 
> > > > Make the OpenACC C++ acc_on_device wrapper "always inline"
> > > > 
> > > > libgomp/
> > > > * openacc.h [__cplusplus] (acc_on_device): Mark as "always
> > > > inline".
> > > > * testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c: 
> > > > Remove
> > > > file; test cases already present...
> > > > * testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: 
> > > > ... in
> > > > this file.  Update.
> > > > * testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: 
> > > > Remove
> > > > file; test cases now present...
> > > > * testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c: 
> > > > ... in
> > > > this new file.
> > > > * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: 
> > > > Update.
> 
> I don't like this very much.

Thanks for having a look.  Would you please clarify whether "this"
applies to my "always inline" changes and testing additions that you
quoted, or rather to the C++ "acc_on_device" wrapper function as it is
currently present?

> Can't you instead just turn the builtin into BT_FN_INT_VAR and diagnose
> during folding if it has no or 2+ arguments or if the argument is not type
> compatible with int?

Thanks for the suggestion, I'll look into that!

In terms of incremental progress, do you oppose that I commit my existing
patch now, and then rework the builtin in a later patch?


Grüße
 Thomas


  1   2   >