[PATCH, Fortran, 29229, v1] [7 Regression] ICE in gfc_trans_assignment_1 with -fcheck=mem

2017-02-06 Thread Andre Vehreschild
Hi all,

attached patch fixes the issue of 79229 (which is not a regression).
The issue was that the code generating the checking was expecting a
pointer type, but got an indirect ref to it. This is fixed now.

Bootstraps and regtests ok on x86_64-linux/f25. Ok for trunk?

Regards,
Andre
-- 
Andre Vehreschild * Kreuzherrenstr. 8 * 52062 Aachen
Tel.: +49 241 9291018 * Email: ve...@gmx.de 
gcc/testsuite/ChangeLog:

2017-02-07  Andre Vehreschild  

PR fortran/79229
* gfortran.dg/class_allocate_24.f90: New test.


gcc/fortran/ChangeLog:

2017-02-07  Andre Vehreschild  

PR fortran/79229
* trans-expr.c (gfc_trans_assignment_1): Deref indirect refs when
compiling with -fcheck=mem to check the pointer and not the data.

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 138af56..fcd6f16 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -9959,13 +9959,16 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag,
 	  tree cond;
 	  const char* msg;
 
+	  tmp = INDIRECT_REF_P (lse.expr)
+	  ? gfc_build_addr_expr (NULL_TREE, lse.expr) : lse.expr;
+
 	  /* We should only get array references here.  */
-	  gcc_assert (TREE_CODE (lse.expr) == POINTER_PLUS_EXPR
-		  || TREE_CODE (lse.expr) == ARRAY_REF);
+	  gcc_assert (TREE_CODE (tmp) == POINTER_PLUS_EXPR
+		  || TREE_CODE (tmp) == ARRAY_REF);
 
 	  /* 'tmp' is either the pointer to the array(POINTER_PLUS_EXPR)
 	 or the array itself(ARRAY_REF).  */
-	  tmp = TREE_OPERAND (lse.expr, 0);
+	  tmp = TREE_OPERAND (tmp, 0);
 
 	  /* Provide the address of the array.  */
 	  if (TREE_CODE (lse.expr) == ARRAY_REF)
diff --git a/gcc/testsuite/gfortran.dg/class_allocate_24.f90 b/gcc/testsuite/gfortran.dg/class_allocate_24.f90
new file mode 100644
index 000..883247d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/class_allocate_24.f90
@@ -0,0 +1,16 @@
+! { dg-do compile }
+! { dg-options "-fcheck=mem" }
+! 
+! Compile time check only, to test that the ICE is fixed in the assignment of the
+! default initializer of the class to sf.
+
+implicit none
+
+type :: t
+  integer, pointer :: data => null ()
+end type
+
+class(t), dimension(:), allocatable :: sf
+allocate (t :: sf (1))
+end
+


[PATCH] Fix target selector in builtin-apply-2.c

2017-02-06 Thread Segher Boessenkool
Revision r245228 introduced a syntax error in a target selector in
builtin-apply-2.c: || wants a single expression on each side, you
cannot have "{a} || {b} || {c}", instead you need to write this as
"{a} || {{b} || {c}}".  In this testcase b and c are real target
selectors (not effect target keywords) however, so we can just write
"{a} || {b c}".  This also removes the unnecessary default arguments.

Tested on powerpc64-linux {-m32,-m64}; is this okay for trunk?


Segher


2017-02-07  Segher Boessenkool  

gcc/testsuite/
* gcc.dg/torture/stackalign/builtin-apply-2.c: Fix syntax error in
target selector.  Delete superfluous default arguments.

---
 gcc/testsuite/gcc.dg/torture/stackalign/builtin-apply-2.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/torture/stackalign/builtin-apply-2.c 
b/gcc/testsuite/gcc.dg/torture/stackalign/builtin-apply-2.c
index eb9d3a9..d033010 100644
--- a/gcc/testsuite/gcc.dg/torture/stackalign/builtin-apply-2.c
+++ b/gcc/testsuite/gcc.dg/torture/stackalign/builtin-apply-2.c
@@ -9,8 +9,8 @@
 /* arm_hf_eabi: Variadic funcs use Base AAPCS.  Normal funcs use VFP variant.
avr: Variadic funcs don't pass arguments in registers, while normal funcs
 do.  */
-/* { dg-skip-if "Variadic funcs use different argument passing from normal 
funcs" { arm_hf_eabi || { avr-*-* } || { riscv*-*-* } } "*" "" } */
-/* { dg-skip-if "Variadic funcs have all args on stack. Normal funcs have args 
in registers." { nds32*-*-* } "*" "" } */
+/* { dg-skip-if "Variadic funcs use different argument passing from normal 
funcs" { arm_hf_eabi || { avr-*-* riscv*-*-* } } } */
+/* { dg-skip-if "Variadic funcs have all args on stack. Normal funcs have args 
in registers." { nds32*-*-* } } */
 /* { dg-require-effective-target untyped_assembly } */

 
-- 
1.9.3



[PATCH] avoid ICE when attempting to init a flexible array member (PR c++/79363)

2017-02-06 Thread Martin Sebor

The attached patch avoids another ICE (in addition to the already
fixed bug 72775) in flexible array member NSDMI.  To avoid code
duplication and for consistency I factored the diagnostic code
out of perform_member_init and into a new helper.

Martin
PR c++/79363 - ICE with NSDMI and array

gcc/cp/ChangeLog:

	PR c++/79363
	* init.c (maybe_reject_flexarray_init): New function.
	(perform_member_init): Call it.

gcc/testsuite/ChangeLog:

	PR c++/79363
	* g++.dg/ext/flexary12.C: Adjust.
	* g++.dg/ext/flexary20.C: Same.
	* g++.dg/ext/flexary21.C: Same.
	* g++.dg/ext/flexary22.C: New test.

diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index 42f1c61..c83d2eb 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -597,6 +597,33 @@ get_nsdmi (tree member, bool in_ctor)
   return init;
 }
 
+/* Diagnose the flexible array MEMBER if its INITializer is non-null
+   and return true if so.  Otherwise return false.  */
+
+static bool
+maybe_reject_flexarray_init (tree member, tree init)
+{
+  tree type = TREE_TYPE (member);
+
+  if (!init
+  || TREE_CODE (type) != ARRAY_TYPE
+  || TYPE_DOMAIN (type))
+return false;
+
+  /* Point at the flexible array member declaration if it's initialized
+ in-class, and at the ctor if it's initialized in a ctor member
+ initializer list.  */
+  location_t loc;
+  if (DECL_INITIAL (member) == init
+  || DECL_DEFAULTED_FN (current_function_decl))
+loc = DECL_SOURCE_LOCATION (member);
+  else
+loc = DECL_SOURCE_LOCATION (current_function_decl);
+
+  error_at (loc, "initializer for flexible array member %q#D", member);
+  return true;
+}
+
 /* Initialize MEMBER, a FIELD_DECL, with INIT, a TREE_LIST of
arguments.  If TREE_LIST is void_type_node, an empty initializer
list was given; if NULL_TREE no initializer was given.  */
@@ -722,10 +749,18 @@ perform_member_init (tree member, tree init)
 	{
 	  if (init)
 	{
-	  if (TREE_CHAIN (init))
+	  /* Check to make sure the member initializer is valid and
+		 something like a CONSTRUCTOR in: T a[] = { 1, 2 } and
+		 if it isn't, return early to avoid triggering another
+		 error below.  */
+	  if (maybe_reject_flexarray_init (member, init))
+		return;
+
+	  if (TREE_CODE (init) != TREE_LIST || TREE_CHAIN (init))
 		init = error_mark_node;
 	  else
 		init = TREE_VALUE (init);
+
 	  if (BRACE_ENCLOSED_INITIALIZER_P (init))
 		init = digest_init (type, init, tf_warning_or_error);
 	}
@@ -800,16 +835,9 @@ perform_member_init (tree member, tree init)
 	   in that case.  */
 	init = build_x_compound_expr_from_list (init, ELK_MEM_INIT,
 		tf_warning_or_error);
-  if (TREE_CODE (type) == ARRAY_TYPE
-	  && TYPE_DOMAIN (type) == NULL_TREE
-	  && init != NULL_TREE)
-	{
-	  error_at (DECL_SOURCE_LOCATION (current_function_decl),
-		"member initializer for flexible array member");
-	  inform (DECL_SOURCE_LOCATION (member), "%q#D initialized", member);
-	}
 
-  if (init)
+  /* Reject a member initializer for a flexible array member.  */
+  if (init && !maybe_reject_flexarray_init (member, init))
 	finish_expr_stmt (cp_build_modify_expr (input_location, decl,
 		INIT_EXPR, init,
 		tf_warning_or_error));
diff --git a/gcc/testsuite/g++.dg/ext/flexary12.C b/gcc/testsuite/g++.dg/ext/flexary12.C
index db80bf4..763ffa3 100644
--- a/gcc/testsuite/g++.dg/ext/flexary12.C
+++ b/gcc/testsuite/g++.dg/ext/flexary12.C
@@ -44,8 +44,9 @@ struct D {
   D ();
 };
 
-D::D ():// { dg-error "member initializer for flexible array member" }
-  a ("c")   // { dg-error "incompatible types in assignment of .const char \\\[2\\\]. to .int \\\[\\\]." }
+D::D ():// { dg-error "initializer for flexible array member" }
+  a ("c")   // the initializer also has an invalid type but emitting
+// just the error above is sufficient
 { }
 
 
diff --git a/gcc/testsuite/g++.dg/ext/flexary20.C b/gcc/testsuite/g++.dg/ext/flexary20.C
index 2c8ab29..10a06b4 100644
--- a/gcc/testsuite/g++.dg/ext/flexary20.C
+++ b/gcc/testsuite/g++.dg/ext/flexary20.C
@@ -4,22 +4,22 @@
 
 struct S {
   int i;
-  char a[] = "foo";
-  S () {} // { dg-error "member initializer for flexible array member" }
+  char a[] = "foo";   // { dg-error "initializer for flexible array member" }
+  S () {}
 };
 
-struct T { // { dg-error "member initializer for flexible array member" }
+struct T {
   int i;
-  char a[] = "foo";
+  char a[] = "foo";   // { dg-error "initializer for flexible array member" }
 };
 
 struct U {
   int i;
-  char a[] = "foo";
+  char a[] = "foo";   // { dg-error "initializer for flexible array member" }
   U ();
 };
 
-U::U() {} // { dg-error "member initializer for flexible array member" }
+U::U() {}
 
 int
 main ()
@@ -29,17 +29,17 @@ main ()
 
 struct V {
   int i;
-  struct W { // { dg-error "member initializer for flexible array member" }
+  struct W {
 int j;
-char a[] = "foo";
+char a[] = "foo";   // { dg-error "initializer for flexible array member" }
   } w;
   V () {}
 };
 

Re: [RFA] [PR tree-optimization/79095][PATCH 3/4] Improve ASSERT_EXPRs and simplification of overflow tests

2017-02-06 Thread Jeff Law

On 02/06/2017 06:06 AM, Richard Biener wrote:


+  /* If OP0 CODE OP1 is an overflow comparison, if it can be expressed
+ as a simple equality test, then prefer that over its current form
+ for evaluation.
+
+ An overflow test which collapses to an equality test can always be
+ expressed as a comparison of one argument against zero.  Overflow
+ occurs when the chosen argument is zero and does not occur if the
+ chosen argument is not zero.  */
+  tree x;
+  if (overflow_comparison_p (code, op0, op1, use_equiv_p, false, ))


This somehow feels like a hack so I'd add a comment why we do not change
the IL in the first place.
It doesn't feel like a hack to me.  It's just discovering alternate 
ranges/forms for a class of comparisons that are sometimes more useful 
than the cananoical form we have in the IL (which promotes 
{ADD,SUB}_OVERFLOW detection).


Regardless, a comment WRT why we don't transform the IL sounds quite 
reasonable.;




Feeding overflow_comparison_p the original and the swapped comparison
looks like it makes it more expensive given its stmt walking?
No.  In both cases we have to walk one or both chains to see if we have 
a statement like X +- CST.  Then and only then we walk the other chain 
to see if it is X.


Identification of X in the other chain can happen at the start of the 
chain, end of the chain or somewhere in the middle (and I've seen all 
three happen in practice).



If we knew we were going to fully walk both chains, then the answer 
would be that we should walk them once to the end, then do detection 
both ways on the result of fully walking both chains.


 I'd see whether

returning a second output from it (whether we matched op0 or op1) would
simplify callers.

I think it could slightly.  Let me play with that.

jeff



Re: [PATCH] PR68972: g++.dg/cpp1y/vla-initlist1.C test case fails on power

2017-02-06 Thread Rainer Orth
Kelvin Nilsen  writes:

> Index: gcc/testsuite/g++.dg/cpp1y/vla-initlist1.C
> ===
> --- gcc/testsuite/g++.dg/cpp1y/vla-initlist1.C(revision 245156)
> +++ gcc/testsuite/g++.dg/cpp1y/vla-initlist1.C(working copy)
> @@ -1,4 +1,5 @@
>  // { dg-do run { target c++11 } }
> +// { dg-skip-if "power overwrites two slots of array i" { "power*-*-*"
> } { "*" } { "" } }

Please omit the default args to dg-skip-if: they're unnecessary.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [RFA] [PR tree-optimization/79095][PATCH 2/4] Add infrastructure to detect overflow checks

2017-02-06 Thread Jeff Law

On 02/06/2017 05:52 AM, Richard Biener wrote:

+  /* Now look at the defining statement of OP1 to see if it adds
+or subtracts a nonzero constant from another operand.  */
+  if (op1_def
+ && is_gimple_assign (op1_def)
+ && gimple_assign_rhs_code (op1_def) == PLUS_EXPR
+ && TREE_CODE (gimple_assign_rhs2 (op1_def)) == INTEGER_CST
+ && wi::ne_p (gimple_assign_rhs2 (op1_def), 0))


! integer_zerop ()

Fixed.


+
+ tree type = TREE_TYPE (op0);
+ wide_int max = wi::max_value (TYPE_PRECISION (type), UNSIGNED);
+ HOST_WIDE_INT inc = TREE_INT_CST_LOW (gimple_assign_rhs2
(op1_def));


You nowhere tested that rhs2 fits a HOST_WIDE_INT.  You can simply make inc
a tree and use that in the max + inc expression below.

Also fixed.



I'll have to see where the function is used to make sense of it.

Understood.

jeff



[PATCH] PR68972: g++.dg/cpp1y/vla-initlist1.C test case fails on power

2017-02-06 Thread Kelvin Nilsen

The test g++.dg/cpp1y/vla-initlist1.C makes assumptions that the memory
used to represent the private temporary variables of neighboring control
blocks at the same control nesting level is:

1. found at the same address, and
2. not overwritten between when the first block ends and the second
block begins.

While these assumptions are valid with some optimization choices on some
architectures, these assumptions do not hold universally.

With optimization disabled on the power architecture, the
g++.dg/cpp1y/vla-initlist1.C test program runs initialization code to
allocate the variable-length array a[] before entry into the second of
two neighboring control blocks.  This initialization code overwrites the
first two cells of the array i[] that were initialized by the first of
the two neighboring control blocks.  Thus, the initialization value
stored into i[1] is no longer present when this value is subsequently
fetched as a[1].i from within the second control block.

This patch disables this particular test case on power hardware.

The patch has been bootstrapped and tested on
powerpc64le-unknown-linux with no regressions.

Is this ok for trunk?

gcc/testsuite/ChangeLog:

2017-02-06  Kelvin Nilsen  

PR target/68972
* g++.dg/cpp1y/vla-initlist1.C: Add dg-skip-if directive to
disable this test on power architecture.

Index: gcc/testsuite/g++.dg/cpp1y/vla-initlist1.C
===
--- gcc/testsuite/g++.dg/cpp1y/vla-initlist1.C  (revision 245156)
+++ gcc/testsuite/g++.dg/cpp1y/vla-initlist1.C  (working copy)
@@ -1,4 +1,5 @@
 // { dg-do run { target c++11 } }
+// { dg-skip-if "power overwrites two slots of array i" { "power*-*-*"
} { "*" } { "" } }
 // { dg-options "-Wno-vla" }

 #include 



Re: [RFA][PR tree-optimization/79095] [PATCH 1/4] Improve ranges for MINUS_EXPR and EXACT_DIV_EXPR

2017-02-06 Thread Jeff Law

On 02/06/2017 08:33 AM, Richard Biener wrote:


ah, indeed vr0type is VR_ANTI_RANGE and yes we have the case
of a range with an anti-range "inside".  This also covers [-1,1] v ~[0,0]
where you choose the much larger anti-range now.  So at least we
want to have some idea about the sizes of the ranges (ideally we'd
choose the smaller though for most further propagations anti-ranges
often degenerate to varying...)
vr0 as an anti-singleton range like ~[0,0] is the only one likely of any 
interest right now and that's always going to have a range that is all 
but one value :-)


vr1 is the tricky case.  We could do v1.max - vr1.min and if that 
overflows or is some "large" value (say > 65536 just to throw out a 
value), then we conclude creating the singleton anti-range like ~[0,0] 
is more useful.


Jeff


Add Ada support for aarch64-*-freebsd*

2017-02-06 Thread John Marino
The attached patch to gcc trunk enables Ada support on 
aarch64-*-freebsd*.  All tests pass except those affected by the 
currently missing unwind support (c52103x, c52104x, c52104y, cb1010a, 
cb1010c, cb1010d,  gnat.dg/null_pointer_deref* (3)).


I'd also like to get this patch backported to the gcc-6 branch.  There's 
a one-line content difference between patches, so I've attached the 6.x 
specific patch as well.


My copyright assignment is on file and in order.

Thanks,
John

gcc/ada/Changelog:
2017-02-XX  John Marino  

* gcc-interface/Makefile.in: Support aarch64-freebsd



---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
--- gcc/ada/gcc-interface/Makefile.in.orig  2017-02-06 16:29:55 UTC
+++ gcc/ada/gcc-interface/Makefile.in
@@ -1495,6 +1495,34 @@ ifeq ($(strip $(filter-out x86_64 kfreeb
   LIBRARY_VERSION := $(LIB_VERSION)
 endif
 
+# aarch64 FreeBSD
+ifeq ($(strip $(filter-out %aarch64 freebsd%,$(target_cpu) $(target_os))),)
+  LIBGNAT_TARGET_PAIRS = \
+  a-intnam.ads--- gcc/ada/gcc-interface/Makefile.in.orig  2017-02-06 18:55:09 UTC
+++ gcc/ada/gcc-interface/Makefile.in
@@ -1475,6 +1475,34 @@ ifeq ($(strip $(filter-out x86_64 kfreeb
   LIBRARY_VERSION := $(LIB_VERSION)
 endif
 
+# aarch64 FreeBSD
+ifeq ($(strip $(filter-out %aarch64 freebsd%,$(target_cpu) $(target_os))),)
+  LIBGNAT_TARGET_PAIRS = \
+  a-intnam.ads

Re: New Port for RISC-V v3

2017-02-06 Thread Palmer Dabbelt
On Mon, 06 Feb 2017 11:19:56 PST (-0800), ja...@redhat.com wrote:
> On Mon, Feb 06, 2017 at 11:18:12AM -0800, Palmer Dabbelt wrote:
>> OK, great!  I think we're all set:
>>
>>  * Here's the responses to the documentation comments
>>,
>>.
>>
>>  * I believe the patch was silently dropped because it was over the size
>>limits, so I gzip'd the patch and sent it to the mailing list here
>>.
>>
>>  * We don't touch anything in any of the other ports.
>>
>> If you give the OK, then I can commit this as soon as I figure out git-svn
>> (which I'm looking at now).
>
> Ok (or just check out svn, apply the patch and commit from svn).

I believe we're in.  Thanks for the help!


Re: [wwwdocs] Add a case to porting_to + a question wrt validity of another one

2017-02-06 Thread Marek Polacek
On Mon, Feb 06, 2017 at 03:00:38PM -0500, Nathan Sidwell wrote:
> On 02/06/2017 01:03 PM, Jason Merrill wrote:
> 
> > No, this is a bug.
> 
> Yup.

I've opened PR79393.

Marek


C++ PATCH for c++/71193 (incomplete error in template)

2017-02-06 Thread Jason Merrill
71193 is a case where we now properly treat as non-dependent some
expressions that we previously considered type-dependent, and as a
result try to resolve their containing expressions, which can lead to
incomplete type errors that weren't seen before even though the code
was ill-formed.

This patch adjusts one situation this might occur to catch the
problem, give a pedwarn instead of an error, and treat the expression
as type-dependent.  This doesn't help the testcase for 71773, where
the access is more complicated.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit debb4f99e55e5c0b8c64fd1957d41417dc951870
Author: Jason Merrill 
Date:   Mon Feb 6 14:50:03 2017 -0500

PR c++/71193 - incomplete types in templates

* parser.c (cp_parser_postfix_dot_deref_expression): In a template
handle incomplete type by pedwarning and then treating as dependent.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 0aa738b..1813adb 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -7309,7 +7309,7 @@ cp_parser_postfix_dot_deref_expression (cp_parser *parser,
 
   /* Enter the scope corresponding to the type of the object
  given by the POSTFIX_EXPRESSION.  */
-  if (!dependent_p && TREE_TYPE (postfix_expression) != NULL_TREE)
+  if (!dependent_p)
 {
   scope = TREE_TYPE (postfix_expression);
   /* According to the standard, no expression should ever have
@@ -7324,26 +7324,50 @@ cp_parser_postfix_dot_deref_expression (cp_parser 
*parser,
 required to be of complete type for purposes of class member
 access (5.2.5) outside the member function body.  */
   if (postfix_expression != current_class_ref
+ && scope != error_mark_node
  && !(processing_template_decl
   && current_class_type
   && (same_type_ignoring_top_level_qualifiers_p
   (scope, current_class_type
-   scope = complete_type_or_else (scope, postfix_expression);
-  /* Let the name lookup machinery know that we are processing a
-class member access expression.  */
-  parser->context->object_type = scope;
-  /* If something went wrong, we want to be able to discern that case,
-as opposed to the case where there was no SCOPE due to the type
-of expression being dependent.  */
-  if (!scope)
-   scope = error_mark_node;
-  /* If the SCOPE was erroneous, make the various semantic analysis
-functions exit quickly -- and without issuing additional error
-messages.  */
-  if (scope == error_mark_node)
-   postfix_expression = error_mark_node;
+   {
+ scope = complete_type (scope);
+ if (!COMPLETE_TYPE_P (scope))
+   {
+ /* In a template, be permissive by treating an object expression
+of incomplete type as dependent (after a pedwarn).  */
+ diagnostic_t kind = (processing_template_decl
+  ? DK_PEDWARN
+  : DK_ERROR);
+ cxx_incomplete_type_diagnostic
+   (location_of (postfix_expression),
+postfix_expression, scope, kind);
+ if (processing_template_decl)
+   {
+ dependent_p = true;
+ scope = TREE_TYPE (postfix_expression) = NULL_TREE;
+   }
+   }
+   }
+
+  if (!dependent_p)
+   {
+ /* Let the name lookup machinery know that we are processing a
+class member access expression.  */
+ parser->context->object_type = scope;
+ /* If something went wrong, we want to be able to discern that case,
+as opposed to the case where there was no SCOPE due to the type
+of expression being dependent.  */
+ if (!scope)
+   scope = error_mark_node;
+ /* If the SCOPE was erroneous, make the various semantic analysis
+functions exit quickly -- and without issuing additional error
+messages.  */
+ if (scope == error_mark_node)
+   postfix_expression = error_mark_node;
+   }
 }
-  else
+
+  if (dependent_p)
 /* Tell cp_parser_lookup_name that there was an object, even though it's
type-dependent.  */
 parser->context->object_type = unknown_type_node;
diff --git a/gcc/testsuite/g++.dg/template/incomplete8.C 
b/gcc/testsuite/g++.dg/template/incomplete8.C
new file mode 100644
index 000..d6cde6a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/incomplete8.C
@@ -0,0 +1,11 @@
+// PR c++/71193
+// { dg-options "" }
+
+class Heap;
+class A {
+public:  
+  Heap *m_fn1();
+};
+template  class B : A {
+  void m_fn2() { m_fn1()->HashSeed; } // { dg-warning "incomplete" }
+};


Re: [PATCH] PR target/66144, PowerPC improve vector compare

2017-02-06 Thread Segher Boessenkool
On Mon, Feb 06, 2017 at 01:28:29PM -0500, Michael Meissner wrote:
> Can I check it into the trunk?

Yes please.  Thanks,

Segher


> 2017-02-06  Michael Meissner  
> 
>   PR target/66144
>   * config/rs6000/vector.md (vcond): Allow the true and
>   false values to be constant vectors with all 0 or all 1 bits set.
>   (vcondu): Likewise.
>   * config/rs6000/predicates.md (vector_int_reg_or_same_bit): New
>   predicate.
>   (fpmask_comparison_operator): Update comment.
>   (vecint_comparison_operator): New predicate.
>   * config/rs6000/rs6000.c (rs6000_emit_vector_cond_expr): Optimize
>   vector conditionals when the true and false values are constant
>   vectors with all 0 bits or all 1 bits set.
> 
> [gcc/testsuite]
> 2017-02-06  Michael Meissner  
> 
>   PR target/66144
>   * gcc.target/powerpc/pr66144-1.c: New test.
>   * gcc.target/powerpc/pr66144-2.c: Likewise.
>   * gcc.target/powerpc/pr66144-3.c: Likewise.


Re: [PATCH] Fix __atomic to not implement atomic loads with CAS.

2017-02-06 Thread Richard Henderson

On 02/03/2017 05:44 AM, Ramana Radhakrishnan wrote:

On 02/02/17 15:21, Torvald Riegel wrote:

On Thu, 2017-02-02 at 14:48 +, Ramana Radhakrishnan wrote:

On 30/01/17 18:54, Torvald Riegel wrote:

This patch fixes the __atomic builtins to not implement supposedly
lock-free atomic loads based on just a compare-and-swap operation.

If there is no hardware-backed atomic load for a certain memory
location, the current implementation can implement the load with a CAS
while claiming that the access is lock-free.  This is a bug in the cases
of volatile atomic loads and atomic loads to read-only-mapped memory; it
also creates a lot of contention in case of concurrent atomic loads,
which results in at least counter-intuitive performance because most
users probably understand "lock-free" to mean hardware-backed (and thus
"fast") instead of just in the progress-criteria sense.

This patch implements option 3b of the choices described here:
https://gcc.gnu.org/ml/gcc/2017-01/msg00167.html



Will Deacon pointed me at this thread asking if something similar could
be done on ARM.


It would be nice if someone more familiar with ARM could double-check
that ARM is not affected.  I guess ARM isn't, but that's based on me
looking at machine descriptions, which I hadn't ever done before working
on this patch...



ARM doesn't have __int128 support, so I don't think the problem exists there.

On ARM, on architecture levels (i.e arch < armv6k) that do not have single copy
atomic routines we end up with calling the kernel helper routines where the
appropriate handling is done by the kernel depending on whether you are
multicore or not.

__atomic_load on ARM appears to be ok as well

except for

__atomic_load_di which should really be the ldrexd / strexd loop but we could
ameliorate that similar to your option 3b.


No, look again.  ldrexd has 64-bit single-copy semantics WITHOUT requiring the 
strexd.  It's only the AArch64 (64-bit) LDXP that requires the store.



r~


Re: [wwwdocs] Add a case to porting_to + a question wrt validity of another one

2017-02-06 Thread Nathan Sidwell

On 02/06/2017 01:03 PM, Jason Merrill wrote:


No, this is a bug.


Yup.



--
Nathan Sidwell


Re: [PING 6, PATCH] Remove xfail from thread_local-order2.C.

2017-02-06 Thread Jason Merrill
On Mon, Feb 6, 2017 at 11:42 AM, Mike Stump  wrote:
> I'll copy Jason to see if he recalls any systems where this might still fail.

Not particularly; I expected it to fail everywhere except recent
glibc, but apparently that isn't the case.

Jason


Re: [C++ PATCH] Fix ICEs with #pragma GCC ivdep (PR c++/79379)

2017-02-06 Thread Jason Merrill
OK.

On Mon, Feb 6, 2017 at 2:30 PM, Jakub Jelinek  wrote:
> Hi!
>
> #pragma GCC ivdep can make it through into potential_constant_expression_1
> as the following testcase shows.  As it is just a hint to the compiler,
> I think it isn't needed to make loops with that pragma non-constexpr, so
> the patch just ignores the hint for determination of what is a potential
> constant expression and ditto in cxx_eval_constant_expression.  The pragma
> is represented by ANNOTATE_EXPR with first operand being the condition
> of the loop and second operand annot_expr_ivdep_kind.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2017-02-06  Jakub Jelinek  
>
> PR c++/79379
> * constexpr.c (cxx_eval_constant_expression): Handle ANNOTATE_EXPR.
> (potential_constant_expression_1): Likewise.
>
> * g++.dg/cpp1y/constexpr-79379.C: New test.
>
> --- gcc/cp/constexpr.c.jj   2017-02-03 23:35:37.0 +0100
> +++ gcc/cp/constexpr.c  2017-02-06 16:00:21.903921006 +0100
> @@ -4518,6 +4518,14 @@ cxx_eval_constant_expression (const cons
>  *non_constant_p = true;
>return t;
>
> +case ANNOTATE_EXPR:
> +  gcc_assert (tree_to_uhwi (TREE_OPERAND (t, 1)) == 
> annot_expr_ivdep_kind);
> +  r = cxx_eval_constant_expression (ctx, TREE_OPERAND (t, 0),
> +   lval,
> +   non_constant_p, overflow_p,
> +   jump_target);
> +  break;
> +
>  default:
>if (STATEMENT_CODE_P (TREE_CODE (t)))
> {
> @@ -5689,6 +5697,10 @@ potential_constant_expression_1 (tree t,
> return false;
>}
>
> +case ANNOTATE_EXPR:
> +  gcc_assert (tree_to_uhwi (TREE_OPERAND (t, 1)) == 
> annot_expr_ivdep_kind);
> +  return RECUR (TREE_OPERAND (t, 0), rval);
> +
>  default:
>if (objc_is_property_ref (t))
> return false;
> --- gcc/testsuite/g++.dg/cpp1y/constexpr-79379.C.jj 2017-02-06 
> 16:09:30.457863715 +0100
> +++ gcc/testsuite/g++.dg/cpp1y/constexpr-79379.C2017-02-06 
> 16:08:49.0 +0100
> @@ -0,0 +1,19 @@
> +// PR c++/79379
> +// { dg-do compile { target c++14 } }
> +// { dg-options "-O2" }
> +
> +template 
> +constexpr int
> +foo (int x)
> +{
> +  int q[64] = { 0 }, r = 0;
> +#pragma GCC ivdep
> +  for (int i = 0; i < x; ++i)
> +q[i] += 2;
> +  for (int i = 0; i < x; ++i)
> +r += q[i];
> +  return r + N;
> +}
> +
> +constexpr int a = foo<0> (17);
> +static_assert (a == 34, "");
>
> Jakub


Re: [C++ PATCH] Fix ICE with -fpermissive on post{inc,dec}rement (PR c++/79377)

2017-02-06 Thread Jason Merrill
OK.

On Mon, Feb 6, 2017 at 2:32 PM, Jakub Jelinek  wrote:
> Hi!
>
> As the testcase shows, with -fpermissive we call operator++() or
> operator--() if operator++(int) or operator--(int) is not available,
> but the new code in build_min_non_dep_op_overload wasn't expecting
> such a change.  Fixed thusly, bootstrapped/regtested on x86_64-linux
> and i686-linux, ok for trunk?
>
> 2017-02-06  Jakub Jelinek  
>
> PR c++/79377
> * tree.c (build_min_non_dep_op_overload): For POST{INC,DEC}REMENT_EXPR
> allow one fewer than expected arguments if flag_permissive.
>
> * g++.dg/lookup/pr79377.C: New test.
>
> --- gcc/cp/tree.c.jj2017-01-26 20:35:13.0 +0100
> +++ gcc/cp/tree.c   2017-02-06 17:30:26.101413319 +0100
> @@ -2938,8 +2938,10 @@ build_min_non_dep_op_overload (enum tree
>nargs = call_expr_nargs (non_dep);
>
>expected_nargs = cp_tree_code_length (op);
> -  if (op == POSTINCREMENT_EXPR
> -  || op == POSTDECREMENT_EXPR)
> +  if ((op == POSTINCREMENT_EXPR
> +   || op == POSTDECREMENT_EXPR)
> +  /* With -fpermissive non_dep could be operator++().  */
> +  && (!flag_permissive || nargs != expected_nargs))
>  expected_nargs += 1;
>gcc_assert (nargs == expected_nargs);
>
> --- gcc/testsuite/g++.dg/lookup/pr79377.C.jj2017-02-06 17:41:22.316931623 
> +0100
> +++ gcc/testsuite/g++.dg/lookup/pr79377.C   2017-02-06 17:37:09.0 
> +0100
> @@ -0,0 +1,36 @@
> +// PR c++/79377
> +// { dg-do run }
> +// { dg-options "-fpermissive" }
> +
> +struct A
> +{
> +  A () : a (0) {}
> +  A& operator++ () { ++a; ++c; return *this; }
> +  int a;
> +  static int c;
> +};
> +
> +int A::c = 0;
> +
> +template 
> +void
> +foo (A& a)
> +{
> +  a++; // { dg-warning "trying prefix operator instead" }
> +  if (A::c != 3 || a.a != 3) __builtin_abort ();
> +  ++a;
> +  if (A::c != 4 || a.a != 4) __builtin_abort ();
> +}
> +
> +int
> +main ()
> +{
> +  A a;
> +  if (A::c != 0 || a.a != 0) __builtin_abort ();
> +  ++a;
> +  if (A::c != 1 || a.a != 1) __builtin_abort ();
> +  a++; // { dg-warning "trying prefix operator instead" }
> +  if (A::c != 2 || a.a != 2) __builtin_abort ();
> +  foo (a);
> +  if (A::c != 4 || a.a != 4) __builtin_abort ();
> +}
>
> Jakub


Re: [C++ PATCH] Fix ICE in invalid decomposition (PR c++/79372)

2017-02-06 Thread Jason Merrill
OK.

On Mon, Feb 6, 2017 at 2:34 PM, Jakub Jelinek  wrote:
> Hi!
>
> The following patch fixes ICE when cp_finish_decomp already when parsing
> template manages to diagnose the decomposition as errorneous, then
> the types of all the decls for the decomp identifiers are all
> error_mark_node, but they don't have DECL_VALUE_EXPR tsubst_decl_names
> has been asserting.  Fixed thusly, bootstrapped/regtested on x86_64-linux
> and i686-linux, ok for trunk?
>
> 2017-02-06  Jakub Jelinek  
>
> PR c++/79372
> * decl.c (cp_finish_decomp): On error set decl type to 
> error_mark_node.
> * pt.c (tsubst_expr): Don't call tsubst_decomp_names on decompositions
> with error_mark_node type.
>
> * g++.dg/cpp1z/decomp25.C: New test.
>
> --- gcc/cp/decl.c.jj2017-01-31 09:26:02.0 +0100
> +++ gcc/cp/decl.c   2017-02-06 18:02:02.115840352 +0100
> @@ -7378,6 +7378,7 @@ cp_finish_decomp (tree decl, tree first,
> }
>   first = DECL_CHAIN (first);
> }
> +  TREE_TYPE (decl) = error_mark_node;
>if (DECL_P (decl) && DECL_NAMESPACE_SCOPE_P (decl))
> SET_DECL_ASSEMBLER_NAME (decl, get_identifier (""));
>return;
> --- gcc/cp/pt.c.jj  2017-02-03 23:35:37.0 +0100
> +++ gcc/cp/pt.c 2017-02-06 18:12:17.533828738 +0100
> @@ -15765,7 +15765,9 @@ tsubst_expr (tree t, tree args, tsubst_f
>   const_init = (DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P
> (pattern_decl));
> cp_finish_decl (decl, init, const_init, NULL_TREE, 0);
> -   if (VAR_P (decl) && DECL_DECOMPOSITION_P (decl))
> +   if (VAR_P (decl)
> +   && DECL_DECOMPOSITION_P (decl)
> +   && TREE_TYPE (pattern_decl) != error_mark_node)
>   {
> unsigned int cnt;
> tree first;
> --- gcc/testsuite/g++.dg/cpp1z/decomp25.C.jj2017-02-06 18:03:40.318562362 
> +0100
> +++ gcc/testsuite/g++.dg/cpp1z/decomp25.C   2017-02-06 18:15:37.0 
> +0100
> @@ -0,0 +1,20 @@
> +// PR c++/79372
> +// { dg-do compile { target c++11 } }
> +// { dg-options "" }
> +
> +template 
> +struct S
> +{
> +  enum E { A };
> +  void f () { auto [x] = 0; x++; } // { dg-error "cannot decompose 
> non-array non-class type" }
> +   // { dg-warning "decomposition 
> declaration only available with" "" { target c++14_down } .-1 }
> +  void g (T t) { auto [y] = t; y++; }  // { dg-error "cannot decompose 
> non-array non-class type" }
> +}; // { dg-warning "decomposition 
> declaration only available with" "" { target c++14_down } .-1 }
> +
> +int
> +main ()
> +{
> +  S  s;
> +  s.f ();
> +  s.g (5);
> +}
>
> Jakub


[PATCH, i386]: Simplify vector modes -> vec_extract* splitter

2017-02-06 Thread Uros Bizjak
No functional changes.

2017-02-06  Uros Bizjak  

* config/i386/sse.md (vector modes -> vec_extract* splitter): Use
explicit subreg RTX with operand 1.  Use VECTOR_MODE_P predicate
to simplify split condition.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index a19c8f2..1654890 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -13669,26 +13669,19 @@
 ;; vector modes into vec_extract*.
 (define_split
   [(set (match_operand:SWI48x 0 "nonimmediate_operand")
-   (match_operand:SWI48x 1 "register_operand"))]
+   (subreg:SWI48x (match_operand 1 "register_operand") 0))]
   "can_create_pseudo_p ()
-   && SUBREG_P (operands[1])
-   && REG_P (SUBREG_REG (operands[1]))
-   && (GET_MODE_CLASS (GET_MODE (SUBREG_REG (operands[1]))) == MODE_VECTOR_INT
-   || (GET_MODE_CLASS (GET_MODE (SUBREG_REG (operands[1])))
-  == MODE_VECTOR_FLOAT))
-   && SUBREG_BYTE (operands[1]) == 0
-   && TARGET_SSE
-   && (GET_MODE_SIZE (GET_MODE (SUBREG_REG (operands[1]))) == 16
-   || (GET_MODE_SIZE (GET_MODE (SUBREG_REG (operands[1]))) == 32
-  && TARGET_AVX)
-   || (GET_MODE_SIZE (GET_MODE (SUBREG_REG (operands[1]))) == 64
-  && TARGET_AVX512F))
+   && REG_P (operands[1])
+   && VECTOR_MODE_P (GET_MODE (operands[1]))
+   && ((TARGET_SSE && GET_MODE_SIZE (GET_MODE (operands[1])) == 16)
+   || (TARGET_AVX && GET_MODE_SIZE (GET_MODE (operands[1])) == 32)
+   || (TARGET_AVX512F && GET_MODE_SIZE (GET_MODE (operands[1])) == 64))
&& (mode == SImode || TARGET_64BIT || MEM_P (operands[0]))"
   [(set (match_dup 0) (vec_select:SWI48x (match_dup 1)
 (parallel [(const_int 0)])))]
 {
   rtx tmp;
-  operands[1] = SUBREG_REG (operands[1]);
+
   switch (GET_MODE_SIZE (GET_MODE (operands[1])))
 {
 case 64:


Re: [PATCH] use zero as the lower bound for a signed-unsigned range (PR 79327)

2017-02-06 Thread Jakub Jelinek
On Sat, Feb 04, 2017 at 09:07:23AM +0100, Jakub Jelinek wrote:
> You've committed the patch unnecessarily complicated, see above.
> The following gives the same testsuite result.
> 
> dirtype is one of the standard {un,}signed {char,short,int,long,long long}
> types, all of them have 0 in their ranges.
> For VR_RANGE we almost always set res.knownrange to true:
>   /* Set KNOWNRANGE if the argument is in a known subrange
>  of the directive's type (KNOWNRANGE may be reset below).  */
>   res.knownrange
> = (!tree_int_cst_equal (TYPE_MIN_VALUE (dirtype), argmin)
>|| !tree_int_cst_equal (TYPE_MAX_VALUE (dirtype), argmax));
> (the exception is in case that range clearly has to include zero),
> and reset it only if adjust_range_for_overflow returned true, which means
> it also set the range to TYPE_M{IN,AX}_VALUE (dirtype) and again
> includes zero.
> So IMNSHO likely_adjust in what you've committed is always true
> when you use it and thus just a useless computation and something to make
> the code harder to understand.
> 
> Even if you don't trust this, with the ranges in argmin/argmax, it is
> IMHO undesirable to set it differently at the different code paths,
> if you want to check whether the final range includes zero and at least
> one another value, just do
> -  if (likely_adjust && maybebase && base != 10)
> +  if ((tree_int_cst_sgn (argmin) < 0 || tree_int_cst_sgn (argmax) > 0)
>  && maybebase && base != 10)
> Though, it is useless both for the above reason and for the reason that you
> actually do something only:
>   if (res.range.min == 1)
> res.range.likely += base == 8 ? 1 : 2;
>   else if (res.range.min == 2
>&& base == 16
>&& (dir.width[0] == 2 || dir.prec[0] == 2))
> ++res.range.likely;
> where if the range doesn't include zero, you would never get
> res.range.min of 1 and for base == 16 also not 2.
> 
> 2017-02-04  Jakub Jelinek  
> 
>   PR tree-optimization/79327
>   * gimple-ssa-sprintf.c (format_integer): Remove likely_adjust
>   variable, its initialization and use.

Now bootstrapped/regtested on x86_64-linux and i686-linux (ignore the
testsuite/ChangeLog part, I've committed that part in another commit already),
ok for trunk?

> --- gcc/gimple-ssa-sprintf.c.jj   2017-02-04 08:43:12.0 +0100
> +++ gcc/gimple-ssa-sprintf.c  2017-02-04 08:45:33.173709580 +0100
> @@ -1232,10 +1232,6 @@ format_integer (const directive , tr
> of the format string by returning [-1, -1].  */
>  return fmtresult ();
>  
> -  /* True if the LIKELY counter should be adjusted upward from the MIN
> - counter to account for arguments with unknown values.  */
> -  bool likely_adjust = false;
> -
>fmtresult res;
>  
>/* Using either the range the non-constant argument is in, or its
> @@ -1265,14 +1261,6 @@ format_integer (const directive , tr
>  
> res.argmin = argmin;
> res.argmax = argmax;
> -
> -   /* Set the adjustment for an argument whose range includes
> -  zero since that doesn't include the octal or hexadecimal
> -  base prefix.  */
> -   wide_int wzero = wi::zero (wi::get_precision (min));
> -   if (wi::le_p (min, wzero, SIGNED)
> -   && !wi::neg_p (max))
> - likely_adjust = true;
>   }
>else if (range_type == VR_ANTI_RANGE)
>   {
> @@ -1307,11 +1295,6 @@ format_integer (const directive , tr
>  
>if (!argmin)
>  {
> -  /* Set the adjustment for an argument whose range includes
> -  zero since that doesn't include the octal or hexadecimal
> -  base prefix.  */
> -  likely_adjust = true;
> -
>if (TREE_CODE (argtype) == POINTER_TYPE)
>   {
> argmin = build_int_cst (pointer_sized_int_node, 0);
> @@ -1371,7 +1354,7 @@ format_integer (const directive , tr
>else
>  {
>res.range.likely = res.range.min;
> -  if (likely_adjust && maybebase && base != 10)
> +  if (maybebase && base != 10)
>   {
> if (res.range.min == 1)
>   res.range.likely += base == 8 ? 1 : 2;

Jakub


[PATCH] Fix cprop ICE with conditional asserts becoming non-conditional (PR rtl-optimization/79386)

2017-02-06 Thread Jakub Jelinek
Hi!

The recent r244993 change where bypass_conditional_jumps is called only
after splitting blocks for unconditional traps can result in ICEs during
bypass_conditional_jumps, because on the (unreachable, to be removed later)
new basic blocks created by that splitting we don't have cprop per-bb data
structures like cprop_avout computed (nor space for them allocated).
bypass_conditional_jumps already uses bypass_last_basic_block variable
to avoid touching basic blocks created during that function, so this
patch just extends that to also the basic blocks created for splitting
after unconditional traps.

Bootstrapped/regtested on {x86_64,i686,powerpc64{,le}}-linux, ok for trunk?

2017-02-06  Jakub Jelinek  

PR rtl-optimization/79386
* cprop.c (bypass_conditional_jumps): Initialize
bypass_last_basic_block already before splitting bbs after
unconditional traps...
(bypass_conditional_jumps): ... rather than here.

* gcc.c-torture/compile/pr79386.c: New test.

--- gcc/cprop.c.jj  2017-01-30 09:31:48.0 +0100
+++ gcc/cprop.c 2017-02-06 14:37:07.157093577 +0100
@@ -1697,7 +1697,6 @@ bypass_conditional_jumps (void)
   if (ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb == EXIT_BLOCK_PTR_FOR_FN (cfun))
 return 0;
 
-  bypass_last_basic_block = last_basic_block_for_fn (cfun);
   mark_dfs_back_edges ();
 
   changed = 0;
@@ -1863,6 +1862,11 @@ one_cprop_pass (void)
  }
}
 
+  /* Make sure bypass_conditional_jumps will ignore not just its new
+basic blocks, but also the ones after unconditional traps (those are
+unreachable and will be eventually removed as such).  */
+  bypass_last_basic_block = last_basic_block_for_fn (cfun);
+
   while (!uncond_traps.is_empty ())
{
  rtx_insn *insn = uncond_traps.pop ();
--- gcc/testsuite/gcc.c-torture/compile/pr79386.c.jj2017-02-06 
14:43:31.932063697 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr79386.c   2017-02-06 
14:43:19.0 +0100
@@ -0,0 +1,46 @@
+/* PR rtl-optimization/79386 */
+
+int a, b;
+
+int
+foo (int x)
+{
+  int c;
+  int *d, *e;
+
+  if (b == 0)
+{
+  c = 0;
+  e = 
+  d = 
+}
+  else
+{
+  int f;
+
+  c = 1;
+  for (f = 0; f < 9; ++f)
+   c *= 3;
+  e = (int *) (__UINTPTR_TYPE__) c;
+  d = 
+}
+  *e = c < 3;
+  if (*e != 0)
+{
+  int g;
+
+  b += (a != 0) ? a : 1;
+  if (g != 0 || x != 0)
+   *d = 0;
+  if (b >= 0)
+   {
+ if (g != 0)
+   g = x;
+ if (*d / g != 0)
+   for (;;)
+ ;
+   }
+}
+
+  return b * (a != 0 && *d != 0);
+}

Jakub


[C++ PATCH] Fix ICE in invalid decomposition (PR c++/79372)

2017-02-06 Thread Jakub Jelinek
Hi!

The following patch fixes ICE when cp_finish_decomp already when parsing
template manages to diagnose the decomposition as errorneous, then
the types of all the decls for the decomp identifiers are all
error_mark_node, but they don't have DECL_VALUE_EXPR tsubst_decl_names
has been asserting.  Fixed thusly, bootstrapped/regtested on x86_64-linux
and i686-linux, ok for trunk?

2017-02-06  Jakub Jelinek  

PR c++/79372
* decl.c (cp_finish_decomp): On error set decl type to error_mark_node.
* pt.c (tsubst_expr): Don't call tsubst_decomp_names on decompositions
with error_mark_node type.

* g++.dg/cpp1z/decomp25.C: New test.

--- gcc/cp/decl.c.jj2017-01-31 09:26:02.0 +0100
+++ gcc/cp/decl.c   2017-02-06 18:02:02.115840352 +0100
@@ -7378,6 +7378,7 @@ cp_finish_decomp (tree decl, tree first,
}
  first = DECL_CHAIN (first);
}
+  TREE_TYPE (decl) = error_mark_node;
   if (DECL_P (decl) && DECL_NAMESPACE_SCOPE_P (decl))
SET_DECL_ASSEMBLER_NAME (decl, get_identifier (""));
   return;
--- gcc/cp/pt.c.jj  2017-02-03 23:35:37.0 +0100
+++ gcc/cp/pt.c 2017-02-06 18:12:17.533828738 +0100
@@ -15765,7 +15765,9 @@ tsubst_expr (tree t, tree args, tsubst_f
  const_init = (DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P
(pattern_decl));
cp_finish_decl (decl, init, const_init, NULL_TREE, 0);
-   if (VAR_P (decl) && DECL_DECOMPOSITION_P (decl))
+   if (VAR_P (decl)
+   && DECL_DECOMPOSITION_P (decl)
+   && TREE_TYPE (pattern_decl) != error_mark_node)
  {
unsigned int cnt;
tree first;
--- gcc/testsuite/g++.dg/cpp1z/decomp25.C.jj2017-02-06 18:03:40.318562362 
+0100
+++ gcc/testsuite/g++.dg/cpp1z/decomp25.C   2017-02-06 18:15:37.0 
+0100
@@ -0,0 +1,20 @@
+// PR c++/79372
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+template 
+struct S
+{
+  enum E { A };
+  void f () { auto [x] = 0; x++; } // { dg-error "cannot decompose 
non-array non-class type" }
+   // { dg-warning "decomposition 
declaration only available with" "" { target c++14_down } .-1 }
+  void g (T t) { auto [y] = t; y++; }  // { dg-error "cannot decompose 
non-array non-class type" }
+}; // { dg-warning "decomposition 
declaration only available with" "" { target c++14_down } .-1 }
+
+int
+main ()
+{
+  S  s;
+  s.f ();
+  s.g (5);
+}

Jakub


[C++ PATCH] Fix ICE with -fpermissive on post{inc,dec}rement (PR c++/79377)

2017-02-06 Thread Jakub Jelinek
Hi!

As the testcase shows, with -fpermissive we call operator++() or
operator--() if operator++(int) or operator--(int) is not available,
but the new code in build_min_non_dep_op_overload wasn't expecting
such a change.  Fixed thusly, bootstrapped/regtested on x86_64-linux
and i686-linux, ok for trunk?

2017-02-06  Jakub Jelinek  

PR c++/79377
* tree.c (build_min_non_dep_op_overload): For POST{INC,DEC}REMENT_EXPR
allow one fewer than expected arguments if flag_permissive.

* g++.dg/lookup/pr79377.C: New test.

--- gcc/cp/tree.c.jj2017-01-26 20:35:13.0 +0100
+++ gcc/cp/tree.c   2017-02-06 17:30:26.101413319 +0100
@@ -2938,8 +2938,10 @@ build_min_non_dep_op_overload (enum tree
   nargs = call_expr_nargs (non_dep);
 
   expected_nargs = cp_tree_code_length (op);
-  if (op == POSTINCREMENT_EXPR
-  || op == POSTDECREMENT_EXPR)
+  if ((op == POSTINCREMENT_EXPR
+   || op == POSTDECREMENT_EXPR)
+  /* With -fpermissive non_dep could be operator++().  */
+  && (!flag_permissive || nargs != expected_nargs))
 expected_nargs += 1;
   gcc_assert (nargs == expected_nargs);
 
--- gcc/testsuite/g++.dg/lookup/pr79377.C.jj2017-02-06 17:41:22.316931623 
+0100
+++ gcc/testsuite/g++.dg/lookup/pr79377.C   2017-02-06 17:37:09.0 
+0100
@@ -0,0 +1,36 @@
+// PR c++/79377
+// { dg-do run }
+// { dg-options "-fpermissive" }
+
+struct A
+{
+  A () : a (0) {}
+  A& operator++ () { ++a; ++c; return *this; }
+  int a;
+  static int c;
+};
+
+int A::c = 0;
+
+template 
+void
+foo (A& a)
+{
+  a++; // { dg-warning "trying prefix operator instead" }
+  if (A::c != 3 || a.a != 3) __builtin_abort ();
+  ++a;
+  if (A::c != 4 || a.a != 4) __builtin_abort ();
+}
+
+int
+main ()
+{
+  A a;
+  if (A::c != 0 || a.a != 0) __builtin_abort ();
+  ++a;
+  if (A::c != 1 || a.a != 1) __builtin_abort ();
+  a++; // { dg-warning "trying prefix operator instead" }
+  if (A::c != 2 || a.a != 2) __builtin_abort ();
+  foo (a);
+  if (A::c != 4 || a.a != 4) __builtin_abort ();
+}

Jakub


[C++ PATCH] Fix ICEs with #pragma GCC ivdep (PR c++/79379)

2017-02-06 Thread Jakub Jelinek
Hi!

#pragma GCC ivdep can make it through into potential_constant_expression_1
as the following testcase shows.  As it is just a hint to the compiler,
I think it isn't needed to make loops with that pragma non-constexpr, so
the patch just ignores the hint for determination of what is a potential
constant expression and ditto in cxx_eval_constant_expression.  The pragma
is represented by ANNOTATE_EXPR with first operand being the condition
of the loop and second operand annot_expr_ivdep_kind.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-02-06  Jakub Jelinek  

PR c++/79379
* constexpr.c (cxx_eval_constant_expression): Handle ANNOTATE_EXPR.
(potential_constant_expression_1): Likewise.

* g++.dg/cpp1y/constexpr-79379.C: New test.

--- gcc/cp/constexpr.c.jj   2017-02-03 23:35:37.0 +0100
+++ gcc/cp/constexpr.c  2017-02-06 16:00:21.903921006 +0100
@@ -4518,6 +4518,14 @@ cxx_eval_constant_expression (const cons
 *non_constant_p = true;
   return t;
 
+case ANNOTATE_EXPR:
+  gcc_assert (tree_to_uhwi (TREE_OPERAND (t, 1)) == annot_expr_ivdep_kind);
+  r = cxx_eval_constant_expression (ctx, TREE_OPERAND (t, 0),
+   lval,
+   non_constant_p, overflow_p,
+   jump_target);
+  break;
+
 default:
   if (STATEMENT_CODE_P (TREE_CODE (t)))
{
@@ -5689,6 +5697,10 @@ potential_constant_expression_1 (tree t,
return false;
   }
 
+case ANNOTATE_EXPR:
+  gcc_assert (tree_to_uhwi (TREE_OPERAND (t, 1)) == annot_expr_ivdep_kind);
+  return RECUR (TREE_OPERAND (t, 0), rval);
+
 default:
   if (objc_is_property_ref (t))
return false;
--- gcc/testsuite/g++.dg/cpp1y/constexpr-79379.C.jj 2017-02-06 
16:09:30.457863715 +0100
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-79379.C2017-02-06 
16:08:49.0 +0100
@@ -0,0 +1,19 @@
+// PR c++/79379
+// { dg-do compile { target c++14 } }
+// { dg-options "-O2" }
+
+template 
+constexpr int
+foo (int x)
+{
+  int q[64] = { 0 }, r = 0;
+#pragma GCC ivdep
+  for (int i = 0; i < x; ++i)
+q[i] += 2;
+  for (int i = 0; i < x; ++i)
+r += q[i];
+  return r + N;
+}
+
+constexpr int a = foo<0> (17);
+static_assert (a == 34, "");

Jakub


Re: [PATCH] sched: Do not move expensive insns speculatively (PR68664)

2017-02-06 Thread Segher Boessenkool
On Fri, Feb 03, 2017 at 06:28:21PM -0600, Segher Boessenkool wrote:
> On Fri, Feb 03, 2017 at 04:31:58PM -0700, Jeff Law wrote:
> > >+@deftypefn {Target Hook} bool TARGET_SCHED_CAN_SPECULATE_INSN (rtx_insn 
> > >*@var{insn})
> > >+Some instructions should never be speculated by the schedulers, usually
> > >+ because the instruction is too expensive to get this wrong.  This hook
> > >+ should return @code{false} if @var{insn} should not be speculated.
> > >+@end deftypefn
> > Consider adding something like this:
> > 
> > Define this hook to return false for instructions which are not fully 
> > modeled by the pipeline description to avoid DFA size explosion. 
> > Otherwise the scheduler may erroneously speculate those instructions 
> > into a pipeline bubble that is too small which may severely impact 
> > performance.
> 
> Well, it speculates it even _if_ you correctly model it, currently
> anyway.  But I'll write something similar, good idea.

This is what I committed:

+DEFHOOK
+(can_speculate_insn,
+ "Some instructions should never be speculated by the schedulers, usually\n\
+ because the instruction is too expensive to get this wrong.  Often such\n\
+ instructions have long latency, and often they are not fully modeled in the\n\
+ pipeline descriptions.  This hook should return @code{false} if @var{insn}\n\
+ should not be speculated.",
+ bool, (rtx_insn *insn), hook_bool_rtx_insn_true)


Segher


Re: [PATCH] Fix __atomic to not implement atomic loads with CAS.

2017-02-06 Thread Jakub Jelinek
On Wed, Feb 01, 2017 at 06:26:04PM +0100, Torvald Riegel wrote:
> On Mon, 2017-01-30 at 19:54 +0100, Torvald Riegel wrote:
> > This patch fixes the __atomic builtins to not implement supposedly
> > lock-free atomic loads based on just a compare-and-swap operation.
> 
> After an off-list OK by Jakub, I have committed this as r245098.
> Jakub will take care of the OpenMP side in a follow-up patch.

Here it is.  It is an ABI change for -fopenmp -mcx16 on x86_64, but -mcx16
used to be ABI incompatible with -mno-cx16, and it affects mostly just
__int128 and long double atomics.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2017-02-06  Jakub Jelinek  

* omp-expand.c (oxpand_omp_atomic_fetch_op,
expand_omp_atomic_pipeline): Return false if can_atomic_load_p is
false.

--- gcc/omp-expand.c.jj 2017-01-25 17:17:52.0 +0100
+++ gcc/omp-expand.c2017-02-06 13:41:22.703372812 +0100
@@ -6241,7 +6241,7 @@ expand_omp_atomic_fetch_op (basic_block
  matter is that (with the exception of i486 vs i586 and xadd) all targets
  that support any atomic operaton optab also implements compare-and-swap.
  Let optabs.c take care of expanding any compare-and-swap loop.  */
-  if (!can_compare_and_swap_p (imode, true))
+  if (!can_compare_and_swap_p (imode, true) || !can_atomic_load_p (imode))
 return false;
 
   gsi = gsi_last_bb (load_bb);
@@ -6318,7 +6318,8 @@ expand_omp_atomic_pipeline (basic_block
   type = TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (addr)));
   itype = TREE_TYPE (TREE_TYPE (cmpxchg));
 
-  if (!can_compare_and_swap_p (TYPE_MODE (itype), true))
+  if (!can_compare_and_swap_p (TYPE_MODE (itype), true)
+  || !can_atomic_load_p (TYPE_MODE (itype)))
 return false;
 
   /* Load the initial value, replacing the GIMPLE_OMP_ATOMIC_LOAD.  */


Jakub


Re: New Port for RISC-V v3

2017-02-06 Thread Jakub Jelinek
On Mon, Feb 06, 2017 at 11:18:12AM -0800, Palmer Dabbelt wrote:
> OK, great!  I think we're all set:
> 
>  * Here's the responses to the documentation comments
>,
>.
> 
>  * I believe the patch was silently dropped because it was over the size
>limits, so I gzip'd the patch and sent it to the mailing list here
>.
> 
>  * We don't touch anything in any of the other ports.
> 
> If you give the OK, then I can commit this as soon as I figure out git-svn
> (which I'm looking at now).

Ok (or just check out svn, apply the patch and commit from svn).

Jakub


Re: New Port for RISC-V v3

2017-02-06 Thread Palmer Dabbelt
On Mon, 06 Feb 2017 00:21:36 PST (-0800), ja...@redhat.com wrote:
> On Sun, Feb 05, 2017 at 10:38:18AM -0800, Palmer Dabbelt wrote:
>> There have been a handful of changes since we submitted our v2 port:
>>
>>  * Some documentation formatting fixes.
>>
>>  * A documentation typo fix.
>>
>>  * Some changes to wwwdocs, which have been mailed to the list.
>>
>>  * The port now builds via contrib/config-list.mk.  I worked around the
>>warnings in other parts of the codebase with some "#pragma GCC diagnostic
>>ignored" when I couldn't fix them properly, so the patches aren't useful,
>>but I fixed the warnings in our port reasonably.  I can try to fix all 
>> these
>>reasonably, but it might take a while.
>>
>> As far as I know there are currently no outstanding problems with this port, 
>> so
>> I think it's at the point where we should talk about actually getting the 
>> code
>> in.  We have been accepted as maintainers of the port, and I have write 
>> access
>> to the repositories, so I think we're all good to go on that end.  Of course 
>> if
>> there's any remaining comments I'd love to fix them, but it seems the 
>> comments
>> on our v2 were somewhat minimal.
>>
>> What's the procedure for moving forward with the port?
>>
>> Thanks to everyone who helped with reviewing the port!
>>
>> [PATCH 1/6] RISC-V Port: gcc/config/riscv/riscv.c
>> [PATCH 2/6] RISC-V Port: gcc
>> [PATCH 3/6] RISC-V Port: libgcc
>> [PATCH 4/6] RISC-V Port: libatomic
>> [PATCH 5/6] RISC-V Port: gcc/testsuite
>> [PATCH 6/6] RISC-V Port: contrib
>
> Richard in another mail said he is ok with the patchset, Sandra said some
> notes on the documentation patch and have seen just 5 of these 6 patches
> posted in v3 (the 2/6 patch is missing).
> From RM POV as long as it doesn't affect other targets it is ok for trunk,
> but please don't delay it too much (i.e. resolve Sandra's comments, post the
> missing patch, then check it in).

OK, great!  I think we're all set:

 * Here's the responses to the documentation comments
   ,
   .

 * I believe the patch was silently dropped because it was over the size
   limits, so I gzip'd the patch and sent it to the mailing list here
   .

 * We don't touch anything in any of the other ports.

If you give the OK, then I can commit this as soon as I figure out git-svn
(which I'm looking at now).

Thanks!


Re: Mising Patch #2 from the RISC-V v3 Submission

2017-02-06 Thread Palmer Dabbelt
I think this got dropped because it was over the size limit for this
mailing list.  I've attached a gzip'd version of the patch instead.

On Mon, Feb 6, 2017 at 10:53 AM, Palmer Dabbelt  wrote:
> Sorry, I'm not sure what happened to this patch.  Here's patch #2 from of v3
> submission which I seem to have dropped.
>


0001-RISC-V-Port-gcc.patch.gz
Description: GNU Zip compressed data


[PATCH 2/2] Alphebetize RISC-V Options section

2017-02-06 Thread Palmer Dabbelt
2017-02-06  Palmer Dabbelt  

* docs/inwvoke.texi (RISC-V Options): Alphabetize.
---
 gcc/ChangeLog   |   4 ++
 gcc/doc/invoke.texi | 158 ++--
 2 files changed, 83 insertions(+), 79 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 60eee29..d22fad7 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,9 @@
 2017-02-06  Palmer Dabbelt  
 
+   * docs/inwvoke.texi (RISC-V Options): Alphabetize.
+
+2017-02-06  Palmer Dabbelt  
+
* doc/invoke.texi (RISC-V Options): Use two spaces to separate
options.
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ae413b1..cabaa1b 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -959,6 +959,20 @@ Objective-C and Objective-C++ Dialects}.
 @emph{PowerPC Options}
 See RS/6000 and PowerPC Options.
 
+@emph{RISC-V Options}
+@gccoptlist{-mbranch-cost=@var{N-instruction} @gol
+-mmemcpy  -mno-memcpy @gol
+-mplt  -mno-plt @gol
+-mabi=@var{ABI-string} @gol
+-mfdiv  -mno-fdiv @gol
+-mdiv  -mno-div @gol
+-march=@var{ISA-string} @gol
+-mtune=@var{processor-string} @gol
+-msmall-data-limit=@var{N-bytes} @gol
+-msave-restore  -mno-save-restore @gol
+-mcmodel=@var{code-model} @gol
+-mexplicit-relocs  -mno-explicit-relocs @gol}
+
 @emph{RL78 Options}
 @gccoptlist{-msim  -mmul=none  -mmul=g13  -mmul=g14  -mallregs @gol
 -mcpu=g10  -mcpu=g13  -mcpu=g14  -mg10  -mg13  -mg14 @gol
@@ -1026,20 +1040,6 @@ See RS/6000 and PowerPC Options.
 -mstack-protector-guard-offset=@var{offset} @gol
 -mlra  -mno-lra}
 
-@emph{RISC-V Options}
-@gccoptlist{-mbranch-cost=@var{N-instruction} @gol
--mmemcpy  -mno-memcpy @gol
--mplt  -mno-plt @gol
--mabi=@var{ABI-string} @gol
--mfdiv  -mno-fdiv @gol
--mdiv  -mno-div @gol
--march=@var{ISA-string} @gol
--mtune=@var{processor-string} @gol
--msmall-data-limit=@var{N-bytes} @gol
--msave-restore  -mno-save-restore @gol
--mcmodel=@var{code-model} @gol
--mexplicit-relocs  -mno-explicit-relocs @gol}
-
 @emph{RX Options}
 @gccoptlist{-m64bit-doubles  -m32bit-doubles  -fpu  -nofpu@gol
 -mcpu=@gol
@@ -13756,9 +13756,9 @@ platform.
 * PDP-11 Options::
 * picoChip Options::
 * PowerPC Options::
+* RISC-V Options::
 * RL78 Options::
 * RS/6000 and PowerPC Options::
-* RISC-V Options::
 * RX Options::
 * S/390 and zSeries Options::
 * Score Options::
@@ -20831,6 +20831,70 @@ these warnings.
 
 These are listed under @xref{RS/6000 and PowerPC Options}.
 
+@node RISC-V Options
+@subsection RISC-V Options
+@cindex RISC-V Options
+
+These command-line options are defined for RISC-V targets:
+
+@table @gcctabopt
+@item -mbranch-cost=@var{n}
+@opindex mbranch-cost
+Set the cost of branches to roughly @var{n} instructions.
+
+@item -mmemcpy
+@itemx -mno-memcpy
+@opindex mmemcpy
+Don't optimize block moves.
+
+@item -mplt
+@itemx -mno-plt
+@opindex plt
+When generating PIC code, allow the use of PLTs. Ignored for non-PIC.
+
+@item -mabi=@var{ABI-string}
+@opindex mabi
+Specify integer and floating-point calling convention.  This defaults to the
+natural calling convention: e.g.@ LP64 for RV64I, ILP32 for RV32I, LP64D for
+RV64G.
+
+@item -mfdiv
+@itemx -mno-fdiv
+@opindex mfdiv
+Use hardware floating-point divide and square root instructions.  This requires
+the F or D extensions for floating-point registers.
+
+@item -mdiv
+@itemx -mno-div
+@opindex mdiv
+Use hardware instructions for integer division.  This requires the M extension.
+
+@item -march=@var{ISA-string}
+@opindex march
+Generate code for given RISC-V ISA (e.g.@ @samp{rv64im}).  ISA strings must be
+lower-case.  Examples include @samp{rv64i}, @samp{rv32g}, and @samp{rv32imaf}.
+
+@item -mtune=@var{processor-string}
+@opindex mtune
+Optimize the output for the given processor, specified by microarchitecture
+name.
+
+@item -msmall-data-limit=@var{n}
+@opindex msmall-data-limit
+Put global and static data smaller than @var{n} bytes into a special section
+(on some targets).
+
+@item -msave-restore
+@itemx -mno-save-restore
+@opindex msave-restore
+Use smaller but slower prologue and epilogue code.
+
+@item -mcmodel=@var{code-model}
+@opindex mcmodel
+Specify the code model.
+
+@end table
+
 @node RL78 Options
 @subsection RL78 Options
 @cindex RL78 Options
@@ -22152,70 +22216,6 @@ offset from that base register. The default for those 
is as specified in the
 relevant ABI.
 @end table
 
-@node RISC-V Options
-@subsection RISC-V Options
-@cindex RISC-V Options
-
-These command-line options are defined for RISC-V targets:
-
-@table @gcctabopt
-@item -mbranch-cost=@var{n}
-@opindex mbranch-cost
-Set the cost of branches to roughly @var{n} instructions.
-
-@item -mmemcpy
-@itemx -mno-memcpy
-@opindex mmemcpy
-Don't optimize block moves.
-
-@item -mplt
-@itemx -mno-plt
-@opindex plt
-When generating PIC code, allow the use of PLTs. Ignored for non-PIC.
-
-@item -mabi=@var{ABI-string}
-@opindex mabi
-Specify integer and floating-point calling convention.  

[PATCH 1/2] Use two spaces to separate options in the RISC-V docs

2017-02-06 Thread Palmer Dabbelt
gcc/ChangeLog:

2017-02-06  Palmer Dabbelt  

* doc/invoke.texi (RISC-V Options): Use two spaces to separate
options.
---
 gcc/ChangeLog   |  5 +
 gcc/doc/invoke.texi | 12 ++--
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 28b8156..60eee29 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,10 @@
 2017-02-06  Palmer Dabbelt  
 
+   * doc/invoke.texi (RISC-V Options): Use two spaces to separate
+   options.
+
+2017-02-06  Palmer Dabbelt  
+
* config/riscv/riscv.c: New file.
* gcc/common/config/riscv/riscv-common.c: Likewise.
* config.gcc: Likewise.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ce3c704..ae413b1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1028,17 +1028,17 @@ See RS/6000 and PowerPC Options.
 
 @emph{RISC-V Options}
 @gccoptlist{-mbranch-cost=@var{N-instruction} @gol
--mmemcpy -mno-memcpy @gol
--mplt -mno-plt @gol
+-mmemcpy  -mno-memcpy @gol
+-mplt  -mno-plt @gol
 -mabi=@var{ABI-string} @gol
--mfdiv -mno-fdiv @gol
--mdiv -mno-div @gol
+-mfdiv  -mno-fdiv @gol
+-mdiv  -mno-div @gol
 -march=@var{ISA-string} @gol
 -mtune=@var{processor-string} @gol
 -msmall-data-limit=@var{N-bytes} @gol
--msave-restore -mno-save-restore @gol
+-msave-restore  -mno-save-restore @gol
 -mcmodel=@var{code-model} @gol
--mexplicit-relocs -mno-explicit-relocs @gol}
+-mexplicit-relocs  -mno-explicit-relocs @gol}
 
 @emph{RX Options}
 @gccoptlist{-m64bit-doubles  -m32bit-doubles  -fpu  -nofpu@gol
-- 
2.10.2



RISC-V Post-v3 Documentation Fixups

2017-02-06 Thread Palmer Dabbelt
A few problems in the RISC-V documentation were found as part of the
submission.  Here's some fixes to them, which I'll commit on top of the rest of
the RISC-V port which has already been accepted.

Thanks for noticing this!



[PATCH] Fix fixincludes for canadian cross builds

2017-02-06 Thread Bernd Edlinger
Hi,

I noticed that there is a subtle problem with build!=host
configurations.

That is, the fixinclude machinery is using the path that would
work on the target system to find the headers that need to be
fixed, but the build machine can have different header files than
the target machine, even if th are at the same location.  This can 
theoretically cause a mis-compilation of the target libraries.

However the mkheaders script works on the target, and would fix it up,
but the target libraries are not rebuilt, and they may have used the
wrong fixed headers.

To fix this inconsistency I would like to introduce a new make
variable BUILD_SYSTEM_HEADER_DIR that is identical to SYSTEM_HEADER_DIR
if build==host and which is CROSS_SYSTEM_HEADER_DIR for canadian cross
configs.

Only mkheaders.conf uses SYSTEM_HEADER_DIR because it runs on the
host system, all other places should use BUILD_SYSTEM_HEADER_DIR.

I tested this change with different arm-linux-gnueabihf cross
compilers, and verified that mkheaders still works on the host system.

Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
2017-02-06  Bernd Edlinger  

* Makefile.in (BUILD_SYSTEM_HEADER_DIR): New make variabe.
	(LIMITS_H_TEST, if_multiarch, stmp-fixinc): Use BUILD_SYSTEM_HEADER_DIR
	instead of SYSTEM_HEADER_DIR.

Index: gcc/Makefile.in
===
--- gcc/Makefile.in	(revision 245184)
+++ gcc/Makefile.in	(working copy)
@@ -517,11 +517,18 @@
 # macro is also used in a double-quoted context.
 SYSTEM_HEADER_DIR = `echo @SYSTEM_HEADER_DIR@ | sed -e :a -e 's,[^/]*/\.\.\/,,' -e ta`
 
+# Path to the system headers on the build machine
+ifeq ($(build),$(host))
+BUILD_SYSTEM_HEADER_DIR = $(SYSTEM_HEADER_DIR)
+else
+BUILD_SYSTEM_HEADER_DIR = `echo $(CROSS_SYSTEM_HEADER_DIR) | sed -e :a -e 's,[^/]*/\.\.\/,,' -e ta`
+endif
+
 # Control whether to run fixincludes.
 STMP_FIXINC = @STMP_FIXINC@
 
 # Test to see whether  exists in the system header files.
-LIMITS_H_TEST = [ -f $(SYSTEM_HEADER_DIR)/limits.h ]
+LIMITS_H_TEST = [ -f $(BUILD_SYSTEM_HEADER_DIR)/limits.h ]
 
 # Directory for prefix to system directories, for
 # each of $(system_prefix)/usr/include, $(system_prefix)/usr/lib, etc.
@@ -572,7 +579,7 @@
 else
   ifeq ($(enable_multiarch),auto)
 # SYSTEM_HEADER_DIR is makefile syntax, cannot be evaluated in configure.ac
-if_multiarch = $(if $(wildcard $(shell echo $(SYSTEM_HEADER_DIR))/../../usr/lib/*/crti.o),$(1))
+if_multiarch = $(if $(wildcard $(shell echo $(BUILD_SYSTEM_HEADER_DIR))/../../usr/lib/*/crti.o),$(1))
   else
 if_multiarch =
   endif
@@ -2990,11 +2997,11 @@
 	sysroot_headers_suffix=`echo $${ml} | sed -e 's/;.*$$//'`; \
 	multi_dir=`echo $${ml} | sed -e 's/^[^;]*;//'`; \
 	fix_dir=include-fixed$${multi_dir}; \
-	if ! $(inhibit_libc) && test ! -d ${SYSTEM_HEADER_DIR}; then \
+	if ! $(inhibit_libc) && test ! -d ${BUILD_SYSTEM_HEADER_DIR}; then \
 	  echo The directory that should contain system headers does not exist: >&2 ; \
-	  echo "  ${SYSTEM_HEADER_DIR}" >&2 ; \
+	  echo "  ${BUILD_SYSTEM_HEADER_DIR}" >&2 ; \
 	  tooldir_sysinc=`echo "${gcc_tooldir}/sys-include" | sed -e :a -e "s,[^/]*/\.\.\/,," -e ta`; \
-	  if test "x${SYSTEM_HEADER_DIR}" = "x$${tooldir_sysinc}"; \
+	  if test "x${BUILD_SYSTEM_HEADER_DIR}" = "x$${tooldir_sysinc}"; \
 	  then sleep 1; else exit 1; fi; \
 	fi; \
 	$(mkinstalldirs) $${fix_dir}; \
@@ -3005,7 +3012,7 @@
 	  export TARGET_MACHINE srcdir SHELL MACRO_LIST && \
 	  cd $(build_objdir)/fixincludes && \
 	  $(SHELL) ./fixinc.sh "$${gcc_dir}/$${fix_dir}" \
-	$(SYSTEM_HEADER_DIR) $(OTHER_FIXINCLUDES_DIRS) ); \
+	$(BUILD_SYSTEM_HEADER_DIR) $(OTHER_FIXINCLUDES_DIRS) ); \
 	rm -f $${fix_dir}/syslimits.h; \
 	if [ -f $${fix_dir}/limits.h ]; then \
 	  mv $${fix_dir}/limits.h $${fix_dir}/syslimits.h; \


Re: [wwwdocs] Add a case to porting_to + a question wrt validity of another one

2017-02-06 Thread Markus Trippelsdorf
On 2017.02.06 at 18:13 +0100, Marek Polacek wrote:
> This patch adds a description of something I noticed while doing the 
> Fedora mass rebuild.  Do we want to say more about the invalidity of
> the incomplete type case?
> +
> +GCC 7 no longer accepts ill-formed code involving use of an incomplete type:
> +
> +namespace N {
> +class C;
> +class A {
> +  C fn1();
> +};
> +template  class B : A {
> +  void fn2() { fn1().x; }
> +};
> +}
> +
> +
> +

You could drop the namespace. Also "struct A" would be better, because
otherwise fn1 is a private and thus unaccessible in fn2.

You could also mention Chromium and Node.js as prominent examples.

-- 
Markus


Re: [PATCH] PR target/66144, PowerPC improve vector compare

2017-02-06 Thread Michael Meissner
On Fri, Feb 03, 2017 at 06:07:56PM -0600, Segher Boessenkool wrote:
> On Fri, Feb 03, 2017 at 04:25:00PM -0500, Michael Meissner wrote:
> > +;; Return 1 if operand is either a vector constant of all 0 bits of a 
> > vector
> > +;; constant of all 1 bits.
> > +(define_predicate "vector_int_same_bit"
> > +  (match_code "const_vector")
> > +{
> > +  if (GET_MODE_CLASS (mode) != MODE_VECTOR_INT)
> > +return 0;
> > +
> > +  else
> > +return op == CONST0_RTX (mode) || op == CONSTM1_RTX (mode);
> > +})
> 
> This predicate is unused as far as I see?

I removed it.  Thanks.

> > +  /* Optimize vec1 == vec2, to know the mask generates -1/0.  */
> > +  if (GET_MODE_CLASS (dest_mode) == MODE_VECTOR_INT)
> >  {
> > -  tmp = op_true;
> > -  op_true = op_false;
> > -  op_false = tmp;
> > +  if (op_true == constant_m1 && op_false == constant_0)
> > +   {
> > + emit_move_insn (dest, mask);
> > + return 1;
> > +   }
> > +
> > +  else if (op_true == constant_0 && op_false == constant_m1)
> > +   {
> > + emit_insn (gen_rtx_SET (dest, gen_rtx_NOT (dest_mode, mask)));
> > + return 1;
> > +   }
> >  }
> 
> Do you need to test for dest_mode == mask_mode here, like below?

I reworked the test, see below.

> > +  if (op_true == constant_m1 && dest_mode == mask_mode)
> > +op_true = mask;
> > +  else if (!REG_P (op_true) && !SUBREG_P (op_true))
> > +op_true = force_reg (dest_mode, op_true);
> > +
> > +  if (op_false == constant_0 && dest_mode == mask_mode)
> > +op_false = mask;
> > +  else if (!REG_P (op_false) && !SUBREG_P (op_false))
> > +op_false = force_reg (dest_mode, op_false);
> 
> Another thing you could try is, if either op_true or op_false is 0
> or -1, let the result be
>   (mask & op_true) | (~mask & op_false)
> 
> and let the rest of the optimisers sort it out (it's a single vor/vand
> or vorc/vandc, or a vnot, or nothing).  A later improvement perhaps.
> Or does it already handle all cases now :-)

That will be a later optimization if desired.  I reran the tests with no
regressions.

When I reran the spec benchmark in more controlled conditions, I wasn't able to
reproduce the gcc speedups like I saw with a quick run.  I did see the tonto
speedups with these changes.

As I said, but perhaps I wasn't clear.  I will be in the office on Tuesday, but
I will be on vacation starting on Wednesday.  I would prefer checking in the
changes today (Monday) just in case something needs fixing before I leave.  Can
I check it into the trunk?

[gcc]
2017-02-06  Michael Meissner  

PR target/66144
* config/rs6000/vector.md (vcond): Allow the true and
false values to be constant vectors with all 0 or all 1 bits set.
(vcondu): Likewise.
* config/rs6000/predicates.md (vector_int_reg_or_same_bit): New
predicate.
(fpmask_comparison_operator): Update comment.
(vecint_comparison_operator): New predicate.
* config/rs6000/rs6000.c (rs6000_emit_vector_cond_expr): Optimize
vector conditionals when the true and false values are constant
vectors with all 0 bits or all 1 bits set.

[gcc/testsuite]
2017-02-06  Michael Meissner  

PR target/66144
* gcc.target/powerpc/pr66144-1.c: New test.
* gcc.target/powerpc/pr66144-2.c: Likewise.
* gcc.target/powerpc/pr66144-3.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 245137)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -808,6 +808,21 @@ (define_predicate "all_ones_constant"
   (and (match_code "const_int,const_double,const_wide_int,const_vector")
(match_test "op == CONSTM1_RTX (mode) && !FLOAT_MODE_P (mode)")))
 
+;; Return 1 if operand is a vector int register or is either a vector constant
+;; of all 0 bits of a vector constant of all 1 bits.
+(define_predicate "vector_int_reg_or_same_bit"
+  (match_code "reg,subreg,const_vector")
+{
+  if (GET_MODE_CLASS (mode) != MODE_VECTOR_INT)
+return 0;
+
+  else if (REG_P (op) || SUBREG_P (op))
+return vint_operand (op, mode);
+
+  else
+return op == CONST0_RTX (mode) || op == CONSTM1_RTX (mode);
+})
+
 ;; Return 1 if operand is 0.0.
 (define_predicate "zero_fp_constant"
   (and (match_code "const_double")
@@ -1260,8 +1275,8 @@ (define_predicate "scc_rev_comparison_op
   (and (match_operand 0 "branch_comparison_operator")
(match_code "ne,le,ge,leu,geu,ordered")))
 
-;; Return 1 if OP is a comparison operator suitable for vector/scalar
-;; comparisons that generate a -1/0 mask.
+;; Return 1 if OP is a comparison operator suitable for floating point
+;; vector/scalar comparisons that generate a -1/0 mask.
 

Re: [PATCH 2/6] RISC-V Port: gcc

2017-02-06 Thread Palmer Dabbelt
On Sun, 05 Feb 2017 21:36:29 PST (-0800), san...@codesourcery.com wrote:
> I didn't see a v3 with the documentation patches go by yet, and I had
> some nit-picky comments on v2 (in addition to the ones Joseph already
> asked for):

No problem.

>  > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index 4b13aeb..581c4ef 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -1046,6 +1046,20 @@ See RS/6000 and PowerPC Options.
>>   -mstack-protector-guard-offset=@var{offset} @gol
>>   -mlra  -mno-lra}
>>
>> +@emph{RISC-V Options}
>> +@gccoptlist{-mbranch-cost=@var{N-instruction} @gol
>> +-mmemcpy -mno-memcpy @gol
>> +-mplt -mno-plt @gol
>> +-mabi=@var{ABI-string} @gol
>> +-mfdiv -mno-fdiv @gol
>> +-mdiv -mno-div @gol
>> +-march=@var{ISA-string} @gol
>> +-mtune=@var{processor-string} @gol
>> +-msmall-data-limit=@var{N-bytes} @gol
>> +-msave-restore -mno-save-restore @gol
>> +-mcmodel=@var{code-model} @gol
>> +-mexplicit-relocs -mno-explicit-relocs @gol}
>
> Please use 2 spaces to separate options on the same line in @gccoptlist
> instead of only 1.

https://github.com/riscv/riscv-gcc/commit/7b6385fe73a38221bfbc2a2f7983a266960f7ea4

>> @@ -13881,6 +13895,7 @@ platform.
>>   * PowerPC Options::
>>   * RL78 Options::
>>   * RS/6000 and PowerPC Options::
>> +* RISC-V Options::
>>   * RX Options::
>>   * S/390 and zSeries Options::
>>   * Score Options::
>
> Can we please keep this properly alphabetized?  Likewise with the
> placement of the section itself.

https://github.com/riscv/riscv-gcc/commit/8a64242d0d0b90707e84368fcafeefc1b73c1311

Thanks for the feedback!


Re: [wwwdocs] Fix four references to Intel instruction set references

2017-02-06 Thread Jakub Jelinek
On Mon, Feb 06, 2017 at 06:44:30PM +0100, Gerald Pfeifer wrote:
> Intel's webmasters were at it again (or rather not).
> 
> One old link in svn.html contained "319433-020" in its file name, the 
> new one I found shows "319433-028" on the title page, so I assume this
> is a newer version of the same?

Unfortunately not.  In -025 they've removed most of the instructions
from the manual (moved to another manual), so e.g. -028 is useless for
anything except the very latest ISAs.  So when talking about VL,BW,DQ
at least, all those are in the base manuals now.

Jakub


Re: [wwwdocs] Add a case to porting_to + a question wrt validity of another one

2017-02-06 Thread Jason Merrill
On Mon, Feb 6, 2017 at 12:13 PM, Marek Polacek  wrote:
> This patch adds a description of something I noticed while doing the
> Fedora mass rebuild.  Do we want to say more about the invalidity of
> the incomplete type case?

Here's the relevant rule in the standard:

14.6/8: "If a hypothetical instantiation of a template immediately
following its definition would be ill-formed due to a construct that
does not depend on a template parameter, the program is ill-formed; no
diagnostic is required.  If the interpretation of such a construct in
the hypothetical instantiation is different from the interpretation of
the corresponding construct in any actual instantiation of the
template, the program is ill-formed; no diagnostic is required. [
Note: This can happen in situations including the following:
* a type used in a non-dependent name is incomplete at the point at
which a template is defined but is complete at the point at which an
instantiation is performed, "

> Furthermore I noticed that the testcase below doesn't compile anymore since
> r240874; Nathan, is that desirable?
>
> class C {
> public:
>   virtual ~C() noexcept(false);
> };
> class A : virtual C {
>   virtual C m_fn1() = 0;
> };
> class B : A {};
>
> s.cc:8:7: error: looser throw specifier for ‘virtual B::~B() noexcept (false)’
>  class B : A {};
>^
> s.cc:5:7: error:   overriding ‘virtual A::~A() noexcept’
>  class A : virtual C {
>^

No, this is a bug.

Jason


[wwwdocs] readings.html - remove alpha

2017-02-06 Thread Gerald Pfeifer
I've been looking for replacement links for these for a while, but
if you look at the domain names and what has happened in the industry
the last 15 years, it's kinda clear why I haven't found a whole lot. ;-)

And it's not like this architecture hasn't been EOLed a while ago, so
I went ahead and committed the below.

Gerald

Index: readings.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/readings.html,v
retrieving revision 1.256
diff -u -r1.256 readings.html
--- readings.html   3 Feb 2017 07:43:57 -   1.256
+++ readings.html   6 Feb 2017 17:44:52 -
@@ -65,15 +65,6 @@
ARM Documentation
  
 
- alpha
-   Manufacturer: Compaq (DEC)
-   http://www.tru64unix.compaq.com/docs/base_doc/DOCUMENTATION/V51A_HTML/ARH9MBTE/TITLE.HTM;>Calling
-  Standard for AXP Systems documents the calling conventions for
-  Digital Unix; chapters 2-4 apply to other Alpha Unix systems as well.
-   http://h18000.www1.hp.com/cpq-alphaserver/technology/chip-docs.html;>Alpha
-   Hardware Reference Manuals
- 
-
  andes (nds32)
   Manufacturer: Various licenses of Andes Technology Corporation.
   CPUs include: AndesCore families N7, N8, SN8, N9, N10, N12 and N13.


[wwwdocs] Fix four references to Intel instruction set references

2017-02-06 Thread Gerald Pfeifer
Intel's webmasters were at it again (or rather not).

One old link in svn.html contained "319433-020" in its file name, the 
new one I found shows "319433-028" on the title page, so I assume this
is a newer version of the same?

Similar for the olther old link in svn.html which has "319433-015.pdf".

And similarly so for the two different(!) links on gcc-5/changes.html.

Hopefully by consolidating to use the same URL for four references
this will become easier in the future.

Applied.

Gerald

Index: svn.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/svn.html,v
retrieving revision 1.208
diff -u -r1.208 svn.html
--- svn.html6 Aug 2016 15:37:39 -   1.208
+++ svn.html6 Feb 2017 15:20:18 -
@@ -374,7 +374,7 @@
   avx-512vlbwdq
   The goal of this branch is to implement the Intel AVX-512{VL,BW,DQ}
   Programming Reference
-  (https://software.intel.com/sites/default/files/managed/c6/a9/319433-020.pdf;>link).
+  (https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf;>link).
   The branch is maintained by Yukhin Kirill mailto:kirill.yuk...@intel.com;>kirill.yuk...@intel.com.
   Patches should be marked with the tag [AVX512] in the subject
@@ -390,7 +390,7 @@
 
   mpx
   The goal of this branch is to support Intel MPX technology
-  (http://download-software.intel.com/sites/default/files/319433-015.pdf;>link).
+  (https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf;>link).
   The branch is maintained by
   Ilya Enkovich mailto:ilya.enkov...@intel.com;>ilya.enkov...@intel.com
   Patches should be marked with the tag [MPX] in the subject
Index: gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.142
diff -u -r1.142 changes.html
--- gcc-5/changes.html  5 Feb 2017 21:47:16 -   1.142
+++ gcc-5/changes.html  6 Feb 2017 15:20:19 -
@@ -776,7 +776,7 @@
 IA-32/x86-64
   
 New ISA extensions support
-   https://software.intel.com/sites/default/files/managed/0d/53/319433-023.pdf;>
+   https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf;>
AVX-512{BW,DQ,VL,IFMA,VBMI} of Intel's CPU
codenamed Skylake Server was added to GCC.  That includes inline
assembly support, new intrinsics, and basic autovectorization.  These
@@ -788,7 +788,7 @@
-mavx512ifma and for AVX-512 Vector Bit Manipulation
Instructions: -mavx512vbmi.
New ISA extensions support
-   https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf;>
+   https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf;>
Intel MPX was added to GCC.  This new extension is available via the
-mmpx compiler switch.  Intel MPX is a set of processor 
features which,
with compiler, run-time library and OS support, brings increased 
robustness to


[wwwdocs] Add a case to porting_to + a question wrt validity of another one

2017-02-06 Thread Marek Polacek
This patch adds a description of something I noticed while doing the 
Fedora mass rebuild.  Do we want to say more about the invalidity of
the incomplete type case?

Furthermore I noticed that the testcase below doesn't compile anymore since
r240874; Nathan, is that desirable?

class C {
public:
  virtual ~C() noexcept(false);
};
class A : virtual C {
  virtual C m_fn1() = 0;
};
class B : A {};

s.cc:8:7: error: looser throw specifier for ‘virtual B::~B() noexcept (false)’
 class B : A {};
   ^
s.cc:5:7: error:   overriding ‘virtual A::~A() noexcept’
 class A : virtual C {
   ^

If it's desirable that g++ rejects this, we should say something about it in
porting_to.  Any ideas what that might be?

Thanks,

Index: gcc-7/porting_to.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/porting_to.html,v
retrieving revision 1.3
diff -u -r1.3 porting_to.html
--- gcc-7/porting_to.html   3 Feb 2017 07:55:27 -   1.3
+++ gcc-7/porting_to.html   6 Feb 2017 17:08:44 -
@@ -33,6 +33,23 @@
 
 C++ language issues
 
+Invalid use of incomplete type
+
+
+GCC 7 no longer accepts ill-formed code involving use of an incomplete type:
+
+namespace N {
+class C;
+class A {
+  C fn1();
+};
+template  class B : A {
+  void fn2() { fn1().x; }
+};
+}
+
+
+
 Mangling change for conversion operators
 
 

Marek


Re: [PING 6, PATCH] Remove xfail from thread_local-order2.C.

2017-02-06 Thread Dominique d'Humières
For the record the test XPASS on darwin16 since at least r244095, but not on 
darwin10:

XPASS: g++.dg/tls/thread_local-order2.C  -std=c++11 execution test
XPASS: g++.dg/tls/thread_local-order2.C  -std=c++14 execution test
XPASS: g++.dg/tls/thread_local-order2.C  -std=c++11 execution test
XPASS: g++.dg/tls/thread_local-order2.C  -std=c++14 execution test

Dominique



Re: Fix profile updating in ifcombine

2017-02-06 Thread Richard Earnshaw (lists)
On 06/02/17 15:54, Jiong Wang wrote:
> On 06/02/17 15:26, Jan Hubicka wrote:
>> I think it is not a regression, just the testcase if fragile and
>> depends on outcome
>> of ifcombine.  It seems it was updated several time in the past. I am
>> not quite
>> sure what the test is testing.
> 
> They are tring to make sure optimal stack adjustment decisions are made.
> 
> Fix the testcases by disabling relevant transformation passes looks one
> way to
> me.  The other way, might be more reliable, is we dump the decisions
> made during
> aarch64 frame layout if dump_file be true, and prefix the dump entry by
> function
> name to make it easier caught by dejagnu.  We then scan rtl dump instead of
> instructions.
>  
> 

We only care that the epilogue instructions appear at least once.  So
for the epilogue we should probably just adjust the tests to use
scan-assembler, rather than scan-assembler-times.

R.


Re: [PING 6, PATCH] Remove xfail from thread_local-order2.C.

2017-02-06 Thread Mike Stump
On Feb 6, 2017, at 3:33 AM, Rainer Orth  wrote:
> 
> Hi Gerald,
> 
>> Copying the two guys listed as testsuite maintainers in gcc/MAINTAINERS
>> may help; let me do that for you.
>> 
>> That said, if this fails to fail, the patch might be considered obvious,
>> not requiring a approval?
> 
> it's not: while it may XPASS with newer glibc versions, it still XFAILs
> e.g. on Solaris (and probably others).  So unconditionally removing the
> xfail *-*-* trades an XPASS->PASS on some Linux versions against a
> XFAIL->FAIL elsewhere, which isn't acceptable.

So, if it passes most everywhere, then I think the systems where it fails need 
to be identified and listed.

Are there any solaris systems where it works?

Systems like darwin and freebsd and aix seem to suggest that things should 
generally work; which means that the problem is likely just specific 
implementations of specific software.

Anyone know of any other systems where it fails?

I'll copy Jason to see if he recalls any systems where this might still fail.

Anyway, I'd recommend just xfailing on solaris, and getting on with life.  
Other systems that fail, will trivially add themselves, or identify a way to 
xfail or otherwise mark as unsupported or add a new requirement if they prefer. 
 Any objections?



[PATCH] Fix test failure at -O0 by pruning output

2017-02-06 Thread Jonathan Wakely

PR libstdc++/79323
* testsuite/20_util/duration/literals/range.cc: Prune extra output
at -O0.

Tested x86_64-linux, committed to trunk.

commit c090246da29b04f260a31886537f1e4790652f08
Author: Jonathan Wakely 
Date:   Mon Feb 6 14:13:51 2017 +

Fix test failure at -O0 by pruning output

PR libstdc++/79323
* testsuite/20_util/duration/literals/range.cc: Prune extra output
at -O0.

diff --git a/libstdc++-v3/testsuite/20_util/duration/literals/range.cc 
b/libstdc++-v3/testsuite/20_util/duration/literals/range.cc
index 451d161..eafc806 100644
--- a/libstdc++-v3/testsuite/20_util/duration/literals/range.cc
+++ b/libstdc++-v3/testsuite/20_util/duration/literals/range.cc
@@ -28,3 +28,4 @@ test01()
   auto h = 9223372036854775808h;
   // { dg-error "cannot be represented" "" { target *-*-* } 890 }
 }
+// { dg-prune-output "in constexpr expansion" } // needed for -O0


Re: [PATCH] avoid overwriting minimum length of unknown strings (PR 79376)

2017-02-06 Thread Jeff Law

On 02/04/2017 10:37 AM, Martin Sebor wrote:

There's a "thinko" in the get_range_strlen() function that computes
the range of possible string lengths for a character pointer that
may point to an array holding a string of unknown length and a string
literal.  The bug lets the function return the length of the string
as the lower bound (and the size of the array -1 as the upper bound),
which isn't correct.  The lower bound in this case must be zero because
the array could be the empty string.

The attached patch corrects this error.

Martin

gcc-79376.diff


gcc/ChangeLog:

* gimple-fold.c (get_range_strlen): Set the minimum length to zero.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/builtin-sprintf-warn-14.c: New test.
* gcc.dg/tree-ssa/pr79376.c: New test.

OK.
jeff



Re: Fix profile updating in ifcombine

2017-02-06 Thread Jiong Wang

On 06/02/17 15:26, Jan Hubicka wrote:

I think it is not a regression, just the testcase if fragile and depends on 
outcome
of ifcombine.  It seems it was updated several time in the past. I am not quite
sure what the test is testing.


They are tring to make sure optimal stack adjustment decisions are made.

Fix the testcases by disabling relevant transformation passes looks one way to
me.  The other way, might be more reliable, is we dump the decisions made during
aarch64 frame layout if dump_file be true, and prefix the dump entry by function
name to make it easier caught by dejagnu.  We then scan rtl dump instead of
instructions.
 



Re: [RFA][PR tree-optimization/79095] [PATCH 1/4] Improve ranges for MINUS_EXPR and EXACT_DIV_EXPR

2017-02-06 Thread Jeff Law

On 02/06/2017 08:33 AM, Richard Biener wrote:

On Mon, Feb 6, 2017 at 4:18 PM, Jeff Law  wrote:

On 02/06/2017 08:15 AM, Jeff Law wrote:


On 02/06/2017 01:11 AM, Richard Biener wrote:


On Sat, Feb 4, 2017 at 3:52 PM, Jeff Law  wrote:


This is the first of a 4 part series to address the issues around 79095.

This patch addresses improvements in determining ranges of binary
expressions in three ways.

First if we are otherwise unable to find a range for the result of a
MINUS_EXPR, if we know the arguments are not equal, then we know the
resultant range is ~[0,0].

Second, for EXACT_DIV_EXPR, if the numerator has the range ~[0,0], then
resultant range is currently [TYPE_MIN/DENOM,TYPE_MAX/DENOM].  That is
rarely a useful range.   A resultant range of ~[0,0] is actually more
useful
since it often tells us something important about the difference of two
pointers.

Finally, when vrp2 discovers an updated range for an object that had
a range
discovered by vrp1, if the new range is ~[0,0], prefer that new range in
some cases.  This is needed to avoid losing the newly discovered ~[0,0]
range for EXACT_DIV_EXPR.

Bootstrapped and regression tested with the other patches in this
series.
OK for the trunk?

Jeff

* tree-vrp.c (extract_range_from_binary_expr): For
EXACT_DIV_EXPR,
if the numerator has the range ~[0,0] make the resultant range
~[0,0].  For MINUS_EXPR with no derived range, if the
operands are
known to be not equal, then the resulting range is ~[0,0].
(intersect_ranges): In some cases prefer ~[0,0].

commit b7baf46ab62e28d2dbc22e9dcd4404926d59df18
Author: Jeff Law 
Date:   Fri Feb 3 15:45:58 2017 -0500

Improved ranges

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index b429217..3338d8b 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -3298,6 +3298,37 @@ extract_range_from_binary_expr (value_range *vr,

   extract_range_from_binary_expr_1 (vr, code, expr_type, _vr0,
);
 }
+
+  /* EXACT_DIV_EXPR is typically used for pointer subtraction;
+ as a result a ~[0,0] may be better than what has already
+ been computed.
+
+ In particular if numerator has the range ~[0,0], then the
+ result range is going to be something like
+ [MININT/DIVISOR,MAXINT/DIVISOR], which is rarely useful.
+
+ So instead make the result range ~[0,0].  */
+  if (code == EXACT_DIV_EXPR
+  && TREE_CODE (op0) == SSA_NAME
+  && vr0.type == VR_ANTI_RANGE
+  && vr0.min == vr0.max
+  && integer_zerop (vr0.min))
+set_value_range_to_nonnull (vr, TREE_TYPE (op0));



The above belongs in extract_range_from_binary_expr_1, in principle the
cases below as well (though there's pre-existing VARYING result
handling).


Do you want those existing cases moved, it's easy enough to do.


The _1 ones are supposed to be the actual range computations while
the routine you patched is responsible for interfacing with a
lattice.  The
_1 routines can be used from code outside of VRP.


OK.  Good to know.


 /* Extract range information from a unary operation CODE based on
@@ -8620,6 +8651,12 @@ intersect_ranges (enum value_range_type *vr0type,
  else if (vrp_val_is_min (vr1min)
   && vrp_val_is_max (vr1max))
;
+ /* Choose the anti-range if it is ~[0,0], that range is
special
+enough to special case.  */
+ else if (*vr0type == VR_ANTI_RANGE
+  && *vr0min == *vr0max
+  && integer_zerop (*vr0min))
+   ;



Huh.  If I spotted the place of the change correctly then we cannot
arrive
here with vr0 == ~[0,0] as *vr0type is VR_RANGE.  In the case covered
we'd have the only case intersecting [-1, 1] and ~[0,0] that you'd change
to ~[0,0] instead of [-1,1] which generally would be a bad choice (apart
from your implementation error as vr1 is the anti-range here).


Nope.  It's in the right place.  We have a ~[0,0] for vr0 and vr1 is
typically going to be [4,4] or [8.8].  Thus we're in this case:


I matched the above hunk to the following context:

  else if ((maxeq || operand_less_p (*vr0max, vr1max) == 1)
   && (mineq || operand_less_p (vr1min, *vr0min) == 1))
{
  /* ( [  ] ) or ([  ] ) or ( [  ]) */
  if (*vr0type == VR_RANGE
  && vr1type == VR_RANGE)
/* Choose the inner range.  */
;
  else if (*vr0type == VR_ANTI_RANGE
   && vr1type == VR_RANGE)
{
...
  /* Choose the anti-range if the range is effectively varying.  */
  else if (vrp_val_is_min (vr1min)
   && vrp_val_is_max (vr1max))
;
  /* Else choose the range.  */
  else
{
  *vr0type = vr1type;
  *vr0min = vr1min;
  *vr0max = vr1max;
}
}

ah, indeed vr0type is VR_ANTI_RANGE and yes we have the case
of a range with an anti-range "inside".  This also covers 

RE: [PATCH] MIPS: Fix mode mismatch error between Loongson builtin arguments and insn operands.

2017-02-06 Thread Matthew Fortune
Toma Tabacu  writes:
> Matthew Fortune writes:
> >
> > That's not what I hoped but is what I was concerned about as I believe
> > it means we have a change of behaviour.  It boils down to simply
> > ignoring the argument type of unsigned char.  My guess is that a zero
> > extension is created but then immediately eliminated because of the
> paradoxical subreg.
> >
> > I think you need to create a temporary and perform the zero extension
> > to ensure we honour the unsigned char operand:
> >
> >   rtx new_dst = gen_reg_rtx (SImode);
> >   emit_insn (gen_zero_extendqisi2 (new_dst, ops[2].value));
> >   ops[2].value = foo;
> >
> > This should mean that the testcase I sent always has a zero extension
> > but if you change the type of 'amount' to be unsigned char then there
> > should not be a zero extension as the argument will be assumed to be
> > correctly zero extended already and the explicitly introduced
> zero_extend will be eliminated.
> >
> 
> I have made it generate a zero_extend instead of a SUBREG.
> However, the pattern associated with gen_zero_extendqisi2 does not work
> with immediate operands, so I had to add an extra step in which the
> argument is put into a QImode register before being passed to
> gen_zero_extendqisi2.
> 
> Is this OK ?
> 
> Regards,
> Toma
> 
> gcc/
> 
>   * config/mips/mips.c (mips_expand_builtin_insn): Convert the QImode
>   argument of the pshufh, psllh, psllw, psrah, psraw, psrlh, psrlw
>   builtins to SImode and emit a zero-extend, if necessary.
> 
> diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index
> da7fa8f..bab5b93 100644
> --- a/gcc/config/mips/mips.c
> +++ b/gcc/config/mips/mips.c
> @@ -16571,9 +16571,35 @@ mips_expand_builtin_insn (enum insn_code icode,
> unsigned int nops,  {
>machine_mode imode;
>int rangelo = 0, rangehi = 0, error_opno = 0;
> +  rtx qireg, sireg;
> 
>switch (icode)
>  {
> +/* The third operand of these instructions is in SImode, so we need
> to
> +   bring the corresponding builtin argument from QImode into
> SImode.  */
> +case CODE_FOR_loongson_pshufh:
> +case CODE_FOR_loongson_psllh:
> +case CODE_FOR_loongson_psllw:
> +case CODE_FOR_loongson_psrah:
> +case CODE_FOR_loongson_psraw:
> +case CODE_FOR_loongson_psrlh:
> +case CODE_FOR_loongson_psrlw:
> +  gcc_assert (has_target_p && nops == 3 && ops[2].mode == QImode);
> +  sireg = gen_reg_rtx (SImode);
> +  /* We need to put the immediate in a register because
> +  gen_zero_extendqisi2 does not accept immediate operands.  */
> +  if (CONST_INT_P (ops[2].value))
> + {
> +   qireg = gen_reg_rtx (QImode);
> +   emit_insn (gen_rtx_SET (qireg, ops[2].value));
> +   emit_insn (gen_zero_extendqisi2 (sireg, qireg));
> + } else {
> +   emit_insn (gen_zero_extendqisi2 (sireg, ops[2].value));
> + }

Almost but not quite. There is a force_reg helper that takes care of
this i.e. can get rid of the qireg local and the whole if statement.

  emit_insn (gen_zero_extendqisi2 (sireg, force_reg (ops[2].value)));

> +  ops[2].value = sireg;
> +  ops[2].mode = SImode;
> +  break;
> +
>  case CODE_FOR_msa_addvi_b:
>  case CODE_FOR_msa_addvi_h:
>  case CODE_FOR_msa_addvi_w:

OK with that change.

Thanks,
Matthew


Re: [PATCH] Fix bool vs. unsigned:1 vectorization (PR tree-optimization/79284)

2017-02-06 Thread Jeff Law

On 02/06/2017 08:36 AM, Richard Biener wrote:

On Mon, 6 Feb 2017, Jakub Jelinek wrote:


On Mon, Feb 06, 2017 at 03:26:01PM +0100, Richard Biener wrote:

Works for me.  Like VECT_SCALAR_BOOLEAN_TYPE_P () to not confuse it
with VECTOR_BOOLEAN_TYPE_P?


So like this if it passes bootstrap/regtest?


Ok.

No objections from me either.

jeff



Re: [PATCH] Fix bool vs. unsigned:1 vectorization (PR tree-optimization/79284)

2017-02-06 Thread Jeff Law

On 02/02/2017 03:32 PM, Jakub Jelinek wrote:

On Thu, Feb 02, 2017 at 10:12:32AM -0700, Jeff Law wrote:

On 02/01/2017 03:45 AM, Richard Biener wrote:


I agree.  But this means we should look for a vectorizer-local fix
without a new global predicate then (there seem to be subtly different
needs and coming up with good names for all of them sounds difficult...).

Well, we could go with Jakub's INTEGRAL_BOOLEAN_TYPE as posted, but in
contexts where we use it and really depend on single bit objects, we add the
precision == 1 check back.  Jakub's patch removes the type precision check
in tree-vect-patterns for example.  There's likely all kinds of places where
we need to add that check as well.


The 3 cases in tree-vect-patterns.c where I've removed the check were
exactly what the proposed macro does, i.e.
  if ((TYPE_PRECISION (TREE_TYPE (rhs1)) != 1
   || !TYPE_UNSIGNED (TREE_TYPE (rhs1)))
  && TREE_CODE (TREE_TYPE (rhs1)) != BOOLEAN_TYPE)
return false;
i.e. bail out unless the rhs1 type is a BOOLEAN_TYPE (any precision,
assuming it has only valid values of 0 and 1) or unless it is unsigned
precision type 1 integer (i.e. something that (if it has also QImode)
forwprop etc. could have changed a BOOLEAN_TYPE with precision 1 into.
Doh!  I should have read those hunks more closely.  Sorry for creating 
the lengthier than necessary discussion.


Jeff


Re: [PATCH] Fix bool vs. unsigned:1 vectorization (PR tree-optimization/79284)

2017-02-06 Thread Richard Biener
On Mon, 6 Feb 2017, Jakub Jelinek wrote:

> On Mon, Feb 06, 2017 at 03:26:01PM +0100, Richard Biener wrote:
> > Works for me.  Like VECT_SCALAR_BOOLEAN_TYPE_P () to not confuse it
> > with VECTOR_BOOLEAN_TYPE_P?
> 
> So like this if it passes bootstrap/regtest?

Ok.

Richard.

> 2017-02-06  Jakub Jelinek  
> 
>   PR tree-optimization/79284
>   * tree-vectorizer.h (VECT_SCALAR_BOOLEAN_TYPE_P): Define.
>   * tree-vect-stmts.c (vect_get_vec_def_for_operand,
>   vectorizable_mask_load_store, vectorizable_operation,
>   vect_is_simple_cond, get_same_sized_vectype): Use it instead
>   of comparing TREE_CODE of a type against BOOLEAN_TYPE.
>   * tree-vect-patterns.c (check_bool_pattern, search_type_for_mask_1,
>   vect_recog_bool_pattern, vect_recog_mask_conversion_pattern): Likewise.
>   * tree-vect-slp.c (vect_get_constant_vectors): Likewise.
>   * tree-vect-loop.c (vect_determine_vectorization_factor): Likewise.
>   Remove redundant gimple_code (stmt) == GIMPLE_ASSIGN test after
>   is_gimple_assign (stmt).  Replace another such test with
>   is_gimple_assign (stmt).
> testsuite/
>   * gcc.c-torture/compile/pr79284.c: New test.
> 
> --- gcc/tree-vectorizer.h.jj  2017-01-01 12:45:36.0 +0100
> +++ gcc/tree-vectorizer.h 2017-02-06 16:17:41.850539082 +0100
> @@ -784,6 +784,18 @@ struct dataref_aux {
>  /* The maximum vectorization factor supported by any target (V64QI).  */
>  #define MAX_VECTORIZATION_FACTOR 64
>  
> +/* Nonzero if TYPE represents a (scalar) boolean type or type
> +   in the middle-end compatible with it (unsigned precision 1 integral
> +   types).  Used to determine which types should be vectorized as
> +   VECTOR_BOOLEAN_TYPE_P.  */
> +
> +#define VECT_SCALAR_BOOLEAN_TYPE_P(TYPE) \
> +  (TREE_CODE (TYPE) == BOOLEAN_TYPE  \
> +   || ((TREE_CODE (TYPE) == INTEGER_TYPE \
> + || TREE_CODE (TYPE) == ENUMERAL_TYPE)   \
> +   && TYPE_PRECISION (TYPE) == 1 \
> +   && TYPE_UNSIGNED (TYPE)))
> +
>  extern vec stmt_vec_info_vec;
>  
>  void init_stmt_vec_info_vec (void);
> --- gcc/tree-vect-stmts.c.jj  2017-01-31 22:36:31.719323924 +0100
> +++ gcc/tree-vect-stmts.c 2017-02-06 16:20:25.435433608 +0100
> @@ -1420,7 +1420,7 @@ vect_get_vec_def_for_operand (tree op, g
>  
>if (vectype)
>   vector_type = vectype;
> -  else if (TREE_CODE (TREE_TYPE (op)) == BOOLEAN_TYPE
> +  else if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (op))
>  && VECTOR_BOOLEAN_TYPE_P (stmt_vectype))
>   vector_type = build_same_sized_truth_vector_type (stmt_vectype);
>else
> @@ -2029,7 +2029,7 @@ vectorizable_mask_load_store (gimple *st
>  
>mask = gimple_call_arg (stmt, 2);
>  
> -  if (TREE_CODE (TREE_TYPE (mask)) != BOOLEAN_TYPE)
> +  if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (mask)))
>  return false;
>  
>/* FORNOW. This restriction should be relaxed.  */
> @@ -5275,9 +5275,9 @@ vectorizable_operation (gimple *stmt, gi
>of booleans or vector of integers).  We use output
>vectype because operations on boolean don't change
>type.  */
> -  if (TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE)
> +  if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (op0)))
>   {
> -   if (TREE_CODE (TREE_TYPE (scalar_dest)) != BOOLEAN_TYPE)
> +   if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (scalar_dest)))
>   {
> if (dump_enabled_p ())
>   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> @@ -7666,7 +7666,7 @@ vect_is_simple_cond (tree cond, vec_info
>  
>/* Mask case.  */
>if (TREE_CODE (cond) == SSA_NAME
> -  && TREE_CODE (TREE_TYPE (cond)) == BOOLEAN_TYPE)
> +  && VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (cond)))
>  {
>gimple *lhs_def_stmt = SSA_NAME_DEF_STMT (cond);
>if (!vect_is_simple_use (cond, vinfo, _def_stmt,
> @@ -9059,7 +9059,7 @@ get_mask_type_for_scalar_type (tree scal
>  tree
>  get_same_sized_vectype (tree scalar_type, tree vector_type)
>  {
> -  if (TREE_CODE (scalar_type) == BOOLEAN_TYPE)
> +  if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type))
>  return build_same_sized_truth_vector_type (vector_type);
>  
>return get_vectype_for_scalar_type_and_size
> --- gcc/tree-vect-patterns.c.jj   2017-01-31 22:36:29.933346731 +0100
> +++ gcc/tree-vect-patterns.c  2017-02-06 16:20:09.636636952 +0100
> @@ -3158,9 +3158,7 @@ check_bool_pattern (tree var, vec_info *
>break;
>  
>  CASE_CONVERT:
> -  if ((TYPE_PRECISION (TREE_TYPE (rhs1)) != 1
> -|| !TYPE_UNSIGNED (TREE_TYPE (rhs1)))
> -   && TREE_CODE (TREE_TYPE (rhs1)) != BOOLEAN_TYPE)
> +  if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (rhs1)))
>   return false;
>if (! check_bool_pattern (rhs1, vinfo, stmts))
>   return false;
> @@ -3474,9 +3472,7 @@ search_type_for_mask_1 (tree var, vec_in
>if (TREE_CODE (var) != SSA_NAME)
>  return NULL_TREE;
>  
> -  if 

Re: [PATCH] Fix bool vs. unsigned:1 vectorization (PR tree-optimization/79284)

2017-02-06 Thread Jeff Law

On 02/06/2017 07:23 AM, Richard Biener wrote:

On Thu, 2 Feb 2017, Jakub Jelinek wrote:


On Thu, Feb 02, 2017 at 10:12:32AM -0700, Jeff Law wrote:

On 02/01/2017 03:45 AM, Richard Biener wrote:


I agree.  But this means we should look for a vectorizer-local fix
without a new global predicate then (there seem to be subtly different
needs and coming up with good names for all of them sounds difficult...).

Well, we could go with Jakub's INTEGRAL_BOOLEAN_TYPE as posted, but in
contexts where we use it and really depend on single bit objects, we add the
precision == 1 check back.  Jakub's patch removes the type precision check
in tree-vect-patterns for example.  There's likely all kinds of places where
we need to add that check as well.


The 3 cases in tree-vect-patterns.c where I've removed the check were
exactly what the proposed macro does, i.e.
  if ((TYPE_PRECISION (TREE_TYPE (rhs1)) != 1
   || !TYPE_UNSIGNED (TREE_TYPE (rhs1)))
  && TREE_CODE (TREE_TYPE (rhs1)) != BOOLEAN_TYPE)
return false;
i.e. bail out unless the rhs1 type is a BOOLEAN_TYPE (any precision,
assuming it has only valid values of 0 and 1) or unless it is unsigned
precision type 1 integer (i.e. something that (if it has also QImode)
forwprop etc. could have changed a BOOLEAN_TYPE with precision 1 into.

Note Fortran has been using I think precision one boolean_type_node even in GCC 
6.x,
it had:
  boolean_type_node = gfc_get_logical_type (gfc_default_logical_kind);
  boolean_true_node = build_int_cst (boolean_type_node, 1);
  boolean_false_node = build_int_cst (boolean_type_node, 0);
Fortran logical(4) etc. are built using:
  new_type = make_unsigned_type (bit_size);
  TREE_SET_CODE (new_type, BOOLEAN_TYPE);
  TYPE_MAX_VALUE (new_type) = build_int_cst (new_type, 1);
  TYPE_PRECISION (new_type) = 1;
thus I believe they have e.g. SImode or DImode rather than QImode, but still
TYPE_PRECISION of 1.  The non-Ada boolean is also:
  boolean_type_node = make_unsigned_type (BOOL_TYPE_SIZE);
  TREE_SET_CODE (boolean_type_node, BOOLEAN_TYPE);
  TYPE_PRECISION (boolean_type_node) = 1;
where BOOL_TYPE_SIZE is CHAR_TYPE_SIZE everywhere but on powerpc*-darwin*
with some command line option.
Ada is
  boolean_type_node = make_unsigned_type (8);
  TREE_SET_CODE (boolean_type_node, BOOLEAN_TYPE);
and thus it is indeed precision 8 QImode.

So, requiring precision 1 for all BOOLEAN_TYPEs in the vectorizer would
only affect Ada.  Requiring QImode or whatever other fixed TYPE_MODE
in the macro for precision 1 unsigned non-BOOLEAN_TYPE integers would
actually also affect Fortran a lot, because e.g. SImode precision 1
INTEGER_TYPE is considered compatible with SImode precision 1 BOOLEAN_TYPE.


/me is frustrated that we have booleans with nonstandard precision, even
though I understand why it was done.  It creates numerous headaches.


Ditto.


I guess I'm mostly frustrated that we have BOOLEAN_TYPE/ENUMERAL_TYPE at
all given precision-1 unsigned INTEGER_TYPEs are fine to represent
0/1 valued entities (aka bools).  But then we also have BImode...
IIRC BImode was the most natural way to support predication on ia64 and 
likely could be used in that manner on other targets if we wanted to get 
serious about predication.


jeff


Re: [RFA][PR tree-optimization/79095] [PATCH 1/4] Improve ranges for MINUS_EXPR and EXACT_DIV_EXPR

2017-02-06 Thread Richard Biener
On Mon, Feb 6, 2017 at 4:28 PM, Jeff Law  wrote:
> On 02/06/2017 01:11 AM, Richard Biener wrote:
>>
>> On Sat, Feb 4, 2017 at 3:52 PM, Jeff Law  wrote:
>>>
>>> This is the first of a 4 part series to address the issues around 79095.
>>>
>>> This patch addresses improvements in determining ranges of binary
>>> expressions in three ways.
>>>
>>> First if we are otherwise unable to find a range for the result of a
>>> MINUS_EXPR, if we know the arguments are not equal, then we know the
>>> resultant range is ~[0,0].
>>>
>>> Second, for EXACT_DIV_EXPR, if the numerator has the range ~[0,0], then
>>> resultant range is currently [TYPE_MIN/DENOM,TYPE_MAX/DENOM].  That is
>>> rarely a useful range.   A resultant range of ~[0,0] is actually more
>>> useful
>>> since it often tells us something important about the difference of two
>>> pointers.
>>>
>>> Finally, when vrp2 discovers an updated range for an object that had a
>>> range
>>> discovered by vrp1, if the new range is ~[0,0], prefer that new range in
>>> some cases.  This is needed to avoid losing the newly discovered ~[0,0]
>>> range for EXACT_DIV_EXPR.
>>>
>>> Bootstrapped and regression tested with the other patches in this series.
>>> OK for the trunk?
>>>
>>> Jeff
>>>
>>> * tree-vrp.c (extract_range_from_binary_expr): For
>>> EXACT_DIV_EXPR,
>>> if the numerator has the range ~[0,0] make the resultant range
>>> ~[0,0].  For MINUS_EXPR with no derived range, if the operands
>>> are
>>> known to be not equal, then the resulting range is ~[0,0].
>>> (intersect_ranges): In some cases prefer ~[0,0].
>>>
>>> commit b7baf46ab62e28d2dbc22e9dcd4404926d59df18
>>> Author: Jeff Law 
>>> Date:   Fri Feb 3 15:45:58 2017 -0500
>>>
>>> Improved ranges
>>>
>>> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
>>> index b429217..3338d8b 100644
>>> --- a/gcc/tree-vrp.c
>>> +++ b/gcc/tree-vrp.c
>>> @@ -3298,6 +3298,37 @@ extract_range_from_binary_expr (value_range *vr,
>>>
>>>extract_range_from_binary_expr_1 (vr, code, expr_type, _vr0,
>>> );
>>>  }
>>> +
>>> +  /* EXACT_DIV_EXPR is typically used for pointer subtraction;
>>> + as a result a ~[0,0] may be better than what has already
>>> + been computed.
>>> +
>>> + In particular if numerator has the range ~[0,0], then the
>>> + result range is going to be something like
>>> + [MININT/DIVISOR,MAXINT/DIVISOR], which is rarely useful.
>>> +
>>> + So instead make the result range ~[0,0].  */
>>> +  if (code == EXACT_DIV_EXPR
>>> +  && TREE_CODE (op0) == SSA_NAME
>>> +  && vr0.type == VR_ANTI_RANGE
>>> +  && vr0.min == vr0.max
>>> +  && integer_zerop (vr0.min))
>>> +set_value_range_to_nonnull (vr, TREE_TYPE (op0));
>>
>>
>> The above belongs in extract_range_from_binary_expr_1, in principle the
>> cases below as well (though there's pre-existing VARYING result handling).
>> The _1 ones are supposed to be the actual range computations while
>> the routine you patched is responsible for interfacing with a lattice.
>> The
>> _1 routines can be used from code outside of VRP.
>
> So moving the new MINUS_EXPR code or the existing PLUS_EXPR/MINUS_EXPR code
> is easy, but would require passing in op0/op1.  I'm guessing we don't want
> to do that.
>
> We can still move the EXACT_DIV_EXPR case as that doesn't depend on looking
> at the actual operand.

Yes please.

Richard.

> jeff
>


Re: [RFA][PR tree-optimization/79095] [PATCH 1/4] Improve ranges for MINUS_EXPR and EXACT_DIV_EXPR

2017-02-06 Thread Richard Biener
On Mon, Feb 6, 2017 at 4:18 PM, Jeff Law  wrote:
> On 02/06/2017 08:15 AM, Jeff Law wrote:
>>
>> On 02/06/2017 01:11 AM, Richard Biener wrote:
>>>
>>> On Sat, Feb 4, 2017 at 3:52 PM, Jeff Law  wrote:

 This is the first of a 4 part series to address the issues around 79095.

 This patch addresses improvements in determining ranges of binary
 expressions in three ways.

 First if we are otherwise unable to find a range for the result of a
 MINUS_EXPR, if we know the arguments are not equal, then we know the
 resultant range is ~[0,0].

 Second, for EXACT_DIV_EXPR, if the numerator has the range ~[0,0], then
 resultant range is currently [TYPE_MIN/DENOM,TYPE_MAX/DENOM].  That is
 rarely a useful range.   A resultant range of ~[0,0] is actually more
 useful
 since it often tells us something important about the difference of two
 pointers.

 Finally, when vrp2 discovers an updated range for an object that had
 a range
 discovered by vrp1, if the new range is ~[0,0], prefer that new range in
 some cases.  This is needed to avoid losing the newly discovered ~[0,0]
 range for EXACT_DIV_EXPR.

 Bootstrapped and regression tested with the other patches in this
 series.
 OK for the trunk?

 Jeff

 * tree-vrp.c (extract_range_from_binary_expr): For
 EXACT_DIV_EXPR,
 if the numerator has the range ~[0,0] make the resultant range
 ~[0,0].  For MINUS_EXPR with no derived range, if the
 operands are
 known to be not equal, then the resulting range is ~[0,0].
 (intersect_ranges): In some cases prefer ~[0,0].

 commit b7baf46ab62e28d2dbc22e9dcd4404926d59df18
 Author: Jeff Law 
 Date:   Fri Feb 3 15:45:58 2017 -0500

 Improved ranges

 diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
 index b429217..3338d8b 100644
 --- a/gcc/tree-vrp.c
 +++ b/gcc/tree-vrp.c
 @@ -3298,6 +3298,37 @@ extract_range_from_binary_expr (value_range *vr,

extract_range_from_binary_expr_1 (vr, code, expr_type, _vr0,
 );
  }
 +
 +  /* EXACT_DIV_EXPR is typically used for pointer subtraction;
 + as a result a ~[0,0] may be better than what has already
 + been computed.
 +
 + In particular if numerator has the range ~[0,0], then the
 + result range is going to be something like
 + [MININT/DIVISOR,MAXINT/DIVISOR], which is rarely useful.
 +
 + So instead make the result range ~[0,0].  */
 +  if (code == EXACT_DIV_EXPR
 +  && TREE_CODE (op0) == SSA_NAME
 +  && vr0.type == VR_ANTI_RANGE
 +  && vr0.min == vr0.max
 +  && integer_zerop (vr0.min))
 +set_value_range_to_nonnull (vr, TREE_TYPE (op0));
>>>
>>>
>>> The above belongs in extract_range_from_binary_expr_1, in principle the
>>> cases below as well (though there's pre-existing VARYING result
>>> handling).
>>
>> Do you want those existing cases moved, it's easy enough to do.
>>
>>> The _1 ones are supposed to be the actual range computations while
>>> the routine you patched is responsible for interfacing with a
>>> lattice.  The
>>> _1 routines can be used from code outside of VRP.
>>
>> OK.  Good to know.
>>
  /* Extract range information from a unary operation CODE based on
 @@ -8620,6 +8651,12 @@ intersect_ranges (enum value_range_type *vr0type,
   else if (vrp_val_is_min (vr1min)
&& vrp_val_is_max (vr1max))
 ;
 + /* Choose the anti-range if it is ~[0,0], that range is
 special
 +enough to special case.  */
 + else if (*vr0type == VR_ANTI_RANGE
 +  && *vr0min == *vr0max
 +  && integer_zerop (*vr0min))
 +   ;
>>>
>>>
>>> Huh.  If I spotted the place of the change correctly then we cannot
>>> arrive
>>> here with vr0 == ~[0,0] as *vr0type is VR_RANGE.  In the case covered
>>> we'd have the only case intersecting [-1, 1] and ~[0,0] that you'd change
>>> to ~[0,0] instead of [-1,1] which generally would be a bad choice (apart
>>> from your implementation error as vr1 is the anti-range here).
>>
>> Nope.  It's in the right place.  We have a ~[0,0] for vr0 and vr1 is
>> typically going to be [4,4] or [8.8].  Thus we're in this case:

I matched the above hunk to the following context:

  else if ((maxeq || operand_less_p (*vr0max, vr1max) == 1)
   && (mineq || operand_less_p (vr1min, *vr0min) == 1))
{
  /* ( [  ] ) or ([  ] ) or ( [  ]) */
  if (*vr0type == VR_RANGE
  && vr1type == VR_RANGE)
/* Choose the inner range.  */
;
  else if (*vr0type == VR_ANTI_RANGE
   && vr1type == VR_RANGE)
{
...
  /* Choose the anti-range if the range is effectively 

Re: [PATCH] Fix bool vs. unsigned:1 vectorization (PR tree-optimization/79284)

2017-02-06 Thread Jakub Jelinek
On Mon, Feb 06, 2017 at 03:26:01PM +0100, Richard Biener wrote:
> Works for me.  Like VECT_SCALAR_BOOLEAN_TYPE_P () to not confuse it
> with VECTOR_BOOLEAN_TYPE_P?

So like this if it passes bootstrap/regtest?

2017-02-06  Jakub Jelinek  

PR tree-optimization/79284
* tree-vectorizer.h (VECT_SCALAR_BOOLEAN_TYPE_P): Define.
* tree-vect-stmts.c (vect_get_vec_def_for_operand,
vectorizable_mask_load_store, vectorizable_operation,
vect_is_simple_cond, get_same_sized_vectype): Use it instead
of comparing TREE_CODE of a type against BOOLEAN_TYPE.
* tree-vect-patterns.c (check_bool_pattern, search_type_for_mask_1,
vect_recog_bool_pattern, vect_recog_mask_conversion_pattern): Likewise.
* tree-vect-slp.c (vect_get_constant_vectors): Likewise.
* tree-vect-loop.c (vect_determine_vectorization_factor): Likewise.
Remove redundant gimple_code (stmt) == GIMPLE_ASSIGN test after
is_gimple_assign (stmt).  Replace another such test with
is_gimple_assign (stmt).
testsuite/
* gcc.c-torture/compile/pr79284.c: New test.

--- gcc/tree-vectorizer.h.jj2017-01-01 12:45:36.0 +0100
+++ gcc/tree-vectorizer.h   2017-02-06 16:17:41.850539082 +0100
@@ -784,6 +784,18 @@ struct dataref_aux {
 /* The maximum vectorization factor supported by any target (V64QI).  */
 #define MAX_VECTORIZATION_FACTOR 64
 
+/* Nonzero if TYPE represents a (scalar) boolean type or type
+   in the middle-end compatible with it (unsigned precision 1 integral
+   types).  Used to determine which types should be vectorized as
+   VECTOR_BOOLEAN_TYPE_P.  */
+
+#define VECT_SCALAR_BOOLEAN_TYPE_P(TYPE) \
+  (TREE_CODE (TYPE) == BOOLEAN_TYPE\
+   || ((TREE_CODE (TYPE) == INTEGER_TYPE   \
+   || TREE_CODE (TYPE) == ENUMERAL_TYPE)   \
+   && TYPE_PRECISION (TYPE) == 1   \
+   && TYPE_UNSIGNED (TYPE)))
+
 extern vec stmt_vec_info_vec;
 
 void init_stmt_vec_info_vec (void);
--- gcc/tree-vect-stmts.c.jj2017-01-31 22:36:31.719323924 +0100
+++ gcc/tree-vect-stmts.c   2017-02-06 16:20:25.435433608 +0100
@@ -1420,7 +1420,7 @@ vect_get_vec_def_for_operand (tree op, g
 
   if (vectype)
vector_type = vectype;
-  else if (TREE_CODE (TREE_TYPE (op)) == BOOLEAN_TYPE
+  else if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (op))
   && VECTOR_BOOLEAN_TYPE_P (stmt_vectype))
vector_type = build_same_sized_truth_vector_type (stmt_vectype);
   else
@@ -2029,7 +2029,7 @@ vectorizable_mask_load_store (gimple *st
 
   mask = gimple_call_arg (stmt, 2);
 
-  if (TREE_CODE (TREE_TYPE (mask)) != BOOLEAN_TYPE)
+  if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (mask)))
 return false;
 
   /* FORNOW. This restriction should be relaxed.  */
@@ -5275,9 +5275,9 @@ vectorizable_operation (gimple *stmt, gi
 of booleans or vector of integers).  We use output
 vectype because operations on boolean don't change
 type.  */
-  if (TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE)
+  if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (op0)))
{
- if (TREE_CODE (TREE_TYPE (scalar_dest)) != BOOLEAN_TYPE)
+ if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (scalar_dest)))
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -7666,7 +7666,7 @@ vect_is_simple_cond (tree cond, vec_info
 
   /* Mask case.  */
   if (TREE_CODE (cond) == SSA_NAME
-  && TREE_CODE (TREE_TYPE (cond)) == BOOLEAN_TYPE)
+  && VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (cond)))
 {
   gimple *lhs_def_stmt = SSA_NAME_DEF_STMT (cond);
   if (!vect_is_simple_use (cond, vinfo, _def_stmt,
@@ -9059,7 +9059,7 @@ get_mask_type_for_scalar_type (tree scal
 tree
 get_same_sized_vectype (tree scalar_type, tree vector_type)
 {
-  if (TREE_CODE (scalar_type) == BOOLEAN_TYPE)
+  if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type))
 return build_same_sized_truth_vector_type (vector_type);
 
   return get_vectype_for_scalar_type_and_size
--- gcc/tree-vect-patterns.c.jj 2017-01-31 22:36:29.933346731 +0100
+++ gcc/tree-vect-patterns.c2017-02-06 16:20:09.636636952 +0100
@@ -3158,9 +3158,7 @@ check_bool_pattern (tree var, vec_info *
   break;
 
 CASE_CONVERT:
-  if ((TYPE_PRECISION (TREE_TYPE (rhs1)) != 1
-  || !TYPE_UNSIGNED (TREE_TYPE (rhs1)))
- && TREE_CODE (TREE_TYPE (rhs1)) != BOOLEAN_TYPE)
+  if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (rhs1)))
return false;
   if (! check_bool_pattern (rhs1, vinfo, stmts))
return false;
@@ -3474,9 +3472,7 @@ search_type_for_mask_1 (tree var, vec_in
   if (TREE_CODE (var) != SSA_NAME)
 return NULL_TREE;
 
-  if ((TYPE_PRECISION (TREE_TYPE (var)) != 1
-   || !TYPE_UNSIGNED (TREE_TYPE (var)))
-  && TREE_CODE (TREE_TYPE (var)) != BOOLEAN_TYPE)
+  if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (var)))
 return NULL_TREE;
 
 

Re: [RFA][PR tree-optimization/79095] [PATCH 1/4] Improve ranges for MINUS_EXPR and EXACT_DIV_EXPR

2017-02-06 Thread Jeff Law

On 02/06/2017 01:11 AM, Richard Biener wrote:

On Sat, Feb 4, 2017 at 3:52 PM, Jeff Law  wrote:

This is the first of a 4 part series to address the issues around 79095.

This patch addresses improvements in determining ranges of binary
expressions in three ways.

First if we are otherwise unable to find a range for the result of a
MINUS_EXPR, if we know the arguments are not equal, then we know the
resultant range is ~[0,0].

Second, for EXACT_DIV_EXPR, if the numerator has the range ~[0,0], then
resultant range is currently [TYPE_MIN/DENOM,TYPE_MAX/DENOM].  That is
rarely a useful range.   A resultant range of ~[0,0] is actually more useful
since it often tells us something important about the difference of two
pointers.

Finally, when vrp2 discovers an updated range for an object that had a range
discovered by vrp1, if the new range is ~[0,0], prefer that new range in
some cases.  This is needed to avoid losing the newly discovered ~[0,0]
range for EXACT_DIV_EXPR.

Bootstrapped and regression tested with the other patches in this series.
OK for the trunk?

Jeff

* tree-vrp.c (extract_range_from_binary_expr): For EXACT_DIV_EXPR,
if the numerator has the range ~[0,0] make the resultant range
~[0,0].  For MINUS_EXPR with no derived range, if the operands are
known to be not equal, then the resulting range is ~[0,0].
(intersect_ranges): In some cases prefer ~[0,0].

commit b7baf46ab62e28d2dbc22e9dcd4404926d59df18
Author: Jeff Law 
Date:   Fri Feb 3 15:45:58 2017 -0500

Improved ranges

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index b429217..3338d8b 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -3298,6 +3298,37 @@ extract_range_from_binary_expr (value_range *vr,

   extract_range_from_binary_expr_1 (vr, code, expr_type, _vr0, );
 }
+
+  /* EXACT_DIV_EXPR is typically used for pointer subtraction;
+ as a result a ~[0,0] may be better than what has already
+ been computed.
+
+ In particular if numerator has the range ~[0,0], then the
+ result range is going to be something like
+ [MININT/DIVISOR,MAXINT/DIVISOR], which is rarely useful.
+
+ So instead make the result range ~[0,0].  */
+  if (code == EXACT_DIV_EXPR
+  && TREE_CODE (op0) == SSA_NAME
+  && vr0.type == VR_ANTI_RANGE
+  && vr0.min == vr0.max
+  && integer_zerop (vr0.min))
+set_value_range_to_nonnull (vr, TREE_TYPE (op0));


The above belongs in extract_range_from_binary_expr_1, in principle the
cases below as well (though there's pre-existing VARYING result handling).
The _1 ones are supposed to be the actual range computations while
the routine you patched is responsible for interfacing with a lattice.  The
_1 routines can be used from code outside of VRP.
So moving the new MINUS_EXPR code or the existing PLUS_EXPR/MINUS_EXPR 
code is easy, but would require passing in op0/op1.  I'm guessing we 
don't want to do that.


We can still move the EXACT_DIV_EXPR case as that doesn't depend on 
looking at the actual operand.


jeff



Re: Fix profile updating in ifcombine

2017-02-06 Thread Jan Hubicka
> 
> After this patch (r245151), I've noticed regressions on aarch64:
>   gcc.target/aarch64/test_frame_1.c scan-assembler-times ldr\tx30,
> \\[sp\\], [0-9]+ 2
>   gcc.target/aarch64/test_frame_2.c scan-assembler-times ldp\tx19,
> x30, \\[sp\\], [0-9]+ 1
>   gcc.target/aarch64/test_frame_4.c scan-assembler-times ldp\tx19,
> x30, \\[sp\\], [0-9]+ 1
>   gcc.target/aarch64/test_frame_6.c scan-assembler-times ldr\tx30, \\[sp\\] 2
>   gcc.target/aarch64/test_frame_7.c scan-assembler-times ldp\tx19,
> x30, \\[sp\\] 1
> 
> now FAIL.
> 
> For instance in test_frame_1.c, we used to generate:
> ccmpw2, w1, 0, eq
> bne.L6
> ldrbw0, [sp, 123]
> ldrbw1, [sp, 124]
> cmpw0, 99
> ccmpw1, w0, 0, eq
> csetw0, eq
> .L6:
> ldrx30, [sp], 224
> ret
> 
> and now:
> ccmpw2, w1, 0, eq
> beq.L11
> ldrx30, [sp], 224
> ret
> .p2align 3
> .L11:
> ldrbw0, [sp, 123]
> ldrbw1, [sp, 124]
> cmpw0, 99
> ldrx30, [sp], 224
> ccmpw1, w0, 0, eq
> csetw0, eq
> ret
> 
> which is 2 instructions more, as the control flow is less efficient.
> 
> Do we want to just update the tests (increasing the number of expected
> str/ldr/stp/ldp to match the new code generation), or do we
> consider this a regression caused by this patch?

I think it is not a regression, just the testcase if fragile and depends on 
outcome
of ifcombine.  It seems it was updated several time in the past. I am not quite
sure what the test is testing, but probably just updating the testcase and/or
disabling the ifconvert pass for it is the right answer.

Honza


Re: [GIMPLE FE] Avoid ICE with __builtin_abs

2017-02-06 Thread Richard Biener
On Mon, 6 Feb 2017, Richard Biener wrote:

> On Sat, 4 Feb 2017, Prathamesh Kulkarni wrote:
> 
> > Hi,
> > The following test-case ICE's with -fgimple:
> > 
> > int __GIMPLE foo(int a)
> > {
> >   int t1;
> >   t1_1 = __builtin_abs (a);
> >   return t1_1;
> > }
> > 
> > gimplefe-2.c:4:3: internal compiler error: in get_callee_fndecl, at 
> > tree.c:9500
> >t1_1 = __builtin_abs (a);
> >^~~~
> > 0xe96e8d get_callee_fndecl(tree_node const*)
> > ../../gcc/gcc/tree.c:9500
> > 0x924d75 gimple_build_call_from_tree(tree_node*)
> > ../../gcc/gcc/gimple.c:351
> > 0x6c86b3 c_parser_gimple_statement
> > ../../gcc/gcc/c/gimple-parser.c:393
> > 0x6c86b3 c_parser_gimple_compound_statement
> > ../../gcc/gcc/c/gimple-parser.c:216
> > 0x6c86b3 c_parser_parse_gimple_body(c_parser*)
> > ../../gcc/gcc/c/gimple-parser.c:93
> > 0x6b04f1 c_parser_declaration_or_fndef
> > ../../gcc/gcc/c/c-parser.c:2081
> > 0x6b883b c_parser_external_declaration
> > ../../gcc/gcc/c/c-parser.c:1464
> > 0x6b92a1 c_parser_translation_unit
> > ../../gcc/gcc/c/c-parser.c:1344
> > 0x6b92a1 c_parse_file()
> > ../../gcc/gcc/c/c-parser.c:18141
> > 0x717832 c_common_parse_file()
> > ../../gcc/gcc/c-family/c-opts.c:1102
> > 
> > This happens because __builtin_abs(a) gets folded to 
> > and get_callee_fndecl expects CALL_EXPR.
> > 
> > The attached patch tries to fix the issue by building gimple_assign
> > with appropriate subcode
> > for functions that get folded to expression instead of trying to build
> > it as a function-call.
> > Is it OK to commit after bootstrap+test ?
> 
> No.  The proper fix is to not use the C frontend call-expr parsing
> and building -- it does have many more issues I think.

Sth as simple as

Index: gcc/c/gimple-parser.c
===
--- gcc/c/gimple-parser.c   (revision 245203)
+++ gcc/c/gimple-parser.c   (working copy)
@@ -946,17 +946,11 @@
orig_expr = expr;
start = expr.get_start ();
finish = c_parser_tokens_buf (parser, 0)->get_finish ();
-   expr.value = c_build_function_call_vec (expr_loc, arg_loc,
-   expr.value,
-   exprlist, origtypes);
+   expr.value = build_call_vec (TREE_TYPE (TREE_TYPE (expr.value)),
+expr.value, exprlist);
+   SET_EXPR_LOCATION (expr.value, expr_loc);
set_c_expr_source_range (, start, finish);
-
expr.original_code = ERROR_MARK;
-   if (TREE_CODE (expr.value) == INTEGER_CST
-   && TREE_CODE (orig_expr.value) == FUNCTION_DECL
-   && DECL_BUILT_IN_CLASS (orig_expr.value) == BUILT_IN_NORMAL
-   && DECL_FUNCTION_CODE (orig_expr.value) == BUILT_IN_CONSTANT_P)
- expr.original_code = C_MAYBE_CONST_EXPR;
expr.original_type = NULL;
if (exprlist)
  {

for example fixes the bogus promotions inserted for

short int __GIMPLE ()
foo (short int s)
{
  short int D_1803;

  bb_2:
  D_1803 = s;

L0:
  return D_1803;

}


int __GIMPLE ()
main (int argc, char * * argv)
{
  short int s;
  int D_1805;
  int _1;
  short _2;

  bb_2:
  s = (short int) argc;
  _1 = (int) s;
  _2 = foo (_1);
  D_1805 = (int) _2;

L0:
  return D_1805;

}

it should also fix the folding you see.  Otherwise untested, of course
(and we shouldn't build a CALL_EXPR but instead refactor this so we
can build a GIMPLE_CALL directly)

Richard.


RE: [PATCH] MIPS: Fix mode mismatch error between Loongson builtin arguments and insn operands.

2017-02-06 Thread Toma Tabacu
Matthew Fortune writes:
> 
> That's not what I hoped but is what I was concerned about as I believe it
> means we have a change of behaviour.  It boils down to simply ignoring the
> argument type of unsigned char.  My guess is that a zero extension is
> created but then immediately eliminated because of the paradoxical subreg.
> 
> I think you need to create a temporary and perform the zero extension to
> ensure we honour the unsigned char operand:
> 
>   rtx new_dst = gen_reg_rtx (SImode);
>   emit_insn (gen_zero_extendqisi2 (new_dst, ops[2].value));
>   ops[2].value = foo;
> 
> This should mean that the testcase I sent always has a zero extension but if
> you change the type of 'amount' to be unsigned char then there should not be
> a zero extension as the argument will be assumed to be correctly zero extended
> already and the explicitly introduced zero_extend will be eliminated.
> 

I have made it generate a zero_extend instead of a SUBREG.
However, the pattern associated with gen_zero_extendqisi2 does not work with
immediate operands, so I had to add an extra step in which the argument is put
into a QImode register before being passed to gen_zero_extendqisi2.

Is this OK ?

Regards,
Toma

gcc/

* config/mips/mips.c (mips_expand_builtin_insn): Convert the QImode
argument of the pshufh, psllh, psllw, psrah, psraw, psrlh, psrlw
builtins to SImode and emit a zero-extend, if necessary.

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index da7fa8f..bab5b93 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -16571,9 +16571,35 @@ mips_expand_builtin_insn (enum insn_code icode, 
unsigned int nops,
 {
   machine_mode imode;
   int rangelo = 0, rangehi = 0, error_opno = 0;
+  rtx qireg, sireg;
 
   switch (icode)
 {
+/* The third operand of these instructions is in SImode, so we need to
+   bring the corresponding builtin argument from QImode into SImode.  */
+case CODE_FOR_loongson_pshufh:
+case CODE_FOR_loongson_psllh:
+case CODE_FOR_loongson_psllw:
+case CODE_FOR_loongson_psrah:
+case CODE_FOR_loongson_psraw:
+case CODE_FOR_loongson_psrlh:
+case CODE_FOR_loongson_psrlw:
+  gcc_assert (has_target_p && nops == 3 && ops[2].mode == QImode);
+  sireg = gen_reg_rtx (SImode);
+  /* We need to put the immediate in a register because
+gen_zero_extendqisi2 does not accept immediate operands.  */
+  if (CONST_INT_P (ops[2].value))
+   {
+ qireg = gen_reg_rtx (QImode);
+ emit_insn (gen_rtx_SET (qireg, ops[2].value));
+ emit_insn (gen_zero_extendqisi2 (sireg, qireg));
+   } else {
+ emit_insn (gen_zero_extendqisi2 (sireg, ops[2].value));
+   }
+  ops[2].value = sireg;
+  ops[2].mode = SImode;
+  break;
+
 case CODE_FOR_msa_addvi_b:
 case CODE_FOR_msa_addvi_h:
 case CODE_FOR_msa_addvi_w:



Re: [RFA][PR tree-optimization/79095] [PATCH 1/4] Improve ranges for MINUS_EXPR and EXACT_DIV_EXPR

2017-02-06 Thread Jeff Law

On 02/06/2017 08:15 AM, Jeff Law wrote:

On 02/06/2017 01:11 AM, Richard Biener wrote:

On Sat, Feb 4, 2017 at 3:52 PM, Jeff Law  wrote:

This is the first of a 4 part series to address the issues around 79095.

This patch addresses improvements in determining ranges of binary
expressions in three ways.

First if we are otherwise unable to find a range for the result of a
MINUS_EXPR, if we know the arguments are not equal, then we know the
resultant range is ~[0,0].

Second, for EXACT_DIV_EXPR, if the numerator has the range ~[0,0], then
resultant range is currently [TYPE_MIN/DENOM,TYPE_MAX/DENOM].  That is
rarely a useful range.   A resultant range of ~[0,0] is actually more
useful
since it often tells us something important about the difference of two
pointers.

Finally, when vrp2 discovers an updated range for an object that had
a range
discovered by vrp1, if the new range is ~[0,0], prefer that new range in
some cases.  This is needed to avoid losing the newly discovered ~[0,0]
range for EXACT_DIV_EXPR.

Bootstrapped and regression tested with the other patches in this
series.
OK for the trunk?

Jeff

* tree-vrp.c (extract_range_from_binary_expr): For
EXACT_DIV_EXPR,
if the numerator has the range ~[0,0] make the resultant range
~[0,0].  For MINUS_EXPR with no derived range, if the
operands are
known to be not equal, then the resulting range is ~[0,0].
(intersect_ranges): In some cases prefer ~[0,0].

commit b7baf46ab62e28d2dbc22e9dcd4404926d59df18
Author: Jeff Law 
Date:   Fri Feb 3 15:45:58 2017 -0500

Improved ranges

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index b429217..3338d8b 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -3298,6 +3298,37 @@ extract_range_from_binary_expr (value_range *vr,

   extract_range_from_binary_expr_1 (vr, code, expr_type, _vr0,
);
 }
+
+  /* EXACT_DIV_EXPR is typically used for pointer subtraction;
+ as a result a ~[0,0] may be better than what has already
+ been computed.
+
+ In particular if numerator has the range ~[0,0], then the
+ result range is going to be something like
+ [MININT/DIVISOR,MAXINT/DIVISOR], which is rarely useful.
+
+ So instead make the result range ~[0,0].  */
+  if (code == EXACT_DIV_EXPR
+  && TREE_CODE (op0) == SSA_NAME
+  && vr0.type == VR_ANTI_RANGE
+  && vr0.min == vr0.max
+  && integer_zerop (vr0.min))
+set_value_range_to_nonnull (vr, TREE_TYPE (op0));


The above belongs in extract_range_from_binary_expr_1, in principle the
cases below as well (though there's pre-existing VARYING result
handling).

Do you want those existing cases moved, it's easy enough to do.


The _1 ones are supposed to be the actual range computations while
the routine you patched is responsible for interfacing with a
lattice.  The
_1 routines can be used from code outside of VRP.

OK.  Good to know.


 /* Extract range information from a unary operation CODE based on
@@ -8620,6 +8651,12 @@ intersect_ranges (enum value_range_type *vr0type,
  else if (vrp_val_is_min (vr1min)
   && vrp_val_is_max (vr1max))
;
+ /* Choose the anti-range if it is ~[0,0], that range is
special
+enough to special case.  */
+ else if (*vr0type == VR_ANTI_RANGE
+  && *vr0min == *vr0max
+  && integer_zerop (*vr0min))
+   ;


Huh.  If I spotted the place of the change correctly then we cannot
arrive
here with vr0 == ~[0,0] as *vr0type is VR_RANGE.  In the case covered
we'd have the only case intersecting [-1, 1] and ~[0,0] that you'd change
to ~[0,0] instead of [-1,1] which generally would be a bad choice (apart
from your implementation error as vr1 is the anti-range here).

Nope.  It's in the right place.  We have a ~[0,0] for vr0 and vr1 is
typically going to be [4,4] or [8.8].  Thus we're in this case:
Sorry, vr1 is typically going to be some very wide range.  It's the 
range from the prior vrp pass, not the denominator.


jeff



Re: [RFA][PR tree-optimization/79095] [PATCH 1/4] Improve ranges for MINUS_EXPR and EXACT_DIV_EXPR

2017-02-06 Thread Jeff Law

On 02/06/2017 01:11 AM, Richard Biener wrote:

On Sat, Feb 4, 2017 at 3:52 PM, Jeff Law  wrote:

This is the first of a 4 part series to address the issues around 79095.

This patch addresses improvements in determining ranges of binary
expressions in three ways.

First if we are otherwise unable to find a range for the result of a
MINUS_EXPR, if we know the arguments are not equal, then we know the
resultant range is ~[0,0].

Second, for EXACT_DIV_EXPR, if the numerator has the range ~[0,0], then
resultant range is currently [TYPE_MIN/DENOM,TYPE_MAX/DENOM].  That is
rarely a useful range.   A resultant range of ~[0,0] is actually more useful
since it often tells us something important about the difference of two
pointers.

Finally, when vrp2 discovers an updated range for an object that had a range
discovered by vrp1, if the new range is ~[0,0], prefer that new range in
some cases.  This is needed to avoid losing the newly discovered ~[0,0]
range for EXACT_DIV_EXPR.

Bootstrapped and regression tested with the other patches in this series.
OK for the trunk?

Jeff

* tree-vrp.c (extract_range_from_binary_expr): For EXACT_DIV_EXPR,
if the numerator has the range ~[0,0] make the resultant range
~[0,0].  For MINUS_EXPR with no derived range, if the operands are
known to be not equal, then the resulting range is ~[0,0].
(intersect_ranges): In some cases prefer ~[0,0].

commit b7baf46ab62e28d2dbc22e9dcd4404926d59df18
Author: Jeff Law 
Date:   Fri Feb 3 15:45:58 2017 -0500

Improved ranges

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index b429217..3338d8b 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -3298,6 +3298,37 @@ extract_range_from_binary_expr (value_range *vr,

   extract_range_from_binary_expr_1 (vr, code, expr_type, _vr0, );
 }
+
+  /* EXACT_DIV_EXPR is typically used for pointer subtraction;
+ as a result a ~[0,0] may be better than what has already
+ been computed.
+
+ In particular if numerator has the range ~[0,0], then the
+ result range is going to be something like
+ [MININT/DIVISOR,MAXINT/DIVISOR], which is rarely useful.
+
+ So instead make the result range ~[0,0].  */
+  if (code == EXACT_DIV_EXPR
+  && TREE_CODE (op0) == SSA_NAME
+  && vr0.type == VR_ANTI_RANGE
+  && vr0.min == vr0.max
+  && integer_zerop (vr0.min))
+set_value_range_to_nonnull (vr, TREE_TYPE (op0));


The above belongs in extract_range_from_binary_expr_1, in principle the
cases below as well (though there's pre-existing VARYING result handling).

Do you want those existing cases moved, it's easy enough to do.


The _1 ones are supposed to be the actual range computations while
the routine you patched is responsible for interfacing with a lattice.  The
_1 routines can be used from code outside of VRP.

OK.  Good to know.


 /* Extract range information from a unary operation CODE based on
@@ -8620,6 +8651,12 @@ intersect_ranges (enum value_range_type *vr0type,
  else if (vrp_val_is_min (vr1min)
   && vrp_val_is_max (vr1max))
;
+ /* Choose the anti-range if it is ~[0,0], that range is special
+enough to special case.  */
+ else if (*vr0type == VR_ANTI_RANGE
+  && *vr0min == *vr0max
+  && integer_zerop (*vr0min))
+   ;


Huh.  If I spotted the place of the change correctly then we cannot arrive
here with vr0 == ~[0,0] as *vr0type is VR_RANGE.  In the case covered
we'd have the only case intersecting [-1, 1] and ~[0,0] that you'd change
to ~[0,0] instead of [-1,1] which generally would be a bad choice (apart
from your implementation error as vr1 is the anti-range here).
Nope.  It's in the right place.  We have a ~[0,0] for vr0 and vr1 is 
typically going to be [4,4] or [8.8].  Thus we're in this case:


  else if ((maxeq || operand_less_p (*vr0max, vr1max) == 1)
   && (mineq || operand_less_p (vr1min, *vr0min) == 1))
{
  /* ( [  ] ) or ([  ] ) or ( [  ]) */

mineq and maxeq are both false.  So neither of these subcases apply:

  /* Choose the right gap if the left is empty.  */
  if (mineq)
[ ... ]
  /* Choose the left gap if the right is empty.  */
  else if (maxeq)

This doesn't apply either:

 /* Choose the anti-range if the range is effectively varying.  */
  else if (vrp_val_is_min (vr1min)
   && vrp_val_is_max (vr1max))

Even if vr1 is something larger, we're almost never going to derive 
anything useful from vr1 because vr0 is ~[0,0].




Re: [PATCH] PR 78348: Limit amount of unrolling in test.

2017-02-06 Thread Andreas Krebbel
On 02/06/2017 03:43 PM, Dominik Vogt wrote:
> The attached patch fixes the s390x test failure reported here:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78348
> Tested on s390x biarch.
> 
> Ciao
> 
> Dominik ^_^  ^_^
> 
Applied. Thanks!

-Andreas-



Re: [PING 6, PATCH] Remove xfail from thread_local-order2.C.

2017-02-06 Thread Dominik Vogt
On Mon, Feb 06, 2017 at 01:22:39PM +0100, Rainer Orth wrote:
> Hi Dominik,
> 
> > On Mon, Feb 06, 2017 at 12:33:21PM +0100, Rainer Orth wrote:
> >> > Copying the two guys listed as testsuite maintainers in gcc/MAINTAINERS
> >> > may help; let me do that for you.
> >> >
> >> > That said, if this fails to fail, the patch might be considered obvious,
> >> > not requiring a approval?
> >> 
> >> it's not: while it may XPASS with newer glibc versions, it still XFAILs
> >> e.g. on Solaris (and probably others).
> >
> > It's been so long that I cannot tell what the reference to
> > glibc-2.18 means.  I've only ever tested this on s390 and s390x,
> > and the test may or may not PASS on other targets with
> > glibc-2.18+.
> >
> >>  So unconditionally removing the
> >> xfail *-*-* trades an XPASS->PASS on some Linux versions against a
> >> XFAIL->FAIL elsewhere, which isn't acceptable.
> >
> > Okay, so what would you suggest?
> >
> >   // { dg-do run { xfail !s390*-*-* } } 
> >
> > or
> >
> >   // { dg-do run { xfail *-*-solaris } } 
> >
> > or something else?  We'll probably only get this list right by
> > trial and error anyway.
> 
> how about checking the gcc-testresults archives for XPASSes to get an
> idea?

In the newest 300 matches on gcc-testresults,

  XPASS: g++.dg/tls/thread_local-order2.C  -std=c++14 execution
test

appears for the following targets:

  s390
  s390x
  i386-unknown-freebsd10.3
  i686-pc-linux-gnu
  x86_64-pc-linux-gnu
  x86_64-apple-darwin16.4.0
  x86_64-unknown-freebsd12.0
  powerpc64le-unknown-linux-gnu
  powerpc-ibm-aix7.2.0.0
  aarch64-unknown-linux-gnu
  aarch64-suse-linux-gnu
  hppa-unknown-linux-gnu
  armv6-unknown-freebsd12.0
  target:arm-none-linux-gnueabi, host:i686-pc-linux-gnu
  target:m68k-unknown-linux-gnu; host:x86_64-suse-linux-gnu
  target:sh4-unknown-linux-gnu; host:i686-pc-linux-gnu

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH] PR 78348: Limit amount of unrolling in test.

2017-02-06 Thread Richard Biener
On Mon, Feb 6, 2017 at 3:43 PM, Dominik Vogt  wrote:
> The attached patch fixes the s390x test failure reported here:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78348
> Tested on s390x biarch.

Ok.

Richard.

> Ciao
>
> Dominik ^_^  ^_^
>
> --
>
> Dominik Vogt
> IBM Germany


[PATCH] PR 78348: Limit amount of unrolling in test.

2017-02-06 Thread Dominik Vogt
The attached patch fixes the s390x test failure reported here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78348
Tested on s390x biarch.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/testsuite/ChangeLog-pr78348

PR 78348
* gcc.dg/tree-ssa/ldist-24.c: Add "--param max-completely-peel-times=8"
to dg-options.
>From b242b323e411a8833b8ee5c8d5185259ae4a4d29 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Mon, 6 Feb 2017 15:38:17 +0100
Subject: [PATCH] PR 78348: Limit amount of unrolling in test.

This prevents a FAIL in ldist-24.c on s390x.
---
 gcc/testsuite/gcc.dg/tree-ssa/ldist-24.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-24.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-24.c
index 7ab79e6..75f7b8f 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ldist-24.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-24.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -fdump-tree-ldist-details" } */
+/* { dg-options "-O3 -fdump-tree-ldist-details --param 
max-completely-peel-times=8" } */
 
 typedef struct S {
 double z[8][25];
-- 
2.3.0



Re: Fix profile updating in ifcombine

2017-02-06 Thread Christophe Lyon
Hi,


On 3 February 2017 at 08:56, Jeff Law  wrote:
> On 02/02/2017 01:29 PM, Jan Hubicka wrote:
>>
>> Hi,
>> this patches fixes profile updating in the ifcombine.  This is not hard to
>> do
>> and ifcombine is #2 profile update offender out of tree passes (#1 is the
>> vectorizer).
>>
>> I think this counts as a regression, becuase one can trigger arbitrarily
>> bad profile after ifconversion and defnitly construct a testcase where
>> this
>> will cause us optimize for size where we optimized for speed previously.
>>
>> Bootstrapped/regtested x86_64-linux. Will commit it tomorrow (after
>> testers
>> pick up the threading fix) unless there are complains.
>>
>> * gcc.dg/tree-ssa/ssa-ifcombine-1.c: Check for no profile
>> mismatches.
>> * gcc.dg/tree-ssa/ssa-ifcombine-2.c: Check for no profile
>> mismatches.
>> * gcc.dg/tree-ssa/ssa-ifcombine-3.c: Check for no profile
>> mismatches.
>> * gcc.dg/tree-ssa/ssa-ifcombine-4.c: Check for no profile
>> mismatches.
>> * gcc.dg/tree-ssa/ssa-ifcombine-5.c: Check for no profile
>> mismatches.
>> * gcc.dg/tree-ssa/ssa-ifcombine-6.c: Check for no profile
>> mismatches.
>> * gcc.dg/tree-ssa/ssa-ifcombine-7.c: Check for no profile
>> mismatches.
>> * gcc.dg/tree-ssa/ssa-ifcombine-8.c: Check for no profile
>> mismatches.
>> * gcc.dg/tree-ssa/ssa-ifcombine-9.c: Check for no profile
>> mismatches.
>> * gcc.dg/tree-ssa/ssa-ifcombine-10.c: Check for no profile
>> mismatches.
>> * gcc.dg/tree-ssa/ssa-ifcombine-11.c: Check for no profile
>> mismatches.
>> * gcc.dg/tree-ssa/ssa-ifcombine-12.c: Check for no profile
>> mismatches.
>> * gcc.dg/tree-ssa/ssa-ifcombine-13.c: Check for no profile
>> mismatches.
>>
>> * tree-ssa-ifcombine.c (update_profile_after_ifcombine): New
>> function.
>> (ifcombine_ifandif): Use it.
>>
>> Index: testsuite/gcc.dg/tree-ssa/threadbackward-1.c
>> ===
>> --- testsuite/gcc.dg/tree-ssa/threadbackward-1.c(revision 0)
>> +++ testsuite/gcc.dg/tree-ssa/threadbackward-1.c(working copy)
>> @@ -0,0 +1,9 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-ethread" } */
>> +char *c;
>> +int t()
>> +{
>> +  for (int i=0;i<5000;i++)
>> +c[i]=i;
>> +}
>> +/* { dg-final { scan-tree-dump-times "Registering FSM jump thread" 1
>> "ethread"} } */
>
> Seems like this is unrelated to fixing profile updates in ifcombine.  If you
> want the new test it's probably best done in another independent patch :-)
>
>
>> Index: tree-ssa-ifcombine.c
>> ===
>> --- tree-ssa-ifcombine.c(revision 245134)
>> +++ tree-ssa-ifcombine.c(working copy)
>> @@ -332,6 +332,51 @@ recognize_bits_test (gcond *cond, tree *
>>return true;
>>  }
>>
>> +
>> +/* Update profile after code in outer_cond_bb was adjuted so
>
> s/adjuted/adjusted/
>
> OK with the nit fixed.
>
> jeff
>

After this patch (r245151), I've noticed regressions on aarch64:
  gcc.target/aarch64/test_frame_1.c scan-assembler-times ldr\tx30,
\\[sp\\], [0-9]+ 2
  gcc.target/aarch64/test_frame_2.c scan-assembler-times ldp\tx19,
x30, \\[sp\\], [0-9]+ 1
  gcc.target/aarch64/test_frame_4.c scan-assembler-times ldp\tx19,
x30, \\[sp\\], [0-9]+ 1
  gcc.target/aarch64/test_frame_6.c scan-assembler-times ldr\tx30, \\[sp\\] 2
  gcc.target/aarch64/test_frame_7.c scan-assembler-times ldp\tx19,
x30, \\[sp\\] 1

now FAIL.

For instance in test_frame_1.c, we used to generate:
ccmpw2, w1, 0, eq
bne.L6
ldrbw0, [sp, 123]
ldrbw1, [sp, 124]
cmpw0, 99
ccmpw1, w0, 0, eq
csetw0, eq
.L6:
ldrx30, [sp], 224
ret

and now:
ccmpw2, w1, 0, eq
beq.L11
ldrx30, [sp], 224
ret
.p2align 3
.L11:
ldrbw0, [sp, 123]
ldrbw1, [sp, 124]
cmpw0, 99
ldrx30, [sp], 224
ccmpw1, w0, 0, eq
csetw0, eq
ret

which is 2 instructions more, as the control flow is less efficient.

Do we want to just update the tests (increasing the number of expected
str/ldr/stp/ldp to match the new code generation), or do we
consider this a regression caused by this patch?

Christophe


Re: [PATCH] Fix bool vs. unsigned:1 vectorization (PR tree-optimization/79284)

2017-02-06 Thread Richard Biener
On Mon, 6 Feb 2017, Jakub Jelinek wrote:

> On Mon, Feb 06, 2017 at 09:14:24AM +0100, Richard Biener wrote:
> > > > +/* Nonzero if TYPE represents a (scalar) boolean type or type
> > > > +   in the middle-end compatible with it.  */
> > > > +
> > > > +#define INTEGRAL_BOOLEAN_TYPE_P(TYPE) \
> > > > +  (TREE_CODE (TYPE) == BOOLEAN_TYPE\
> > > > +   || ((TREE_CODE (TYPE) == INTEGER_TYPE   \
> > > > +   || TREE_CODE (TYPE) == ENUMERAL_TYPE)   \
> > > > +   && TYPE_PRECISION (TYPE) == 1   \
> > > > +   && TYPE_UNSIGNED (TYPE)))
> > > > 
> > > > (just to quote what you proposed).
> 
> > > I could agree to that.  Alternately, we could restore the TYPE_PRECISION
> > > checks that Jakub removed in the vectorizer.
> 
> I haven't removed any, there were just 3 changes of the kind:
> -  if ((TYPE_PRECISION (TREE_TYPE (var)) != 1
> -   || !TYPE_UNSIGNED (TREE_TYPE (var)))
> -  && TREE_CODE (TREE_TYPE (var)) != BOOLEAN_TYPE)
> +  if (!INTEGRAL_BOOLEAN_TYPE_P (TREE_TYPE (var)))
> which are almost equivalent, except that the old code could let
> non-INTEGRAL_TYPE_P with TYPE_PRECISION (whatever it means
> for the various kinds of types (e.g. log2 of number of vector elements
> etc.).
> 
> > 
> > Yeah, I guess at this point I prefer a vectorizer-local fix.
> 
> So shall I move the INTEGRAL_BOOLEAN_TYPE_P macro (or change it to
> another name, e.g. including VECT in it?) to tree-vectorizer.h?

Works for me.  Like VECT_SCALAR_BOOLEAN_TYPE_P () to not confuse it
with VECTOR_BOOLEAN_TYPE_P?

Richard.


Re: [PATCH] Fix bool vs. unsigned:1 vectorization (PR tree-optimization/79284)

2017-02-06 Thread Richard Biener
On Thu, 2 Feb 2017, Jakub Jelinek wrote:

> On Thu, Feb 02, 2017 at 10:12:32AM -0700, Jeff Law wrote:
> > On 02/01/2017 03:45 AM, Richard Biener wrote:
> > > 
> > > I agree.  But this means we should look for a vectorizer-local fix
> > > without a new global predicate then (there seem to be subtly different
> > > needs and coming up with good names for all of them sounds difficult...).
> > Well, we could go with Jakub's INTEGRAL_BOOLEAN_TYPE as posted, but in
> > contexts where we use it and really depend on single bit objects, we add the
> > precision == 1 check back.  Jakub's patch removes the type precision check
> > in tree-vect-patterns for example.  There's likely all kinds of places where
> > we need to add that check as well.
> 
> The 3 cases in tree-vect-patterns.c where I've removed the check were
> exactly what the proposed macro does, i.e.
>   if ((TYPE_PRECISION (TREE_TYPE (rhs1)) != 1
>|| !TYPE_UNSIGNED (TREE_TYPE (rhs1)))
>   && TREE_CODE (TREE_TYPE (rhs1)) != BOOLEAN_TYPE)
>   return false;
> i.e. bail out unless the rhs1 type is a BOOLEAN_TYPE (any precision,
> assuming it has only valid values of 0 and 1) or unless it is unsigned
> precision type 1 integer (i.e. something that (if it has also QImode)
> forwprop etc. could have changed a BOOLEAN_TYPE with precision 1 into.
> 
> Note Fortran has been using I think precision one boolean_type_node even in 
> GCC 6.x,
> it had:
>   boolean_type_node = gfc_get_logical_type (gfc_default_logical_kind);
>   boolean_true_node = build_int_cst (boolean_type_node, 1);
>   boolean_false_node = build_int_cst (boolean_type_node, 0);
> Fortran logical(4) etc. are built using:
>   new_type = make_unsigned_type (bit_size);
>   TREE_SET_CODE (new_type, BOOLEAN_TYPE);
>   TYPE_MAX_VALUE (new_type) = build_int_cst (new_type, 1);
>   TYPE_PRECISION (new_type) = 1;
> thus I believe they have e.g. SImode or DImode rather than QImode, but still
> TYPE_PRECISION of 1.  The non-Ada boolean is also:
>   boolean_type_node = make_unsigned_type (BOOL_TYPE_SIZE);
>   TREE_SET_CODE (boolean_type_node, BOOLEAN_TYPE);
>   TYPE_PRECISION (boolean_type_node) = 1;
> where BOOL_TYPE_SIZE is CHAR_TYPE_SIZE everywhere but on powerpc*-darwin*
> with some command line option.
> Ada is
>   boolean_type_node = make_unsigned_type (8);
>   TREE_SET_CODE (boolean_type_node, BOOLEAN_TYPE);
> and thus it is indeed precision 8 QImode.
> 
> So, requiring precision 1 for all BOOLEAN_TYPEs in the vectorizer would
> only affect Ada.  Requiring QImode or whatever other fixed TYPE_MODE
> in the macro for precision 1 unsigned non-BOOLEAN_TYPE integers would
> actually also affect Fortran a lot, because e.g. SImode precision 1
> INTEGER_TYPE is considered compatible with SImode precision 1 BOOLEAN_TYPE.
> 
> > /me is frustrated that we have booleans with nonstandard precision, even
> > though I understand why it was done.  It creates numerous headaches.
> 
> Ditto.

I guess I'm mostly frustrated that we have BOOLEAN_TYPE/ENUMERAL_TYPE at
all given precision-1 unsigned INTEGER_TYPEs are fine to represent
0/1 valued entities (aka bools).  But then we also have BImode...

Richard.


C++ patch ping

2017-02-06 Thread Jakub Jelinek
Hi!

I'd like to ping 2 C++ patches:

- P1 PR79232 - ICEs and wrong-code with COMPOUND_EXPR on lhs of assignment
  http://gcc.gnu.org/ml/gcc-patches/2017-01/msg02341.html

- P1 PR79288 - wrong default TLS model for __thread static data members
  http://gcc.gnu.org/ml/gcc-patches/2017-01/msg02349.html

Thanks

Jakub


Re: Fix PR c++/79360

2017-02-06 Thread Jason Merrill
OK.

On Sun, Feb 5, 2017 at 11:52 PM, Patrick Palka  wrote:
> This patch fixes PR c++/79360, a regression from PR c++/70347.
>
> The TYPE_FIELDS of a type may contain TYPE_DECLs and CONST_DECLs as well
> as FIELD_DECLs, but when looking for an NSDMI we are only interested in
> the FIELD_DECLs.  Otherwise we may try to initialize the union with the
> DECL_INITIAL of a nested CONST_DECL.  Does this look OK to commit after
> bootstrap + regtest?
>
> gcc/cp/ChangeLog:
>
> PR c++/79360
> * typeck2.c (process_init_constructor_union): Consider only
> FIELD_DECLs when looking for an NSDMI.
>
> gcc/testsuite/ChangeLog:
>
> PR c++/79360
> * g++.dg/cpp1y/nsdmi-union2.C: New test.
> ---
>  gcc/cp/typeck2.c  |  3 ++-
>  gcc/testsuite/g++.dg/cpp1y/nsdmi-union2.C | 12 
>  2 files changed, 14 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp1y/nsdmi-union2.C
>
> diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
> index 014de5c..1e0354d 100644
> --- a/gcc/cp/typeck2.c
> +++ b/gcc/cp/typeck2.c
> @@ -1510,7 +1510,8 @@ process_init_constructor_union (tree type, tree init,
>  {
>for (tree field = TYPE_FIELDS (type); field; field = TREE_CHAIN 
> (field))
>  {
> -  if (DECL_INITIAL (field))
> +  if (TREE_CODE (field) == FIELD_DECL
> +  && DECL_INITIAL (field) != NULL_TREE)
>  {
>CONSTRUCTOR_APPEND_ELT (CONSTRUCTOR_ELTS (init),
>field,
> diff --git a/gcc/testsuite/g++.dg/cpp1y/nsdmi-union2.C
> b/gcc/testsuite/g++.dg/cpp1y/nsdmi-union2.C
> new file mode 100644
> index 000..08217d7
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp1y/nsdmi-union2.C
> @@ -0,0 +1,12 @@
> +// PR c++/79360
> +// { dg-do compile { target c++14 } }
> +
> +union U
> +{
> +  enum E { e };
> +};
> +
> +struct A
> +{
> +  U u{};
> +};
> --
> 2.10.1.456.g9cf5127


Re: C++ Modules branch

2017-02-06 Thread Nathan Sidwell

On 02/05/2017 02:58 PM, Gerald Pfeifer wrote:

On Tue, 24 Jan 2017, Nathan Sidwell wrote:

As some have already noticed, I created a c++-modules branch yesterday.
Don't get too excited, that doesn't mean I have an implementation to
commit there.


Are you planning to add this to svn.html (where we generally
describe all branches)?


Ah, thanks for the reminder.

First, here's a patch to collate the existing list, ok?

(It seems quite a few may be dead now, time for some pruning?)

nathan

--
Nathan Sidwell
Index: htdocs/svn.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/svn.html,v
retrieving revision 1.208
diff -r1.208 svn.html
154,160c154,157
<   struct-reorg-branch
<   This branch is for the development of structure reorganization
<   optimizations, including field reordering, structure splitting for
<   trees.  These optimizations are profile information driven.  This is
<   a subbranch of tree-profiling.  This branch is being maintained by
<   Caroline Tice, Dale Johannesen, Kenneth Zadeck, Stuart Hastings,
<   Mostafa Hagog.
---
>   alias-export
>   This branch contains the alias export and data dependency export patch.
>   It is used to experiment with the propagation process.  This branch is maintained
>   by Andrey Belevantsev a...@ispras.ru.
169,171c166,175
<   https://gcc.gnu.org/wiki/Graphite;>graphite-branch
<   The purpose of this branch is to develop an infrastructure for loop
<   transforms using the polyhedral model.
---
>   cxx-conversion
>   This branch hosts mini-projects that rewrite parts of the existing
>   GCC code into C++.  Each conversion project will be proposed for
>   trunk integration independently.  The branch is maintained by
>   Diego Novillo.  Patches
>   sent to this branch and discussions related to it should be marked
>   with the tag [cxx-conversion] in the subject line.  For
>   details on working with this branch, see the
>   https://gcc.gnu.org/wiki/cxx-conversion;>C++ conversion
>   page.
173,177c177,180
<   https://gcc.gnu.org/wiki/LightweightIpo;>lw-ipo
<   This branch aims to implement lightweight IPO.  Patches
<   and discussion on this branch should be marked with the tag
<   [lipo] in the subject line. The branch is maintained by
<   David Li.
---
>   https://gcc.gnu.org/wiki/Atomic/GCCMM;>cxx-mem-model
>   This branch is for the implementation of the C++ memory model.
>   Patches for this branch should be marked [cxx-mem-model]
>   in the subject line.  The branch is maintained by Aldy Hernandez.
178a182,200
>   https://gcc.gnu.org/wiki/AndrewMacLeod/debuglocus;>debuglocus
>   This branch is an experiment to see whether improved debug information 
>   can be maintained throughout the compiler by associating a user decl with an 
>   expression, statement, or insn.  The name comes from attempting to utilize
>   the ever present source location (locus) field to carry the debug info.
>   Further information can be found on the https://gcc.gnu.org/wiki/AndrewMacLeod/debuglocus;>debuglocus wiki page .
> 
>   dwarf4
>   This branch is for support of DWARF-4 features.
>   DWARF-4 is currently under development, so changes on this branch
>   will remain experimental until Version 4 is officially finalized.
> 
>   gc-improv
>   This branch is for the development of garbage collector
>   improvements.  It is the successor to the boehm-gc branch, but
>   without integration with Boehm's GC.  The branch is maintained
>   by Laurynas
>   Biveinis.  Patches for this branch should be marked with the
>   tag [gc-improv] in the subject line.
198,209c220,222
<   ra-improvements
<   This branch aims to implement several improvements to the
<   current register allocator.  Examples include implementing a
<   lower-triangular conflict matrix and register coalescing.
<   It is hoped that these improvements will not only help the
<   current allocator, but will be useful to the other register
<   allocation projects such as RABLE and YARA. This branch will
<   be merged with the dataflow-branch from time to time.
<   The patches for this branch should be marked with the tag
<   [ra-improvements] in the subject line.  The branch
<   is maintained by Peter
<   Bergner.
---
>   https://gcc.gnu.org/wiki/Graphite;>graphite-branch
>   The purpose of this branch is to develop an infrastructure for loop
>   transforms using the polyhedral model.
211,220c224,228
<   sel-sched-branch
<   This branch contains the implementation of the selective scheduling
<   approach.  The goal of the branch is to provide more aggressive scheduler 
<   implementation with support for instruction cloning, register renaming,
<   and forward substitution.   The branch is maintained by Andrey
<   Belevantsev a...@ispras.ru 
<   and Maxim Kuvyrkov 
<   mkuvyr...@ispras.ru and will be regularly merged with mainline.  
<   Patches will be marked with the tag [sel-sched] in 
<   the subject line.
---
>   https://gcc.gnu.org/wiki/LightweightIpo;>lw-ipo
>   

Re: [RFA] [PR tree-optimization/79095][PATCH 3/4] Improve ASSERT_EXPRs and simplification of overflow tests

2017-02-06 Thread Richard Biener
On Sat, Feb 4, 2017 at 3:52 PM, Jeff Law  wrote:
>
> This is the actual optimization piece of the patchseries and uses the
> overflow test detection function in patch #2.
>
> First, when we detect an overflow test, we register additional ASSERT_EXPRs
> for the given name.  So instead of an ASSERT_EXPR for an expression like A <
> B, we get an assert like A > 0xfffe  or A <= 0.
>
> Additionally, during propagation and folding, if we are presented with an
> overflow test which collapses into an equality test, we will simplify the
> test into an equality check (without changing the IL).   So A + 1 < A would
> turn into A == -1  or A + 1 > A turns into A != -1.  There's a corresponding
> equivalent for A - 1 < A and A - 1 > A.
>
> The net result is the new ASSERT_EXPRs and simplified tests allow VRP to
> eliminate more paths through the CFG and improve its constant propagation
> capabilities.  Examples can be found in the next patch which has the tests.
>
> Bootstrapped and regression tested with the other patches in this series.
> OK for the trunk?
>
> * tree-vrp.c (register_edge_assert_for_2): Register additional
> asserts
> fif NAME is used in an overflow test.
> (vrp_evaluate_conditional_warnv_with_ops): If the ops represent an
> overflow check that can be expressed as an equality test, then
> adjust
> ops to be that equality test.
>
>
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index 6459c71..8d78646 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -5299,7 +5298,19 @@ register_edge_assert_for_2 (tree name, edge e,
> gimple_stmt_iterator bsi,
>/* Only register an ASSERT_EXPR if NAME was found in the sub-graph
>   reachable from E.  */
>if (live_on_edge (e, name))
> -register_new_assert_for (name, name, comp_code, val, NULL, e, bsi);
> +{
> +  tree x;
> +  if (overflow_comparison_p (comp_code, name, val, false, false, )
> + || overflow_comparison_p (swap_tree_comparison (comp_code), val,
> name,
> + false, true, ))
> +   {
> + enum tree_code new_code
> +   = ((comp_code == GT_EXPR || comp_code == GE_EXPR)
> +  ? GT_EXPR : LE_EXPR);
> + register_new_assert_for (name, name, new_code, x, NULL, e, bsi);
> +   }
> +  register_new_assert_for (name, name, comp_code, val, NULL, e, bsi);
> +}
>
>/* In the case of NAME <= CST and NAME being defined as
>   NAME = (unsigned) NAME2 + CST2 we can assert NAME2 >= -CST2
> @@ -7658,6 +7669,60 @@ vrp_evaluate_conditional_warnv_with_ops (enum
> tree_code code, tree op0,
>&& !POINTER_TYPE_P (TREE_TYPE (op0)))
>  return NULL_TREE;
>
> +  /* If OP0 CODE OP1 is an overflow comparison, if it can be expressed
> + as a simple equality test, then prefer that over its current form
> + for evaluation.
> +
> + An overflow test which collapses to an equality test can always be
> + expressed as a comparison of one argument against zero.  Overflow
> + occurs when the chosen argument is zero and does not occur if the
> + chosen argument is not zero.  */
> +  tree x;
> +  if (overflow_comparison_p (code, op0, op1, use_equiv_p, false, ))

This somehow feels like a hack so I'd add a comment why we do not change
the IL in the first place.

Feeding overflow_comparison_p the original and the swapped comparison
looks like it makes it more expensive given its stmt walking?  I'd see whether
returning a second output from it (whether we matched op0 or op1) would
simplify callers.

Richard.

> +{
> +  wide_int max = wi::max_value (TYPE_PRECISION (TREE_TYPE (op0)),
> UNSIGNED);
> +  /* B = A - 1; if (A < B) -> B = A - 1; if (A == 0)
> + B = A - 1; if (A > B) -> B = A - 1; if (A != 0) */
> +  if (integer_zerop (x))
> +   {
> + op1 = x;
> + code = (code == LT_EXPR || code == LE_EXPR) ? EQ_EXPR : NE_EXPR;
> +   }
> +  /* B = A + 1; if (A > B) -> B = A + 1; if (B == 0)
> + B = A + 1; if (A < B) -> B = A + 1; if (B != 0) */
> +  else if (wi::eq_p (x, max - 1))
> +   {
> + op0 = op1;
> + op1 = wide_int_to_tree (TREE_TYPE (op0), 0);
> + code = (code == GT_EXPR || code == GE_EXPR) ? EQ_EXPR : NE_EXPR;
> +   }
> +}
> +  else if (overflow_comparison_p (swap_tree_comparison (code),
> + op1, op0, use_equiv_p, true, ))
> +{
> +  /* X holds the value if we wanted to generate an overflow check
> +for the comparison using OP1.  But we're actually going to
> +test against OP0 and we're always going to use an equality
> +test, so the constants for detection below are different
> +than the constant we pass into vrp_evaluate_... */
> +  wide_int max = wi::max_value (TYPE_PRECISION (TREE_TYPE (op0)),
> UNSIGNED);
> +  /* B = A - 1; if (B > A) -> B = A - 1; if (A == 0)
> + B = A - 

Re: [RFA] [PR tree-optimization/79095][PATCH 2/4] Add infrastructure to detect overflow checks

2017-02-06 Thread Richard Biener
On Sat, Feb 4, 2017 at 3:52 PM, Jeff Law  wrote:
>
> This is the second in the 4 part series to address 79095.  This patch
> introduces a new function into tree-vrp.c to allow for the detection of
> overflow checks of the form A OP A + CST (for unsigned/wrapping A).
>
> This is implemented by first checking for A OP B, we then conditionally walk
> the ASSERT_EXPR chain for B to produce B'.  We then look at the defining
> statement for B or B' to see if it has the form B = X + CST or B' = X + CST
> respectively.
>
> Then we conditionally walk the ASSERT_EXPR chain for A to see if it resolves
> to X at any point.  There have been cases where no walking was necessary to
> show that X resolves to A.  Other cases have required walking part or the
> entire ASSERT_EXPR chain.
>
> We do not walk during propagation, but do walk during
> folding/simplification.
>
> At this point we have an overflow check of the appropriate form.  We compute
> an updated constant so that we can check for overflow with expressions like
>
> A > 0xfffe
>
> or
>
> A <= 0
>
> Those are particularly interesting forms as they collapse into equality
> tests (next patch).  The code supports other forms, but they're not as
> useful because they don't end up generating equality tests or allow for
> constant propagation.
>
> Bootstrapped and regression tested with the other patches in this series.
> OK for the trunk?
>
>
>
>
> * tree-vrp.c (overflow_comparison_p): New function.
>
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index 3338d8b..6459c71 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -5189,6 +5189,94 @@ masked_increment (const wide_int _in, const
> wide_int ,
>return val ^ sgnbit;
>  }
>
> +/* OP0 CODE OP1 is a comparison.  Examine the comparison and potentially
> +   OP1's defining statement to see if it ultimately has the form
> +   OP0 CODE (OP0 PLUS INTEGER_CST)
> +
> +   If so, return TRUE indicating this is an overflow test and store into
> +   *NEW_CST an updated constant that can be used in a narrowed range test.
> +
> +   REVERSED indicates if the comparison was originally:
> +
> +   OP1 CODE' OP0.
> +
> +   This affects how we build the updated constant.  */
> +
> +static bool
> +overflow_comparison_p (enum tree_code code, tree op0, tree op1,
> +  bool follow_assert_exprs, bool reversed, tree
> *new_cst)
> +{
> +  /* See if this is a relational operation between two SSA_NAMES with
> + unsigned, overflow wrapping values.  If so, check it more deeply.  */
> +  if ((code == LT_EXPR || code == LE_EXPR
> +   || code == GE_EXPR || code == GT_EXPR)
> +  && TREE_CODE (op0) == SSA_NAME
> +  && TREE_CODE (op1) == SSA_NAME
> +  && INTEGRAL_TYPE_P (TREE_TYPE (op0))
> +  && TYPE_UNSIGNED (TREE_TYPE (op0))
> +  && TYPE_OVERFLOW_WRAPS (TREE_TYPE (op0)))
> +{
> +  gimple *op1_def = SSA_NAME_DEF_STMT (op1);
> +
> +  /* If requested, follow any ASSERT_EXPRs backwards for OP1.  */
> +  if (follow_assert_exprs)
> +   {
> + while (gimple_assign_single_p (op1_def)
> +&& TREE_CODE (gimple_assign_rhs1 (op1_def)) == ASSERT_EXPR)
> +   {
> + op1 = TREE_OPERAND (gimple_assign_rhs1 (op1_def), 0);
> + if (TREE_CODE (op1) != SSA_NAME)
> +   break;
> + op1_def = SSA_NAME_DEF_STMT (op1);
> +   }
> +   }
> +
> +  /* Now look at the defining statement of OP1 to see if it adds
> +or subtracts a nonzero constant from another operand.  */
> +  if (op1_def
> + && is_gimple_assign (op1_def)
> + && gimple_assign_rhs_code (op1_def) == PLUS_EXPR
> + && TREE_CODE (gimple_assign_rhs2 (op1_def)) == INTEGER_CST
> + && wi::ne_p (gimple_assign_rhs2 (op1_def), 0))

! integer_zerop ()

> +   {
> + tree target = gimple_assign_rhs1 (op1_def);
> +
> + /* If requested, follow ASSERT_EXPRs backwards for op0 looking
> +for one where TARGET appears on the RHS.  */
> + if (follow_assert_exprs)
> +   {
> + /* Now see if that "other operand" is op0, following the chain
> +of ASSERT_EXPRs if necessary.  */
> + gimple *op0_def = SSA_NAME_DEF_STMT (op0);
> + while (op0 != target
> +&& gimple_assign_single_p (op0_def)
> +&& TREE_CODE (gimple_assign_rhs1 (op0_def)) ==
> ASSERT_EXPR)
> +   {
> + op0 = TREE_OPERAND (gimple_assign_rhs1 (op0_def), 0);
> + if (TREE_CODE (op0) != SSA_NAME)
> +   break;
> + op0_def = SSA_NAME_DEF_STMT (op0);
> +   }
> +   }
> +
> + /* If we did not find our target SSA_NAME, then this is not
> +an overflow test.  */
> + if (op0 != target)
> +   return false;
> +
> + tree type = TREE_TYPE (op0);
> + wide_int max = wi::max_value 

Re: [PING 6, PATCH] Remove xfail from thread_local-order2.C.

2017-02-06 Thread Rainer Orth
Hi Dominik,

> On Mon, Feb 06, 2017 at 12:33:21PM +0100, Rainer Orth wrote:
>> > Copying the two guys listed as testsuite maintainers in gcc/MAINTAINERS
>> > may help; let me do that for you.
>> >
>> > That said, if this fails to fail, the patch might be considered obvious,
>> > not requiring a approval?
>> 
>> it's not: while it may XPASS with newer glibc versions, it still XFAILs
>> e.g. on Solaris (and probably others).
>
> It's been so long that I cannot tell what the reference to
> glibc-2.18 means.  I've only ever tested this on s390 and s390x,
> and the test may or may not PASS on other targets with
> glibc-2.18+.
>
>>  So unconditionally removing the
>> xfail *-*-* trades an XPASS->PASS on some Linux versions against a
>> XFAIL->FAIL elsewhere, which isn't acceptable.
>
> Okay, so what would you suggest?
>
>   // { dg-do run { xfail !s390*-*-* } } 
>
> or
>
>   // { dg-do run { xfail *-*-solaris } } 
>
> or something else?  We'll probably only get this list right by
> trial and error anyway.

how about checking the gcc-testresults archives for XPASSes to get an
idea?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PING 6, PATCH] Remove xfail from thread_local-order2.C.

2017-02-06 Thread Dominik Vogt
On Mon, Feb 06, 2017 at 12:33:21PM +0100, Rainer Orth wrote:
> > Copying the two guys listed as testsuite maintainers in gcc/MAINTAINERS
> > may help; let me do that for you.
> >
> > That said, if this fails to fail, the patch might be considered obvious,
> > not requiring a approval?
> 
> it's not: while it may XPASS with newer glibc versions, it still XFAILs
> e.g. on Solaris (and probably others).

It's been so long that I cannot tell what the reference to
glibc-2.18 means.  I've only ever tested this on s390 and s390x,
and the test may or may not PASS on other targets with
glibc-2.18+.

>  So unconditionally removing the
> xfail *-*-* trades an XPASS->PASS on some Linux versions against a
> XFAIL->FAIL elsewhere, which isn't acceptable.

Okay, so what would you suggest?

  // { dg-do run { xfail !s390*-*-* } } 

or

  // { dg-do run { xfail *-*-solaris } } 

or something else?  We'll probably only get this list right by
trial and error anyway.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH] S/390: Use macros from hwint.h where possible.

2017-02-06 Thread Andreas Krebbel
On 01/26/2017 09:45 PM, Dominik Vogt wrote:

* config/s390/predicates.md ("larl_operand"): Use macros from hwint.h.
* config/s390/s390.c (s390_const_operand_ok)
(s390_canonicalize_comparison, s390_extract_part)
(s390_single_part, s390_contiguous_bitmask_nowrap_p)
(s390_contiguous_bitmask_p, s390_rtx_costs)
(legitimize_pic_address): Likewise.
* config/s390/s390.md ("clzdi2", "clztidi2"): Likewise.
* config/s390/vx-builtins.md ("vec_genbytemaskv16qi")
("vec_permi", "vfae", "*vfaes", "vstrc")
("*vstrcs"): Use UINTVAL() to set unsigned HOST_WIDE_INT.
* config/s390/vector.md ("vec_vfenes"): Likewise.

Applied. Thanks!

-Andreas-



Re: [PING 6, PATCH] Remove xfail from thread_local-order2.C.

2017-02-06 Thread Rainer Orth
Hi Gerald,

> Copying the two guys listed as testsuite maintainers in gcc/MAINTAINERS
> may help; let me do that for you.
>
> That said, if this fails to fail, the patch might be considered obvious,
> not requiring a approval?

it's not: while it may XPASS with newer glibc versions, it still XFAILs
e.g. on Solaris (and probably others).  So unconditionally removing the
xfail *-*-* trades an XPASS->PASS on some Linux versions against a
XFAIL->FAIL elsewhere, which isn't acceptable.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [libstdc++,doc] Link maintenance for doc/xml/manual/documentation_hacking.xml

2017-02-06 Thread Jonathan Wakely

On 06/02/17 11:28 +0100, Gerald Pfeifer wrote:

Before GCC 7 branches, I am trying to get as many links in our
(non-wwwdocs) documentation fixed as possible.  Turns out libstdc++
has a fair number that need attention.

In documentation_hacking.xml I could not find a replacement document
for DocBook Publishing Tools.  The other two changes were more straight-
forward.

(In particular, also note that "here" or "online" as names for links
should be avoided, and this patch also improves one such case.)


Thanks, I always change such link names when I come across them, but I
expect we still have a few.



Re: [PING 6, PATCH] Remove xfail from thread_local-order2.C.

2017-02-06 Thread Gerald Pfeifer
Copying the two guys listed as testsuite maintainers in gcc/MAINTAINERS
may help; let me do that for you.

That said, if this fails to fail, the patch might be considered obvious,
not requiring a approval?

Gerald

On Mon, 6 Feb 2017, Dominik Vogt wrote:
> Pinging this for eight months now.  :-/
> 
> On Mon, Jun 20, 2016 at 02:41:21PM +0100, Dominik Vogt wrote:
>> Patch:
>> https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01587.html
>> 
>> On Wed, Jan 27, 2016 at 10:39:44AM +0100, Dominik Vogt wrote:
>>> g++.dg/tls/thread_local-order2.C no longer fail with Glibc-2.18 or
>>> newer since this commit:
>>> 
>>>   2014-08-01  Zifei Tong  
>>> 
>>> * libsupc++/atexit_thread.cc (HAVE___CXA_THREAD_ATEXIT_IMPL): 
>>> Add
>>> _GLIBCXX_ prefix to macro.
>>> 
>>>   git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@213504 
>>> 138bc75d-0d04-0410-96
>>> 
>>> https://gcc.gnu.org/ml/gcc-patches/2014-07/msg02091.html
>>> 
>>> So, is it time to remove the xfail from the test case?
>>>
>>> gcc/testsuite/ChangeLog
>>> 
>>> * g++.dg/tls/thread_local-order2.C: Remove xfail.
>>> 
>>> From 0b0abbd2e6d9d8b6857622065bdcbdde31b5ddb0 Mon Sep 17 00:00:00 2001
>>> From: Dominik Vogt 
>>> Date: Wed, 27 Jan 2016 09:54:07 +0100
>>> Subject: [PATCH] Remove xfail from thread_local-order2.C.
>>> 
>>> This should work with Glibc-2.18 or newer.
>>> ---
>>>  gcc/testsuite/g++.dg/tls/thread_local-order2.C | 1 -
>>>  1 file changed, 1 deletion(-)
>>> 
>>> diff --git a/gcc/testsuite/g++.dg/tls/thread_local-order2.C 
>>> b/gcc/testsuite/g++.dg/tls/thread_local-order2.C
>>> index f8df917..d3351e6 100644
>>> --- a/gcc/testsuite/g++.dg/tls/thread_local-order2.C
>>> +++ b/gcc/testsuite/g++.dg/tls/thread_local-order2.C
>>> @@ -2,7 +2,6 @@
>>>  // that isn't reverse order of construction.  We need to move
>>>  // __cxa_thread_atexit into glibc to get this right.
>>>  
>>> -// { dg-do run { xfail *-*-* } }
>>>  // { dg-require-effective-target c++11 }
>>>  // { dg-add-options tls }
>>>  // { dg-require-effective-target tls_runtime }
>>> -- 
>>> 2.3.0


Re: [patch,avr] Tweak addhi3_zero_extend.

2017-02-06 Thread Denis Chertykov
2017-02-06 14:03 GMT+04:00 Georg-Johann Lay :
> This patch adds an alternative to addhi3_zero_extend for the case
> where output operand and the 8-bit addend happen to reside
> the the same register.  Without the patch this might lead
> to additional reloads to satisfy the constraints like
>
>
> uint16_t func (uint8_t x, uint16_t y)
> {
> return x + y;
> }
>
>
> Without the new alternative the code will be
>
>
> func:
> movw r18,r22 ;  18  *movhi/1[length = 1]
> add r18,r24  ;  13  *addhi3_zero_extend [length = 2]
> adc r19,__zero_reg__
> movw r24,r18 ;  19  *movhi/1[length = 1]
> /* epilogue start */
> ret
>
>
> With the change the code reads
>
>
> func:
> add r24,r22  ;  13  *addhi3_zero_extend/2   [length = 3]
> mov r25,r23
> adc r25,__zero_reg__
> /* epilogue start */
> ret
>
> which has smaller code size and register pressure.
>
> Ok for trunk?
>
> Johann
>
> * config/avr/avr.md (*addhi3_zero_extend): Add alternative where
> REGNO($0) == REGNO($1).

Approved.
Please apply.


Re: Ping#1: [patch,avr]: For PR78883 #3

2017-02-06 Thread Denis Chertykov
2017-02-06 14:06 GMT+04:00 Georg-Johann Lay :
> https://gcc.gnu.org/ml/gcc-patches/2017-01/msg00926.html
>
> On 13.01.2017 12:53, Georg-Johann Lay wrote:
>>
>> This is 3rd way to fix PR78883 by rejecting malicious expressions
>> from the start.
>>
>> Ok for trunk?
>>
>> Johann
>>
>> gcc/
>> PR target/78883
>> * config/avr/avr.c (rtl-iter.h): Include it.
>> (TARGET_LEGITIMATE_COMBINED_INSN): New hook define...
>> (avr_legitimate_combined_insn): ...and implementation.
>>
>> gcc/testsuite/
>> PR target/78883
>> * gcc.c-torture/compile/pr78883.c: New test.
>>
>

Approved.


[libstdc++,doc] Link maintenance for doc/xml/manual/documentation_hacking.xml

2017-02-06 Thread Gerald Pfeifer
Before GCC 7 branches, I am trying to get as many links in our 
(non-wwwdocs) documentation fixed as possible.  Turns out libstdc++ 
has a fair number that need attention.
 
In documentation_hacking.xml I could not find a replacement document
for DocBook Publishing Tools.  The other two changes were more straight-
forward.

(In particular, also note that "here" or "online" as names for links
should be avoided, and this patch also improves one such case.)

Applied (revision 245205).

Gerald

2017-02-06  Gerald Pfeifer  

* doc/xml/manual/documentation_hacking.xml: Remove obsolete link 
to DocBook Publishing Tools.
epubcheck has moved to GitHub.
Update URL of the DocBook Element Reference; use that term as 
link description instead of "online".

Index: doc/xml/manual/documentation_hacking.xml
===
--- doc/xml/manual/documentation_hacking.xml(revision 245203)
+++ doc/xml/manual/documentation_hacking.xml(working copy)
@@ -765,9 +765,8 @@
For PDF output, something that transforms valid Docbook XML to PDF is
required. Possible solutions include http://www.w3.org/1999/xlink; 
xlink:href="http://dblatex.sourceforge.net;>dblatex,
xmlto, or prince. Of
-   these, dblatex is the default. Other
-   options are listed on the DocBook web http://www.w3.org/1999/xlink; 
xlink:href="http://wiki.docbook.org/topic/DocBookPublishingTools;>pages. 
Please
-   consult the libstd...@gcc.gnu.org list when
+   these, dblatex is the default.
+   Please consult the libstd...@gcc.gnu.org list when
preparing printed manuals for current best practice and
suggestions.
   
@@ -778,7 +777,7 @@
   
 
   
-   For epub output, the http://www.w3.org/1999/xlink; 
xlink:href="https://sourceforge.net/projects/docbook/files/epub3/;>stylesheets
 for EPUB3 are required. These stylesheets are still in development. To 
validate the created file, http://www.w3.org/1999/xlink; 
xlink:href="https://code.google.com/p/epubcheck/;>epubcheck is necessary.
+   For epub output, the http://www.w3.org/1999/xlink; 
xlink:href="https://sourceforge.net/projects/docbook/files/epub3/;>stylesheets
 for EPUB3 are required. These stylesheets are still in development. To 
validate the created file, http://www.w3.org/1999/xlink; 
xlink:href="https://github.com/IDPF/epubcheck;>epubcheck is necessary.
   
 
 
@@ -1022,9 +1021,8 @@
   
 
   
-   Complete details on Docbook markup can be found in the DocBook
-   Element Reference,
-   http://www.w3.org/1999/xlink; 
xlink:href="http://www.docbook.org/tdg/en/html/part2.html;>online.
+   Complete details on Docbook markup can be found in the
+   http://www.w3.org/1999/xlink; 
xlink:href="http://tdg.docbook.org/tdg/5.0/ref-elements.html;>DocBook Element 
Reference.
An incomplete reference for HTML to Docbook conversion is
detailed in the table below.
   


Re: [PING 6, PATCH] Remove xfail from thread_local-order2.C.

2017-02-06 Thread Dominik Vogt
Pinging this for eight months now.  :-/

On Mon, Jun 20, 2016 at 02:41:21PM +0100, Dominik Vogt wrote:
> Patch:
> https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01587.html
> 
> On Wed, Jan 27, 2016 at 10:39:44AM +0100, Dominik Vogt wrote:
> > g++.dg/tls/thread_local-order2.C no longer fail with Glibc-2.18 or
> > newer since this commit:
> > 
> >   2014-08-01  Zifei Tong  
> > 
> > * libsupc++/atexit_thread.cc (HAVE___CXA_THREAD_ATEXIT_IMPL): 
> > Add
> > _GLIBCXX_ prefix to macro.
> > 
> >   git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@213504 
> > 138bc75d-0d04-0410-96
> > 
> > https://gcc.gnu.org/ml/gcc-patches/2014-07/msg02091.html
> > 
> > So, is it time to remove the xfail from the test case?

> > gcc/testsuite/ChangeLog
> > 
> > * g++.dg/tls/thread_local-order2.C: Remove xfail.
> 
> > >From 0b0abbd2e6d9d8b6857622065bdcbdde31b5ddb0 Mon Sep 17 00:00:00 2001
> > From: Dominik Vogt 
> > Date: Wed, 27 Jan 2016 09:54:07 +0100
> > Subject: [PATCH] Remove xfail from thread_local-order2.C.
> > 
> > This should work with Glibc-2.18 or newer.
> > ---
> >  gcc/testsuite/g++.dg/tls/thread_local-order2.C | 1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/gcc/testsuite/g++.dg/tls/thread_local-order2.C 
> > b/gcc/testsuite/g++.dg/tls/thread_local-order2.C
> > index f8df917..d3351e6 100644
> > --- a/gcc/testsuite/g++.dg/tls/thread_local-order2.C
> > +++ b/gcc/testsuite/g++.dg/tls/thread_local-order2.C
> > @@ -2,7 +2,6 @@
> >  // that isn't reverse order of construction.  We need to move
> >  // __cxa_thread_atexit into glibc to get this right.
> >  
> > -// { dg-do run { xfail *-*-* } }
> >  // { dg-require-effective-target c++11 }
> >  // { dg-add-options tls }
> >  // { dg-require-effective-target tls_runtime }
> > -- 
> > 2.3.0

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH, PING 1] S/390: Use macros from hwint.h where possible.

2017-02-06 Thread Dominik Vogt
Ping.

On Thu, Jan 26, 2017 at 09:45:23PM +0100, Dominik Vogt wrote:
> The attached patch uses macros from hwint.h in the s390 backend
> where possible.
> 
> Bootstrapped and regression tested on a zEC12 with s390x biarch
> and s390.
> 
> Ciao
> 
> Dominik ^_^  ^_^
> 
> -- 
> 
> Dominik Vogt
> IBM Germany

> gcc/ChangeLog-macros
> 
>   * config/s390/predicates.md ("larl_operand"): Use macros from hwint.h.
>   * config/s390/s390.c (s390_const_operand_ok)
>   (s390_canonicalize_comparison, s390_extract_part)
>   (s390_single_part, s390_contiguous_bitmask_nowrap_p)
>   (s390_contiguous_bitmask_p, s390_rtx_costs)
>   (legitimize_pic_address): Likewise.
>   * config/s390/s390.md ("clzdi2", "clztidi2"): Likewise.
>   * config/s390/vx-builtins.md ("vec_genbytemaskv16qi")
>   ("vec_permi", "vfae", "*vfaes", "vstrc")
>   ("*vstrcs"): Use UINTVAL() to set unsigned HOST_WIDE_INT.
>   * config/s390/vector.md ("vec_vfenes"): Likewise.

> >From 55d162e6bbbd56aa4d67e86fe029eeb21477c60a Mon Sep 17 00:00:00 2001
> From: Dominik Vogt 
> Date: Thu, 26 Jan 2017 18:21:02 +0100
> Subject: [PATCH] S/390: Use macros from hwint.h where possible.
> 
> ---
>  gcc/config/s390/predicates.md  |  4 ++--
>  gcc/config/s390/s390.c | 50 
> --
>  gcc/config/s390/s390.md|  5 ++---
>  gcc/config/s390/vector.md  |  2 +-
>  gcc/config/s390/vx-builtins.md | 14 ++--
>  5 files changed, 36 insertions(+), 39 deletions(-)
> 
> diff --git a/gcc/config/s390/predicates.md b/gcc/config/s390/predicates.md
> index a233aaf..0c82efc 100644
> --- a/gcc/config/s390/predicates.md
> +++ b/gcc/config/s390/predicates.md
> @@ -147,8 +147,8 @@
>if (GET_CODE (XEXP (op, 1)) != CONST_INT
>|| (INTVAL (XEXP (op, 1)) & 1) != 0)
>  return false;
> -  if (INTVAL (XEXP (op, 1)) >= (HOST_WIDE_INT)1 << 31
> -   || INTVAL (XEXP (op, 1)) < -((HOST_WIDE_INT)1 << 31))
> +  if (INTVAL (XEXP (op, 1)) >= HOST_WIDE_INT_1 << 31
> +   || INTVAL (XEXP (op, 1)) < -(HOST_WIDE_INT_1 << 31))
>  return false;
>op = XEXP (op, 0);
>  }
> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
> index fe65846..4c7e9a7 100644
> --- a/gcc/config/s390/s390.c
> +++ b/gcc/config/s390/s390.c
> @@ -750,12 +750,12 @@ s390_const_operand_ok (tree arg, int argnum, int 
> op_flags, tree decl)
>int bitwidth = bitwidths[op_flags - O_U1];
>  
>if (!tree_fits_uhwi_p (arg)
> -   || tree_to_uhwi (arg) > ((unsigned HOST_WIDE_INT)1 << bitwidth) - 1)
> +   || tree_to_uhwi (arg) > (HOST_WIDE_INT_1U << bitwidth) - 1)
>   {
> error("constant argument %d for builtin %qF is out of range (0.."
>   HOST_WIDE_INT_PRINT_UNSIGNED ")",
>   argnum, decl,
> - ((unsigned HOST_WIDE_INT)1 << bitwidth) - 1);
> + (HOST_WIDE_INT_1U << bitwidth) - 1);
> return false;
>   }
>  }
> @@ -766,15 +766,15 @@ s390_const_operand_ok (tree arg, int argnum, int 
> op_flags, tree decl)
>int bitwidth = bitwidths[op_flags - O_S2];
>  
>if (!tree_fits_shwi_p (arg)
> -   || tree_to_shwi (arg) < -((HOST_WIDE_INT)1 << (bitwidth - 1))
> -   || tree_to_shwi (arg) > (((HOST_WIDE_INT)1 << (bitwidth - 1)) - 1))
> +   || tree_to_shwi (arg) < -(HOST_WIDE_INT_1 << (bitwidth - 1))
> +   || tree_to_shwi (arg) > ((HOST_WIDE_INT_1 << (bitwidth - 1)) - 1))
>   {
> error("constant argument %d for builtin %qF is out of range ("
>   HOST_WIDE_INT_PRINT_DEC ".."
>   HOST_WIDE_INT_PRINT_DEC ")",
>   argnum, decl,
> - -((HOST_WIDE_INT)1 << (bitwidth - 1)),
> - ((HOST_WIDE_INT)1 << (bitwidth - 1)) - 1);
> + -(HOST_WIDE_INT_1 << (bitwidth - 1)),
> + (HOST_WIDE_INT_1 << (bitwidth - 1)) - 1);
> return false;
>   }
>  }
> @@ -1561,7 +1561,7 @@ s390_canonicalize_comparison (int *code, rtx *op0, rtx 
> *op1,
> && modesize <= HOST_BITS_PER_WIDE_INT)
>   {
> unsigned HOST_WIDE_INT block;
> -   block = ((unsigned HOST_WIDE_INT) 1 << len) - 1;
> +   block = (HOST_WIDE_INT_1U << len) - 1;
> block <<= modesize - pos - len;
>  
> *op0 = gen_rtx_AND (GET_MODE (inner), inner,
> @@ -1610,7 +1610,7 @@ s390_canonicalize_comparison (int *code, rtx *op0, rtx 
> *op1,
>&& INTVAL (*op1) == 0x
>&& SCALAR_INT_MODE_P (GET_MODE (*op0))
>&& (nonzero_bits (*op0, GET_MODE (*op0))
> -   & ~(unsigned HOST_WIDE_INT) 0x) == 0)
> +   & ~HOST_WIDE_INT_UC (0x)) == 0)
>  {
>*op0 = gen_lowpart (HImode, *op0);
>*op1 = constm1_rtx;
> @@ -2212,14 +2212,13 @@ s390_extract_part (rtx op, machine_mode mode, int def)
>unsigned HOST_WIDE_INT value = 0;
>int max_parts = HOST_BITS_PER_WIDE_INT / GET_MODE_BITSIZE (mode);
>int part_bits = GET_MODE_BITSIZE (mode);

Ping#1: [patch,avr]: For PR78883 #3

2017-02-06 Thread Georg-Johann Lay

https://gcc.gnu.org/ml/gcc-patches/2017-01/msg00926.html

On 13.01.2017 12:53, Georg-Johann Lay wrote:

This is 3rd way to fix PR78883 by rejecting malicious expressions
from the start.

Ok for trunk?

Johann

gcc/
PR target/78883
* config/avr/avr.c (rtl-iter.h): Include it.
(TARGET_LEGITIMATE_COMBINED_INSN): New hook define...
(avr_legitimate_combined_insn): ...and implementation.

gcc/testsuite/
PR target/78883
* gcc.c-torture/compile/pr78883.c: New test.





[wwwdocs] update two links on projects/gupc.html

2017-02-06 Thread Gerald Pfeifer
The first was straightforward, via a permanent redirect, the 
second required googling since the ufl.edu service simply seems 
to have been deprovisioned.

Applied.

Gerald

Index: projects/gupc.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/gupc.html,v
retrieving revision 1.10
diff -u -r1.10 gupc.html
--- projects/gupc.html  27 May 2016 20:44:17 -  1.10
+++ projects/gupc.html  5 Feb 2017 21:52:15 -
@@ -36,11 +36,11 @@
 Features
 
 
-https://www.gwu.edu/~upc/docs/upc_specs_1.2.pdf;>
+https://www2.gwu.edu/~upc/docs/upc_specs_1.2.pdf;>
 UPC 1.2 specification compliant
 http://upc.gwu.edu/docs/UPC_Coll_Spec_V1.0.pdf;>
 UPC collectives library support
-http://gasp.hcs.ufl.edu;>GASP support; GASP is a performance
+http://upc.lbl.gov/gasp-archive/;>GASP support; GASP is a 
performance
 tool interface for Global Address Space Languages
 Fast bit packed pointer-to-shared support
 Configurable UPC pointer-to-shared representation


[patch,avr] Tweak addhi3_zero_extend.

2017-02-06 Thread Georg-Johann Lay

This patch adds an alternative to addhi3_zero_extend for the case
where output operand and the 8-bit addend happen to reside
the the same register.  Without the patch this might lead
to additional reloads to satisfy the constraints like


uint16_t func (uint8_t x, uint16_t y)
{
return x + y;
}


Without the new alternative the code will be


func:
movw r18,r22 ;  18  *movhi/1[length = 1]
add r18,r24  ;  13  *addhi3_zero_extend [length = 2]
adc r19,__zero_reg__
movw r24,r18 ;  19  *movhi/1[length = 1]
/* epilogue start */
ret


With the change the code reads


func:
add r24,r22  ;  13  *addhi3_zero_extend/2   [length = 3]
mov r25,r23
adc r25,__zero_reg__
/* epilogue start */
ret

which has smaller code size and register pressure.

Ok for trunk?

Johann

* config/avr/avr.md (*addhi3_zero_extend): Add alternative where
REGNO($0) == REGNO($1).
Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 244001)
+++ config/avr/avr.md	(working copy)
@@ -1200,12 +1200,14 @@ (define_expand "add3"
 
 
 (define_insn "*addhi3_zero_extend"
-  [(set (match_operand:HI 0 "register_operand" "=r")
-(plus:HI (zero_extend:HI (match_operand:QI 1 "register_operand" "r"))
- (match_operand:HI 2 "register_operand" "0")))]
+  [(set (match_operand:HI 0 "register_operand" "=r,*?r")
+(plus:HI (zero_extend:HI (match_operand:QI 1 "register_operand" "r  ,0"))
+ (match_operand:HI 2 "register_operand" "0  ,r")))]
   ""
-  "add %A0,%1\;adc %B0,__zero_reg__"
-  [(set_attr "length" "2")
+  "@
+	add %A0,%1\;adc %B0,__zero_reg__
+	add %A0,%A2\;mov %B0,%B2\;adc %B0,__zero_reg__"
+  [(set_attr "length" "2,3")
(set_attr "cc" "set_n")])
 
 (define_insn "*addhi3_zero_extend1"


[wwwdocs] remove ftp.dti.ad.jp from mirrors.html

2017-02-06 Thread Gerald Pfeifer
The mirror at ftp.dti.ad.jp is not reachable from my local systems
nor from gcc.gnu.org itself, so for now I applied the patch below.

ftp-admin / Takahiro-san, can you please let me know if I missed
something?

Thank you,
Gerald

Index: mirrors.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/mirrors.html,v
retrieving revision 1.239
diff -u -r1.239 mirrors.html
--- mirrors.html5 Feb 2017 19:57:51 -   1.239
+++ mirrors.html5 Feb 2017 21:57:33 -
@@ -36,7 +36,6 @@
 Germany: http://gcc.cybermirror.org;>http://gcc.cybermirror.org, thanks to 
Sascha Schwarz (cm at cybermirror.org)
 Greece: ftp://ftp.ntua.gr/pub/gnu/gcc/;>ftp.ntua.gr, thanks 
to ftpadm at ntua.gr
 Hungary, Budapest: http://robotlab.itk.ppke.hu/gcc/;>robotlab.itk.ppke.hu, thanks to 
Adam Rak (neurhlp at gmail.com)
-Japan: ftp://ftp.dti.ad.jp/pub/lang/gcc/;>ftp.dti.ad.jp, 
thanks to IWAIZAKO Takahiro (ftp-admin at dti.ad.jp)
 Japan: http://ftp.tsukuba.wide.ad.jp/software/gcc/;>ftp.tsukuba.wide.ad.jp, 
thanks to Kohei Takahashi (tsukuba-ftp-servers at tsukuba.wide.ad.jp)
 The Netherlands, Amsterdam:
   http://nl.mirror.babylon.network/gcc/;>http://nl.mirror.babylon.network/gcc/
 |


Re: [Patch][ARM,AArch64] more poly64 intrinsics and tests

2017-02-06 Thread Kyrill Tkachov

Hi Christophe,

On 06/02/17 08:31, Christophe Lyon wrote:

On 4 February 2017 at 22:54, James Greenhalgh  wrote:

On Wed, Jan 11, 2017 at 11:13:07AM +0100, Christophe Lyon wrote:

Ping?

James, I'm not sure whether your comment was a request for a new
version of my patch or just FYI?

Sorry that this was unclear. I was looking for a new version of the patch
covering this comment. Otherwise we just have debt to go fix it in future.

With the suggested change, the AArch64 parts of this patch are OK - adding
missing intrinsics is very safe (even in Stage 4).

Please post an updated patch, and give Richard and Marcus a reasonable
amount of tiume to object to taking the patch this late. (and you need an
AArch32 OK too).

Thanks,
James


Hi James,

Thanks for the clarification, here is an updated patch.

I had to make a few changes after rebasing, and I also took the opportunity to
rewrite the existing vtst_p8, vtst_p16, vtstq_p8 and vtstq_p16 without an
asm() statement.

As before, the aarch64 and aarch32 updates to arm_neon.h are independent,
but I found it simpler to group them, as they imply updates to the same test.

Tested as usual, cross-testing on several arm* and aarch64* configurations,
no regression.

OK?


Ok for arm.

Thanks,
Kyrill


Thanks,

Christophe



On 3 January 2017 at 16:47, Christophe Lyon  wrote:

Ping?


On 14 December 2016 at 23:09, Christophe Lyon
 wrote:

On 14 December 2016 at 17:55, James Greenhalgh  wrote:

On Mon, Dec 12, 2016 at 05:03:31PM +0100, Christophe Lyon wrote:

Hi,

After the recent update from Tamar, I noticed a few discrepancies
between ARM and AArch64 regarding a few poly64 intrinsics.

This patch:
- adds vtst_p64 and vtstq_p64 to AArch64's arm_neon.h
- adds vgetq_lane_p64, vset_lane_p64 and vsetq_lane_p64 to ARM's arm_neon.h
( vget_lane_p64 was already there)
- adds the corresponding tests, and moves the vget_lane_p64 ones out
of the #ifdef __aarch64__ zone.

Cross-tested on arm* and aarch64* targets.

OK?

The AArch64 parts of this look fine to me, but I do have one question on
your inline assembly implementation for vtstq_p64:


+__extension__ extern __inline uint64x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vtstq_p64 (poly64x2_t a, poly64x2_t b)
+{
+  uint64x2_t result;
+  __asm__ ("cmtst %0.2d, %1.2d, %2.2d"
+   : "=w"(result)
+   : "w"(a), "w"(b)
+   : /* No clobbers */);
+  return result;
+}
+

Why can this not be written as many of the other vtstq intrinsics are; e.g.:

__extension__ extern __inline uint64x2_t
   __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
   vtstq_p64 (poly64x2_t __a, poly64x2_t __b)
   {
 return (uint64x2_t) uint64x2_t) __a) & ((uint64x2_t) __b))
   != __AARCH64_INT64_C (0));
   }


I don't know, I just followed the pattern used for vtstq_p8 and vtstq_p16
just above...



Thanks,
James


gcc/ChangeLog:

2016-12-12  Christophe Lyon  

   * config/aarch64/arm_neon.h (vtst_p64): New.
   (vtstq_p64): New.
   * config/arm/arm_neon.h (vgetq_lane_p64): New.
   (vset_lane_p64): New.
   (vsetq_lane_p64): New.

gcc/testsuite/ChangeLog:

2016-12-12  Christophe Lyon  

   * gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
   (vget_lane_expected, vset_lane_expected, vtst_expected_poly64x1):
   New.
   (vmov_n_expected0, vmov_n_expected1, vmov_n_expected2)
   (expected_vld_st2_0, expected_vld_st2_1, expected_vld_st3_0)
   (expected_vld_st3_1, expected_vld_st3_2, expected_vld_st4_0)
   (expected_vld_st4_1, expected_vld_st4_2, expected_vld_st4_3)
   (vtst_expected_poly64x2): Move to aarch64-only section.
   (vget_lane_p64, vgetq_lane_p64, vset_lane_p64, vsetq_lane_p64)
   (vtst_p64, vtstq_p64): New tests.







Re: [Patch][ARM,AArch64] more poly64 intrinsics and tests

2017-02-06 Thread Christophe Lyon
On 4 February 2017 at 22:54, James Greenhalgh  wrote:
> On Wed, Jan 11, 2017 at 11:13:07AM +0100, Christophe Lyon wrote:
>> Ping?
>>
>> James, I'm not sure whether your comment was a request for a new
>> version of my patch or just FYI?
>
> Sorry that this was unclear. I was looking for a new version of the patch
> covering this comment. Otherwise we just have debt to go fix it in future.
>
> With the suggested change, the AArch64 parts of this patch are OK - adding
> missing intrinsics is very safe (even in Stage 4).
>
> Please post an updated patch, and give Richard and Marcus a reasonable
> amount of tiume to object to taking the patch this late. (and you need an
> AArch32 OK too).
>
> Thanks,
> James
>

Hi James,

Thanks for the clarification, here is an updated patch.

I had to make a few changes after rebasing, and I also took the opportunity to
rewrite the existing vtst_p8, vtst_p16, vtstq_p8 and vtstq_p16 without an
asm() statement.

As before, the aarch64 and aarch32 updates to arm_neon.h are independent,
but I found it simpler to group them, as they imply updates to the same test.

Tested as usual, cross-testing on several arm* and aarch64* configurations,
no regression.

OK?

Thanks,

Christophe

>>
>>
>> On 3 January 2017 at 16:47, Christophe Lyon  
>> wrote:
>> > Ping?
>> >
>> >
>> > On 14 December 2016 at 23:09, Christophe Lyon
>> >  wrote:
>> >> On 14 December 2016 at 17:55, James Greenhalgh  
>> >> wrote:
>> >>> On Mon, Dec 12, 2016 at 05:03:31PM +0100, Christophe Lyon wrote:
>>  Hi,
>> 
>>  After the recent update from Tamar, I noticed a few discrepancies
>>  between ARM and AArch64 regarding a few poly64 intrinsics.
>> 
>>  This patch:
>>  - adds vtst_p64 and vtstq_p64 to AArch64's arm_neon.h
>>  - adds vgetq_lane_p64, vset_lane_p64 and vsetq_lane_p64 to ARM's 
>>  arm_neon.h
>>  ( vget_lane_p64 was already there)
>>  - adds the corresponding tests, and moves the vget_lane_p64 ones out
>>  of the #ifdef __aarch64__ zone.
>> 
>>  Cross-tested on arm* and aarch64* targets.
>> 
>>  OK?
>> >>>
>> >>> The AArch64 parts of this look fine to me, but I do have one question on
>> >>> your inline assembly implementation for vtstq_p64:
>> >>>
>>  +__extension__ extern __inline uint64x2_t
>>  +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>>  +vtstq_p64 (poly64x2_t a, poly64x2_t b)
>>  +{
>>  +  uint64x2_t result;
>>  +  __asm__ ("cmtst %0.2d, %1.2d, %2.2d"
>>  +   : "=w"(result)
>>  +   : "w"(a), "w"(b)
>>  +   : /* No clobbers */);
>>  +  return result;
>>  +}
>>  +
>> >>>
>> >>> Why can this not be written as many of the other vtstq intrinsics are; 
>> >>> e.g.:
>> >>>
>> >>>__extension__ extern __inline uint64x2_t
>> >>>   __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>> >>>   vtstq_p64 (poly64x2_t __a, poly64x2_t __b)
>> >>>   {
>> >>> return (uint64x2_t) uint64x2_t) __a) & ((uint64x2_t) __b))
>> >>>   != __AARCH64_INT64_C (0));
>> >>>   }
>> >>>
>> >>
>> >> I don't know, I just followed the pattern used for vtstq_p8 and vtstq_p16
>> >> just above...
>> >>
>> >>
>> >>> Thanks,
>> >>> James
>> >>>
>>  gcc/ChangeLog:
>> 
>>  2016-12-12  Christophe Lyon  
>> 
>>    * config/aarch64/arm_neon.h (vtst_p64): New.
>>    (vtstq_p64): New.
>>    * config/arm/arm_neon.h (vgetq_lane_p64): New.
>>    (vset_lane_p64): New.
>>    (vsetq_lane_p64): New.
>> 
>>  gcc/testsuite/ChangeLog:
>> 
>>  2016-12-12  Christophe Lyon  
>> 
>>    * gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
>>    (vget_lane_expected, vset_lane_expected, vtst_expected_poly64x1):
>>    New.
>>    (vmov_n_expected0, vmov_n_expected1, vmov_n_expected2)
>>    (expected_vld_st2_0, expected_vld_st2_1, expected_vld_st3_0)
>>    (expected_vld_st3_1, expected_vld_st3_2, expected_vld_st4_0)
>>    (expected_vld_st4_1, expected_vld_st4_2, expected_vld_st4_3)
>>    (vtst_expected_poly64x2): Move to aarch64-only section.
>>    (vget_lane_p64, vgetq_lane_p64, vset_lane_p64, vsetq_lane_p64)
>>    (vtst_p64, vtstq_p64): New tests.
>> 
>> >>>
>> >>>
>
gcc/ChangeLog:

2017-02-06  Christophe Lyon  

* config/aarch64/arm_neon.h (vtst_p8): Rewrite without asm.
(vtst_p16): Likewise.
(vtstq_p8): Likewise.
(vtstq_p16): Likewise.
(vtst_p64): New.
(vtstq_p64): Likewise.
* config/arm/arm_neon.h (vgetq_lane_p64): New.
(vset_lane_p64): New.
(vsetq_lane_p64): New.

gcc/testsuite/ChangeLog:

2017-02-06  Christophe Lyon  

Re: [PATCH] Fix bool vs. unsigned:1 vectorization (PR tree-optimization/79284)

2017-02-06 Thread Jakub Jelinek
On Mon, Feb 06, 2017 at 09:14:24AM +0100, Richard Biener wrote:
> > > +/* Nonzero if TYPE represents a (scalar) boolean type or type
> > > +   in the middle-end compatible with it.  */
> > > +
> > > +#define INTEGRAL_BOOLEAN_TYPE_P(TYPE) \
> > > +  (TREE_CODE (TYPE) == BOOLEAN_TYPE\
> > > +   || ((TREE_CODE (TYPE) == INTEGER_TYPE   \
> > > +   || TREE_CODE (TYPE) == ENUMERAL_TYPE)   \
> > > +   && TYPE_PRECISION (TYPE) == 1   \
> > > +   && TYPE_UNSIGNED (TYPE)))
> > > 
> > > (just to quote what you proposed).

> > I could agree to that.  Alternately, we could restore the TYPE_PRECISION
> > checks that Jakub removed in the vectorizer.

I haven't removed any, there were just 3 changes of the kind:
-  if ((TYPE_PRECISION (TREE_TYPE (var)) != 1
-   || !TYPE_UNSIGNED (TREE_TYPE (var)))
-  && TREE_CODE (TREE_TYPE (var)) != BOOLEAN_TYPE)
+  if (!INTEGRAL_BOOLEAN_TYPE_P (TREE_TYPE (var)))
which are almost equivalent, except that the old code could let
non-INTEGRAL_TYPE_P with TYPE_PRECISION (whatever it means
for the various kinds of types (e.g. log2 of number of vector elements
etc.).

> 
> Yeah, I guess at this point I prefer a vectorizer-local fix.

So shall I move the INTEGRAL_BOOLEAN_TYPE_P macro (or change it to
another name, e.g. including VECT in it?) to tree-vectorizer.h?

Jakub


Re: New Port for RISC-V v3

2017-02-06 Thread Jakub Jelinek
On Sun, Feb 05, 2017 at 10:38:18AM -0800, Palmer Dabbelt wrote:
> There have been a handful of changes since we submitted our v2 port:
> 
>  * Some documentation formatting fixes.
> 
>  * A documentation typo fix.
> 
>  * Some changes to wwwdocs, which have been mailed to the list.
> 
>  * The port now builds via contrib/config-list.mk.  I worked around the
>warnings in other parts of the codebase with some "#pragma GCC diagnostic
>ignored" when I couldn't fix them properly, so the patches aren't useful,
>but I fixed the warnings in our port reasonably.  I can try to fix all 
> these
>reasonably, but it might take a while.
> 
> As far as I know there are currently no outstanding problems with this port, 
> so
> I think it's at the point where we should talk about actually getting the code
> in.  We have been accepted as maintainers of the port, and I have write access
> to the repositories, so I think we're all good to go on that end.  Of course 
> if
> there's any remaining comments I'd love to fix them, but it seems the comments
> on our v2 were somewhat minimal.
> 
> What's the procedure for moving forward with the port?
> 
> Thanks to everyone who helped with reviewing the port!
> 
> [PATCH 1/6] RISC-V Port: gcc/config/riscv/riscv.c
> [PATCH 2/6] RISC-V Port: gcc
> [PATCH 3/6] RISC-V Port: libgcc
> [PATCH 4/6] RISC-V Port: libatomic
> [PATCH 5/6] RISC-V Port: gcc/testsuite
> [PATCH 6/6] RISC-V Port: contrib

Richard in another mail said he is ok with the patchset, Sandra said some
notes on the documentation patch and have seen just 5 of these 6 patches
posted in v3 (the 2/6 patch is missing).
>From RM POV as long as it doesn't affect other targets it is ok for trunk,
but please don't delay it too much (i.e. resolve Sandra's comments, post the
missing patch, then check it in).

Jakub


Re: [PATCH] Fix bool vs. unsigned:1 vectorization (PR tree-optimization/79284)

2017-02-06 Thread Richard Biener
On Thu, 2 Feb 2017, Jeff Law wrote:

> On 02/01/2017 01:21 AM, Richard Biener wrote:
> > 
> > +/* Nonzero if TYPE represents a (scalar) boolean type or type
> > +   in the middle-end compatible with it.  */
> > +
> > +#define INTEGRAL_BOOLEAN_TYPE_P(TYPE) \
> > +  (TREE_CODE (TYPE) == BOOLEAN_TYPE\
> > +   || ((TREE_CODE (TYPE) == INTEGER_TYPE   \
> > +   || TREE_CODE (TYPE) == ENUMERAL_TYPE)   \
> > +   && TYPE_PRECISION (TYPE) == 1   \
> > +   && TYPE_UNSIGNED (TYPE)))
> > 
> > (just to quote what you proposed).
> And my suggestion is to move the PRECISION/UNSIGNED checks so that they're
> tested for BOOLEAN_TYPEs as well.
> 
> 
> > 
> > As of useless_type_conversion_p, I don't remember why we have
> > 
> >   /* Preserve conversions to/from BOOLEAN_TYPE if types are not
> >  of precision one.  */
> >   if (((TREE_CODE (inner_type) == BOOLEAN_TYPE)
> >!= (TREE_CODE (outer_type) == BOOLEAN_TYPE))
> >   && TYPE_PRECISION (outer_type) != 1)
> > return false;
> > 
> > it came with r173854 where you see other BOOLEAN_TYPE
> > -> integral-type with precision 1 check changes, so a new predicate
> > is very welcome IMHO.
> Funny, it's your change ;-)  Looks like Kai added a gimple checker that
> verified that the operand of a TRUTH_NOT_EXPR must be compatible with
> boolean_type_node.
> 
> That check tripped (bz48989) on some Fortran code.  Your fix seems to imply
> that the Fortran front-end created non-1-precision BOOLEAN_TYPE object.

It used to override boolean_type_node which AFAIK it no longer does.

> 
> > 
> > all BOOLEAN_TYPEs but Adas have precision one and are unsigned
> > (their TYPE_SIZE may vary though).  Adas larger precision boolean
> > has only two valid values but needs to be able to encode some 'NaT'
> > state.
> > 
> > I think BOOLEAN_COMPATIBLE_TYPE_P would be misleading as it isn't
> > equal to types_compatible_p (boolean_type_node, t).
> > 
> > Maybe we want TWO_VALUED_UNSIGNED_INTEGRAL_TYPE_P ()? (ick)
> > I thought "BOOLEAN" covers TWO_VALUED_UNSIGNED well enough but
> > simply BOOLEAN_TYPE_P is easily confused with TREE_CODE () ==
> > BOOLEAN_TYPE.
> > 
> > I'm fine with changing the predicate to be more explicit, like
> > 
> > #define INTEGRAL_BOOLEAN_TYPE_P(TYPE) \
> >   (INTEGRAL_TYPE_P (TYPE) && TYPE_PRECISION (TYPE) == 1)
> > 
> > not sure if we really need the TYPE_UNSIGNED check?  The middle-end
> > has various places that just check for a 1-precision type when
> > asking for a boolean context.
> > 
> > So naming set aside, would you agree with the above definition?
> > (modulo a && TYPE_UNSIGNED (TYPE))?
> I could agree to that.  Alternately, we could restore the TYPE_PRECISION
> checks that Jakub removed in the vectorizer.

Yeah, I guess at this point I prefer a vectorizer-local fix.

Richard.


Re: [RFA][PR tree-optimization/79095] [PATCH 1/4] Improve ranges for MINUS_EXPR and EXACT_DIV_EXPR

2017-02-06 Thread Richard Biener
On Sat, Feb 4, 2017 at 3:52 PM, Jeff Law  wrote:
> This is the first of a 4 part series to address the issues around 79095.
>
> This patch addresses improvements in determining ranges of binary
> expressions in three ways.
>
> First if we are otherwise unable to find a range for the result of a
> MINUS_EXPR, if we know the arguments are not equal, then we know the
> resultant range is ~[0,0].
>
> Second, for EXACT_DIV_EXPR, if the numerator has the range ~[0,0], then
> resultant range is currently [TYPE_MIN/DENOM,TYPE_MAX/DENOM].  That is
> rarely a useful range.   A resultant range of ~[0,0] is actually more useful
> since it often tells us something important about the difference of two
> pointers.
>
> Finally, when vrp2 discovers an updated range for an object that had a range
> discovered by vrp1, if the new range is ~[0,0], prefer that new range in
> some cases.  This is needed to avoid losing the newly discovered ~[0,0]
> range for EXACT_DIV_EXPR.
>
> Bootstrapped and regression tested with the other patches in this series.
> OK for the trunk?
>
> Jeff
>
> * tree-vrp.c (extract_range_from_binary_expr): For EXACT_DIV_EXPR,
> if the numerator has the range ~[0,0] make the resultant range
> ~[0,0].  For MINUS_EXPR with no derived range, if the operands are
> known to be not equal, then the resulting range is ~[0,0].
> (intersect_ranges): In some cases prefer ~[0,0].
>
> commit b7baf46ab62e28d2dbc22e9dcd4404926d59df18
> Author: Jeff Law 
> Date:   Fri Feb 3 15:45:58 2017 -0500
>
> Improved ranges
>
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index b429217..3338d8b 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -3298,6 +3298,37 @@ extract_range_from_binary_expr (value_range *vr,
>
>extract_range_from_binary_expr_1 (vr, code, expr_type, _vr0, );
>  }
> +
> +  /* EXACT_DIV_EXPR is typically used for pointer subtraction;
> + as a result a ~[0,0] may be better than what has already
> + been computed.
> +
> + In particular if numerator has the range ~[0,0], then the
> + result range is going to be something like
> + [MININT/DIVISOR,MAXINT/DIVISOR], which is rarely useful.
> +
> + So instead make the result range ~[0,0].  */
> +  if (code == EXACT_DIV_EXPR
> +  && TREE_CODE (op0) == SSA_NAME
> +  && vr0.type == VR_ANTI_RANGE
> +  && vr0.min == vr0.max
> +  && integer_zerop (vr0.min))
> +set_value_range_to_nonnull (vr, TREE_TYPE (op0));

The above belongs in extract_range_from_binary_expr_1, in principle the
cases below as well (though there's pre-existing VARYING result handling).
The _1 ones are supposed to be the actual range computations while
the routine you patched is responsible for interfacing with a lattice.  The
_1 routines can be used from code outside of VRP.

> +  /* If we didn't derive a range for MINUS_EXPR, and
> + op1's range is ~[op0,op0] or vice-versa, then we
> + can derive a non-null range.  This happens often for
> + pointer subtraction.  */
> +  if (vr->type == VR_VARYING
> +  && code == MINUS_EXPR
> +  && TREE_CODE (op0) == SSA_NAME
> +  && ((vr0.type == VR_ANTI_RANGE
> +  && symbolic_range_based_on_p (, op1)
> +  && vr0.min == vr0.max)
> + || (vr1.type == VR_ANTI_RANGE
> + && symbolic_range_based_on_p (, op0)
> + && vr1.min == vr1.max)))
> +  set_value_range_to_nonnull (vr, TREE_TYPE (op0));
>  }
>
>  /* Extract range information from a unary operation CODE based on
> @@ -8620,6 +8651,12 @@ intersect_ranges (enum value_range_type *vr0type,
>   else if (vrp_val_is_min (vr1min)
>&& vrp_val_is_max (vr1max))
> ;
> + /* Choose the anti-range if it is ~[0,0], that range is special
> +enough to special case.  */
> + else if (*vr0type == VR_ANTI_RANGE
> +  && *vr0min == *vr0max
> +  && integer_zerop (*vr0min))
> +   ;

Huh.  If I spotted the place of the change correctly then we cannot arrive
here with vr0 == ~[0,0] as *vr0type is VR_RANGE.  In the case covered
we'd have the only case intersecting [-1, 1] and ~[0,0] that you'd change
to ~[0,0] instead of [-1,1] which generally would be a bad choice (apart
from your implementation error as vr1 is the anti-range here).

Richard.

>   /* Else choose the range.  */
>   else
> {
>