Re: Fold strstr (s, t) eq/ne s to strcmp (s, t) eq/ne 0 if strlen (t) is known

2016-12-13 Thread Prathamesh Kulkarni
On 13 December 2016 at 17:54, Jakub Jelinek  wrote:
> On Tue, Dec 13, 2016 at 05:41:09PM +0530, Prathamesh Kulkarni wrote:
>> --- a/gcc/tree-ssa-strlen.c
>> +++ b/gcc/tree-ssa-strlen.c
>> @@ -,6 +,90 @@ handle_char_store (gimple_stmt_iterator *gsi)
>>return true;
>>  }
>>
>> +/* Try to fold strstr (s, t) eq/ne s to memcmp (s, t, strlen (t)) eq/ne 0.  
>> */
>> +
>> +static void
>> +fold_strstr_to_memcmp (enum tree_code code, tree rhs1, tree rhs2, gimple 
>> *stmt)
>
> You can drop code argument here, see below.  And I'd say it is better to
> do the
>   if (TREE_CODE (rhs1) != SSA_NAME || TREE_CODE (rhs2) != SSA_NAME)
> return;
> here than repeat it in all the callers.
>
>> +   if (gimple_assign_rhs_code (stmt) == COND_EXPR)
>> + {
>> +   tree cond = gimple_assign_rhs1 (stmt);
>> +   TREE_SET_CODE (cond, code);
>
> TREE_CODE (cond) is already code, so no need to set it again.
>
>> +   gcond *cond = as_a (stmt);
>> +   gimple_cond_set_lhs (cond, memcmp_lhs);
>> +   gimple_cond_set_rhs (cond, zero);
>> +   gimple_cond_set_code (cond, code);
>
> And gimple_cond_code (cond) == code here too.
>
>> +   update_stmt (cond);
>> + }
>
> You can perhaps move the update_stmt (stmt); to a single spot after
> all the 3 cases are handled.
>
>> + if (cond_code == EQ_EXPR || cond_code == NE_EXPR)
>> +   {
>> + tree rhs1 = TREE_OPERAND (cond, 0);
>> + tree rhs2 = TREE_OPERAND (cond, 1);
>
> While it is necessary to check cond_code here and in the other spots
> similarly, because otherwise you don't know if it has 2 arguments etc.,
> you can avoid the SSA_NAME tests here.
>
>> + if (TREE_CODE (rhs1) == SSA_NAME
>> + && TREE_CODE (rhs2) == SSA_NAME)
>> +   fold_strstr_to_memcmp (cond_code, rhs1, rhs2, stmt);
>> +   }
>> +   }
>> + else if (code == EQ_EXPR || code == NE_EXPR)
>> +   {
>> + tree rhs1 = gimple_assign_rhs1 (stmt);
>> + tree rhs2 = gimple_assign_rhs2 (stmt);
>> +
>> + if (TREE_CODE (rhs1) == SSA_NAME
>> + && TREE_CODE (rhs2) == SSA_NAME)
>
> And here.
>> +   fold_strstr_to_memcmp (code, rhs1, rhs2, stmt);
>> +   }
>> +  }
>> +else if (TREE_CODE (lhs) != SSA_NAME && !TREE_SIDE_EFFECTS (lhs))
>>   {
>> tree type = TREE_TYPE (lhs);
>> if (TREE_CODE (type) == ARRAY_TYPE)
>> @@ -2316,6 +2427,17 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi)
>>   }
>>   }
>>  }
>> +  else if (gcond *cond = dyn_cast (stmt))
>> +{
>> +  enum tree_code code = gimple_cond_code (cond);
>> +  tree lhs = gimple_cond_lhs (stmt);
>> +  tree rhs = gimple_cond_rhs (stmt);
>> +
>> +  if ((code == EQ_EXPR || code == NE_EXPR)
>> +   && TREE_CODE (lhs) == SSA_NAME
>> +   && TREE_CODE (rhs) == SSA_NAME)
>
> And here.
>> + fold_strstr_to_memcmp (code, lhs, rhs, stmt);
>> +}
>>
>>if (gimple_vdef (stmt))
>>  maybe_invalidate (stmt);
>
> Otherwise LGTM, but it would be nice to cover also the COND_EXPR case by a
> testcase (can be done incrementally).
Hi Jakub,
Done the changes in attached version.
Bootstrap+tested on x86_64-unknown-linux-gnu with default languages
and cross-tested on arm*-*-*, aarch64*-*-* with c,c++,fortran.
It it OK to commit ?
I am trying to come up with COND_EXPR test-case.

Thanks,
Prathamesh
>
> Jakub
2016-12-14  Jakub Jelinek  
Prathamesh Kulkarni  

* tree-ssa-strlen.c (fold_strstr_to_memcmp): New function.
(strlen_optimize_stmt): Call fold_strstr_to_memcmp.

testsuite/
* gcc.dg/strlenopt-30.c: New test-case.

diff --git a/gcc/testsuite/gcc.dg/strlenopt-30.c 
b/gcc/testsuite/gcc.dg/strlenopt-30.c
new file mode 100644
index 000..089b3a2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-30.c
@@ -0,0 +1,63 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+__attribute__((no_icf))
+_Bool f1(char *s)
+{
+  return __builtin_strstr (s, "hello") == s;
+}
+
+__attribute__((no_icf))
+_Bool f2(char *s)
+{
+  return s == __builtin_strstr (s, "hello");
+}
+
+__attribute__((no_icf))
+_Bool f3(char *s)
+{
+  return s != __builtin_strstr (s, "hello");
+}
+
+__attribute__((no_icf))
+_Bool f4()
+{
+  char *foo_f4(void);
+  char *t1 = foo_f4();
+  char *t2 = __builtin_strstr (t1, "hello");
+  _Bool t3 = t2 == t1;
+  return t3;
+}
+
+__attribute__((no_icf))
+void f5(char *s)
+{
+  char *t1 = __builtin_strstr (s, "hello");
+  void foo_f5(void);
+  if (t1 != s)
+foo_f5();
+}
+
+/* Do not perform transform, since strlen (t)
+   is unknown.  */
+
+__attribute__((no_icf))
+_Bool f6(char *s, char *t)
+{
+  return __builtin_strstr (s, t) == s;
+}
+
+/* Do not perform transform in this case, since
+   t1 doesn't have single use.  */

Re: [PATCH] Enhance analyze_brprob script

2016-12-13 Thread Jeff Law

On 12/09/2016 03:40 AM, Martin Liška wrote:

On 12/09/2016 11:02 AM, Martin Liška wrote:

Following patch enhances scripts and fixed various small issues.

Ready to be installed?

Martin


I forgot to squash commits, this is the right patch.

M.


0001-Enhance-analyze_brprob-script.patch


From c64469dcdf114ab3a432a9006be3d291df62b8a1 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 8 Dec 2016 18:01:15 +0100
Subject: [PATCH] Enhance analyze_brprob script

contrib/ChangeLog:

2016-12-09  Martin Liska  

* analyze_brprob.py: Add new column to output and new sorting
option. Fix coding style to not exceed 80 characters.
* analyze_brprob_spec.py: Add new sorting
option. Fix coding style to not exceed 80 characters.

OK.

jeff



Re: [PATCH] add missing attribute nonnull to stdio functions (PR 78673 and 17308)

2016-12-13 Thread Jeff Law

On 12/07/2016 08:24 PM, Martin Sebor wrote:



You're right!  Good chatch! I missed that there are two ways to
represent the same thing.  For example, these two declarations

  void __attribute ((nonnull (1, 2)))
  f (void);

  void __attribute ((nonnull (1))) __attribute ((nonnull (2)))
  f (void);

apply to the same arguments but each is represented differently,
as is this one:

  void __attribute ((nonnull))
  f (void);

The builtins use the first form and I didn't have tests for user-
defined attributes in the second form.  I've fixed that in the
attached updated patch (and added more tests).  It does seem,
though, that it would be better to represent these declarations
canonically.  It would simplify the processing and avoid bugs.
I'm not sure what the historical context is for having two forms of the 
same thing.  Though I would generally agree that there should be a 
canonical form.


One more high level note.  While I am in favor of adding these 
attributes, they can cause interesting codegen issues that we need to be 
on the lookout for.


Specifically when we see an SSA_NAME as an argument when the argument is 
marked as NONNULL, the optimizers will try to use that information to 
eliminate unnecessary NULL pointer checks.


The canonical example has been

memcpy (src, dst, 0)
[ stuff ]
if (src == 0)
  whatever

GCC can sometimes follow things well enough to realize that the test, in 
a conforming program, will always be false and remove the test. Some 
have claimed this is undesirable behavior on GCC's part (and they may be 
right).


Anyway, something to keep in mind.  These changes may make GCC ever so 
slightly more aggressive in its NULL pointer eliminations.





Thanks
Martin

gcc-78673.diff


PR c/78673 - sprintf missing attribute nonnull on destination argument
PR c/17308 - nonnull attribute not as useful as it could be

gcc/ChangeLog:

PR c/78673
PR c/17308
* builtin-attrs.def (ATTR_NONNULL_1_1, ATTR_NONNULL_1_2): Defined.
(ATTR_NONNULL_1_3, ATTR_NONNULL_1_4, ATTR_NONNULL_1_5): Same.
(ATTR_NOTHROW_NONNULL_1_1, ATTR_NOTHROW_NONNULL_1_2): Same.
(ATTR_NOTHROW_NONNULL_1_3, ATTR_NOTHROW_NONNULL_1_4): Same.
(ATTR_NOTHROW_NONNULL_1_5): Same.
(ATTR_NONNULL_1_FORMAT_PRINTF_1_2): Same.
(ATTR_NONNULL_1_FORMAT_PRINTF_2_0): Same.
(ATTR_NONNULL_1_FORMAT_PRINTF_2_3): Same.
(ATTR_NONNULL_1_FORMAT_PRINTF_3_0): Same.
(ATTR_NONNULL_1_FORMAT_PRINTF_3_4): Same.
(ATTR_NONNULL_1_FORMAT_PRINTF_4_0): Same.
(ATTR_NONNULL_1_FORMAT_PRINTF_4_5): Same.
* builtins.c (validate_arg): Add argument.  Treat null pointers
passed to nonnull arguments as invalid.
(validate_arglist): Same.
* builtins.def (fprintf, fprintf_unlocked): Add nonnull attribute.
(printf, printf_unlocked, sprintf. vfprintf, vsprintf): Same.
(__sprintf_chk, __vsprintf_chk, __fprintf_chk, __vfprintf_chk): Same.
* calls.c (get_nonnull_ags, maybe_warn_null_arg): New functions.
(initialize_argument_information): Diagnose null pointers passed to
arguments declared nonnull.
* calls.h (get_nonnull_args): Declared.

gcc/c-family/ChangeLog:

PR c/78673
PR c/17308
* c-common.c (check_nonnull_arg): Disable when optimization
is enabled.

gcc/testsuite/ChangeLog:

PR c/78673
PR c/17308
* gcc.dg/builtins-nonnull.c: New test.
* gcc.dg/nonnull-4.c: New test.

OK.
jeff



Re: [LRA] Fix ICE on pathological testcase

2016-12-13 Thread Jeff Law

On 12/07/2016 04:21 PM, Eric Botcazou wrote:

Presumably the MEM isn't a valid memory address, but it's allowed
through due to the use of an "X" constraint?


Yes (that was supposed to be more or less clear given the description :-).


ISTM that LRA has to be prepared to handle an arbitrary RTX, so punting
seems appropriate.


My opinion too.  Note that Bernd E. has another solution though.

I wasn't ever able to get comfortable with the change in behavior that 
Bernd E's patch introduced.  Namely that X no longer meant "anything", 
which is how it's been documented for eons.


Jeff


Re: [PATCH] fix integer overflow bugs in gimple-ssa-sprintf.c (PR 78608)

2016-12-13 Thread Jeff Law

On 12/02/2016 05:36 PM, Martin Sebor wrote:

Bug 78608 - gimple-ssa-sprintf.c:570:17: runtime error: negation
of -9223372036854775808 cannot be represented in type 'long int'
points out an integer overflow bug in the pass caught by ubsan.
The bug was due to negating a number without checking for equality
to INT_MIN.

In addition, my recent change to fix 78521 introduced a call to
abs() that broke the Solaris bootstrap:

  https://gcc.gnu.org/ml/gcc-patches/2016-12/msg00161.html

While fixing these two problems I noticed that the rest of the pass
wasn't handling the corner case of a width with the value of INT_MIN
specified via an argument to the asterisk, such as in:

  int n = snprintf(0, 0, "%*i", INT_MIN, 0);

This is undefined behavior because negative width is supposed to be
treated as the left justification flag followed by a positive width
(thus resulting in INT_MAX + 1 bytes).  This problem affected all
integer and floating point directives.

Finally, while there, I decided to include in information messages
a bit of detail about ranges of floating point values that was
missing.  I did this to help answer questions like those raised
earlier this week by Gerald here ("where does the 317 come from?):

  https://gcc.gnu.org/ml/gcc/2016-11/msg00102.html

The attached patch adjusts the pass to handle these problems.

Martin

gcc-78608.diff


PR tree-optimization/78608 - gimple-ssa-sprintf.c:570:17: runtime error: 
negation of -9223372036854775808 cannot be represented in type 'long int'

gcc/ChangeLog:

PR tree-optimization/78608
* gimple-ssa-sprintf.c (tree_digits): Avoid negating a minimum signed
value.
(get_width_and_precision): Same.
(format_integer): Use HOST_WIDE_INT for expected sprintf return value
to allow for width being the absolte value of INT_MIN.
(format_floating): Use HOST_WIDE_INT for width and precision.  Store
argument type for use in diagnostics.  Use target INT_MAX rather than
the host constant.
(format_floating): Reuse get_width_and_precision instead of hadcoding

s/hadcoding/hardcoding


the same.
(maybe_get_value_range_from_type): New function.
(format_directive): Treat INT_MAX + 1 as a valid (if excessive) byte
count.  Call maybe_get_value_range_from_type.

gcc/testsuite/ChangeLog:

PR tree-optimization/78608
* gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Add test cases.
So I've been going back and forth on whether or not to suggest a slight 
change in how we're working.


Specifically the practice of lumping multiple fixes into a single patch 
-- I realize that it's usually the case that you're finding related 
issues so there's a desire to address them as a group.  Doing things 
that way also tends to avoid interdependent patches.


The problem is that makes reviewing more difficult and thus I'm much 
less likely to knock it out as soon as it flys by my inbox.   And I 
think I've seen several trivial to review, but important fixes inside 
larger, harder to review changes.


Let's try to keep patches to addressing one issue for the future.  If 
you have multiple related issues, you should consider a series of patches.


Hopefully that will cut down on the number of things are getting stalled.



Index: gcc/gimple-ssa-sprintf.c
===
--- gcc/gimple-ssa-sprintf.c(revision 243196)
+++ gcc/gimple-ssa-sprintf.c(working copy)
@@ -565,8 +565,14 @@ tree_digits (tree x, int base, bool plus, bool pre
   if (tree_fits_shwi_p (x))
{
  HOST_WIDE_INT i = tree_to_shwi (x);
- if (i < 0)
+ if (HOST_WIDE_INT_MIN == i)

nit.  I think most folks would probably prefer this as
if (i == HOST_WIDE_INT_MIN).

HOST_WIDE_INT_MIN is a constant and when we can write an expression in 
either order variable OP const is the preferred order.


You seem to be going back and forth between both styles.  Let' try to 
stick with variable OP const.  I don't think you need to go back and fix 
all of the existing const OP variable instances right now, but we may in 
the future.




@@ -1221,13 +1251,15 @@ static fmtresult
   {
/* The minimum output is "0x.p+0".  */
res.range.min = 6 + (prec > 0 ? prec : 0);
-   res.range.max = (width == INT_MIN
-? HOST_WIDE_INT_MAX
+   /* The maximum is the greater of widh and the number of bytes

s/widh/width/



@@ -1316,42 +1350,23 @@ format_floating (const conversion_spec , tree
   /* Set WIDTH to -1 when it's not specified, to INT_MIN when it is
  specified by the asterisk to an unknown value, and otherwise to
  a non-negative value corresponding to the specified width.  */
-  int width = -1;
-  int prec = -1;
+  HOST_WIDE_INT width;
+  HOST_WIDE_INT prec;
+  get_width_and_precision (spec, , );

+  /* FIXME: Handle specified precision with an unknown value.  */
+  if (prec == 

Re: [PATCH] work around MPFR undefined behavior (PR 78786)

2016-12-13 Thread Jeff Law

On 12/13/2016 09:07 PM, Martin Sebor wrote:

In a discussion of my patch for bug 78696 I mentioned I had found
a bug/limitation in MPFR that causes GCC to allocate excessive
amounts of memory on some corner cases (very large precision).

  https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01098.html

I've since raised GCC bug 78786 for the GCC problem and discussed
the MPFR issues with the library's maintainer.  The attached patch
tweaks GCC to avoid triggering the MPFR problem.

There is some overlap with the attached patch and the one posted
for bug 78608 (fix integer overflow bugs in gimple-ssa-sprintf.c)
that will need to be resolved.  The latter patch also fixes some
more bugs in some esoteric corner cases having to do with flags
that this patch doesn't attempt to handle.

Martin

gcc-78786.diff


PR middle-end/78786 - GCC hangs/out of memory calling sprintf with large 
precision

gcc/ChangeLog:

PR middle-end/78786
* gimple-ssa-sprintf.c (target_dir_max): New macro.
(get_mpfr_format_length): New function.
(format_integer): Use HOST_WIDE_INT instead of int.
(format_floating_max): Same.
(format_floating): Call get_mpfr_format_length.
(format_directive): Use target_dir_max.

gcc/testsuite/ChangeLog:

PR middle-end/78786
* gcc.dg/tree-ssa/builtin-sprintf-warn-7.c: New test.

Index: gcc/gimple-ssa-sprintf.c
+  /* Adjust the retrun value by the difference.  */

s/retrun/return/

As you note there's some overlap between this patch and the 78608 which 
I need to take another look at.


If this patch is safe as-is and has been through the usual bootstrap and 
regression testing on its own, then it can go in now with the nit above 
fixed.


Jeff



Re: [PATCH v3,rs6000] Add built-in function support for Power9 byte instructions

2016-12-13 Thread Sandra Loosemore

On 12/12/2016 05:40 PM, Kelvin Nilsen wrote:


@@ -15105,6 +15109,24 @@ If all of the enabled test conditions are false, t
  The @code{scalar_test_neg} built-in functions return a non-zero value
  if their @code{source} argument holds a negative value.

+The @code{__builtin_byte_in_set} function requires a
+64-bit environment supporting ISA 3.0 or later.  This function returns
+a non-zero value if and only if its @code{u} argument exactly equals one of
+the eight bytes contained within its 64-bit @code{set} argument.
+
+The @code{__builtin_byte_in_range} and
+@code{__builtin_byte_in_either_range} require an environment
+supporting ISA 3.0 or later.  For these two functions, the
+@code{range} argument is encoded as 4 bytes, organized as
+@code{hi_1:lo_1:hi_2:lo_2}.
+The first of these functions returns a
+non-zero value if and only if its @code{u} argument is within the
+range bounded between @code{lo_2} and @code{hi_2} inclusive.
+The second of these functions returns non-zero if and only
+if its @code{u} argument is either within the range bounded between
+@code{lo_1} and @code{hi_1} inclusive or is within the range bounded between
+@code{lo_2} and @code{hi_2} inclusive.


I would prefer that you refer to the functions by name instead of "the 
first/second of these functions".


Also, you have a parallel construction problem in the second sentence: 
"is either within... or is within...".  I suggest rephrasing as "is 
within either...  or...".


-Sandra the nit-picky



[PATCH] work around MPFR undefined behavior (PR 78786)

2016-12-13 Thread Martin Sebor

In a discussion of my patch for bug 78696 I mentioned I had found
a bug/limitation in MPFR that causes GCC to allocate excessive
amounts of memory on some corner cases (very large precision).

  https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01098.html

I've since raised GCC bug 78786 for the GCC problem and discussed
the MPFR issues with the library's maintainer.  The attached patch
tweaks GCC to avoid triggering the MPFR problem.

There is some overlap with the attached patch and the one posted
for bug 78608 (fix integer overflow bugs in gimple-ssa-sprintf.c)
that will need to be resolved.  The latter patch also fixes some
more bugs in some esoteric corner cases having to do with flags
that this patch doesn't attempt to handle.

Martin
PR middle-end/78786 - GCC hangs/out of memory calling sprintf with large precision

gcc/ChangeLog:

	PR middle-end/78786
	* gimple-ssa-sprintf.c (target_dir_max): New macro.
	(get_mpfr_format_length): New function.
	(format_integer): Use HOST_WIDE_INT instead of int.
	(format_floating_max): Same.
	(format_floating): Call get_mpfr_format_length.
	(format_directive): Use target_dir_max.

gcc/testsuite/ChangeLog:

	PR middle-end/78786
	* gcc.dg/tree-ssa/builtin-sprintf-warn-7.c: New test.

Index: gcc/gimple-ssa-sprintf.c
===
--- gcc/gimple-ssa-sprintf.c	(revision 243624)
+++ gcc/gimple-ssa-sprintf.c	(working copy)
@@ -84,6 +84,11 @@ along with GCC; see the file COPYING3.  If not see
to be used for optimization but it's good enough as is for warnings.  */
 #define target_mb_len_max   6
 
+/* The maximum number of bytes a single non-string directive can result
+   in.  This is the result of printf("%.*Lf", INT_MAX, -LDBL_MAX) for
+   LDBL_MAX_10_EXP of 4932.  */
+#define target_dir_max()   (target_int_max () + 4932 + 2)
+
 namespace {
 
 const pass_data pass_data_sprintf_length = {
@@ -989,7 +994,7 @@ format_integer (const conversion_spec , tree
 	  gcc_unreachable ();
 	}
 
-  int len;
+  HOST_WIDE_INT len;
 
   if ((prec == HOST_WIDE_INT_MIN || prec == 0) && integer_zerop (arg))
 	{
@@ -1214,11 +1219,73 @@ format_integer (const conversion_spec , tree
   return res;
 }
 
+/* Return the number of bytes that a format directive consisting of FLAGS,
+   PRECision, format SPECification, and MPFR rounding specifier RNDSPEC,
+   would result for argument X under ideal conditions (i.e., if PREC
+   weren't excessive).  MPFR 3.1 allocates large amounts of memory for
+   values of PREC with large magnitude and can fail (see MPFR bug #21056).
+   This function works around those problems.  */
+
+static unsigned HOST_WIDE_INT
+get_mpfr_format_length (mpfr_ptr x, const char *flags, HOST_WIDE_INT prec,
+			char spec, char rndspec)
+{
+  char fmtstr[40];
+
+  HOST_WIDE_INT len = strlen (flags);
+
+  fmtstr[0] = '%';
+  memcpy (fmtstr + 1, flags, len);
+  memcpy (fmtstr + 1 + len, ".*R", 3);
+  fmtstr[len + 4] = rndspec;
+  fmtstr[len + 5] = spec;
+  fmtstr[len + 6] = '\0';
+
+  /* Avoid passing negative precisions with larger magnitude to MPFR
+ to avoid exposing its bugs.  (A negative precision is supposed
+ to be ignored.)  */
+  if (prec < 0)
+prec = -1;
+
+  HOST_WIDE_INT p = prec;
+
+  if (TOUPPER (spec) == 'G')
+{
+  /* For G/g, precision gives the maximum number of significant
+	 digits which is bounded by LDBL_MAX_10_EXP, or, for a 128
+	 bit IEEE extended precision, 4932.  Using twice as much
+	 here should be more than sufficient for any real format.  */
+  if (9864 < prec)
+	prec = 9864;
+  p = prec;
+}
+  else
+{
+  /* Cap precision arbitrarily at 1KB and add the difference
+	 (if any) to the MPFR result.  */
+  if (1024 < prec)
+	p = 1024;
+}
+
+  len = mpfr_snprintf (NULL, 0, fmtstr, (int)p, x);
+
+  /* Handle the unlikely (impossible?) error by returning more than
+ the maximum dictated by the function's return type.  */
+  if (len < 0)
+return target_dir_max () + 1;
+
+  /* Adjust the retrun value by the difference.  */
+  if (p < prec)
+len += prec - p;
+
+  return len;
+}
+
 /* Return the number of bytes to format using the format specifier
SPEC the largest value in the real floating TYPE.  */
 
-static int
-format_floating_max (tree type, char spec, int prec = -1)
+static unsigned HOST_WIDE_INT
+format_floating_max (tree type, char spec, HOST_WIDE_INT prec)
 {
   machine_mode mode = TYPE_MODE (type);
 
@@ -1243,21 +1310,8 @@ format_integer (const conversion_spec , tree
   mpfr_init2 (x, rfmt->p);
   mpfr_from_real (x, , GMP_RNDN);
 
-  int n;
-
-  if (-1 < prec)
-{
-  const char fmt[] = { '%', '.', '*', 'R', spec, '\0' };
-  n = mpfr_snprintf (NULL, 0, fmt, prec, x);
-}
-  else
-{
-  const char fmt[] = { '%', 'R', spec, '\0' };
-  n = mpfr_snprintf (NULL, 0, fmt, x);
-}
-
   /* Return a value one greater to account for the leading minus sign.  */
-  return n + 1;
+  return 1 + get_mpfr_format_length 

Re: [PATCH v3,rs6000] Add built-in function support for Power9 byte instructions

2016-12-13 Thread Segher Boessenkool
On Tue, Dec 13, 2016 at 05:25:52PM -0700, Kelvin Nilsen wrote:
> Regarding the two test cases that are missing the scan-assembler
> directive (byte-in-set-1.c and byte-in-set-2.c), those tests are both
> expected to fail.  They are checking that the compiler rejects those
> programs with appropriate error messages.

Oh, apparently I cannot read.  Sorry about that.


Segher


Re: [PATCH] Add ISA 3.0 PowerPC support for VEXTU{B,H,W}{L,R}X instructions

2016-12-13 Thread Segher Boessenkool
On Tue, Dec 13, 2016 at 06:17:17PM -0500, Michael Meissner wrote:
> > > +   else if (mode == V8HImode)
> > > + {
> > > +   rtx tmp_gpr_si = (GET_CODE (tmp_gpr) == SCRATCH
> > > + ? dest_si
> > > + : gen_rtx_REG (SImode, REGNO (tmp_gpr)));
> > 
> > I think you have these the wrong way around?
> 
> The rs6000_split_vec_extract_var function is called from several places in
> vsx.md to do a variable vector extract.  In looking at each of the cases, 
> there
> is a GPR tmp register for each of the calls, so I could modify it to move:
> 
>   gcc_assert (REG_P (tmp_gpr));
> 
> before the support for VEXTU{B,H,W}{L,R}X instructions, and leave the
> 
>   gcc_assert (REG_P (tmp_altivec));
> 
> and remove the test for SCRATCH.  In the original version of the code, the
> non-variable case also called rs6000_split_vec_extract_var, and it did not 
> have
> a scratch register.

What I am asking is: in your code, if there is a scratch you don't use it,
while if you get a reg you generate a new reg.  It looks like you have the
? and : the wrong way around.


> > You didn't address the reload_completed on all the splitters yet; is there
> > a reason for it?

[ ... ]

> Basically, until we know all of the details (i.e. after register allocator), 
> we
> can't do the split, because the code is different.

Right.

> There is also the practical case that due to the funky way the scalar parts 
> are
> not in the bottom part of the register, that SUBREG's really don't work 
> between
> 64-bit and 128-bit items that go in vector registers.  After register
> allocation, we can do gen_rtx_REG (, ) to change the types, but
> that really doesn't work before register allocator.

Hrm.  Nastiness.

Thanks for the explanation,


Segher


Re: [PATCH v3,rs6000] Add built-in function support for Power9 byte instructions

2016-12-13 Thread Kelvin Nilsen

Thanks for your quick feedback.

I'll update the comments regarding possible future enhancement to
support QImode for operands[1] as well.

Regarding the two test cases that are missing the scan-assembler
directive (byte-in-set-1.c and byte-in-set-2.c), those tests are both
expected to fail.  They are checking that the compiler rejects those
programs with appropriate error messages.

On 12/13/2016 03:14 PM, Segher Boessenkool wrote:
> Hi Kelvin,
> 
> On Mon, Dec 12, 2016 at 05:40:05PM -0700, Kelvin Nilsen wrote:
>> The patch has been bootstrapped and tested on
>> powerpc64le-unknown-linux and powerpc-unknown-linux (big-endian, with
>> both -m32 and -m64 target options) with no regressions.
>>
>> Is this ok for the trunk?
> 
> Yes it is, much better, thanks!  Two comments below, please fix the testcase
> one before commit if it is indeed a problem:
> 
>> +;; Though the instructions to which this expansion maps operate on
>> +;; 64-bit registers, the current implementation only operates on
>> +;; SI-mode operands as the high-order bits provide no information
>> +;; that is not already available in the low-order bits.  To avoid the
>> +;; costs of data widening operations, a future enhancement might add
>> +;; support for DI-mode operands.
> 
> And operands[1] could be QImode.
> 
>> +(define_expand "cmprb"
>> +  [(set (match_dup 3)
>> +(unspec:CC [(match_operand:SI 1 "gpc_reg_operand" "r")
>> +(match_operand:SI 2 "gpc_reg_operand" "r")]
>> + UNSPEC_CMPRB))
> 
> 
>> --- gcc/testsuite/gcc.target/powerpc/byte-in-set-1.c (revision 0)
>> +++ gcc/testsuite/gcc.target/powerpc/byte-in-set-1.c (working copy)
> 
> Did you forget the scan-assembler here and in the next one, or do you only
> want to test it does indeed compile?
> 
> 
> Segher
> 
> 

-- 
Kelvin Nilsen, Ph.D.  kdnil...@linux.vnet.ibm.com
home office: 801-756-4821, cell: 520-991-6727
IBM Linux Technology Center - PPC Toolchain



Re: [PATCH] fix integer overflow bugs in gimple-ssa-sprintf.c (PR 78608)

2016-12-13 Thread Jeff Law

On 12/13/2016 04:49 PM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2016-12/msg00262.html

(I would have almost forgotten about this if it weren't for bug
78786.  While working on a fix for it I keep thinking that some
of the changes I'm making look like they should have already been
made.)

It's not forgotten on my end :-)


jeff



Re: [PATCH] fix integer overflow bugs in gimple-ssa-sprintf.c (PR 78608)

2016-12-13 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2016-12/msg00262.html

(I would have almost forgotten about this if it weren't for bug
78786.  While working on a fix for it I keep thinking that some
of the changes I'm making look like they should have already been
made.)

Thanks
Martin

On 12/02/2016 05:36 PM, Martin Sebor wrote:

Bug 78608 - gimple-ssa-sprintf.c:570:17: runtime error: negation
of -9223372036854775808 cannot be represented in type 'long int'
points out an integer overflow bug in the pass caught by ubsan.
The bug was due to negating a number without checking for equality
to INT_MIN.

In addition, my recent change to fix 78521 introduced a call to
abs() that broke the Solaris bootstrap:

  https://gcc.gnu.org/ml/gcc-patches/2016-12/msg00161.html

While fixing these two problems I noticed that the rest of the pass
wasn't handling the corner case of a width with the value of INT_MIN
specified via an argument to the asterisk, such as in:

  int n = snprintf(0, 0, "%*i", INT_MIN, 0);

This is undefined behavior because negative width is supposed to be
treated as the left justification flag followed by a positive width
(thus resulting in INT_MAX + 1 bytes).  This problem affected all
integer and floating point directives.

Finally, while there, I decided to include in information messages
a bit of detail about ranges of floating point values that was
missing.  I did this to help answer questions like those raised
earlier this week by Gerald here ("where does the 317 come from?):

  https://gcc.gnu.org/ml/gcc/2016-11/msg00102.html

The attached patch adjusts the pass to handle these problems.

Martin




Re: [PATCH], Add PowerPC ISA 3.0 vec_vinsert4b and vec_vextract4b built-in functions

2016-12-13 Thread Michael Meissner
On Tue, Dec 13, 2016 at 04:40:20PM -0600, Segher Boessenkool wrote:
> > +  /* If we got invalid arguments bail out before generating bad rtl.  
> > */
> 
> If what is invalid?  Just remove the first comment?

I had copied the code from DST before, and it returns (const_int 0).  However,
normal built-ins can't return (const_int 0) because the rest of the compiler
generates all sorts of errors.  Instead after raising the error, you have to
generate a normal function call.  I'll remove the comment.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH] Add ISA 3.0 PowerPC support for VEXTU{B,H,W}{L,R}X instructions

2016-12-13 Thread Michael Meissner
On Tue, Dec 13, 2016 at 04:29:36PM -0600, Segher Boessenkool wrote:
> On Tue, Dec 13, 2016 at 10:15:02AM -0500, Michael Meissner wrote:
> > This patch should address the comments in the last patch.
> > 
> > I have tested this patch with bootstrap builds and make check regression 
> > tests
> > on a little endian Power8 64-bit system and a big endian Power7 32/64-bit
> > system with no regressions.  Can I check this into the trunk?
> 
> > + else if (mode == V8HImode)
> > +   {
> > + rtx tmp_gpr_si = (GET_CODE (tmp_gpr) == SCRATCH
> > +   ? dest_si
> > +   : gen_rtx_REG (SImode, REGNO (tmp_gpr)));
> 
> I think you have these the wrong way around?

The rs6000_split_vec_extract_var function is called from several places in
vsx.md to do a variable vector extract.  In looking at each of the cases, there
is a GPR tmp register for each of the calls, so I could modify it to move:

gcc_assert (REG_P (tmp_gpr));

before the support for VEXTU{B,H,W}{L,R}X instructions, and leave the

gcc_assert (REG_P (tmp_altivec));

and remove the test for SCRATCH.  In the original version of the code, the
non-variable case also called rs6000_split_vec_extract_var, and it did not have
a scratch register.


> You didn't address the reload_completed on all the splitters yet; is there
> a reason for it?

Because there are 4 different cases that generate wildy different code, based
on what register class or memory is used:

1)  For DImode, DFmode, and SFmode we could be extracting to a vector
register and we would not use VEXTU{B,H,W}{L,R}X but instead do a VLSO
and other operations;

2)  Even if the result is in a GPR, DImode and DFmode have to do VLSO
because there is no VEXTUD{L,R}X instruction that deposits the value in
a GPR.  Similarly, on ISA 2.07, we don't have those instructions, so we
have to generate the more complex instructions;

3)  The variable extract code also handles variable extracts that are
stores;

4)  Finally there is the new case where we are extracting to a GPR when we
have ISA 3.0 instructions.

Basically, until we know all of the details (i.e. after register allocator), we
can't do the split, because the code is different.

There is also the practical case that due to the funky way the scalar parts are
not in the bottom part of the register, that SUBREG's really don't work between
64-bit and 128-bit items that go in vector registers.  After register
allocation, we can do gen_rtx_REG (, ) to change the types, but
that really doesn't work before register allocator.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH], Add PowerPC ISA 3.0 vec_vinsert4b and vec_vextract4b built-in functions

2016-12-13 Thread Segher Boessenkool
Hi Mike,

On Tue, Dec 13, 2016 at 01:16:56PM -0500, Michael Meissner wrote:
> I have done bootstrap builds on a 64-bit power8 little endian system and a
> 32/64-bit power7 big endian system.  There were no regressions.  Can I check
> this into the GCC trunk?

Yes, please apply.  One remark below.

> 2016-12-13   Michael Meissner  
> 
>   * config/rs6000/predicates.md (const_0_to_11_operand): New
>   predicate, match 0..11.
>   * config/rs6000/rs6000-builtin.def (BU_P9V_VSX_3): Set built-in
>   type to ternary, not binary.
>   (BU_P9V_64BIT_VSX_3): Likewise.
>   (P9V_BUILTIN_VEXTRACT4B): Add support for vec_vinsert4b and
>   vec_extract4b non-overloaded built-in functions.
>   (P9V_BUILTIN_VINSERT4B): Likewise.
>   (P9V_BUILTIN_VINSERT4B_DI): Likewise.
>   (P9V_BUILTIN_VEC_VEXTULX): Move to section that adds 2 operand ISA
>   3.0 built-in functions.
>   (P9V_BUILTIN_VEC_VEXTURX): Likewise.
>   (P9V_BUILTIN_VEC_VEXTRACT4B): Add support for overloaded
>   vec_insert4b and vec_extract4 built-in functions.
>   (P9V_BUILTIN_VEC_VINSERT4B): Likewise.
>   * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add
>   overloaded support for vec_vinsert4b and vec_extract4b.
>   * config/rs6000/rs6000.c (altivec_expand_builtin): Add checks for
>   the vec_insert4b and vec_extract4b byte number being a constant in
>   the range 0..11.
>   * config/rs6000/vsx.md (UNSPEC_XXEXTRACTUW): New unspec.
>   (UNSPEC_XXINSERTW): Likewise.
>   (vextract4b): Add support for the vec_vextract4b built-in
>   function.
>   (vextract4b_internal): Likewise.
>   (vinsert4b): Add support for the vec_insert4b built-in function.
>   Include both a version that inserts element 1 from a V4SI object
>   and one that inserts a DI object.
>   (vinsert4b_internal): Likewise.
>   (vinsert4b_di): Likewise.
>   (vinsert4b_di_internal): Likewise.
>   * config/rs6000/altivec.h (vec_vinsert4b): Support vec_vinsert4b
>   and vec_extract4b built-in functions.
>   * doc/extend.doc (PowerPC VSX built-in functions): Document
>   vec_insert4b and vec_extract4b.
> 
> [gcc/testsuite]
> 2016-12-13  Michael Meissner  
> 
>   * gcc.target/powerpc/p9-vinsert4b-1.c: New test.
>   * gcc.target/powerpc/p9-vinsert4b-2.c: Likewise.

> @@ -15766,6 +15766,40 @@ altivec_expand_builtin (tree exp, rtx ta
>  case VSX_BUILTIN_VEC_EXT_V1TI:
>return altivec_expand_vec_ext_builtin (exp, target);
>  
> +case P9V_BUILTIN_VEXTRACT4B:
> +case P9V_BUILTIN_VEC_VEXTRACT4B:
> +  arg1 = CALL_EXPR_ARG (exp, 1);
> +  STRIP_NOPS (arg1);
> +
> +  /* Generate a normal call if it is invalid.  */
> +  /* If we got invalid arguments bail out before generating bad rtl.  */

If what is invalid?  Just remove the first comment?

Thanks,


Segher


Re: [PATCH] Add ISA 3.0 PowerPC support for VEXTU{B,H,W}{L,R}X instructions

2016-12-13 Thread Segher Boessenkool
On Tue, Dec 13, 2016 at 10:15:02AM -0500, Michael Meissner wrote:
> This patch should address the comments in the last patch.
> 
> I have tested this patch with bootstrap builds and make check regression tests
> on a little endian Power8 64-bit system and a big endian Power7 32/64-bit
> system with no regressions.  Can I check this into the trunk?

> +   else if (mode == V8HImode)
> + {
> +   rtx tmp_gpr_si = (GET_CODE (tmp_gpr) == SCRATCH
> + ? dest_si
> + : gen_rtx_REG (SImode, REGNO (tmp_gpr)));

I think you have these the wrong way around?

You didn't address the reload_completed on all the splitters yet; is there
a reason for it?

Okay with these fixed.


Segher


Re: [PATCH v3,rs6000] Add built-in function support for Power9 byte instructions

2016-12-13 Thread Segher Boessenkool
Hi Kelvin,

On Mon, Dec 12, 2016 at 05:40:05PM -0700, Kelvin Nilsen wrote:
> The patch has been bootstrapped and tested on
> powerpc64le-unknown-linux and powerpc-unknown-linux (big-endian, with
> both -m32 and -m64 target options) with no regressions.
> 
> Is this ok for the trunk?

Yes it is, much better, thanks!  Two comments below, please fix the testcase
one before commit if it is indeed a problem:

> +;; Though the instructions to which this expansion maps operate on
> +;; 64-bit registers, the current implementation only operates on
> +;; SI-mode operands as the high-order bits provide no information
> +;; that is not already available in the low-order bits.  To avoid the
> +;; costs of data widening operations, a future enhancement might add
> +;; support for DI-mode operands.

And operands[1] could be QImode.

> +(define_expand "cmprb"
> +  [(set (match_dup 3)
> + (unspec:CC [(match_operand:SI 1 "gpc_reg_operand" "r")
> + (match_operand:SI 2 "gpc_reg_operand" "r")]
> +  UNSPEC_CMPRB))


> --- gcc/testsuite/gcc.target/powerpc/byte-in-set-1.c  (revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/byte-in-set-1.c  (working copy)

Did you forget the scan-assembler here and in the next one, or do you only
want to test it does indeed compile?


Segher


[LRA] Fix ICE for paradoxical subregs on strict-alignment platforms

2016-12-13 Thread Eric Botcazou
Hi,

the Ada runtime library doesn't build on SPARC 32-bit with LRA because of an 
ICE on a couple of files:

eric@polaris:~/build/gcc/sparc-sun-solaris2.10> gcc/gnat1 -I gcc/ada/rts -
quiet -dumpbase g-debpoo.adb -auxbase g-debpoo -O2 -gnatpg -fPIC -mlra -
mcpu=v9 g-debpoo.adb -o g-debpoo.s
+===GNAT BUG DETECTED==+
| 7.0.0 20161213 (experimental) [trunk revision 243595] (sparc-sun-
solaris2.10) GCC error:|
| in lra_set_insn_recog_data, at lra.c:965 |
| Error detected around g-debpoo.adb:1896:8 

The problem is that curr_insn_transform attempts to reload subregs before 
reloading addresses:

  if (! check_only_p)
/* Make equivalence substitution and memory subreg elimination
   before address processing because an address legitimacy can
   depend on memory mode.  */

Now, on strict-alignment platforms, simplify_operand_subreg has a special 
provision for paradoxical subregs of memory references:

 /* If we change the address for a paradoxical subreg of memory, the
address might violate the necessary alignment or the access might
be slow.  So take this into consideration.  We need not worry
about accesses beyond allocated memory for paradoxical memory
subregs as we don't substitute such equiv memory (see processing
equivalences in function lra_constraints) and because for spilled
pseudos we allocate stack memory enough for the biggest
corresponding paradoxical subreg.  */
  if (!(MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (mode)
&& SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (reg)))
  || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
  && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg
return true;

That is to say, if the adjusted memory reference is not sufficient aligned, 
the routine will reload the original memory reference instead of the adjusted 
one on strict-alignment platforms.  But there is a hitch: if the address of 
this original memory reference is invalid, the routine will nevertheless 
reload the entire memory reference, leading to the aforementioned ICE.

The proposed fix is to reload the address, if it is invalid, just before 
reloading the memory reference in simplify_operand_subreg; this yields the 
correct sequence of reloads for the case I investigated.  The patch also 
contains formatting fixes and adds a 'return' for the sake of clarity.

Tested on x86-64/Linux and SPARC/Solaris with LRA, OK for the mainline?


2016-12-13  Eric Botcazou  <ebotca...@adacore.com>

* lra-constraints.c (process_address): Add forward declaration.
(simplify_operand_subreg): In the MEM case, if the adjusted memory
reference is not sufficient aligned and the address was invalid,
reload the address before reloading the original memory reference.
Fix long lines and add a final return for the sake of clarity.

-- 
Eric BotcazouIndex: lra-constraints.c
===
--- lra-constraints.c	(revision 243595)
+++ lra-constraints.c	(working copy)
@@ -1454,6 +1454,7 @@ insert_move_for_subreg (rtx_insn **befor
 }
 
 static int valid_address_p (machine_mode mode, rtx addr, addr_space_t as);
+static bool process_address (int, bool, rtx_insn **, rtx_insn **);
 
 /* Make reloads for subreg in operand NOP with internal subreg mode
REG_MODE, add new reloads for further processing.  Return true if
@@ -1480,13 +1481,13 @@ simplify_operand_subreg (int nop, machin
   type = curr_static_id->operand[nop].type;
   if (MEM_P (reg))
 {
-  rtx subst;
-
+  const bool addr_was_valid
+	= valid_address_p (innermode, XEXP (reg, 0), MEM_ADDR_SPACE (reg));
   alter_subreg (curr_id->operand_loc[nop], false);
-  subst = *curr_id->operand_loc[nop];
+  rtx subst = *curr_id->operand_loc[nop];
   lra_assert (MEM_P (subst));
-  if (! valid_address_p (innermode, XEXP (reg, 0),
-			 MEM_ADDR_SPACE (reg))
+
+  if (!addr_was_valid
 	  || valid_address_p (GET_MODE (subst), XEXP (subst, 0),
 			  MEM_ADDR_SPACE (subst))
 	  || ((get_constraint_type (lookup_constraint
@@ -1503,10 +1504,10 @@ simplify_operand_subreg (int nop, machin
 		ADDRESS, SCRATCH)][0]],
   MEM_ADDR_SPACE (subst
 	{
-	  /* If we change address for paradoxical subreg of memory, the
+	  /* If we change the address for a paradoxical subreg of memory, the
 	 address might violate the necessary alignment or the access might
-	 be slow.  So take this into consideration.  We should not worry
-	 about access beyond allocated memory for paradoxical memory
+	 be slow.  So take this into consideration.  We need not worry
+	 about accesses beyond allocated memory for paradoxical memory
 	 subregs as we don't s

_Rb_tree regression

2016-12-13 Thread François Dumont

Hi

I have been reported privately by Christophe in copy a regression 
resulting from my recent changes to _Rb_tree. I removed a constructor 
still necessary in C++03 mode or before. Tests would have shown it if I 
had run them in C++03.


* include/bits/stl_tree.h
(_Rb_tree_impl(const _Key_compare&, const _Node_allocator&): Restore
before C++11 mode.

I prefer to restore it only before C++11 while it used to be 
available in any mode. It is only call with rvalue references. Don't 
hesitate to commit it yourself if you prefer to fix it quickly otherwise 
I'll do it after validation in about 24 hours.


François

diff --git a/libstdc++-v3/include/bits/stl_tree.h b/libstdc++-v3/include/bits/stl_tree.h
index 1bfbfa7..cb2c116 100644
--- a/libstdc++-v3/include/bits/stl_tree.h
+++ b/libstdc++-v3/include/bits/stl_tree.h
@@ -692,7 +692,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  , _Base_key_compare(__x._M_key_compare)
 	  { }
 
-#if __cplusplus >= 201103L
+#if __cplusplus < 201103L
+	  _Rb_tree_impl(const _Key_compare& __comp, const _Node_allocator& __a)
+	  : _Node_allocator(__a), _Base_key_compare(__comp)
+	  { }
+#else
 	  _Rb_tree_impl(_Rb_tree_impl&&) = default;
 	  _Rb_tree_impl(const _Key_compare& __comp, _Node_allocator&& __a)
 	  : _Node_allocator(std::move(__a)), _Base_key_compare(__comp)


Re: C++ PATCH to reject initializating flexible array members in constructors (PR c++/72775)

2016-12-13 Thread Martin Sebor

On 12/13/2016 02:32 PM, Jason Merrill wrote:

On 12/13/2016 03:25 PM, Martin Sebor wrote:

On 12/13/2016 12:06 PM, Jason Merrill wrote:

On 12/09/2016 07:46 PM, Martin Sebor wrote:

On 12/09/2016 02:49 PM, Jason Merrill wrote:

On Fri, Dec 9, 2016 at 11:33 AM, Martin Sebor 
wrote:

For flexible array members, because they're not in C++, we get to
make up the rules that make the most sense to us.  IMO, they should
fit in well with the rest of the language.


I disagree; we should support C code, but flexible arrays don't really
fit with the C++ object model, so I don't think trying to do anything
clever with them in constructors is worthwhile.


With the suggested approach the array becomes just an ordinary member.
It's not a flexible array member anymore because its bound is deduced
from the initializer (it just looks like one).  The NSDMI char a[] =
"foo" syntax is just a shorthand for char a[4] = "foo".


That would mean that the size of the class varies with its initializer,
which again gets into not fitting with the C++ object model.


Okay, I accept the challenge of finding a case where the size
of a class depends on an initializer expression :)

Here's one:

  constexpr long f (int) { return 1; }
  constexpr long f (long) { return LONG_MAX; };

  struct A {   // sizeof (A) == 4
enum E { i = f (0) } e;
  };

  struct B {   // sizeof (B) == 8
enum E { i = f (0L) } e;
  };

It's not exactly the same but I think it's close enough to argue
that deducing the array size from its initializer wouldn't go
against the established model.


I don't see the similarity; A and B both have constant size.


Sure, just as C and D below do today

  struct C {   // sizeof (C) == 4
char a[4] = "abc";
  };

  struct D {   // sizeof (D) == 8
char a[8] = "abcdefg";
  };

and as would E and F below with what I suggest:

  struct E {   // sizeof (E) == 4
char a[] = "abc";
  };

  struct F {   // sizeof (F) == 8
char a[] = "abcdefg";
  };

Where is the disconnect?

I'm not suggesting that the size of an object of a class change
depending on how the member array is initialized in each ctor
initializer list (if that's how it came across).

Martin


Re: C++ PATCH to reject initializating flexible array members in constructors (PR c++/72775)

2016-12-13 Thread Jason Merrill

On 12/13/2016 03:25 PM, Martin Sebor wrote:

On 12/13/2016 12:06 PM, Jason Merrill wrote:

On 12/09/2016 07:46 PM, Martin Sebor wrote:

On 12/09/2016 02:49 PM, Jason Merrill wrote:

On Fri, Dec 9, 2016 at 11:33 AM, Martin Sebor  wrote:

For flexible array members, because they're not in C++, we get to
make up the rules that make the most sense to us.  IMO, they should
fit in well with the rest of the language.


I disagree; we should support C code, but flexible arrays don't really
fit with the C++ object model, so I don't think trying to do anything
clever with them in constructors is worthwhile.


With the suggested approach the array becomes just an ordinary member.
It's not a flexible array member anymore because its bound is deduced
from the initializer (it just looks like one).  The NSDMI char a[] =
"foo" syntax is just a shorthand for char a[4] = "foo".


That would mean that the size of the class varies with its initializer,
which again gets into not fitting with the C++ object model.


Okay, I accept the challenge of finding a case where the size
of a class depends on an initializer expression :)

Here's one:

  constexpr long f (int) { return 1; }
  constexpr long f (long) { return LONG_MAX; };

  struct A {   // sizeof (A) == 4
enum E { i = f (0) } e;
  };

  struct B {   // sizeof (B) == 8
enum E { i = f (0L) } e;
  };

It's not exactly the same but I think it's close enough to argue
that deducing the array size from its initializer wouldn't go
against the established model.


I don't see the similarity; A and B both have constant size.

Jason



Re: [patch, fortran] Fix ICEs with -fimplicit-none

2016-12-13 Thread Mikael Morin

Le 11/12/2016 à 23:21, Thomas Koenig a écrit :

Hello world,

the attached patch fixes a 5/6/7 regression with -fimplicit-none.

Regression-tested.  OK for all affected branches?


OK. Thanks

Mikael



[PATCH] Use the middle-end boolean_type_node

2016-12-13 Thread Janne Blomqvist
Use the boolean_type_node setup by the middle-end instead of
redefining it. boolean_type_node is not used in GFortran for any
ABI-visible stuff, only internally as the type of boolean
expressions. There appears to be one exception to this, namely the
caf_get* and caf_send* calls which have boolean_type_node
arguments. However, on the library side they seem to use C _Bool, so I
suspect this might be a case of a argument mismatch that hasn't
affected anything so far.

The practical effect of this is that the size of such variables will
be the same as a C _Bool or C++ bool, that is, on most targets a
single byte. Previously we redefined boolean_type_node to be a Fortran
default logical kind sized variable, that is 4 or 8 bytes depending on
compile options. This might enable slightly more compact code, in case
the optimizer determines that the result of such a generated
comparison expression needs to be stored in some temporary location
rather than being used immediately.

Regression tested on x86_64-pc-linux-gnu, Ok for trunk?

2016-12-13  Janne Blomqvist  

* trans-types.c (gfc_init_types): Don't redefine boolean type node.
---
 gcc/fortran/trans-types.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c
index 354308f..e8dafa0 100644
--- a/gcc/fortran/trans-types.c
+++ b/gcc/fortran/trans-types.c
@@ -961,10 +961,6 @@ gfc_init_types (void)
wi::mask (n, UNSIGNED,
  TYPE_PRECISION (size_type_node)));
 
-  boolean_type_node = gfc_get_logical_type (gfc_default_logical_kind);
-  boolean_true_node = build_int_cst (boolean_type_node, 1);
-  boolean_false_node = build_int_cst (boolean_type_node, 0);
-
   /* ??? Shouldn't this be based on gfc_index_integer_kind or so?  */
   gfc_charlen_int_kind = 4;
   gfc_charlen_type_node = gfc_get_int_type (gfc_charlen_int_kind);
-- 
2.7.4



Re: [C++ PATCH] c++/78776 fix alias template ICE

2016-12-13 Thread Jason Merrill

On 12/13/2016 03:43 PM, Nathan Sidwell wrote:

On 12/13/2016 02:03 PM, Jason Merrill wrote:


OK with those changes.


Done.



+/* Template information for an alias template type.  */
+#define TYPE_ALIAS_TEMPLATE_INFO(NODE) \
+  (DECL_LANG_SPECIFIC (TYPE_NAME (NODE))   \
+   ? DECL_TEMPLATE_INFO (TYPE_NAME (NODE)) \
+   : NULL_TREE)
+
+/* If NODE is a specialization of an alias template, this accessor
+   returns the template info for the alias template.  Otherwise behave
+   as TYPE_TEMPLATE_INFO.  */
+#define TYPE_TEMPLATE_INFO_MAYBE_ALIAS(NODE)   \
+  (TYPE_ALIAS_P (NODE) && DECL_LANG_SPECIFIC (TYPE_NAME (NODE))
\
+   ? DECL_TEMPLATE_INFO (TYPE_NAME (NODE)) \
+   : TYPE_TEMPLATE_INFO (NODE))


I'm still seeing the redundant DECL_LANG_SPECIFIC check.

Jason



Re: C++ PATCH to reject initializating flexible array members in constructors (PR c++/72775)

2016-12-13 Thread Martin Sebor

On 12/13/2016 12:06 PM, Jason Merrill wrote:

On 12/09/2016 07:46 PM, Martin Sebor wrote:

On 12/09/2016 02:49 PM, Jason Merrill wrote:

On Fri, Dec 9, 2016 at 11:33 AM, Martin Sebor  wrote:

For flexible array members, because they're not in C++, we get to
make up the rules that make the most sense to us.  IMO, they should
fit in well with the rest of the language.


I disagree; we should support C code, but flexible arrays don't really
fit with the C++ object model, so I don't think trying to do anything
clever with them in constructors is worthwhile.


With the suggested approach the array becomes just an ordinary member.
It's not a flexible array member anymore because its bound is deduced
from the initializer (it just looks like one).  The NSDMI char a[] =
"foo" syntax is just a shorthand for char a[4] = "foo".


That would mean that the size of the class varies with its initializer,
which again gets into not fitting with the C++ object model.


Okay, I accept the challenge of finding a case where the size
of a class depends on an initializer expression :)

Here's one:

  constexpr long f (int) { return 1; }
  constexpr long f (long) { return LONG_MAX; };

  struct A {   // sizeof (A) == 4
enum E { i = f (0) } e;
  };

  struct B {   // sizeof (B) == 8
enum E { i = f (0L) } e;
  };

It's not exactly the same but I think it's close enough to argue
that deducing the array size from its initializer wouldn't go
against the established model.

FWIW, I have a (vague) recollection of an incident where the enum
size problem caused ABI breakage in a product (it was in C and
involved simple macros).  Because an enum is normally expected
to be the same size as an int, having its size change based on
the (hidden behind a macro) value of one of its enumerators is
likely far more unexpected than having an array size depend on
its its initializer.

Martin


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-13 Thread Janne Blomqvist
On Mon, Dec 12, 2016 at 9:39 PM, Janne Blomqvist
 wrote:
> On Mon, Dec 12, 2016 at 7:39 PM, Andre Vehreschild  wrote:
>> Hi Janne,
>>
>> I found that you are favoring build_int_cst (size_type_node, 0) over
>> size_zero_node. Is there a reason to this?
>
> Yes. AFAIU size_zero_node is a zero constant for sizetype which is not
> the same as size_type_node.
>
> AFAIK the difference is that size_type_node is the C size_t type,
> whereas sizetype is a GCC internal type used for address expressions.
> On a "normal" target I understand that they are the same size, but
> there are some slight differences in semantics, e.g. size_type_node
> like C unsigned integer types is defined to wrap around on overflow
> whereas sizetype is undefined on overflow.
>
> I don't know if GCC supports some strange targets with some kind of
> segmented memory where the size of sizetype would be different from
> size_type_node, but I guess it's possible in theory at least.
>
> That being said, now that you mention in I should be using
> build_zero_cst (some_type_node) instead of
> build_int_cst(some_type_node, 0). There's also build_one_cst that I
> should use.
>
>> Furthermore did I have to patch this:
>>
>> diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
>> index 585f25d..f374558 100644
>> --- a/gcc/fortran/dump-parse-tree.c
>> +++ b/gcc/fortran/dump-parse-tree.c
>> @@ -465,7 +465,7 @@ show_expr (gfc_expr *p)
>>   break;
>>
>> case BT_HOLLERITH:
>> - fprintf (dumpfile, "%dH", p->representation.length);
>> + fprintf (dumpfile, "%zdH", p->representation.length);
>>   c = p->representation.string;
>>   for (i = 0; i < p->representation.length; i++, c++)
>> {
>>
>> to bootstrap on x86_64-linux/f23.
>
> Ah, thanks for the catch. I'll fix it by using HOST_WIDE_INT_PRINT_DEC
> since I'll have to change gfc_charlen_t to be a typedef form
> HOST_WIDE_INT (see my answer to FX).
>
>> And I have this regression:
>>
>> FAIL: gfortran.dg/allocate_deferred_char_scalar_1.f03   -O1  (test for excess
>> errors)
>>
>> allocate_deferred_char_scalar_1.f03:184:0:
>>
>>  p = '12345679'
>>
>> Warning: '__builtin_memcpy' writing 8 bytes into a region of size 5 overflows
>>  the destination [-Wstringop-overflow=]
>>  allocate_deferred_char_scalar_1.f03:242:0:
>>
>>  p = 4_'12345679'
>>
>> Warning: '__builtin_memcpy' writing 32 bytes into a region of size 20 
>> overflows
>> the destination [-Wstringop-overflow=]
>
> I'm seeing that too, but I assumed they would be fixed by Paul's
> recent patch which I don't yet have in my tree yet due to the git
> mirror being stuck..
>
>> Btw, the patch for changing the ABI of the coarray-libs is already nearly 
>> done.
>> I just need to figure that what the state of regressions is with and without 
>> my
>> change.
>
> Thanks.
>
> I'll produce an updated patch with the changes discussed so far.
>
>
> --
> Janne Blomqvist

Hi,

attached is the updated patch that applies on top of the original. I
didn't do the charlen_zero_node etc, I just fixed the relatively few
places in my previous patch rather than everywhere in the entire
frontend.

Now that the git mirror is working again, I see though those warnings
with -O1 from gfortran.dg/allocate_deferred_char_scalar_1.f03 are
still there, and Paul's patch didn't get rid of them. :(. I have
looked at the tree dumps, however, and AFAICS it's a bogus warning.

Ok for trunk?

-- 
Janne Blomqvist
diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
index 585f25d..7aafe54 100644
--- a/gcc/fortran/dump-parse-tree.c
+++ b/gcc/fortran/dump-parse-tree.c
@@ -348,12 +348,10 @@ show_constructor (gfc_constructor_base base)
 
 
 static void
-show_char_const (const gfc_char_t *c, int length)
+show_char_const (const gfc_char_t *c, gfc_charlen_t length)
 {
-  int i;
-
   fputc ('\'', dumpfile);
-  for (i = 0; i < length; i++)
+  for (gfc_charlen_t i = 0; i < length; i++)
 {
   if (c[i] == '\'')
fputs ("''", dumpfile);
@@ -465,7 +463,8 @@ show_expr (gfc_expr *p)
  break;
 
case BT_HOLLERITH:
- fprintf (dumpfile, "%dH", p->representation.length);
+ fprintf (dumpfile, HOST_WIDE_INT_PRINT_DEC "H",
+  p->representation.length);
  c = p->representation.string;
  for (i = 0; i < p->representation.length; i++, c++)
{
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 6a05e9e..e82c320 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -2067,10 +2067,12 @@ gfc_intrinsic_sym;
 typedef splay_tree gfc_constructor_base;
 
 
-/* This should be a size_t. But occasionally the string length field
-   is used as a flag with values -1 and -2, see
-   e.g. gfc_add_assign_aux_vars.  */
-typedef ptrdiff_t gfc_charlen_t;
+/* This should be an unsigned variable of type size_t.  But to handle
+   compiling to a 64-bit target from a 

Re: C++ PATCH to reject initializating flexible array members in constructors (PR c++/72775)

2016-12-13 Thread Jason Merrill

On 12/09/2016 07:46 PM, Martin Sebor wrote:

On 12/09/2016 02:49 PM, Jason Merrill wrote:

On Fri, Dec 9, 2016 at 11:33 AM, Martin Sebor  wrote:

For flexible array members, because they're not in C++, we get to
make up the rules that make the most sense to us.  IMO, they should
fit in well with the rest of the language.


I disagree; we should support C code, but flexible arrays don't really
fit with the C++ object model, so I don't think trying to do anything
clever with them in constructors is worthwhile.


With the suggested approach the array becomes just an ordinary member.
It's not a flexible array member anymore because its bound is deduced
from the initializer (it just looks like one).  The NSDMI char a[] =
"foo" syntax is just a shorthand for char a[4] = "foo".


That would mean that the size of the class varies with its initializer, 
which again gets into not fitting with the C++ object model.


Jason



Re: [C++ PATCH] c++/78776 fix alias template ICE

2016-12-13 Thread Jason Merrill

On 12/13/2016 12:49 PM, Nathan Sidwell wrote:

+/* Template information for an alias template type.  */
+#define TYPE_ALIAS_TEMPLATE_INFO(NODE) \
+  (DECL_LANG_SPECIFIC (TYPE_NAME (NODE))   \
+   ? DECL_TEMPLATE_INFO (TYPE_NAME (NODE)) \
+   : NULL_TREE)
+
+/* If NODE is a specialization of an alias template, this accessor
+   returns the template info for the alias template.  Otherwise behave
+   as TYPE_TEMPLATE_INFO.  */
+#define TYPE_TEMPLATE_INFO_MAYBE_ALIAS(NODE)   \
+  (TYPE_ALIAS_P (NODE) && DECL_LANG_SPECIFIC (TYPE_NAME (NODE))
\
+   ? TYPE_ALIAS_TEMPLATE_INFO (NODE)   \
+   : TYPE_TEMPLATE_INFO (NODE))


Looks like this is checking DECL_LANG_SPECIFIC twice again; we should be 
able to drop the check from TYPE_TEMPLATE_INFO_MAYBE_ALIAS.



/* An alias template name is never deduced.  */
if (TYPE_ALIAS_P (arg))
  arg = strip_typedefs (arg);
-   tree argvec = INNERMOST_TEMPLATE_ARGS (TYPE_TI_ARGS (arg));
+
+   tree tinfo = TYPE_TEMPLATE_INFO_MAYBE_ALIAS (arg);


You don't need _MAYBE_ALIAS here.  And without it, we should be able to 
drop the strip_typedefs.


OK with those changes.

Jason



Re: [Patch, Fortran, cleanup] PR 78798: some int-valued functions should be bool

2016-12-13 Thread Janus Weil
2016-12-13 19:19 GMT+01:00 Janne Blomqvist :
> On Tue, Dec 13, 2016 at 8:13 PM, Janus Weil  wrote:
>> Hi all,
>>
>> here is a straightforward cleanup patch that makes a few functions
>> return a bool instead of an int. Regtests cleanly on x86_64-linux-gnu.
>> Ok for trunk?
>
> Ok, thanks.

Thanks, Janne. Committed as r243621.

Cheers,
Janus


Re: [PATCH, Fortran, pr78780, v1] [7 Regression] [Coarray] ICE in conv_caf_send, at fortran/trans-intrinsic.c:1936

2016-12-13 Thread Steve Kargl
On Tue, Dec 13, 2016 at 07:32:33PM +0100, Andre Vehreschild wrote:
> 
> attached patch fixes the issue by improving the check whether a call of the
> caf-runtime-routines needs to be generated instead of a regular assignment. 
> 
> Bootstraps and regtests ok on x86_64-linux/f23. Ok for trunk?
> 
> gcc/testsuite/ChangeLog:
> 
> 2016-12-13  Andre Vehreschild  
> 
>   PR fortran/78780
>   * gfortran.dg/coarray/alloc_comp_5.f90: New test.
> 
> gcc/fortran/ChangeLog:
> 
> 2016-12-13  Andre Vehreschild  
> 
>   PR fortran/78780
>   * trans-expr.c (gfc_trans_assignment_1): Improve check whether detour
>   caf-runtime routines is needed.
> 

OK.

-- 
Steve


[PATCH, Fortran, pr78780, v1] [7 Regression] [Coarray] ICE in conv_caf_send, at fortran/trans-intrinsic.c:1936

2016-12-13 Thread Andre Vehreschild
Hi all,

attached patch fixes the issue by improving the check whether a call of the
caf-runtime-routines needs to be generated instead of a regular assignment. 

Bootstraps and regtests ok on x86_64-linux/f23. Ok for trunk?

- Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
gcc/testsuite/ChangeLog:

2016-12-13  Andre Vehreschild  

PR fortran/78780
* gfortran.dg/coarray/alloc_comp_5.f90: New test.


gcc/fortran/ChangeLog:

2016-12-13  Andre Vehreschild  

PR fortran/78780
* trans-expr.c (gfc_trans_assignment_1): Improve check whether detour
caf-runtime routines is needed.

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 2f45d40..f908c25 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -9718,7 +9718,7 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag,
   bool scalar_to_array;
   tree string_length;
   int n;
-  bool maybe_workshare = false;
+  bool maybe_workshare = false, lhs_refs_comp = false, rhs_refs_comp = false;
   symbol_attribute lhs_caf_attr, rhs_caf_attr, lhs_attr;
   bool is_poly_assign;
 
@@ -9758,8 +9758,8 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag,
  mode.  */
   if (flag_coarray == GFC_FCOARRAY_LIB)
 {
-  lhs_caf_attr = gfc_caf_attr (expr1);
-  rhs_caf_attr = gfc_caf_attr (expr2);
+  lhs_caf_attr = gfc_caf_attr (expr1, false, _refs_comp);
+  rhs_caf_attr = gfc_caf_attr (expr2, false, _refs_comp);
 }
 
   if (lss != gfc_ss_terminator)
@@ -9959,10 +9959,19 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag,
 }
   else if (flag_coarray == GFC_FCOARRAY_LIB
 	   && lhs_caf_attr.codimension && rhs_caf_attr.codimension
-	   && lhs_caf_attr.alloc_comp && rhs_caf_attr.alloc_comp)
+	   && ((lhs_caf_attr.allocatable && lhs_refs_comp)
+	   || (rhs_caf_attr.allocatable && rhs_refs_comp)))
 {
+  /* Only detour to caf_send[get][_by_ref] () when the lhs or rhs is an
+	 allocatable component, because those need to be accessed via the
+	 caf-runtime.  No need to check for coindexes here, because resolve
+	 has rewritten those already.  */
   gfc_code code;
   gfc_actual_arglist a1, a2;
+  /* Clear the structures to prevent accessing garbage.  */
+  memset (, '\0', sizeof (gfc_code));
+  memset (, '\0', sizeof (gfc_actual_arglist));
+  memset (, '\0', sizeof (gfc_actual_arglist));
   a1.expr = expr1;
   a1.next = 
   a2.expr = expr2;
diff --git a/gcc/testsuite/gfortran.dg/coarray/alloc_comp_5.f90 b/gcc/testsuite/gfortran.dg/coarray/alloc_comp_5.f90
new file mode 100644
index 000..159f5a3
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/coarray/alloc_comp_5.f90
@@ -0,0 +1,16 @@
+! { dg-do run }
+
+program Jac
+  type Domain
+integer :: n=64
+integer,allocatable :: endsi(:)
+  end type
+  type(Domain),allocatable :: D[:,:,:]
+
+  allocate(D[2,2,*])
+  allocate(D%endsi(2), source = 0)
+  D%endsi(2) = D%n
+  if (any(D%endsi /= [ 0, 64])) error stop
+  deallocate(D)
+end program
+


Re: [PATCH, rs6000] Cost model adjustment

2016-12-13 Thread Segher Boessenkool
Hi Bill,

On Tue, Dec 13, 2016 at 11:13:08AM -0600, Bill Schmidt wrote:
> This patch makes a slight adjustment to the vectorization cost model that
> was previously overlooked.
> 
> Bootstrapped and tested on powerpc64le-unknown-linux gnu with no regressions.
> Ok for trunk?

Sure, thanks!


> 2016-12-13  Bill Schmidt  
> 
>   * config/rs6000/rs600.c (rs6000_builtin_vectorization_cost):
>   Adjust unaligned load cost.


Re: [Patch, Fortran, cleanup] PR 78798: some int-valued functions should be bool

2016-12-13 Thread Janne Blomqvist
On Tue, Dec 13, 2016 at 8:13 PM, Janus Weil  wrote:
> Hi all,
>
> here is a straightforward cleanup patch that makes a few functions
> return a bool instead of an int. Regtests cleanly on x86_64-linux-gnu.
> Ok for trunk?

Ok, thanks.



-- 
Janne Blomqvist


[PATCH], Add PowerPC ISA 3.0 vec_vinsert4b and vec_vextract4b built-in functions

2016-12-13 Thread Michael Meissner
This patch adds support for the vec_vinsert4b and vec_vextract4b built-in
functions that generate the ISA 3.0 XXINSERTW and XXEXTRACTUW/VEXTUW{L,R}X
instructions.  These functions are part of the PowerOpen 64-bit ELF V2 abi.

In doing the work, I noticed the P9V built-in ternary functions incorrectly
were declared to be binary.  I have fixed these functions.

The built-ins added are:

long long vec_vextract4b (const vector signed char, const int);
long long vec_vextract4b (const vector unsigned char, const int);

vector signed char vec_insert4b (vector int, vector signed char, const int);
vector unsigned char vec_insert4b (vector unsigned int, vector unsigned char,
   const int);
vector signed char vec_insert4b (long long, vector signed char, const int);
vector unsigned char vec_insert4b (long long, vector unsigned char, const int);

Note, the ABI only adds the form of vec_insert4b that takes a vector int as the
first argument.  On little endian systems, you have to swap double words to get
the desired element into the scalar position for the XXINSERTW instruction.

I have added a GCC extension to alternatively take a long long (or long in
64-bit) for the value to be inserted, since IMHO, it makes the built-in much
easier to use.

I have done bootstrap builds on a 64-bit power8 little endian system and a
32/64-bit power7 big endian system.  There were no regressions.  Can I check
this into the GCC trunk?

[gcc]
2016-12-13   Michael Meissner  

* config/rs6000/predicates.md (const_0_to_11_operand): New
predicate, match 0..11.
* config/rs6000/rs6000-builtin.def (BU_P9V_VSX_3): Set built-in
type to ternary, not binary.
(BU_P9V_64BIT_VSX_3): Likewise.
(P9V_BUILTIN_VEXTRACT4B): Add support for vec_vinsert4b and
vec_extract4b non-overloaded built-in functions.
(P9V_BUILTIN_VINSERT4B): Likewise.
(P9V_BUILTIN_VINSERT4B_DI): Likewise.
(P9V_BUILTIN_VEC_VEXTULX): Move to section that adds 2 operand ISA
3.0 built-in functions.
(P9V_BUILTIN_VEC_VEXTURX): Likewise.
(P9V_BUILTIN_VEC_VEXTRACT4B): Add support for overloaded
vec_insert4b and vec_extract4 built-in functions.
(P9V_BUILTIN_VEC_VINSERT4B): Likewise.
* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add
overloaded support for vec_vinsert4b and vec_extract4b.
* config/rs6000/rs6000.c (altivec_expand_builtin): Add checks for
the vec_insert4b and vec_extract4b byte number being a constant in
the range 0..11.
* config/rs6000/vsx.md (UNSPEC_XXEXTRACTUW): New unspec.
(UNSPEC_XXINSERTW): Likewise.
(vextract4b): Add support for the vec_vextract4b built-in
function.
(vextract4b_internal): Likewise.
(vinsert4b): Add support for the vec_insert4b built-in function.
Include both a version that inserts element 1 from a V4SI object
and one that inserts a DI object.
(vinsert4b_internal): Likewise.
(vinsert4b_di): Likewise.
(vinsert4b_di_internal): Likewise.
* config/rs6000/altivec.h (vec_vinsert4b): Support vec_vinsert4b
and vec_extract4b built-in functions.
* doc/extend.doc (PowerPC VSX built-in functions): Document
vec_insert4b and vec_extract4b.

[gcc/testsuite]
2016-12-13  Michael Meissner  

* gcc.target/powerpc/p9-vinsert4b-1.c: New test.
* gcc.target/powerpc/p9-vinsert4b-2.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md 
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
(revision 243590)
+++ gcc/config/rs6000/predicates.md (.../gcc/config/rs6000) (working copy)
@@ -210,6 +210,11 @@ (define_predicate "const_0_to_7_operand"
   (and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 0, 7)")))
 
+;; Match op = 0..11
+(define_predicate "const_0_to_11_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 0, 11)")))
+
 ;; Match op = 0..15
 (define_predicate "const_0_to_15_operand"
   (and (match_code "const_int")
Index: gcc/config/rs6000/rs6000-builtin.def
===
--- gcc/config/rs6000/rs6000-builtin.def
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
(revision 243590)
+++ gcc/config/rs6000/rs6000-builtin.def(.../gcc/config/rs6000) 
(working copy)
@@ -877,7 +877,16 @@
"__builtin_vsx_" NAME,  /* NAME */  \
RS6000_BTM_P9_VECTOR,   /* MASK */  \
(RS6000_BTC_ ## ATTR  

[Patch, Fortran, cleanup] PR 78798: some int-valued functions should be bool

2016-12-13 Thread Janus Weil
Hi all,

here is a straightforward cleanup patch that makes a few functions
return a bool instead of an int. Regtests cleanly on x86_64-linux-gnu.
Ok for trunk?

Cheers,
Janus



2016-12-13  Janus Weil  

PR fortran/78798
* gfortran.h (gfc_is_constant_expr, gfc_is_formal_arg,
gfc_is_compile_time_shape): Return bool instead of int.
* array.c (gfc_is_compile_time_shape): Ditto.
* expr.c (gfc_is_constant_expr): Ditto.
* resolve.c (gfc_is_formal_arg): Ditto. Make formal_arg_flag bool.
Index: gcc/fortran/array.c
===
--- gcc/fortran/array.c (revision 243609)
+++ gcc/fortran/array.c (working copy)
@@ -2581,18 +2581,16 @@ gfc_find_array_ref (gfc_expr *e)
 
 /* Find out if an array shape is known at compile time.  */
 
-int
+bool
 gfc_is_compile_time_shape (gfc_array_spec *as)
 {
-  int i;
-
   if (as->type != AS_EXPLICIT)
-return 0;
+return false;
 
-  for (i = 0; i < as->rank; i++)
+  for (int i = 0; i < as->rank; i++)
 if (!gfc_is_constant_expr (as->lower[i])
|| !gfc_is_constant_expr (as->upper[i]))
-  return 0;
+  return false;
 
-  return 1;
+  return true;
 }
Index: gcc/fortran/expr.c
===
--- gcc/fortran/expr.c  (revision 243609)
+++ gcc/fortran/expr.c  (working copy)
@@ -882,10 +882,9 @@ done:
 
 
 /* Determine if an expression is constant in the sense of F08:7.1.12.
- * This function expects that the expression has already been simplified.
- * FIXME: Return a bool, not an int.  */
+ * This function expects that the expression has already been simplified.  */
 
-int
+bool
 gfc_is_constant_expr (gfc_expr *e)
 {
   gfc_constructor *c;
@@ -892,7 +891,7 @@ gfc_is_constant_expr (gfc_expr *e)
   gfc_actual_arglist *arg;
 
   if (e == NULL)
-return 1;
+return true;
 
   switch (e->expr_type)
 {
@@ -902,7 +901,7 @@ gfc_is_constant_expr (gfc_expr *e)
  || gfc_is_constant_expr (e->value.op.op2)));
 
 case EXPR_VARIABLE:
-  return 0;
+  return false;
 
 case EXPR_FUNCTION:
 case EXPR_PPC:
@@ -915,7 +914,7 @@ gfc_is_constant_expr (gfc_expr *e)
{
  for (arg = e->value.function.actual; arg; arg = arg->next)
if (!gfc_is_constant_expr (arg->expr))
- return 0;
+ return false;
}
 
   if (e->value.function.isym
@@ -923,13 +922,13 @@ gfc_is_constant_expr (gfc_expr *e)
  || e->value.function.isym->pure
  || e->value.function.isym->inquiry
  || e->value.function.isym->transformational))
-   return 1;
+   return true;
 
-  return 0;
+  return false;
 
 case EXPR_CONSTANT:
 case EXPR_NULL:
-  return 1;
+  return true;
 
 case EXPR_SUBSTRING:
   return e->ref == NULL || (gfc_is_constant_expr (e->ref->u.ss.start)
@@ -943,14 +942,14 @@ gfc_is_constant_expr (gfc_expr *e)
 
   for (; c; c = gfc_constructor_next (c))
if (!gfc_is_constant_expr (c->expr))
- return 0;
+ return false;
 
-  return 1;
+  return true;
 
 
 default:
   gfc_internal_error ("gfc_is_constant_expr(): Unknown expression type");
-  return 0;
+  return false;
 }
 }
 
Index: gcc/fortran/gfortran.h
===
--- gcc/fortran/gfortran.h  (revision 243609)
+++ gcc/fortran/gfortran.h  (working copy)
@@ -3088,7 +3088,7 @@ bool gfc_check_init_expr (gfc_expr *);
 gfc_expr *gfc_build_conversion (gfc_expr *);
 void gfc_free_ref_list (gfc_ref *);
 void gfc_type_convert_binary (gfc_expr *, int);
-int gfc_is_constant_expr (gfc_expr *);
+bool gfc_is_constant_expr (gfc_expr *);
 bool gfc_simplify_expr (gfc_expr *, int);
 int gfc_has_vector_index (gfc_expr *);
 
@@ -3180,7 +3180,7 @@ bool gfc_resolve_iterator (gfc_iterator *, bool, b
 bool find_forall_index (gfc_expr *, gfc_symbol *, int);
 bool gfc_resolve_index (gfc_expr *, int);
 bool gfc_resolve_dim_arg (gfc_expr *);
-int gfc_is_formal_arg (void);
+bool gfc_is_formal_arg (void);
 void gfc_resolve_substring_charlen (gfc_expr *);
 match gfc_iso_c_sub_interface(gfc_code *, gfc_symbol *);
 gfc_expr *gfc_expr_to_initialize (gfc_expr *);
@@ -3218,7 +3218,7 @@ gfc_array_ref *gfc_find_array_ref (gfc_expr *);
 tree gfc_conv_array_initializer (tree type, gfc_expr *);
 bool spec_size (gfc_array_spec *, mpz_t *);
 bool spec_dimen_size (gfc_array_spec *, int, mpz_t *);
-int gfc_is_compile_time_shape (gfc_array_spec *);
+bool gfc_is_compile_time_shape (gfc_array_spec *);
 
 bool gfc_ref_dimen_size (gfc_array_ref *, int dimen, mpz_t *, mpz_t *);
 
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c   (revision 243609)
+++ gcc/fortran/resolve.c   (working copy)
@@ -72,9 +72,9 @@ static bool first_actual_arg = false;
 
 static int omp_workshare_flag;
 
-/* Nonzero if we 

Re: libgo patch committed: Copy hash code from Go 1.7 runtime

2016-12-13 Thread Ian Lance Taylor
On Fri, Dec 9, 2016 at 1:37 AM, Rainer Orth  
wrote:
>
> 2016-12-09  Rainer Orth  
>
> * configure.ac: Call GCC_CHECK_LINKER_HWCAP.
> * Makefile.am (AM_LDFLAGS): Initialize to HWCAP_LDFLAGS.
> [USING_SPLIT_STACK]: Add to AM_LDFLAGS.
> * aclocal.m4: Regenerate.
> * configure: Regenerate.
> * Makefile.in: Regenerate.
> * testsuite/Makefile.in: Regenerate.

Thanks.  Committed to mainline as follows.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 243459)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-ac59bb383e1b446c68465af793722dd0e84abefb
+556a546ba3c7bb14bd1b9b8469ee3b7a914909f6
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/Makefile.am
===
--- libgo/Makefile.am   (revision 243445)
+++ libgo/Makefile.am   (working copy)
@@ -46,8 +46,10 @@ AM_CFLAGS = -fexceptions -fnon-call-exce
-I $(srcdir)/../libgcc -I $(srcdir)/../libbacktrace \
-I $(MULTIBUILDTOP)../../gcc/include
 
+AM_LDFLAGS = $(HWCAP_LDFLAGS)
+
 if USING_SPLIT_STACK
-AM_LDFLAGS = -XCClinker $(SPLIT_STACK)
+AM_LDFLAGS += -XCClinker $(SPLIT_STACK)
 endif
 
 # Multilib support.
@@ -561,7 +563,7 @@ s-sigtab: $(srcdir)/mksigtab.sh gen-sysi
 runtime.inc: s-runtime-inc; @true
 s-runtime-inc: runtime.lo Makefile
rm -f runtime.inc.tmp2
-   grep -v "#define _" runtime.inc.tmp | grep -v "#define c[01] " > 
runtime.inc.tmp2
+   grep -v "#define _" runtime.inc.tmp | grep -v "#define [cm][01234] " > 
runtime.inc.tmp2
for pattern in '_[GP][a-z]' _Max _Lock _Sig _Trace _MHeap _Num; do \
  grep "#define $$pattern" runtime.inc.tmp >> runtime.inc.tmp2; \
done
Index: libgo/configure.ac
===
--- libgo/configure.ac  (revision 243445)
+++ libgo/configure.ac  (working copy)
@@ -421,6 +421,9 @@ case "$target" in
 esac
 AC_SUBST(OSCFLAGS)
 
+dnl Check linker hardware capability support.
+GCC_CHECK_LINKER_HWCAP
+
 dnl Use -fsplit-stack when compiling C code if available.
 AC_CACHE_CHECK([whether -fsplit-stack is supported],
 [libgo_cv_c_split_stack_supported],


Re: [C++ PATCH] c++/78776 fix alias template ICE

2016-12-13 Thread Nathan Sidwell

On 12/12/2016 10:28 PM, Jason Merrill wrote:


I suspect that most uses don't need to change.


Indeed. This addresses the ones I could find.  Both by grepping pt.c and 
fixing subsequent test fall out.


took the opportunity to make the control flow in get_underlying_template 
somewhat clearer.  I don't think there's anything unsurprising in this 
patch.


ok?

nathan

--
Nathan Sidwell
2016-12-12  Nathan Sidwell  

	PR c++/69481
	* cp-tree.h (TYPE_TEMPLATE_INFO): Remove alias type checking.
	(TYPE_ALIAS_TEMPLATE_INFO): New.
	(TYPE_TEMPLATE_INFO_MAYBE_ALIAS): New.  Use those macros.
	* error.c (dump_alias_template_specialization): Adjust.
	* pt.c (maybe_process_partial_specialization,
	iterative_has_template_arg, find_parameter_packs_r,
	alias_template_specialization_p, dependent_alias_template_spec_p,
	get_underlying_template, lookup_template_class_1, unify): Adjust
	template using decl access.

	PR c++/69481
	* g++.dg/cpp0x/pr69481.C: New.

Index: cp/cp-tree.h
===
--- cp/cp-tree.h	(revision 243604)
+++ cp/cp-tree.h	(working copy)
@@ -3038,23 +3038,30 @@ extern void decl_shadowed_for_var_insert
->template_info)
 
 /* Template information for an ENUMERAL_, RECORD_, UNION_TYPE, or
-   BOUND_TEMPLATE_TEMPLATE_PARM type.  Note that if NODE is a
-   specialization of an alias template, this accessor returns the
-   template info for the alias template, not the one (if any) for the
-   template of the underlying type.  */
+   BOUND_TEMPLATE_TEMPLATE_PARM type.  This ignores any alias
+   templateness of NODE.  */
 #define TYPE_TEMPLATE_INFO(NODE)	\
-  ((TYPE_ALIAS_P (NODE) && DECL_LANG_SPECIFIC (TYPE_NAME (NODE)))	\
-   ? (DECL_LANG_SPECIFIC (TYPE_NAME (NODE))\
-  ? DECL_TEMPLATE_INFO (TYPE_NAME (NODE))\
-  : NULL_TREE)			\
-   : ((TREE_CODE (NODE) == ENUMERAL_TYPE)\
-  ? ENUM_TEMPLATE_INFO (NODE)	\
-  : ((TREE_CODE (NODE) == BOUND_TEMPLATE_TEMPLATE_PARM)		\
-	 ? TEMPLATE_TEMPLATE_PARM_TEMPLATE_INFO (NODE)			\
-	 : (CLASS_TYPE_P (NODE)		\
-	? CLASSTYPE_TEMPLATE_INFO (NODE)\
-	: NULL_TREE
+  (TREE_CODE (NODE) == ENUMERAL_TYPE	\
+   ? ENUM_TEMPLATE_INFO (NODE)		\
+   : (TREE_CODE (NODE) == BOUND_TEMPLATE_TEMPLATE_PARM			\
+  ? TEMPLATE_TEMPLATE_PARM_TEMPLATE_INFO (NODE)			\
+  : (CLASS_TYPE_P (NODE)		\
+	 ? CLASSTYPE_TEMPLATE_INFO (NODE)\
+	 : NULL_TREE)))
 
+/* Template information for an alias template type.  */
+#define TYPE_ALIAS_TEMPLATE_INFO(NODE)	\
+  (DECL_LANG_SPECIFIC (TYPE_NAME (NODE))\
+   ? DECL_TEMPLATE_INFO (TYPE_NAME (NODE))\
+   : NULL_TREE)
+
+/* If NODE is a specialization of an alias template, this accessor
+   returns the template info for the alias template.  Otherwise behave
+   as TYPE_TEMPLATE_INFO.  */
+#define TYPE_TEMPLATE_INFO_MAYBE_ALIAS(NODE)\
+  (TYPE_ALIAS_P (NODE) && DECL_LANG_SPECIFIC (TYPE_NAME (NODE))		\
+   ? TYPE_ALIAS_TEMPLATE_INFO (NODE)	\
+   : TYPE_TEMPLATE_INFO (NODE))
 
 /* Set the template information for an ENUMERAL_, RECORD_, or
UNION_TYPE to VAL.  */
Index: cp/error.c
===
--- cp/error.c	(revision 243604)
+++ cp/error.c	(working copy)
@@ -365,15 +365,13 @@ dump_template_bindings (cxx_pretty_print
 static void
 dump_alias_template_specialization (cxx_pretty_printer *pp, tree t, int flags)
 {
-  tree name;
-
   gcc_assert (alias_template_specialization_p (t));
 
+  tree decl = TYPE_NAME (t);
   if (!(flags & TFF_UNQUALIFIED_NAME))
-dump_scope (pp, CP_DECL_CONTEXT (TYPE_NAME (t)), flags);
-  name = TYPE_IDENTIFIER (t);
-  pp_cxx_tree_identifier (pp, name);
-  dump_template_parms (pp, TYPE_TEMPLATE_INFO (t),
+dump_scope (pp, CP_DECL_CONTEXT (decl), flags);
+  pp_cxx_tree_identifier (pp, DECL_NAME (decl));
+  dump_template_parms (pp, DECL_TEMPLATE_INFO (decl),
 		   /*primary=*/false,
 		   flags & ~TFF_TEMPLATE_HEADER);
 }
Index: cp/pt.c
===
--- cp/pt.c	(revision 243604)
+++ cp/pt.c	(working copy)
@@ -940,10 +940,11 @@ maybe_process_partial_specialization (tr
 
   if (TYPE_ALIAS_P (type))
 {
-  if (TYPE_TEMPLATE_INFO (type)
-	  && DECL_ALIAS_TEMPLATE_P (TYPE_TI_TEMPLATE (type)))
+  tree tinfo = TYPE_ALIAS_TEMPLATE_INFO (type);
+
+  if (tinfo && DECL_ALIAS_TEMPLATE_P (TI_TEMPLATE (tinfo)))
 	error ("specialization of alias template %qD",
-	   TYPE_TI_TEMPLATE (type));
+	   TI_TEMPLATE (tinfo));
   else
 	error ("explicit specialization of non-template %qT", type);
   return error_mark_node;
@@ -1829,7 +1830,7 @@ iterative_hash_template_arg (tree arg, h
 	  // left alone, or untouched specializations because
 	  // coerce_template_parms returns the unconverted template
 	  // arguments if it sees incomplete argument packs.
-	  tree ti = TYPE_TEMPLATE_INFO (arg);
+	  tree ti = TYPE_ALIAS_TEMPLATE_INFO (arg);
 

Re: Ping #1: [patch,testsuite,avr]: Filter-out -mmcu= from options for tests that set -mmcu=

2016-12-13 Thread Denis Chertykov
2016-12-13 13:36 GMT+04:00 Georg-Johann Lay :
> Ping #1
>
> As I explained below, the solution taken be arm (pruning output)
> does not work here.

Approved.
Please apply your patch.

>
> Johann
>
> On 02.12.2016 11:21, Georg-Johann Lay wrote:
>>
>> On 01.12.2016 17:40, Mike Stump wrote:
>>>
>>> On Dec 1, 2016, at 3:54 AM, Georg-Johann Lay  wrote:


 This patch moves the compile tests that have a hard coded -mmcu=MCU
 in their dg-options to a new folder.

 The exp driver filters out -mmcu= from the command line options that
 are provided by, say, board description files or --tool-opts.

 This is needed because otherwise conflicting -mmcu= will FAIL
 respective test cases because of "specified option '-mmcu' more than
 once" errors from avr-gcc.

 Ok for trunk?
>>>
>>>
>>> So, it would be nice if different ports can use roughly similar
>>> schemes to handle the same problems.  I think arm is one of the more
>>> complex ports at this point in this regard with a lot of people and a
>>> lot of years time to contemplate and implement solutions to the
>>> problem.  They in particular don't have to move test cases around to
>>> handle the difference like this, I think it would be best to avoid
>>> that requirement if possible.
>>>
>>> Glancing around, two starting points for how the arm achieves what it
>>> does:
>>>
>>>   lappend dg_runtest_extra_prunes "warning: switch -m(cpu|arch)=.*
>>> conflicts with -m(cpu|arch)=.* switch"
>>>
>>> in arm.exp, and they use something like:
>>>
>>> /* { dg-require-effective-target arm_crypto_ok }
>>> */
>>> |crypto-vsha256hq_u32.c:2:/* { dg-require-effective-target
>>> arm_crypto_ok } */
>>> /* { dg-add-options arm_crypto }
>>> */
>>> |crypto-vsha256su0q_u32.c:2:/* { dg-require-effective-target
>>> arm_crypto_ok } */
>>>
>>> to validate the requirements of the test case, and to ensure that
>>> optional things are selected.  Nice, simple, extensible, handles
>>> multilibs, dejagnu arguments and different cpu defaults as I recall.
>>>
>>> You won't need all the hair the arm folks have, but if you stub in
>>> support in that direction, you then have simple, easy expansion room
>>> to match all complexities that the arm folks have already hit and
>>> solved.
>>
>>
>> I tried this approach, but it does not work as expected.
>>
>> Notice that avr-gcc throws an error if conflicting -mmcu options are
>> supplied.  Pruning the output will make some tests "PASS", but the
>> compiler didn't actually do anything but producing an error message...
>>
>> And one test FAILs because it should produce some specific diagnostic,
>> but again the compiler just throws a different error, the output is
>> pruned and the expected message is missing, hence fail.
>>
>> Also one test case is for ATmega8, but you won't run the whole test
>> suite against ATmega8 because that device is too restricted to give
>> reasonable results...  Hence for some tests -mmcu=atmega8 is added by
>> hand.
>>
>> Johann
>>
>>
>>
>>
>>
>


[arm-embedded] Add ARMv8-M Security Extensions support to Cortex-M23 and Cortex-M33

2016-12-13 Thread Thomas Preudhomme

Hi,

We have decided to apply the following patch to the embedded-6-branch to enable 
ARMv8-M Security Extensions to ARM Cortex-M23 and ARM Cortex-M33.


ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2016-12-13  Thomas Preud'homme  

* config/arm/arm-cores.def (cortex-m23): Add FL2_CMSE flag.
(cortex-m33): Likewise.

I've checked that gcc.target/arm/cmse/cmse-1.c compiles with -mcpu=cortex-m23 
-mcmse and -mcpu=cortex-m33 and that it does not when trying with 
-mcpu=cortex-m0 -mcmse or -mcpu=cortex-m7 -mcmse.


Best regards,

Thomas
diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index e5b492897a7b39ade9286e6b8b15934277a6848e..add6b21ee5238aab5759d17f3de1aa5dff928d7f 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -166,9 +166,9 @@ ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7,	7A,	ARM_FSET_MAKE_
 ARM_CORE("cortex-a17.cortex-a7", cortexa17cortexa7, cortexa7,	7A,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV | FL_FOR_ARCH7A), cortex_a12)
 
 /* V8 Architecture Processors */
-ARM_CORE("cortex-m23",	cortexm23, cortexm23,	8M_BASE, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH8M_BASE), v6m)
+ARM_CORE("cortex-m23",	cortexm23, cortexm23,	8M_BASE, ARM_FSET_MAKE (FL_LDSCHED | FL_FOR_ARCH8M_BASE, FL2_CMSE), v6m)
 ARM_CORE("cortex-a32",	cortexa32, cortexa53,	8A,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a35)
-ARM_CORE("cortex-m33",	cortexm33, cortexm33,	8M_MAIN, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_ARCH7EM | FL_FOR_ARCH8M_MAIN), v7m)
+ARM_CORE("cortex-m33",	cortexm33, cortexm33,	8M_MAIN, ARM_FSET_MAKE (FL_LDSCHED | FL_ARCH7EM | FL_FOR_ARCH8M_MAIN, FL2_CMSE), v7m)
 ARM_CORE("cortex-a35",	cortexa35, cortexa53,	8A,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a35)
 ARM_CORE("cortex-a53",	cortexa53, cortexa53,	8A,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a53)
 ARM_CORE("cortex-a57",	cortexa57, cortexa57,	8A,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a57)


[PATCH, i386]: Fix PR78794, ~9% regression in 32-bit mode for 462.libquntum on Avoton after r243202

2016-12-13 Thread Uros Bizjak
Hello!

Attached patch fixes STV cost function to better model gains of pandn
insn on non-BMI targets. As explained in the PR, STV converts four
scalar arithmetic insns (2 * not and 2 * and) to one (pandn). The
patch increases gain for non-BMI targets for 2 * ix86_cost->add to a
total of 3 * ix86_cost->add.

2016-12-13  Uros Bizjak  

PR target/78794
* config/i386/i386.c (dimode_scalar_chain::compute_convert_gain):
Calculate additional gain for andnot for targets without BMI.

testsuite/ChangeLog:

2016-12-13  Uros Bizjak  

PR target/78794
* gcc.target/i386/pr78794.c: New test.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 243611)
+++ config/i386/i386.c  (working copy)
@@ -3419,6 +3419,10 @@ dimode_scalar_chain::compute_convert_gain ()
   || GET_CODE (src) == AND)
{
  gain += ix86_cost->add;
+ /* Additional gain for andnot for targets without BMI.  */
+ if (GET_CODE (XEXP (src, 0)) == NOT
+ && !TARGET_BMI)
+   gain += 2 * ix86_cost->add;
  if (CONST_INT_P (XEXP (src, 0)))
gain -= vector_const_cost (XEXP (src, 0));
  if (CONST_INT_P (XEXP (src, 1)))
Index: testsuite/gcc.target/i386/pr78794.c
===
--- testsuite/gcc.target/i386/pr78794.c (nonexistent)
+++ testsuite/gcc.target/i386/pr78794.c (working copy)
@@ -0,0 +1,32 @@
+/* PR target/pr78794 */
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -march=slm -mno-bmi -mno-stackrealign" } */
+/* { dg-final { scan-assembler "pandn" } } */
+
+typedef unsigned long long ull;
+
+struct S1
+{
+  float x;
+  ull y;
+};
+
+
+struct S2
+{
+  int a1;
+  struct S1 *node;
+  int *a2;
+};
+
+void
+foo(int c1, int c2, int c3, struct S2 *reg)
+{
+  int i;
+  for(i=0; ia1; i++)
+if(reg->node[i].y & ((ull) 1 << c1))
+  {
+   if(reg->node[i].y & ((ull) 1 << c2))
+ reg->node[i].y ^= ((ull) 1 << c3);
+  }
+}


[PATCH, rs6000] Cost model adjustment

2016-12-13 Thread Bill Schmidt
Hi,

This patch makes a slight adjustment to the vectorization cost model that
was previously overlooked.

Bootstrapped and tested on powerpc64le-unknown-linux gnu with no regressions.
Ok for trunk?

Thanks,
Bill


2016-12-13  Bill Schmidt  

* config/rs6000/rs600.c (rs6000_builtin_vectorization_cost):
Adjust unaligned load cost.


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 243578)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -5358,6 +5358,9 @@ rs6000_builtin_vectorization_cost (enum vect_cost_
 return 3;
 
   case unaligned_load:
+   if (TARGET_P9_VECTOR)
+ return 3;
+
if (TARGET_EFFICIENT_UNALIGNED_VSX)
  return 1;
 



Re: [PATCH, fortran, pr77785, v3] [Coarray] ICE in gfc_get_caf_token_offset, at fortran/trans-expr.c:1990

2016-12-13 Thread Andre Vehreschild
Hi Janus,

thanks for the review. Committed as r243614.

- Andre

On Tue, 13 Dec 2016 16:08:59 +0100
Janus Weil  wrote:

> Hi Andre,
> 
> > thanks for the input on the missing testcase, Janus (btw, when you know
> > where to get a new crystal ball, let me know; I am missing mine, too). The
> > new version of the patch adds a new testcase coarray_41.f90 to test that
> > the compiler compiles correctly and the test runs ok.
> >
> > Bootstrapped and regtested on x86_64-linux/f23. Ok for trunk?  
> 
> yes, good for trunk then.
> 
> Cheers,
> Janus
> 
> 
> 
> > On Tue, 13 Dec 2016 12:11:50 +0100
> > Janus Weil  wrote:
> >  
> >> Hi Andre,
> >>  
> >> > all the sanitizer issues I fixed occur during compiling the testsuite.
> >> > So I would say, that when with the patch these errors do not occur
> >> > anymore while processing the testsuite, then those are tested for,
> >> > right?  
> >>
> >> aah, so you're saying that hunk is not actually related to the PR in
> >> the subject line, but instead fixes a testsuite failure seen with a
> >> sanitized compiler? That wasn't mentioned anywhere and sadly I forgot
> >> to bring my crystal ball ...
> >>
> >> Cheers,
> >> Janus
> >>
> >>
> >>  
> >> > On Mon, 12 Dec 2016 13:37:43 +0100
> >> > Janus Weil  wrote:
> >> >  
> >> >> Hi Andre,
> >> >>  
> >> >> > the attached patch corrects reporting of "Sorry, unimplemented yet"
> >> >> > for allocatable and pointer components in polymorphic objects
> >> >> > (BT_CLASS) thus fixing two ICEs reported in the PR.
> >> >> >
> >> >> > The next chunk fixes an ICE when the declaration containing the token
> >> >> > information is of type POINTER or REFERENCE.
> >> >> >
> >> >> > Bootstraps and regtests ok on x86_64-linux/f23. Ok for trunk?  
> >> >>
> >> >> the resolve.c hunk is certainly ok. The trans-array.c part looks
> >> >> reasonable as well, but I wonder if it is actually covered by any of
> >> >> your test cases? Since they are all compile-only, with errors being
> >> >> thrown at resolution stage, do they even get to the translation stage?
> >> >>
> >> >> Cheers,
> >> >> Janus  
> >> >
> >> >
> >> > --
> >> > Andre Vehreschild * Email: vehre ad gmx dot de  
> >
> >
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de  


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 243613)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,11 @@
+2016-12-13  Andre Vehreschild  
+
+	PR fortran/77785
+	* resolve.c (resolve_symbol): Correct attr lookup to the _data
+	component.
+	* trans-array.c (gfc_alloc_allocatable_for_assignment): Indirect ref
+	pointers and references before retrieving the caf-token.
+
 2016-12-13  Janus Weil  
 	Paul Thomas  
 
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c	(Revision 243613)
+++ gcc/fortran/resolve.c	(Arbeitskopie)
@@ -14044,8 +14044,8 @@
   if (flag_coarray == GFC_FCOARRAY_LIB && sym->ts.type == BT_CLASS
   && sym->ts.u.derived && CLASS_DATA (sym)
   && CLASS_DATA (sym)->attr.codimension
-  && (sym->ts.u.derived->attr.alloc_comp
-	  || sym->ts.u.derived->attr.pointer_comp))
+  && (CLASS_DATA (sym)->ts.u.derived->attr.alloc_comp
+	  || CLASS_DATA (sym)->ts.u.derived->attr.pointer_comp))
 {
   gfc_error ("Sorry, allocatable/pointer components in polymorphic (CLASS) "
 		 "type coarrays at %L are unsupported", >declared_at);
Index: gcc/fortran/trans-array.c
===
--- gcc/fortran/trans-array.c	(Revision 243613)
+++ gcc/fortran/trans-array.c	(Arbeitskopie)
@@ -9337,6 +9337,8 @@
   if (token == NULL_TREE)
 	{
 	  tmp = gfc_get_tree_for_caf_expr (expr1);
+	  if (POINTER_TYPE_P (TREE_TYPE (tmp)))
+	tmp = build_fold_indirect_ref (tmp);
 	  gfc_get_caf_token_offset (_se, , NULL, tmp, NULL_TREE,
 expr1);
 	  token = gfc_build_addr_expr (NULL_TREE, token);
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 243613)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,10 @@
+2016-12-13  Andre Vehreschild  
+
+	PR fortran/77785
+	* gfortran.dg/coarray_38.f90: Added expecting error message.
+	* gfortran.dg/coarray_41.f90: New test.
+	* gfortran.dg/coarray_class_2.f90: New test.
+
 2016-12-13 Carl Love  
 
 	* gcc.target/powerpc/builtins-3.c: Add new test of the test suite
Index: gcc/testsuite/gfortran.dg/coarray_38.f90
===
--- gcc/testsuite/gfortran.dg/coarray_38.f90	(Revision 243613)
+++ gcc/testsuite/gfortran.dg/coarray_38.f90	(Arbeitskopie)
@@ -92,7 +92,7 @@
 type t2
   class(t), allocatable :: 

Re: [PATCH] avoid infinite recursion in maybe_warn_alloc_args_overflow (pr 78775)

2016-12-13 Thread Jeff Law

On 12/13/2016 03:52 AM, Marek Polacek wrote:

On Mon, Dec 12, 2016 at 06:36:16PM -0700, Martin Sebor wrote:

+/* Return true if the type of OP is signed, looking through any casts
+   to an unsigned type.  */
+
+static bool
+operand_signed_p (tree op)
+{
+  bitmap visited = NULL;
+  bool ret = operand_signed_p (op, );
+
+  if (visited)
+BITMAP_FREE (visited);


I think you can drop the if before BITMAP_FREE here.

Correct.
jeff


Re: [PATCH, rs6000] Add support for vec_pack and vec_sld built-ins

2016-12-13 Thread Carl E. Love
On Thu, 2016-12-08 at 21:03 -0600, Segher Boessenkool wrote:
> On Thu, Dec 08, 2016 at 08:28:22AM -0800, Carl E. Love wrote:
> > The following patch adds two more built-ins that are missing.
> > Specifically:
> > 
> > vector floatvec_packvector doublevector double
> > vector double   vec_sld vector doublevector double
> > 
> > The patch has been boot strapped and tested on
> > powerpc64le-unknown-linux-gnu (Power 8) and on 
> > powerpc64-unknown-linux-gnu (Power 7) with no regressions.
> 
> > 2016-12-08  Carl Love  
> > 
> >* config/rs6000/rs6000-c.c: Add built-in support for
> >vector float  vec_pack  vector double  vector double
> >vector double  vec_sld  vector double  vector double
> 
> Add some parens and commas here?
> 
> >* config/rs6000/rs6000.c: Add icode check for vsldoi_v2df to allow
> >4-bit unsigned literal.
> >* config/rs6000/rs6000-builtin.def: Add definition for VSLDOI_2DF
> >* doc/extend.texi: Update the built-in documentation file for the
> >new powerpc vec_pack and vec_sld built-ins.
> 
> Okay for trunk with that.  Thanks,

Segher:

The patch has been committed with the requested changes to the ChangeLog
message.

Revision 243612 - Directory Listing
Modified Tue Dec 13 16:18:36 2016 UTC (77 seconds ago) by carll

gcc/ChangeLog:

2016-12-13  Carl Love  

   * config/rs6000/rs6000-c.c: Add built-in support for
   vector float vec_pack (vector double, vector double)
   vector double vec_sld (vector double, vector double)
   * config/rs6000/rs6000.c: Add icode check for vsldoi_v2df to allow
   4-bit unsigned literal.
   * config/rs6000/rs6000-builtin.def: Add definition for VSLDOI_2DF
   * doc/extend.texi: Update the built-in documentation file for the
   new powerpc vec_pack and vec_sld built-ins.

gcc/testsuite/ChangeLog:

2016-12-13 Carl Love  

* gcc.target/powerpc/builtins-3.c: Add new test of the test suite
file.
* gcc.target/powerpc/builtins-3-p8.c: Add new test of the test suite
file.


  Carl Love



Re: [RFC][PATCH] Sanopt for use-after-scope ASAN_MARK internal functions

2016-12-13 Thread Jakub Jelinek
On Tue, Dec 13, 2016 at 04:21:56PM +0100, Martin Liška wrote:
> gcc/ChangeLog:
> 
> 2016-12-12  Martin Liska  
> 
>   * sanopt.c (sanopt_optimize_walker): Set contains_asan_mark.
>   (sanopt_optimize): Add new argument.
>   (sanitize_asan_mark_unpoison): New function.
>   (maybe_contains_asan_check): Likewise.
>   (sanitize_asan_mark_poison): Likewise.
>   (pass_sanopt::execute): Call the new functions.

Ok, thanks.

Jakub


Re: [RFC][PATCH] Sanopt for use-after-scope ASAN_MARK internal functions

2016-12-13 Thread Martin Liška
On 12/13/2016 01:29 PM, Jakub Jelinek wrote:
> On Tue, Dec 13, 2016 at 01:12:34PM +0100, Martin Liška wrote:
 +  gimple_stmt_iterator gsi;
 +  bool finish = false;
 +  for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev ())
 +  {
 +gimple *stmt = gsi_stmt (gsi);
 +if (maybe_contains_asan_check (stmt))
 +  {
 +bitmap_set_bit (with_check, bb->index);
 +finish = true;
>>>
>>> Why the finish var and separate if (finish) break; ?  Perhaps from the first
>>> iteration when you used a switch?  Just doing break; should be enough.
>>
>> Well, I verified that even the first iteration (of patch) needed if(finish) 
>> break;
>> as I need to jump to next iteration of FOR_EACH_BB_FN (bb, cfun).
> 
> I fail to see functional difference between
>   for (...)
> {
>   ...
>   bool finish = false;
>   for (...)
> {
>   ...
>   if (...)
>   {
> ...
> finish = true;
>   }
> if (finish)
>   break;
> }
> }
> and
>   for (...)
> {
>   ...
>   for (...)
> {
>   ...
>   if (...)
>   {
> ...
> break;
>   }
> }
> }
> just the latter is not obfuscated.  The break is in both cases in the same
> for loop.

Sorry for that, I really overlooked that. It was obfuscated.

> 
>>> Don't you need also release_defs (stmt); here (and in the other gsi_remove
>>> spot)?
>>
>> As I remove only internal function calls that do not define a SSA name then 
>> not:
>>   ASAN_MARK (UNPOISON, _char, 1);
>>   ASAN_MARK (POISON, _char, 1);
> 
> If they have a vdef, then I believe they do define a SSA name (the vdef).
> I think unlink_stmt_vdef does not release the vdef SSA_NAME if any.

Yep, fixed in the patch.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Martin
 

> 
>   Jakub
> 

>From e313f08405ef7bb6597df64773348a3d04b3ab4e Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 12 Dec 2016 15:22:05 +0100
Subject: [PATCH] Add sanopt for ASAN_MARK poison and unpoison.

gcc/ChangeLog:

2016-12-12  Martin Liska  

	* sanopt.c (sanopt_optimize_walker): Set contains_asan_mark.
	(sanopt_optimize): Add new argument.
	(sanitize_asan_mark_unpoison): New function.
	(maybe_contains_asan_check): Likewise.
	(sanitize_asan_mark_poison): Likewise.
	(pass_sanopt::execute): Call the new functions.
---
 gcc/sanopt.c | 206 +--
 1 file changed, 202 insertions(+), 4 deletions(-)

diff --git a/gcc/sanopt.c b/gcc/sanopt.c
index 320e14e9421..7fe8d32e39a 100644
--- a/gcc/sanopt.c
+++ b/gcc/sanopt.c
@@ -24,6 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "backend.h"
 #include "tree.h"
 #include "gimple.h"
+#include "ssa.h"
 #include "tree-pass.h"
 #include "tree-ssa-operands.h"
 #include "gimple-pretty-print.h"
@@ -37,7 +38,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-phinodes.h"
 #include "ssa-iterators.h"
 
-
 /* This is used to carry information about basic blocks.  It is
attached to the AUX field of the standard CFG block.  */
 
@@ -160,8 +160,10 @@ struct sanopt_ctx
 
   /* Number of IFN_ASAN_CHECK statements.  */
   int asan_num_accesses;
-};
 
+  /* True when the current functions constains an ASAN_MARK.  */
+  bool contains_asan_mark;
+};
 
 /* Return true if there might be any call to free/munmap operation
on any path in between DOM (which should be imm(BB)) and BB.  */
@@ -582,6 +584,9 @@ sanopt_optimize_walker (basic_block bb, struct sanopt_ctx *ctx)
 	if (!remove)
 	  ctx->asan_num_accesses++;
 	break;
+	  case IFN_ASAN_MARK:
+	ctx->contains_asan_mark = true;
+	break;
 	  default:
 	break;
 	  }
@@ -620,10 +625,11 @@ sanopt_optimize_walker (basic_block bb, struct sanopt_ctx *ctx)
 /* Try to remove redundant sanitizer checks in function FUN.  */
 
 static int
-sanopt_optimize (function *fun)
+sanopt_optimize (function *fun, bool *contains_asan_mark)
 {
   struct sanopt_ctx ctx;
   ctx.asan_num_accesses = 0;
+  ctx.contains_asan_mark = false;
 
   /* Set up block info for each basic block.  */
   alloc_aux_for_blocks (sizeof (sanopt_info));
@@ -638,6 +644,7 @@ sanopt_optimize (function *fun)
 
   free_aux_for_blocks ();
 
+  *contains_asan_mark = ctx.contains_asan_mark;
   return ctx.asan_num_accesses;
 }
 
@@ -671,18 +678,201 @@ public:
 
 }; // class pass_sanopt
 
+/* Sanitize all ASAN_MARK unpoison calls that are not reachable by a BB
+   that contains an ASAN_MARK poison.  All these ASAN_MARK unpoison call
+   can be removed as all variables are unpoisoned in a function prologue.  */
+
+static void
+sanitize_asan_mark_unpoison (void)
+{
+  /* 1) Find all BBs that contain an ASAN_MARK poison call.  */
+  auto_sbitmap with_poison (last_basic_block_for_fn (cfun) + 1);
+  bitmap_clear (with_poison);
+  

Re: [PATCH] Add ISA 3.0 PowerPC support for VEXTU{B,H,W}{L,R}X instructions

2016-12-13 Thread Michael Meissner
This patch should address the comments in the last patch.

I have tested this patch with bootstrap builds and make check regression tests
on a little endian Power8 64-bit system and a big endian Power7 32/64-bit
system with no regressions.  Can I check this into the trunk?

[gcc]
2016-12-13  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_split_vec_extract_var): On ISA
3.0/power9, add support to use the VEXTU{B,H,W}{L,R}X extract
instructions.
* config/rs6000/vsx.md (VSr2): Add IEEE 128-bit floating point
type constraint registers.
(VSr3): Likewise.
(FL_CONV): New mode iterator for binary floating types that have a
direct conversion from 64-bit integer to floating point.
(vsx_extract__p9): Add support for the ISA 3.0/power9
VEXTU{B,H,W}{L,R}X extract instructions.
(vsx_extract__p9 splitter): Add splitter to load up the
extract byte position into the GPR if we are using the
VEXTU{B,H,W}{L,R}X extract instructions.
(vsx_extract__di_p9): Support extracts to GPRs.
(vsx_extract__store_p9): Support extracting to GPRs so that
we can use reg+offset address instructions.
(vsx_extract__var): Support extracts to GPRs.
(vsx_extract___var): New combiner
insn to combine vector extracts with zero_extend.
(vsx_ext__fl_): Optimize
extracting a small integer vector element and converting it to a
floating point type.
(vsx_ext__ufl_): Likewise.

[gcc/testsuite]
2016-12-13  Michael Meissner  

* gcc/testsuite/gcc.target/powerpc/vec-extract.h: If DO_TRACE is
defined, add tracing of the various extracts to stderr.  Add
support for tests that convert the result to another type.
* gcc/testsuite/gcc.target/powerpc/vec-extract-v2df.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/vec-extract-v4sf.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/vec-extract-v4si-df.c: Add new
tests that do an extract and then convert the values double.
* gcc/testsuite/gcc.target/powerpc/vec-extract-v4siu-df.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/vec-extract-v16qiu-df.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/vec-extract-v16qi-df.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/vec-extract-v8hiu-df.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/vec-extract-v8hi-df.c: Likewise.
* gcc.target/powerpc/p9-extract-1.c: Update test to check for
VEXTU{B,H,W}{L,R}X instructions being generated by default instead
of VEXTRACTU{B,H} and XXEXTRACTUW.
* gcc.target/powerpc/p9-extract-3.c: New test for combination of
vec_extract and convert to floating point.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
(revision 243590)
+++ gcc/config/rs6000/rs6000.c  (.../gcc/config/rs6000) (working copy)
@@ -7519,6 +7519,52 @@ rs6000_split_vec_extract_var (rtx dest, 
 {
   int bit_shift = byte_shift + 3;
   rtx element2;
+  int dest_regno = regno_or_subregno (dest);
+  int src_regno = regno_or_subregno (src);
+  int element_regno = regno_or_subregno (element);
+
+  /* See if we want to generate VEXTU{B,H,W}{L,R}X if the destination is in
+a general purpose register.  */
+  if (TARGET_P9_VECTOR
+ && (mode == V16QImode || mode == V8HImode || mode == V4SImode)
+ && INT_REGNO_P (dest_regno)
+ && ALTIVEC_REGNO_P (src_regno)
+ && INT_REGNO_P (element_regno))
+   {
+ rtx dest_si = gen_rtx_REG (SImode, dest_regno);
+ rtx element_si = gen_rtx_REG (SImode, element_regno);
+
+ if (mode == V16QImode)
+   emit_insn (VECTOR_ELT_ORDER_BIG
+  ? gen_vextublx (dest_si, element_si, src)
+  : gen_vextubrx (dest_si, element_si, src));
+
+ else if (mode == V8HImode)
+   {
+ rtx tmp_gpr_si = (GET_CODE (tmp_gpr) == SCRATCH
+   ? dest_si
+   : gen_rtx_REG (SImode, REGNO (tmp_gpr)));
+ emit_insn (gen_ashlsi3 (tmp_gpr_si, element_si, const1_rtx));
+ emit_insn (VECTOR_ELT_ORDER_BIG
+? gen_vextuhlx (dest_si, tmp_gpr_si, src)
+: gen_vextuhrx (dest_si, tmp_gpr_si, src));
+   }
+
+
+ else
+   {
+ rtx tmp_gpr_si = (GET_CODE (tmp_gpr) == SCRATCH
+   ? dest_si
+   : gen_rtx_REG (SImode, REGNO (tmp_gpr)));
+ emit_insn 

Re: [PATCH, fortran, pr77785, v3] [Coarray] ICE in gfc_get_caf_token_offset, at fortran/trans-expr.c:1990

2016-12-13 Thread Janus Weil
Hi Andre,

> thanks for the input on the missing testcase, Janus (btw, when you know where 
> to
> get a new crystal ball, let me know; I am missing mine, too). The new version
> of the patch adds a new testcase coarray_41.f90 to test that the compiler
> compiles correctly and the test runs ok.
>
> Bootstrapped and regtested on x86_64-linux/f23. Ok for trunk?

yes, good for trunk then.

Cheers,
Janus



> On Tue, 13 Dec 2016 12:11:50 +0100
> Janus Weil  wrote:
>
>> Hi Andre,
>>
>> > all the sanitizer issues I fixed occur during compiling the testsuite. So I
>> > would say, that when with the patch these errors do not occur anymore while
>> > processing the testsuite, then those are tested for, right?
>>
>> aah, so you're saying that hunk is not actually related to the PR in
>> the subject line, but instead fixes a testsuite failure seen with a
>> sanitized compiler? That wasn't mentioned anywhere and sadly I forgot
>> to bring my crystal ball ...
>>
>> Cheers,
>> Janus
>>
>>
>>
>> > On Mon, 12 Dec 2016 13:37:43 +0100
>> > Janus Weil  wrote:
>> >
>> >> Hi Andre,
>> >>
>> >> > the attached patch corrects reporting of "Sorry, unimplemented yet" for
>> >> > allocatable and pointer components in polymorphic objects (BT_CLASS) 
>> >> > thus
>> >> > fixing two ICEs reported in the PR.
>> >> >
>> >> > The next chunk fixes an ICE when the declaration containing the token
>> >> > information is of type POINTER or REFERENCE.
>> >> >
>> >> > Bootstraps and regtests ok on x86_64-linux/f23. Ok for trunk?
>> >>
>> >> the resolve.c hunk is certainly ok. The trans-array.c part looks
>> >> reasonable as well, but I wonder if it is actually covered by any of
>> >> your test cases? Since they are all compile-only, with errors being
>> >> thrown at resolution stage, do they even get to the translation stage?
>> >>
>> >> Cheers,
>> >> Janus
>> >
>> >
>> > --
>> > Andre Vehreschild * Email: vehre ad gmx dot de
>
>
> --
> Andre Vehreschild * Email: vehre ad gmx dot de


[PATCH] Dump number of loops in a function plus fixup state

2016-12-13 Thread Richard Biener

I've found this useful several times.  Note that currently
number_of_loops returns loops plus freed ones, the patch prepares
to change that, adjusting number_of_loops callers to use
vec_safe_length if they really want the maximum allocated loop
number.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2016-12-13  Richard Biener  

* cfgloopmanip.c (place_new_loop): Use vec_safe_length to
get at the maximum allocated loop number.
* loop-init.c (fix_loop_structure): Likewise.
* tree-cfg.c (fixup_loop_arrays_after_move): Use place_new_loop.
(dump_function_to_file): Dump the number of loops in the function
as well as whether they need fixup.

Index: gcc/cfgloopmanip.c
===
--- gcc/cfgloopmanip.c  (revision 243599)
+++ gcc/cfgloopmanip.c  (working copy)
@@ -430,7 +430,7 @@ remove_path (edge e, bool *irred_invalid
 void
 place_new_loop (struct function *fn, struct loop *loop)
 {
-  loop->num = number_of_loops (fn);
+  loop->num = vec_safe_length (loops_for_fn (fn)->larray);
   vec_safe_push (loops_for_fn (fn)->larray, loop);
 }
 
Index: gcc/loop-init.c
===
--- gcc/loop-init.c (revision 243599)
+++ gcc/loop-init.c (working copy)
@@ -253,7 +258,7 @@ fix_loop_structure (bitmap changed_bbs)
 
   /* Remember the number of loops so we can return how many new loops
  flow_loops_find discovered.  */
-  old_nloops = number_of_loops (cfun);
+  old_nloops = vec_safe_length (current_loops->larray);
 
   /* Re-compute loop structure in-place.  */
   flow_loops_find (current_loops);
@@ -321,7 +327,7 @@ fix_loop_structure (bitmap changed_bbs)
 
   timevar_pop (TV_LOOP_INIT);
 
-  return number_of_loops (cfun) - old_nloops;
+  return vec_safe_length (current_loops->larray) - old_nloops;
 }
 
 /* The RTL loop superpass.  The actual passes are subpasses.  See passes.c for
Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c  (revision 243597)
+++ gcc/tree-cfg.c  (working copy)
@@ -7095,8 +7095,7 @@ fixup_loop_arrays_after_move (struct fun
   (*get_loops (fn1))[loop->num] = NULL;
 
   /* Place it in the new loop array, assigning it a new number.  */
-  loop->num = number_of_loops (fn2);
-  vec_safe_push (loops_for_fn (fn2)->larray, loop);
+  place_new_loop (fn2, loop);
 
   /* Recurse to children.  */
   for (loop = loop->inner; loop; loop = loop->next)
@@ -7561,7 +7560,14 @@ dump_function_to_file (tree fndecl, FILE
   fprintf (file, " __GIMPLE ()\n%s (", function_name (fun));
 }
   else
-fprintf (file, "%s %s(", function_name (fun), tmclone ? "[tm-clone] " : 
"");
+{
+  if (loops_for_fn (fun))
+   fprintf (file, ";; %d loops found%s\n", number_of_loops (fun),
+loops_for_fn (fun)->state & LOOPS_NEED_FIXUP
+? " (fixup required)" : "");
+  fprintf (file, "%s %s(", function_name (fun),
+  tmclone ? "[tm-clone] " : "");
+}
 
   arg = DECL_ARGUMENTS (fndecl);
   while (arg)


Re: RFC: Allow emergency EH pool size to be controlled (for stage 1)

2016-12-13 Thread Ville Voutilainen
On 13 December 2016 at 16:52, Richard Biener  wrote:
> On Tue, Dec 13, 2016 at 3:45 PM, Ville Voutilainen
>  wrote:
>> On 13 December 2016 at 16:42, Richard Biener  
>> wrote:
>>> Also maybe N should be the number of exception objects rather
>>> than bytes?  Otherwise a target independent N is hard to specify
>>> for say, a distribution that wants either a different default or a
>>> static buffer.
>>
>>
>> But how to specify the number of exception objects that may have
>> different sizes?
>
> We have EMERGENCY_OBJ_SIZE for this (yeah, some guesstimate)
> which can be easily target configured in libstdc++ itself.  We possibly
> should split it into a sizeof () of the exception header plus some
> payload guesstimate.


Ah, yes, that's what the patch does with the default-size when there
is nothing in the environment to customize
the value. Makes sense.

I don't know whether a separate byte-wise count would be useful, I
must admit I have no field experience with such
tunings.


[PATCH] Fix PR78788

2016-12-13 Thread Richard Biener

The following patch fixes PR78788.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2016-12-13  Richard Biener  

PR tree-optimization/78788
* tree-vrp.c (set_value_range): Allow [-INF(OVF), +INF(OVF)].
(set_and_canonicalize_value_range): Do not drop the above to
VARYING.

* gcc.dg/torture/pr78788.c: New testcase.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 243599)
+++ gcc/tree-vrp.c  (working copy)
@@ -365,10 +365,6 @@ set_value_range (value_range *vr, enum v
 
   cmp = compare_values (min, max);
   gcc_assert (cmp == 0 || cmp == -1 || cmp == -2);
-
-  if (needs_overflow_infinity (TREE_TYPE (min)))
-   gcc_assert (!is_overflow_infinity (min)
-   || !is_overflow_infinity (max));
 }
 
   if (flag_checking
@@ -506,14 +502,9 @@ set_and_canonicalize_value_range (value_
 }
 }
 
-  /* Drop [-INF(OVF), +INF(OVF)] to varying.  */
-  if (needs_overflow_infinity (TREE_TYPE (min))
-  && is_overflow_infinity (min)
-  && is_overflow_infinity (max))
-{
-  set_value_range_to_varying (vr);
-  return;
-}
+  /* Do not drop [-INF(OVF), +INF(OVF)] to varying.  (OVF) has to be sticky
+ to make sure VRP iteration terminates, otherwise we can get into
+ oscillations.  */
 
   set_value_range (vr, t, min, max, equiv);
 }

Index: gcc/testsuite/gcc.dg/torture/pr78788.c
===
--- gcc/testsuite/gcc.dg/torture/pr78788.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr78788.c  (working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+
+int a;
+long b;
+long c;
+void d()
+{
+  int e = 0;
+  for (; b; b++)
+if (c)
+  {
+   e++;
+   e++;
+  }
+  while (e)
+a = e -= 2;
+}


Re: RFC: Allow emergency EH pool size to be controlled (for stage 1)

2016-12-13 Thread Richard Biener
On Tue, Dec 13, 2016 at 3:45 PM, Ville Voutilainen
 wrote:
> On 13 December 2016 at 16:42, Richard Biener  
> wrote:
>> Also maybe N should be the number of exception objects rather
>> than bytes?  Otherwise a target independent N is hard to specify
>> for say, a distribution that wants either a different default or a
>> static buffer.
>
>
> But how to specify the number of exception objects that may have
> different sizes?

We have EMERGENCY_OBJ_SIZE for this (yeah, some guesstimate)
which can be easily target configured in libstdc++ itself.  We possibly
should split it into a sizeof () of the exception header plus some
payload guesstimate.

Richard.


Re: [PATCH] omp-low.c split

2016-12-13 Thread Cesar Philippidis
On 12/13/2016 04:42 AM, Martin Jambor wrote:

>> And this as well.  But omp-grid.c is fine too.
> 
> ...I prefer omp-grid.c because I plan to use gridification also for
> GCN targets, though hopefully only as an optimization rather than a
> hard requirement ...and in fact I still think it is a good
> optimization of simple loops for execution on all CUDA-like
> environments with block/thread grids because it removes conditions
> which the run-time can handle better.

Regarding gridification, is your cauldron talk from 2015 still current,
or have there been some significant changes?

When we first started with OpenACC we were using a lot of the existing
lower_omp_for* infrastructure handle ACC LOOPs. But there was a couple
of problems with that. First, the chunk partitioning caused a lot of
overhead, and second because of OpenACC execution model it made more
sense to write our own functions (lower_oacc_head_tail /
lower_oacc_reductions). In fact, during lowering gcc only marks where
the loops are. All of those markers get replaced and the loops get
optimized during the oaccdevlow pass which runs on the target compiler.

Right now one of the significant bottlenecks we're experiencing on nvptx
targets is with I/O. First, prior to launching a PTX kernel, libgomp
transfers each data mapping individually in a synchronous manner. I'm
debating whether it makes sense to pass in all of those data mappings to
the accelerator prior to the PTX kernel launch asynchronously, obviously
with an explicit synchronization barrier just prior to launching the kernel.

Another bottleneck involves firstprivate variables. Often, those
variables are 'scalars' and consequently, they shouldn't need explicit
data mappings. I noticed that Jakub introduced a special
GOMP_MAP_FIRSTPRIVATE_INT, which omits data mappings for integral types
with less than or equal precision to pointers. It would probably be
beneficial to expand this to reals.

The last observation is that OpenMP code in general passes a struct with
all of the data mappings to the various OMP regions/offloaded code.
That's fine, but for offloading targets, particularly nvptx, it would
probably be slightly more efficient if those OMP regions took actual
function arguments instead of a single struct. At least on nvptx
targets, in order to pass that struct to the accelerator, the runtime
must first allocate device memory for it, then copy all of the struct
contents to the device each time prior to launching a PTX kernel. A lot
of this could be bypassed because cuLaunchKernel accepts a variable
number of kernel arguments. Obviously, those arguments need to be
transferred to the accelerator one way or another, so I'm not sure yet
how beneficial this optimization would end up being.

To be clear, I'm not proposing any of these changes for gcc7. Any
changes to the above will go to gomp-4_0-branch first, then we'll port
them over to gcc8.

What type of performance problems are you experiencing with HSA?

Cesar


Re: RFC: Allow emergency EH pool size to be controlled (for stage 1)

2016-12-13 Thread Ville Voutilainen
On 13 December 2016 at 16:42, Richard Biener  wrote:
> Also maybe N should be the number of exception objects rather
> than bytes?  Otherwise a target independent N is hard to specify
> for say, a distribution that wants either a different default or a
> static buffer.


But how to specify the number of exception objects that may have
different sizes?


Re: [Patch, fortran] PR78737 - [OOP] linking error with deferred, undefined user-defined derived-type I/O

2016-12-13 Thread Janus Weil
Hi Paul,

> We got there! OK for trunk.

thanks. Unfortunately there was a problem with my latest patch, so
what I now committed as r243609 is basically your fixed version of my
draft patch (with some very minor adjustments).

Phew, we finally nailed it!

(My dtio_13 fix had seemed to work at some point, but in the end
apparently it didn't, so I discarded it.)


> This was a demonstration of the corollary of the "bon mot" from Barack
> Obama at the end of the message :-)

:)

Maybe we can even "make gfortran great again" (to paraphrase another
specimen of that category)  ;P


> Many thanks for finding the right path.

Thanks for your help on the way, and of course thanks to Damian for
triggering this whole discussion in the first place.

Cheers,
Janus



> On 13 December 2016 at 10:23, Janus Weil  wrote:
>> 2016-12-13 10:58 GMT+01:00 Janus Weil :
>>> 2016-12-13 10:42 GMT+01:00 Janus Weil :
> please find attached a version of your patch that runs all the dtio
> testcases successfully.

 Great, thanks a lot. Your addition of
 gfc_find_and_cut_at_last_class_ref is just what I was looking for
 right now ...

 If you don't mind I'll write a ChangeLog, add the proper tests and
 commit to trunk. Or do you prefer to take care of it yourself?
>>>
>>> Btw, to continue the brainstorming, I think I found a slightly better
>>> solution for the dtio_13 problem, which even removes the spurious
>>> error on that test case via a small fix in resolve_transfer. The
>>> attached patch is what I'd like to commit if you're ok with it.
>>
>> Finally, here is a complete patch, including testcase and ChangeLog
>> entries. Ok for trunk?
>>
>> Cheers,
>> Janus
>>
>>
>> 2016-12-13  Janus Weil  
>> Paul Thomas  
>>
>> PR fortran/78737
>> * gfortran.h (gfc_find_typebound_dtio_proc): New prototype.
>> * interface.c (gfc_compare_interfaces): Whitespace fix.
>> (gfc_find_typebound_dtio_proc): New function.
>> (gfc_find_specific_dtio_proc): Use it.
>> * resolve.c (resolve_transfer): Improve error recovery.
>> * trans-io.c (get_dtio_proc): Implement polymorphic calls to DTIO
>> procedures.
>>
>> 2016-12-13  Janus Weil  
>> Paul Thomas  
>>
>> PR fortran/78737
>> * gfortran.dg/dtio_13.f90: Remove spurious error.
>> * gfortran.dg/dtio_19.f90: New test case.
>
>
>
> --
> If you're walking down the right path and you're willing to keep
> walking, eventually you'll make progress.
>
> Barack Obama


Re: RFC: Allow emergency EH pool size to be controlled (for stage 1)

2016-12-13 Thread Richard Biener
On Tue, Dec 13, 2016 at 2:50 PM, Jonathan Wakely  wrote:
> This patch allows the size of the emergency buffer for exception
> handling to be controlled by a build-time macro (to avoid dynamic
> allocation) or by a run-time environment variable (to allocate a
> larger or smaller buffer).
>
> This will have to wait for the next stage 1 now, as it's too late for
> GCC 7, but this shows what I'm thinking about and so might avoid
> anybody else reinventing the wheel.
>
> The patch doesn't include documentation updates for the manual, which
> would be needed to explain the use of the macro and the env var. We
> could also add a --with-libstdcxx-static-eh-pool=N configure flag to
> make it easier to set the macro.

Looks reasonable.  But I wonder whether we'd want the default
arena size to be configurable as well, thus split static-eh-pool=N
into default-eh-pool-size=N and -static-eh-pool (without =N).

Also maybe N should be the number of exception objects rather
than bytes?  Otherwise a target independent N is hard to specify
for say, a distribution that wants either a different default or a
static buffer.

Richard.

>


Re: [PATCH] Fill bitregion_{start,end} in store_constructor (PR, tree-optimization/78428).

2016-12-13 Thread Richard Biener
On Tue, Dec 13, 2016 at 10:05 AM, Martin Liška  wrote:
> On 12/12/2016 12:10 PM, Eric Botcazou wrote:
>>> Ok. I'm sending a patch that put gcc_unreachable to places where either size
>>> or (and) offset is a non-constant. This survives regression tests
>>> (including ada) on x86_64-linux-gnu. Apart from that normal bootstrap +
>>> regression tests works fine on ppc64le-redhat-linux.
>>
>> I didn't manage to break it so it is OK by me.
>>
>
> Out of curiosity, I blame the commit which added the
>
> if (offset)
>   {
> machine_mode address_mode;
> rtx offset_rtx;
>
> offset
>   = SUBSTITUTE_PLACEHOLDER_IN_EXPR (offset,
> make_tree (TREE_TYPE (exp),
>target));
>
> and it comes to commit done in 1993:
>
> commit c869557a9ccc1bd3e5474b144bcb84065db23549
> Author: kenner 
> Date:   Mon Oct 4 01:48:03 1993 +
>
> (store_expr): Use expr_size value, not size_int.
> (store_constructor): Handle case of variable position and allow it to 
> contain
> a PLACEHOLDER_EXPR.
> (get_inner_reference): Make a WITH_RECORD_EXPR if required.
> (expand_expr, case PLACEHOLDER_EXPR, WITH_RECORD_EXPR): New cases.
> (expand_expr, case ARRAY_REF): Make WITH_RECORD_EXPR expressions when 
> needed.

Yeah, I expect that we got rid of the cases at the time we introduced
GIMPLE (and gimplification).

>
> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@5584 
> 138bc75d-0d04-0410-961f-82ee72b054a4
>
> May I install the patch?

OK from my POV.

Thanks,
Richard.

> Martin
>


Re: [PR78365] ICE in determine_value_range, at tree-ssa-loo p-niter.c:413

2016-12-13 Thread Richard Biener
On Mon, Dec 12, 2016 at 8:57 AM, kugan
 wrote:
> Hi Richard,
>
>>> I am fine with the new patch but you'll need an approval from Honza
>>> or Richi.
>>>
>>> I find it a bit saddening that we cannot really rely on
>>> gimple_call_fntype but at least I do not see any other way.
>>
>>
>> The patch looks somewhat backward.  It populates the param type from
>> the callee but the only thing we really know is the _originating_ type
>> from the callers DECL_ARGUMENTS (if the JF is based on a parameter
>
>
> I am not sure I understood this. I think we get param_type from callees
> DECL_ARGUMENTS.
>
>> which is the only case where we do not know the type of the actual
>> value statically -- we only know the type we expect).
>>
>> So the type would be present only on IPA_JF_PASS_THROUGH JFs
>> (? does that also cover modified params?).
>>
>> At propagation time when applying the jump-function we have to make
>> sure to first convert the actual parameter value to that expected type
>> (or drop to BOTTOM if we can't do that conversion due to excess
>> mismatches).
>
>
> From ipa vrp point of view, that is what is being done with the patch (In
> this version, I also changed the POINTER_TYPE_P to use param_type). At the
> point we create ipa_compute_jump_functions_for_edge, we are covering the VR
> for arguments to param_type.
>
> In propagate_vr_accross_jump_function also, we are doing the conversion.

Hum, I think what your patch does is only half-way correct.  I'm
looking a bit more
into how the IPA propagation works and it looks like it's not the JF
where the type
needs to be stored but the lattice itself -- we need to know at propagation time
what type we expect for the arguments.  And if there's a mismatch with the
actual type flowing through the JF then we need to convert or punt.  And looking
closer that type should get known during propagation as at that time we should
have the actual parameter types as seen in the function body.

And ipa_get_callee_param_type should go away - just use

  jfunc->param_type = TREE_TYPE (arg);

(well, and no need to store that redundant information in the jump
function -- it's
there for all but pass-through JFs by means of the JF -- and pass-through JFs
get their type at propagation time from the source lattice values)

So we just need to get propagation correct (and not fiddle strangely with types
in JF building).

Richard.

> Thanks,
> Kugan


Re: [RFC][PATCH] Speed-up use-after-scope (re-writing to SSA)

2016-12-13 Thread Richard Biener
On Thu, Dec 8, 2016 at 1:51 PM, Martin Liška  wrote:
> On 12/02/2016 01:29 PM, Richard Biener wrote:
>> On Thu, Dec 1, 2016 at 5:30 PM, Martin Liška  wrote:
>>> On 11/23/2016 03:13 PM, Jakub Jelinek wrote:
 On Wed, Nov 23, 2016 at 02:57:07PM +0100, Martin Liška wrote:
> I started review process in libsanitizer: https://reviews.llvm.org/D26965
> And I have a question that was asked in the review: can we distinguish 
> between load and store
> in case of having usage of ASAN_POISON?

 I think with ASAN_POISON it is indeed just loads from after scope that can
 be caught, a store overwrites the variable with a new value and when 
 turning
 the store after we make the var no longer addressable into SSA form, we
 loose information about the out of scope store.  Furthermore, if there is
 first a store and then a read, like:
   if (argc != 12312)
 {
   char my_char;
   ptr = _char;
 }
   *ptr = i + 26;
   return *ptr;
 we don't notice even the read.  Not sure what could be done against that
 though.  I think we'd need to hook into the into-ssa framework, there it
 should know the current value of the variable at the point of the store is
 result of ASAN_POISON and be able to instead of turning that
   my_char = _23;
 into
   my_char_35 = _23;
 turn it into:
   my_char_35 = ASAN_POISON (_23);
 which would represent after scope store into my_char.

 Not really familiar with into-ssa though to know where to do it.

   Jakub

>>>
>>> Richi, may I ask you for help with this question?
>>
>> Probably where we handle the CLOBBER case (rewrite_stmt, maybe_register_def),
>> we do this for -Wuninitialized.
>>
>> Richard.
>
> Thanks for the tip, however as the optimization of memory address store + 
> load happens
> before we rewrite my_char into SSA, it would be probably hard to guess which 
> memory
> stores and loads should be preserved:
>
> use-after-scope-20.c.032t.ccp1:
> main (int argc, char * * argv)
> {
>   int my_char;
>   int * ptr;
>   int _1;
>   int _11;
>
>[0.0%]:
>   if (argc_4(D) != 12312)
> goto ; [0.0%]
>   else
> goto ; [0.0%]
>
>[0.0%]:
>   ASAN_MARK (2, _char, 4);
>   ptr_8 = _char;
>   ASAN_MARK (1, _char, 4);
>
>[0.0%]:
>   # ptr_2 = PHI 
>   _1 = argc_4(D) + 26;
>   *ptr_2 = _1;
>   _11 = *ptr_2;
>   return _11;
>
> }

The SSA renamer sees

   my_char = ASAN_MARK;
   ptr_8 = _char;
   my_char = ASAN_MARK;

?

It does perform a DOM walk when updating the stmts so simply registering the
appropriate current def should do the trick?

> I sent updated version of patch to LLVM phabricator:
> https://reviews.llvm.org/D26965
>
> Hopefully we can cherry pick the patch very soon to our trunk.
>
> M.
>
>>
>>> Thanks,
>>> Martin
>>>
>


Re: [PATCH] Tree-level fix for PR 69526

2016-12-13 Thread Richard Biener
On Wed, Dec 7, 2016 at 5:14 PM, Robin Dapp  wrote:
>> So we have (uint64_t)(uint32 + -1U) + 1 and using TYPE_SIGN (inner_type)
>> produces (uint64_t)uint32 + -1U + 1.  This simply means that we cannot ignore
>> overflow of the inner operation and for some reason your change
>> to extract_range_from_binary_expr didn't catch this.  That is _8 + 4294967295
>> overflows but we ignored that.
>
> In this case the range of _8 was [1,1] so no overflow.
> I think the problem is rather about the interpretation of the inner
> constant. I tried discerning two cases now, a range split (i.e. when the
> single range of a binop variable becomes an anti range or two ranges
> after combining it with the binop's other range) and an overflow of the
> range's min and max.
> If the range didn't split, we can perform the simplification. If there
> was a min and max overflow, we have to interpret the inner constant as
> signed and perform a sign extension before converting it to the outer
> type. Without overflow we can use TYPE_SIGN (inner_type).
> Does this make sense?

I'm not sure there is anything to "interpret" -- the operation is unsigned
and overflow is when the operation may wrap around zero.  There might
be clever ways of re-writing the expression to
(uint64_t)((uint32_t)((int32_t)uint32 + -1) + 1)
avoiding the overflow and thus allowing the transform but I'm not sure that's
good.

A related thing would be canonicalizing unsigned X plus CST to
unsigned X minus CST'
if CST' has a smaller absolute value than CST.  I think currently we
simply canonicalize
to 'plus CST' but also only in fold-const.c, not in match.pd (ah we
do, but only in a simplified manner).

That said, can we leave that "trick" out of the patch?  I think your
more complicated
"overflows" result from extract_range_from_binary_expr_1 doesn't apply to all
ops (like MULT_EXPR where more complicated cases can arise).

Richard.


> Included the remarks and attached the new version.
>
> Regards
>  Robin


Re: [PATCH, fortran, pr77785, v3] [Coarray] ICE in gfc_get_caf_token_offset, at fortran/trans-expr.c:1990

2016-12-13 Thread Andre Vehreschild
Hi Janus, hi all,

thanks for the input on the missing testcase, Janus (btw, when you know where to
get a new crystal ball, let me know; I am missing mine, too). The new version
of the patch adds a new testcase coarray_41.f90 to test that the compiler
compiles correctly and the test runs ok.

Bootstrapped and regtested on x86_64-linux/f23. Ok for trunk?

Regards,
Andre

On Tue, 13 Dec 2016 12:11:50 +0100
Janus Weil  wrote:

> Hi Andre,
> 
> > all the sanitizer issues I fixed occur during compiling the testsuite. So I
> > would say, that when with the patch these errors do not occur anymore while
> > processing the testsuite, then those are tested for, right?  
> 
> aah, so you're saying that hunk is not actually related to the PR in
> the subject line, but instead fixes a testsuite failure seen with a
> sanitized compiler? That wasn't mentioned anywhere and sadly I forgot
> to bring my crystal ball ...
> 
> Cheers,
> Janus
> 
> 
> 
> > On Mon, 12 Dec 2016 13:37:43 +0100
> > Janus Weil  wrote:
> >  
> >> Hi Andre,
> >>  
> >> > the attached patch corrects reporting of "Sorry, unimplemented yet" for
> >> > allocatable and pointer components in polymorphic objects (BT_CLASS) thus
> >> > fixing two ICEs reported in the PR.
> >> >
> >> > The next chunk fixes an ICE when the declaration containing the token
> >> > information is of type POINTER or REFERENCE.
> >> >
> >> > Bootstraps and regtests ok on x86_64-linux/f23. Ok for trunk?  
> >>
> >> the resolve.c hunk is certainly ok. The trans-array.c part looks
> >> reasonable as well, but I wonder if it is actually covered by any of
> >> your test cases? Since they are all compile-only, with errors being
> >> thrown at resolution stage, do they even get to the translation stage?
> >>
> >> Cheers,
> >> Janus  
> >
> >
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de  


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
gcc/fortran/ChangeLog:

2016-12-13  Andre Vehreschild  

PR fortran/77785
* resolve.c (resolve_symbol): Correct attr lookup to the _data
component.
* trans-array.c (gfc_alloc_allocatable_for_assignment): Indirect ref
pointers and references before retrieving the caf-token.

gcc/testsuite/ChangeLog:

2016-12-13  Andre Vehreschild  

PR fortran/77785
* gfortran.dg/coarray_38.f90: Added expecting error message.
* gfortran.dg/coarray_41.f90: New test.
* gfortran.dg/coarray_class_2.f90: New test.


diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index c7d872c..b610797 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -14044,8 +14044,8 @@ resolve_symbol (gfc_symbol *sym)
   if (flag_coarray == GFC_FCOARRAY_LIB && sym->ts.type == BT_CLASS
   && sym->ts.u.derived && CLASS_DATA (sym)
   && CLASS_DATA (sym)->attr.codimension
-  && (sym->ts.u.derived->attr.alloc_comp
-	  || sym->ts.u.derived->attr.pointer_comp))
+  && (CLASS_DATA (sym)->ts.u.derived->attr.alloc_comp
+	  || CLASS_DATA (sym)->ts.u.derived->attr.pointer_comp))
 {
   gfc_error ("Sorry, allocatable/pointer components in polymorphic (CLASS) "
 		 "type coarrays at %L are unsupported", >declared_at);
diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 8753cbf..0cd83f4 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -9337,6 +9337,8 @@ gfc_alloc_allocatable_for_assignment (gfc_loopinfo *loop,
   if (token == NULL_TREE)
 	{
 	  tmp = gfc_get_tree_for_caf_expr (expr1);
+	  if (POINTER_TYPE_P (TREE_TYPE (tmp)))
+	tmp = build_fold_indirect_ref (tmp);
 	  gfc_get_caf_token_offset (_se, , NULL, tmp, NULL_TREE,
 expr1);
 	  token = gfc_build_addr_expr (NULL_TREE, token);
diff --git a/gcc/testsuite/gfortran.dg/coarray_38.f90 b/gcc/testsuite/gfortran.dg/coarray_38.f90
index c8011d4..04ef742 100644
--- a/gcc/testsuite/gfortran.dg/coarray_38.f90
+++ b/gcc/testsuite/gfortran.dg/coarray_38.f90
@@ -92,7 +92,7 @@ end type t
 type t2
   class(t), allocatable :: caf2[:]
 end type t2
-class(t), save, allocatable :: caf[:]
+class(t), save, allocatable :: caf[:] ! { dg-error "Sorry, allocatable/pointer components in polymorphic" }
 type(t) :: x
 type(t2) :: y
 
diff --git a/gcc/testsuite/gfortran.dg/coarray_41.f90 b/gcc/testsuite/gfortran.dg/coarray_41.f90
new file mode 100644
index 000..b62d8e4
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/coarray_41.f90
@@ -0,0 +1,29 @@
+! { dg-do run }
+! { dg-options "-fcoarray=lib -lcaf_single" }
+
+program coarray_41
+
+  integer, allocatable :: vec(:)[:,:]
+
+  allocate(vec(10)[2,*], source= 37)
+
+  if (.not. allocated(vec)) error stop
+
+  call foo(vec)
+
+  if (any(vec /= 42)) error stop
+
+  deallocate(vec)
+contains
+
+  subroutine foo(gv)
+
+integer, allocatable, intent(inout) :: gv(:)[:,:]
+integer, allocatable :: gvin(:)
+
+allocate(gvin, mold=gv)
+gvin = 5
+gv = gv + gvin
+ 

RFC: Allow emergency EH pool size to be controlled (for stage 1)

2016-12-13 Thread Jonathan Wakely

This patch allows the size of the emergency buffer for exception
handling to be controlled by a build-time macro (to avoid dynamic
allocation) or by a run-time environment variable (to allocate a
larger or smaller buffer).

This will have to wait for the next stage 1 now, as it's too late for
GCC 7, but this shows what I'm thinking about and so might avoid
anybody else reinventing the wheel.

The patch doesn't include documentation updates for the manual, which
would be needed to explain the use of the macro and the env var. We
could also add a --with-libstdcxx-static-eh-pool=N configure flag to
make it easier to set the macro.


commit 4e021613dcda14dfec5c88286f8faf3001afcd17
Author: Jonathan Wakely 
Date:   Tue Dec 13 13:23:48 2016 +

Allow pool size to be set by macro or env var

diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac
index 7d0fafa..be4679e 100644
--- a/libstdc++-v3/configure.ac
+++ b/libstdc++-v3/configure.ac
@@ -257,6 +257,7 @@ if $GLIBCXX_IS_NATIVE; then
 
   AC_CHECK_FUNCS(__cxa_thread_atexit_impl)
   AC_CHECK_FUNCS(aligned_alloc posix_memalign memalign _aligned_malloc)
+  AC_CHECK_FUNCS(secure_getenv)
 
   # For iconv support.
   AM_ICONV
diff --git a/libstdc++-v3/crossconfig.m4 b/libstdc++-v3/crossconfig.m4
index 8cc788c..c01213a 100644
--- a/libstdc++-v3/crossconfig.m4
+++ b/libstdc++-v3/crossconfig.m4
@@ -184,6 +184,7 @@ case "${host}" in
 GCC_CHECK_TLS
 AC_CHECK_FUNCS(__cxa_thread_atexit_impl)
 AC_CHECK_FUNCS(aligned_alloc posix_memalign memalign _aligned_malloc)
+AC_CHECK_FUNCS(secure_getenv)
 AM_ICONV
 ;;
   *-mingw32*)
diff --git a/libstdc++-v3/libsupc++/eh_alloc.cc b/libstdc++-v3/libsupc++/eh_alloc.cc
index d362e40..571fe6b 100644
--- a/libstdc++-v3/libsupc++/eh_alloc.cc
+++ b/libstdc++-v3/libsupc++/eh_alloc.cc
@@ -29,6 +29,7 @@
 #include 
 #if _GLIBCXX_HOSTED
 #include 
+#include 
 #endif
 #include 
 #include 
@@ -36,6 +37,14 @@
 #include 
 #include 
 
+#ifdef STATIC_EH_ALLOC_POOL_BYTES
+char* alloc_pool_arena(std::size_t& size)
+{
+  size = STATIC_EH_ALLOC_POOL_BYTES;
+  static alignas(void*) char arena[STATIC_EH_ALLOC_POOL_BYTES];
+  return arena;
+}
+#else
 #if _GLIBCXX_HOSTED
 using std::free;
 using std::malloc;
@@ -50,8 +59,6 @@ extern "C" void *memset (void *, int, std::size_t);
 
 using namespace __cxxabiv1;
 
-// ??? How to control these parameters.
-
 // Guess from the size of basic types how large a buffer is reasonable.
 // Note that the basic c++ exception header has 13 pointers and 2 ints,
 // so on a system with PSImode pointers we're talking about 56 bytes
@@ -73,6 +80,32 @@ using namespace __cxxabiv1;
 # define EMERGENCY_OBJ_COUNT	4
 #endif
 
+char *alloc_pool_arena(std::size_t& size)
+{
+  size = 0;
+#if _GLIBCXX_HOSTED
+  const char *name = "GLIBCXX_EH_ARENA_SIZE";
+  char *env = nullptr;
+# ifdef _GLIBCXX_HAVE_SECURE_GETENV
+  env = ::secure_getenv (name);
+# else
+  env = std::getenv (name);
+# endif
+  if (env)
+{
+  char *end;
+  size = strtoul(env, , 0);
+  if (*end || (size == ULONG_MAX && errno == ERANGE))
+	size = 0;
+}
+#endif // _GLIBCXX_HOSTED
+  if (size == 0)
+size = (EMERGENCY_OBJ_SIZE * EMERGENCY_OBJ_COUNT
+	+ EMERGENCY_OBJ_COUNT * sizeof (__cxa_dependent_exception));
+  return (char *)malloc (size);
+}
+#endif // STATIC_EH_ALLOC_POOL_BYTES
+
 namespace __gnu_cxx
 {
   void __freeres();
@@ -107,7 +140,7 @@ namespace
   // The free-list
   free_entry *first_free_entry;
   // The arena itself - we need to keep track of these only
-  // to implement in_pool.
+  // to implement in_pool and __freeres.
   char *arena;
   std::size_t arena_size;
 
@@ -116,11 +149,8 @@ namespace
 
   pool::pool()
 {
-  // Allocate the arena - we could add a GLIBCXX_EH_ARENA_SIZE environment
-  // to make this tunable.
-  arena_size = (EMERGENCY_OBJ_SIZE * EMERGENCY_OBJ_COUNT
-		+ EMERGENCY_OBJ_COUNT * sizeof (__cxa_dependent_exception));
-  arena = (char *)malloc (arena_size);
+  // Allocate the arena.
+  arena = alloc_pool_arena (arena_size);
   if (!arena)
 	{
 	  // If the allocation failed go without an emergency pool.
@@ -255,11 +285,13 @@ namespace __gnu_cxx
   void
   __freeres()
   {
+#ifndef STATIC_EH_ALLOC_POOL_BYTES
 if (emergency_pool.arena)
   {
 	::free(emergency_pool.arena);
 	emergency_pool.arena = 0;
   }
+#endif
   }
 }
 


RE: [PATCH, testsuite] MIPS: Skip msa-builtins-err.c if there is no assembly output.

2016-12-13 Thread Matthew Fortune
Toma Tabacu  writes:
> >
> > It's a shame this is the only way to deal with this but I see aarch64
> > have to resort to the same thing for error checking tests.
> >
> 
> I have looked into this some more and I 've found that the solution I
> proposed is incomplete.
> 
> The problem is that if no linker plugin is found, -fno-fat-lto-objects
> is not passed on, so the test isn't skipped and it will fail because -
> flto doesn't do assembly generation by default and the errors will not
> be triggered.
> 
> This can be fixed by adding -ffat-lto-objects as a test option for error
> tests, as shown in the patch below.
> 
> The thing is, this already happens for scan-assembler & friends behind-
> the-scenes, but not for dg-error, because the former are run through
> force_conventional_output, while the latter is not.
> 
> A nicer solution would be to have a new directive which would do nothing
> except for the force_conventional_output part, thus forcing assembly
> generation, but this may be overkill.
> 
> Regards,
> Toma
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/mips/msa-builtins-err.c (dg-options): Use
>-ffat-lto-objects to guarantee assembly generation.

OK, thanks for investigating. Slight tweak to the changelog we just say
what changed in the changelog not why. You covered 'why' in the code
comment. This is just:

(dg-options): Add -ffat-lto-objects option.

Thanks,
Matthew


Re: [PATCH] omp-low.c split

2016-12-13 Thread Alexander Monakov
On Tue, 13 Dec 2016, Martin Jambor wrote:
> I have bootstrapped the two patches on aarch64-linux and bootstrapped
> and tested them on x86_64-linux.  What do you think?

Sorry for my 'false alarm' about cp/parser.c conflict in the previous mail -- I
thought I was applying your patch to trunk, but now I see my tree was outdated.

In the new patch there are 7 whitespace issues that git complains about:

zcat 0001-Split-omp-low-into-multiple-files.patch.gz |
  git apply --check --whitespace=error-all -

:2779: space before tab in indent.
# BLOCK 2 (PAR_ENTRY_BB)
:5391: space before tab in indent.
true, GSI_SAME_STMT);
:8174: space before tab in indent.
 after a stmt, not before.  */
:9105: space before tab in indent.
  GOMP_atomic_start ();
:9106: space before tab in indent.
  *addr = rhs;
:9107: space before tab in indent.
  GOMP_atomic_end ();
:10327: space before tab in indent.
 region.  */
error: 7 lines add whitespace errors.


A couple of typos in the Changelog:

>   (is_combined_parallel): kMoved to omp-expand.c.

s/k//

>   * config/nvptx/nvptx.c: Include omp-generic.c.

omp-general.h :)

Alexander


Re: [PATCH] omp-low.c split

2016-12-13 Thread Martin Jambor
Hi,

On Tue, Dec 13, 2016 at 12:43:16PM +0100, Jakub Jelinek wrote:
> On Tue, Dec 13, 2016 at 12:39:01PM +0100, Thomas Schwinge wrote:
> > On Fri, 9 Dec 2016 14:08:21 +0100, Martin Jambor  wrote:
> > > this is the promised attempt at splitting omp-low.c [...]
> > 
> > Yay!  \o/
> > 
> > I have not yet had a chance to review/test this patch, but I plan to.
> > 
> > A few initial comments from the "bike shed departement"; I understand in
> > GCC sources it will not be easy to rename stuff (such as files) later, so
> > we should get the names agreed upon early:
> > 
> > Generally, I agree with your division of "omp-low.c" parts.
> > 
> > >   - move everything that is part of pass_oacc_device_lower,
> > > pass_omp_device_lower and pass_omp_target_link to a new file
> > > omp-device.h,
> > 
> > Should we call this file "omp-offload.c", as offloading is what this
> > deals with, is the term we agreed to generally use (as far as I can
> > tell)?
> 
> That would be fine with me too.

OK, will do.

> 
> > >   - move all pre-lowering gridification stuff to a new file
> > > omp-grid.c.  [...]
> > 
> > Is that code generic enough to not call this file "omp-hsa.c" or similar?
> 

Not at the moment, but...

> And this as well.  But omp-grid.c is fine too.

...I prefer omp-grid.c because I plan to use gridification also for
GCN targets, though hopefully only as an optimization rather than a
hard requirement ...and in fact I still think it is a good
optimization of simple loops for execution on all CUDA-like
environments with block/thread grids because it removes conditions
which the run-time can handle better.

> 
> > >   - I moved stuff that was used from all over the place to a new file
> > > omp-general.c (unless it would mean exposing omp_region or
> > > omp_context types).
> > 
> > I'd have called that simply "omp.c".
> 
> The problem with that is that the corresponding header can't be called
> omp.h for obvious reasons, we already have one with very different meaning.
> 

That is exactly the reason why I chose omp-general. 

Thanks,

Martin


Re: [PATCH] Add pretty printer for ASAN_MARK and add a helper fn

2016-12-13 Thread Jakub Jelinek
On Tue, Dec 13, 2016 at 01:14:00PM +0100, Martin Liška wrote:
> On 12/13/2016 10:05 AM, Jakub Jelinek wrote:
> > Ok.  But the builtins should be renamed too (incrementally),
> > BUILT_IN_ASAN_CLOBBER_N, "__asan_poison_stack_memory",
> > should really be BUILT_IN_ASAN_POISON_STACK_MEMORY etc.
> > 
> > Jakub
> 
> This is follow-up that I've just tested and reg-bootstrapped.
> 
> Ready for trunk?
> Thanks,
> Martin

> >From 8fd18d8fde8f2e3a10812698c37d601c2a52aee7 Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Tue, 13 Dec 2016 10:19:48 +0100
> Subject: [PATCH 1/2] Rename BUILT_IN_ASAN_CLOBBER_N to
>  BUILT_IN_ASAN_POISON_STACK_MEMORY.
> 
> gcc/ChangeLog:
> 
> 2016-12-13  Martin Liska  
> 
>   * asan.c (asan_expand_mark_ifn): Use renamed
>   BUILT_IN_ASAN_{UN}CLOBBER_N to BUILT_IN_ASAN_{UN}POISON_STACK_MEMORY.
>   * sanitizer.def: Likewise.

Ok, thanks.

Jakub


Re: [RFC][PATCH] Sanopt for use-after-scope ASAN_MARK internal functions

2016-12-13 Thread Jakub Jelinek
On Tue, Dec 13, 2016 at 01:12:34PM +0100, Martin Liška wrote:
> >> +  gimple_stmt_iterator gsi;
> >> +  bool finish = false;
> >> +  for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev ())
> >> +  {
> >> +gimple *stmt = gsi_stmt (gsi);
> >> +if (maybe_contains_asan_check (stmt))
> >> +  {
> >> +bitmap_set_bit (with_check, bb->index);
> >> +finish = true;
> > 
> > Why the finish var and separate if (finish) break; ?  Perhaps from the first
> > iteration when you used a switch?  Just doing break; should be enough.
> 
> Well, I verified that even the first iteration (of patch) needed if(finish) 
> break;
> as I need to jump to next iteration of FOR_EACH_BB_FN (bb, cfun).

I fail to see functional difference between
  for (...)
{
  ...
  bool finish = false;
  for (...)
{
  ...
  if (...)
{
  ...
  finish = true;
}
  if (finish)
break;
}
}
and
  for (...)
{
  ...
  for (...)
{
  ...
  if (...)
{
  ...
  break;
}
}
}
just the latter is not obfuscated.  The break is in both cases in the same
for loop.

> > Don't you need also release_defs (stmt); here (and in the other gsi_remove
> > spot)?
> 
> As I remove only internal function calls that do not define a SSA name then 
> not:
>   ASAN_MARK (UNPOISON, _char, 1);
>   ASAN_MARK (POISON, _char, 1);

If they have a vdef, then I believe they do define a SSA name (the vdef).
I think unlink_stmt_vdef does not release the vdef SSA_NAME if any.

Jakub


Re: Fold strstr (s, t) eq/ne s to strcmp (s, t) eq/ne 0 if strlen (t) is known

2016-12-13 Thread Jakub Jelinek
On Tue, Dec 13, 2016 at 05:41:09PM +0530, Prathamesh Kulkarni wrote:
> --- a/gcc/tree-ssa-strlen.c
> +++ b/gcc/tree-ssa-strlen.c
> @@ -,6 +,90 @@ handle_char_store (gimple_stmt_iterator *gsi)
>return true;
>  }
>  
> +/* Try to fold strstr (s, t) eq/ne s to memcmp (s, t, strlen (t)) eq/ne 0.  
> */
> +
> +static void
> +fold_strstr_to_memcmp (enum tree_code code, tree rhs1, tree rhs2, gimple 
> *stmt)

You can drop code argument here, see below.  And I'd say it is better to
do the
  if (TREE_CODE (rhs1) != SSA_NAME || TREE_CODE (rhs2) != SSA_NAME)
return;
here than repeat it in all the callers.

> +   if (gimple_assign_rhs_code (stmt) == COND_EXPR)
> + {
> +   tree cond = gimple_assign_rhs1 (stmt);
> +   TREE_SET_CODE (cond, code);

TREE_CODE (cond) is already code, so no need to set it again.

> +   gcond *cond = as_a (stmt);
> +   gimple_cond_set_lhs (cond, memcmp_lhs);
> +   gimple_cond_set_rhs (cond, zero);
> +   gimple_cond_set_code (cond, code);

And gimple_cond_code (cond) == code here too.

> +   update_stmt (cond);
> + }

You can perhaps move the update_stmt (stmt); to a single spot after
all the 3 cases are handled.

> + if (cond_code == EQ_EXPR || cond_code == NE_EXPR)
> +   {
> + tree rhs1 = TREE_OPERAND (cond, 0);
> + tree rhs2 = TREE_OPERAND (cond, 1);

While it is necessary to check cond_code here and in the other spots
similarly, because otherwise you don't know if it has 2 arguments etc.,
you can avoid the SSA_NAME tests here.

> + if (TREE_CODE (rhs1) == SSA_NAME
> + && TREE_CODE (rhs2) == SSA_NAME)
> +   fold_strstr_to_memcmp (cond_code, rhs1, rhs2, stmt);
> +   }
> +   }
> + else if (code == EQ_EXPR || code == NE_EXPR)
> +   {
> + tree rhs1 = gimple_assign_rhs1 (stmt);
> + tree rhs2 = gimple_assign_rhs2 (stmt);
> +
> + if (TREE_CODE (rhs1) == SSA_NAME
> + && TREE_CODE (rhs2) == SSA_NAME)

And here.
> +   fold_strstr_to_memcmp (code, rhs1, rhs2, stmt);
> +   }
> +  }
> +else if (TREE_CODE (lhs) != SSA_NAME && !TREE_SIDE_EFFECTS (lhs))
>   {
> tree type = TREE_TYPE (lhs);
> if (TREE_CODE (type) == ARRAY_TYPE)
> @@ -2316,6 +2427,17 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi)
>   }
>   }
>  }
> +  else if (gcond *cond = dyn_cast (stmt))
> +{
> +  enum tree_code code = gimple_cond_code (cond);
> +  tree lhs = gimple_cond_lhs (stmt);
> +  tree rhs = gimple_cond_rhs (stmt);
> +
> +  if ((code == EQ_EXPR || code == NE_EXPR)
> +   && TREE_CODE (lhs) == SSA_NAME
> +   && TREE_CODE (rhs) == SSA_NAME)

And here.
> + fold_strstr_to_memcmp (code, lhs, rhs, stmt);
> +}
>  
>if (gimple_vdef (stmt))
>  maybe_invalidate (stmt);

Otherwise LGTM, but it would be nice to cover also the COND_EXPR case by a
testcase (can be done incrementally).

Jakub


Re: [PATCH] Add pretty printer for ASAN_MARK and add a helper fn

2016-12-13 Thread Martin Liška
On 12/13/2016 10:05 AM, Jakub Jelinek wrote:
> Ok.  But the builtins should be renamed too (incrementally),
> BUILT_IN_ASAN_CLOBBER_N, "__asan_poison_stack_memory",
> should really be BUILT_IN_ASAN_POISON_STACK_MEMORY etc.
> 
>   Jakub

This is follow-up that I've just tested and reg-bootstrapped.

Ready for trunk?
Thanks,
Martin
>From 8fd18d8fde8f2e3a10812698c37d601c2a52aee7 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 13 Dec 2016 10:19:48 +0100
Subject: [PATCH 1/2] Rename BUILT_IN_ASAN_CLOBBER_N to
 BUILT_IN_ASAN_POISON_STACK_MEMORY.

gcc/ChangeLog:

2016-12-13  Martin Liska  

	* asan.c (asan_expand_mark_ifn): Use renamed
	BUILT_IN_ASAN_{UN}CLOBBER_N to BUILT_IN_ASAN_{UN}POISON_STACK_MEMORY.
	* sanitizer.def: Likewise.
---
 gcc/asan.c| 5 +++--
 gcc/sanitizer.def | 6 --
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index e297784270d..53acff0a2fb 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -2838,8 +2838,9 @@ asan_expand_mark_ifn (gimple_stmt_iterator *iter)
   gsi_insert_before (iter, g, GSI_SAME_STMT);
   tree sz_arg = gimple_assign_lhs (g);
 
-  tree fun = builtin_decl_implicit (is_poison ? BUILT_IN_ASAN_CLOBBER_N
-	: BUILT_IN_ASAN_UNCLOBBER_N);
+  tree fun
+	= builtin_decl_implicit (is_poison ? BUILT_IN_ASAN_POISON_STACK_MEMORY
+ : BUILT_IN_ASAN_UNPOISON_STACK_MEMORY);
   g = gimple_build_call (fun, 2, base_addr, sz_arg);
   gimple_set_location (g, loc);
   gsi_insert_after (iter, g, GSI_NEW_STMT);
diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def
index 3db08a7b702..43e46f90e9c 100644
--- a/gcc/sanitizer.def
+++ b/gcc/sanitizer.def
@@ -165,9 +165,11 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_BEFORE_DYNAMIC_INIT,
 DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_AFTER_DYNAMIC_INIT,
 		  "__asan_after_dynamic_init",
 		  BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
-DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_CLOBBER_N, "__asan_poison_stack_memory",
+DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_POISON_STACK_MEMORY,
+		  "__asan_poison_stack_memory",
 		  BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST)
-DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_UNCLOBBER_N, "__asan_unpoison_stack_memory",
+DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_UNPOISON_STACK_MEMORY,
+		  "__asan_unpoison_stack_memory",
 		  BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST)
 
 /* Thread Sanitizer */
-- 
2.11.0



Re: [RFC][PATCH] Sanopt for use-after-scope ASAN_MARK internal functions

2016-12-13 Thread Martin Liška
On 12/13/2016 10:17 AM, Jakub Jelinek wrote:
> On Tue, Dec 13, 2016 at 09:56:00AM +0100, Martin Liška wrote:
>> @@ -671,18 +678,203 @@ public:
>>  
>>  }; // class pass_sanopt
>>  
> 
> Please add a short function comment here...
> 
>> +static void
>> +sanitize_asan_mark_unpoison (void)
>> +{
>> +  /* 1) Find all BBs that contain an ASAN_MARK poison call.  */
>> +  auto_sbitmap with_poison (last_basic_block_for_fn (cfun) + 1);
>> +  bitmap_clear (with_poison);
>> +  basic_block bb;
> 
> ... and here.

Done.

> 
>> +static void
>> +sanitize_asan_mark_poison (void)
>> +{
>> +  /* 1) Find all BBs that possibly contain an ASAN_CHECK.  */
>> +  auto_sbitmap with_check (last_basic_block_for_fn (cfun) + 1);
>> +  bitmap_clear (with_check);
>> +  basic_block bb;
> 
>> +  FOR_EACH_BB_FN (bb, cfun)
>> +{
>> +  if (bitmap_bit_p (with_check, bb->index))
>> +continue;
> 
> When would this happen?  It seems you only set it for the current bb,
> it doesn't seem that you'd try to process the same bb multiple times.

That's not needed, probably an artifact from very first version.

> 
>> +  gimple_stmt_iterator gsi;
>> +  bool finish = false;
>> +  for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev ())
>> +{
>> +  gimple *stmt = gsi_stmt (gsi);
>> +  if (maybe_contains_asan_check (stmt))
>> +{
>> +  bitmap_set_bit (with_check, bb->index);
>> +  finish = true;
> 
> Why the finish var and separate if (finish) break; ?  Perhaps from the first
> iteration when you used a switch?  Just doing break; should be enough.

Well, I verified that even the first iteration (of patch) needed if(finish) 
break;
as I need to jump to next iteration of FOR_EACH_BB_FN (bb, cfun).

> 
>> +  else if (asan_mark_p (stmt, ASAN_MARK_POISON))
>> +{
>> +  if (dump_file)
>> +fprintf (dump_file, "Removing ASAN_MARK poison\n");
>> +  unlink_stmt_vdef (stmt);
>> +  gsi_remove (, true);
> 
> Don't you need also release_defs (stmt); here (and in the other gsi_remove
> spot)?

As I remove only internal function calls that do not define a SSA name then not:
  ASAN_MARK (UNPOISON, _char, 1);
  ASAN_MARK (POISON, _char, 1);

Ready to install?
M.

> 
>   Jakub
> 

>From 9335008e783155e131ca1dec1ef468f371cfcd1e Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 12 Dec 2016 15:22:05 +0100
Subject: [PATCH] Add sanopt for ASAN_MARK poison and unpoison.

gcc/ChangeLog:

2016-12-12  Martin Liska  

	* sanopt.c (sanopt_optimize_walker): Set contains_asan_mark.
	(sanopt_optimize): Add new argument.
	(sanitize_asan_mark_unpoison): New function.
	(maybe_contains_asan_check): Likewise.
	(sanitize_asan_mark_poison): Likewise.
	(pass_sanopt::execute): Call the new functions.
---
 gcc/sanopt.c | 210 ++-
 1 file changed, 207 insertions(+), 3 deletions(-)

diff --git a/gcc/sanopt.c b/gcc/sanopt.c
index 320e14e9421..3aebb1621c9 100644
--- a/gcc/sanopt.c
+++ b/gcc/sanopt.c
@@ -160,8 +160,10 @@ struct sanopt_ctx
 
   /* Number of IFN_ASAN_CHECK statements.  */
   int asan_num_accesses;
-};
 
+  /* True when the current functions constains an ASAN_MARK.  */
+  bool contains_asan_mark;
+};
 
 /* Return true if there might be any call to free/munmap operation
on any path in between DOM (which should be imm(BB)) and BB.  */
@@ -582,6 +584,9 @@ sanopt_optimize_walker (basic_block bb, struct sanopt_ctx *ctx)
 	if (!remove)
 	  ctx->asan_num_accesses++;
 	break;
+	  case IFN_ASAN_MARK:
+	ctx->contains_asan_mark = true;
+	break;
 	  default:
 	break;
 	  }
@@ -620,10 +625,11 @@ sanopt_optimize_walker (basic_block bb, struct sanopt_ctx *ctx)
 /* Try to remove redundant sanitizer checks in function FUN.  */
 
 static int
-sanopt_optimize (function *fun)
+sanopt_optimize (function *fun, bool *contains_asan_mark)
 {
   struct sanopt_ctx ctx;
   ctx.asan_num_accesses = 0;
+  ctx.contains_asan_mark = false;
 
   /* Set up block info for each basic block.  */
   alloc_aux_for_blocks (sizeof (sanopt_info));
@@ -638,6 +644,7 @@ sanopt_optimize (function *fun)
 
   free_aux_for_blocks ();
 
+  *contains_asan_mark = ctx.contains_asan_mark;
   return ctx.asan_num_accesses;
 }
 
@@ -671,18 +678,207 @@ public:
 
 }; // class pass_sanopt
 
+/* Sanitize all ASAN_MARK unpoison calls that are not reachable by a BB
+   that contains an ASAN_MARK poison.  All these ASAN_MARK unpoison call
+   can be removed as all variables are unpoisoned in a function prologue.  */
+
+static void
+sanitize_asan_mark_unpoison (void)
+{
+  /* 1) Find all BBs that contain an ASAN_MARK poison call.  */
+  auto_sbitmap with_poison (last_basic_block_for_fn (cfun) + 1);
+  bitmap_clear (with_poison);
+  basic_block bb;
+
+  FOR_EACH_BB_FN (bb, cfun)
+{
+  if (bitmap_bit_p (with_poison, bb->index))
+	continue;
+
+  gimple_stmt_iterator gsi;
+  for (gsi = gsi_last_bb (bb); !gsi_end_p 

Re: Fold strstr (s, t) eq/ne s to strcmp (s, t) eq/ne 0 if strlen (t) is known

2016-12-13 Thread Prathamesh Kulkarni
On 13 December 2016 at 15:27, Jakub Jelinek  wrote:
> On Tue, Dec 13, 2016 at 03:08:17PM +0530, Prathamesh Kulkarni wrote:
>> Thanks for the suggestions. It didn't occur to me to check for gimple_cond.
>> I have tried to do the changes in the attached version.
>> I am not sure if I have handled cond_expr correctly.
>> IIUC, if gimple_assign has code cond_expr, then the condition is
>> stored in gimple_assign_rhs1,
>> however it's not a single operand but a tree of the form "op1 cond_code op2".
>> Is that correct ?
>
> Yes.  gimple_assign_rhs1 will be in what you are looking for EQ_EXPR or
> NE_EXPR tree, its TREE_CODE will be this code you want to check, and
> TREE_OPERAND (exp, 0) and TREE_OPERAND (exp, 1) the rhs1 and rhs2 you use
> elsewhere.
>
>> However I am not able to write a test-case that generates cond_expr in the 
>> IL.
>> I tried:
>> t1 = strstr (s, t);
>> (t1 == s)  ? foo() : bar ();
>> and other such variants but it seems the ?: operator is getting
>> lowered to gimple_cond instead.
>
> It is, but in some cases tree-if-conv.c turns them back into COND_EXPRs.
> I guess you need -ftree-loop-if-convert now, and it has to be in some loop
> where the addition of cond_expr would likely turn it into a single bb loop.
> You probably want constants or vars, not function calls in the ? :
> expressions though.
>
>> +/* Try to fold strstr (s, t) == s to memcmp (s, t, strlen (t)) == 0.  */
>> +
>> +static void
>> +fold_strstr_to_memcmp(enum tree_code code, tree rhs1, tree rhs2, gimple 
>> *stmt)
>
> Formatting, space before (.
>
>> +{
>> +  gimple *call_stmt = NULL;
>> +  for (int pass = 0; pass < 2; pass++)
>> +{
>> +  gimple *g = SSA_NAME_DEF_STMT (rhs1);
>> +  if (g
>
> I think g should be always non-NULL (except for invalid IL), so probably no
> need to check it.
Ah indeed, thanks for pointing out. I assumed if ssa-var has default definition,
then SSA_NAME_DEF_STMT would be NULL, but it's GIMPLE_NOP.
>
>> +   && gimple_call_builtin_p (g, BUILT_IN_STRSTR)
>> +   && has_single_use (rhs1)
>> +   && gimple_call_arg (as_a (g), 0) == rhs2)
>
> I think gimple_call_arg works fine even with just gimple * argument.
> So you can avoid the as_a (g) uglification and just use g.
>
>> +   if (is_gimple_assign (stmt))
>> + {
>> +   if (gimple_assign_rhs_code (stmt) == COND_EXPR)
>> + {
>> +   tree cond = gimple_assign_rhs1 (stmt);
>> +   TREE_SET_CODE (cond, EQ_EXPR);
>
> This looks weird.  You are hardcoding EQ_EXPR, while for the
> other case below you use code.  So, do you handle properly both
> EQ_EXPR and NE_EXPR for this and gimple_cond cases?
> Also, for non-COND_EXPR assign you build a new stmt instead of reusing
> the existing one, why?
>
>> +   TREE_OPERAND (cond, 0) = memcmp_lhs;
>> +   TREE_OPERAND (cond, 1) = zero;
>> +   update_stmt (stmt);
>> + }
>> +   else
>> + {
>> +   gsi = gsi_for_stmt (stmt);
>> +   tree lhs = gimple_assign_lhs (stmt);
>> +   gassign *ga = gimple_build_assign (lhs, code, memcmp_lhs,
>> +  zero);
>> +   gsi_replace (, ga, false);
>> + }
>> + }
>> +   else
>> + {
>> +   gcond *cond = as_a (stmt);
>> +   gimple_cond_set_lhs (cond, memcmp_lhs);
>> +   gimple_cond_set_rhs (cond, zero);
>> +   gimple_cond_set_code (cond, EQ_EXPR);
>
> Likewise here.
Oops, sorry about that :/
Does this version look OK ?
Bootstrap+test in progress.

Thanks,
Prathamesh
>
> Jakub
2016-12-13  Jakub Jelinek  
Prathamesh Kulkarni  

* tree-ssa-strlen.c (fold_strstr_to_memcmp): New function.
(strlen_optimize_stmt): Call fold_strstr_to_memcmp.

testsuite/
* gcc.dg/strlenopt-30.c: New test-case.

diff --git a/gcc/testsuite/gcc.dg/strlenopt-30.c 
b/gcc/testsuite/gcc.dg/strlenopt-30.c
new file mode 100644
index 000..089b3a2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-30.c
@@ -0,0 +1,63 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+__attribute__((no_icf))
+_Bool f1(char *s)
+{
+  return __builtin_strstr (s, "hello") == s;
+}
+
+__attribute__((no_icf))
+_Bool f2(char *s)
+{
+  return s == __builtin_strstr (s, "hello");
+}
+
+__attribute__((no_icf))
+_Bool f3(char *s)
+{
+  return s != __builtin_strstr (s, "hello");
+}
+
+__attribute__((no_icf))
+_Bool f4()
+{
+  char *foo_f4(void);
+  char *t1 = foo_f4();
+  char *t2 = __builtin_strstr (t1, "hello");
+  _Bool t3 = t2 == t1;
+  return t3;
+}
+
+__attribute__((no_icf))
+void f5(char *s)
+{
+  char *t1 = __builtin_strstr (s, "hello");
+  void foo_f5(void);
+  if (t1 != s)
+foo_f5();
+}
+
+/* Do not perform 

Re: [PATCH] [AArch64] Implement popcount pattern

2016-12-13 Thread Kyrill Tkachov

Hi Naveen,

On 13/12/16 11:51, Hurugalawadi, Naveen wrote:

Hi Kyrill,

Thanks for reviewing the patch and your useful comments.


looks good to me if it has gone through the normal required
bootstrap and testing, but I can't approve.

Bootstrapped and Regression Tested on aarch64-thunderx-linux.


The rest of the MD file uses the term AdvSIMD. Also, the instrurction
is CNT rather than "pop count".

Done.


__builtin_popcount takes an unsigned int, so this should be
scanning for absence of popcountsi2 instead?

Done.

Please find attached the modified patch as per review comments
and let me know if its okay for Stage-1 or current branch.


This looks much better, thanks.
I still have a minor nit about the testcase.

+long
+foo1 (int x)
+{
+  return __builtin_popcountl (x);
+}

On ILP32 systems this would still use the SImode patterns, so I suggest you use 
__builtin_popcountll and
an unsigned long long return type to ensure you always exercise the 64-bit code.

+
+/* { dg-final { scan-assembler-not "popcount" } } */


This looks ok to me otherwise, but you'll need an ok from the aarch64 folk.
Kyrill


Thanks,
Naveen




Re: [PATCH] [AArch64] Implement popcount pattern

2016-12-13 Thread Hurugalawadi, Naveen
Hi Kyrill,

Thanks for reviewing the patch and your useful comments.

>> looks good to me if it has gone through the normal required
>> bootstrap and testing, but I can't approve.

Bootstrapped and Regression Tested on aarch64-thunderx-linux.

>> The rest of the MD file uses the term AdvSIMD. Also, the instrurction
>> is CNT rather than "pop count".

Done.

>> __builtin_popcount takes an unsigned int, so this should be 
>> scanning for absence of popcountsi2 instead?

Done.

Please find attached the modified patch as per review comments
and let me know if its okay for Stage-1 or current branch.

Thanks,
Naveendiff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 65eb326..0acb3f0 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3785,6 +3785,39 @@
   }
 )
 
+;; Pop count be done via the "CNT" instruction in AdvSIMD.
+;;
+;; MOV	v.1d, x0
+;; CNT	v1.8b, v.8b
+;; ADDV b2, v1.8b
+;; MOV	w0, v2.b[0]
+
+(define_expand "popcount2"
+  [(match_operand:GPI 0 "register_operand")
+   (match_operand:GPI 1 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx v = gen_reg_rtx (V8QImode);
+  rtx v1 = gen_reg_rtx (V8QImode);
+  rtx r = gen_reg_rtx (QImode);
+  rtx in = operands[1];
+  rtx out = operands[0];
+  if(mode == SImode)
+{
+  rtx tmp;
+  tmp = gen_reg_rtx (DImode);
+  /* If we have SImode, zero extend to DImode, pop count does
+ not change if we have extra zeros. */
+  emit_insn (gen_zero_extendsidi2 (tmp, in));
+  in = tmp;
+}
+  emit_move_insn (v, gen_lowpart (V8QImode, in));
+  emit_insn (gen_popcountv8qi2 (v1, v));
+  emit_insn (gen_reduc_plus_scal_v8qi (r, v1));
+  emit_insn (gen_zero_extendqi2 (out, r));
+  DONE;
+})
+
 (define_insn "clrsb2"
   [(set (match_operand:GPI 0 "register_operand" "=r")
 (clrsb:GPI (match_operand:GPI 1 "register_operand" "r")))]
diff --git a/gcc/testsuite/gcc.target/aarch64/popcnt.c b/gcc/testsuite/gcc.target/aarch64/popcnt.c
new file mode 100644
index 000..37cf4b9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/popcnt.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int
+foo (int x)
+{
+  return __builtin_popcount (x);
+}
+
+long
+foo1 (int x)
+{
+  return __builtin_popcountl (x);
+}
+
+/* { dg-final { scan-assembler-not "popcount" } } */
+/* { dg-final { scan-assembler-times "cnt\t" 2 } } */


Re: [PATCH, fortran, pr77785, v2] [Coarray] ICE in gfc_get_caf_token_offset, at fortran/trans-expr.c:1990

2016-12-13 Thread Andre Vehreschild
Hi Janus,

no sorry. I mixed up the context. I thought your question was on pr78534. Sorry
for getting those two PRs mixed up. Just void my answer below. It is wrong. I
will see what I can do about a better testcase for the trans-array part. The
code responsible for the error unfortunately does not have a main program. So I
need to invent something.

- Andre



On Tue, 13 Dec 2016 12:11:50 +0100
Janus Weil  wrote:

> Hi Andre,
> 
> > all the sanitizer issues I fixed occur during compiling the testsuite. So I
> > would say, that when with the patch these errors do not occur anymore while
> > processing the testsuite, then those are tested for, right?  
> 
> aah, so you're saying that hunk is not actually related to the PR in
> the subject line, but instead fixes a testsuite failure seen with a
> sanitized compiler? That wasn't mentioned anywhere and sadly I forgot
> to bring my crystal ball ...
> 
> Cheers,
> Janus
> 
> 
> 
> > On Mon, 12 Dec 2016 13:37:43 +0100
> > Janus Weil  wrote:
> >  
> >> Hi Andre,
> >>  
> >> > the attached patch corrects reporting of "Sorry, unimplemented yet" for
> >> > allocatable and pointer components in polymorphic objects (BT_CLASS) thus
> >> > fixing two ICEs reported in the PR.
> >> >
> >> > The next chunk fixes an ICE when the declaration containing the token
> >> > information is of type POINTER or REFERENCE.
> >> >
> >> > Bootstraps and regtests ok on x86_64-linux/f23. Ok for trunk?  
> >>
> >> the resolve.c hunk is certainly ok. The trans-array.c part looks
> >> reasonable as well, but I wonder if it is actually covered by any of
> >> your test cases? Since they are all compile-only, with errors being
> >> thrown at resolution stage, do they even get to the translation stage?
> >>
> >> Cheers,
> >> Janus  
> >
> >
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de  


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH, GCC, gcc-5/6-branch] Fix PR77673: bswap loads passed end of object

2016-12-13 Thread Thomas Preudhomme
Sorry, ignore the first attachment (2_combine_profile_multilib.patch). i always 
miss that Thunderbird selects the first file in the in-review folder upfront.


Best regards,

Thomas

On 13/12/16 11:43, Thomas Preudhomme wrote:

Hi,

Fix in r242869 for PR77673 (bswap loads after end of object) applies cleanly on
GCC 6 and with trivial fix for GCC 5 (gimple * in GCC 6 -> gimple in GCC 5). The
backports also bootstrap fine on x86_64-linux-gnu and testsuite shows no
regression.

ChangeLog entries are as follow:


*** gcc/ChangeLog ***

2016-12-12  Thomas Preud'homme  

Backport from mainline
2016-11-25  Thomas Preud'homme  

PR tree-optimization/77673
* tree-ssa-math-opts.c (struct symbolic_number): Add new src field.
(init_symbolic_number): Initialize src field from src parameter.
(perform_symbolic_merge): Select most dominated statement as the
source statement.  Set src field of resulting n structure from the
input src with the lowest address.
(find_bswap_or_nop): Rename source_stmt into ins_stmt.
(bswap_replace): Rename src_stmt into ins_stmt.  Initially get source
of load from src field rather than insertion statement.  Cancel
optimization if statement analyzed is not dominated by the insertion
statement.
(pass_optimize_bswap::execute): Rename src_stmt to ins_stmt.  Compute
dominance information.


*** gcc/testsuite/ChangeLog ***

2016-12-12  Thomas Preud'homme  

Backport from mainline
2016-11-25  Thomas Preud'homme  

PR tree-optimization/77673
* gcc.dg/pr77673.c: New test.


Is this ok for gcc-5-branch and gcc-6-branch?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.dg/pr77673.c b/gcc/testsuite/gcc.dg/pr77673.c
new file mode 100644
index ..e03bcb9284d1e5a0e496cfa547fdbab630bec04f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr77673.c
@@ -0,0 +1,19 @@
+/* { dg-do compile { target fpic } } */
+/* { dg-require-effective-target bswap32 } */
+/* { dg-options "-O2 -fPIC -fdump-tree-bswap" } */
+/* { dg-additional-options "-march=z900" { target s390*-*-* } } */
+
+void mach_parse_compressed(unsigned char* ptr, unsigned long int* val)
+{
+  if (ptr[0] < 0xC0U) {
+*val = ptr[0] + ptr[1];
+return;
+  }
+
+  *val = ((unsigned long int)(ptr[0]) << 24)
+| ((unsigned long int)(ptr[1]) << 16)
+| ((unsigned long int)(ptr[2]) << 8)
+| ptr[3];
+}
+
+/* { dg-final { scan-tree-dump-not "load_dst_\\d+ =.* if \\(" "bswap" } } */
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 61b65bb824b1e90ab13e3cb3ac1b4fbbc34a3b70..3882652dcfc0e04642196243a034f1f7c1405936 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1645,6 +1645,7 @@ struct symbolic_number {
   tree base_addr;
   tree offset;
   HOST_WIDE_INT bytepos;
+  tree src;
   tree alias_set;
   tree vuse;
   unsigned HOST_WIDE_INT range;
@@ -1746,6 +1747,7 @@ init_symbolic_number (struct symbolic_number *n, tree src)
   int size;
 
   n->base_addr = n->offset = n->alias_set = n->vuse = NULL_TREE;
+  n->src = src;
 
   /* Set up the symbolic number N by setting each byte to a value between 1 and
  the byte size of rhs1.  The highest order byte is set to n->size and the
@@ -1859,6 +1861,7 @@ perform_symbolic_merge (gimple source_stmt1, struct symbolic_number *n1,
   uint64_t inc;
   HOST_WIDE_INT start_sub, end_sub, end1, end2, end;
   struct symbolic_number *toinc_n_ptr, *n_end;
+  basic_block bb1, bb2;
 
   if (!n1->base_addr || !n2->base_addr
 	  || !operand_equal_p (n1->base_addr, n2->base_addr, 0))
@@ -1872,15 +1875,20 @@ perform_symbolic_merge (gimple source_stmt1, struct symbolic_number *n1,
 	{
 	  n_start = n1;
 	  start_sub = n2->bytepos - n1->bytepos;
-	  source_stmt = source_stmt1;
 	}
   else
 	{
 	  n_start = n2;
 	  start_sub = n1->bytepos - n2->bytepos;
-	  source_stmt = source_stmt2;
 	}
 
+  bb1 = gimple_bb (source_stmt1);
+  bb2 = gimple_bb (source_stmt2);
+  if (dominated_by_p (CDI_DOMINATORS, bb1, bb2))
+	source_stmt = source_stmt1;
+  else
+	source_stmt = source_stmt2;
+
   /* Find the highest address at which a load is performed and
 	 compute related info.  */
   end1 = n1->bytepos + (n1->range - 1);
@@ -1937,6 +1945,7 @@ perform_symbolic_merge (gimple source_stmt1, struct symbolic_number *n1,
   n->vuse = n_start->vuse;
   n->base_addr = n_start->base_addr;
   n->offset = n_start->offset;
+  n->src = n_start->src;
   n->bytepos = n_start->bytepos;
   n->type = n_start->type;
   size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
@@ -2147,7 +2156,7 @@ find_bswap_or_nop (gimple stmt, struct symbolic_number *n, bool *bswap)
   uint64_t cmpxchg = CMPXCHG;
   uint64_t cmpnop = CMPNOP;
 
-  gimple source_stmt;
+  gimple ins_stmt;
   int limit;
 
   

[PATCH, GCC, gcc-5/6-branch] Improve comment for struct symbolic_number in bswap pass

2016-12-13 Thread Thomas Preudhomme

Hi,

Fix in r242869 for PR77673 was accompanied with r242870 which updated the 
description of the struct symbolic_number modified by the previous patch. It did 
so by rewriting the comment completely but I believe this patch should be still 
backported to make the comment match the code.


ChangeLog entry is as follows:


*** gcc/ChangeLog ***

2016-12-12  Thomas Preud'homme  

Backport from mainline
2016-11-25  Thomas Preud'homme  

* tree-ssa-math-opts.c (struct symbolic_number): Improve comment.


Is this ok for gcc-5-branch and gcc-6-branch?

Best regards,

Thomas
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 3882652dcfc0e04642196243a034f1f7c1405936..84c6fc80bc5b980ca8ed8cca5ad2ab7775f4f65b 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1619,25 +1619,32 @@ make_pass_cse_sincos (gcc::context *ctxt)
   return new pass_cse_sincos (ctxt);
 }
 
-/* A symbolic number is used to detect byte permutation and selection
-   patterns.  Therefore the field N contains an artificial number
-   consisting of octet sized markers:
+/* A symbolic number structure is used to detect byte permutation and selection
+   patterns of a source.  To achieve that, its field N contains an artificial
+   number consisting of BITS_PER_MARKER sized markers tracking where does each
+   byte come from in the source:
 
-   0- target byte has the value 0
-   FF   - target byte has an unknown value (eg. due to sign extension)
-   1..size - marker value is the target byte index minus one.
+   0	   - target byte has the value 0
+   FF	   - target byte has an unknown value (eg. due to sign extension)
+   1..size - marker value is the byte index in the source (0 for lsb).
 
To detect permutations on memory sources (arrays and structures), a symbolic
-   number is also associated a base address (the array or structure the load is
-   made from), an offset from the base address and a range which gives the
-   difference between the highest and lowest accessed memory location to make
-   such a symbolic number. The range is thus different from size which reflects
-   the size of the type of current expression. Note that for non memory source,
-   range holds the same value as size.
-
-   For instance, for an array char a[], (short) a[0] | (short) a[3] would have
-   a size of 2 but a range of 4 while (short) a[0] | ((short) a[0] << 1) would
-   still have a size of 2 but this time a range of 1.  */
+   number is also associated:
+   - a base address BASE_ADDR and an OFFSET giving the address of the source;
+   - a range which gives the difference between the highest and lowest accessed
+ memory location to make such a symbolic number;
+   - the address SRC of the source element of lowest address as a convenience
+ to easily get BASE_ADDR + offset + lowest bytepos.
+
+   Note 1: the range is different from size as size reflects the size of the
+   type of the current expression.  For instance, for an array char a[],
+   (short) a[0] | (short) a[3] would have a size of 2 but a range of 4 while
+   (short) a[0] | ((short) a[0] << 1) would still have a size of 2 but this
+   time a range of 1.
+
+   Note 2: for non-memory sources, range holds the same value as size.
+
+   Note 3: SRC points to the SSA_NAME in case of non-memory source.  */
 
 struct symbolic_number {
   uint64_t n;
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index ac15e8179a3257d1e190086c8089bc85ed8552bf..6413bd6d1ae17d04e276d97c088497d6d334823c 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1925,25 +1925,32 @@ make_pass_cse_sincos (gcc::context *ctxt)
   return new pass_cse_sincos (ctxt);
 }
 
-/* A symbolic number is used to detect byte permutation and selection
-   patterns.  Therefore the field N contains an artificial number
-   consisting of octet sized markers:
+/* A symbolic number structure is used to detect byte permutation and selection
+   patterns of a source.  To achieve that, its field N contains an artificial
+   number consisting of BITS_PER_MARKER sized markers tracking where does each
+   byte come from in the source:
 
-   0- target byte has the value 0
-   FF   - target byte has an unknown value (eg. due to sign extension)
-   1..size - marker value is the target byte index minus one.
+   0	   - target byte has the value 0
+   FF	   - target byte has an unknown value (eg. due to sign extension)
+   1..size - marker value is the byte index in the source (0 for lsb).
 
To detect permutations on memory sources (arrays and structures), a symbolic
-   number is also associated a base address (the array or structure the load is
-   made from), an offset from the base address and a range which gives the
-   difference between the highest and lowest accessed memory location to make
-   such a symbolic number. The range is thus different from size which reflects
-   the 

[PATCH, GCC, gcc-5/6-branch] Fix PR77673: bswap loads passed end of object

2016-12-13 Thread Thomas Preudhomme

Hi,

Fix in r242869 for PR77673 (bswap loads after end of object) applies cleanly on 
GCC 6 and with trivial fix for GCC 5 (gimple * in GCC 6 -> gimple in GCC 5). The 
backports also bootstrap fine on x86_64-linux-gnu and testsuite shows no regression.


ChangeLog entries are as follow:


*** gcc/ChangeLog ***

2016-12-12  Thomas Preud'homme  

Backport from mainline
2016-11-25  Thomas Preud'homme  

PR tree-optimization/77673
* tree-ssa-math-opts.c (struct symbolic_number): Add new src field.
(init_symbolic_number): Initialize src field from src parameter.
(perform_symbolic_merge): Select most dominated statement as the
source statement.  Set src field of resulting n structure from the
input src with the lowest address.
(find_bswap_or_nop): Rename source_stmt into ins_stmt.
(bswap_replace): Rename src_stmt into ins_stmt.  Initially get source
of load from src field rather than insertion statement.  Cancel
optimization if statement analyzed is not dominated by the insertion
statement.
(pass_optimize_bswap::execute): Rename src_stmt to ins_stmt.  Compute
dominance information.


*** gcc/testsuite/ChangeLog ***

2016-12-12  Thomas Preud'homme  

Backport from mainline
2016-11-25  Thomas Preud'homme  

PR tree-optimization/77673
* gcc.dg/pr77673.c: New test.


Is this ok for gcc-5-branch and gcc-6-branch?

Best regards,

Thomas
diff --git a/gcc/config.gcc b/gcc/config.gcc
index bfd1127d6e8e647ca8c3a57dd2d58b586dffe4a5..d7508e89de0e1d7a3027da0b04e4fa36c1e95fd0 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3710,34 +3710,18 @@ case "${target}" in
 		# Add extra multilibs
 		if test "x$with_multilib_list" != x; then
 			arm_multilibs=`echo $with_multilib_list | sed -e 's/,/ /g'`
-			case ${arm_multilibs} in
-			aprofile)
-# Note that arm/t-aprofile is a
-# stand-alone make file fragment to be
-# used only with itself.  We do not
-# specifically use the
-# TM_MULTILIB_OPTION framework because
-# this shorthand is more
-# pragmatic.
-tmake_profile_file="arm/t-aprofile"
-;;
-			rmprofile)
-# Note that arm/t-rmprofile is a
-# stand-alone make file fragment to be
-# used only with itself.  We do not
-# specifically use the
-# TM_MULTILIB_OPTION framework because
-# this shorthand is more
-# pragmatic.
-tmake_profile_file="arm/t-rmprofile"
-;;
-			default)
-;;
-			*)
-echo "Error: --with-multilib-list=${with_multilib_list} not supported." 1>&2
-exit 1
-;;
-			esac
+			if test "x${arm_multilibs}" != xdefault ; then
+for arm_multilib in ${arm_multilibs}; do
+	case ${arm_multilib} in
+	aprofile|rmprofile)
+		tmake_profile_file="arm/t-multilib"
+		;;
+	*)
+		echo "Error: --with-multilib-list=${with_multilib_list} not supported." 1>&2
+		exit 1
+		;;
+	esac
+done
 
 			if test "x${tmake_profile_file}" != x ; then
 # arm/t-aprofile and arm/t-rmprofile are only
diff --git a/gcc/config/arm/t-aprofile b/gcc/config/arm/t-aprofile
index 90305e1206e3964e08a673e385d3198747bdffa1..2cbd8e3c8e8bcd4bed6368bfea83ece953c8dbb4 100644
--- a/gcc/config/arm/t-aprofile
+++ b/gcc/config/arm/t-aprofile
@@ -24,30 +24,13 @@
 # have their default values during the configure step.  We enforce
 # this during the top-level configury.
 
-MULTILIB_OPTIONS =
-MULTILIB_DIRNAMES=
-MULTILIB_EXCEPTIONS  =
-MULTILIB_MATCHES =
-MULTILIB_REUSE	 =
+# Arch and FPU variants to build libraries with
 
-# We have the following hierachy:
-#   ISA: A32 (.) or T32 (thumb)
-#   Architecture: ARMv7-A (v7-a), ARMv7VE (v7ve), or ARMv8-A (v8-a).
-#   FPU: VFPv3-D16 (fpv3), NEONv1 (simdv1), VFPv4-D16 (fpv4),
-#NEON-VFPV4 (simdvfpv4), NEON for ARMv8 (simdv8), or None (.).
-#   Float-abi: Soft (.), softfp (softfp), or hard (hardfp).
+MULTI_ARCH_OPTS_A   = march=armv7-a/march=armv7ve/march=armv8-a
+MULTI_ARCH_DIRS_A   = v7-a v7ve v8-a
 
-MULTILIB_OPTIONS   += mthumb
-MULTILIB_DIRNAMES  += thumb
-
-MULTILIB_OPTIONS   += march=armv7-a/march=armv7ve/march=armv8-a
-MULTILIB_DIRNAMES  += v7-a v7ve v8-a
-
-MULTILIB_OPTIONS   += mfpu=vfpv3-d16/mfpu=neon/mfpu=vfpv4-d16/mfpu=neon-vfpv4/mfpu=neon-fp-armv8
-MULTILIB_DIRNAMES  += fpv3 simdv1 fpv4 simdvfpv4 simdv8
-
-MULTILIB_OPTIONS   += mfloat-abi=softfp/mfloat-abi=hard
-MULTILIB_DIRNAMES  += softfp hard
+MULTI_FPU_OPTS_A= mfpu=vfpv3-d16/mfpu=neon/mfpu=vfpv4-d16/mfpu=neon-vfpv4/mfpu=neon-fp-armv8
+MULTI_FPU_DIRS_A= fpv3 simdv1 fpv4 simdvfpv4 simdv8
 
 
 # Option combinations to build library with
@@ -71,7 +54,11 @@ MULTILIB_REQUIRED  += *march=armv8-a
 MULTILIB_REQUIRED  += *march=armv8-a/mfpu=neon-fp-armv8/mfloat-abi=*
 
 
+# Matches
+
 # CPU Matches

Re: [PATCH] omp-low.c split

2016-12-13 Thread Jakub Jelinek
On Tue, Dec 13, 2016 at 12:39:01PM +0100, Thomas Schwinge wrote:
> On Fri, 9 Dec 2016 14:08:21 +0100, Martin Jambor  wrote:
> > this is the promised attempt at splitting omp-low.c [...]
> 
> Yay!  \o/
> 
> I have not yet had a chance to review/test this patch, but I plan to.
> 
> A few initial comments from the "bike shed departement"; I understand in
> GCC sources it will not be easy to rename stuff (such as files) later, so
> we should get the names agreed upon early:
> 
> Generally, I agree with your division of "omp-low.c" parts.
> 
> >   - move everything that is part of pass_oacc_device_lower,
> > pass_omp_device_lower and pass_omp_target_link to a new file
> > omp-device.h,
> 
> Should we call this file "omp-offload.c", as offloading is what this
> deals with, is the term we agreed to generally use (as far as I can
> tell)?

That would be fine with me too.

> >   - move all pre-lowering gridification stuff to a new file
> > omp-grid.c.  [...]
> 
> Is that code generic enough to not call this file "omp-hsa.c" or similar?

And this as well.  But omp-grid.c is fine too.

> >   - I moved stuff that was used from all over the place to a new file
> > omp-general.c (unless it would mean exposing omp_region or
> > omp_context types).
> 
> I'd have called that simply "omp.c".

The problem with that is that the corresponding header can't be called
omp.h for obvious reasons, we already have one with very different meaning.

Jakub


Re: [PATCH] omp-low.c split

2016-12-13 Thread Thomas Schwinge
Hi!

On Fri, 9 Dec 2016 14:08:21 +0100, Martin Jambor  wrote:
> this is the promised attempt at splitting omp-low.c [...]

Yay!  \o/

I have not yet had a chance to review/test this patch, but I plan to.

A few initial comments from the "bike shed departement"; I understand in
GCC sources it will not be easy to rename stuff (such as files) later, so
we should get the names agreed upon early:

Generally, I agree with your division of "omp-low.c" parts.

>   - move everything that is part of pass_oacc_device_lower,
> pass_omp_device_lower and pass_omp_target_link to a new file
> omp-device.h,

Should we call this file "omp-offload.c", as offloading is what this
deals with, is the term we agreed to generally use (as far as I can
tell)?

>   - move all pre-lowering gridification stuff to a new file
> omp-grid.c.  [...]

Is that code generic enough to not call this file "omp-hsa.c" or similar?

>   - I moved stuff that was used from all over the place to a new file
> omp-general.c (unless it would mean exposing omp_region or
> omp_context types).

I'd have called that simply "omp.c".

> I am opened to suggestions what to do differently, names of the file
> are for example of course subject to discussion, and I absolutely
> welcome any review and checking, for one I am not going to pretend I
> understand the stuff I put into omp-device.c.  If however there is
> consensus that we should do something like this, I would like to ask
> the community to freeze omp-low.c file until this gets committed, I
> hope you understand that I am afraid of any conflicts.

I very much understand...  :-| When I had worked on the very same thing
months ago, and my changes went without review/approval for a long time,
I had spent numerous hours on keeping my patch up to date.  So, I'm happy
to see that this is now near approval!  (Even though I don't understand
what's different now from when I worked on the same thing back then...)

I hope to have time later today to review/test your actual patch.


Grüße
 Thomas


Re: [PATCH] Optimiza aggregate a = b = c = {} (PR c/78408)

2016-12-13 Thread Jakub Jelinek
Hi!

Sorry for not getting to this earlier.

On Mon, Nov 28, 2016 at 10:50:26AM +0100, Richard Biener wrote:
> > +  else if (gimple_call_builtin_p (defstmt, BUILT_IN_MEMSET)
> > +  && TREE_CODE (gimple_call_arg (defstmt, 0)) == ADDR_EXPR
> > +  && TREE_CODE (gimple_call_arg (defstmt, 1)) == INTEGER_CST
> > +  && TREE_CODE (gimple_call_arg (defstmt, 2)) == INTEGER_CST)
> > +{
> > +  HOST_WIDE_INT ssize, max_size, off;
> > +  bool reverse;
> > +  src2 = TREE_OPERAND (gimple_call_arg (defstmt, 0), 0);
> > +  get_ref_base_and_extent (src2, , , _size, );
> > +  if (ssize != max_size
> > + || (ssize % BITS_PER_UNIT) != 0
> > + || !wi::eq_p (gimple_call_arg (defstmt, 2), ssize / BITS_PER_UNIT))
> > +   src2 = NULL_TREE;
> 
> I wonder why you jump through the hoops of get_ref_base_and_extent
> given the call args will be invariant addresses and thus
> get_addr_base_and_unit_offset would be more appropriate here.

get_addr_base_and_unit_offset does not give me the size though,
which is what I wanted to compute.  Even if as you suggest I'd
accept other INTEGER_CST sizes, it would still be better to punt
if the memset is clearly invalid.  And for the case where the
memset call is followed by assignment, not memcpy (very common, as
memcpy is often folded into the assignment), the verification needs
to be done.

> Also not sure why you want to restrict the size with the wi::eq_p
> (probably for the a = b case where the size isn't given explicitely
> but then you don't check whether off is 0 ...).  I'd say passing

But I'm not comparing the result of get_ref_base_and_extent, but
the argument as is.  Perhaps where it does above the src2 = NULL_TREE;
I could save the size into one var, off into another one and set
src2 to the result of get_ref_base_and_extent in that case, then
if those vars are set require the second stmt to be a memset and do the same
stuff there?

> in base, offset and size for src and dest into this function will
> simplify things and should allow to handle
> 
>   memset (p+10, 0, 24);
>   memcpy (q, p+10, 24);
> 
> if you compare bases with operand_equal_p.
> 
> > +  if (refs_may_alias_p (dest, src))
> > +return;
> 
> Why's that?

I admit I'm not sure if GIMPLE_ASSIGN may be between overlapping objects or
not, memset can't.

Jakub


Re: [PATCH, GCC/testsuite/ARM] Skip optional_mthumb tests if GCC has a default mode

2016-12-13 Thread Christophe Lyon
On 13 December 2016 at 12:18, Thomas Preudhomme
 wrote:
> On 13/12/16 10:11, Christophe Lyon wrote:
>>
>> On 13 December 2016 at 10:54, Thomas Preudhomme
>>  wrote:
>>>
>>> On 12/12/16 21:17, Christophe Lyon wrote:


 Hi Thomas,

 Thanks for working on this,


 On 12 December 2016 at 18:52, Thomas Preudhomme
  wrote:
>
>
> Hi,
>
> The logic to make -mthumb optional for Thumb-only devices is only
> executed
> when no -marm or -mthumb is given on the command-line. This includes
> configuring GCC with --with-mode= because this makes the option to be
> passed
> before any others. The optional_mthumb-* testcases are skipped when
> -marm
> or
> -mthumb is passed on the command line but not when GCC was configured
> with
> --with-mode. Not only are the tests meaningless in these
> configurations,
> they also spuriously FAIL if --with-mode=arm was used since the test
> are
> built for Thumb-only devices, as reported by Christophe Lyon in [1].
>
> [1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02082.html
>
> This patch adds logic to target-support.exp to check how was GCC
> configured
> and changes the optional_mthumb testcases to be skipped if there is a
> default mode (--with-mode=). It also fixes a couple of typo on the
> selectors.
>

 How hard would it be to skip these tests only if --with-mode=arm,
 such that they would still pass in configurations --with-mode=thumb?

 It seems easy to extend what you propose here, doesn't it?
>>>
>>>
>>>
>>> It is but IMO it gives a false sense of quality since it does not test
>>> the
>>> optional -mthumb logic.
>>>
>>> What is the motivation to make it run with --with-mode=thumb?
>>>
>>
>> With your patch, they will appear as unsupported, right?
>> It's not a big deal.
>
>
> yes indeed.
>
OK, that's fine if that's your intent.

>>
>> Do you know how most people configure their toolchains?
>> Linaro and Ubuntu (I think) use --with-mode=thumb.
>> Do most people use no --with-mode configure flag?
>
>
> I cannot answer for people in general but in our case we use
> --with-multilib-list which will build the toolchain for several -march,
> -mfpu and -mfloat combinations. But it's good to have a mix, it finds more
> issues in the testsuite as this very examples shows.
>
>>
>> In the validations I run against trunk, I always have --with-mode,
>> except for one arm-none-eabi configuration where I also use
>> default cpu/fpu.
>
>
> For good reason because without those it will compile target libraries
> without any option and so the default would be used (ie ARMv4T in default
> mode). Either you need to specify some --with-* or you built with multilib.
>
I keep this configuration with all defaults to make sure everything
still works for the default, armv4t indeed. And there are often problems
in the testsuite as you know, because people test with more modern
configurations.

>>
>> Maybe I should remove --with-mode=thumb from my config
>> --with-cpu=cortex-m3? (but I don't remember if the new logic
>> was backported to gcc-5/gcc-6 ?)
>
>
> No it wasn't backported.
>
> Best regards,
>
> Thomas


Re: [PATCH, GCC/testsuite/ARM] Skip optional_mthumb tests if GCC has a default mode

2016-12-13 Thread Thomas Preudhomme

On 13/12/16 10:11, Christophe Lyon wrote:

On 13 December 2016 at 10:54, Thomas Preudhomme
 wrote:

On 12/12/16 21:17, Christophe Lyon wrote:


Hi Thomas,

Thanks for working on this,


On 12 December 2016 at 18:52, Thomas Preudhomme
 wrote:


Hi,

The logic to make -mthumb optional for Thumb-only devices is only
executed
when no -marm or -mthumb is given on the command-line. This includes
configuring GCC with --with-mode= because this makes the option to be
passed
before any others. The optional_mthumb-* testcases are skipped when -marm
or
-mthumb is passed on the command line but not when GCC was configured
with
--with-mode. Not only are the tests meaningless in these configurations,
they also spuriously FAIL if --with-mode=arm was used since the test are
built for Thumb-only devices, as reported by Christophe Lyon in [1].

[1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02082.html

This patch adds logic to target-support.exp to check how was GCC
configured
and changes the optional_mthumb testcases to be skipped if there is a
default mode (--with-mode=). It also fixes a couple of typo on the
selectors.



How hard would it be to skip these tests only if --with-mode=arm,
such that they would still pass in configurations --with-mode=thumb?

It seems easy to extend what you propose here, doesn't it?



It is but IMO it gives a false sense of quality since it does not test the
optional -mthumb logic.

What is the motivation to make it run with --with-mode=thumb?



With your patch, they will appear as unsupported, right?
It's not a big deal.


yes indeed.



Do you know how most people configure their toolchains?
Linaro and Ubuntu (I think) use --with-mode=thumb.
Do most people use no --with-mode configure flag?


I cannot answer for people in general but in our case we use 
--with-multilib-list which will build the toolchain for several -march, -mfpu 
and -mfloat combinations. But it's good to have a mix, it finds more issues in 
the testsuite as this very examples shows.




In the validations I run against trunk, I always have --with-mode,
except for one arm-none-eabi configuration where I also use
default cpu/fpu.


For good reason because without those it will compile target libraries without 
any option and so the default would be used (ie ARMv4T in default mode). Either 
you need to specify some --with-* or you built with multilib.




Maybe I should remove --with-mode=thumb from my config
--with-cpu=cortex-m3? (but I don't remember if the new logic
was backported to gcc-5/gcc-6 ?)


No it wasn't backported.

Best regards,

Thomas


Re: [patch, doc] Move -pthread documentation to linker options

2016-12-13 Thread Rainer Orth
Hi Sandra,

> On 12/11/2016 01:28 PM, Rainer Orth wrote:
>> Hi Sandra,
>>
>>> PR 16519 notes that -pthread has only ever been documented as an RS6000 and
>>> Solaris 2 option.  In fact it's supported by most/all(?) POSIX-flavored
>>> targets, including GNU/Linux, BSD variants, Darwin, etc. It's probably best
>>> to document it as a generic option, with the expectation that GCC supports
>>> -pthread if the underlying operating system or C library provides an
>>> implementation of the POSIX threads API.
>>>
>>> After scratching my head about it, it seemed that it's best categorized as
>>> a linker option even though it also affects the preprocessor on some
>>> targets.  I'll wait a day or two before committing the attached patch, in
>>> case anybody wants to argue that this is the wrong way to categorize it.
>>
>> I don't like categorizing it as a linker option: as you say, it affects
>> the preprocessor as well (adding -D_REENTRANT on most systems), and in
>> the past (not completely sure about the present) there were subtle bugs
>> if you forgot to add -pthread during compilation.
>
> Do you have a suggestion for a better place to put it?  Preferably *one*
> place, instead of duplicating the docs in 10+ places with target-specific
> options?

I just checked and couldn't find a really good place.  It has some
similarity to C Language Options, but doesn't fit since it doesn't
extend the language.  Unless you want to introduce a whole new section,
I think the best you can do is list it separately under Preprocessor and
Link Options.  I fully agree to not list it individually for each system
that currently supports it.  The better route would be to support it for
both compilation and linking on any system that needs either for
symmetry's sake, even if one of the two is just a no-op.  Everything
else is just a nightmarish user experience.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH, fortran, pr77785, v2] [Coarray] ICE in gfc_get_caf_token_offset, at fortran/trans-expr.c:1990

2016-12-13 Thread Janus Weil
Hi Andre,

> all the sanitizer issues I fixed occur during compiling the testsuite. So I
> would say, that when with the patch these errors do not occur anymore while
> processing the testsuite, then those are tested for, right?

aah, so you're saying that hunk is not actually related to the PR in
the subject line, but instead fixes a testsuite failure seen with a
sanitized compiler? That wasn't mentioned anywhere and sadly I forgot
to bring my crystal ball ...

Cheers,
Janus



> On Mon, 12 Dec 2016 13:37:43 +0100
> Janus Weil  wrote:
>
>> Hi Andre,
>>
>> > the attached patch corrects reporting of "Sorry, unimplemented yet" for
>> > allocatable and pointer components in polymorphic objects (BT_CLASS) thus
>> > fixing two ICEs reported in the PR.
>> >
>> > The next chunk fixes an ICE when the declaration containing the token
>> > information is of type POINTER or REFERENCE.
>> >
>> > Bootstraps and regtests ok on x86_64-linux/f23. Ok for trunk?
>>
>> the resolve.c hunk is certainly ok. The trans-array.c part looks
>> reasonable as well, but I wonder if it is actually covered by any of
>> your test cases? Since they are all compile-only, with errors being
>> thrown at resolution stage, do they even get to the translation stage?
>>
>> Cheers,
>> Janus
>
>
> --
> Andre Vehreschild * Email: vehre ad gmx dot de


Re: [PATCH] avoid infinite recursion in maybe_warn_alloc_args_overflow (pr 78775)

2016-12-13 Thread Marek Polacek
On Mon, Dec 12, 2016 at 06:36:16PM -0700, Martin Sebor wrote:
> +/* Return true if the type of OP is signed, looking through any casts
> +   to an unsigned type.  */
> +
> +static bool
> +operand_signed_p (tree op)
> +{
> +  bitmap visited = NULL;
> +  bool ret = operand_signed_p (op, );
> +
> +  if (visited)
> +BITMAP_FREE (visited);

I think you can drop the if before BITMAP_FREE here.

Marek


Re: [PATCH] S/390: Run md tests with -march=native instead of -march=z13.

2016-12-13 Thread Jakub Jelinek
On Tue, Dec 13, 2016 at 11:18:31AM +0100, Dominik Vogt wrote:
> > IMHO you want something like x86 avx_runtime effective target
> > (z13_runtime?), which would stand for running on z13 capable hw and
> > with z13 assembler support.
> 
> Something like that, yes, but it's not so easy because the kernel
> has to support it too.  Some features are disabled in a VM
> although the hardware supports them.  What we really need is

The z13_runtime or whatever effective target routine can always
just try to compile/link a simple testcase with at least one z13 specific
instruction and try to run it.

>   Run test if the test system (not just the hardware) supports the
>   instruction set of the -march= option the test was compiled
>   with, otherwise just compile it.
> 
> I.e. derive the "effective_targt..." option from the "-march=..."
> option set by the torture test.
> 
> > Or choose what options to include based on such effective target tests,
> > and perhaps also select a default action, like we do on some targets e.g. in
> > the vectorizer.
> 
> Can you give an example test file, please?

Look e.g. at gcc.dg/vect/vect.exp and testsuite/lib/*.exp for
dg-do-what-default and the games done with it.  E.g. if you set
(and saved/restored) in addition to dg-do-what-default also EFFECTIVE_TARGETS
then perhaps you could use et-dg-runtest or similar.

Jakub


Re: [PATCH] omp-low.c split

2016-12-13 Thread Jakub Jelinek
On Tue, Dec 13, 2016 at 11:15:43AM +0100, Martin Jambor wrote:
> I have bootstrapped the two patches on aarch64-linux and bootstrapped
> and tested them on x86_64-linux.  What do you think?

Thanks a lot for the work.  If you wouldn't mind doing a couple of further
changes (see below), I'd appreciate it, but if you want to commit it right
away, I'm ok with that too.

@@ -4321,7 +4322,7 @@ expand_cilk_for (struct omp_region *region, struct 
omp_for_data *fd)
 
   tree child_fndecl
 = gimple_omp_parallel_child_fn (
-as_a  (last_stmt (region->outer->entry)));
+   as_a  (last_stmt (region->outer->entry)));
   tree t, low_val = NULL_TREE, high_val = NULL_TREE;
   for (t = DECL_ARGUMENTS (child_fndecl); t; t = TREE_CHAIN (t))
 {

My preference for the above would be

  gomp_parallel *par_stmt
= as_a  (last_stmt (region->outer->entry));
  tree child_fndecl = gimple_omp_parallel_child_fn (par_stmt);

@@ -6428,7 +6429,7 @@ expand_omp_atomic_pipeline (basic_block load_bb, 
basic_block store_bb,
  floating point.  This allows the atomic operation to properly
  succeed even with NaNs and -0.0.  */
   stmt = gimple_build_cond_empty
-   (build2 (NE_EXPR, boolean_type_node,
+  (build2 (NE_EXPR, boolean_type_node,
new_storedi, old_vali));
   gsi_insert_before (, stmt, GSI_SAME_STMT);
 
And here

  tree ne = build2 (NE_EXPR, boolean_type_node, new_storedi, old_vali);
  stmt = gimple_build_cond_empty (ne);

@@ -442,7 +442,7 @@ omp_max_simt_vf (void)
   if (!optimize)
 return 0;
   if (ENABLE_OFFLOADING)
-for (const char *c = getenv ("OFFLOAD_TARGET_NAMES"); c; )
+for (const char *c = getenv ("OFFLOAD_TARGET_NAMES"); c;)

I admit I don't know what the coding style we have in this case.  Ok either way.

@@ -5860,7 +5860,7 @@ lower_omp_sections (gimple_stmt_iterator *gsi_p, 
omp_context *ctx)
   new_body = maybe_catch_exception (new_body);
 
   t = gimple_build_omp_return
-(!!omp_find_clause (gimple_omp_sections_clauses (stmt),
+   (!!omp_find_clause (gimple_omp_sections_clauses (stmt),
OMP_CLAUSE_NOWAIT));
   gimple_seq_add_stmt (_body, t);
   maybe_add_implicit_barrier_cancel (ctx, _body);
@@ -6023,7 +6023,7 @@ lower_omp_single (gimple_stmt_iterator *gsi_p, 
omp_context *ctx)
   bind_body = maybe_catch_exception (bind_body);
 
   t = gimple_build_omp_return
-(!!omp_find_clause (gimple_omp_single_clauses (single_stmt),
+   (!!omp_find_clause (gimple_omp_single_clauses (single_stmt),
OMP_CLAUSE_NOWAIT));
   gimple_seq_add_stmt (_body_tail, t);
   maybe_add_implicit_barrier_cancel (ctx, _body_tail);

And in the above 2 spots something like:
  bool nowait = omp_find_clause (gimple_omp_single_clauses (single_stmt),
 OMP_CLAUSE_NOWAIT) != NULL_TREE;
  t = gimple_build_omp_return (nowait);

(wonder why we use t for gimple, renaming t var to g would be also nice).  But
ok either way.

@@ -8668,7 +8669,7 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context 
*ctx)
   break;
 case GIMPLE_TRANSACTION:
   lower_omp (gimple_transaction_body_ptr (
-   as_a  (stmt)),
+  as_a  (stmt)),
 ctx);

Here it fits, so no need to wrap:
  lower_omp (gimple_transaction_body_ptr (as_a  (stmt)),
 ctx);


Jakub


Re: [PATCH] avoid infinite recursion in maybe_warn_alloc_args_overflow (pr 78775)

2016-12-13 Thread Richard Biener
On Tue, Dec 13, 2016 at 2:36 AM, Martin Sebor  wrote:
> The attached patch avoids infinite recursion when traversing phi
> nodes in maybe_warn_alloc_args_overflow by using a bitmap to keep
> track of those already visited and breaking out.

It looks somewhat excessive (the whole PHI node walk looks exponential in the
number of alloca calls given a good enough testcase).

It also looks like operand_signed_p really returns only a wild guess, neither
conservatively true or false.  Is that correct?

Can you instead scrap the weird anti-range handling please?

Thanks,
Richard.

> Thanks
> Martin


Re: [Patch, fortran] PR78737 - [OOP] linking error with deferred, undefined user-defined derived-type I/O

2016-12-13 Thread Paul Richard Thomas
Dear Janus,

We got there! OK for trunk.

This was a demonstration of the corollary of the "bon mot" from Barack
Obama at the end of the message :-)

Many thanks for finding the right path.

Paul

On 13 December 2016 at 10:23, Janus Weil  wrote:
> 2016-12-13 10:58 GMT+01:00 Janus Weil :
>> 2016-12-13 10:42 GMT+01:00 Janus Weil :
 please find attached a version of your patch that runs all the dtio
 testcases successfully.
>>>
>>> Great, thanks a lot. Your addition of
>>> gfc_find_and_cut_at_last_class_ref is just what I was looking for
>>> right now ...
>>>
>>> If you don't mind I'll write a ChangeLog, add the proper tests and
>>> commit to trunk. Or do you prefer to take care of it yourself?
>>
>> Btw, to continue the brainstorming, I think I found a slightly better
>> solution for the dtio_13 problem, which even removes the spurious
>> error on that test case via a small fix in resolve_transfer. The
>> attached patch is what I'd like to commit if you're ok with it.
>
> Finally, here is a complete patch, including testcase and ChangeLog
> entries. Ok for trunk?
>
> Cheers,
> Janus
>
>
> 2016-12-13  Janus Weil  
> Paul Thomas  
>
> PR fortran/78737
> * gfortran.h (gfc_find_typebound_dtio_proc): New prototype.
> * interface.c (gfc_compare_interfaces): Whitespace fix.
> (gfc_find_typebound_dtio_proc): New function.
> (gfc_find_specific_dtio_proc): Use it.
> * resolve.c (resolve_transfer): Improve error recovery.
> * trans-io.c (get_dtio_proc): Implement polymorphic calls to DTIO
> procedures.
>
> 2016-12-13  Janus Weil  
> Paul Thomas  
>
> PR fortran/78737
> * gfortran.dg/dtio_13.f90: Remove spurious error.
> * gfortran.dg/dtio_19.f90: New test case.



-- 
If you're walking down the right path and you're willing to keep
walking, eventually you'll make progress.

Barack Obama


Re: [PATCH] omp-low.c split

2016-12-13 Thread Martin Jambor
Hi,

On Fri, Dec 09, 2016 at 07:18:54PM +0300, Alexander Monakov wrote:
> On Fri, 9 Dec 2016, Jakub Jelinek wrote:
> > Can you post an incremental patch fixing those issues?
> 
> A few small nits I found while reading the patch.
> 
> First of all, please use 'git diff --patience' (or --histogram) when
> generating such patches, without it the changes in omp-low.c look uglier than
> necessary.

OK, the new patches use --patience, the result is actually quite a bit
smaller.  Thanks for the hint.

> 
> The comment for 'struct oacc_loop' in new file omp-device.c is misplaced: for
> some reason it's above the #include block.

Oh, thanks for noticing, fixed.

> 
> This patch doesn't seem to apply to current trunk due to a conflict in
> cp/parser.c.

Hmm, I don't think so?

  $ svn st
  $ svn up
  Updating '.':
  At revision 243600.
  $ patch --dry-run -p1 < 
/tmp/misc-next/pat/0001-Split-omp-low-into-multiple-files.patch
  checking file gcc/Makefile.in
  checking file gcc/c-family/c-omp.c
  checking file gcc/c/c-parser.c
  checking file gcc/c/c-typeck.c
  checking file gcc/c/gimple-parser.c
  checking file gcc/config/nvptx/nvptx.c
  checking file gcc/cp/parser.c
  checking file gcc/cp/semantics.c
  checking file gcc/fortran/trans-openmp.c
  checking file gcc/gengtype.c
  checking file gcc/gimple-fold.c
  checking file gcc/gimplify.c
  checking file gcc/lto-cgraph.c
  checking file gcc/omp-device.c
  checking file gcc/omp-device.h
  checking file gcc/omp-expand.c
  checking file gcc/omp-expand.h
  checking file gcc/omp-general.c
  checking file gcc/omp-general.h
  checking file gcc/omp-grid.c
  checking file gcc/omp-grid.h
  checking file gcc/omp-low.c
  checking file gcc/omp-low.h
  checking file gcc/toplev.c
  checking file gcc/tree-cfg.c
  checking file gcc/tree-parloops.c
  checking file gcc/tree-ssa-loop.c
  checking file gcc/tree-vrp.c
  checking file gcc/varpool.c
  $ echo $?
  0

This was using the exact same file I have compressed and sent a while
ago to the mailing list.

> If you could create a git branch (perhaps in your personal
> namespace so you can freely rebase it) with this patchset, I'd appreciate it.
> 

Well... I do not see this as a long-term project so I do not really
see much value in this.  I have used git just because that is how I
conveniently send patches to machines where I bootstrap them.  I hope
this will get committed very early so that we avoid any big conflicts
in omp-low.c.  But if this drags on for a while and if our git mirror
recovers meanwhile, I can.


> When trying to apply the patch, git notes a few remaining whitespace issues:
> 
> $ zcat 0001-Split-omp-low-into-multiple-files.patch.gz | git apply --reject -
> 
> :2734: space before tab in indent.
> # BLOCK 2 (PAR_ENTRY_BB)
> :5346: space before tab in indent.
> true, GSI_SAME_STMT);
> :8129: space before tab in indent.
>  after a stmt, not before.  */
> :9060: space before tab in indent.
>   GOMP_atomic_start ();
> :9061: space before tab in indent.
>   *addr = rhs;
> 

I hope I have addressed this when doing what Jakub suggested, please
let me know if not.

Thanks for looking at the big patch!

Martin


Re: [PATCH] S/390: Run md tests with -march=native instead of -march=z13.

2016-12-13 Thread Dominik Vogt
On Tue, Dec 13, 2016 at 10:42:37AM +0100, Jakub Jelinek wrote:
> On Tue, Dec 13, 2016 at 10:28:29AM +0100, Dominik Vogt wrote:
> > The attached patch fixes an md test execution problem on S/390.
> > The tests would be built with -march=z13 but executed even on
> > older machines.  Build with -march=native instead, so executing
> > the tests should work on any machine generation.
> > 
> > Tested on s390x biarch, but not bootstrapped or regression tested.
> 
> I think this isn't a very good idea.  Then the testresults from one
> box to another one will mean something different.

Note that this is already the case:  Recently the setmem-long-1.c
test case has started crashing on zEC12 because -march=z13 now
generates z13 specific instructions.

> IMHO you want something like x86 avx_runtime effective target
> (z13_runtime?), which would stand for running on z13 capable hw and
> with z13 assembler support.

Something like that, yes, but it's not so easy because the kernel
has to support it too.  Some features are disabled in a VM
although the hardware supports them.  What we really need is

  Run test if the test system (not just the hardware) supports the
  instruction set of the -march= option the test was compiled
  with, otherwise just compile it.

I.e. derive the "effective_targt..." option from the "-march=..."
option set by the torture test.

> Or choose what options to include based on such effective target tests,
> and perhaps also select a default action, like we do on some targets e.g. in
> the vectorizer.

Can you give an example test file, please?

> Some tests are dg-do run by default if the hw supports
> the needed ISA, or dg-do assemble, if the hw doesn't support that, but
> at least the assembler can assemble those, otherwise dg-do compile or
> something similar.
> 
> With -march=native, you find some results in gcc-testresults and the exact
> underlying hw will be part of the needed info to see what you've actually
> tested.  While usually just some tests are UNSUPPORTED if hw or assembler
> doesn't have the needed features.

Yes.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH, GCC/testsuite/ARM] Skip optional_mthumb tests if GCC has a default mode

2016-12-13 Thread Christophe Lyon
On 13 December 2016 at 10:54, Thomas Preudhomme
 wrote:
> On 12/12/16 21:17, Christophe Lyon wrote:
>>
>> Hi Thomas,
>>
>> Thanks for working on this,
>>
>>
>> On 12 December 2016 at 18:52, Thomas Preudhomme
>>  wrote:
>>>
>>> Hi,
>>>
>>> The logic to make -mthumb optional for Thumb-only devices is only
>>> executed
>>> when no -marm or -mthumb is given on the command-line. This includes
>>> configuring GCC with --with-mode= because this makes the option to be
>>> passed
>>> before any others. The optional_mthumb-* testcases are skipped when -marm
>>> or
>>> -mthumb is passed on the command line but not when GCC was configured
>>> with
>>> --with-mode. Not only are the tests meaningless in these configurations,
>>> they also spuriously FAIL if --with-mode=arm was used since the test are
>>> built for Thumb-only devices, as reported by Christophe Lyon in [1].
>>>
>>> [1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02082.html
>>>
>>> This patch adds logic to target-support.exp to check how was GCC
>>> configured
>>> and changes the optional_mthumb testcases to be skipped if there is a
>>> default mode (--with-mode=). It also fixes a couple of typo on the
>>> selectors.
>>>
>>
>> How hard would it be to skip these tests only if --with-mode=arm,
>> such that they would still pass in configurations --with-mode=thumb?
>>
>> It seems easy to extend what you propose here, doesn't it?
>
>
> It is but IMO it gives a false sense of quality since it does not test the
> optional -mthumb logic.
>
> What is the motivation to make it run with --with-mode=thumb?
>

With your patch, they will appear as unsupported, right?
It's not a big deal.

Do you know how most people configure their toolchains?
Linaro and Ubuntu (I think) use --with-mode=thumb.
Do most people use no --with-mode configure flag?

In the validations I run against trunk, I always have --with-mode,
except for one arm-none-eabi configuration where I also use
default cpu/fpu.

Maybe I should remove --with-mode=thumb from my config
--with-cpu=cortex-m3? (but I don't remember if the new logic
was backported to gcc-5/gcc-6 ?)

Christophe


> Best regards,
>
> Thomas


Re: Fold strstr (s, t) eq/ne s to strcmp (s, t) eq/ne 0 if strlen (t) is known

2016-12-13 Thread Jakub Jelinek
On Tue, Dec 13, 2016 at 03:08:17PM +0530, Prathamesh Kulkarni wrote:
> Thanks for the suggestions. It didn't occur to me to check for gimple_cond.
> I have tried to do the changes in the attached version.
> I am not sure if I have handled cond_expr correctly.
> IIUC, if gimple_assign has code cond_expr, then the condition is
> stored in gimple_assign_rhs1,
> however it's not a single operand but a tree of the form "op1 cond_code op2".
> Is that correct ?

Yes.  gimple_assign_rhs1 will be in what you are looking for EQ_EXPR or
NE_EXPR tree, its TREE_CODE will be this code you want to check, and
TREE_OPERAND (exp, 0) and TREE_OPERAND (exp, 1) the rhs1 and rhs2 you use
elsewhere.

> However I am not able to write a test-case that generates cond_expr in the IL.
> I tried:
> t1 = strstr (s, t);
> (t1 == s)  ? foo() : bar ();
> and other such variants but it seems the ?: operator is getting
> lowered to gimple_cond instead.

It is, but in some cases tree-if-conv.c turns them back into COND_EXPRs.
I guess you need -ftree-loop-if-convert now, and it has to be in some loop
where the addition of cond_expr would likely turn it into a single bb loop.
You probably want constants or vars, not function calls in the ? :
expressions though.

> +/* Try to fold strstr (s, t) == s to memcmp (s, t, strlen (t)) == 0.  */
> +
> +static void
> +fold_strstr_to_memcmp(enum tree_code code, tree rhs1, tree rhs2, gimple 
> *stmt)

Formatting, space before (.

> +{
> +  gimple *call_stmt = NULL;
> +  for (int pass = 0; pass < 2; pass++)
> +{
> +  gimple *g = SSA_NAME_DEF_STMT (rhs1);
> +  if (g

I think g should be always non-NULL (except for invalid IL), so probably no
need to check it.

> +   && gimple_call_builtin_p (g, BUILT_IN_STRSTR)
> +   && has_single_use (rhs1)
> +   && gimple_call_arg (as_a (g), 0) == rhs2)

I think gimple_call_arg works fine even with just gimple * argument.
So you can avoid the as_a (g) uglification and just use g.

> +   if (is_gimple_assign (stmt))
> + {
> +   if (gimple_assign_rhs_code (stmt) == COND_EXPR)
> + {
> +   tree cond = gimple_assign_rhs1 (stmt);
> +   TREE_SET_CODE (cond, EQ_EXPR);

This looks weird.  You are hardcoding EQ_EXPR, while for the
other case below you use code.  So, do you handle properly both
EQ_EXPR and NE_EXPR for this and gimple_cond cases?
Also, for non-COND_EXPR assign you build a new stmt instead of reusing
the existing one, why?

> +   TREE_OPERAND (cond, 0) = memcmp_lhs;
> +   TREE_OPERAND (cond, 1) = zero;
> +   update_stmt (stmt);
> + }
> +   else
> + {
> +   gsi = gsi_for_stmt (stmt);
> +   tree lhs = gimple_assign_lhs (stmt);
> +   gassign *ga = gimple_build_assign (lhs, code, memcmp_lhs,
> +  zero);
> +   gsi_replace (, ga, false);
> + }
> + }
> +   else
> + {
> +   gcond *cond = as_a (stmt);
> +   gimple_cond_set_lhs (cond, memcmp_lhs);
> +   gimple_cond_set_rhs (cond, zero);
> +   gimple_cond_set_code (cond, EQ_EXPR);

Likewise here.

Jakub


Re: [PATCH, GCC/testsuite/ARM] Skip optional_mthumb tests if GCC has a default mode

2016-12-13 Thread Thomas Preudhomme

On 12/12/16 21:17, Christophe Lyon wrote:

Hi Thomas,

Thanks for working on this,


On 12 December 2016 at 18:52, Thomas Preudhomme
 wrote:

Hi,

The logic to make -mthumb optional for Thumb-only devices is only executed
when no -marm or -mthumb is given on the command-line. This includes
configuring GCC with --with-mode= because this makes the option to be passed
before any others. The optional_mthumb-* testcases are skipped when -marm or
-mthumb is passed on the command line but not when GCC was configured with
--with-mode. Not only are the tests meaningless in these configurations,
they also spuriously FAIL if --with-mode=arm was used since the test are
built for Thumb-only devices, as reported by Christophe Lyon in [1].

[1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02082.html

This patch adds logic to target-support.exp to check how was GCC configured
and changes the optional_mthumb testcases to be skipped if there is a
default mode (--with-mode=). It also fixes a couple of typo on the
selectors.



How hard would it be to skip these tests only if --with-mode=arm,
such that they would still pass in configurations --with-mode=thumb?

It seems easy to extend what you propose here, doesn't it?


It is but IMO it gives a false sense of quality since it does not test the 
optional -mthumb logic.


What is the motivation to make it run with --with-mode=thumb?

Best regards,

Thomas


Re: [PATCH, middle-end]: Fix PR78738, unrecognized insn with -ffast-math

2016-12-13 Thread James Greenhalgh
On Tue, Dec 13, 2016 at 09:52:40AM +0100, Richard Biener wrote:
> On Sun, Dec 11, 2016 at 5:16 PM, Uros Bizjak  wrote:
> > On Fri, Dec 9, 2016 at 11:09 AM, Richard Biener
> >  wrote:
> >> On Thu, Dec 8, 2016 at 10:44 PM, Uros Bizjak  wrote:
> >>> Hello!
> >>>
> >>> Attached patch fixes fall-out from excess-precision improvements
> >>> patch. As shown in the PR, the code throughout the compiler assumes
> >>> FLAG_PRECISION_FAST when flag_unsafe_math_optimizations flag is in
> >>> effect. The patch puts back two lines, removed by excess-precision
> >>> improvements patch.
> >>>
> >>> 2016-12-08  Uros Bizjak  
> >>>
> >>> PR middle-end/78738
> >>> * toplev.c (init_excess_precision): Initialize flag_excess_precision
> >>> to EXCESS_PRECISION_FAST for flag_unsafe_math_optimizations.
> >>>
> >>> testsuite/ChangeLog:
> >>>
> >>> 2016-12-08  Uros Bizjak  
> >>>
> >>> PR middle-end/78738
> >>> * gcc.target/i386/pr78738.c: New test.
> >>>
> >>> Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
> >>>
> >>> OK for mainline?
> >>
> >> Hmm, I think it belongs to set_unsafe_math_optimization_flags instead
> >> (and be consistent if -fexcess-precision was manually specified).
> >>
> >> Also where do we assume connection between -funsafe-math-optimizations
> >> and FLAG_PRECISION_FAST?  We have two flags so we should fix any
> >> user that looks at one but means the other.
> >
> > The failure is caused by the call to ix86_emit_i387_round in
> > "lround2" expander. This expander is
> > enabled for x87 math when flag_unsafe_math_optimizations is enabled.
> > In the called ix86_emit_i387_round, DFmode PLUS pattern is manually
> > emitted:
> >
> >   emit_insn (gen_rtx_SET (e2, gen_rtx_PLUS (inmode, e1, half)));
> >
> > but due to  the definition of X87_ENABLE_ARITH, DFmode fadd pattern
> > remains disabled.
> >
> > It is possible to fix the failure by enabling X87_ENABLE_ARITH (and
> > X87_ENABLE_FLOAT) for flag_unsafe_math_optimizations (as is the case
> > in the attached v2 patch), but since gcc-6.x does
> >
> > if (flag_unsafe_math_optimizations)
> >flag_excess_precision = EXCESS_PRECISION_FAST;
> >
> > I though it was worth mentioning the difference between gcc-6 and
> > gcc-7. Probably, x87 is the only target that cares for it, in which
> > case the attached patch is sufficient.
> 
> I think that patch is the correct correctness fix.
> 
> With respect to GCC 6 vs. GCC 7 behavior I believe it would be more
> reasonable to set EXCESS_PRECISION_FAST by -ffast-math/-Ofast
> rather than from simply -funsafe-math-optimizations (which is an option
> directly controlling "legacy" controlled stuff which should be moved
> under a more specific option umbrella).

Do note that EXCESS_PRECISION_FAST is the default under the GNU dialects,
and we're only in EXCESS_PRECISION_STANDARD in the testcase because of
-std=c99.

That said, relaxing to EXCESS_PRECISION_FAST with -Ofast does make sense
to me.

Thanks,
James



Re: [PATCH] S/390: Run md tests with -march=native instead of -march=z13.

2016-12-13 Thread Jakub Jelinek
On Tue, Dec 13, 2016 at 10:28:29AM +0100, Dominik Vogt wrote:
> The attached patch fixes an md test execution problem on S/390.
> The tests would be built with -march=z13 but executed even on
> older machines.  Build with -march=native instead, so executing
> the tests should work on any machine generation.
> 
> Tested on s390x biarch, but not bootstrapped or regression tested.

I think this isn't a very good idea.  Then the testresults from one
box to another one will mean something different.
IMHO you want something like x86 avx_runtime effective target
(z13_runtime?), which would stand for running on z13 capable hw and
with z13 assembler support.
Or choose what options to include based on such effective target tests,
and perhaps also select a default action, like we do on some targets e.g. in
the vectorizer.  Some tests are dg-do run by default if the hw supports
the needed ISA, or dg-do assemble, if the hw doesn't support that, but
at least the assembler can assemble those, otherwise dg-do compile or
something similar.

With -march=native, you find some results in gcc-testresults and the exact
underlying hw will be part of the needed info to see what you've actually
tested.  While usually just some tests are UNSUPPORTED if hw or assembler
doesn't have the needed features.

Jakub


Re: [PATCH, GCC/ARM, gcc-5/6-branch, ping2] Fix PR77933: stack corruption on ARM when using high registers and lr

2016-12-13 Thread Thomas Preudhomme



On 09/12/16 15:27, Thomas Preudhomme wrote:



On 09/12/16 14:30, Kyrill Tkachov wrote:


On 09/12/16 14:24, Thomas Preudhomme wrote:

[Seeing as an RC for GCC 6.3 was suggested on IRC for mid next week]

Ping?

backport for 6 bootstraps on Thumb-1 and testsuite shows no regression for
either 5 or 6. Bootstrap for 5 is ongoing.



Ok.
Thanks,
Kyrill


I've committed the backport for 6 but am waiting for the bootstrap of the
backport for 5 to complete before committing it as well. Will probably only be
Monday.


I've finally committed the backport to gcc-5-branch. The bootstrap failed when 
using in-tree gmp, mpfr, mpc and isl so I had to restart it. Bootstrap succeeded.


PR77933 is now fixed in all supported branches, but note that it still affects 
gcc-4_7-branch, gcc-4_8-branch and gcc-4_9-branch.


Best regards,

Thomas


Re: Fold strstr (s, t) eq/ne s to strcmp (s, t) eq/ne 0 if strlen (t) is known

2016-12-13 Thread Prathamesh Kulkarni
On 9 December 2016 at 17:59, Jakub Jelinek  wrote:
> On Fri, Dec 09, 2016 at 05:36:41PM +0530, Prathamesh Kulkarni wrote:
>> --- a/gcc/tree-ssa-strlen.c
>> +++ b/gcc/tree-ssa-strlen.c
>> @@ -2302,7 +2302,81 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi)
>> else if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR)
>>   handle_pointer_plus (gsi);
>>   }
>> -  else if (TREE_CODE (lhs) != SSA_NAME && !TREE_SIDE_EFFECTS (lhs))
>> +
>> +  /* Fold strstr (s, t) == s to memcmp (s, t, strlen (t)) == 0.
>> +  if strlen (t) is known and var holding return value of strstr
>> +  has single use.  */
>> +
>> +  else if (TREE_CODE (lhs) == SSA_NAME && INTEGRAL_TYPE_P (TREE_TYPE 
>> (lhs)))
>> + {
>> +   enum tree_code code = gimple_assign_rhs_code (stmt);
>> +   if (code == EQ_EXPR || code == NE_EXPR)
>
> This way you handle _8 = _5 == _7;, but not if (_5 == _7) bar ();.  Shouldn't 
> you
> also handle GIMPLE_COND similarly (of course, the rhs1 and rhs2 grabbing
> and code grabbing is different for GIMPLE_COND.  But the rest should be
> the same, except again that you don't want to replace the GIMPLE_COND, but
> adjust it.  Maybe also COND_EXPR in gimple_assign (_9 = _5 == _7 ? _10 : 
> _11;).
>
>> + {
>> +   tree rhs1 = gimple_assign_rhs1 (stmt);
>> +   tree rhs2 = gimple_assign_rhs2 (stmt);
>> +   if (TREE_CODE (rhs1) == SSA_NAME
>> +   && TREE_CODE (rhs2) == SSA_NAME)
>> + {
>> +   gcall *call_stmt = dyn_cast (SSA_NAME_DEF_STMT 
>> (rhs1));
>> +   if (!call_stmt)
>> + {
>> +   call_stmt = dyn_cast (SSA_NAME_DEF_STMT (rhs2));
>> +   tree tmp = rhs1;
>> +   rhs1 = rhs2;
>> +   rhs2 = tmp;
>
> We use std::swap (rhs1, rhs2); in this case these days.
>
>> + }
>> +
>> +   tree call_lhs;
>> +   if (call_stmt
>> +   && gimple_call_builtin_p (call_stmt, BUILT_IN_STRSTR)
>> +   && (call_lhs = gimple_call_lhs (call_stmt))
>> +   && has_single_use (call_lhs))
>
> This might not optimize if you have:
>   _5 = foo ();
>   _7 = __builtin_strstr (_5, "abcd");
>   _8 = _5 == _7;
>
> Or even you could have:
>   _5 = __builtin_strstr (...);
>   _7 = __builtin_strstr (_5, "abcd");
>   _8 = _5 == _7;
>
> So I wonder if you shouldn't do:
>   gimple *call_stmt = NULL;
>   for (int pass = 0; pass < 2; pass++)
> {
>   gimple *g = SSA_NAME_DEF_STMT (rhs1);
>   if (gimple_call_builtin_p (g, BUILT_IN_STRSTR)
>   && gimple_call_lhs (g) == rhs1
>   && has_single_use (rhs1)
>   && gimple_call_arg (g, 0) == rhs2)
> {
>   call_stmt = g;
>   break;
> }
>   std::swap (rhs1, rhs2);
> }
>   if (call_stmt)
> ...
>
> I think you don't need operand_equal_p, because SSA_NAMEs should just
> be the same pointer if they are the same thing.
> The above way you handle both orderings.  Perhaps also it is big enough to
> be done in a separate function, which you call with the code/rhs1/rhs2 and
> stmt for the EQ/NE_EXPR is_gimple_assign as well as for COND_EXPR and
> GIMPLE_COND.
Hi Jakub,
Thanks for the suggestions. It didn't occur to me to check for gimple_cond.
I have tried to do the changes in the attached version.
I am not sure if I have handled cond_expr correctly.
IIUC, if gimple_assign has code cond_expr, then the condition is
stored in gimple_assign_rhs1,
however it's not a single operand but a tree of the form "op1 cond_code op2".
Is that correct ?

However I am not able to write a test-case that generates cond_expr in the IL.
I tried:
t1 = strstr (s, t);
(t1 == s)  ? foo() : bar ();
and other such variants but it seems the ?: operator is getting
lowered to gimple_cond instead.

Bootstrap+tested on x86_64-unknown-linux-gnu
and cross-tested on arm*-*-*, aarch64*-*-*.
Does it look OK ?

Thanks,
Prathamesh
>
> Jakub
2016-12-13  Jakub Jelinek  
Prathamesh Kulkarni  

* tree-ssa-strlen.c (fold_strstr_to_memcmp): New function.
(strlen_optimize_stmt): Call fold_strstr_to_memcmp.

testsuite/
* gcc.dg/strlenopt-30.c: New test-case.

diff --git a/gcc/testsuite/gcc.dg/strlenopt-30.c 
b/gcc/testsuite/gcc.dg/strlenopt-30.c
new file mode 100644
index 000..089b3a2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-30.c
@@ -0,0 +1,63 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+__attribute__((no_icf))
+_Bool f1(char *s)
+{
+  return __builtin_strstr (s, "hello") == s;
+}
+
+__attribute__((no_icf))
+_Bool f2(char 

Ping #1: [patch,testsuite,avr]: Filter-out -mmcu= from options for tests that set -mmcu=

2016-12-13 Thread Georg-Johann Lay

Ping #1

As I explained below, the solution taken be arm (pruning output)
does not work here.

Johann

On 02.12.2016 11:21, Georg-Johann Lay wrote:

On 01.12.2016 17:40, Mike Stump wrote:

On Dec 1, 2016, at 3:54 AM, Georg-Johann Lay  wrote:


This patch moves the compile tests that have a hard coded -mmcu=MCU
in their dg-options to a new folder.

The exp driver filters out -mmcu= from the command line options that
are provided by, say, board description files or --tool-opts.

This is needed because otherwise conflicting -mmcu= will FAIL
respective test cases because of "specified option '-mmcu' more than
once" errors from avr-gcc.

Ok for trunk?


So, it would be nice if different ports can use roughly similar
schemes to handle the same problems.  I think arm is one of the more
complex ports at this point in this regard with a lot of people and a
lot of years time to contemplate and implement solutions to the
problem.  They in particular don't have to move test cases around to
handle the difference like this, I think it would be best to avoid
that requirement if possible.

Glancing around, two starting points for how the arm achieves what it
does:

  lappend dg_runtest_extra_prunes "warning: switch -m(cpu|arch)=.*
conflicts with -m(cpu|arch)=.* switch"

in arm.exp, and they use something like:

/* { dg-require-effective-target arm_crypto_ok }
*/
|crypto-vsha256hq_u32.c:2:/* { dg-require-effective-target
arm_crypto_ok } */
/* { dg-add-options arm_crypto }
*/
|crypto-vsha256su0q_u32.c:2:/* { dg-require-effective-target
arm_crypto_ok } */

to validate the requirements of the test case, and to ensure that
optional things are selected.  Nice, simple, extensible, handles
multilibs, dejagnu arguments and different cpu defaults as I recall.

You won't need all the hair the arm folks have, but if you stub in
support in that direction, you then have simple, easy expansion room
to match all complexities that the arm folks have already hit and
solved.


I tried this approach, but it does not work as expected.

Notice that avr-gcc throws an error if conflicting -mmcu options are
supplied.  Pruning the output will make some tests "PASS", but the
compiler didn't actually do anything but producing an error message...

And one test FAILs because it should produce some specific diagnostic,
but again the compiler just throws a different error, the output is
pruned and the expected message is missing, hence fail.

Also one test case is for ATmega8, but you won't run the whole test
suite against ATmega8 because that device is too restricted to give
reasonable results...  Hence for some tests -mmcu=atmega8 is added by hand.

Johann









  1   2   >