date:20180103

Re: [PATCH] PR 78534 Change character length from int to size_t

2018-01-03 Thread Janne Blomqvist

On Thu, Jan 4, 2018 at 4:21 AM, Jerry DeLisle  wrote:
> On 01/03/2018 03:37 AM, Janne Blomqvist wrote:
>> On Sat, Dec 30, 2017 at 10:58 PM, Jerry DeLisle  
>> wrote:
>>> On 12/30/2017 12:35 PM, Janne Blomqvist wrote:
 On Sat, Dec 30, 2017 at 7:16 PM, Thomas Koenig  
 wrote:
>>> ---snip---

 I can provide that stuff as a separate patch, or merge it into the
 original megapatch and resubmit that, whichever way you prefer.
>>>
>>> I would prefer we split into two patches. This will make review of the 
>>> library
>>> I/O changes easier. The int len is used in a lot of places also where it 
>>> really
>>> happens to also be the kind (which is a length in our implementation).
>>
>> Hi,
>>
>> attached is a patch that makes the two attached testcases work. It
>> applies on top of the charlen->size_t patch. In the formatted I/O
>> stuff, I have mostly used ptrdiff_t to avoid having to deal with
>> signed/unsigned issues, as the previous code was using int.
>>
>>
>>
>
> Well I have started reviewing.  One concern I have is that in many places you
> are changing formatted field widths, like w in read_l, to ptrdiff_t.  I don't
> think this is necessary. If someone is printing a value that has a field width
> that big, it will never finish.
>
> I did not check the definitions of format data structures that use these 
> values
> all over.  So I think in the least, the patch goes too far.

I don't expect to support format widths larger than INT_MAX.  The
reason why ptrdiff_t is used is that read_block_form_* and many
similar functions take a pointer to a ptrdiff_t (to handle the one
case where a scalar variable can be wider than INT_MAX, namely a
character), and we can't pass a pointer to an incompatible type.

Now, you could argue that the internal API should be refactored so
that we pass integers by value rather than as pointers, and then only
the character case would need to care to use a larger type and the
rest could keep using plain int. And I would agree that would be good,
but I think such a refactor would be stage 1 material, not stage 3.

> Another issue I have is that the readability of the code just sucks to me. The
> type ptrdiff_t is so not intuitive in a very many places.  If this is really
> necessary lets hide it behind  a macro or something so I don't have to cringe
> every time I look at this code. (such as gfc_int)
>
> Sometimes we can get too pedantic about things like this. A lot of these
> variables have nothing to do with pointers, though they may be used as 
> modifiers
> to offsets in large memory spaces.

Well, ptrdiff_t is one of the two "pointer-sized integer" typedefs
introduced in C89, the other being size_t. So I would argue that if
one is doing stuff like pointer arithmetic, or calculating array
indices etc., and the variable might for some reason be negative,
ptrdiff_t is the natural type to use. I do agree that ptrdiff_t is a
bit of a mouthful, but I don't have any really good alternatives
either; I'm not convinced that a libgfortran-specific typedef would
make things clearer.

Or we can blame Microsoft which made win64 LLP64 and not LP64 like the
rest of the world, and thus spoiled the use of long for portable code.
:)

FWIW, we have "index_type" which is a typedef for ptrdiff_t, but
that's used mostly in code that deals with array descriptors, so I
thought it would be cleaner to not use it here.

-- 
Janne Blomqvist

Re: [PATCH] PR 78534 Change character length from int to size_t

2018-01-03 Thread Jerry DeLisle

On 01/03/2018 03:37 AM, Janne Blomqvist wrote:
> On Sat, Dec 30, 2017 at 10:58 PM, Jerry DeLisle  wrote:
>> On 12/30/2017 12:35 PM, Janne Blomqvist wrote:
>>> On Sat, Dec 30, 2017 at 7:16 PM, Thomas Koenig  
>>> wrote:
>> ---snip---
>>>
>>> I can provide that stuff as a separate patch, or merge it into the
>>> original megapatch and resubmit that, whichever way you prefer.
>>
>> I would prefer we split into two patches. This will make review of the 
>> library
>> I/O changes easier. The int len is used in a lot of places also where it 
>> really
>> happens to also be the kind (which is a length in our implementation).
> 
> Hi,
> 
> attached is a patch that makes the two attached testcases work. It
> applies on top of the charlen->size_t patch. In the formatted I/O
> stuff, I have mostly used ptrdiff_t to avoid having to deal with
> signed/unsigned issues, as the previous code was using int.
> 
> 
> 

Well I have started reviewing.  One concern I have is that in many places you
are changing formatted field widths, like w in read_l, to ptrdiff_t.  I don't
think this is necessary. If someone is printing a value that has a field width
that big, it will never finish.

I did not check the definitions of format data structures that use these values
all over.  So I think in the least, the patch goes too far.

Another issue I have is that the readability of the code just sucks to me. The
type ptrdiff_t is so not intuitive in a very many places.  If this is really
necessary lets hide it behind  a macro or something so I don't have to cringe
every time I look at this code. (such as gfc_int)

Sometimes we can get too pedantic about things like this. A lot of these
variables have nothing to do with pointers, though they may be used as modifiers
to offsets in large memory spaces.

I need a day or two to go through all of this. Sorry if I am just a downer 
about it.

Regards,

Jerry

Re: [PATCH] avoid using %lli et al.

2018-01-03 Thread Martin Sebor


On 01/03/2018 04:47 PM, Jakub Jelinek wrote:

On Tue, Jan 02, 2018 at 04:40:50PM -0700, Jeff Law wrote:

Attached is the patch with the casts removed (still bootstrapping).

Martin

gcc-printf-lli.diff


gcc/ChangeLog:

* gimple-ssa-warn-restrict.c (builtin_memref::builtin_memref): Use
offset_int::from instead of wide_int::to_shwi.
(maybe_diag_overlap): Remove assertion.
Use HOST_WIDE_INT_PRINT_DEC instead of %lli.
* gimple-ssa-sprintf.c (format_directive): Same.
(parse_directive): Same.
(sprintf_dom_walker::compute_format_length): Same.
(try_substitute_return_value): Same.

gcc/testsuite/ChangeLog:

* gcc.dg/Wrestrict-3.c: New test.

OK.


This broke bootstrap on i686-linux, dir.len is size_t, which isn't always
appropriate for HOST_WIDE_INT_PRINT_DEC format.

Fixed thusly, committed to trunk as obvious after i686-linux bootstrap
went past the previous failure point.


Thanks.

This is an example where having a solution for bug 78014 would
be helpful.  I.e., a -Wformat checker to help enforce the use
of %zu instead of %u or %lu (or an explicit cast from size_t)
even on targets size_t is unsigned or unsigned long, respectively.

Martin



2018-01-04  Jakub Jelinek  

* gimple-ssa-sprintf.c (parse_directive): Cast second dir.len to uhwi.

--- gcc/gimple-ssa-sprintf.c.jj 2018-01-03 22:24:36.0 +0100
+++ gcc/gimple-ssa-sprintf.c2018-01-04 00:15:55.950179583 +0100
@@ -3040,7 +3040,7 @@ parse_directive (sprintf_dom_walker::cal
   "length = " HOST_WIDE_INT_PRINT_UNSIGNED "\n",
   dir.dirno,
   (unsigned HOST_WIDE_INT)(size_t)(dir.beg - info.fmtstr),
-  (int)dir.len, dir.beg, dir.len);
+  (int)dir.len, dir.beg, (unsigned HOST_WIDE_INT) dir.len);
}

   return len - !*str;


Jakub

Re: [PATCH] handle invalid calls to built-ins with no prototype (PR 83603)

2018-01-03 Thread Martin Sebor


On 01/02/2018 04:29 PM, Jeff Law wrote:

On 01/01/2018 05:58 PM, Martin Sebor wrote:

The -Wrestrict code assumes that built-ins are called with
the correct number of arguments.  When this isn't so it
crashes.  The attached patch avoids the ICE due to this
error.

There are outstanding assumptions that the type of actual
arguments to built-ins declared without a prototype matches
the expected type.  Those will be corrected in a subsequent
patch.

Martin

gcc-83603.diff


PR tree-optimization/83603 - ICE in builtin_memref at 
gcc/gimple-ssa-warn-restrict.c:238

gcc/ChangeLog:

PR tree-optimization/83603
* calls.c (maybe_warn_nonstring_arg): Avoid accessing function
arguments past the endof the argument list in functions declared
without a prototype.
* gimple-ssa-warn-restrict.c (wrestrict_dom_walker::check_call):
Avoid checking when arguments are null.

OK.

Presumably we're not getting syntax errors precisely because we're
calling these functions without an in-scope prototype?


Right.


I wouldn't at all be surprised to find other passes mis-handling that
case.  The idioms you're using to get the arguments are similar to code
I've seen elsewhere -- it's probably just harder to trigger the failures
because analysis of the mem* or str* call is conditional on surrounding
context.


Other passes may be relying on gimple_call_builtin_p to determine
whether a call is to a built-in.  The function returns false for
calls with incompatible arguments (not just when passing a pointer
where an integer is expected, for example, but also when passing
an int where a size_t is expected, such as in a call to memcpy),
and so it exempts many otherwise correct calls from checking (and
also from the benefits of optimization, which also implies less
thorough checking).

Martin

Re: [PATCH] avoid using %lli et al.

2018-01-03 Thread Jakub Jelinek

On Tue, Jan 02, 2018 at 04:40:50PM -0700, Jeff Law wrote:
> > Attached is the patch with the casts removed (still bootstrapping).
> > 
> > Martin
> > 
> > gcc-printf-lli.diff
> > 
> > 
> > gcc/ChangeLog:
> > 
> > * gimple-ssa-warn-restrict.c (builtin_memref::builtin_memref): Use
> > offset_int::from instead of wide_int::to_shwi.
> > (maybe_diag_overlap): Remove assertion.
> > Use HOST_WIDE_INT_PRINT_DEC instead of %lli.
> > * gimple-ssa-sprintf.c (format_directive): Same.
> > (parse_directive): Same.
> > (sprintf_dom_walker::compute_format_length): Same.
> > (try_substitute_return_value): Same.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * gcc.dg/Wrestrict-3.c: New test.
> OK.

This broke bootstrap on i686-linux, dir.len is size_t, which isn't always
appropriate for HOST_WIDE_INT_PRINT_DEC format.

Fixed thusly, committed to trunk as obvious after i686-linux bootstrap
went past the previous failure point.

2018-01-04  Jakub Jelinek  

* gimple-ssa-sprintf.c (parse_directive): Cast second dir.len to uhwi.

--- gcc/gimple-ssa-sprintf.c.jj 2018-01-03 22:24:36.0 +0100
+++ gcc/gimple-ssa-sprintf.c2018-01-04 00:15:55.950179583 +0100
@@ -3040,7 +3040,7 @@ parse_directive (sprintf_dom_walker::cal
   "length = " HOST_WIDE_INT_PRINT_UNSIGNED "\n",
   dir.dirno,
   (unsigned HOST_WIDE_INT)(size_t)(dir.beg - info.fmtstr),
-  (int)dir.len, dir.beg, dir.len);
+  (int)dir.len, dir.beg, (unsigned HOST_WIDE_INT) dir.len);
}
 
   return len - !*str;


Jakub

[committed] Allow the target to set MAX_BITSIZE_MODE_ANY_MODE

2018-01-03 Thread Richard Sandiford

The default value of MAX_BITSIZE_MODE_ANY_MODE is calculated
from the initial mode sizes specified in the modes.def file.
The target needs to be able to override it if ADJUST_BYTESIZE
& co. can choose a bigger size.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
Also tested by comparing the before and after assembly output for at
least one target per CPU directory.  Applied (as a gen* patch).

Richard


2018-01-03  Richard Sandiford  

gcc/
* doc/rtl.texi (MAX_BITSIZE_MODE_ANY_MODE): Describe how the default
is calculated and how it can be overridden.
* genmodes.c (max_bitsize_mode_any_mode): New variable.
(create_modes): Initialize it from MAX_BITSIZE_MODE_ANY_MODE,
if defined.
(emit_max_int): Use it to set the output MAX_BITSIZE_MODE_ANY_MODE,
if nonzero.

Index: gcc/doc/rtl.texi
===
--- gcc/doc/rtl.texi2018-01-03 09:44:10.691921247 +
+++ gcc/doc/rtl.texi2018-01-03 09:44:10.854914708 +
@@ -1509,7 +1509,12 @@ compute integer values.
 
 @findex MAX_BITSIZE_MODE_ANY_MODE
 @item MAX_BITSIZE_MODE_ANY_MODE
-The bitsize of the largest mode on the target.   
+The bitsize of the largest mode on the target.  The default value is
+the largest mode size given in the mode definition file, which is
+always correct for targets whose modes have a fixed size.  Targets
+that might increase the size of a mode beyond this default should define
+@code{MAX_BITSIZE_MODE_ANY_MODE} to the actual upper limit in
+@file{@var{machine}-modes.def}.
 @end table
 
 @findex byte_mode
Index: gcc/genmodes.c
===
--- gcc/genmodes.c  2018-01-03 09:44:10.691921247 +
+++ gcc/genmodes.c  2018-01-03 09:44:10.854914708 +
@@ -792,6 +792,7 @@ #define ADJUST_FBIT(M, X)  _ADD_ADJUST (
 
 static int bits_per_unit;
 static int max_bitsize_mode_any_int;
+static int max_bitsize_mode_any_mode;
 
 static void
 create_modes (void)
@@ -811,6 +812,12 @@ create_modes (void)
 #else
   max_bitsize_mode_any_int = 0;
 #endif
+
+#ifdef MAX_BITSIZE_MODE_ANY_MODE
+  max_bitsize_mode_any_mode = MAX_BITSIZE_MODE_ANY_MODE;
+#else
+  max_bitsize_mode_any_mode = 0;
+#endif
 }
 
 #ifndef NUM_POLY_INT_COEFFS
@@ -989,12 +996,18 @@ emit_max_int (void)
   else
 printf ("#define MAX_BITSIZE_MODE_ANY_INT %d\n", max_bitsize_mode_any_int);
 
-  mmax = 0;
-  for (j = 0; j < MAX_MODE_CLASS; j++)
-for (i = modes[j]; i; i = i->next)
-  if (mmax < i->bytesize)
-   mmax = i->bytesize;
-  printf ("#define MAX_BITSIZE_MODE_ANY_MODE (%d*BITS_PER_UNIT)\n", mmax);
+  if (max_bitsize_mode_any_mode == 0)
+{
+  mmax = 0;
+  for (j = 0; j < MAX_MODE_CLASS; j++)
+   for (i = modes[j]; i; i = i->next)
+ if (mmax < i->bytesize)
+   mmax = i->bytesize;
+  printf ("#define MAX_BITSIZE_MODE_ANY_MODE (%d*BITS_PER_UNIT)\n", mmax);
+}
+  else
+printf ("#define MAX_BITSIZE_MODE_ANY_MODE %d\n",
+   max_bitsize_mode_any_mode);
 }
 
 /* Emit mode_size_inline routine into insn-modes.h header.  */

[committed] Use partial_subreg_p in curr_insn_transform

2018-01-03 Thread Richard Sandiford

Use partial_subreg_p in code that was added since the initial patch
that introduced this function.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
Also tested by comparing the before and after assembly output for at
least one target per CPU directory.  Committed as obvious.

Richard


2018-01-03  Richard Sandiford  

gcc/
* lra-constraints.c (curr_insn_transform): Use partial_subreg_p.

Index: gcc/lra-constraints.c
===
--- gcc/lra-constraints.c   2018-01-03 07:18:30.797822724 +
+++ gcc/lra-constraints.c   2018-01-03 09:34:11.747070247 +
@@ -4243,8 +4243,7 @@ curr_insn_transform (bool check_only_p)
  || (simplify_subreg_regno
  (ira_class_hard_regs[goal_alt[i]][0],
   GET_MODE (reg), byte, mode) >= 0)))
- || (GET_MODE_PRECISION (mode)
- < GET_MODE_PRECISION (GET_MODE (reg))
+ || (partial_subreg_p (mode, GET_MODE (reg))
  && GET_MODE_SIZE (GET_MODE (reg)) <= UNITS_PER_WORD
  && WORD_REGISTER_OPERATIONS)))
{

Re: [PATCH] Fix gcc.dg/vect-opt-info-1.c testcase

2018-01-03 Thread Richard Sandiford

Jakub Jelinek  writes:
> On Mon, Oct 23, 2017 at 06:26:12PM +0100, Richard Sandiford wrote:
>> 2017-10-23  Richard Sandiford  
>>  Alan Hayward  
>>  David Sherwood  
> ...
>
>> --- /dev/null2017-10-21 08:51:42.385141415 +0100
>> +++ gcc/testsuite/gcc.dg/vect-opt-info-1.c 2017-10-23
>> 17:22:26.571498977 +0100
>> @@ -0,0 +1,11 @@
>> +/* { dg-options "-std=c99 -fopt-info -O3" } */
>> +
>> +void
>> +vadd (int *dst, int *op1, int *op2, int count)
>> +{
>> +  for (int i = 0; i < count; ++i)
>> +dst[i] = op1[i] + op2[i];
>> +}
>> +
>> +/* { dg-message "loop vectorized" "" { target *-*-* } 6 } */
>> +/* { dg-message "loop versioned for vectorization because of possible
>> aliasing" "" { target *-*-* } 6 } */
>
> This testcase fails e.g. on i686-linux.  The problem is
> 1) it really should be at least guarded with
> /* { dg-do compile { target vect_int } } */
> because on targets that can't vectorize even simple int operations
> this will obviously fail

Hmm, yeah.

> 2) that won't help for i686 though, because we need -msse2 added
> to options for it to work; that is normally added by
> check_vect_support_and_set_flags
> only when in vect.exp.  If it was just that target, we could add
> dg-additional-options, but I'm afraid many other targets add some options.
>
> The following works for me, calling it nodump-* ensures that
> -fdump-tree-* isn't added, which I believe is essential for the testcase;

Yeah, that's right, the bug was using dump_file when dump_enabled_p (),
which would segfault when -fopt-info was passed and -fdump-tree-vect*
wasn't.

> tested on x86_64-linux with
> RUNTESTFLAGS='--target_board=unix\{-m32,-m32/-mno-sse,-m64\} vect.exp=nodump*'
> ok for trunk?
>
> Sadly I don't have your broken development version of the patch, so can't
> verify it fails with the broken patch.

Me neither any more, but it looks good to me FWIW.

Thanks,
Richard

[PATCH] minor tweak to complete strlen fix for PR83501

2018-01-03 Thread Martin Sebor


Prathamesh's fix restores the optimization for the test case
reported in the bug (thanks!) but it isn't sufficient to bring
GCC 8 completely up to par with 7.  Prior GCC versions are able
to compute the string length in the test case below but GCC 8
cannot.

  char d[8];
  const char s[] = "1234567";

  void f (void)
  {
__builtin_strcpy (d, s);
if (__builtin_strlen (d) != sizeof s - 1)
  __builtin_abort ();
  }

The attached patch slightly tweaks Prathamesh's patch to also
handle the case above.  There still are outstanding cases where
folding into MEM_REF defeats the strlen pass (I've raised
a couple of new bugs to track those I noticed today) but, AFAICT,
those aren't recent regressions so they can be dealt with later
and separately.

Martin

PS I've renamed the function because I had initially thought it
also needed to handle PHIs that refer to strings of the same
length (and there isn't necessarily one "right" string to return)
but then realized that the lack of that handling isn't part of
the regression so I didn't include it but kept the new return
type and name.  When the pass is enhanced to handle PHIs and
some of the other cases it doesn't handle the function won't
need to change.
PR tree-optimization/83501 - strlen(a) not folded after strcpy(a, ...)

gcc/ChangeLog:

	PR tree-optimization/83501
	* tree-ssa-strlen.c (get_string_cst): Rename...
	(get_strlen): ...to this.  Handle global constants.
	(handle_char_store): Adjust.

gcc/testsuite/ChangeLog:

	PR tree-optimization/83501
	* gcc.dg/strlenopt-39.c: New test.

Index: gcc/testsuite/gcc.dg/strlenopt-39.c
===
--- gcc/testsuite/gcc.dg/strlenopt-39.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/strlenopt-39.c	(working copy)
@@ -0,0 +1,66 @@
+/* PR tree-optimization/83444
+   { dg-do compile }
+   { dg-options "-O2 -fdump-tree-optimized" } */
+
+#include "strlenopt.h"
+
+#define STR "1234567"
+
+const char str[] = STR;
+
+char dst[10];
+
+void copy_from_global_str (void)
+{
+  strcpy (dst, str);
+
+  if (strlen (dst) != sizeof str - 1)
+abort ();
+}
+
+void copy_from_local_str (void)
+{
+  const char s[] = STR;
+
+  strcpy (dst, s);
+
+  if (strlen (dst) != sizeof s - 1)
+abort ();
+}
+
+void copy_from_local_memstr (void)
+{
+  struct {
+char s[sizeof STR];
+  } x = { STR };
+
+  strcpy (dst, x.s);
+
+  if (strlen (dst) != sizeof x.s - 1)
+abort ();
+}
+
+void copy_to_local_str (void)
+{
+  char d[sizeof STR];
+
+  strcpy (d, str);
+
+  if (strlen (d) != sizeof str - 1)
+abort ();
+}
+
+void copy_to_local_memstr (void)
+{
+  struct {
+char d[sizeof STR];
+  } x;
+
+  strcpy (x.d, str);
+
+  if (strlen (x.d) != sizeof str- 1)
+abort ();
+}
+
+/* Verify that all calls to strlen have been eliminated.
+  { dg-final { scan-tree-dump-not "(abort|strlen) \\(\\)" "optimized" } } */
Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c	(revision 256180)
+++ gcc/tree-ssa-strlen.c	(working copy)
@@ -2772,9 +2772,11 @@ handle_pointer_plus (gimple_stmt_iterator *gsi)
 }
 }
 
-/* Check if RHS is string_cst possibly wrapped by mem_ref.  */
-static tree
-get_string_cst (tree rhs)
+/* If RHS, either directly or indirectly, refers to a string of constant
+   length, return it.  Otherwise return a negative value.  */
+
+static int
+get_strlen (tree rhs)
 {
   if (TREE_CODE (rhs) == MEM_REF
   && integer_zerop (TREE_OPERAND (rhs, 1)))
@@ -2784,7 +2786,17 @@ handle_pointer_plus (gimple_stmt_iterator *gsi)
 	rhs = TREE_OPERAND (rhs, 0);
 }
 
-  return (TREE_CODE (rhs) == STRING_CST) ? rhs : NULL_TREE;
+  if (TREE_CODE (rhs) == VAR_DECL
+  && TREE_READONLY (rhs))
+rhs = DECL_INITIAL (rhs);
+
+  if (rhs && TREE_CODE (rhs) == STRING_CST)
+{
+  unsigned HOST_WIDE_INT ilen = strlen (TREE_STRING_POINTER (rhs));
+  return ilen <= INT_MAX ? ilen : -1;
+}
+
+  return -1;
 }
 
 /* Handle a single character store.  */
@@ -2799,6 +2811,9 @@ handle_char_store (gimple_stmt_iterator *gsi)
   tree rhs = gimple_assign_rhs1 (stmt);
   unsigned HOST_WIDE_INT offset = 0;
 
+  /* Set to the length of the string being assigned if known.  */
+  int rhslen;
+
   if (TREE_CODE (lhs) == MEM_REF
   && TREE_CODE (TREE_OPERAND (lhs, 0)) == SSA_NAME)
 {
@@ -2942,19 +2957,18 @@ handle_char_store (gimple_stmt_iterator *gsi)
 	}
 }
   else if (idx == 0
-	   && (rhs = get_string_cst (gimple_assign_rhs1 (stmt)))
+	   && (rhslen = get_strlen (gimple_assign_rhs1 (stmt))) >= 0
 	   && ssaname == NULL_TREE
 	   && TREE_CODE (TREE_TYPE (lhs)) == ARRAY_TYPE)
 {
-  size_t l = strlen (TREE_STRING_POINTER (rhs));
   HOST_WIDE_INT a = int_size_in_bytes (TREE_TYPE (lhs));
-  if (a > 0 && (unsigned HOST_WIDE_INT) a > l)
+  if (a > 0 && (unsigned HOST_WIDE_INT) a > (unsigned HOST_WIDE_INT) rhslen)
 	{
 	  int idx = new_addr_stridx (lhs);
 	  if (idx != 0)
 	{
 	  si =

Re: [PATCH] [PR target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64

2018-01-03 Thread Jeff Law

On 01/03/2018 01:49 PM, Jeff Law wrote:
> On 01/03/2018 01:36 PM, Jakub Jelinek wrote:
>> On Wed, Jan 03, 2018 at 12:57:10PM -0700, Jeff Law wrote:
 Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux?

 2018-01-03  Jakub Jelinek  

PR target/83641
* config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For
noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop,
only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp
and add REG_CFA_ADJUST_CFA notes in that case to both insns.
>>> I still question how useful this part is, but not enough to object given
>>
>> Reduces bloat in .eh_frame and when unwinding less work is needed.  The
>> patch isn't that large after all.
>>
>>> you've done the work.  I'll go ahead and commit both as a single unit.
>>
>> BTW, I've now bootstrapped/regtested on x86_64-linux and i686-linux your and
>> my patch, the only regression caused by your patch is:
>> +FAIL: gcc.target/i386/stack-check-12.c scan-assembler popl\\t%esi
>> +FAIL: gcc.target/i386/stack-check-12.c scan-assembler pushl\\t%esi
>> on i686-linux, quite understandably, that is exactly the case you are
>> changing, there is %edi clobbered in the function and thus needs to be saved
>> and restored and so no push/pop of %esi is needed.
>> So, this testcase should be tweaked.
> I think the ASM can just be dropped.  It doesn't contribute anything to
> what Florian was trying to show.   Once we drop the ASM we should see
> those probes return.  I'll have to check that obviously.
Here's the final patch -- only changes are smashing Jakub's and my work
together, adding a comment and the stack-check-12 tweak mentioned above.

Committing to the trunk.

jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8df8f232d80..7aa0920fc65 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,17 @@
+2017-01-03  Jakub Jelinek  
+   Jeff Law  
+
+   PR target/83641
+   * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For
+   noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop,
+   only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp
+   and add REG_CFA_ADJUST_CFA notes in that case to both insns.
+
+   PR target/83641
+   * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): Do not
+   explicitly probe *sp in a noreturn function if there were any callee
+   register saves or frame pointer is needed.
+
 2018-01-03  Jakub Jelinek  
 
PR debug/83621
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 56baaa7e5e8..c363de93e02 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12217,21 +12217,39 @@ ix86_adjust_stack_and_probe_stack_clash (const 
HOST_WIDE_INT size)
  pointer could be anywhere in the guard page.  The safe thing
  to do is emit a probe now.
 
+ The probe can be avoided if we have already emitted any callee
+ register saves into the stack or have a frame pointer (which will
+ have been saved as well).  Those saves will function as implicit
+ probes.
+
  ?!? This should be revamped to work like aarch64 and s390 where
  we track the offset from the most recent probe.  Normally that
  offset would be zero.  For a noreturn function we would reset
  it to PROBE_INTERVAL - (STACK_BOUNDARY / BITS_PER_UNIT).   Then
  we just probe when we cross PROBE_INTERVAL.  */
-  if (TREE_THIS_VOLATILE (cfun->decl))
+  if (TREE_THIS_VOLATILE (cfun->decl)
+  && !(m->frame.nregs || m->frame.nsseregs || frame_pointer_needed))
 {
   /* We can safely use any register here since we're just going to push
 its value and immediately pop it back.  But we do try and avoid
 argument passing registers so as not to introduce dependencies in
 the pipeline.  For 32 bit we use %esi and for 64 bit we use %rax.  */
   rtx dummy_reg = gen_rtx_REG (word_mode, TARGET_64BIT ? AX_REG : SI_REG);
-  rtx_insn *insn = emit_insn (gen_push (dummy_reg));
-  RTX_FRAME_RELATED_P (insn) = 1;
-  ix86_emit_restore_reg_using_pop (dummy_reg);
+  rtx_insn *insn_push = emit_insn (gen_push (dummy_reg));
+  rtx_insn *insn_pop = emit_insn (gen_pop (dummy_reg));
+  m->fs.sp_offset -= UNITS_PER_WORD;
+  if (m->fs.cfa_reg == stack_pointer_rtx)
+   {
+ m->fs.cfa_offset -= UNITS_PER_WORD;
+ rtx x = plus_constant (Pmode, stack_pointer_rtx, -UNITS_PER_WORD);
+ x = gen_rtx_SET (stack_pointer_rtx, x);
+ add_reg_note (insn_push, REG_CFA_ADJUST_CFA, x);
+ RTX_FRAME_RELATED_P (insn_push) = 1;
+ x = plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD);
+ x = gen_rtx_SET (stack_pointer_rtx, x);
+ add_reg_note (insn_pop, REG_CFA_ADJUST_CFA, x);
+ RTX_FRAME_RELATED_P (insn_pop) = 1;
+   }

[PATCH] Fix gcc.dg/vect-opt-info-1.c testcase

2018-01-03 Thread Jakub Jelinek

On Mon, Oct 23, 2017 at 06:26:12PM +0100, Richard Sandiford wrote:
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
...

> --- /dev/null 2017-10-21 08:51:42.385141415 +0100
> +++ gcc/testsuite/gcc.dg/vect-opt-info-1.c2017-10-23 17:22:26.571498977 
> +0100
> @@ -0,0 +1,11 @@
> +/* { dg-options "-std=c99 -fopt-info -O3" } */
> +
> +void
> +vadd (int *dst, int *op1, int *op2, int count)
> +{
> +  for (int i = 0; i < count; ++i)
> +dst[i] = op1[i] + op2[i];
> +}
> +
> +/* { dg-message "loop vectorized" "" { target *-*-* } 6 } */
> +/* { dg-message "loop versioned for vectorization because of possible 
> aliasing" "" { target *-*-* } 6 } */

This testcase fails e.g. on i686-linux.  The problem is
1) it really should be at least guarded with
/* { dg-do compile { target vect_int } } */
because on targets that can't vectorize even simple int operations
this will obviously fail
2) that won't help for i686 though, because we need -msse2 added
to options for it to work; that is normally added by
check_vect_support_and_set_flags
only when in vect.exp.  If it was just that target, we could add
dg-additional-options, but I'm afraid many other targets add some options.

The following works for me, calling it nodump-* ensures that
-fdump-tree-* isn't added, which I believe is essential for the testcase;
tested on x86_64-linux with
RUNTESTFLAGS='--target_board=unix\{-m32,-m32/-mno-sse,-m64\} vect.exp=nodump*'
ok for trunk?

Sadly I don't have your broken development version of the patch, so can't
verify it fails with the broken patch.

2018-01-03  Jakub Jelinek  

* gcc.dg/vect-opt-info-1.c: Moved to ...
* gcc.dg/vect/nodump-vect-opt-info-1.c: ... here.  Only run on
vect_int targets, use dg-additional-options instead of dg-options and
use relative line numbers instead of absolute.

--- gcc/testsuite/gcc.dg/vect-opt-info-1.c.jj   2018-01-03 10:04:47.568412808 
+0100
+++ gcc/testsuite/gcc.dg/vect-opt-info-1.c  2018-01-03 22:14:44.082848915 
+0100
@@ -1,11 +0,0 @@
-/* { dg-options "-std=c99 -fopt-info -O3" } */
-
-void
-vadd (int *dst, int *op1, int *op2, int count)
-{
-  for (int i = 0; i < count; ++i)
-dst[i] = op1[i] + op2[i];
-}
-
-/* { dg-message "loop vectorized" "" { target *-*-* } 6 } */
-/* { dg-message "loop versioned for vectorization because of possible 
aliasing" "" { target *-*-* } 6 } */
--- gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-1.c.jj   2018-01-03 
22:14:49.387852927 +0100
+++ gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-1.c  2018-01-03 
22:17:30.437974412 +0100
@@ -0,0 +1,11 @@
+/* { dg-do compile { target vect_int } } */
+/* { dg-additional-options "-std=c99 -fopt-info -O3" } */
+
+void
+vadd (int *dst, int *op1, int *op2, int count)
+{
+/* { dg-message "loop vectorized" "" { target *-*-* } .+2 } */
+/* { dg-message "loop versioned for vectorization because of possible 
aliasing" "" { target *-*-* } .+1 } */
+  for (int i = 0; i < count; ++i)
+dst[i] = op1[i] + op2[i];
+}

Jakub

[PATCH] Update crtl->has_bb_partition if NOTE_INSN_SWITCH_SECTIONS isn't emitted (PR debug/83585)

2018-01-03 Thread Jakub Jelinek

Hi!

My recent dwarf2out.c, dbxout.c and rs6000.c uses of crtl->has_bb_partition
all assume that if it is true then the function actually has 2 partitions
and NOTE_INSN_SWITCH_SECTIONS is present, but as the following testcase
shows, that isn't always the case, crtl->has_bb_partition is set during
bbpart pass if there are 2 partitions, but there are many passes in between
and all bbs of one of the partitions could be optimized away.

Are there any uses of this flag that would be interested in whether the
function had any partitions in the past (I guess that also means
BB_PARTITION is not unspecified), rather than whether the function has 2
partitions right now?  If yes, instead of this patch we could introduce
a new flag in cfun set by insert_section_boundary_note.  If not, the
following patch should be sufficient.

Bootstrapped/regtested on x86_64-linux and i686-linux.

2018-01-03  Jakub Jelinek  

PR debug/83585
* bb-reorder.c (insert_section_boundary_note): Set has_bb_partition
to switched_sections.

* gcc.dg/pr83585.c: New test.

--- gcc/bb-reorder.c.jj 2018-01-03 10:19:55.303533981 +0100
+++ gcc/bb-reorder.c2018-01-03 18:16:32.470555769 +0100
@@ -2523,6 +2523,11 @@ insert_section_boundary_note (void)
   current_partition = BB_PARTITION (bb);
}
 }
+
+  /* Make sure crtl->has_bb_partition matches reality even if bbpart finds
+ some hot and some cold basic blocks, but later one of those kinds is
+ optimized away.  */
+  crtl->has_bb_partition = switched_sections;
 }
 
 namespace {
--- gcc/testsuite/gcc.dg/pr83585.c.jj   2018-01-03 18:20:33.685641817 +0100
+++ gcc/testsuite/gcc.dg/pr83585.c  2018-01-03 18:19:29.922619064 +0100
@@ -0,0 +1,18 @@
+/* PR debug/83585 */
+/* { dg-do assemble } */
+/* { dg-options "-std=gnu89 -O2 -g -fno-tree-dce 
-fno-guess-branch-probability" } */
+
+int
+foo (int x)
+{
+  int a, b;
+  for (a = 0; a < 2; ++a)
+if (x != 0)
+  {
+for (b = 0; b < 2; ++b)
+  ;
+return 0;
+  }
+
+  return;
+}

Jakub

Re: [PATCH] Fix ICE with statement frontiers (PR debug/83645)

2018-01-03 Thread Richard Biener

On January 3, 2018 9:42:01 PM GMT+01:00, Jakub Jelinek  wrote:
>Hi!
>
>When var-tracking pass isn't done (e.g. with -O2 -g -fno-var-tracking
>or -O2 -gstatement-frontiers), rest_of_handle_final calls
>variable_tracking_main in order to perform delete_vta_debug_insns
>- replace the debug marker insns with notes.  The problem with that is
>that final pass is after freeing cfg, so using
>FOR_EACH_BB/FOR_BB_INSNS_SAFE
>is incorrect as the attached patch shows, e.g. the barriers pass swaps
>a
>barrier with the statement notes and nothing updates the bb boundaries.
>
>Fixed by doing the replacements on the raw insn stream when called from
>final.c.  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
>trunk?

OK. 

Richard. 

>2018-01-03  Jakub Jelinek  
>
>   PR debug/83645
>   * var-tracking.c (delete_vta_debug_insn): New inline function.
>   (delete_vta_debug_insns): Add USE_CFG argument, if true, walk just
>   insns from get_insns () to NULL instead of each bb separately.
>   Use delete_vta_debug_insn.  No longer static.
>   (vt_debug_insns_local, variable_tracking_main_1): Adjust
>   delete_vta_debug_insns callers.
>   * rtl.h (delete_vta_debug_insns): Declare.
>   * final.c (rest_of_handle_final): Call delete_vta_debug_insns
>   instead of variable_tracking_main.
>
>   * gcc.dg/pr83645.c: New test.
>
>--- gcc/var-tracking.c.jj  2018-01-03 10:19:54.385533834 +0100
>+++ gcc/var-tracking.c 2018-01-03 16:01:48.608217538 +0100
>@@ -10271,11 +10271,40 @@ vt_initialize (void)
> 
> static int debug_label_num = 1;
> 
>+/* Remove from the insn stream a single debug insn used for
>+   variable tracking at assignments.  */
>+
>+static inline void
>+delete_vta_debug_insn (rtx_insn *insn)
>+{
>+  if (DEBUG_MARKER_INSN_P (insn))
>+{
>+  reemit_marker_as_note (insn);
>+  return;
>+}
>+
>+  tree decl = INSN_VAR_LOCATION_DECL (insn);
>+  if (TREE_CODE (decl) == LABEL_DECL
>+  && DECL_NAME (decl)
>+  && !DECL_RTL_SET_P (decl))
>+{
>+  PUT_CODE (insn, NOTE);
>+  NOTE_KIND (insn) = NOTE_INSN_DELETED_DEBUG_LABEL;
>+  NOTE_DELETED_LABEL_NAME (insn)
>+  = IDENTIFIER_POINTER (DECL_NAME (decl));
>+  SET_DECL_RTL (decl, insn);
>+  CODE_LABEL_NUMBER (insn) = debug_label_num++;
>+}
>+  else
>+delete_insn (insn);
>+}
>+
> /* Remove from the insn stream all debug insns used for variable
>-   tracking at assignments.  */
>+   tracking at assignments.  USE_CFG should be false if the cfg is no
>+   longer usable.  */
> 
>-static void
>-delete_vta_debug_insns (void)
>+void
>+delete_vta_debug_insns (bool use_cfg)
> {
>   basic_block bb;
>   rtx_insn *insn, *next;
>@@ -10283,33 +10312,20 @@ delete_vta_debug_insns (void)
>   if (!MAY_HAVE_DEBUG_INSNS)
> return;
> 
>-  FOR_EACH_BB_FN (bb, cfun)
>-{
>-  FOR_BB_INSNS_SAFE (bb, insn, next)
>+  if (use_cfg)
>+FOR_EACH_BB_FN (bb, cfun)
>+  {
>+  FOR_BB_INSNS_SAFE (bb, insn, next)
>+if (DEBUG_INSN_P (insn))
>+  delete_vta_debug_insn (insn);
>+  }
>+  else
>+for (insn = get_insns (); insn; insn = next)
>+  {
>+  next = NEXT_INSN (insn);
>   if (DEBUG_INSN_P (insn))
>-{
>-  if (DEBUG_MARKER_INSN_P (insn))
>-{
>-  reemit_marker_as_note (insn);
>-  continue;
>-}
>-
>-  tree decl = INSN_VAR_LOCATION_DECL (insn);
>-  if (TREE_CODE (decl) == LABEL_DECL
>-  && DECL_NAME (decl)
>-  && !DECL_RTL_SET_P (decl))
>-{
>-  PUT_CODE (insn, NOTE);
>-  NOTE_KIND (insn) = NOTE_INSN_DELETED_DEBUG_LABEL;
>-  NOTE_DELETED_LABEL_NAME (insn)
>-= IDENTIFIER_POINTER (DECL_NAME (decl));
>-  SET_DECL_RTL (decl, insn);
>-  CODE_LABEL_NUMBER (insn) = debug_label_num++;
>-}
>-  else
>-delete_insn (insn);
>-}
>-}
>+delete_vta_debug_insn (insn);
>+  }
> }
> 
> /* Run a fast, BB-local only version of var tracking, to take care of
>@@ -10322,7 +10338,7 @@ static void
> vt_debug_insns_local (bool skipped ATTRIBUTE_UNUSED)
> {
>   /* ??? Just skip it all for now.  */
>-  delete_vta_debug_insns ();
>+  delete_vta_debug_insns (true);
> }
> 
> /* Free the data structures needed for variable tracking.  */
>@@ -10395,7 +10411,7 @@ variable_tracking_main_1 (void)
>any pseudos at this point.  */
>   || targetm.no_register_allocation)
> {
>-  delete_vta_debug_insns ();
>+  delete_vta_debug_insns (true);
>   return 0;
> }
> 
>@@ -10423,7 +10439,7 @@ variable_tracking_main_1 (void)
> {
>   vt_finalize ();
> 
>-  delete_vta_debug_insns ();
>+  delete_vta_debug_insns (true);
> 
>   /* This is later restored by our caller.  */
>   flag_var_tracking_assignments = 0;
>--- gcc/rtl.h.jj   2018-01-03 10:19:55.184533962 +0100
>+++ gcc/rtl.h

Re: [PATCH] Fix ICE during expand_debug_expr (PR debug/83621)

2018-01-03 Thread Richard Biener

On January 3, 2018 9:46:14 PM GMT+01:00, Jakub Jelinek  wrote:
>Hi!
>
>The ICE here is when calling multiple_p with GET_MODE_BITSIZE
>(BLKmode),
>so SIGFPE because it is 0.  BLKmode can leak into the debug stmts
>through
>generic vectors without HW support, but doing say (plus:BLK ...) etc.
>just doesn't look like a valid RTL, so instead of just not trying to
>subreg
>it and not call multiple_p, this patch punts when unary/binary/ternary
>ops have BLKmode.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK. 

Richard. 

>2018-01-03  Jakub Jelinek  
>
>   PR debug/83621
>   * cfgexpand.c (expand_debug_expr): Return NULL if mode is
>   BLKmode for ternary, binary or unary expressions.
>
>   * gcc.dg/pr83621.c: New test.
>
>--- gcc/cfgexpand.c.jj 2018-01-03 10:19:54.0 +0100
>+++ gcc/cfgexpand.c2018-01-03 16:56:28.375179714 +0100
>@@ -4208,6 +4208,8 @@ expand_debug_expr (tree exp)
> 
> binary:
> case tcc_binary:
>+  if (mode == BLKmode)
>+  return NULL_RTX;
>   op1 = expand_debug_expr (TREE_OPERAND (exp, 1));
>   if (!op1)
>   return NULL_RTX;
>@@ -4232,6 +4234,8 @@ expand_debug_expr (tree exp)
> 
> unary:
> case tcc_unary:
>+  if (mode == BLKmode)
>+  return NULL_RTX;
>   inner_mode = TYPE_MODE (TREE_TYPE (TREE_OPERAND (exp, 0)));
>   op0 = expand_debug_expr (TREE_OPERAND (exp, 0));
>   if (!op0)
>--- gcc/testsuite/gcc.dg/pr83621.c.jj  2018-01-03 17:01:21.244249712
>+0100
>+++ gcc/testsuite/gcc.dg/pr83621.c 2018-01-03 17:00:56.414243769 +0100
>@@ -0,0 +1,12 @@
>+/* PR debug/83621 */
>+/* { dg-do compile } */
>+/* { dg-options "-O -g" } */
>+
>+typedef int __attribute__ ((__vector_size__ (64))) V;
>+V v;
>+
>+void
>+foo ()
>+{
>+  V u = v >> 1;
>+}
>
>   Jakub

Re: [PATCH] [PR target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64

2018-01-03 Thread Jeff Law

On 01/03/2018 01:36 PM, Jakub Jelinek wrote:
> On Wed, Jan 03, 2018 at 12:57:10PM -0700, Jeff Law wrote:
>>> Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux?
>>>
>>> 2018-01-03  Jakub Jelinek  
>>>
>>> PR target/83641
>>> * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For
>>> noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop,
>>> only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp
>>> and add REG_CFA_ADJUST_CFA notes in that case to both insns.
>> I still question how useful this part is, but not enough to object given
> 
> Reduces bloat in .eh_frame and when unwinding less work is needed.  The
> patch isn't that large after all.
> 
>> you've done the work.  I'll go ahead and commit both as a single unit.
> 
> BTW, I've now bootstrapped/regtested on x86_64-linux and i686-linux your and
> my patch, the only regression caused by your patch is:
> +FAIL: gcc.target/i386/stack-check-12.c scan-assembler popl\\t%esi
> +FAIL: gcc.target/i386/stack-check-12.c scan-assembler pushl\\t%esi
> on i686-linux, quite understandably, that is exactly the case you are
> changing, there is %edi clobbered in the function and thus needs to be saved
> and restored and so no push/pop of %esi is needed.
> So, this testcase should be tweaked.
I think the ASM can just be dropped.  It doesn't contribute anything to
what Florian was trying to show.   Once we drop the ASM we should see
those probes return.  I'll have to check that obviously.

jeff

[PATCH] Fix ICE during expand_debug_expr (PR debug/83621)

2018-01-03 Thread Jakub Jelinek

Hi!

The ICE here is when calling multiple_p with GET_MODE_BITSIZE (BLKmode),
so SIGFPE because it is 0.  BLKmode can leak into the debug stmts through
generic vectors without HW support, but doing say (plus:BLK ...) etc.
just doesn't look like a valid RTL, so instead of just not trying to subreg
it and not call multiple_p, this patch punts when unary/binary/ternary
ops have BLKmode.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-01-03  Jakub Jelinek  

PR debug/83621
* cfgexpand.c (expand_debug_expr): Return NULL if mode is
BLKmode for ternary, binary or unary expressions.

* gcc.dg/pr83621.c: New test.

--- gcc/cfgexpand.c.jj  2018-01-03 10:19:54.0 +0100
+++ gcc/cfgexpand.c 2018-01-03 16:56:28.375179714 +0100
@@ -4208,6 +4208,8 @@ expand_debug_expr (tree exp)
 
 binary:
 case tcc_binary:
+  if (mode == BLKmode)
+   return NULL_RTX;
   op1 = expand_debug_expr (TREE_OPERAND (exp, 1));
   if (!op1)
return NULL_RTX;
@@ -4232,6 +4234,8 @@ expand_debug_expr (tree exp)
 
 unary:
 case tcc_unary:
+  if (mode == BLKmode)
+   return NULL_RTX;
   inner_mode = TYPE_MODE (TREE_TYPE (TREE_OPERAND (exp, 0)));
   op0 = expand_debug_expr (TREE_OPERAND (exp, 0));
   if (!op0)
--- gcc/testsuite/gcc.dg/pr83621.c.jj   2018-01-03 17:01:21.244249712 +0100
+++ gcc/testsuite/gcc.dg/pr83621.c  2018-01-03 17:00:56.414243769 +0100
@@ -0,0 +1,12 @@
+/* PR debug/83621 */
+/* { dg-do compile } */
+/* { dg-options "-O -g" } */
+
+typedef int __attribute__ ((__vector_size__ (64))) V;
+V v;
+
+void
+foo ()
+{
+  V u = v >> 1;
+}

Jakub

[PATCH] Fix ICE with statement frontiers (PR debug/83645)

2018-01-03 Thread Jakub Jelinek

Hi!

When var-tracking pass isn't done (e.g. with -O2 -g -fno-var-tracking
or -O2 -gstatement-frontiers), rest_of_handle_final calls
variable_tracking_main in order to perform delete_vta_debug_insns
- replace the debug marker insns with notes.  The problem with that is
that final pass is after freeing cfg, so using FOR_EACH_BB/FOR_BB_INSNS_SAFE
is incorrect as the attached patch shows, e.g. the barriers pass swaps a
barrier with the statement notes and nothing updates the bb boundaries.

Fixed by doing the replacements on the raw insn stream when called from
final.c.  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2018-01-03  Jakub Jelinek  

PR debug/83645
* var-tracking.c (delete_vta_debug_insn): New inline function.
(delete_vta_debug_insns): Add USE_CFG argument, if true, walk just
insns from get_insns () to NULL instead of each bb separately.
Use delete_vta_debug_insn.  No longer static.
(vt_debug_insns_local, variable_tracking_main_1): Adjust
delete_vta_debug_insns callers.
* rtl.h (delete_vta_debug_insns): Declare.
* final.c (rest_of_handle_final): Call delete_vta_debug_insns
instead of variable_tracking_main.

* gcc.dg/pr83645.c: New test.

--- gcc/var-tracking.c.jj   2018-01-03 10:19:54.385533834 +0100
+++ gcc/var-tracking.c  2018-01-03 16:01:48.608217538 +0100
@@ -10271,11 +10271,40 @@ vt_initialize (void)
 
 static int debug_label_num = 1;
 
+/* Remove from the insn stream a single debug insn used for
+   variable tracking at assignments.  */
+
+static inline void
+delete_vta_debug_insn (rtx_insn *insn)
+{
+  if (DEBUG_MARKER_INSN_P (insn))
+{
+  reemit_marker_as_note (insn);
+  return;
+}
+
+  tree decl = INSN_VAR_LOCATION_DECL (insn);
+  if (TREE_CODE (decl) == LABEL_DECL
+  && DECL_NAME (decl)
+  && !DECL_RTL_SET_P (decl))
+{
+  PUT_CODE (insn, NOTE);
+  NOTE_KIND (insn) = NOTE_INSN_DELETED_DEBUG_LABEL;
+  NOTE_DELETED_LABEL_NAME (insn)
+   = IDENTIFIER_POINTER (DECL_NAME (decl));
+  SET_DECL_RTL (decl, insn);
+  CODE_LABEL_NUMBER (insn) = debug_label_num++;
+}
+  else
+delete_insn (insn);
+}
+
 /* Remove from the insn stream all debug insns used for variable
-   tracking at assignments.  */
+   tracking at assignments.  USE_CFG should be false if the cfg is no
+   longer usable.  */
 
-static void
-delete_vta_debug_insns (void)
+void
+delete_vta_debug_insns (bool use_cfg)
 {
   basic_block bb;
   rtx_insn *insn, *next;
@@ -10283,33 +10312,20 @@ delete_vta_debug_insns (void)
   if (!MAY_HAVE_DEBUG_INSNS)
 return;
 
-  FOR_EACH_BB_FN (bb, cfun)
-{
-  FOR_BB_INSNS_SAFE (bb, insn, next)
+  if (use_cfg)
+FOR_EACH_BB_FN (bb, cfun)
+  {
+   FOR_BB_INSNS_SAFE (bb, insn, next)
+ if (DEBUG_INSN_P (insn))
+   delete_vta_debug_insn (insn);
+  }
+  else
+for (insn = get_insns (); insn; insn = next)
+  {
+   next = NEXT_INSN (insn);
if (DEBUG_INSN_P (insn))
- {
-   if (DEBUG_MARKER_INSN_P (insn))
- {
-   reemit_marker_as_note (insn);
-   continue;
- }
-
-   tree decl = INSN_VAR_LOCATION_DECL (insn);
-   if (TREE_CODE (decl) == LABEL_DECL
-   && DECL_NAME (decl)
-   && !DECL_RTL_SET_P (decl))
- {
-   PUT_CODE (insn, NOTE);
-   NOTE_KIND (insn) = NOTE_INSN_DELETED_DEBUG_LABEL;
-   NOTE_DELETED_LABEL_NAME (insn)
- = IDENTIFIER_POINTER (DECL_NAME (decl));
-   SET_DECL_RTL (decl, insn);
-   CODE_LABEL_NUMBER (insn) = debug_label_num++;
- }
-   else
- delete_insn (insn);
- }
-}
+ delete_vta_debug_insn (insn);
+  }
 }
 
 /* Run a fast, BB-local only version of var tracking, to take care of
@@ -10322,7 +10338,7 @@ static void
 vt_debug_insns_local (bool skipped ATTRIBUTE_UNUSED)
 {
   /* ??? Just skip it all for now.  */
-  delete_vta_debug_insns ();
+  delete_vta_debug_insns (true);
 }
 
 /* Free the data structures needed for variable tracking.  */
@@ -10395,7 +10411,7 @@ variable_tracking_main_1 (void)
 any pseudos at this point.  */
   || targetm.no_register_allocation)
 {
-  delete_vta_debug_insns ();
+  delete_vta_debug_insns (true);
   return 0;
 }
 
@@ -10423,7 +10439,7 @@ variable_tracking_main_1 (void)
 {
   vt_finalize ();
 
-  delete_vta_debug_insns ();
+  delete_vta_debug_insns (true);
 
   /* This is later restored by our caller.  */
   flag_var_tracking_assignments = 0;
--- gcc/rtl.h.jj2018-01-03 10:19:55.184533962 +0100
+++ gcc/rtl.h   2018-01-03 15:56:13.089080504 +0100
@@ -4254,6 +4254,7 @@ extern GTY(()) rtx stack_limit_rtx;
 
 /* In var-tracking.c */
 extern unsigned int variable_tracking_main (void);
+extern void

Re: [PATCH] [PR target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64

2018-01-03 Thread Jakub Jelinek

On Wed, Jan 03, 2018 at 12:57:10PM -0700, Jeff Law wrote:
> > Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux?
> > 
> > 2018-01-03  Jakub Jelinek  
> > 
> > PR target/83641
> > * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For
> > noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop,
> > only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp
> > and add REG_CFA_ADJUST_CFA notes in that case to both insns.
> I still question how useful this part is, but not enough to object given

Reduces bloat in .eh_frame and when unwinding less work is needed.  The
patch isn't that large after all.

> you've done the work.  I'll go ahead and commit both as a single unit.

BTW, I've now bootstrapped/regtested on x86_64-linux and i686-linux your and
my patch, the only regression caused by your patch is:
+FAIL: gcc.target/i386/stack-check-12.c scan-assembler popl\\t%esi
+FAIL: gcc.target/i386/stack-check-12.c scan-assembler pushl\\t%esi
on i686-linux, quite understandably, that is exactly the case you are
changing, there is %edi clobbered in the function and thus needs to be saved
and restored and so no push/pop of %esi is needed.
So, this testcase should be tweaked.

Jakub

Re: [PATCH] [PR target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64

2018-01-03 Thread Jeff Law

On 01/03/2018 04:57 AM, Jakub Jelinek wrote:
> On Tue, Jan 02, 2018 at 01:02:26PM -0700, Jeff Law wrote:
>> It's fairly obvious that the probe of *sp isn't actually necessary here
>> because the register saves in the prologue act as probe points for *sp.
>>
>> In fact, the only way this can ever cause problems is if %esi is used in
>> the body in which case it would have been callee saved in the prologue.
>> So if we detect that %esi is already callee saved in the prologue then
>> we could eliminate the explicit probe of *sp.
>>
>> But we can do even better.  If any register is saved in the prologue,
>> then that callee register save functions as an implicit probe of *sp and
>> we do not need to explicitly probe *sp.
>>
>> While this was reported with -m32, I'm pretty sure we can trigger a
>> similar issue on x86_64.
>>
>> Bootstrapped and regression tested on x86_64.  Also verified the
>> testcase behavior on -m32.  The test uses flags to hopefully ensure
>> expected behavior on x86/Solaris, but I did not explicitly test that
>> configuration.
>>
>> OK for the trunk?
>>
>> Jeff
>>
> 
> Missing
>   PR target/83641
> here.
> 
>>  * config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not
> 
> adjust as Florian reported.
> 
>>  explicitly probe *sp in a noreturn function if there were any callee
>>  register saves.
> 
> " or frame pointer is needed" ?
> 
>>
>>  * gcc.target/i386/stack-check-17.c: New test
> 
> Missing full stop after New test
> 
> This is a nice optimization, ok for trunk.
> 
> I think we do not want to put anything into the CFI even if the condition
> you've added doesn't trigger, that is a pure space and runtime cycle waste,
> except for noting the sp change if it is the current CFA.  Even after the
> push the register value still holds the caller's value, and after the pop
> too.  ix86_emit_restore_reg_using_pop does a lot of stuff we IMNSHO don't
> want to do, if dummy_reg happened to be a drap reg, we'd emit really weird
> stuff, the cfa restore, etc.  There are many other spots that gen_pop
> themselves and do the appropriate thing at that spot, instead of
> all using ix86_emit_restore_reg_using_pop.
> 
> Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux?
> 
> 2018-01-03  Jakub Jelinek  
> 
>   PR target/83641
>   * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For
>   noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop,
>   only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp
>   and add REG_CFA_ADJUST_CFA notes in that case to both insns.
I still question how useful this part is, but not enough to object given
you've done the work.  I'll go ahead and commit both as a single unit.

Jeff

Re: [v3 PATCH] Protect optional's deduction guide with the feature macro

2018-01-03 Thread Jonathan Wakely


On 03/01/18 16:34 +0200, Ville Voutilainen wrote:

Tested partially on Linux-x64, finishing testing the full suite on
Linux-PPC64. Ok for trunk?


Yes, and gcc-7-branch please. Thanks.

Re: Fix Bug 83566 - cyl_bessel_j returns wrong result for x>1000 for high orders

2018-01-03 Thread Michele Pezzutti



Hi.

On 01/02/2018 05:43 PM, Michele Pezzutti wrote:


On 01/02/2018 02:28 AM, Ed Smith-Rowland wrote:

I like the patch.

I have a similar one in the tr29124 branch.

Anyway, I got held up and I think it's good to have new folks looking 
into this.


It looks good except that you need to uglify k.


I looked at the GSL implementation, based on same reference, and their 
loop is cleaner. What about porting that implementation here? Possible?


My implementation is also using one more term for P than for Q, which 
is discouraged in GSL, according to their comments.


Ed, do you have any comment about this point?
Regarding the 2nd line, after my observations, usually term stops 
contributing to P, Q before k > nu/2, so actually, an offset by one is 
most likely without consequences.

GSL implementation is nevertheless more elegant.



Also, keep in mind that these series are asymptotic - they'll 
eventually blow up.


You should watch the magnitude of sequential terms and bail out of 
the loop if either term starts growing.


Ed
Yes, you are right. As nu increases, the multiplying '__term' gets 
larger, and the result will lose accuracy.

This cannot be used for large nu.
I have not been able to figure out how to detect when it starts to 
lose accuracy though, other than empirically.
GSL has no checks on terms and uses this method only under certain 
conditions, on nu and x.

Otherwise, they use other methods.

Re: [PATCH] PR 78534 Change character length from int to size_t

2018-01-03 Thread Janne Blomqvist

On Wed, Jan 3, 2018 at 8:34 PM, Bob Deen  wrote:
> On 12/29/17 5:31 AM, Janne Blomqvist wrote:
>>
>> In order to handle large character lengths on (L)LP64 targets, switch
>> the GFortran character length from an int to a size_t.
>>
>> This is an ABI change, as procedures with character arguments take
>> hidden arguments with the character length.
>
>
> Did this change not make it into gcc 7 then?

No, it caused regressions on big endian targets, and it was so late in
the gcc 7 cycle that there was no time to fix them (at the time I
didn't have access to a big endian target to test on, and getting said
access took time), so I had to revert it. Those regressions have now
been fixed, and the ABI has been broken anyway due to other changes,
so I'm trying again for gcc 8.

>  I am one of those who still
> use these hidden arguments for Fortran <-> C interfaces.  Based on
> discussions a year ago, I added this to my code:
>
> #if defined(__GNUC__) && (__GNUC__ > 6)
> #include 
> #define FORSTR_STDARG_TYPE size_t
> #else
> #define FORSTR_STDARG_TYPE int
> #endif
>
> I infer from this thread that I should change this to __GNUC__ > 7 now. Is
> this still the correct/best way to determine the hidden argument size?

Yes, I would say so.

> (note that we're still using 4.x... ugh, don't ask... so the >6 check hasn't
> actually been used yet, I just want to future-proof things as much as
> possible without having to rewrite the entire Fortran <-> C interface.)
>
> Thanks...
>
> -Bob
>
> Bob Deen  @  NASA-JPL Multmission Image Processing Lab
> bob.d...@jpl.nasa.gov
>



-- 
Janne Blomqvist

[committed] Fix warning in gcc.dg/plugin/expensive_selftests_plugin.c with !CHECKING_P

2018-01-03 Thread David Malcolm

On Tue, 2018-01-02 at 21:25 +0100, Andreas Schwab wrote:
> /daten/gcc/gcc-
> 20180101/gcc/testsuite/gcc.dg/plugin/expensive_selftests_plugin.c:175
> :1: warning: no return statement in function returning non-void [-
> Wreturn-type]
> 
> Andreas.

Thanks.  

I forgot to handle the --enable-checking=release case here; sorry.

I've committed the following patch to trunk to fix this (as r256183),
under the "obvious fixes" rule, borrowing the note from
toplev::run_self_tests to give the dg-regexp something to look for for
the !CHECKING_P case.

Tested with and without --enable-checking=release.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/expensive-selftests-1.c: Update regexp to handle
the !CHECKING_P case by expecting a note.
* gcc.dg/plugin/expensive_selftests_plugin.c (plugin_init): Issue
a note for the !CHECKING_P case, and move the return statement
outside of #if CHECKING_P guard.
---
 gcc/testsuite/gcc.dg/plugin/expensive-selftests-1.c  | 2 +-
 gcc/testsuite/gcc.dg/plugin/expensive_selftests_plugin.c | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/plugin/expensive-selftests-1.c 
b/gcc/testsuite/gcc.dg/plugin/expensive-selftests-1.c
index e464117..64f168d 100644
--- a/gcc/testsuite/gcc.dg/plugin/expensive-selftests-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/expensive-selftests-1.c
@@ -1,3 +1,3 @@
 int not_empty;
 
-/* { dg-regexp "expensive_selftests_plugin: .* pass\\(es\\) in .* seconds" } */
+/* { dg-regexp "expensive_selftests_plugin: .* pass\\(es\\) in .* seconds|not 
enabled in this build" } */
diff --git a/gcc/testsuite/gcc.dg/plugin/expensive_selftests_plugin.c 
b/gcc/testsuite/gcc.dg/plugin/expensive_selftests_plugin.c
index 9470764..a7c6728 100644
--- a/gcc/testsuite/gcc.dg/plugin/expensive_selftests_plugin.c
+++ b/gcc/testsuite/gcc.dg/plugin/expensive_selftests_plugin.c
@@ -170,6 +170,8 @@ plugin_init (struct plugin_name_args *plugin_info,
 PLUGIN_FINISH,
 selftest::expensive_tests,
 NULL); /* void *user_data */
-  return 0;
+#else
+  inform (UNKNOWN_LOCATION, "self-tests are not enabled in this build");
 #endif /* #if CHECKING_P */
+  return 0;
 }
-- 
1.8.5.3

Re: PR83648

2018-01-03 Thread Jeff Law

On 01/02/2018 11:03 PM, Prathamesh Kulkarni wrote:
> Hi,
> malloc_candidate_p() in ipa-pure-const misses detecting that a
> function is malloc-like if the  return value is result of PHI and one
> of the arguments of PHI is 0.
> For example:
> 
> void g(unsigned n)
> {
>   return (n) ? __builtin_malloc (n) : 0;
> }
> 
> The reason is that the following check:
> if (TREE_CODE (arg) != SSA_NAME)
>   DUMP_AND_RETURN ("phi arg is not SSA_NAME.")
> 
> fails for arg with constant value 0 and malloc_candidate_p returns false.
> The patch simply skips the arg if it equals null_pointer_node.
> Does it look OK ?
> 
> One concern I have is that with the patch, malloc_candidate_p will
> return true if all the args to PHI are NULL:
> retval = PHI<0, 0>
> return retval
> 
> However I expect that PHI with all 0 args would be constant folded to
> 0 earlier, so this case shouldn't occur in practice ?
> 
> Bootstrapped+tested on x86_64-unknown-linux-gnu.
> Cross-testing on arm*-*-* and aarch64*-*-* in progress.
A degenerate PHI should be folded/propagated, but you can't absolutely
depend on it -- ie, you must handle it gracefully.

I think the way to do that here is verify that at least one argument is
an SSA_NAME.

jeff

Re: [PATCH] PR 78534 Change character length from int to size_t

2018-01-03 Thread Bob Deen


On 12/29/17 5:31 AM, Janne Blomqvist wrote:

In order to handle large character lengths on (L)LP64 targets, switch
the GFortran character length from an int to a size_t.

This is an ABI change, as procedures with character arguments take
hidden arguments with the character length.


Did this change not make it into gcc 7 then?  I am one of those who 
still use these hidden arguments for Fortran <-> C interfaces.  Based on 
discussions a year ago, I added this to my code:


#if defined(__GNUC__) && (__GNUC__ > 6)
#include 
#define FORSTR_STDARG_TYPE size_t
#else
#define FORSTR_STDARG_TYPE int
#endif

I infer from this thread that I should change this to __GNUC__ > 7 now. 
Is this still the correct/best way to determine the hidden argument size?


(note that we're still using 4.x... ugh, don't ask... so the >6 check 
hasn't actually been used yet, I just want to future-proof things as 
much as possible without having to rewrite the entire Fortran <-> C 
interface.)


Thanks...

-Bob

Bob Deen  @  NASA-JPL Multmission Image Processing Lab
bob.d...@jpl.nasa.gov

[PATCH][committed][ PR middle-end/83654] Fix bogus probe in dynamic space when there are no residuals

2018-01-03 Thread Jeff Law

There's a of sense of deja-vu on this BZ.  There's a lot of similarities
with an issue that came up earlier on aarch64 and its special handling
of the outgoing argument area.

Basically the residual of dynamic allocation may be a compile time
constant or a runtime variant and anything in between.  In the case
where it is not a compile time constant, but has the value zero at
runtime we must not probe the non-existent residual.

So the way to address that is to emit the compare/branch around the
probe when the residual is not a compile time constant (just like we do
for the outgoing argument area on aarch64).

I took the opportunity to make the residual handling and aarch64
outgoing argument space handling have the same core style.

There's two tests.  One is the original from Florian.  It's a bit
interesting in that at expansion time we do not see the allocation size
or residual as a constant.  So we emit a probing loop and residual
handling during expansion.  We check for that.  Later optimizers come
along and prove the residual handling isn't needed which we verify as
well by checking the assembly output.  In theory we should realize the
loop iterates precisely once and remove the looping structure, but don't
(I haven't investigated that missed-optimization).

I've added a derived test where the size is not compile-time
determinable.  We verify there'll be loop and residual probing during
expansion.  We then verify the existence of both probe instructions
*and* verify there's 3 conditionals (one guarding entrance to the
probing loop, one for the probing loop backedge and one guarding the
residual probe) in the assembly output.

Bootstrapped and regression tested on x86_64.  Verified the test works
on both x86 and x86_64.

Installing on the trunk.

Jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d95c297e96a..f570eb4e606 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2017-01-03  Jeff Law  
+
+   PR middle-end/83654
+   * explow.c (anti_adjust_stack_and_probe_stack_clash): Test a
+   non-constant residual for zero at runtime and avoid probing in
+   that case.  Reorganize code for trailing problem to mirror handling
+   of the residual.
+
 2018-01-03  Prathamesh Kulkarni  
 
PR tree-optimization/83501
diff --git a/gcc/explow.c b/gcc/explow.c
index b6c56602152..042e71904ec 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -1997,11 +1997,27 @@ anti_adjust_stack_and_probe_stack_clash (rtx size)
 
   if (residual != CONST0_RTX (Pmode))
 {
+  rtx label = NULL_RTX;
+  /* RESIDUAL could be zero at runtime and in that case *sp could
+hold live data.  Furthermore, we do not want to probe into the
+red zone.
+
+Go ahead and just guard the probe at *sp on RESIDUAL != 0 at
+runtime if RESIDUAL is not a compile time constant.  */
+  if (!CONST_INT_P (residual))
+   {
+ label = gen_label_rtx ();
+ emit_cmp_and_jump_insns (residual, CONST0_RTX (GET_MODE (residual)),
+  EQ, NULL_RTX, Pmode, 1, label);
+   }
+
   rtx x = force_reg (Pmode, plus_constant (Pmode, residual,
   -GET_MODE_SIZE (word_mode)));
   anti_adjust_stack (residual);
   emit_stack_probe (gen_rtx_PLUS (Pmode, stack_pointer_rtx, x));
   emit_insn (gen_blockage ());
+  if (!CONST_INT_P (residual))
+   emit_label (label);
 }
 
   /* Some targets make optimistic assumptions in their prologues about
@@ -2014,28 +2030,20 @@ anti_adjust_stack_and_probe_stack_clash (rtx size)
 live data.  Furthermore, we don't want to probe into the red
 zone.
 
-Go ahead and just guard a probe at *sp on SIZE != 0 at runtime
+Go ahead and just guard the probe at *sp on SIZE != 0 at runtime
 if SIZE is not a compile time constant.  */
-
-  /* Ideally we would just probe at *sp.  However, if SIZE is not
-a compile-time constant, but is zero at runtime, then *sp
-might hold live data.  So probe at *sp if we know that
-an allocation was made, otherwise probe into the red zone
-which is obviously undesirable.  */
-  if (CONST_INT_P (size))
-   {
- emit_stack_probe (stack_pointer_rtx);
- emit_insn (gen_blockage ());
-   }
-  else
+  rtx label = NULL_RTX;
+  if (!CONST_INT_P (size))
{
- rtx label = gen_label_rtx ();
+ label = gen_label_rtx ();
  emit_cmp_and_jump_insns (size, CONST0_RTX (GET_MODE (size)),
   EQ, NULL_RTX, Pmode, 1, label);
- emit_stack_probe (stack_pointer_rtx);
- emit_insn (gen_blockage ());
- emit_label (label);
}
+
+  emit_stack_probe (stack_pointer_rtx);
+  emit_insn (gen_blockage ());
+  if (!CONST_INT_P (size))
+   emit_label (label);
 }
 }
 
diff --git

Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)

2018-01-03 Thread Marek Polacek

On Wed, Jan 03, 2018 at 11:45:55AM -0500, Nathan Sidwell wrote:
> On 01/03/2018 11:31 AM, Marek Polacek wrote:
> > Here we are crashing because cxx_fold_indirect_ref got a POINTER_PLUS_EXPR
> > with offset > signed HOST_WIDE_INT and we tried to convert it to sHWI.
> > 
> > The matching code in fold_indirect_ref_1 uses uHWIs so I've followed suit.
> > But that code now also uses poly_uint64 and I'm not sure if any of the
> > constexpr.c code should use it, too.  But this patch fixes the ICE.
> 
> This doesn't look right to me, but it doesn't help that the test case
> invokes UB.

Hmm, like I said, I just followed fold_indirect_ref_1 so I was pretty confident
that this is not a totally wrong thing to do ;).
 
> > +typedef int V __attribute__ ((__vector_size__ (16)));
> > +V a;
> > +
> > +int
> > +main ()
> > +{
> > +  reinterpret_cast  ()[-1] += 1;
> > +}
> 
> one could make this code well formed with (I think)
> 
> typedef int V __attribute__ ((__vector_size__ (16)));
> V a[2];
> 
> int main ()
> {
>return reinterpret_cast  ([1])[-1];
> }
> 
> That should access the final element of the a[0] vector.

Unfortunately, this doesn't trigger the ICE.

Marek

Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)

2018-01-03 Thread Marek Polacek

On Wed, Jan 03, 2018 at 05:40:32PM +, Richard Sandiford wrote:
> Marek Polacek  writes:
> > Here we are crashing because cxx_fold_indirect_ref got a POINTER_PLUS_EXPR
> > with offset > signed HOST_WIDE_INT and we tried to convert it to sHWI.
> >
> > The matching code in fold_indirect_ref_1 uses uHWIs so I've followed suit.
> > But that code now also uses poly_uint64 and I'm not sure if any of the
> > constexpr.c code should use it, too.
> 
> The frontend isn't supposed to see any poly_ints (yet), so it's OK to
> keep on using INTEGER_CST accessors like tree_to_shwi and tree_to_uhwi.

Thanks, that's good to know.

Marek

Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)

2018-01-03 Thread Richard Sandiford

Marek Polacek  writes:
> Here we are crashing because cxx_fold_indirect_ref got a POINTER_PLUS_EXPR
> with offset > signed HOST_WIDE_INT and we tried to convert it to sHWI.
>
> The matching code in fold_indirect_ref_1 uses uHWIs so I've followed suit.
> But that code now also uses poly_uint64 and I'm not sure if any of the
> constexpr.c code should use it, too.

The frontend isn't supposed to see any poly_ints (yet), so it's OK to
keep on using INTEGER_CST accessors like tree_to_shwi and tree_to_uhwi.

Thanks,
Richard

[PATCH 5/5][AArch64] fp16fml support

2018-01-03 Thread Michael Collison

Hi All,

This patch adds support for the FP16 multiply add/subtract instructions in 
Armv8.4-a.  Support for the new instructions is in the form of new ACLE 
intrinsics. A new command line feature modifier, +fp16fml, is added to enable 
the support. Enabling +fp16fml automatically enables +fp16.

Test cases were added to verify that the ACLE Intrinsics generate the 
appropriate FP16 multiply add/subtract assembly instructions.

Bootstrapped on aarch64-none-elf. Tested with new binutils and verified all 
instructions assembly correctly.

Okay for trunk?

2017-11-10  Michael Collison  

* config/aarch64/aarch64-modes.def (V2HF): New VECTOR_MODE.
* config/aarch64/aarch64-option-extension.def: Add
AARCH64_OPT_EXTENSION of 'fp16fml'.
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins):
(__ARM_FEATURE_FP16_FML): Define if TARGET_F16FML is true.
* config/aarch64/predicates.md (aarch64_lane_imm3): New predicate.
* config/aarch64/constraints.md (Ui7): New constraint.
* config/aarch64/iterators.md (VFMLA_W): New mode iterator.
(VFMLA_SEL_W): Ditto.
(f16quad): Ditto.
(f16mac1): Ditto.
(VFMLA16_LOW): New int iterator.
(VFMLA16_HIGH): Ditto.
(UNSPEC_FMLAL): New unspec.
(UNSPEC_FMLSL): Ditto.
(UNSPEC_FMLAL2): Ditto.
(UNSPEC_FMLSL2): Ditto.
(f16mac): New code attribute.
* config/aarch64/aarch64-simd-builtins.def
(aarch64_fmlal_lowv2sf): Ditto.
(aarch64_fmlsl_lowv2sf): Ditto.
(aarch64_fmlalq_lowv4sf): Ditto.
(aarch64_fmlslq_lowv4sf): Ditto.
(aarch64_fmlal_highv2sf): Ditto.
(aarch64_fmlsl_highv2sf): Ditto.
(aarch64_fmlalq_highv4sf): Ditto.
(aarch64_fmlslq_highv4sf): Ditto.
(aarch64_fmlal_lane_lowv2sf): Ditto.
(aarch64_fmlsl_lane_lowv2sf): Ditto.
(aarch64_fmlal_laneq_lowv2sf): Ditto.
(aarch64_fmlsl_laneq_lowv2sf): Ditto.
(aarch64_fmlalq_lane_lowv4sf): Ditto.
(aarch64_fmlsl_lane_lowv4sf): Ditto.
(aarch64_fmlalq_laneq_lowv4sf): Ditto.
(aarch64_fmlsl_laneq_lowv4sf): Ditto.
(aarch64_fmlal_lane_highv2sf): Ditto.
(aarch64_fmlsl_lane_highv2sf): Ditto.
(aarch64_fmlal_laneq_highv2sf): Ditto.
(aarch64_fmlsl_laneq_highv2sf): Ditto.
(aarch64_fmlalq_lane_highv4sf): Ditto.
(aarch64_fmlsl_lane_highv4sf): Ditto.
(aarch64_fmlalq_laneq_highv4sf): Ditto.
(aarch64_fmlsl_laneq_highv4sf): Ditto.
* config/aarch64/aarch64-simd.md:
(aarch64_fmll_low): New pattern.
(aarch64_fmll_high): Ditto.
(aarch64_simd_fmll_low): Ditto.
(aarch64_simd_fmll_high): Ditto.
(aarch64_fmll_lane_lowv2sf): Ditto.
(aarch64_fmll_lane_highv2sf): Ditto.
(aarch64_simd_fmll_lane_lowv2sf): Ditto.
(aarch64_simd_fmll_lane_highv2sf): Ditto.
(aarch64_fmllq_laneq_lowv4sf): Ditto.
(aarch64_fmllq_laneq_highv4sf): Ditto.
(aarch64_simd_fmllq_laneq_lowv4sf): Ditto.
(aarch64_simd_fmllq_laneq_highv4sf): Ditto.
(aarch64_fmll_laneq_lowv2sf): Ditto.
(aarch64_fmll_laneq_highv2sf): Ditto.
(aarch64_simd_fmll_laneq_lowv2sf): Ditto.
(aarch64_simd_fmll_laneq_highv2sf): Ditto.
(aarch64_fmllq_lane_lowv4sf): Ditto.
(aarch64_fmllq_lane_highv4sf): Ditto.
(aarch64_simd_fmllq_lane_lowv4sf): Ditto.
(aarch64_simd_fmllq_lane_highv4sf): Ditto.
* config/aarch64/arm_neon.h (vfmlal_low_u32): New intrinsic.
(vfmlsl_low_u32): Ditto.
(vfmlalq_low_u32): Ditto.
(vfmlslq_low_u32): Ditto.
(vfmlal_high_u32): Ditto.
(vfmlsl_high_u32): Ditto.
(vfmlalq_high_u32): Ditto.
(vfmlslq_high_u32): Ditto.
(vfmlal_lane_low_u32): Ditto.
(vfmlsl_lane_low_u32): Ditto.
(vfmlal_laneq_low_u32): Ditto.
(vfmlsl_laneq_low_u32): Ditto.
(vfmlalq_lane_low_u32): Ditto.
(vfmlslq_lane_low_u32): Ditto.
(vfmlalq_laneq_low_u32): Ditto.
(vfmlslq_laneq_low_u32): Ditto.
(vfmlal_lane_high_u32): Ditto.
(vfmlsl_lane_high_u32): Ditto.
(vfmlal_laneq_high_u32): Ditto.
(vfmlsl_laneq_high_u32): Ditto.
(vfmlalq_lane_high_u32): Ditto.
(vfmlslq_lane_high_u32): Ditto.
(vfmlalq_laneq_high_u32): Ditto.
(vfmlslq_laneq_high_u32): Ditto.
* config/aarch64/aarch64.h (AARCH64_FL_F16SML): New flag.
(AARCH64_FL_FOR_ARCH8_4): New.
(AARCH64_ISA_F16FML): New ISA flag.
(TARGET_F16FML): New feature flag for fp16fml.
gcc.target/aarch64/fp16_fmul_high_1.c: New testcase.
gcc.target/aarch64/fp16_fmul_high_2.c: New testcase.
gcc.target/aarch64/fp16_fmul_high_3.c: New testcase.
gcc.target/aarch64/fp16_fmul_high.h: New shared testcase.
gcc.target/aarch64/fp16_fmul_lane_high_1.c: New testcase.

[PATCH 4/5][AArch64] Crypto sha512 and sha3

2018-01-03 Thread Michael Collison

Hi All,

This patch adds support for the SHA-512 and SHA-3 instructions added in 
Armv8.4-a. Support for the new instructions is in the form of new ACLE 
intrinsics. A new command line feature modifier, +sha3, is added to enable the 
support.

Test cases were added to verify that the ACLE Intrinsics generate the 
appropriate SHA-512/SHA-3 assembly instructions.

Bootstrapped on aarch64-none-elf. Tested with new binutils and verified all 
instructions assembly correctly.

Okay for trunk?

2017-11-10  Michael Collison  

* config/aarch64/aarch64-builtins.c:
(aarch64_types_ternopu_imm_qualifiers, TYPES_TERNOPUI): New.
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins):
(__ARM_FEATURE_SHA3): Define if TARGET_SHA3 is true.
* config/aarch64/aarch64.h (AARCH64_FL_SHA3): New flags.
(AARCH64_ISA_SHA3): New ISA flag.
(TARGET_SHA3): New feature flag for sha3.
* config/aarch64/iterators.md (sha512_op): New int attribute.
(CRYPTO_SHA512): New int iterator.
(UNSPEC_SHA512H): New unspec.
(UNSPEC_SHA512H2): Ditto.
(UNSPEC_SHA512SU0): Ditto.
(UNSPEC_SHA512SU1): Ditto.
* config/aarch64/aarch64-simd-builtins.def
(aarch64_crypto_sha512hqv2di): New builtin.
(aarch64_crypto_sha512h2qv2di): Ditto.
(aarch64_crypto_sha512su0qv2di): Ditto.
(aarch64_crypto_sha512su1qv2di): Ditto.
(aarch64_eor3qv8hi): Ditto.
(aarch64_rax1qv2di): Ditto.
(aarch64_xarqv2di): Ditto.
(aarch64_bcaxqv8hi): Ditto.
* config/aarch64/aarch64-simd.md:
(aarch64_crypto_sha512hqv2di): New pattern.
(aarch64_crypto_sha512su0qv2di): Ditto.
(aarch64_crypto_sha512su1qv2di): Ditto.
(aarch64_eor3qv8hi): Ditto.
(aarch64_rax1qv2di): Ditto.
(aarch64_xarqv2di): Ditto.
(aarch64_bcaxqv8hi): Ditto.
* config/aarch64/arm_neon.h (vsha512hq_u64): New intrinsic.
(vsha512h2q_u64): Ditto.
(vsha512su0q_u64): Ditto.
(vsha512su1q_u64): Ditto.
(veor3q_u16): Ditto.
(vrax1q_u64): Ditto.
(vxarq_u64): Ditto.
(vbcaxq_u16): Ditto.
* config/arm/types.md (crypto_sha512): New type attribute.
(crypto_sha3): Ditto.
(doc/invoke.texi): Document new sha3 option.
gcc.target/aarch64/sha2.h: New shared testcase.
gcc.target/aarch64/sha2_1.c: New testcase.
gcc.target/aarch64/sha2_2.c: New testcase.
gcc.target/aarch64/sha2_3.c: New testcase.
gcc.target/aarch64/sha3.h: New shared testcase.
gcc.target/aarch64/sha3_1.c: New testcase.
gcc.target/aarch64/sha3_2.c: New testcase.
gcc.target/aarch64/sha3_3.c: New testcase.


crypto_sha512.patch
Description: crypto_sha512.patch

[PATCH 3/5][AArch64] Crypto SM4 Support

2018-01-03 Thread Michael Collison

Hi All,

This patch adds support for the SM3/SM4 cryptographic instructions added in 
Armv8.4-a. Support for the new instructions is in the form of new ACLE 
intrinsics. A new command line feature modifier, +sm4, is added to enable the 
support.

Test cases were added to verify that the ACLE Intrinsics generate the 
appropriate SM3/SM4 assembly instructions.

Bootstrapped on aarch64-none-elf. Tested with new binutils and verified all 
instructions assembly correctly.

Okay for trunk?

2017-11-10  Michael Collison  

* config/aarch64/aarch64-builtins.c:
(aarch64_types_quadopu_imm_qualifiers, TYPES_QUADOPUI): New.
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins):
(__ARM_FEATURE_SM3): Define if TARGET_SM4 is true.
(__ARM_FEATURE_SM4): Define if TARGET_SM4 is true.
* config/aarch64/aarch64.h (AARCH64_FL_SM4): New flags.
(AARCH64_ISA_SM4): New ISA flag.
(TARGET_SM4): New feature flag for sm4.
* config/aarch64/aarch64-simd-builtins.def
(aarch64_sm3ss1qv4si): Ditto.
(aarch64_sm3tt1aq4si): Ditto.
(aarch64_sm3tt1bq4si): Ditto.
(aarch64_sm3tt2aq4si): Ditto.
(aarch64_sm3tt2bq4si): Ditto.
(aarch64_sm3partw1qv4si): Ditto.
(aarch64_sm3partw2qv4si): Ditto.
(aarch64_sm4eqv4si): Ditto.
(aarch64_sm4ekeyqv4si): Ditto.
* config/aarch64/aarch64-simd.md:
(aarch64_sm3ss1qv4si): Ditto.
(aarch64_sm3ttqv4si): Ditto.
(aarch64_sm3partwqv4si): Ditto.
(aarch64_sm4eqv4si): Ditto.
(aarch64_sm4ekeyqv4si): Ditto.
* config/aarch64/iterators.md (sm3tt_op): New int iterator.
(sm3part_op): Ditto.
(CRYPTO_SM3TT): Ditto.
(CRYPTO_SM3PART): Ditto.
(UNSPEC_SM3SS1): New unspec.
(UNSPEC_SM3TT1A): Ditto.
(UNSPEC_SM3TT1B): Ditto.
(UNSPEC_SM3TT2A): Ditto.
(UNSPEC_SM3TT2B): Ditto.
(UNSPEC_SM3PARTW1): Ditto.
(UNSPEC_SM3PARTW2): Ditto.
(UNSPEC_SM4E): Ditto.
(UNSPEC_SM4EKEY): Ditto.
* config/aarch64/constraints.md (Ui2): New constraint.
* config/aarch64/predicates.md (aarch64_imm2): New predicate.
* config/arm/types.md (crypto_sm3): New type attribute.
(crypto_sm4): Ditto.
* config/aarch64/arm_neon.h (vsm3ss1q_u32): New intrinsic.
(vsm3tt1aq_u32): Ditto.
(vsm3tt1bq_u32): Ditto.
(vsm3tt2aq_u32): Ditto.
(vsm3tt2bq_u32): Ditto.
(vsm3partw1q_u32): Ditto.
(vsm3partw2q_u32): Ditto.
(vsm4eq_u32): Ditto.
(vsm4ekeyq_u32): Ditto.
(doc/invoke.texi): Document new sm4 option.
gcc.target/aarch64/sm3_sm4.c: New testcase.


crypto_sm4.patch
Description: crypto_sm4.patch

[PATCH 2/5][AArch64] Add v8.4 architecture

2018-01-03 Thread Michael Collison

Hi all,

This patch adds support for the Arm architecture v8.4. A new command line 
option, -march=armv8.4-a, is added as well as documentation.

Bootstrapped on aarch64-none-elf. Tested with new binutils and verified all 
instructions assembly correctly.

2017-11-10  Michael Collison  

* config/aarch64/aarch64-arches.def (armv8.4-a): New architecture.
* config/aarch64/aarch64.h (AARCH64_ISA_V8_4): New ISA flag.
(AARCH64_FL_FOR_ARCH8_4): New.
(AARCH64_FL_V8_4): New flag.
(doc/invoke.texi): Document new armv8.4-a option.



v8_4_architecture.patch
Description: v8_4_architecture.patch

[PATCH 1/5][AArch64] Crypto command line split

2018-01-03 Thread Michael Collison

Hi all,

This patch adds two new command line options for the legacy cryptographic 
extensions AES (+aes) and SHA-1/SHA-2 (+sha2). Backward compatibility is 
retained by modifying the +crypto feature modifier to enable +aes and +sha2.

Bootstrapped on aarch64-none-elf. Tested with new binutils and verified all 
instructions assembly correctly.

2017-11-10  Michael Collison  

* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins):
(__ARM_FEATURE_AES): Define if TARGET_AES is true.
(__ARM_FEATURE_SHA2): Define if TARGET_SHA2 is true.
* config/aarch64/aarch64-option-extension.def: Add
AARCH64_OPT_EXTENSION of 'sha2'.
(aes): Add AARCH64_OPT_EXTENSION of 'aes'.
(crypto): Disable sha2 and aes if crypto disabled.
(crypto): Enable aes and sha2 if enabled.
(simd): Disable sha2 and aes if simd disabled.
* config/aarch64/aarch64.h (AARCH64_FL_AES, AARCH64_FL_SHA2):
New flags.
(AARCH64_ISA_AES, AARCH64_ISA_SHA2): New ISA flags.
(TARGET_SHA2): New feature flag for sha2.
(TARGET_AES): New feature flag for aes.
* config/aarch64/aarch64-simd.md:
(aarch64_crypto_aesv16qi): Make pattern
conditional on TARGET_AES.
(aarch64_crypto_aesv16qi): Ditto.
(aarch64_crypto_sha1hsi): Make pattern conditional
on TARGET_SHA2.
(aarch64_crypto_sha1hv4si): Ditto.
(aarch64_be_crypto_sha1hv4si): Ditto.
(aarch64_crypto_sha1su1v4si): Ditto.
(aarch64_crypto_sha1v4si): Ditto.
(aarch64_crypto_sha1su0v4si): Ditto.
(aarch64_crypto_sha256hv4si): Ditto.
(aarch64_crypto_sha256su0v4si): Ditto.
(aarch64_crypto_sha256su1v4si): Ditto.
(doc/invoke.texi): Document new aes and sha2 options.


crypto_split.patch
Description: crypto_split.patch

[PATCH 0/5][AArch64] ARMv8.4-A support

2018-01-03 Thread Michael Collison

Hello,

The ARMv8.4-A architecture builds on ARMv8.3-A and includes optional 
cryptographic extensions supporting SHA512, SHA3, SM3 and SM4. New FP16 
multiply add/subtract instructions have been added that are mandatory in 
ARMv8.4-A and optional from ARMv8.2-A onward.  

Although the new cryptographic instructions are introduced in ARMv8.4-A, they 
may be optionally supported in any architecture implementation from ARMv8.2-A 
onward.

This patch set adds support to GCC for the ARMv8.4-A architecture and adds new 
command line options to individually select the existing cryptographic 
SHA-1/SHA-256 and AES extensions. The existing +crypto option is retained for 
backward compatibility. New command line options are added for SHA512/SHA3, 
SM3/SM4 and the FP16 multiply add/subtract extensions. The cryptographic and 
FP16 multiply add/subtract instructions are exposed as new ACLE intrinsics.

The patches in this series are:

- Add new command line options for SHA-1/SHA-256 and AES including documentation
- Add support for ARMv8.4-A
- Add support for the cryptographic SM3/SM4 extension
- Add support for the cryptographic SHA-512/SHA-3 extension
- Add support for the FP16 multiply add/subtract extension

Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)

2018-01-03 Thread Nathan Sidwell


On 01/03/2018 11:31 AM, Marek Polacek wrote:

Here we are crashing because cxx_fold_indirect_ref got a POINTER_PLUS_EXPR
with offset > signed HOST_WIDE_INT and we tried to convert it to sHWI.

The matching code in fold_indirect_ref_1 uses uHWIs so I've followed suit.
But that code now also uses poly_uint64 and I'm not sure if any of the
constexpr.c code should use it, too.  But this patch fixes the ICE.


This doesn't look right to me, but it doesn't help that the test case 
invokes UB.



+typedef int V __attribute__ ((__vector_size__ (16)));
+V a;
+
+int
+main ()
+{
+  reinterpret_cast  ()[-1] += 1;
+}


one could make this code well formed with (I think)

typedef int V __attribute__ ((__vector_size__ (16)));
V a[2];

int main ()
{
   return reinterpret_cast  ([1])[-1];
}

That should access the final element of the a[0] vector.


nathan
--
Nathan Sidwell

C++ PATCH to fix ICE with vector expr folding (PR c++/83659)

2018-01-03 Thread Marek Polacek

Here we are crashing because cxx_fold_indirect_ref got a POINTER_PLUS_EXPR
with offset > signed HOST_WIDE_INT and we tried to convert it to sHWI.

The matching code in fold_indirect_ref_1 uses uHWIs so I've followed suit.
But that code now also uses poly_uint64 and I'm not sure if any of the
constexpr.c code should use it, too.  But this patch fixes the ICE.

Bootstrapped/regtested on x86_64-linux, ok for trunk/7?

2018-01-03  Marek Polacek  

PR c++/83659
* constexpr.c (cxx_fold_indirect_ref): Use unsigned HOST_WIDE_INT
when computing offsets.

* g++.dg/torture/pr83659.C: New test.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 1aeacd51810..cf7c994b381 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -3109,9 +3109,10 @@ cxx_fold_indirect_ref (location_t loc, tree type, tree 
op0, bool *empty_base)
  && (same_type_ignoring_top_level_qualifiers_p
  (type, TREE_TYPE (op00type
{
- HOST_WIDE_INT offset = tree_to_shwi (op01);
+ unsigned HOST_WIDE_INT offset = tree_to_uhwi (op01);
  tree part_width = TYPE_SIZE (type);
- unsigned HOST_WIDE_INT part_widthi = tree_to_shwi 
(part_width)/BITS_PER_UNIT;
+ unsigned HOST_WIDE_INT part_widthi
+   = tree_to_uhwi (part_width) / BITS_PER_UNIT;
  unsigned HOST_WIDE_INT indexi = offset * BITS_PER_UNIT;
  tree index = bitsize_int (indexi);
 
diff --git gcc/testsuite/g++.dg/torture/pr83659.C 
gcc/testsuite/g++.dg/torture/pr83659.C
index e69de29bb2d..d9f709bb520 100644
--- gcc/testsuite/g++.dg/torture/pr83659.C
+++ gcc/testsuite/g++.dg/torture/pr83659.C
@@ -0,0 +1,11 @@
+// PR c++/83659
+// { dg-do compile }
+
+typedef int V __attribute__ ((__vector_size__ (16)));
+V a;
+
+int
+main ()
+{
+  reinterpret_cast  ()[-1] += 1;
+}

Marek

Re: update config.guess/sub

2018-01-03 Thread Jeff Law

On 01/02/2018 09:24 PM, Ben Elliston wrote:
> It's a new year, time to update these scripts.
> 
> Ben
> 
> 
> 2018-01-03  Ben Elliston  
> 
>   * config.guess: Import latest version.
>   * config.sub: Likewise.
Seems reasonable.  I did a quick looksie and nothing looked unreasonable.

jeff

Re: PR83501

2018-01-03 Thread Prathamesh Kulkarni

On 2 January 2018 at 19:29, Richard Biener  wrote:
> On Thu, Dec 28, 2017 at 8:42 AM, Prathamesh Kulkarni
>  wrote:
>> On 21 December 2017 at 12:53, Prathamesh Kulkarni
>>  wrote:
>>> Hi Jakub,
>>> Based on your suggestions in PR83501, I have updated the patch to
>>> check for integer_zerop for 2nd operand of mem_ref.
>>>
>>> With the patch, Warray-bounds started warning for the following test
>>> in Warray-bounds-3.c in line 362 and thus I removed xfail on it:
>>> TM (a5, "0123", ma.a5 + i, ma.a5);
>>>
>>> Does it look OK ?
>>> Validation in progress.
>> ping https://gcc.gnu.org/ml/gcc-patches/2017-12/msg01415.html
>
> Ok.
Thanks, I removed the Warray-bounds-3.c hunk since that was added in
interim by Martin Sebor.
Committed the attached version in r256180 after verifying
bootstrap+test passes on x86_64-unknown-linux-gnu.

Thanks,
Prathamesh
>
> RIchard.
>
>> Thanks,
>> Prathamesh
>>>
>>> Thanks,
>>> Prathamesh
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83501.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83501.c
new file mode 100644
index 000..d8d3bf6039a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83501.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+char a[4];
+
+void f (void)
+{
+  __builtin_strcpy (a, "abc");
+
+  if (__builtin_strlen (a) != 3)
+__builtin_abort ();
+}
+
+/* { dg-final { scan-tree-dump-not "__builtin_strlen" "strlen" } } */
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 8971c7df4f3..f1d4f6ef059 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -2769,6 +2769,21 @@ handle_pointer_plus (gimple_stmt_iterator *gsi)
 }
 }
 
+/* Check if RHS is string_cst possibly wrapped by mem_ref.  */
+static tree
+get_string_cst (tree rhs)
+{
+  if (TREE_CODE (rhs) == MEM_REF
+  && integer_zerop (TREE_OPERAND (rhs, 1)))
+{
+  rhs = TREE_OPERAND (rhs, 0);
+  if (TREE_CODE (rhs) == ADDR_EXPR)
+   rhs = TREE_OPERAND (rhs, 0);
+}
+
+  return (TREE_CODE (rhs) == STRING_CST) ? rhs : NULL_TREE;
+}
+
 /* Handle a single character store.  */
 
 static bool
@@ -2924,11 +2939,11 @@ handle_char_store (gimple_stmt_iterator *gsi)
}
 }
   else if (idx == 0
-  && TREE_CODE (gimple_assign_rhs1 (stmt)) == STRING_CST
+  && (rhs = get_string_cst (gimple_assign_rhs1 (stmt)))
   && ssaname == NULL_TREE
   && TREE_CODE (TREE_TYPE (lhs)) == ARRAY_TYPE)
 {
-  size_t l = strlen (TREE_STRING_POINTER (gimple_assign_rhs1 (stmt)));
+  size_t l = strlen (TREE_STRING_POINTER (rhs));
   HOST_WIDE_INT a = int_size_in_bytes (TREE_TYPE (lhs));
   if (a > 0 && (unsigned HOST_WIDE_INT) a > l)
{

[AARCH64]Fix ldr_got_small and ldr_got_small_28k patterns to only allow DImode address.

2018-01-03 Thread Renlin Li


Hi all,

The only allowed addressing mode for aarch64 is DImode, AKA Pmode.
ptr_mode could be SImode or DImode depending on the ABI used.

This patch here fixes the addressing mode of two patterns as DImode.
If any other mode is ever used, somewhere in the compiler might go wrong.


aarch64-none-elf regression test Okay without regressions.

Okay to commit?

Regards,
Renlin

gcc/ChangeLog:

2018-01-03  Renlin Li  

* config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Make
sure address is Pmode.
* config/aarch64/aarch64.md (ldr_got_small): Fix address mode as
DImode.
(ldr_got_small_28k): Likewise.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1da313f..9e9d2ea 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1406,10 +1406,6 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
 	rtx s = gen_rtx_SYMBOL_REF (Pmode, "_GLOBAL_OFFSET_TABLE_");
 	crtl->uses_pic_offset_table = 1;
 	emit_move_insn (gp_rtx, gen_rtx_HIGH (Pmode, s));
-
-	if (mode != GET_MODE (gp_rtx))
- gp_rtx = gen_lowpart (mode, gp_rtx);
-
 	  }
 
 	if (mode == ptr_mode)
@@ -1451,13 +1447,19 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
 
 	rtx insn;
 	rtx mem;
-	rtx tmp_reg = dest;
+	rtx tmp_reg;
 	machine_mode mode = GET_MODE (dest);
 
 	if (can_create_pseudo_p ())
-	  tmp_reg = gen_reg_rtx (mode);
+	  tmp_reg = gen_reg_rtx (Pmode);
+	else
+	{
+	  gcc_assert (HARD_REGISTER_P (dest));
+	  if (mode != Pmode)
+	tmp_reg = gen_rtx_REG (Pmode, REGNO (dest));
+	}
 
-	emit_move_insn (tmp_reg, gen_rtx_HIGH (mode, imm));
+	emit_move_insn (tmp_reg, gen_rtx_HIGH (Pmode, imm));
 	if (mode == ptr_mode)
 	  {
 	if (mode == DImode)
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index f1e2a07..a1f2d2d 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -5407,9 +5407,9 @@
 
 (define_insn "ldr_got_small_"
   [(set (match_operand:PTR 0 "register_operand" "=r")
-	(unspec:PTR [(mem:PTR (lo_sum:PTR
-			  (match_operand:PTR 1 "register_operand" "r")
-			  (match_operand:PTR 2 "aarch64_valid_symref" "S")))]
+	(unspec:PTR [(mem:PTR (lo_sum:DI
+			  (match_operand:DI 1 "register_operand" "r")
+			  (match_operand:DI 2 "aarch64_valid_symref" "S")))]
 		UNSPEC_GOTSMALLPIC))]
   ""
   "ldr\\t%0, [%1, #:got_lo12:%c2]"
@@ -5430,9 +5430,9 @@
 
 (define_insn "ldr_got_small_28k_"
   [(set (match_operand:PTR 0 "register_operand" "=r")
-	(unspec:PTR [(mem:PTR (lo_sum:PTR
-			  (match_operand:PTR 1 "register_operand" "r")
-			  (match_operand:PTR 2 "aarch64_valid_symref" "S")))]
+	(unspec:PTR [(mem:PTR (lo_sum:DI
+			  (match_operand:DI 1 "register_operand" "r")
+			  (match_operand:DI 2 "aarch64_valid_symref" "S")))]
 		UNSPEC_GOTSMALLPIC28K))]
   ""
   "ldr\\t%0, [%1, #::%c2]"

[PR c++/83667] Fix tree_dump ICE

2018-01-03 Thread Nathan Sidwell

This fixes a tree dumping ICE involving static thunk fns.  Copying the 
thunked-to fn's context suffices.


nathan
--
Nathan Sidwell
2018-01-03  Nathan Sidwell  

	PR c++/83667
	* method.c (make_alias_for): Copy DECL_CONTEXT.

	PR c++/83667
	* g++.dg/ipa/pr83667.C: New.

Index: cp/method.c
===
--- cp/method.c	(revision 256175)
+++ cp/method.c	(working copy)
@@ -206,7 +206,7 @@ make_alias_for (tree target, tree newid)
 			   TREE_CODE (target), newid, TREE_TYPE (target));
   DECL_LANG_SPECIFIC (alias) = DECL_LANG_SPECIFIC (target);
   cxx_dup_lang_specific_decl (alias);
-  DECL_CONTEXT (alias) = NULL;
+  DECL_CONTEXT (alias) = DECL_CONTEXT (target);
   TREE_READONLY (alias) = TREE_READONLY (target);
   TREE_THIS_VOLATILE (alias) = TREE_THIS_VOLATILE (target);
   TREE_PUBLIC (alias) = 0;
Index: testsuite/g++.dg/ipa/pr83667.C
===
--- testsuite/g++.dg/ipa/pr83667.C	(revision 0)
+++ testsuite/g++.dg/ipa/pr83667.C	(working copy)
@@ -0,0 +1,23 @@
+/* { dg-options "-fdump-ipa-inline" } */
+// c++/83667 ICE dumping a static thunk
+
+struct a
+{
+  virtual ~a ();
+};
+
+struct b
+{
+  virtual void d (...);
+};
+
+struct c : a, b
+{
+  void d (...)
+  {
+  }
+};
+
+c c;
+
+// { dg-final { scan-ipa-dump "summary for void c::\\*.LTHUNK0" "inline" } }

Re: [PATCH] Add version to intermediate gcov file (PR gcov-profile/83669).

2018-01-03 Thread Nathan Sidwell


On 01/03/2018 08:25 AM, Martin Liška wrote:


This is small enhancement reported by users of gcov tool. I'm aware of current 
stage of GCC,
but it's really small change in code. Apart from that, a small fix to 
documentation is included.



yeah, this is useful, thanks.

nathan

--
Nathan Sidwell

Revert DECL_USER_ALIGN part of r241959

2018-01-03 Thread Richard Sandiford

r241959 included code to stop us increasing the alignment of a
"user-aligned" variable.  This wasn't the main purpose of the patch,
and I think it was just there to make the testcase work.

The documentation for the aligned attribute says:

  This attribute specifies a minimum alignment for the variable or
  structure field, measured in bytes.

The DECL_USER_ALIGN code seemed to be treating as a sort of maximum
instead, but there's not really such a thing as a maximum here: the
variable might still end up at the start of a section that has a higher
alignment, or might end up by chance on a "very aligned" boundary at
link or load time.

I think people who add alignment attributes want to ensure that
accesses to that variable are fast, so it seems counter-intuitive
for it to make the access slower.  The vect-align-4.c test is an
example of this: for targets with 128-bit vectors, we get better
code without the aligned attribute than we do with it.

Tested on aarch64-linux-gnu so far, will test more widely if OK.

Thanks,
Richard


2018-01-03  Richard Sandiford  

gcc/
* tree-vect-data-refs.c (vect_compute_data_ref_alignment): Don't
punt for user-aligned variables.

gcc/testsuite/
* gcc.dg/vect/vect-align-4.c: New test.
* gcc.dg/vect/vect-nb-iter-ub-2.c (cc): Remove alignment attribute
and redefine as a structure with an unaligned member "b".
(foo): Update accordingly.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-01-03 15:03:14.301330558 +
+++ gcc/tree-vect-data-refs.c   2018-01-03 15:03:14.454324422 +
@@ -920,19 +920,6 @@ vect_compute_data_ref_alignment (struct
  return true;
}
 
-  if (DECL_USER_ALIGN (base))
-   {
- if (dump_enabled_p ())
-   {
- dump_printf_loc (MSG_NOTE, vect_location,
-  "not forcing alignment of user-aligned "
-  "variable: ");
- dump_generic_expr (MSG_NOTE, TDF_SLIM, base);
- dump_printf (MSG_NOTE, "\n");
-   }
- return true;
-   }
-
   /* Force the alignment of the decl.
 NOTE: This is the only change to the code we make during
 the analysis phase, before deciding to vectorize the loop.  */
Index: gcc/testsuite/gcc.dg/vect/vect-align-4.c
===
--- /dev/null   2018-01-03 08:32:43.873058927 +
+++ gcc/testsuite/gcc.dg/vect/vect-align-4.c2018-01-03 15:03:14.453324462 
+
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-add-options bind_pic_locally } */
+
+__attribute__((aligned (8))) int a[2048] = {};
+
+void
+f1 (void)
+{
+  for (int i = 0; i < 2048; i++)
+a[i]++;
+}
+
+/* { dg-final { scan-tree-dump-not "Vectorizing an unaligned access" "vect" } 
} */
+/* { dg-final { scan-tree-dump-not "Alignment of access forced using peeling" 
"vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c
===
--- gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c   2018-01-03 
15:03:14.301330558 +
+++ gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c   2018-01-03 
15:03:14.454324422 +
@@ -3,18 +3,19 @@
 #include "tree-vect.h"
 
 int ii[32];
-char cc[66] __attribute__((aligned(1))) =
+struct { char a; char b[66]; } cc = { 0,
   { 0, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0,
 10, 0, 11, 0, 12, 0, 13, 0, 14, 0, 15, 0, 16, 0, 17, 0, 18, 0, 19, 0,
 20, 0, 21, 0, 22, 0, 23, 0, 24, 0, 25, 0, 26, 0, 27, 0, 28, 0, 29, 0,
-30, 0, 31, 0 };
+30, 0, 31, 0 }
+};
 
 void __attribute__((noinline,noclone))
 foo (int s)
 {
   int i;
for (i = 0; i < s; i++)
- ii[i] = (int) cc[i*2];
+ ii[i] = (int) cc.b[i*2];
 }
 
 int main (int argc, const char **argv)

Re: [PATCH PR82439][simplify-rtx] Simplify (x | y) == x -> (y & ~x) == 0

2018-01-03 Thread Sudakshina Das


Hi

On 03/01/18 14:38, Segher Boessenkool wrote:

Hi!

On Wed, Jan 03, 2018 at 01:57:38PM +, Sudakshina Das wrote:

This patch add support for the missing transformation of (x | y) == x ->
(y & ~x) == 0.



Testing done: Checked for regressions on bootstrapped
aarch64-none-linux-gnu and arm-none-linux-gnueabihf and added new test
cases.
Is this ok for trunk?


You forgot to include the patch :-)


(facepalm) This is the second time this has happened to me!
Sorry about this. Attaching now.

Sudi




Segher



diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index e5cfd3d..17536d0 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -5032,34 +5032,38 @@ simplify_relational_operation_1 (enum rtx_code code, machine_mode mode,
 simplify_gen_binary (XOR, cmp_mode,
 			 XEXP (op0, 1), op1));
 
-  /* (eq/ne (and x y) x) simplifies to (eq/ne (and (not y) x) 0), which
- can be implemented with a BICS instruction on some targets, or
- constant-folded if y is a constant.  */
+  /* Simplify eq/ne (and/ior x y) x/y) for targets with a BICS instruction or
+ constant folding if x/y is a constant.  */
   if ((code == EQ || code == NE)
-  && op0code == AND
-  && rtx_equal_p (XEXP (op0, 0), op1)
+  && (op0code == AND || op0code == IOR)
   && !side_effects_p (op1)
   && op1 != CONST0_RTX (cmp_mode))
 {
-  rtx not_y = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 1), cmp_mode);
-  rtx lhs = simplify_gen_binary (AND, cmp_mode, not_y, XEXP (op0, 0));
+  /* Both (eq/ne (and x y) x) and (eq/ne (ior x y) y) simplify to
+	 (eq/ne (and (not y) x) 0).  */
+  if ((op0code == AND && rtx_equal_p (XEXP (op0, 0), op1))
+	  || (op0code == IOR && rtx_equal_p (XEXP (op0, 1), op1)))
+	{
+	  rtx not_y = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 1),
+	  cmp_mode);
+	  rtx lhs = simplify_gen_binary (AND, cmp_mode, not_y, XEXP (op0, 0));
 
-  return simplify_gen_relational (code, mode, cmp_mode, lhs,
-  CONST0_RTX (cmp_mode));
-}
+	  return simplify_gen_relational (code, mode, cmp_mode, lhs,
+	  CONST0_RTX (cmp_mode));
+	}
 
-  /* Likewise for (eq/ne (and x y) y).  */
-  if ((code == EQ || code == NE)
-  && op0code == AND
-  && rtx_equal_p (XEXP (op0, 1), op1)
-  && !side_effects_p (op1)
-  && op1 != CONST0_RTX (cmp_mode))
-{
-  rtx not_x = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 0), cmp_mode);
-  rtx lhs = simplify_gen_binary (AND, cmp_mode, not_x, XEXP (op0, 1));
+  /* Both (eq/ne (and x y) y) and (eq/ne (ior x y) x) simplify to
+	 (eq/ne (and (not x) y) 0).  */
+  if ((op0code == AND && rtx_equal_p (XEXP (op0, 1), op1))
+	  || (op0code == IOR && rtx_equal_p (XEXP (op0, 0), op1)))
+	{
+	  rtx not_x = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 0),
+	  cmp_mode);
+	  rtx lhs = simplify_gen_binary (AND, cmp_mode, not_x, XEXP (op0, 1));
 
-  return simplify_gen_relational (code, mode, cmp_mode, lhs,
-  CONST0_RTX (cmp_mode));
+	  return simplify_gen_relational (code, mode, cmp_mode, lhs,
+	  CONST0_RTX (cmp_mode));
+	}
 }
 
   /* (eq/ne (bswap x) C1) simplifies to (eq/ne x C2) with C2 swapped.  */
diff --git a/gcc/testsuite/gcc.target/aarch64/bics_5.c b/gcc/testsuite/gcc.target/aarch64/bics_5.c
new file mode 100644
index 000..b9c2c40
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/bics_5.c
@@ -0,0 +1,86 @@
+/* { dg-do run } */
+/* { dg-options "-O2 --save-temps -fno-inline" } */
+
+extern void abort (void);
+
+int
+bics_si_test1 (int a, int b, int c)
+{
+  if ((a | b) == a)
+return a;
+  else
+return c;
+}
+
+int
+bics_si_test2 (int a, int b, int c)
+{
+  if ((a | b) == b)
+return b;
+  else
+return c;
+}
+
+typedef long long s64;
+
+s64
+bics_di_test1 (s64 a, s64 b, s64 c)
+{
+  if ((a | b) == a)
+return a;
+  else
+return c;
+}
+
+s64
+bics_di_test2 (s64 a, s64 b, s64 c)
+{
+  if ((a | b) == b)
+return b;
+  else
+return c;
+}
+
+int
+main ()
+{
+  int x;
+  s64 y;
+
+  x = bics_si_test1 (0xf00d, 0xf11f, 0);
+  if (x != 0)
+abort ();
+
+  x = bics_si_test1 (0xf11f, 0xf00d, 0);
+  if (x != 0xf11f)
+abort ();
+
+  x = bics_si_test2 (0xf00d, 0xf11f, 0);
+  if (x != 0xf11f)
+abort ();
+
+  x = bics_si_test2 (0xf11f, 0xf00d, 0);
+  if (x != 0)
+abort ();
+
+  y = bics_di_test1 (0x10001000f00dll, 0x12341000f00dll, 0ll);
+  if (y != 0)
+abort ();
+
+  y = bics_di_test1 (0x12341000f00dll, 0x10001000f00dll, 0ll);
+  if (y != 0x12341000f00dll)
+abort ();
+
+  y = bics_di_test2 (0x10001000f00dll, 0x12341000f00dll, 0ll);
+  if (y != 0x12341000f00dll)
+abort ();
+
+  y = bics_di_test2 (0x12341000f00dll, 0x10001000f00dll, 0ll);
+  if (y != 0)
+abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "bics\twzr, w\[0-9\]+, w\[0-9\]+" 2 } } */
+/* { dg-final { scan-assembler-times "bics\txzr, x\[0-9\]+, x\[0-9\]+" 2 } } */
diff --git a/gcc/testsuite/gcc.target/arm/bics_5.c

Re: [PATCH PR82439][simplify-rtx] Simplify (x | y) == x -> (y & ~x) == 0

2018-01-03 Thread Segher Boessenkool

Hi!

On Wed, Jan 03, 2018 at 01:57:38PM +, Sudakshina Das wrote:
> This patch add support for the missing transformation of (x | y) == x -> 
> (y & ~x) == 0.

> Testing done: Checked for regressions on bootstrapped 
> aarch64-none-linux-gnu and arm-none-linux-gnueabihf and added new test 
> cases.
> Is this ok for trunk?

You forgot to include the patch :-)


Segher

[v3 PATCH] Protect optional's deduction guide with the feature macro

2018-01-03 Thread Ville Voutilainen

Tested partially on Linux-x64, finishing testing the full suite on
Linux-PPC64. Ok for trunk?

2018-01-03  Ville Voutilainen  

Protect optional's deduction guide with the feature macro
* include/std/optional: Use the feature macro.
diff --git a/libstdc++-v3/include/std/optional 
b/libstdc++-v3/include/std/optional
index e017eed..f7c72b5 100644
--- a/libstdc++-v3/include/std/optional
+++ b/libstdc++-v3/include/std/optional
@@ -1038,7 +1038,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   /// @}
 
+#if __cpp_deduction_guides >= 201606
   template  optional(_Tp) -> optional<_Tp>;
+#endif
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std

Re: [C++ PATCH] Improve code generation for static_cast of pointers to reference types (PR c++/83555)

2018-01-03 Thread Jakub Jelinek

On Wed, Jan 03, 2018 at 02:02:16PM +0100, Jakub Jelinek wrote:
> Though, the above snippet reminds me I should probably also replace:
> +  expr = build_base_path (MINUS_EXPR, expr, base,
> + /*nonnull=*/!sanitize_null_p, complain);
> with:
> +  expr = build_base_path (MINUS_EXPR, expr, base,
> +   /*nonnull=*/(!sanitize_null_p
> +&& flag_delete_null_pointer_checks),
> +   complain);
> 
> Instead of checking for generated asm without sanitization, I think it will
> be easier/more portable to verify no conditionals in *.optimized dump.
> Let me repost the updated patch.

Here is an updated patch (which just uses flag_delete_null_pointer_checks,
because -fsanitize=null disables that option) and with a testcase that makes
sure there is no conditional testing b != NULL.

Queued for bootstrap/regtest.

2018-01-03  Jakub Jelinek  

PR c++/83555
* typeck.c (build_static_cast_1): For static casts to reference types,
call build_base_path with flag_delete_null_pointer_checks as nonnull
instead of always false.  When -fsanitize=null, call
ubsan_maybe_instrument_reference on the NULL reference INTEGER_CST.
* cp-gimplify.c (cp_genericize_r): Don't walk subtrees of UBSAN_NULL
call if the first argument is INTEGER_CST with REFERENCE_TYPE.

* g++.dg/opt/pr83555.C: New test.
* g++.dg/ubsan/pr83555.C: New test.

--- gcc/cp/typeck.c.jj  2018-01-03 10:20:17.457537524 +0100
+++ gcc/cp/typeck.c 2018-01-03 14:28:46.189830697 +0100
@@ -6943,8 +6943,11 @@ build_static_cast_1 (tree type, tree exp
}
 
   /* Convert from "B*" to "D*".  This function will check that "B"
-is not a virtual base of "D".  */
-  expr = build_base_path (MINUS_EXPR, expr, base, /*nonnull=*/false,
+is not a virtual base of "D".  Even if we don't have a guarantee
+that expr is NULL, if the static_cast is to a reference type,
+it is UB if it would be NULL, so omit the non-NULL check.  */
+  expr = build_base_path (MINUS_EXPR, expr, base,
+ /*nonnull=*/flag_delete_null_pointer_checks,
  complain);
 
   /* Convert the pointer to a reference -- but then remember that
@@ -6955,7 +6958,18 @@ build_static_cast_1 (tree type, tree exp
  is a variable with the same type, the conversion would get folded
  away, leaving just the variable and causing lvalue_kind to give
  the wrong answer.  */
-  return convert_from_reference (rvalue (cp_fold_convert (type, expr)));
+  expr = cp_fold_convert (type, expr);
+
+  /* When -fsanitize=null, make sure to diagnose reference binding to
+NULL even when the reference is converted to pointer later on.  */
+  if (sanitize_flags_p (SANITIZE_NULL)
+ && TREE_CODE (expr) == COND_EXPR
+ && TREE_OPERAND (expr, 2)
+ && TREE_CODE (TREE_OPERAND (expr, 2)) == INTEGER_CST
+ && TREE_TYPE (TREE_OPERAND (expr, 2)) == type)
+   ubsan_maybe_instrument_reference (_OPERAND (expr, 2));
+
+  return convert_from_reference (rvalue (expr));
 }
 
   /* "A glvalue of type cv1 T1 can be cast to type rvalue reference to
--- gcc/cp/cp-gimplify.c.jj 2018-01-03 13:15:42.799817380 +0100
+++ gcc/cp/cp-gimplify.c2018-01-03 14:27:39.383797996 +0100
@@ -1506,6 +1506,12 @@ cp_genericize_r (tree *stmt_p, int *walk
  if (sanitize_flags_p (SANITIZE_VPTR) && !is_ctor)
cp_ubsan_maybe_instrument_member_call (stmt);
}
+ else if (fn == NULL_TREE
+  && CALL_EXPR_IFN (stmt) == IFN_UBSAN_NULL
+  && TREE_CODE (CALL_EXPR_ARG (stmt, 0)) == INTEGER_CST
+  && (TREE_CODE (TREE_TYPE (CALL_EXPR_ARG (stmt, 0)))
+  == REFERENCE_TYPE))
+   *walk_subtrees = 0;
}
   break;
 
--- gcc/testsuite/g++.dg/opt/pr83555.C.jj   2018-01-03 15:17:56.294130008 
+0100
+++ gcc/testsuite/g++.dg/opt/pr83555.C  2018-01-03 15:19:06.764162730 +0100
@@ -0,0 +1,15 @@
+// PR c++/83555
+// { dg-do compile }
+// { dg-options "-O2 -fdump-tree-optimized -fdelete-null-pointer-checks" }
+
+struct A { int a; };
+struct B { int b; };
+struct C : A, B { int c; };
+
+C *
+foo (B *b)
+{
+  return _cast(*b);
+}
+
+// { dg-final { scan-tree-dump-not "if \\(b_\[0-9]*\\(D\\) .= 0" "optimized" } 
}
--- gcc/testsuite/g++.dg/ubsan/pr83555.C.jj 2018-01-03 14:27:39.383797996 
+0100
+++ gcc/testsuite/g++.dg/ubsan/pr83555.C2018-01-03 14:27:39.383797996 
+0100
@@ -0,0 +1,40 @@
+// PR c++/83555
+// { dg-do run }
+// { dg-options "-fsanitize=null" }
+// { dg-output ":25:\[^\n\r]*reference binding to null pointer of type 'struct 
C'" }
+
+struct A { int a; };
+struct B { int b; };
+struct C : A, B { int c; };
+
+__attribute__((noipa)) C *
+foo (B *b)
+{
+  return

Re: [PATCH] Be careful about comdat boundary in ICF (PR ipa/82352).

2018-01-03 Thread Martin Liška

On 01/03/2018 02:40 PM, Jan Hubicka wrote:
 +  if (!original->in_same_comdat_group_p (alias))
 +{
 +  if (dump_file)
 +  fprintf (dump_file, "Not unifying; alias cannot be created; "
 +   "across comdat group boundary\n\n");
 +
 +  return false;
 +}
>>>
>>> Wasn't we supposed to do the wrapper in this case?
>>>
>>> Honza
>>>
>>
>> We attempt to do a wrapper, but even with wrapper we cannot introduce such 
>> call
>> crossing the boundary. Proper message should be probably:
>>
>> "Not unifying; alias nor wrapper cannot be created; across comdat group 
>> boundary"
> 
> If the symbol we are calling is one exported from comdat, it shoud work fine
> as long as we produce gimple thunk and it the symbol comdat exports.
> Of course producing call from one comdat to another may close comdat loop
> (which we handle elsewhere, so i assume original is comdat and alias is not)
> 
> I see in the PR is the split part of comdat which is comdat local, so perhaps
> we want to check that original is not comdat local in addition to your current
> check?

Ok, you understand problematic of comdats more than me. I will test and install
patch with the check added.

Thanks,
Martin

> 
> Honza
>>
>> Martin

>From 707e88333778b306f66c1bf683838b1a7bb4f737 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 3 Jan 2018 11:10:18 +0100
Subject: [PATCH] Be careful about comdat boundary in ICF (PR ipa/82352).

gcc/ChangeLog:

2018-01-03  Martin Liska  

	PR ipa/82352
	* ipa-icf.c (sem_function::merge): Do not cross comdat boundary.

gcc/testsuite/ChangeLog:

2018-01-03  Martin Liska  

	PR ipa/82352
	* g++.dg/ipa/pr82352.C: New test.
---
 gcc/ipa-icf.c  | 11 +
 gcc/testsuite/g++.dg/ipa/pr82352.C | 93 ++
 2 files changed, 104 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr82352.C

diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index a8d3b800318..e17c4292392 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -1113,6 +1113,17 @@ sem_function::merge (sem_item *alias_item)
   return false;
 }
 
+  if (!original->in_same_comdat_group_p (alias)
+  || original->comdat_local_p ())
+{
+  if (dump_file)
+	fprintf (dump_file,
+		 "Not unifying; alias nor wrapper cannot be created; "
+		 "across comdat group boundary\n\n");
+
+  return false;
+}
+
   /* See if original is in a section that can be discarded if the main
  symbol is not used.  */
 
diff --git a/gcc/testsuite/g++.dg/ipa/pr82352.C b/gcc/testsuite/g++.dg/ipa/pr82352.C
new file mode 100644
index 000..c044345a486
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr82352.C
@@ -0,0 +1,93 @@
+// PR ipa/82352
+// { dg-do compile }
+// { dg-options "-O2" }
+
+typedef long unsigned int size_t;
+
+class A
+{
+public :
+  typedef enum { Zero = 0, One = 1 } tA;
+  A(tA a) { m_a = a; }
+
+private :
+  tA m_a;
+};
+
+class B
+{
+public :
+  void *operator new(size_t t) { return (void*)(42); };
+};
+
+class C
+{
+public:
+  virtual void  () = 0;
+};
+
+class D
+{
+ public :
+  virtual void g() = 0;
+  virtual void h() = 0;
+};
+
+template class : public T, public D
+{
+public:
+ void ()
+ {
+   if (!m_i2) throw A(A::One);
+ };
+
+ void h()
+ {
+  if (m_i2) throw A(A::Zero);
+ }
+
+protected:
+ virtual void g()
+ {
+  if (m_i1 !=0) throw A(A::Zero);
+ };
+
+private :
+ int m_i1;
+ void *m_i2;
+};
+
+class E
+{
+private:
+size_t m_e;
+static const size_t Max;
+
+public:
+E& i(size_t a, size_t b, size_t c)
+{
+if ((a > Max) || (c > Max)) throw A(A::Zero );
+if (a + b > m_e) throw A(A::One );
+return (*this);
+}
+
+  inline E& j(const E )
+{
+  return i(0,0,s.m_e);
+}
+};
+
+class F : public C { };
+class G : public C { };
+class  : public B, public F, public G { };
+
+void k()
+{
+new ();
+}
+
+void l()
+{
+  E e1, e2;
+  e1.j(e2);
+}
-- 
2.14.3

Re: [PATCH] Do not inline variadic thunks (PR ipa/83549).

2018-01-03 Thread Martin Liška

On 01/03/2018 02:41 PM, Jan Hubicka wrote:
>> Hi.
>>
>> As mentioned in the PR, we should bail out inlining of thunks with variadic
>> arguments. It's problematic for cgraph_node::expand_thunk function that
>> does not support variadic functions.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Ready to be installed?
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2018-01-03  Martin Liska  
>>
>>  PR ipa/83549
>>  * ipa-fnsummary.c (compute_fn_summary): Do not inline variadic
>>  thunks.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2018-01-03  Martin Liska  
>>
>>  PR ipa/83549
>>  * g++.dg/ipa/pr83549.C: New test.
> 
> OK, but please introduce new CIF_CODE for this case.  MISMATCHED arguments
> is a kitchen sink for various issues and it is very hard to analyze what
> happened when it triggers.

Fully agree, I'm attaching patch that I'll commit soon.

Martin

> 
> Thanks,
> Honza
>> ---
>>  gcc/ipa-fnsummary.c| 5 +
>>  gcc/testsuite/g++.dg/ipa/pr83549.C | 8 
>>  2 files changed, 13 insertions(+)
>>  create mode 100644 gcc/testsuite/g++.dg/ipa/pr83549.C
>>
>>
> 
>> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
>> index 94150312105..274bd8c6758 100644
>> --- a/gcc/ipa-fnsummary.c
>> +++ b/gcc/ipa-fnsummary.c
>> @@ -2422,6 +2422,11 @@ compute_fn_summary (struct cgraph_node *node, bool 
>> early)
>>info->inlinable = false;
>>node->callees->inline_failed = CIF_CHKP;
>>  }
>> +  else if (stdarg_p (TREE_TYPE (node->decl)))
>> +{
>> +  info->inlinable = false;
>> +  node->callees->inline_failed = CIF_MISMATCHED_ARGUMENTS;
>> +}
>>else
>>  info->inlinable = true;
>>  }
>> diff --git a/gcc/testsuite/g++.dg/ipa/pr83549.C 
>> b/gcc/testsuite/g++.dg/ipa/pr83549.C
>> new file mode 100644
>> index 000..90cf8fe7e0d
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.dg/ipa/pr83549.C
>> @@ -0,0 +1,8 @@
>> +// PR ipa/83549
>> +// { dg-do compile }
>> +// { dg-options "-O2" }
>> +
>> +struct A { virtual ~A (); };
>> +struct B { virtual void foo (...); };
>> +struct C : A, B { void foo (...) {} };
>> +C c;
>>
> 

>From d890dbc95bdc57d13f26759888ef48f4b492f53e Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 3 Jan 2018 13:23:46 +0100
Subject: [PATCH] Do not inline variadic thunks (PR ipa/83549).

gcc/ChangeLog:

2018-01-03  Martin Liska  

	PR ipa/83549
	* cif-code.def (VARIADIC_THUNK): New enum value.
	* ipa-fnsummary.c (compute_fn_summary): Do not inline variadic
	thunks.

gcc/testsuite/ChangeLog:

2018-01-03  Martin Liska  

	PR ipa/83549
	* g++.dg/ipa/pr83549.C: New test.
---
 gcc/cif-code.def   | 4 
 gcc/ipa-fnsummary.c| 5 +
 gcc/testsuite/g++.dg/ipa/pr83549.C | 8 
 3 files changed, 17 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr83549.C

diff --git a/gcc/cif-code.def b/gcc/cif-code.def
index 92d81d30a49..a587e712fdf 100644
--- a/gcc/cif-code.def
+++ b/gcc/cif-code.def
@@ -95,6 +95,10 @@ DEFCIFCODE(MISMATCHED_ARGUMENTS, CIF_FINAL_ERROR,
 DEFCIFCODE(LTO_MISMATCHED_DECLARATIONS, CIF_FINAL_ERROR,
 	   N_("mismatched declarations during linktime optimization"))
 
+/* Caller is variadic thunk.  */
+DEFCIFCODE(VARIADIC_THUNK, CIF_FINAL_ERROR,
+	   N_("variadic thunk call"))
+
 /* Call was originally indirect.  */
 DEFCIFCODE(ORIGINALLY_INDIRECT_CALL, CIF_FINAL_NORMAL,
 	   N_("originally indirect function call not considered for inlining"))
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 94150312105..9b1b7daca2e 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -2422,6 +2422,11 @@ compute_fn_summary (struct cgraph_node *node, bool early)
   info->inlinable = false;
   node->callees->inline_failed = CIF_CHKP;
 	}
+  else if (stdarg_p (TREE_TYPE (node->decl)))
+	{
+	  info->inlinable = false;
+	  node->callees->inline_failed = CIF_VARIADIC_THUNK;
+	}
   else
 info->inlinable = true;
 }
diff --git a/gcc/testsuite/g++.dg/ipa/pr83549.C b/gcc/testsuite/g++.dg/ipa/pr83549.C
new file mode 100644
index 000..90cf8fe7e0d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr83549.C
@@ -0,0 +1,8 @@
+// PR ipa/83549
+// { dg-do compile }
+// { dg-options "-O2" }
+
+struct A { virtual ~A (); };
+struct B { virtual void foo (...); };
+struct C : A, B { void foo (...) {} };
+C c;
-- 
2.14.3

[PATCH PR82439][simplify-rtx] Simplify (x | y) == x -> (y & ~x) == 0

2018-01-03 Thread Sudakshina Das


Hi

This patch add support for the missing transformation of (x | y) == x -> 
(y & ~x) == 0.
The transformation for (x & y) == x case already exists in 
simplify-rtx.c since 2014 as of r218503 and this patch only adds a 
couple of extra patterns for the IOR case.


Citing the example given in PR82439 :

Simple testcase (f1 should generate the same as f2):

int f1(int x, int y) { return (x | y) == x; }
int f2(int x, int y) { return (y & ~x) == 0; }

f1:
orr w1, w0, w1
cmp w1, w0
csetw0, eq
ret
f2:
bicswzr, w1, w0
csetw0, eq
ret

This benefits targets that have the BICS instruction to generate better 
code. Wilco helped me in showing that even in targets that do not have 
the BICS instruction, this is no worse and gives out 2 instructions.

For example in x86:

 :
   0:   09 fe   or %edi,%esi
   2:   31 c0   xor%eax,%eax
   4:   39 fe   cmp%edi,%esi
   6:   0f 94 c0sete   %al
   9:   c3  retq

0010 :
  10:   f7 d7   not%edi
  12:   31 c0   xor%eax,%eax
  14:   85 f7   test   %esi,%edi
  16:   0f 94 c0sete   %al
  19:   c3  retq

Testing done: Checked for regressions on bootstrapped 
aarch64-none-linux-gnu and arm-none-linux-gnueabihf and added new test 
cases.

Is this ok for trunk?

Sudi

ChangeLog Entries:

*** gcc/ChangeLog ***

2017-01-03  Sudakshina Das  

PR target/82439
* simplify-rtx.c (simplify_relational_operation_1): Add
simplifications of (x|y) == x for BICS pattern.

*** gcc/testsuite/ChangeLog ***

2017-01-03  Sudakshina Das  

PR target/82439
* gcc.target/aarch64/bics_5.c: New test.
* gcc.target/arm/bics_5.c: Likewise.

Re: [PATCH] Do not inline variadic thunks (PR ipa/83549).

2018-01-03 Thread Jan Hubicka

> Hi.
> 
> As mentioned in the PR, we should bail out inlining of thunks with variadic
> arguments. It's problematic for cgraph_node::expand_thunk function that
> does not support variadic functions.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready to be installed?
> Martin
> 
> gcc/ChangeLog:
> 
> 2018-01-03  Martin Liska  
> 
>   PR ipa/83549
>   * ipa-fnsummary.c (compute_fn_summary): Do not inline variadic
>   thunks.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-01-03  Martin Liska  
> 
>   PR ipa/83549
>   * g++.dg/ipa/pr83549.C: New test.

OK, but please introduce new CIF_CODE for this case.  MISMATCHED arguments
is a kitchen sink for various issues and it is very hard to analyze what
happened when it triggers.

Thanks,
Honza
> ---
>  gcc/ipa-fnsummary.c| 5 +
>  gcc/testsuite/g++.dg/ipa/pr83549.C | 8 
>  2 files changed, 13 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/ipa/pr83549.C
> 
> 

> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
> index 94150312105..274bd8c6758 100644
> --- a/gcc/ipa-fnsummary.c
> +++ b/gcc/ipa-fnsummary.c
> @@ -2422,6 +2422,11 @@ compute_fn_summary (struct cgraph_node *node, bool 
> early)
>info->inlinable = false;
>node->callees->inline_failed = CIF_CHKP;
>   }
> +  else if (stdarg_p (TREE_TYPE (node->decl)))
> + {
> +   info->inlinable = false;
> +   node->callees->inline_failed = CIF_MISMATCHED_ARGUMENTS;
> + }
>else
>  info->inlinable = true;
>  }
> diff --git a/gcc/testsuite/g++.dg/ipa/pr83549.C 
> b/gcc/testsuite/g++.dg/ipa/pr83549.C
> new file mode 100644
> index 000..90cf8fe7e0d
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ipa/pr83549.C
> @@ -0,0 +1,8 @@
> +// PR ipa/83549
> +// { dg-do compile }
> +// { dg-options "-O2" }
> +
> +struct A { virtual ~A (); };
> +struct B { virtual void foo (...); };
> +struct C : A, B { void foo (...) {} };
> +C c;
>

Re: [PATCH] Be careful about comdat boundary in ICF (PR ipa/82352).

2018-01-03 Thread Jan Hubicka

> >> +  if (!original->in_same_comdat_group_p (alias))
> >> +{
> >> +  if (dump_file)
> >> +  fprintf (dump_file, "Not unifying; alias cannot be created; "
> >> +   "across comdat group boundary\n\n");
> >> +
> >> +  return false;
> >> +}
> > 
> > Wasn't we supposed to do the wrapper in this case?
> > 
> > Honza
> > 
> 
> We attempt to do a wrapper, but even with wrapper we cannot introduce such 
> call
> crossing the boundary. Proper message should be probably:
> 
> "Not unifying; alias nor wrapper cannot be created; across comdat group 
> boundary"

If the symbol we are calling is one exported from comdat, it shoud work fine
as long as we produce gimple thunk and it the symbol comdat exports.
Of course producing call from one comdat to another may close comdat loop
(which we handle elsewhere, so i assume original is comdat and alias is not)

I see in the PR is the split part of comdat which is comdat local, so perhaps
we want to check that original is not comdat local in addition to your current
check?

Honza
> 
> Martin

[PATCH] Do not inline variadic thunks (PR ipa/83549).

2018-01-03 Thread Martin Liška

Hi.

As mentioned in the PR, we should bail out inlining of thunks with variadic
arguments. It's problematic for cgraph_node::expand_thunk function that
does not support variadic functions.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

gcc/ChangeLog:

2018-01-03  Martin Liska  

PR ipa/83549
* ipa-fnsummary.c (compute_fn_summary): Do not inline variadic
thunks.

gcc/testsuite/ChangeLog:

2018-01-03  Martin Liska  

PR ipa/83549
* g++.dg/ipa/pr83549.C: New test.
---
 gcc/ipa-fnsummary.c| 5 +
 gcc/testsuite/g++.dg/ipa/pr83549.C | 8 
 2 files changed, 13 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr83549.C


diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 94150312105..274bd8c6758 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -2422,6 +2422,11 @@ compute_fn_summary (struct cgraph_node *node, bool early)
   info->inlinable = false;
   node->callees->inline_failed = CIF_CHKP;
 	}
+  else if (stdarg_p (TREE_TYPE (node->decl)))
+	{
+	  info->inlinable = false;
+	  node->callees->inline_failed = CIF_MISMATCHED_ARGUMENTS;
+	}
   else
 info->inlinable = true;
 }
diff --git a/gcc/testsuite/g++.dg/ipa/pr83549.C b/gcc/testsuite/g++.dg/ipa/pr83549.C
new file mode 100644
index 000..90cf8fe7e0d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr83549.C
@@ -0,0 +1,8 @@
+// PR ipa/83549
+// { dg-do compile }
+// { dg-options "-O2" }
+
+struct A { virtual ~A (); };
+struct B { virtual void foo (...); };
+struct C : A, B { void foo (...) {} };
+C c;

Re: [PATCH] Be careful about comdat boundary in ICF (PR ipa/82352).

2018-01-03 Thread Martin Liška

On 01/03/2018 02:24 PM, Jan Hubicka wrote:
>> Hi.
>>
>> This patch is follow-up of r246848. This time ICF creates an edge between 2 
>> functions,
>> where one is inside a comdat group and second is not. I've got patch that is 
>> conservative
>> about the comdat groups (in_same_comdat_group_p).
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Ready to be installed?
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2018-01-03  Martin Liska  
>>
>>  PR ipa/82352
>>  * ipa-icf.c (sem_function::merge): Do not cross comdat boundary.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2018-01-03  Martin Liska  
>>
>>  PR ipa/82352
>>  * g++.dg/ipa/pr82352.C: New test.
>> ---
>>  gcc/ipa-icf.c  |  9 
>>  gcc/testsuite/g++.dg/ipa/pr82352.C | 93 
>> ++
>>  2 files changed, 102 insertions(+)
>>  create mode 100644 gcc/testsuite/g++.dg/ipa/pr82352.C
>>
>>
> 
>> diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
>> index a8d3b800318..a56c7306201 100644
>> --- a/gcc/ipa-icf.c
>> +++ b/gcc/ipa-icf.c
>> @@ -1113,6 +1113,15 @@ sem_function::merge (sem_item *alias_item)
>>return false;
>>  }
>>  
>> +  if (!original->in_same_comdat_group_p (alias))
>> +{
>> +  if (dump_file)
>> +fprintf (dump_file, "Not unifying; alias cannot be created; "
>> + "across comdat group boundary\n\n");
>> +
>> +  return false;
>> +}
> 
> Wasn't we supposed to do the wrapper in this case?
> 
> Honza
> 

We attempt to do a wrapper, but even with wrapper we cannot introduce such call
crossing the boundary. Proper message should be probably:

"Not unifying; alias nor wrapper cannot be created; across comdat group 
boundary"

Martin

Re: PR83648

2018-01-03 Thread Jan Hubicka


> diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
> index 09ca3590039..0406d5588d2 100644
> --- a/gcc/ipa-pure-const.c
> +++ b/gcc/ipa-pure-const.c
> @@ -910,7 +910,8 @@ malloc_candidate_p (function *fun, bool ipa)
>  #define DUMP_AND_RETURN(reason)  \
>  {  \
>if (dump_file && (dump_flags & TDF_DETAILS))  \
> -fprintf (dump_file, "%s", (reason));  \
> +fprintf (dump_file, "\n%s is not a malloc candidate, reason: %s\n", \
> +  (node->name()), (reason));  \
>return false;  \
>  }
>  
> @@ -961,11 +962,14 @@ malloc_candidate_p (function *fun, bool ipa)
>   for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
> {
>   tree arg = gimple_phi_arg_def (phi, i);
> + if (arg == null_pointer_node)
> +   continue;

I think you want operand_equal_p here and also check for 
flag_delete_null_pointer_checks
because otherwise 0 can be legal memory block addrss.
> +
>   if (TREE_CODE (arg) != SSA_NAME)
> -   DUMP_AND_RETURN("phi arg is not SSA_NAME.")
> - if (!(arg == null_pointer_node || check_retval_uses (arg, phi)))
> -   DUMP_AND_RETURN("phi arg has uses outside phi"
> -   " and comparisons against 0.")
> +   DUMP_AND_RETURN ("phi arg is not SSA_NAME.");
> + if (!check_retval_uses (arg, phi))
> +   DUMP_AND_RETURN ("phi arg has uses outside phi"
> +" and comparisons against 0.")

So the case of phi(0,0) is not really correctness issue just missed 
optimization?
I would still add code to handle it (it is easy).

Do you also handle a case where parameter of phi is another phi?

Honza
>  
>   gimple *arg_def = SSA_NAME_DEF_STMT (arg);
>   gcall *call_stmt = dyn_cast (arg_def);
> diff --git a/gcc/testsuite/gcc.dg/ipa/pr83648.c 
> b/gcc/testsuite/gcc.dg/ipa/pr83648.c
> new file mode 100644
> index 000..03b45de671b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/ipa/pr83648.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-local-pure-const-details" } */
> +
> +void *g(unsigned n)
> +{
> +  return n ? __builtin_malloc (n) : 0;
> +}
> +
> +/* { dg-final { scan-tree-dump "Function found to be malloc: g" 
> "local-pure-const1" } } */

[PATCH] Add version to intermediate gcov file (PR gcov-profile/83669).

2018-01-03 Thread Martin Liška

Hi.

This is small enhancement reported by users of gcov tool. I'm aware of current 
stage of GCC,
but it's really small change in code. Apart from that, a small fix to 
documentation is included.

Survives gcov.exp, may I install it?
Thanks,
Martin

gcc/ChangeLog:

2018-01-03  Martin Liska  <mli...@suse.cz>

PR gcov-profile/83669
* gcov.c (output_intermediate_file): Add version to intermediate
gcov file.
* doc/gcov.texi: Document new field 'version' in intermediate
file format. Fix location of '-k' option of gcov command.
---
 gcc/doc/gcov.texi | 10 ++
 gcc/gcov.c|  1 +
 2 files changed, 7 insertions(+), 4 deletions(-)


diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi
index 8bf422e58d8..be7364d9da9 100644
--- a/gcc/doc/gcov.texi
+++ b/gcc/doc/gcov.texi
@@ -187,11 +187,8 @@ be used by @command{lcov} or other tools. The output is a single
 The format of the intermediate @file{.gcov} file is plain text with
 one entry per line
 
-@item -j
-@itemx --human-readable
-Write counts in human readable format (like 24k).
-
 @smallexample
+version:@var{gcc_version}
 file:@var{source_file_name}
 function:@var{start_line_number},@var{end_line_number},@var{execution_count},@var{function_name}
 lcount:@var{line number},@var{execution_count},@var{has_unexecuted_block}
@@ -212,6 +209,7 @@ times.
 Here is a sample when @option{-i} is used in conjunction with @option{-b} option:
 
 @smallexample
+version: 8.1.0 20180103
 file:tmp.cpp
 function:7,7,0,_ZN3FooIcEC2Ev
 function:7,7,1,_ZN3FooIiEC2Ev
@@ -252,6 +250,10 @@ branch:35,nottaken
 lcount:36,1,0
 @end smallexample
 
+@item -j
+@itemx --human-readable
+Write counts in human readable format (like 24k).
+
 @item -k
 @itemx --use-colors
 
diff --git a/gcc/gcov.c b/gcc/gcov.c
index 24e6da09fcf..3c7881b13fa 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -1035,6 +1035,7 @@ file 'foo.cc.gcov' similar to the above example. */
 static void
 output_intermediate_file (FILE *gcov_file, source_info *src)
 {
+  fprintf (gcov_file, "version:%s\n", version_string);
   fprintf (gcov_file, "file:%s\n", src->name);/* source file name */
 
   std::sort (src->functions.begin (), src->functions.end (),

Re: [PATCH] Be careful about comdat boundary in ICF (PR ipa/82352).

2018-01-03 Thread Jan Hubicka

> Hi.
> 
> This patch is follow-up of r246848. This time ICF creates an edge between 2 
> functions,
> where one is inside a comdat group and second is not. I've got patch that is 
> conservative
> about the comdat groups (in_same_comdat_group_p).
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready to be installed?
> Martin
> 
> gcc/ChangeLog:
> 
> 2018-01-03  Martin Liska  
> 
>   PR ipa/82352
>   * ipa-icf.c (sem_function::merge): Do not cross comdat boundary.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-01-03  Martin Liska  
> 
>   PR ipa/82352
>   * g++.dg/ipa/pr82352.C: New test.
> ---
>  gcc/ipa-icf.c  |  9 
>  gcc/testsuite/g++.dg/ipa/pr82352.C | 93 
> ++
>  2 files changed, 102 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/ipa/pr82352.C
> 
> 

> diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
> index a8d3b800318..a56c7306201 100644
> --- a/gcc/ipa-icf.c
> +++ b/gcc/ipa-icf.c
> @@ -1113,6 +1113,15 @@ sem_function::merge (sem_item *alias_item)
>return false;
>  }
>  
> +  if (!original->in_same_comdat_group_p (alias))
> +{
> +  if (dump_file)
> + fprintf (dump_file, "Not unifying; alias cannot be created; "
> +  "across comdat group boundary\n\n");
> +
> +  return false;
> +}

Wasn't we supposed to do the wrapper in this case?

Honza

Re: [PATCH] Clean-up EH after strlen transformation (PR tree-optimization/83593).

2018-01-03 Thread Jakub Jelinek

On Wed, Jan 03, 2018 at 02:07:36PM +0100, Martin Liška wrote:
> 2018-01-03  Martin Liska  
> 
>   PR tree-optimization/83593
>   * tree-ssa-strlen.c (strlen_check_and_optimize_stmt): Clean-up
>   EH gimple statements.
>   (strlen_dom_walker::before_dom_children): Call
>   gimple_purge_dead_eh_edges.
>   (pass_strlen::execute): Return TODO_cleanup_cfg if needed.

The ChangeLog entry doesn't contain all the changes, like:
> @@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimple-iterator.h"
>  #include "gimplify-me.h"
>  #include "expr.h"
> +#include "tree-cfg.h"
>  #include "tree-dfa.h"
>  #include "domwalk.h"
>  #include "tree-ssa-alias.h"

the above one.

>  static bool
> -strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi)
> +strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi, bool *cleanup_eh)

This one (i.e. addition of a new argument).

> @@ -3318,10 +3337,16 @@ do_invalidate (basic_block dombb, gimple *phi, bitmap 
> visited, int *count)
>  class strlen_dom_walker : public dom_walker
>  {
>  public:
> -  strlen_dom_walker (cdi_direction direction) : dom_walker (direction) {}
> +  strlen_dom_walker (cdi_direction direction)
> +: dom_walker (direction), m_cleanup_cfg (false)
> +  {}

This one.
>  
>virtual edge before_dom_children (basic_block);
>virtual void after_dom_children (basic_block);
> +
> +  /* Flag that will trigger TODO_cleanup_cfg to be returned in strlen
> + execute function.  */
> +  bool m_cleanup_cfg;

This one too.

> +  bool cleanup_eh = false;
> +
>/* Attempt to optimize individual statements.  */
>for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
> -if (strlen_check_and_optimize_stmt ())
> +if (strlen_check_and_optimize_stmt (, _eh))
>gsi_next ();

And the fact that strlen_check_and_optimize_stmt caller has been adjusted.

>  
> +  if (cleanup_eh)
> +{
> +  gimple_purge_dead_eh_edges (bb);
> +  m_cleanup_cfg = true;
> +}

This should be
  if (cleanup_eh && gimple_purge_dead_eh_edges (bb))
m_cleanup_cfg = true;

Ok with that change and updated ChangeLog entry if it passes
bootstrap/regtest.

Jakub

Re: [PATCH] Clean-up EH after strlen transformation (PR tree-optimization/83593).

2018-01-03 Thread Martin Liška

On 01/03/2018 01:50 PM, Jakub Jelinek wrote:
> If gimple_purge_dead_eh_edges returns true, you want to arrange for the
> pass to return TODO_cleanup_cfg (probably needs to use some global static
> variable to propagate that).
> 
>   Jakub

Hi.

Sending v2. I'm suggesting to propagate that in 
strlen_dom_walker::m_cleanup_cfg.

I've just triggered tests.
Ready to install after it finishes?
>From 217982bae6969865051f12798e4da444dd4aaa3f Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 3 Jan 2018 12:15:40 +0100
Subject: [PATCH] Clean-up EH after strlen transformation (PR
 tree-optimization/83593).

gcc/ChangeLog:

2018-01-03  Martin Liska  

	PR tree-optimization/83593
	* tree-ssa-strlen.c (strlen_check_and_optimize_stmt): Clean-up
	EH gimple statements.
	(strlen_dom_walker::before_dom_children): Call
	gimple_purge_dead_eh_edges.
	(pass_strlen::execute): Return TODO_cleanup_cfg if needed.

gcc/testsuite/ChangeLog:

2018-01-03  Martin Liska  

	PR tree-optimization/83593
	* gcc.dg/pr83593.c: New test.
---
 gcc/testsuite/gcc.dg/pr83593.c | 15 +
 gcc/tree-ssa-strlen.c  | 48 --
 2 files changed, 56 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr83593.c

diff --git a/gcc/testsuite/gcc.dg/pr83593.c b/gcc/testsuite/gcc.dg/pr83593.c
new file mode 100644
index 000..eddecc0606a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr83593.c
@@ -0,0 +1,15 @@
+/* PR tree-optimization/83593 */
+/* { dg-options "-O2 -fno-tree-dominator-opts -fnon-call-exceptions -fno-tree-pre -fexceptions -fno-code-hoisting -fno-tree-fre" } */
+
+void
+hr (int *ed, signed char *ju)
+{
+  int kc;
+{
+  int xj;
+  int *q2 = (*ed == 0) ?  : 
+
+  *ju = 0;
+  kc = *ju;
+}
+}
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index be6ab9f1e1b..7ad6ff8228c 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "gimplify-me.h"
 #include "expr.h"
+#include "tree-cfg.h"
 #include "tree-dfa.h"
 #include "domwalk.h"
 #include "tree-ssa-alias.h"
@@ -3051,10 +3052,12 @@ fold_strstr_to_strncmp (tree rhs1, tree rhs2, gimple *stmt)
 }
 
 /* Attempt to check for validity of the performed access a single statement
-   at *GSI using string length knowledge, and to optimize it.  */
+   at *GSI using string length knowledge, and to optimize it.
+   If the given basic block needs clean-up of EH, CLEANUP_EH is set to
+   true.  */
 
 static bool
-strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi)
+strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi, bool *cleanup_eh)
 {
   gimple *stmt = gsi_stmt (*gsi);
 
@@ -3201,11 +3204,27 @@ strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi)
 		if (w1 == w2
 			&& si->full_string_p)
 		  {
+			if (dump_file && (dump_flags & TDF_DETAILS) != 0)
+			  {
+			fprintf (dump_file, "Optimizing: ");
+			print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
+			  }
+
 			/* Reading the final '\0' character.  */
 			tree zero = build_int_cst (TREE_TYPE (lhs), 0);
 			gimple_set_vuse (stmt, NULL_TREE);
 			gimple_assign_set_rhs_from_tree (gsi, zero);
-			update_stmt (gsi_stmt (*gsi));
+			*cleanup_eh
+			  |= maybe_clean_or_replace_eh_stmt (stmt,
+			 gsi_stmt (*gsi));
+			stmt = gsi_stmt (*gsi);
+			update_stmt (stmt);
+
+			if (dump_file && (dump_flags & TDF_DETAILS) != 0)
+			  {
+			fprintf (dump_file, "into: ");
+			print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
+			  }
 		  }
 		else if (w1 > w2)
 		  {
@@ -3318,10 +3337,16 @@ do_invalidate (basic_block dombb, gimple *phi, bitmap visited, int *count)
 class strlen_dom_walker : public dom_walker
 {
 public:
-  strlen_dom_walker (cdi_direction direction) : dom_walker (direction) {}
+  strlen_dom_walker (cdi_direction direction)
+: dom_walker (direction), m_cleanup_cfg (false)
+  {}
 
   virtual edge before_dom_children (basic_block);
   virtual void after_dom_children (basic_block);
+
+  /* Flag that will trigger TODO_cleanup_cfg to be returned in strlen
+ execute function.  */
+  bool m_cleanup_cfg;
 };
 
 /* Callback for walk_dominator_tree.  Attempt to optimize various
@@ -3399,11 +3424,19 @@ strlen_dom_walker::before_dom_children (basic_block bb)
 	}
 }
 
+  bool cleanup_eh = false;
+
   /* Attempt to optimize individual statements.  */
   for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
-if (strlen_check_and_optimize_stmt ())
+if (strlen_check_and_optimize_stmt (, _eh))
   gsi_next ();
 
+  if (cleanup_eh)
+{
+  gimple_purge_dead_eh_edges (bb);
+  m_cleanup_cfg = true;
+}
+
   bb->aux = stridx_to_strinfo;
   if (vec_safe_length (stridx_to_strinfo) && !strinfo_shared ())
 (*stridx_to_strinfo)[0] = (strinfo *) bb;
@@ -3477,7 +3510,8 @@ pass_strlen::execute (function

Re: [C++ PATCH] Improve code generation for static_cast of pointers to reference types (PR c++/83555)

2018-01-03 Thread Jakub Jelinek

On Wed, Jan 03, 2018 at 01:49:11PM +0100, Richard Biener wrote:
> On January 3, 2018 1:21:40 PM GMT+01:00, Nathan Sidwell  
> wrote:
> >On 01/02/2018 04:12 PM, Jakub Jelinek wrote:
> >> Hi!
> >> 
> >> This patch improves code generated for:
> >> struct A { int a; };
> >> struct B { int b; };
> >> struct C : A, B { int c; };
> >> C *bar (B *b) { return _cast(*b); }
> >> Unlike return static_cast(b); where b can be validly NULL, the
> >> reference shouldn't bind to NULL, but we still emit
> >> b ? b - 4 : 0.  The following patch omits the non-NULL check except
> >when
> >> -fsanitize=null (or undefined) and when sanitizing makes sure such
> >bugs are
> >> diagnosed.
> >
> >It's sad the optimizers don't know REFERENCE_TYPE (x) means x != NULL. 
> >(or perhaps that's just a C++ semantic of REFERENCE_TYPE?).  
> 
> Given we treat reference and pointer types as interchangeable we indeed don't 
> know that. 

Yeah, within functions that is quickly lost, sometimes even during folding as
in this case.
In some cases the optimizers can infer that, e.g. nonnull_arg_p has:
  /* Values passed by reference are always non-NULL.  */
  if (TREE_CODE (TREE_TYPE (arg)) == REFERENCE_TYPE
  && flag_delete_null_pointer_checks)
return true;
because we don't change randomly types of function arguments...

Though, the above snippet reminds me I should probably also replace:
+  expr = build_base_path (MINUS_EXPR, expr, base,
+ /*nonnull=*/!sanitize_null_p, complain);
with:
+  expr = build_base_path (MINUS_EXPR, expr, base,
+ /*nonnull=*/(!sanitize_null_p
+  && flag_delete_null_pointer_checks),
+ complain);

Instead of checking for generated asm without sanitization, I think it will
be easier/more portable to verify no conditionals in *.optimized dump.
Let me repost the updated patch.

Jakub

Re: [PATCH] Clean-up EH after strlen transformation (PR tree-optimization/83593).

2018-01-03 Thread Martin Liška

On 01/03/2018 01:49 PM, Marc Glisse wrote:
> On Wed, 3 Jan 2018, Martin Liška wrote:
> 
> +    *cleanup_eh = maybe_clean_or_replace_eh_stmt (stmt,
> +  stmt);
> 
> Do you mean *cleanup_eh |= ... ?
> 

Yes. Thanks!

Re: [PATCH] Clean-up EH after strlen transformation (PR tree-optimization/83593).

2018-01-03 Thread Jakub Jelinek

On Wed, Jan 03, 2018 at 01:27:01PM +0100, Martin Liška wrote:
>   /* Reading the final '\0' character.  */
>   tree zero = build_int_cst (TREE_TYPE (lhs), 0);
>   gimple_set_vuse (stmt, NULL_TREE);
>   gimple_assign_set_rhs_from_tree (gsi, zero);
> + *cleanup_eh = maybe_clean_or_replace_eh_stmt (stmt,
> +   stmt);

The second stmt should be gsi_stmt (*gsi) just in case
gimple_assign_set_rhs_from_tree can't modify in-place, and probably you need
*cleanup_eh
  = maybe_clean_or_replace_eh_stmt (stmt,
gsi_stmt (*gsi));
then to get formatting right.

>   update_stmt (gsi_stmt (*gsi));
> +
> + if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> +   {
> + fprintf (dump_file, "into: ");
> + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);

Again, should be gsi_stmt (*gsi);.  Or do:
stmt = gsi_stmt (*gsi);
above update_stmt.  As stmt is used several times later, I think that is my
preference (though, in maybe_clean_or_replace_eh_stmt call you IMHO still want
gsi_stmt repeated).

> +   }
> }
>   else if (w1 > w2)
> {
> @@ -3399,11 +3416,16 @@ strlen_dom_walker::before_dom_children (basic_block 
> bb)
>   }
>  }
>  
> +  bool cleanup_eh = false;
> +
>/* Attempt to optimize individual statements.  */
>for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
> -if (strlen_check_and_optimize_stmt ())
> +if (strlen_check_and_optimize_stmt (, _eh))
>gsi_next ();
>  
> +  if (cleanup_eh)
> +gimple_purge_dead_eh_edges (bb);

If gimple_purge_dead_eh_edges returns true, you want to arrange for the
pass to return TODO_cleanup_cfg (probably needs to use some global static
variable to propagate that).

Jakub

Re: [PATCH] Clean-up EH after strlen transformation (PR tree-optimization/83593).

2018-01-03 Thread Marc Glisse


On Wed, 3 Jan 2018, Martin Liška wrote:

+   *cleanup_eh = maybe_clean_or_replace_eh_stmt (stmt,
+ stmt);

Do you mean *cleanup_eh |= ... ?

--
Marc Glisse

Re: [C++ PATCH] Improve code generation for static_cast of pointers to reference types (PR c++/83555)

2018-01-03 Thread Richard Biener

On January 3, 2018 1:21:40 PM GMT+01:00, Nathan Sidwell  wrote:
>On 01/02/2018 04:12 PM, Jakub Jelinek wrote:
>> Hi!
>> 
>> This patch improves code generated for:
>> struct A { int a; };
>> struct B { int b; };
>> struct C : A, B { int c; };
>> C *bar (B *b) { return _cast(*b); }
>> Unlike return static_cast(b); where b can be validly NULL, the
>> reference shouldn't bind to NULL, but we still emit
>> b ? b - 4 : 0.  The following patch omits the non-NULL check except
>when
>> -fsanitize=null (or undefined) and when sanitizing makes sure such
>bugs are
>> diagnosed.
>
>It's sad the optimizers don't know REFERENCE_TYPE (x) means x != NULL. 
>(or perhaps that's just a C++ semantic of REFERENCE_TYPE?).  

Given we treat reference and pointer types as interchangeable we indeed don't 
know that. 

Do we 
>manage to elide the check if we eventually dereference the pointer? 

We eventually should via path isolation of the null dereference. But better 
check that  ;) 

>(Not that that'd be an easy fix, but maybe worth a (new?) bug report.)
>
>Your patch is fine, but could you add a test case to make sure the null
>
>check is not there in the output assembly -- it'd be $cpu-of-choice 
>specific, of course.
>
>nathan

[PATCH] Clean-up EH after strlen transformation (PR tree-optimization/83593).

2018-01-03 Thread Martin Liška

Hi.

Strlen pass does following transformation:

Optimizing: _7 = *ju_5(D);
into: _7 = 0;

which leads to need of removal of EH for the gimple statement. That's done via 
maybe_clean_or_replace_eh_stmt
and then we need to call gimple_purge_dead_eh_edges. Last question I have is 
whether it's also needed to
return TODO_cleanup_cfg or not?

Patch as is can bootstrap on ppc64le-redhat-linux and survives regression tests.

Martin

gcc/ChangeLog:

2018-01-03  Martin Liska  

PR tree-optimization/83593
* tree-ssa-strlen.c (strlen_check_and_optimize_stmt): Clean-up
EH gimple statements.
(strlen_dom_walker::before_dom_children): Call
gimple_purge_dead_eh_edges.

gcc/testsuite/ChangeLog:

2018-01-03  Martin Liska  

PR tree-optimization/83593
* gcc.dg/pr83593.c: New test.
---
 gcc/testsuite/gcc.dg/pr83593.c | 15 +++
 gcc/tree-ssa-strlen.c  | 28 +---
 2 files changed, 40 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr83593.c


diff --git a/gcc/testsuite/gcc.dg/pr83593.c b/gcc/testsuite/gcc.dg/pr83593.c
new file mode 100644
index 000..eddecc0606a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr83593.c
@@ -0,0 +1,15 @@
+/* PR tree-optimization/83593 */
+/* { dg-options "-O2 -fno-tree-dominator-opts -fnon-call-exceptions -fno-tree-pre -fexceptions -fno-code-hoisting -fno-tree-fre" } */
+
+void
+hr (int *ed, signed char *ju)
+{
+  int kc;
+{
+  int xj;
+  int *q2 = (*ed == 0) ?  : 
+
+  *ju = 0;
+  kc = *ju;
+}
+}
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index be6ab9f1e1b..293aeceacce 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "gimplify-me.h"
 #include "expr.h"
+#include "tree-cfg.h"
 #include "tree-dfa.h"
 #include "domwalk.h"
 #include "tree-ssa-alias.h"
@@ -3051,10 +3052,12 @@ fold_strstr_to_strncmp (tree rhs1, tree rhs2, gimple *stmt)
 }
 
 /* Attempt to check for validity of the performed access a single statement
-   at *GSI using string length knowledge, and to optimize it.  */
+   at *GSI using string length knowledge, and to optimize it.
+   If the given basic block needs clean-up of EH, CLEANUP_EH is set to
+   true.  */
 
 static bool
-strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi)
+strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi, bool *cleanup_eh)
 {
   gimple *stmt = gsi_stmt (*gsi);
 
@@ -3201,11 +3204,25 @@ strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi)
 		if (w1 == w2
 			&& si->full_string_p)
 		  {
+			if (dump_file && (dump_flags & TDF_DETAILS) != 0)
+			  {
+			fprintf (dump_file, "Optimizing: ");
+			print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
+			  }
+
 			/* Reading the final '\0' character.  */
 			tree zero = build_int_cst (TREE_TYPE (lhs), 0);
 			gimple_set_vuse (stmt, NULL_TREE);
 			gimple_assign_set_rhs_from_tree (gsi, zero);
+			*cleanup_eh = maybe_clean_or_replace_eh_stmt (stmt,
+  stmt);
 			update_stmt (gsi_stmt (*gsi));
+
+			if (dump_file && (dump_flags & TDF_DETAILS) != 0)
+			  {
+			fprintf (dump_file, "into: ");
+			print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
+			  }
 		  }
 		else if (w1 > w2)
 		  {
@@ -3399,11 +3416,16 @@ strlen_dom_walker::before_dom_children (basic_block bb)
 	}
 }
 
+  bool cleanup_eh = false;
+
   /* Attempt to optimize individual statements.  */
   for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
-if (strlen_check_and_optimize_stmt ())
+if (strlen_check_and_optimize_stmt (, _eh))
   gsi_next ();
 
+  if (cleanup_eh)
+gimple_purge_dead_eh_edges (bb);
+
   bb->aux = stridx_to_strinfo;
   if (vec_safe_length (stridx_to_strinfo) && !strinfo_shared ())
 (*stridx_to_strinfo)[0] = (strinfo *) bb;

Re: C++ PATCH to fix -Wparentheses with MVP (PR c++/83592)

2018-01-03 Thread Nathan Sidwell


On 01/03/2018 06:59 AM, Marek Polacek wrote:

Since the -Wparentheses extension regarding Most Vexing Parse we started
warning on e.g. "reinterpret_cast()".  I don't think the warning was
meant to warn in this situation, since in reinterpret_cast there's no decl/expr
disambiguation, is it?

It seems we should simply disable the warning in TYPENAME context.


ok thanks ('' looks weird to me, but if the warning is 
annoying, so be it.)


nathan

--
Nathan Sidwell

Re: [PATCH] PR 78534 Change character length from int to size_t

2018-01-03 Thread Janne Blomqvist

On Wed, Jan 3, 2018 at 2:10 PM, Thomas Koenig  wrote:
> Hi Janne,
>
>> attached is a patch that makes the two attached testcases work. It
>> applies on top of the charlen->size_t patch. In the formatted I/O
>> stuff, I have mostly used ptrdiff_t to avoid having to deal with
>> signed/unsigned issues, as the previous code was using int.
>
>
> Did you regression-test?

Ah yes, I forgot to mention that. Yes, I did, though only on x86_64-linux-gnu

> If yes, I'd say this patch is OK (all the changes look obvious
> enough).
>
> With this, your character length patch is also OK. We can then
> open individual PRs for the other issues.
>
> However, I'd ask you to wait for a day or so with committing
> so that other people also have a chance for a (final) look
> at this.

Thanks! Dominique mentioned on IRC about some incoming comments, so I
guess it makes sense to wait a few days.



-- 
Janne Blomqvist

Re: [C++ PATCH] Improve code generation for static_cast of pointers to reference types (PR c++/83555)

2018-01-03 Thread Nathan Sidwell


On 01/02/2018 04:12 PM, Jakub Jelinek wrote:

Hi!

This patch improves code generated for:
struct A { int a; };
struct B { int b; };
struct C : A, B { int c; };
C *bar (B *b) { return _cast(*b); }
Unlike return static_cast(b); where b can be validly NULL, the
reference shouldn't bind to NULL, but we still emit
b ? b - 4 : 0.  The following patch omits the non-NULL check except when
-fsanitize=null (or undefined) and when sanitizing makes sure such bugs are
diagnosed.


It's sad the optimizers don't know REFERENCE_TYPE (x) means x != NULL. 
(or perhaps that's just a C++ semantic of REFERENCE_TYPE?).  Do we 
manage to elide the check if we eventually dereference the pointer? 
(Not that that'd be an easy fix, but maybe worth a (new?) bug report.)


Your patch is fine, but could you add a test case to make sure the null 
check is not there in the output assembly -- it'd be $cpu-of-choice 
specific, of course.


nathan

--
Nathan Sidwell

Re: [C++ PATCH] Fix ICE in ~macro_use_before_def (PR preprocessor/83602)

2018-01-03 Thread Nathan Sidwell


On 01/02/2018 04:07 PM, Jakub Jelinek wrote:

Hi!

If lookup_name_fuzzy finds an exact match with a macro, it later in the dtor
uses node->value.macro->line in libcpp.  The problem is that for builtin
macros node->value.macro contains garbage, we use node->value.builtin union
member in those cases instead.  It doesn't make any sense to look up
location of a builtin macro definition anyway, those are very special
entities.  So, this patch just ignores those, because we can't do anything
useful with them.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-01-02  Jakub Jelinek  

PR preprocessor/83602
* name-lookup.c (lookup_name_fuzzy): Don't use macro_use_before_def
for builtin macros.

* g++.dg/cpp/pr83602.C: New test.


ok


--
Nathan Sidwell

Re: [C++ PATCH] Avoid NOP_EXPRs with error_mark_node operand (PR c++/83634)

2018-01-03 Thread Nathan Sidwell


On 01/02/2018 04:03 PM, Jakub Jelinek wrote:

Hi!

The gimplifier uses in several places STRIP_USELESS_TYPE_CONVERSION and
that, being primarily a middle-end predicate, doesn't like error_mark_nodes
appearing in conversion operands.  I've talked about it with Richi on IRC
and he'd prefer not to change it.



2018-01-02  Jakub Jelinek  

PR c++/83634
* cp-gimplify.c (cp_fold) : If the operand folds to
error_mark_node, return error_mark_node.

* g++.dg/parse/pr83634.C: New test.



ok thanks


--
Nathan Sidwell

Re: [PATCH] PR 78534 Change character length from int to size_t

2018-01-03 Thread Thomas Koenig


Hi Janne,


attached is a patch that makes the two attached testcases work. It
applies on top of the charlen->size_t patch. In the formatted I/O
stuff, I have mostly used ptrdiff_t to avoid having to deal with
signed/unsigned issues, as the previous code was using int.


Did you regression-test?

If yes, I'd say this patch is OK (all the changes look obvious
enough).

With this, your character length patch is also OK. We can then
open individual PRs for the other issues.

However, I'd ask you to wait for a day or so with committing
so that other people also have a chance for a (final) look
at this.

Regards

Thomas

C++ PATCH to fix -Wparentheses with MVP (PR c++/83592)

2018-01-03 Thread Marek Polacek

Since the -Wparentheses extension regarding Most Vexing Parse we started
warning on e.g. "reinterpret_cast()".  I don't think the warning was
meant to warn in this situation, since in reinterpret_cast there's no decl/expr
disambiguation, is it?

It seems we should simply disable the warning in TYPENAME context.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2018-01-03  Marek Polacek  

PR c++/83592
* decl.c (grokdeclarator): Don't warn about MVP in typename context.

* g++.dg/warn/mvp2.C: New test.

diff --git gcc/cp/decl.c gcc/cp/decl.c
index f7b03e13303..b1c50961169 100644
--- gcc/cp/decl.c
+++ gcc/cp/decl.c
@@ -10866,10 +10866,11 @@ grokdeclarator (const cp_declarator *declarator,
 
   inner_declarator = declarator->declarator;
 
-  /* We don't want to warn in parmeter context because we don't
+  /* We don't want to warn in parameter context because we don't
 yet know if the parse will succeed, and this might turn out
 to be a constructor call.  */
   if (decl_context != PARM
+ && decl_context != TYPENAME
  && declarator->parenthesized != UNKNOWN_LOCATION
  /* If the type is class-like and the inner name used a
 global namespace qualifier, we need the parens.
diff --git gcc/testsuite/g++.dg/warn/mvp2.C gcc/testsuite/g++.dg/warn/mvp2.C
index e69de29bb2d..9b2793c5622 100644
--- gcc/testsuite/g++.dg/warn/mvp2.C
+++ gcc/testsuite/g++.dg/warn/mvp2.C
@@ -0,0 +1,24 @@
+// PR c++/83592
+// { dg-do compile }
+// { dg-options "-Wparentheses" }
+
+// Test that -Wparentheses does not give bogus warnings in
+// typename context.
+
+int *
+foo (long )
+{
+  return reinterpret_cast ();
+}
+
+int *
+bar (long )
+{
+  return (int (*)) 
+}
+
+int *
+baz (int )
+{
+  return static_cast ();
+}

Marek

Re: [PATCH] [PR target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64

2018-01-03 Thread Jakub Jelinek

On Tue, Jan 02, 2018 at 01:02:26PM -0700, Jeff Law wrote:
> It's fairly obvious that the probe of *sp isn't actually necessary here
> because the register saves in the prologue act as probe points for *sp.
> 
> In fact, the only way this can ever cause problems is if %esi is used in
> the body in which case it would have been callee saved in the prologue.
> So if we detect that %esi is already callee saved in the prologue then
> we could eliminate the explicit probe of *sp.
> 
> But we can do even better.  If any register is saved in the prologue,
> then that callee register save functions as an implicit probe of *sp and
> we do not need to explicitly probe *sp.
> 
> While this was reported with -m32, I'm pretty sure we can trigger a
> similar issue on x86_64.
> 
> Bootstrapped and regression tested on x86_64.  Also verified the
> testcase behavior on -m32.  The test uses flags to hopefully ensure
> expected behavior on x86/Solaris, but I did not explicitly test that
> configuration.
> 
> OK for the trunk?
> 
> Jeff
> 

Missing
PR target/83641
here.

>   * config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not

adjust as Florian reported.

>   explicitly probe *sp in a noreturn function if there were any callee
>   register saves.

" or frame pointer is needed" ?

> 
>   * gcc.target/i386/stack-check-17.c: New test

Missing full stop after New test

This is a nice optimization, ok for trunk.

I think we do not want to put anything into the CFI even if the condition
you've added doesn't trigger, that is a pure space and runtime cycle waste,
except for noting the sp change if it is the current CFA.  Even after the
push the register value still holds the caller's value, and after the pop
too.  ix86_emit_restore_reg_using_pop does a lot of stuff we IMNSHO don't
want to do, if dummy_reg happened to be a drap reg, we'd emit really weird
stuff, the cfa restore, etc.  There are many other spots that gen_pop
themselves and do the appropriate thing at that spot, instead of
all using ix86_emit_restore_reg_using_pop.

Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux?

2018-01-03  Jakub Jelinek  

PR target/83641
* config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For
noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop,
only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp
and add REG_CFA_ADJUST_CFA notes in that case to both insns.

--- gcc/config/i386/i386.c.jj   2018-01-03 10:20:06.495535771 +0100
+++ gcc/config/i386/i386.c  2018-01-03 12:32:01.747649506 +0100
@@ -12229,9 +12229,21 @@ ix86_adjust_stack_and_probe_stack_clash
 argument passing registers so as not to introduce dependencies in
 the pipeline.  For 32 bit we use %esi and for 64 bit we use %rax.  */
   rtx dummy_reg = gen_rtx_REG (word_mode, TARGET_64BIT ? AX_REG : SI_REG);
-  rtx_insn *insn = emit_insn (gen_push (dummy_reg));
-  RTX_FRAME_RELATED_P (insn) = 1;
-  ix86_emit_restore_reg_using_pop (dummy_reg);
+  rtx_insn *insn_push = emit_insn (gen_push (dummy_reg));
+  rtx_insn *insn_pop = emit_insn (gen_pop (dummy_reg));
+  m->fs.sp_offset -= UNITS_PER_WORD;
+  if (m->fs.cfa_reg == stack_pointer_rtx)
+   {
+ m->fs.cfa_offset -= UNITS_PER_WORD;
+ rtx x = plus_constant (Pmode, stack_pointer_rtx, -UNITS_PER_WORD);
+ x = gen_rtx_SET (stack_pointer_rtx, x);
+ add_reg_note (insn_push, REG_CFA_ADJUST_CFA, x);
+ RTX_FRAME_RELATED_P (insn_push) = 1;
+ x = plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD);
+ x = gen_rtx_SET (stack_pointer_rtx, x);
+ add_reg_note (insn_pop, REG_CFA_ADJUST_CFA, x);
+ RTX_FRAME_RELATED_P (insn_pop) = 1;
+   }
   emit_insn (gen_blockage ());
 }

Jakub

Re: [PATCH] PR 78534 Change character length from int to size_t

2018-01-03 Thread Janne Blomqvist

On Sat, Dec 30, 2017 at 10:58 PM, Jerry DeLisle  wrote:
> On 12/30/2017 12:35 PM, Janne Blomqvist wrote:
>> On Sat, Dec 30, 2017 at 7:16 PM, Thomas Koenig  wrote:
> ---snip---
>>
>> I can provide that stuff as a separate patch, or merge it into the
>> original megapatch and resubmit that, whichever way you prefer.
>
> I would prefer we split into two patches. This will make review of the library
> I/O changes easier. The int len is used in a lot of places also where it 
> really
> happens to also be the kind (which is a length in our implementation).

Hi,

attached is a patch that makes the two attached testcases work. It
applies on top of the charlen->size_t patch. In the formatted I/O
stuff, I have mostly used ptrdiff_t to avoid having to deal with
signed/unsigned issues, as the previous code was using int.



-- 
Janne Blomqvist
From 72799c5587ee1e830bb424278286cff309d0c4a7 Mon Sep 17 00:00:00 2001
From: Janne Blomqvist 
Date: Tue, 2 Jan 2018 14:17:57 +0200
Subject: [PATCH] Use ptrdiff_t for formatted I/O sizes

---
 libgfortran/io/fbuf.c  | 44 ++---
 libgfortran/io/fbuf.h  | 16 +--
 libgfortran/io/io.h| 16 +--
 libgfortran/io/list_read.c | 15 +-
 libgfortran/io/read.c  | 70 +++---
 libgfortran/io/transfer.c  | 36 
 libgfortran/io/unix.c  | 20 ++---
 libgfortran/io/unix.h  | 12 
 libgfortran/io/write.c | 21 +++---
 9 files changed, 126 insertions(+), 124 deletions(-)

diff --git a/libgfortran/io/fbuf.c b/libgfortran/io/fbuf.c
index 944469d..d38d003 100644
--- a/libgfortran/io/fbuf.c
+++ b/libgfortran/io/fbuf.c
@@ -33,7 +33,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 
 
 void
-fbuf_init (gfc_unit *u, int len)
+fbuf_init (gfc_unit *u, ptrdiff_t len)
 {
   if (len == 0)
 len = 512;			/* Default size.  */
@@ -64,9 +64,9 @@ fbuf_debug (gfc_unit *u, const char *format, ...)
   va_start(args, format);
   vfprintf(stderr, format, args);
   va_end(args);
-  fprintf (stderr, "fbuf_debug pos: %d, act: %d, buf: ''", 
-   u->fbuf->pos, u->fbuf->act);
-  for (int ii = 0; ii < u->fbuf->act; ii++)
+  fprintf (stderr, "fbuf_debug pos: %ld, act: %ld, buf: ''",
+   (long) u->fbuf->pos, (long) u->fbuf->act);
+  for (ptrdiff_t ii = 0; ii < u->fbuf->act; ii++)
 {
   putc (u->fbuf->buf[ii], stderr);
 }
@@ -84,10 +84,10 @@ fbuf_debug (gfc_unit *u __attribute__ ((unused)),
underlying device.  Returns how much the physical position was
modified.  */
 
-int
+ptrdiff_t
 fbuf_reset (gfc_unit *u)
 {
-  int seekval = 0;
+  ptrdiff_t seekval = 0;
 
   if (!u->fbuf)
 return 0;
@@ -99,7 +99,7 @@ fbuf_reset (gfc_unit *u)
   if (u->mode == READING && u->fbuf->act > u->fbuf->pos)
 {
   seekval = - (u->fbuf->act - u->fbuf->pos);
-  fbuf_debug (u, "fbuf_reset seekval %d, ", seekval);
+  fbuf_debug (u, "fbuf_reset seekval %ld, ", (long) seekval);
 }
   u->fbuf->act = u->fbuf->pos = 0;
   return seekval;
@@ -111,11 +111,11 @@ fbuf_reset (gfc_unit *u)
reallocating if necessary.  */
 
 char *
-fbuf_alloc (gfc_unit *u, int len)
+fbuf_alloc (gfc_unit *u, ptrdiff_t len)
 {
-  int newlen;
+  ptrdiff_t newlen;
   char *dest;
-  fbuf_debug (u, "fbuf_alloc len %d, ", len);
+  fbuf_debug (u, "fbuf_alloc len %ld, ", (long) len);
   if (u->fbuf->pos + len > u->fbuf->len)
 {
   /* Round up to nearest multiple of the current buffer length.  */
@@ -138,7 +138,7 @@ fbuf_alloc (gfc_unit *u, int len)
 int
 fbuf_flush (gfc_unit *u, unit_mode mode)
 {
-  int nwritten;
+  ptrdiff_t nwritten;
 
   if (!u->fbuf)
 return 0;
@@ -177,7 +177,7 @@ fbuf_flush (gfc_unit *u, unit_mode mode)
 int
 fbuf_flush_list (gfc_unit *u, unit_mode mode)
 {
-  int nwritten;
+  ptrdiff_t nwritten;
 
   if (!u->fbuf)
 return 0;
@@ -206,8 +206,8 @@ fbuf_flush_list (gfc_unit *u, unit_mode mode)
 }
 
 
-int
-fbuf_seek (gfc_unit *u, int off, int whence)
+ptrdiff_t
+fbuf_seek (gfc_unit *u, ptrdiff_t off, int whence)
 {
   if (!u->fbuf)
 return -1;
@@ -226,7 +226,7 @@ fbuf_seek (gfc_unit *u, int off, int whence)
   return -1;
 }
 
-  fbuf_debug (u, "fbuf_seek, off %d ", off);
+  fbuf_debug (u, "fbuf_seek, off %ld ", (long) off);
   /* The start of the buffer is always equal to the left tab
  limit. Moving to the left past the buffer is illegal in C and
  would also imply moving past the left tab limit, which is never
@@ -248,21 +248,21 @@ fbuf_seek (gfc_unit *u, int off, int whence)
of bytes actually processed. */
 
 char *
-fbuf_read (gfc_unit *u, int *len)
+fbuf_read (gfc_unit *u, ptrdiff_t *len)
 {
   char *ptr;
-  int oldact, oldpos;
-  int readlen = 0;
+  ptrdiff_t oldact, oldpos;
+  ptrdiff_t readlen = 0;
 
-  fbuf_debug (u, "fbuf_read, len %d: ", *len);
+  fbuf_debug (u, "fbuf_read, len %ld: ", (long) *len);

Re: PR83648

2018-01-03 Thread Jan Hubicka

> On Wed, Jan 03, 2018 at 10:05:30AM +0100, Richard Biener wrote:
> > >One concern I have is that with the patch, malloc_candidate_p will
> > >return true if all the args to PHI are NULL:
> > >retval = PHI<0, 0>
> > >return retval
> > >
> > >However I expect that PHI with all 0 args would be constant folded to
> > >0 earlier, so this case shouldn't occur in practice ?
> > 
> > You may not rely on folding for correctness. 

.. and at this level i would say even for code quality. Early optimizers are
facing a lot of garbage and they are not repeated, so we get code at various
intermediate levels of optimizations thorugh the IPA queue.

Honza
> 
> Yeah.  Will the patch handle (I mean punt on) also unfolded
>   if (n) ? 0 : __builtin_malloc (n);
> and similar?
> 
>   Jakub

[PATCH] Be careful about comdat boundary in ICF (PR ipa/82352).

2018-01-03 Thread Martin Liška

Hi.

This patch is follow-up of r246848. This time ICF creates an edge between 2 
functions,
where one is inside a comdat group and second is not. I've got patch that is 
conservative
about the comdat groups (in_same_comdat_group_p).

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

gcc/ChangeLog:

2018-01-03  Martin Liska  

PR ipa/82352
* ipa-icf.c (sem_function::merge): Do not cross comdat boundary.

gcc/testsuite/ChangeLog:

2018-01-03  Martin Liska  

PR ipa/82352
* g++.dg/ipa/pr82352.C: New test.
---
 gcc/ipa-icf.c  |  9 
 gcc/testsuite/g++.dg/ipa/pr82352.C | 93 ++
 2 files changed, 102 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr82352.C


diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index a8d3b800318..a56c7306201 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -1113,6 +1113,15 @@ sem_function::merge (sem_item *alias_item)
   return false;
 }
 
+  if (!original->in_same_comdat_group_p (alias))
+{
+  if (dump_file)
+	fprintf (dump_file, "Not unifying; alias cannot be created; "
+		 "across comdat group boundary\n\n");
+
+  return false;
+}
+
   /* See if original is in a section that can be discarded if the main
  symbol is not used.  */
 
diff --git a/gcc/testsuite/g++.dg/ipa/pr82352.C b/gcc/testsuite/g++.dg/ipa/pr82352.C
new file mode 100644
index 000..c044345a486
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr82352.C
@@ -0,0 +1,93 @@
+// PR ipa/82352
+// { dg-do compile }
+// { dg-options "-O2" }
+
+typedef long unsigned int size_t;
+
+class A
+{
+public :
+  typedef enum { Zero = 0, One = 1 } tA;
+  A(tA a) { m_a = a; }
+
+private :
+  tA m_a;
+};
+
+class B
+{
+public :
+  void *operator new(size_t t) { return (void*)(42); };
+};
+
+class C
+{
+public:
+  virtual void  () = 0;
+};
+
+class D
+{
+ public :
+  virtual void g() = 0;
+  virtual void h() = 0;
+};
+
+template class : public T, public D
+{
+public:
+ void ()
+ {
+   if (!m_i2) throw A(A::One);
+ };
+
+ void h()
+ {
+  if (m_i2) throw A(A::Zero);
+ }
+
+protected:
+ virtual void g()
+ {
+  if (m_i1 !=0) throw A(A::Zero);
+ };
+
+private :
+ int m_i1;
+ void *m_i2;
+};
+
+class E
+{
+private:
+size_t m_e;
+static const size_t Max;
+
+public:
+E& i(size_t a, size_t b, size_t c)
+{
+if ((a > Max) || (c > Max)) throw A(A::Zero );
+if (a + b > m_e) throw A(A::One );
+return (*this);
+}
+
+  inline E& j(const E )
+{
+  return i(0,0,s.m_e);
+}
+};
+
+class F : public C { };
+class G : public C { };
+class  : public B, public F, public G { };
+
+void k()
+{
+new ();
+}
+
+void l()
+{
+  E e1, e2;
+  e1.j(e2);
+}

Re: [patch, fortran] Fix PR 83664, missing check on boundary argument for eoshift

2018-01-03 Thread Janne Blomqvist

On Wed, Jan 3, 2018 at 1:13 PM, Thomas Koenig  wrote:
> Hello world,
>
> the attached patch fixes a missing check for eoshift.
>
> If you're wondering what the "else" belongs to - it is
>
>   if (boundary != NULL)
>
> Regression-tested. OK for trunk?
>
> Regards
>
> Thomas
>
> 2018-01-03  Thomas Koenig  
>
> PR fortran/83664
> * check.c (gfc_check_eoshift): Error for missing boundary if array
> is not one of the standard types.
>
> 2018-01-03  Thomas Koenig  
>
> PR fortran/83664
> * gfortran.dg/eoshift_7.f90: New test.

Ok for trunk, thanks!

-- 
Janne Blomqvist

[patch, fortran] Fix PR 83664, missing check on boundary argument for eoshift

2018-01-03 Thread Thomas Koenig


Hello world,

the attached patch fixes a missing check for eoshift.

If you're wondering what the "else" belongs to - it is

  if (boundary != NULL)

Regression-tested. OK for trunk?

Regards

Thomas

2018-01-03  Thomas Koenig  

PR fortran/83664
* check.c (gfc_check_eoshift): Error for missing boundary if array
is not one of the standard types.

2018-01-03  Thomas Koenig  

PR fortran/83664
* gfortran.dg/eoshift_7.f90: New test.
Index: check.c
===
--- check.c	(Revision 255788)
+++ check.c	(Arbeitskopie)
@@ -2270,6 +2270,26 @@ gfc_check_eoshift (gfc_expr *array, gfc_expr *shif
 	  return false;
 	}
 }
+  else
+{
+  switch (array->ts.type)
+	{
+	case BT_INTEGER:
+	case BT_LOGICAL:
+	case BT_REAL:
+	case BT_COMPLEX:
+	case BT_CHARACTER:
+	  break;
+	  
+	default:
+	  gfc_error ("Missing %qs argument to %qs intrinsic at %L for %qs "
+		 "of type %qs", gfc_current_intrinsic_arg[2]->name,
+		 gfc_current_intrinsic, >where,
+		 gfc_current_intrinsic_arg[0]->name,
+		 gfc_typename (>ts));
+	  return false;
+	}
+}
 
   return true;
 }
! { dg-do compile }
program main
  type t
integer :: x
  end type t
  type(t), dimension(2) :: a, b
  a(1)%x = 1
  a(2)%x = 2
  b = eoshift(a,1) ! { dg-error "Missing 'boundary' argument to 'eoshift' intrinsic" }
  print *,b%x
end program main

Re: [AARCH64] implements neon vld1_*_x2 intrinsics

2018-01-03 Thread Christophe Lyon

Hi Kugan,


On 15 November 2017 at 12:23, James Greenhalgh  wrote:
> On Wed, Nov 15, 2017 at 09:58:28AM +, Kyrill Tkachov wrote:
>> Hi Kugan,
>>
>> On 07/11/17 04:10, Kugan Vivekanandarajah wrote:
>> > Hi,
>> >
>> > Attached patch implements the  vld1_*_x2 intrinsics as defined by the
>> > neon document.
>> >
>> > Bootstrap for the latest patch is ongoing on aarch64-linux-gnu. Is
>> > this OK for trunk if no regressions?
>> >
>>
>> This looks mostly ok to me (though I cannot approve) modulo a couple of
>> minor type issues below.
>
> Thanks for the review Kyrill!
>
> I'm happy to trust Kyrill's knowledge of the back-end here, so the patch
> is OK with the changes Kyrill requested.
>
> Thanks for the patch!
>
> James
>
>> > gcc/ChangeLog:
>> >
>> > 2017-11-06  Kugan Vivekanandarajah 
>> >
>> > * config/aarch64/aarch64-simd.md (aarch64_ld1x2): New.
>> > (aarch64_ld1x2): Likewise.
>> > (aarch64_simd_ld1_x2): Likewise.
>> > (aarch64_simd_ld1_x2): Likewise.
>> > * config/aarch64/arm_neon.h (vld1_u8_x2): New.
>> > (vld1_s8_x2): Likewise.
>> > (vld1_u16_x2): Likewise.
>> > (vld1_s16_x2): Likewise.
>> > (vld1_u32_x2): Likewise.
>> > (vld1_s32_x2): Likewise.
>> > (vld1_u64_x2): Likewise.
>> > (vld1_s64_x2): Likewise.
>> > (vld1_f16_x2): Likewise.
>> > (vld1_f32_x2): Likewise.
>> > (vld1_f64_x2): Likewise.
>> > (vld1_p8_x2): Likewise.
>> > (vld1_p16_x2): Likewise.
>> > (vld1_p64_x2): Likewise.
>> > (vld1q_u8_x2): Likewise.
>> > (vld1q_s8_x2): Likewise.
>> > (vld1q_u16_x2): Likewise.
>> > (vld1q_s16_x2): Likewise.
>> > (vld1q_u32_x2): Likewise.
>> > (vld1q_s32_x2): Likewise.
>> > (vld1q_u64_x2): Likewise.
>> > (vld1q_s64_x2): Likewise.
>> > (vld1q_f16_x2): Likewise.
>> > (vld1q_f32_x2): Likewise.
>> > (vld1q_f64_x2): Likewise.
>> > (vld1q_p8_x2): Likewise.
>> > (vld1q_p16_x2): Likewise.
>> > (vld1q_p64_x2): Likewise.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> > 2017-11-06  Kugan Vivekanandarajah 
>> >
>> > * gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: New test.
>>

Sorry for not seeing this before you committed this patch, but the new
test fails to compile on arm targets.
Can you add the proper guard, as there is in other tests in the same dir?

Other question: why do you force -O3? The harness iterates on O0, O1, 

Thanks,

Christophe


>> +__extension__ extern __inline int8x8x2_t
>> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>> +vld1_s8_x2 (const uint8_t *__a)
>>
>> This should be "const int8_t *"
>>
>>   +{
>> +  int8x8x2_t ret;
>> +  __builtin_aarch64_simd_oi __o;
>> +  __o = __builtin_aarch64_ld1x2v8qi ((const __builtin_aarch64_simd_qi *) 
>> __a);
>> +  ret.val[0] = (int8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 0);
>> +  ret.val[1] = (int8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 1);
>> +  return ret;
>> +}
>>
>> ...
>>
>> +__extension__ extern __inline int32x2x2_t
>> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>> +vld1_s32_x2 (const uint32_t *__a)
>>
>> Likewise, this should be "const int32_t *"
>>
>> +{
>> +  int32x2x2_t ret;
>> +  __builtin_aarch64_simd_oi __o;
>> +  __o = __builtin_aarch64_ld1x2v2si ((const __builtin_aarch64_simd_si *) 
>> __a);
>> +  ret.val[0] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 0);
>> +  ret.val[1] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 1);
>> +  return ret;
>> +}
>> +
>>
>>

[committed] Tweak update-copyright.py so that it doesn't fail on pdt_5.f03

2018-01-03 Thread Jakub Jelinek

Hi!

This testcase contains:

! Third, complete example from the PGInsider article:
! "Object-Oriented Programming in Fortran 2003 Part 3: Parameterized Derived 
Types"
! by Mark Leair
!
! Copyright (c) 2013, NVIDIA CORPORATION.  All rights reserved.
!
! NVIDIA CORPORATION and its licensors retain all intellectual property
! and proprietary rights in and to this software, related documentation
! and any modifications thereto.  Any use, reproduction, disclosure or
! distribution of this software and related documentation without an express
! license agreement from NVIDIA CORPORATION is strictly prohibited.
!

!  THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT
!   WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT
!   NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR
!   FITNESS FOR A PARTICULAR PURPOSE.
!
! Note that modification had to be made all of which are commented.

which makes me wonder if it shouldn't be removed.  So that I could update
the copyright years, I've committed following workaround for that, but if
the testcase is removed or rewritten that can go away as well.

2018-01-03  Jakub Jelinek  

* update-copyright.py: Skip pdt-5.f03 in gfortran.dg subdir.

--- contrib/update-copyright.py (revision 256167)
+++ contrib/update-copyright.py (revision 256168)
@@ -591,6 +591,8 @@ class TestsuiteFilter (GenericFilter):
 # Similarly params/README.
 if filename == 'README' and os.path.basename (dir) == 'params':
 return True
+if filename == 'pdt_5.f03' and os.path.basename (dir) == 'gfortran.dg':
+   return True
 return GenericFilter.skip_file (self, dir, filename)
 
 class LibCppFilter (GenericFilter):

Jakub

Re: [patch AArch64] Do not perform a vector splat for vector initialisation if it is not useful

2018-01-03 Thread Christophe Lyon

On 19 December 2017 at 00:36, Jeff Law  wrote:
> On 12/11/2017 08:44 AM, James Greenhalgh wrote:
>> Hi,
>>
>> In the testcase in this patch we create an SLP vector with only two
>> elements. Our current vector initialisation code will first duplicate
>> the first element to both lanes, then overwrite the top lane with a new
>> value.
>>
>> This duplication can be clunky and wasteful.
>>
>> Better would be to simply use the fact that we will always be overwriting
>> the remaining bits, and simply move the first element to the corrcet place
>> (implicitly zeroing all other bits).
>>
>> This reduces the code generation for this case, and can allow more
>> efficient addressing modes, and other second order benefits for AArch64
>> code which has been vectorized to V2DI mode.
>>
>> Note that the change is generic enough to catch the case for any vector
>> mode, but is expected to be most useful for 2x64-bit vectorization.
>>
>> Unfortunately, on its own, this would cause failures in
>> gcc.target/aarch64/load_v2vec_lanes_1.c and
>> gcc.target/aarch64/store_v2vec_lanes.c , which expect to see many more
>> vec_merge and vec_duplicate for their simplifications to apply. To fix this,
>> add a special case to the AArch64 code if we are loading from two memory
>> addresses, and use the load_pair_lanes patterns directly.
>>
>> We also need a new pattern in simplify-rtx.c:simplify_ternary_operation , to
>> catch:
>>
>>   (vec_merge:OUTER
>>  (vec_duplicate:OUTER x:INNER)
>>  (subreg:OUTER y:INNER 0)
>>  (const_int N))
>>
>> And simplify it to:
>>
>>   (vec_concat:OUTER x:INNER y:INNER) or (vec_concat y x)
>>
>> This is similar to the existing patterns which are tested in this function,
>> without requiring the second operand to also be a vec_duplicate.
>>
>> Bootstrapped and tested on aarch64-none-linux-gnu and tested on
>> aarch64-none-elf.
>>
>> Note that this requires 
>> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00614.html
>> if we don't want to ICE creating broken vector zero extends.
>>
>> Are the non-AArch64 parts OK?
>>
>> Thanks,
>> James
>>
>> ---
>> 2017-12-11  James Greenhalgh  
>>
>>   * config/aarch64/aarch64.c (aarch64_expand_vector_init): Modify code
>>   generation for cases where splatting a value is not useful.
>>   * simplify-rtx.c (simplify_ternary_operation): Simplify vec_merge
>>   across a vec_duplicate and a paradoxical subreg forming a vector
>>   mode to a vec_concat.
>>
>> 2017-12-11  James Greenhalgh  
>>
>>   * gcc.target/aarch64/vect-slp-dup.c: New.
>>
>>
>> 0001-patch-AArch64-Do-not-perform-a-vector-splat-for-vect.patch
>>
>>
>
>> diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
>> index 806c309..ed16f70 100644
>> --- a/gcc/simplify-rtx.c
>> +++ b/gcc/simplify-rtx.c
>> @@ -5785,6 +5785,36 @@ simplify_ternary_operation (enum rtx_code code, 
>> machine_mode mode,
>>   return simplify_gen_binary (VEC_CONCAT, mode, newop0, newop1);
>>   }
>>
>> +   /* Replace:
>> +
>> +   (vec_merge:outer (vec_duplicate:outer x:inner)
>> +(subreg:outer y:inner 0)
>> +(const_int N))
>> +
>> +  with (vec_concat:outer x:inner y:inner) if N == 1,
>> +  or (vec_concat:outer y:inner x:inner) if N == 2.
>> +
>> +  Implicitly, this means we have a paradoxical subreg, but such
>> +  a check is cheap, so make it anyway.
> I'm going to assume that N == 1 and N == 3 are handled elsewhere and do
> not show up here in practice.
>
>
> So is it advisable to handle the case where the VEC_DUPLICATE and SUBREG
> show up in the opposite order?  Or is there some canonicalization that
> prevents that?
>
> simplify-rtx bits are OK as-is if we're certain we're not going to get
> the alternate ordering of the VEC_MERGE operands.  ALso OK if you either
> generalize this chunk of code or duplicate & twiddle it to handle the
> alternate order.
>
> I didn't look at the aarch64 specific bits.
>

Hi James,

Your patch (r255946) introduced regressions on aarch64_be:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/255946/report-build-info.html

I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83663
to track this.

Thanks,

Christophe

> jeff
>

Re: [Patch, fortran] PR83076 - [8 Regression] ICE in gfc_deallocate_scalar_with_status, at fortran/trans.c:1598

2018-01-03 Thread Paul Richard Thomas

Many thanks, Damian. I will commit soonish; probably tomorrow.

Paul


On 3 January 2018 at 00:22, Damian Rouson  wrote:
> I have now confirmed that the patch works the same for the 7 branch: it
> doesn’t break any previously passing tests.
>
> Damian
>
> On January 1, 2018 at 9:44:59 AM, Paul Richard Thomas
> (paul.richard.tho...@gmail.com) wrote:
>
> Committed to trunk as revision 256065.
>
> Damian, it would be good if you would confirm that there are no issues
> with applying the patch to 7-branch.
>
> Thanks for all the help.
>
> Paul
>
> On 28 December 2017 at 19:58, Damian Rouson
>  wrote:
>> I applied the patch the trunk and confirmed that it doesn’t break any
>> previously
>> passing OpenCoarrays tests. Is that sufficient or should I also try
>> applying the
>> patch to the 7 branch?
>>
>> Damian
>>
>>
>
>
>
> --
> "If you can't explain it simply, you don't understand it well enough"
> - Albert Einstein



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein

[committed] Update copyright years

2018-01-03 Thread Jakub Jelinek

Hi!

Happy New Year to everyone!

2018-01-03  Jakub Jelinek  

gcc/
* gcc.c (process_command): Update copyright notice dates.
* gcov-dump.c (print_version): Ditto.
* gcov.c (print_version): Ditto.
* gcov-tool.c (print_version): Ditto.
* gengtype.c (create_file): Ditto.
* doc/cpp.texi: Bump @copying's copyright year.
* doc/cppinternals.texi: Ditto.
* doc/gcc.texi: Ditto.
* doc/gccint.texi: Ditto.
* doc/gcov.texi: Ditto.
* doc/install.texi: Ditto.
* doc/invoke.texi: Ditto.
gcc/fortran/
* gfortranspec.c (lang_specific_driver): Update copyright notice
dates.
* gfc-internals.texi: Bump @copying's copyright year.
* gfortran.texi: Ditto.
* intrinsic.texi: Ditto.
* invoke.texi: Ditto.
gcc/ada/
* gnat_ugn.texi: Bump @copying's copyright year.
* gnat_rm.texi: Likewise.
gcc/go/
* gccgo.texi: Bump @copyrights-go year.
libitm/
* libitm.texi: Bump @copying's copyright year.
libgomp/
* libgomp.texi: Bump @copying's copyright year.
libquadmath/
* libquadmath.texi: Bump @copying's copyright year.

--- libitm/libitm.texi  (revision 243991)
+++ libitm/libitm.texi  (revision 243992)
@@ -7,7 +7,7 @@
 
 
 @copying
-Copyright @copyright{} 2011-2017 Free Software Foundation, Inc.
+Copyright @copyright{} 2011-2018 Free Software Foundation, Inc.
 
 Permission is granted to copy, distribute and/or modify this document
 under the terms of the GNU Free Documentation License, Version 1.2 or
--- libgomp/libgomp.texi(revision 243991)
+++ libgomp/libgomp.texi(revision 243992)
@@ -7,7 +7,7 @@
 
 
 @copying
-Copyright @copyright{} 2006-2017 Free Software Foundation, Inc.
+Copyright @copyright{} 2006-2018 Free Software Foundation, Inc.
 
 Permission is granted to copy, distribute and/or modify this document
 under the terms of the GNU Free Documentation License, Version 1.3 or
--- libquadmath/libquadmath.texi(revision 243991)
+++ libquadmath/libquadmath.texi(revision 243992)
@@ -6,7 +6,7 @@
 @c %**end of header
 
 @copying
-Copyright @copyright{} 2010-2017 Free Software Foundation, Inc.
+Copyright @copyright{} 2010-2018 Free Software Foundation, Inc.
 
 @quotation
 Permission is granted to copy, distribute and/or modify this document
--- gcc/doc/cpp.texi(revision 243991)
+++ gcc/doc/cpp.texi(revision 243992)
@@ -10,7 +10,7 @@
 
 @copying
 @c man begin COPYRIGHT
-Copyright @copyright{} 1987-2017 Free Software Foundation, Inc.
+Copyright @copyright{} 1987-2018 Free Software Foundation, Inc.
 
 Permission is granted to copy, distribute and/or modify this document
 under the terms of the GNU Free Documentation License, Version 1.3 or
--- gcc/doc/gcc.texi(revision 243991)
+++ gcc/doc/gcc.texi(revision 243992)
@@ -40,7 +40,7 @@
 @c %**end of header
 
 @copying
-Copyright @copyright{} 1988-2017 Free Software Foundation, Inc.
+Copyright @copyright{} 1988-2018 Free Software Foundation, Inc.
 
 Permission is granted to copy, distribute and/or modify this document
 under the terms of the GNU Free Documentation License, Version 1.3 or
--- gcc/doc/cppinternals.texi   (revision 243991)
+++ gcc/doc/cppinternals.texi   (revision 243992)
@@ -18,7 +18,7 @@
 @ifinfo
 This file documents the internals of the GNU C Preprocessor.
 
-Copyright (C) 2000-2017 Free Software Foundation, Inc.
+Copyright (C) 2000-2018 Free Software Foundation, Inc.
 
 Permission is granted to make and distribute verbatim copies of
 this manual provided the copyright notice and this permission notice
@@ -47,7 +47,7 @@ into another language, under the above c
 @page
 @vskip 0pt plus 1filll
 @c man begin COPYRIGHT
-Copyright @copyright{} 2000-2017 Free Software Foundation, Inc.
+Copyright @copyright{} 2000-2018 Free Software Foundation, Inc.
 
 Permission is granted to make and distribute verbatim copies of
 this manual provided the copyright notice and this permission notice
--- gcc/doc/gccint.texi (revision 243991)
+++ gcc/doc/gccint.texi (revision 243992)
@@ -26,7 +26,7 @@
 @c %**end of header
 
 @copying
-Copyright @copyright{} 1988-2017 Free Software Foundation, Inc.
+Copyright @copyright{} 1988-2018 Free Software Foundation, Inc.
 
 Permission is granted to copy, distribute and/or modify this document
 under the terms of the GNU Free Documentation License, Version 1.3 or
--- gcc/doc/invoke.texi (revision 243991)
+++ gcc/doc/invoke.texi (revision 243992)
@@ -8,7 +8,7 @@
 @c man end
 
 @c man begin COPYRIGHT
-Copyright @copyright{} 1988-2017 Free Software Foundation, Inc.
+Copyright @copyright{} 1988-2018 Free Software Foundation, Inc.
 
 Permission is granted to copy, distribute and/or modify this document
 under the terms of the GNU Free Documentation License, Version 1.3 or
--- gcc/doc/gcov.texi   (revision 243991)
+++ gcc/doc/gcov.texi   (revision 243992)
@@ -4,7 +4,7 @@
 
 @ignore
 @c man begin COPYRIGHT
-Copyright

Re: PR83648

2018-01-03 Thread Jakub Jelinek

On Wed, Jan 03, 2018 at 10:05:30AM +0100, Richard Biener wrote:
> >One concern I have is that with the patch, malloc_candidate_p will
> >return true if all the args to PHI are NULL:
> >retval = PHI<0, 0>
> >return retval
> >
> >However I expect that PHI with all 0 args would be constant folded to
> >0 earlier, so this case shouldn't occur in practice ?
> 
> You may not rely on folding for correctness. 

Yeah.  Will the patch handle (I mean punt on) also unfolded
  if (n) ? 0 : __builtin_malloc (n);
and similar?

Jakub

Re: PR83648

2018-01-03 Thread Richard Biener

On January 3, 2018 7:03:26 AM GMT+01:00, Prathamesh Kulkarni 
 wrote:
>Hi,
>malloc_candidate_p() in ipa-pure-const misses detecting that a
>function is malloc-like if the  return value is result of PHI and one
>of the arguments of PHI is 0.
>For example:
>
>void g(unsigned n)
>{
>  return (n) ? __builtin_malloc (n) : 0;
>}
>
>The reason is that the following check:
>if (TREE_CODE (arg) != SSA_NAME)
>  DUMP_AND_RETURN ("phi arg is not SSA_NAME.")
>
>fails for arg with constant value 0 and malloc_candidate_p returns
>false.
>The patch simply skips the arg if it equals null_pointer_node.
>Does it look OK ?

Please use integer_zerop instead of a literal compare. I'm not sure how to 
handle targets where address zero is valid (flag_no_delete_null_pointer_chekcs) 

>One concern I have is that with the patch, malloc_candidate_p will
>return true if all the args to PHI are NULL:
>retval = PHI<0, 0>
>return retval
>
>However I expect that PHI with all 0 args would be constant folded to
>0 earlier, so this case shouldn't occur in practice ?

You may not rely on folding for correctness. 

>Bootstrapped+tested on x86_64-unknown-linux-gnu.
>Cross-testing on arm*-*-* and aarch64*-*-* in progress.
>
>Thanks,
>Prathamesh

[committed] [100.2/nnn] poly_int: vector_builder element count

2018-01-03 Thread Richard Sandiford

This patch changes the number of elements in a vector being built
by a vector_builder from unsigned int to poly_uint64.  The case
in which it isn't a constant is the one that motivated adding
the vector encoding in the first place.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
Also tested by comparing the before and after assembly output for at
least one target per CPU directory.  Committed as pre-approved by
Jeff here: https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01409.html .

Thanks,
Richard


2018-01-03  Richard Sandiford  

gcc/
* vector-builder.h (vector_builder::m_full_nelts): Change from
unsigned int to poly_uint64.
(vector_builder::full_nelts): Update prototype accordingly.
(vector_builder::new_vector): Likewise.
(vector_builder::encoded_full_vector_p): Handle polynomial full_nelts.
(vector_builder::operator ==): Likewise.
(vector_builder::finalize): Likewise.
* int-vector-builder.h (int_vector_builder::int_vector_builder):
Take the number of elements as a poly_uint64 rather than an
unsigned int.
* vec-perm-indices.h (vec_perm_indices::m_nelts_per_input): Change
from unsigned int to poly_uint64.
(vec_perm_indices::vec_perm_indices): Update prototype accordingly.
(vec_perm_indices::new_vector): Likewise.
(vec_perm_indices::length): Likewise.
(vec_perm_indices::nelts_per_input): Likewise.
(vec_perm_indices::input_nelts): Likewise.
* vec-perm-indices.c (vec_perm_indices::new_vector): Take the
number of elements per input as a poly_uint64 rather than an
unsigned int.  Use the original encoding for variable-length
vectors, rather than clamping each individual element.
For the second and subsequent elements in each pattern,
clamp the step and base before clamping their sum.
(vec_perm_indices::series_p): Handle polynomial element counts.
(vec_perm_indices::all_in_range_p): Likewise.
(vec_perm_indices_to_tree): Likewise.
(vec_perm_indices_to_rtx): Likewise.
* tree-vect-stmts.c (vect_gen_perm_mask_any): Likewise.
* tree-vector-builder.c (tree_vector_builder::new_unary_operation)
(tree_vector_builder::new_binary_operation): Handle polynomial
element counts.  Return false if we need to know the number
of elements at compile time.
* fold-const.c (fold_vec_perm): Punt if the number of elements
isn't known at compile time.

Index: gcc/vector-builder.h
===
--- gcc/vector-builder.h2018-01-03 08:46:30.347663443 +
+++ gcc/vector-builder.h2018-01-03 08:54:24.909508898 +
@@ -90,7 +90,7 @@ #define GCC_VECTOR_BUILDER_H
 public:
   vector_builder ();
 
-  unsigned int full_nelts () const { return m_full_nelts; }
+  poly_uint64 full_nelts () const { return m_full_nelts; }
   unsigned int npatterns () const { return m_npatterns; }
   unsigned int nelts_per_pattern () const { return m_nelts_per_pattern; }
   unsigned int encoded_nelts () const;
@@ -103,7 +103,7 @@ #define GCC_VECTOR_BUILDER_H
   void finalize ();
 
 protected:
-  void new_vector (unsigned int, unsigned int, unsigned int);
+  void new_vector (poly_uint64, unsigned int, unsigned int);
   void reshape (unsigned int, unsigned int);
   bool repeating_sequence_p (unsigned int, unsigned int, unsigned int);
   bool stepped_sequence_p (unsigned int, unsigned int, unsigned int);
@@ -115,7 +115,7 @@ #define GCC_VECTOR_BUILDER_H
   Derived *derived () { return static_cast (this); }
   const Derived *derived () const;
 
-  unsigned int m_full_nelts;
+  poly_uint64 m_full_nelts;
   unsigned int m_npatterns;
   unsigned int m_nelts_per_pattern;
 };
@@ -152,7 +152,7 @@ vector_builder::encoded_nelt
 inline bool
 vector_builder::encoded_full_vector_p () const
 {
-  return m_npatterns * m_nelts_per_pattern == m_full_nelts;
+  return known_eq (m_npatterns * m_nelts_per_pattern, m_full_nelts);
 }
 
 /* Start building a vector that has FULL_NELTS elements.  Initially
@@ -160,7 +160,7 @@ vector_builder::encoded_full
 
 template
 void
-vector_builder::new_vector (unsigned int full_nelts,
+vector_builder::new_vector (poly_uint64 full_nelts,
unsigned int npatterns,
unsigned int nelts_per_pattern)
 {
@@ -178,7 +178,7 @@ vector_builder::new_vector (
 bool
 vector_builder::operator == (const Derived ) const
 {
-  if (m_full_nelts != other.m_full_nelts
+  if (maybe_ne (m_full_nelts, other.m_full_nelts)
   || m_npatterns != other.m_npatterns
   || m_nelts_per_pattern != other.m_nelts_per_pattern)
 return false;
@@ -356,14 +356,16 @@ vector_builder::finalize ()
 {
   /* The encoding requires

[committed] [100.1/nnn] poly_int: vec_perm_indices element type

2018-01-03 Thread Richard Sandiford

This patch changes the vec_perm_indices element type from HOST_WIDE_INT
to poly_int64, so that it can represent indices into a variable-length
vector.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
Also tested by comparing the before and after assembly output for at
least one target per CPU directory.  Committed as pre-approved by
Jeff here: https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01409.html .

Thanks,
Richard


2018-01-03  Richard Sandiford  

gcc/
* vec-perm-indices.h (vec_perm_builder): Change element type
from HOST_WIDE_INT to poly_int64.
(vec_perm_indices::element_type): Update accordingly.
(vec_perm_indices::clamp): Handle polynomial element_types.
* vec-perm-indices.c (vec_perm_indices::series_p): Likewise.
(vec_perm_indices::all_in_range_p): Likewise.
(tree_to_vec_perm_builder): Check for poly_int64 trees rather
than shwi trees.
* vector-builder.h (vector_builder::stepped_sequence_p): Handle
polynomial vec_perm_indices element types.
* int-vector-builder.h (int_vector_builder::equal_p): Likewise.
* fold-const.c (fold_vec_perm): Likewise.
* optabs.c (shift_amt_for_vec_perm_mask): Likewise.
* tree-vect-generic.c (lower_vec_perm): Likewise.
* tree-vect-slp.c (vect_transform_slp_perm_load): Likewise.
* config/aarch64/aarch64.c (aarch64_evpc_tbl): Cast d->perm
element type to HOST_WIDE_INT.

Index: gcc/vec-perm-indices.h
===
--- gcc/vec-perm-indices.h  2018-01-02 18:27:07.346639775 +
+++ gcc/vec-perm-indices.h  2018-01-03 08:46:30.347663443 +
@@ -25,7 +25,7 @@ #define GCC_VEC_PERN_INDICES_H 1
 /* A vector_builder for building constant permutation vectors.
The elements do not need to be clamped to a particular range
of input elements.  */
-typedef int_vector_builder vec_perm_builder;
+typedef int_vector_builder vec_perm_builder;
 
 /* This class represents a constant permutation vector, such as that used
as the final operand to a VEC_PERM_EXPR.
@@ -49,7 +49,7 @@ typedef int_vector_builder= size)
+if (!known_in_range_p (m_encoding[i], start, size))
   return false;
 
   /* For stepped encodings, check the full range of the series.  */
@@ -174,8 +174,11 @@ vec_perm_indices::all_in_range_p (elemen
 wide enough for overflow not to be a problem.  */
  element_type headroom_down = base1 - start;
  element_type headroom_up = size - headroom_down - 1;
- if (headroom_up < step * step_nelts
- && headroom_down < (limit - step) * step_nelts)
+ HOST_WIDE_INT diff;
+ if ((!step.is_constant ()
+  || maybe_lt (headroom_up, diff * step_nelts))
+ && (!(limit - step).is_constant ()
+ || maybe_lt (headroom_down, diff * step_nelts)))
return false;
}
 }
@@ -191,14 +194,14 @@ tree_to_vec_perm_builder (vec_perm_build
 {
   unsigned int encoded_nelts = vector_cst_encoded_nelts (cst);
   for (unsigned int i = 0; i < encoded_nelts; ++i)
-if (!tree_fits_shwi_p (VECTOR_CST_ENCODED_ELT (cst, i)))
+if (!tree_fits_poly_int64_p (VECTOR_CST_ENCODED_ELT (cst, i)))
   return false;
 
   builder->new_vector (TYPE_VECTOR_SUBPARTS (TREE_TYPE (cst)),
   VECTOR_CST_NPATTERNS (cst),
   VECTOR_CST_NELTS_PER_PATTERN (cst));
   for (unsigned int i = 0; i < encoded_nelts; ++i)
-builder->quick_push (tree_to_shwi (VECTOR_CST_ENCODED_ELT (cst, i)));
+builder->quick_push (tree_to_poly_int64 (VECTOR_CST_ENCODED_ELT (cst, i)));
   return true;
 }
 
Index: gcc/vector-builder.h
===
--- gcc/vector-builder.h2018-01-02 18:27:07.346639775 +
+++ gcc/vector-builder.h2018-01-03 08:46:30.347663443 +
@@ -284,7 +284,8 @@ vector_builder::stepped_sequ
  || !derived ()->integral_p (elt3))
return false;
 
-  if (derived ()->step (elt1, elt2) != derived ()->step (elt2, elt3))
+  if (maybe_ne (derived ()->step (elt1, elt2),
+   derived ()->step (elt2, elt3)))
return false;
 
   if (!derived ()->can_elide_p (elt3))
Index: gcc/int-vector-builder.h
===
--- gcc/int-vector-builder.h2018-01-02 18:26:37.619823164 +
+++ gcc/int-vector-builder.h2018-01-03 08:46:30.345663760 +
@@ -66,7 +66,7 @@ int_vector_builder::int_vector_builde
 inline bool
 int_vector_builder::equal_p (T elt1, T elt2) const
 {
-  return elt1 == elt2;
+  return known_eq (elt1, elt2);
 }
 
 /* Return the value of element ELT2 minus the value of element ELT1.  */
Index: gcc/fold-const.c
===
--- gcc/fold-const.c2018-01-03 07:17:07.401983352

Re: [PATCH] [PR target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64

2018-01-03 Thread Jeff Law

On 01/02/2018 04:22 PM, Jeff Law wrote:
> On 01/02/2018 03:05 PM, Florian Weimer wrote:
>> On 01/02/2018 09:02 PM, Jeff Law wrote:
>>> * config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not
>>
>> Typo: “adjut”.
>>
>>> explicitly probe *sp in a noreturn function if there were any callee
>>> register saves.
>>
>> I recompiled glibc with this patch, and I can confirm it fixes the new
>> glibc test:
>>
>>   https://sourceware.org/ml/libc-alpha/2017-12/msg00987.html
>>
>> However, I would appreciate if it were possible to avoid emitting the
>> .cfi_offset/.cfi_register annotations and only record the change of
>> frame address.  The other CFI notes aren't needed, and it would avoid
>> reintroducing this bug if the way the prologue is constructed changes
>> and the condition for emitting the probe is not completely correct anymore.
> I'm not aware of a way to do that.  I'm not even sure having the ability
> to tell the CFI machinery to avoid that stuff is a good idea from a
> design/implementation standpoint.
So Jakub pointed out in IRC we can use a REG_FRAME_RELATED_EXPR note to
control more precisely what cfis are emitted.  But that introduces its
own set of issues.   I suspect we're more likely to muck up generating
the note explicitly than we are to muck up the condition controlling
whether or not we need the explicit *sp probe.

The way we run into problems is if the CFI notes do not reflect reality
-- for example, indicating a register is saved at a location where it
isn't actually saved, indicating a register was restored to its entry
value when that's not true throughout the function, or mucking up the
CFA offset.

The way to get into those first two states is if we have a redundant
push/pop pair and its associated notes for the *sp explicit probe,
particularly for %esi on x86 or %rax on x86_64.  An additional comment
seems warranted though.  And testing that we don't have a cfi restore in
the regression test also seems warranted.  Consider those added :-)
I'll come up with the exact comment text in the AM.  While avoiding the
cfi notes seems useful, I doubt it's really going to improve things in
practice.

In theory if someone completely rewrote the prologue code they could
potentially introduce all kinds of problems, but I don't see how your
suggested change would really help there either.  Comments in the code
as well as tests verifying the output seem to be the way to go here as well.

Jeff

88 matches

Mail list logo