Re: [PATCH] Fix switchconv ICE (PR tree-optimization/84740)

2018-03-07 Thread Richard Biener
On March 7, 2018 10:51:08 PM GMT+01:00, Jakub Jelinek  wrote:
>Hi!
>
>On the following testcase info.final_bb has only one PHI, the virtual
>one,
>so info.phi_count is 0.  For info.phi_count == 0, we don't really
>create any
>arrays etc.; just returning early (before create_temp_arrays) would
>work,
>but would leave the useless GIMPLE_SWITCH in the IL and it would take
>many
>passes to optimize it properly.  We know that all but the default: bbs
>are
>really empty and the default: can do something arbitrary, if all the
>ranges
>are consecutive, it is better to just let gen_inbound_check do its work
>and convert the switch into if (cond) { former_default_bb }.
>All but build_constructors function can handle 0 info.phi_count, but
>build_constructors can't, on the other side doesn't need to do anything
>if it is 0.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK. 

Richard. 

>2018-03-07  Jakub Jelinek  
>
>   PR tree-optimization/84740
>   * tree-switch-conversion.c (process_switch): Call build_constructors
>   only if info.phi_count is non-zero.
>
>   * gcc.dg/torture/pr84740.c: New test.
>
>--- gcc/tree-switch-conversion.c.jj2018-01-09 08:58:15.0 +0100
>+++ gcc/tree-switch-conversion.c   2018-03-07 12:03:42.228246408 +0100
>@@ -1563,7 +1563,8 @@ process_switch (gswitch *swtch)
>   gather_default_values (info.default_case_nonstandard
>? gimple_switch_label (swtch, 1)
>: gimple_switch_default_label (swtch), );
>-  build_constructors (swtch, );
>+  if (info.phi_count)
>+build_constructors (swtch, );
> 
>build_arrays (swtch, ); /* Build the static arrays and
>assignments.   */
>   gen_inbound_check (swtch, );   /* Build the bounds check.  */
>--- gcc/testsuite/gcc.dg/torture/pr84740.c.jj  2018-03-07
>12:13:02.153271986 +0100
>+++ gcc/testsuite/gcc.dg/torture/pr84740.c 2018-03-07
>12:12:42.643270368 +0100
>@@ -0,0 +1,24 @@
>+/* PR tree-optimization/84740 */
>+
>+void
>+frobulate_for_gcc (unsigned int v)
>+{
>+  const char *s;
>+  switch (v)
>+{
>+case 0:
>+  s = "foo";
>+  break;
>+case 1:
>+  s = "bar";
>+  break;
>+case 2:
>+  s = "spam";
>+  break;
>+default:
>+  s = (const char *) 0;
>+  break;
>+}
>+  if (!s)
>+__builtin_printf ("%s\n", s);
>+}
>
>   Jakub



[PATCH, rs6000] Fix PR83969: ICE in final_scan_insn, at final.c:2997

2018-03-07 Thread Peter Bergner
PR83969 shows another bug in mem_operand_gpr() (which implements the "Y"
constraint) accepting reg+reg addresses.  This was fixed by adding a call
to rs6000_offsettable_memref_p() to verify the address is a valid offsettable
address.  Fixing that exposed a problem in the *movdi_internal64 pattern
which was using the "Y" constraint for the integer load/store alternatives,
which allow either reg+offset or reg+reg addresses.  This worked before
only because of the buggy mem_operand_gpr.  The fix for that was to use
the "YZ" constraint which allows both reg+offset and reg+reg addresses.

This passed bootstrap and regtesting on powerpc64-linux, running the
testsuite in both 32-bit and 64-bit modes with no regressions.
Ok for trunk?

Peter

gcc/
    PR target/83969
    * config/rs6000/rs6000.c (rs6000_offsettable_memref_p): New prototype.
    Add strict argument and use it.
    (rs6000_split_multireg_move): Update for new strict argument.
    (mem_operand_gpr): Disallow all non-offsettable addresses.
    * config/rs6000/rs6000.md (*movdi_internal64): Use YZ constraint.

gcc/testsuite/
    PR target/83969
    * gcc.target/powerpc/pr83969.c: New test.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c    (revision 258268)
+++ gcc/config/rs6000/rs6000.c    (working copy)
@@ -1378,6 +1378,7 @@ static rtx rs6000_debug_legitimize_reloa
                        int, int, int *);
 static bool rs6000_mode_dependent_address (const_rtx);
 static bool rs6000_debug_mode_dependent_address (const_rtx);
+static bool rs6000_offsettable_memref_p (rtx, machine_mode, bool);
 static enum reg_class rs6000_secondary_reload_class (enum reg_class,
                      machine_mode, rtx);
 static enum reg_class rs6000_debug_secondary_reload_class (enum reg_class,
@@ -8207,10 +8208,8 @@ mem_operand_gpr (rtx op, machine_mode mo
   int extra;
   rtx addr = XEXP (op, 0);
 
-  /* Don't allow altivec type addresses like (mem (and (plus ...))).
- See PR target/84279.  */
-
-  if (GET_CODE (addr) == AND)
+  /* Don't allow non-offsettable addresses.  See PRs 83969 and 84279.  */
+  if (!rs6000_offsettable_memref_p (op, mode, false))
 return false;
 
   op = address_offset (addr);
@@ -9956,7 +9955,7 @@ rs6000_find_base_term (rtx op)
    in 32-bit mode, that the recog predicate rejects.  */
 
 static bool
-rs6000_offsettable_memref_p (rtx op, machine_mode reg_mode)
+rs6000_offsettable_memref_p (rtx op, machine_mode reg_mode, bool strict)
 {
   bool worst_case;
 
@@ -9964,7 +9963,7 @@ rs6000_offsettable_memref_p (rtx op, mac
 return false;
 
   /* First mimic offsettable_memref_p.  */
-  if (offsettable_address_p (true, GET_MODE (op), XEXP (op, 0)))
+  if (offsettable_address_p (strict, GET_MODE (op), XEXP (op, 0)))
 return true;
 
   /* offsettable_address_p invokes rs6000_mode_dependent_address, but
@@ -9978,7 +9977,7 @@ rs6000_offsettable_memref_p (rtx op, mac
   worst_case = ((TARGET_POWERPC64 && GET_MODE_CLASS (reg_mode) == MODE_INT)
     || GET_MODE_SIZE (reg_mode) == 4);
   return rs6000_legitimate_offset_address_p (GET_MODE (op), XEXP (op, 0),
-                     true, worst_case);
+                     strict, worst_case);
 }
 
 /* Determine the reassociation width to be used in reassociate_bb.
@@ -24063,7 +24062,7 @@ rs6000_split_multireg_move (rtx dst, rtx
   emit_insn (gen_add3_insn (breg, breg, delta_rtx));
   src = replace_equiv_address (src, breg);
     }
-      else if (! rs6000_offsettable_memref_p (src, reg_mode))
+      else if (! rs6000_offsettable_memref_p (src, reg_mode, true))
     {
   if (GET_CODE (XEXP (src, 0)) == PRE_MODIFY)
     {
@@ -24130,7 +24129,7 @@ rs6000_split_multireg_move (rtx dst, rtx
     emit_insn (gen_add3_insn (breg, breg, delta_rtx));
   dst = replace_equiv_address (dst, breg);
     }
-      else if (!rs6000_offsettable_memref_p (dst, reg_mode)
+      else if (!rs6000_offsettable_memref_p (dst, reg_mode, true)
        && GET_CODE (XEXP (dst, 0)) != LO_SUM)
     {
   if (GET_CODE (XEXP (dst, 0)) == PRE_MODIFY)
@@ -24169,7 +24168,7 @@ rs6000_split_multireg_move (rtx dst, rtx
     }
     }
   else if (GET_CODE (XEXP (dst, 0)) != LO_SUM)
-        gcc_assert (rs6000_offsettable_memref_p (dst, reg_mode));
+        gcc_assert (rs6000_offsettable_memref_p (dst, reg_mode, true));
 }
 
   for (i = 0; i < nregs; i++)
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md    (revision 258268)
+++ gcc/config/rs6000/rs6000.md    (working copy)
@@ -8549,14 +8549,14 @@ (define_split
 ;;  FPR->GPR   GPR->FPR   VSX->GPR   GPR->VSX
 (define_insn "*movdi_internal64"
   [(set (match_operand:DI 0 "nonimmediate_operand"
-   "=Y,    r, r, r, r,  r,
+   "=YZ,   r, 

Re: [PATCH] Fix tail recursion (PR tree-optimization/84739)

2018-03-07 Thread Jeff Law
On 03/07/2018 02:42 PM, Jakub Jelinek wrote:
> Hi!
> 
> Before Honza introduced recursive_call_p, this tree-tailcall.c snipped
> has been guarded with if (func == current_function_decl), so what we used
> for DECL_ARGUMENTS didn't really matter.  But as it can now be some alias
> to it, we really want to check that the current function's arguments
> match the arguments of the call, rather than just that the call's arguments
> match its function type.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2018-03-07  Jakub Jelinek  
> 
>   PR tree-optimization/84739
>   * tree-tailcall.c (find_tail_calls): Check call arguments against
>   DECL_ARGUMENTS (current_function_decl) rather than
>   DECL_ARGUMENTS (func) when checking for tail recursion.
> 
>   * gcc.dg/pr84739.c: New test.
OK.
jeff


Re: [RFA][PATCH][PR middle-end/61118] Improve tree CFG accuracy for setjmp/longjmp

2018-03-07 Thread Peter Bergner
On 3/7/18 12:01 AM, Jeff Law wrote:
> I believe so by nature that the setjmp dominates the longjmp sites and
> thus also dominates the dispatcher.  But it's something I want to
> explicitly check before resubmitting.

Are we sure a setjmp has to dominate its longjmp sites?  Couldn't you
have something like:

bb(a): bb(b):
  ......
  setjmp (env)   setjmp (env)
  \ /
   \   /
\ /
 \   /
  \ /
   \   /
bb(c):
  ...
  longjmp (env)

...or:

bb(a):
  ...
  setjmp (env)
  |\
  | \
  |  \
  |   \
  |   bb(b):
  | ...
  | setjmp (env)
  |   /
  |  /
  | /
  v
bb(c):
  ...
  longjmp (env)

If so, then the setjmp calls might not dominate the longjmp call.

Peter



Re: [PATCH] fix ICE in generic_overlap (PR 84526)

2018-03-07 Thread Martin Sebor

On 03/07/2018 04:04 PM, Richard Sandiford wrote:

Martin Sebor  writes:

@@ -409,23 +412,33 @@ builtin_memref::set_base_and_offset (tree expr)
   base = get_inner_reference (expr, , , _off,
  , , , );

+  /* get_inner_reference is not expected to return null.  */
+  gcc_assert (base != NULL);
+
   poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);

-  HOST_WIDE_INT const_off;
-  if (!base || !bytepos.is_constant (_off))
+  /* Convert the poly_int64 offset to to offset_int.  The offset
+ should be constant but be prepared for it not to be just in
+ case.  */


I know it's just repeating what I said in the other reply, but I think
we should drop this comment.  It doesn't add anything and it will quickly
become out of date once the frontend supports variable-length types.


For some reason I never got the email you're referring to (I just
found it in the archives, after some head scratching as to what
you meant above).

I added the comment based on your feedback that it can't happen
yet (as an explanation for not calling to_constant() instead),
but I'll remove it before I commit the patch.

Otherwise, I only made these changes here because Jakub asked
me to.  They are unrelated to the fix for the ICE and so they
should have been made independently of it.  But since they are
correct and any difference in efficiency between HOST_WIDE_INT
and offset_int here is surely insignificant (and since you
also said you were fine either way) I'll leave the offset_int
in place.

Thanks
Martin



[PATCH] Fix tail recursion (PR tree-optimization/84739)

2018-03-07 Thread Jakub Jelinek
Hi!

Before Honza introduced recursive_call_p, this tree-tailcall.c snipped
has been guarded with if (func == current_function_decl), so what we used
for DECL_ARGUMENTS didn't really matter.  But as it can now be some alias
to it, we really want to check that the current function's arguments
match the arguments of the call, rather than just that the call's arguments
match its function type.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-03-07  Jakub Jelinek  

PR tree-optimization/84739
* tree-tailcall.c (find_tail_calls): Check call arguments against
DECL_ARGUMENTS (current_function_decl) rather than
DECL_ARGUMENTS (func) when checking for tail recursion.

* gcc.dg/pr84739.c: New test.

--- gcc/tree-tailcall.c.jj  2018-01-12 11:35:51.470222838 +0100
+++ gcc/tree-tailcall.c 2018-03-06 23:05:39.779048759 +0100
@@ -481,7 +481,7 @@ find_tail_calls (basic_block bb, struct
 {
   tree arg;
 
-  for (param = DECL_ARGUMENTS (func), idx = 0;
+  for (param = DECL_ARGUMENTS (current_function_decl), idx = 0;
   param && idx < gimple_call_num_args (call);
   param = DECL_CHAIN (param), idx ++)
{
--- gcc/testsuite/gcc.dg/pr84739.c.jj   2018-03-07 09:43:13.961110879 +0100
+++ gcc/testsuite/gcc.dg/pr84739.c  2018-03-07 09:43:19.856109830 +0100
@@ -0,0 +1,26 @@
+/* PR tree-optimization/84739 */
+/* { dg-do compile } */
+/* { dg-require-weak "" } */
+/* { dg-options "-O2" } */
+
+static void baz (void) __attribute__((weakref("bar")));/* { dg-warning 
"alias between functions of incompatible types" } */
+
+int
+foo (int x, int y)
+{
+  if (x)
+y = 0;
+  if (y)
+goto lab;
+  y = 0;
+lab:
+  return y;
+}
+
+void
+bar (int x, int y) /* { dg-message "aliased declaration here" } */
+{
+  y = foo (x, y);
+  if (y != 0)
+baz ();
+}

Jakub


Re: [PATCH] fix ICE in generic_overlap (PR 84526)

2018-03-07 Thread Richard Sandiford
Martin Sebor  writes:
> @@ -409,23 +412,33 @@ builtin_memref::set_base_and_offset (tree expr)
>base = get_inner_reference (expr, , , _off,
> , , , );
>  
> +  /* get_inner_reference is not expected to return null.  */
> +  gcc_assert (base != NULL);
> +
>poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);
>  
> -  HOST_WIDE_INT const_off;
> -  if (!base || !bytepos.is_constant (_off))
> +  /* Convert the poly_int64 offset to to offset_int.  The offset
> + should be constant but be prepared for it not to be just in
> + case.  */

I know it's just repeating what I said in the other reply, but I think
we should drop this comment.  It doesn't add anything and it will quickly
become out of date once the frontend supports variable-length types.

> +  offset_int cstoff;
> +  if (bytepos.is_constant ())

Also, same comment about this being less efficient than using the
natural type of HOST_WIDE_INT.  I think this and...

>  {
> -  base = get_base_address (TREE_OPERAND (expr, 0));
> -  return;
> +  offrange[0] += cstoff;
> +  offrange[1] += cstoff;
> +
> +  /* Besides the reference saved above, also stash the offset
> +  for validation.  */
> +  if (TREE_CODE (expr) == COMPONENT_REF)
> + refoff = cstoff;
>  }
> +  else
> +offrange[1] += maxobjsize;
>  
> -  offrange[0] += const_off;
> -  offrange[1] += const_off;
> -
>if (var_off)
>  {
>if (TREE_CODE (var_off) == INTEGER_CST)
>   {
> -   offset_int cstoff = wi::to_offset (var_off);
> +   cstoff = wi::to_offset (var_off);
> offrange[0] += cstoff;
> offrange[1] += cstoff;

...this are logically separate variables, so there's no need for
them to have the same type and name.

Richard


Re: Fix some _GLIBCXX_DEBUG pretty printer errors

2018-03-07 Thread Jonathan Wakely
On 7 March 2018 at 17:39, François Dumont wrote:
> On 06/03/2018 22:21, Jonathan Wakely wrote:
>>
>>
>>> @@ -575,10 +586,12 @@ class StdDebugIteratorPrinter:
>>>  # and return the wrapped iterator value.
>>>  def to_string (self):
>>>  base_type = gdb.lookup_type('__gnu_debug::_Safe_iterator_base')
>>> +itype = self.val.type.template_argument(0)
>>>  safe_seq = self.val.cast(base_type)['_M_sequence']
>>> -if not safe_seq or self.val['_M_version'] !=
>>> safe_seq['_M_version']:
>>> +if not safe_seq:
>>> +return str(self.val.cast(itype))
>>
>> So what's the effect of this change when we get a value-initialized
>> debug iterator? It prints the wrapped (value-initialized) non-debug
>> iterator instead?
>>
> Yes, for instance this test in smiple11.cc:
>
>
>   std::deque::iterator deqiter0;
> // { dg-final { note-test deqiter0 {non-dereferenceable iterator for
> std::deque} } }
>
> Was failing when running 'make check-debug
> RUNTESTFLAGS=prettyprinters.exp' because it was displaying 'invalid
> iterator' rather than non-dereferenceable. Now it is succeeded.

OK, thanks for confirming.

This is OK for trunk, thanks. If anybody complains about the missing
printers for the __norm types we can add them back again.


[PATCH] avoid warning for memcpy to self (PR 83456)

2018-03-07 Thread Martin Sebor

I have become convinced that issuing -Wrestrict in gimple-fold
for calls to memcpy() where the source pointer is the same as
the destination causes more trouble than it's worth, especially
when inlining is involved, as in:

  inline void bar (void *d, void *s, unsigned N)
  {
if (s != d)
  memcpy (d, s, N);
  }

  void foo (void* src)
  {
bar (src, src, 1);
  }

It seems that there should be a way to teach GCC to avoid
folding statements in dead blocks (e.g., in a block controlled
by 'if (0 != 0)' as the one below), and that it might even speed
up compilation, but in the meantime it leads to false positive
-Wrestrict warnings.

The attached patch removes this instance of the warning and
adjusts tests not to expect it.

Martin
PR tree-optimization/83456 - -Wrestrict false positive on a non-overlapping memcpy in an inline function

gcc/ChangeLog:

	PR tree-optimization/83456
	* gimple-fold.c (gimple_fold_builtin_memory_op): Avoid warning
	for perfectly overlapping calls to memcpy.
	(gimple_fold_builtin_memory_chk): Same.

gcc/testsuite/ChangeLog:

	PR tree-optimization/83456
	* c-c++-common/Wrestrict-2.c: Remove test cases.
	* c-c++-common/Wrestrict.c: Same.
	* gcc.dg/Wrestrict-12.c: New test.

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c	(revision 258339)
+++ gcc/gimple-fold.c	(working copy)
@@ -713,13 +713,7 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterato
 {
   /* Avoid diagnosing exact overlap in calls to __builtin_memcpy.
 	 It's safe and may even be emitted by GCC itself (see bug
-	 32667).  However, diagnose it in explicit calls to the memcpy
-	 function.  */
-  if (check_overlap && *IDENTIFIER_POINTER (DECL_NAME (func)) != '_')
-  	warning_at (loc, OPT_Wrestrict,
-  		"%qD source argument is the same as destination",
-  		func);
-
+	 32667).  */
   unlink_stmt_vdef (stmt);
   if (gimple_vdef (stmt) && TREE_CODE (gimple_vdef (stmt)) == SSA_NAME)
 	release_ssa_name (gimple_vdef (stmt));
@@ -2499,15 +2493,6 @@ gimple_fold_builtin_memory_chk (gimple_stmt_iterat
  (resp. DEST+LEN for __mempcpy_chk).  */
   if (fcode != BUILT_IN_MEMSET_CHK && operand_equal_p (src, dest, 0))
 {
-  if (fcode != BUILT_IN_MEMMOVE && fcode != BUILT_IN_MEMMOVE_CHK)
-	{
-	  tree func = gimple_call_fndecl (stmt);
-
-	  warning_at (loc, OPT_Wrestrict,
-		  "%qD source argument is the same as destination",
-		  func);
-	}
-
   if (fcode != BUILT_IN_MEMPCPY_CHK)
 	{
 	  replace_call_with_value (gsi, dest);
Index: gcc/testsuite/c-c++-common/Wrestrict-2.c
===
--- gcc/testsuite/c-c++-common/Wrestrict-2.c	(revision 258339)
+++ gcc/testsuite/c-c++-common/Wrestrict-2.c	(working copy)
@@ -12,13 +12,13 @@
 
 static void wrap_memcpy (void *d, const void *s, size_t n)
 {
-  memcpy (d, s, n);   /* { dg-warning "source argument is the same as destination" "memcpy" } */
+  memcpy (d, s, n);   /* { dg-warning "accessing 2 bytes at offsets 0 and 1 overlaps 1 byte at offset 1" "memcpy" } */
 }
 
-void call_memcpy (void *d, size_t n)
+void call_memcpy (char *d)
 {
-  const void *s = d;
-  wrap_memcpy (d, s, n);
+  const void *s = d + 1;
+  wrap_memcpy (d, s, 2);
 }
 
 
Index: gcc/testsuite/c-c++-common/Wrestrict.c
===
--- gcc/testsuite/c-c++-common/Wrestrict.c	(revision 258339)
+++ gcc/testsuite/c-c++-common/Wrestrict.c	(working copy)
@@ -52,7 +52,6 @@ void test_memcpy_cst (void *d, const void *s)
   } while (0)
 
   T (a, a, 0);
-  T (a, s = a, 3);   /* { dg-warning "\\\[-Wrestrict" "memcpy" } */
 
   /* This isn't detected because memcpy calls with small power-of-2 sizes
  are intentionally folded into safe copies equivalent to memmove.
@@ -64,19 +63,6 @@ void test_memcpy_cst (void *d, const void *s)
   T (a, a + 3, 5);   /* { dg-warning "\\\[-Wrestrict" "memcpy" } */
 
   {
-char a[3] = { 1, 2, 3 };
-
-/* Verify that a call to memcpy with an exact overlap is diagnosed
-   (also tested above) but an excplicit one to __builtin_memcpy is
-   not.  See bug 32667 for the rationale.  */
-(memcpy)(a, a, sizeof a);   /* { dg-warning "source argument is the same as destination" "memcpy" } */
-sink (a);
-
-__builtin_memcpy (a, a, sizeof a);
-sink (a);
-  }
-
-  {
 char a[3][7];
 sink (a);
 
@@ -116,11 +102,6 @@ void test_memcpy_cst (void *d, const void *s)
 memcpy (d, s, sizeof x.a);
 sink ();
 
-d = x.a;
-s = x.a;
-memcpy (d, s, sizeof x.a);/* { dg-warning "\\\[-Wrestrict" "memcpy" } */
-sink ();
-
 d = x.a + 4;
 s = x.b;
 memcpy (d, s, sizeof x.a);/* { dg-warning "\\\[-Wrestrict" "memcpy" } */
@@ -450,19 +431,6 @@ void test_memcpy_var (char *d, const char *s)
   memcpy (d, d, 0);
   sink (d);
 
-  memcpy (d, d, n);   /* { dg-warning "source argument is the same as 

Re: [PATCH] fix ICE in generic_overlap (PR 84526)

2018-03-07 Thread Martin Sebor

On 03/07/2018 12:37 PM, Jeff Law wrote:

On 02/26/2018 10:32 AM, Martin Sebor wrote:


PR tree-optimization/84526 - ICE in generic_overlap at 
gcc/gimple-ssa-warn-restrict.c:927 since r257860

gcc/ChangeLog:

PR tree-optimization/84526
* gimple-ssa-warn-restrict.c (builtin_memref::set_base_and_offset):
Remove dead code.
(builtin_access::generic_overlap): Be prepared to handle non-array
base objects.

gcc/testsuite/ChangeLog:

PR tree-optimization/84526
* gcc.dg/Wrestrict-10.c: New test.

Index: gcc/gimple-ssa-warn-restrict.c
===
--- gcc/gimple-ssa-warn-restrict.c  (revision 257963)
+++ gcc/gimple-ssa-warn-restrict.c  (working copy)
@@ -396,6 +396,9 @@ builtin_memref::set_base_and_offset (tree expr)
   if (TREE_CODE (expr) == ADDR_EXPR)
 expr = TREE_OPERAND (expr, 0);

+  /* Stash the reference for offset validation.  */
+  ref = expr;
+
   poly_int64 bitsize, bitpos;
   tree var_off;
   machine_mode mode;
@@ -409,23 +412,33 @@ builtin_memref::set_base_and_offset (tree expr)
   base = get_inner_reference (expr, , , _off,
  , , , );

+  /* get_inner_reference is not expected to return null.  */
+  gcc_assert (base != NULL);
+
   poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);

-  HOST_WIDE_INT const_off;
-  if (!base || !bytepos.is_constant (_off))
+  /* Convert the poly_int64 offset to to offset_int.  The offset
+ should be constant but be prepared for it not to be just in
+ case.  */
+  offset_int cstoff;
+  if (bytepos.is_constant ())
 {
-  base = get_base_address (TREE_OPERAND (expr, 0));
-  return;
+  offrange[0] += cstoff;
+  offrange[1] += cstoff;
+
+  /* Besides the reference saved above, also stash the offset
+for validation.  */
+  if (TREE_CODE (expr) == COMPONENT_REF)
+   refoff = cstoff;
 }
+  else
+offrange[1] += maxobjsize;

So is this assignment to offrange[1] really correct?

ISTM that it potentially overflows (relative to MAXOBJSIZE) and that
you'd likely be better off just assigning offrange[1] to MAXOBJSIZE.

Or alternately signal to the callers that we can't really analyze the
memory address and inhibit further analysis of the potential overlap of
the objects.


The offsets are stored in offset_int and there is code to deal
with values outside the [-MAXOBJSIZE, MAXOBJSIZE] range, either
by capping them to the bounds of the array/object being accessed
if it's known, or to the MAXOBJSIZE range.  There are a number of
places where offsets with unknown values are being added together
and where their sum might end up exceeding that range (e.g., when
dealing with multiple offsets, each in some range) but as long
as the offsets are only added and not multiplied they should never
overflow offset_int.  It would be okay to assign MAXOBJSIZE here
instead of adding it, but there are other places with the same
addition (see the bottom of builtin_memref::extend_offset_range)
so doing it here alone would be inconsistent.


@@ -918,12 +924,18 @@ builtin_access::generic_overlap ()
   if (!overlap_certain)
 {
   if (!dstref->strbounded_p && !depends_p)
+   /* Memcpy only considers certain overlap.  */
return false;

   /* There's no way to distinguish an access to the same member
 of a structure from one to two distinct members of the same
 structure.  Give up to avoid excessive false positives.  */
-  tree basetype = TREE_TYPE (TREE_TYPE (dstref->base));
+  tree basetype = TREE_TYPE (dstref->base);
+
+  if (POINTER_TYPE_P (basetype)
+ || TREE_CODE (basetype) == ARRAY_TYPE)
+   basetype = TREE_TYPE (basetype);
+
   if (RECORD_OR_UNION_TYPE_P (basetype))
return false;
 }

This is probably fairly reasonable.  My only concern would be that we
handle multi-dimensional arrays correctly.  Do you need to handle
ARRAY_TYPEs with a loop?


Yes, good catch, thanks!  It's unrelated to this bug (the ICE) but
it seems simple enough to handle at the same time so I've added
the loop and a test to verify it works as expected.


I do have a more general concern.  Given that we walk backwards through
pointer casts and ADDR_EXPRs we potentially lose information.  For
example we might walk backwards through a cast from a small array to a
larger array type.

Thus later we use the smaller array type when computing the bounds and
subsequent offset from BASE.  This could lead to a missed warning as the
computed offset would be too small.


There are many false negatives here.  A lot of those I noticed
during testing are because of limitations in the strlen pass
(I must have raised a dozen bugs to track and, hopefully, fix
in stage 1).  There are also deficiencies in the restrict pass
itself.  For instance, using builtin_object_size to compute
the size of member arrays leads to many because bos computes
the size of the larger 

[PATCH] Fix switchconv ICE (PR tree-optimization/84740)

2018-03-07 Thread Jakub Jelinek
Hi!

On the following testcase info.final_bb has only one PHI, the virtual one,
so info.phi_count is 0.  For info.phi_count == 0, we don't really create any
arrays etc.; just returning early (before create_temp_arrays) would work,
but would leave the useless GIMPLE_SWITCH in the IL and it would take many
passes to optimize it properly.  We know that all but the default: bbs are
really empty and the default: can do something arbitrary, if all the ranges
are consecutive, it is better to just let gen_inbound_check do its work
and convert the switch into if (cond) { former_default_bb }.
All but build_constructors function can handle 0 info.phi_count, but
build_constructors can't, on the other side doesn't need to do anything
if it is 0.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-03-07  Jakub Jelinek  

PR tree-optimization/84740
* tree-switch-conversion.c (process_switch): Call build_constructors
only if info.phi_count is non-zero.

* gcc.dg/torture/pr84740.c: New test.

--- gcc/tree-switch-conversion.c.jj 2018-01-09 08:58:15.0 +0100
+++ gcc/tree-switch-conversion.c2018-03-07 12:03:42.228246408 +0100
@@ -1563,7 +1563,8 @@ process_switch (gswitch *swtch)
   gather_default_values (info.default_case_nonstandard
 ? gimple_switch_label (swtch, 1)
 : gimple_switch_default_label (swtch), );
-  build_constructors (swtch, );
+  if (info.phi_count)
+build_constructors (swtch, );
 
   build_arrays (swtch, ); /* Build the static arrays and assignments.   */
   gen_inbound_check (swtch, );/* Build the bounds check.  */
--- gcc/testsuite/gcc.dg/torture/pr84740.c.jj   2018-03-07 12:13:02.153271986 
+0100
+++ gcc/testsuite/gcc.dg/torture/pr84740.c  2018-03-07 12:12:42.643270368 
+0100
@@ -0,0 +1,24 @@
+/* PR tree-optimization/84740 */
+
+void
+frobulate_for_gcc (unsigned int v)
+{
+  const char *s;
+  switch (v)
+{
+case 0:
+  s = "foo";
+  break;
+case 1:
+  s = "bar";
+  break;
+case 2:
+  s = "spam";
+  break;
+default:
+  s = (const char *) 0;
+  break;
+}
+  if (!s)
+__builtin_printf ("%s\n", s);
+}

Jakub


[PATCH] __builtin_early_constant_p (PR c++/78420)

2018-03-07 Thread Jakub Jelinek
Hi!

This is the implementation of __builtin_early_constant_p builtin which
is at all optimization levels quite similar to __builtin_constant_p
at -O0, except that the FE folding might be slightly different between -O0
and -O1+.  In any case, the builtin is folded to 0 already during the FEs
if the argument can't be proven constant or non-constant, like with
__builtin_constant_p at -O0.
The users can use it for stuff where __builtin_constant_p doesn't really
work (e.g. the kernel ilog2 case), for e.g. C macro constexpr-like
programming.

Is this ok for stage1, or do you find it not useful at all?

In any case, bootstrapped/regtested on x86_64-linux and i686-linux.

2018-03-07  Jakub Jelinek  

PR c++/78420
* builtins.def (BUILT_IN_EARLY_CONSTANT_P): New built-in.
* builtins.h (force_folding_builtin_constant_p): Change from bool to
int.
* builtins.c (force_folding_builtin_constant_p): Likewise.
(expand_builtin): Handle BUILT_IN_EARLY_CONSTANT_P like
BUILT_IN_CONSTANT_P.
(fold_builtin_constant_p): Check force_folding_builtin_constant_p > 0
instead of force_folding_builtin_constant_p.
(fold_builtin_1) : Don't return
integer_zero_node if force_folding_builtin_constant_p < 0.
: New case.
(is_simple_builtin): Handle BUILT_IN_EARLY_CONSTANT_P like
BUILT_IN_CONSTANT_P.
* doc/extend.texi (__builtin_early_constant_p): Document.
c-family/
* c-common.c (check_builtin_function_arguments): Handle
BUILT_IN_EARLY_CONSTANT_P like BUILT_IN_CONSTANT_P.
c/
* c-parser.c (c_parser_get_builtin_args): Adjust for
force_folding_builtin_constant_p now being int rather than bool.
cp/
* cp-tree.h (DECL_IS_BUILTIN_CONSTANT_P): Handle
BUILT_IN_EARLY_CONSTANT_P like BUILT_IN_CONSTANT_P.
* call.c (magic_varargs_p): Likewise.
* tree.c (builtin_valid_in_constant_expr_p): Likewise.
* constexpr.c (cxx_eval_builtin_function_call): Likewise.
Adjust for force_folding_builtin_constant_p now being int rather than
bool.
* cp-gimplify.c: Include builtins.h.
(cp_fold): For __builtin*_constant_p adjust temporarily
force_folding_builtin_constant_p rather than optimize.
testsuite/
* g++.dg/delayedfold/builtin-constant4.C: New test.
* g++.dg/delayedfold/builtin-constant3.C: New test.
* g++.dg/ext/builtin14.C: New test.
* g++.dg/ext/builtin12.C: New test.
* g++.dg/ext/builtin13.C: New test.
* g++.dg/cpp1y/constexpr-78420.C: New test.
* g++.dg/cpp0x/constexpr-builtin5.C: New test.
* g++.dg/cpp0x/constexpr-builtin4.C: New test.
* gcc.dg/torture/pr78420.c: New test.

--- gcc/builtins.def.jj 2018-01-03 10:19:55.103533949 +0100
+++ gcc/builtins.def2018-03-07 15:55:20.756617250 +0100
@@ -857,6 +857,7 @@ DEF_EXT_LIB_BUILTIN(BUILT_IN_DCGETTE
 DEF_EXT_LIB_BUILTIN(BUILT_IN_DGETTEXT, "dgettext", 
BT_FN_STRING_CONST_STRING_CONST_STRING, ATTR_FORMAT_ARG_2)
 DEF_GCC_BUILTIN(BUILT_IN_DWARF_CFA, "dwarf_cfa", BT_FN_PTR, ATTR_NULL)
 DEF_GCC_BUILTIN(BUILT_IN_DWARF_SP_COLUMN, "dwarf_sp_column", 
BT_FN_UINT, ATTR_NULL)
+DEF_GCC_BUILTIN(BUILT_IN_EARLY_CONSTANT_P, "early_constant_p", 
BT_FN_INT_VAR, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_EH_RETURN, "eh_return", 
BT_FN_VOID_PTRMODE_PTR, ATTR_NORETURN_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_EH_RETURN_DATA_REGNO, "eh_return_data_regno", 
BT_FN_INT_INT, ATTR_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_EXECL, "execl", 
BT_FN_INT_CONST_STRING_CONST_STRING_VAR, ATTR_SENTINEL_NOTHROW_LIST)
--- gcc/builtins.h.jj   2018-01-04 00:43:16.103702766 +0100
+++ gcc/builtins.h  2018-03-07 16:31:54.898348803 +0100
@@ -46,8 +46,10 @@ extern struct target_builtins *this_targ
 #define this_target_builtins (_target_builtins)
 #endif
 
-/* Non-zero if __builtin_constant_p should be folded right away.  */
-extern bool force_folding_builtin_constant_p;
+/* 1 if __builtin_{,early_}constant_p should be folded right away,
+   -1 if __builtin_{,early_}constant_p should not be folded to 0 when
+   fold_builtin_constant_p returned NULL.  */
+extern int force_folding_builtin_constant_p;
 
 extern bool is_builtin_fn (tree);
 extern bool called_as_built_in (tree);
--- gcc/builtins.c.jj   2018-02-19 20:35:54.418015330 +0100
+++ gcc/builtins.c  2018-03-07 16:32:12.326353626 +0100
@@ -91,8 +91,10 @@ const char * built_in_names[(int) END_BU
initialized to NULL_TREE.  */
 builtin_info_type builtin_info[(int)END_BUILTINS];
 
-/* Non-zero if __builtin_constant_p should be folded right away.  */
-bool force_folding_builtin_constant_p;
+/* 1 if __builtin_{,early_}constant_p should be folded right away,
+   -1 if __builtin_{,early_}constant_p should not be folded to 0 when
+   fold_builtin_constant_p returned NULL.  */
+int force_folding_builtin_constant_p;
 
 

Re: [PATCH] Fix ICE for static vars in offloaded functions

2018-03-07 Thread Jakub Jelinek
On Wed, Mar 07, 2018 at 02:20:26PM +0100, Tom de Vries wrote:
> Fix ICE for static vars in offloaded functions
> 
> 2018-03-06  Tom de Vries  
> 
>   PR lto/84592
>   * varpool.c (varpool_node::get_create): Mark static variables in
>   offloaded functions as offloadable.
> 
>   * testsuite/libgomp.c/pr84592-2.c: New test.
>   * testsuite/libgomp.c/pr84592.c: New test.
>   * testsuite/libgomp.oacc-c-c++-common/pr84592-3.c: New test.

Ok, thanks

Jakub


Re: C++ PATCH for c++/84036, ICE with variadic lambda capture

2018-03-07 Thread Jason Merrill
On Sun, Feb 11, 2018 at 8:21 PM, Jason Merrill  wrote:
> The old lambda model handled variadic capture by focusing on the
> FIELD_DECL rather than trying to map between capture proxies.  The new
> model relies more on capture proxies, so it makes sense to use them
> more for variadic capture as well.  So with this patch we treat a
> variadic capture proxy as a pack, rather than the field.

Which makes the recently added is_capture_proxy_with_ref redundant.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 4f59367cbb4fe82c68f9df30e49af8a2a19dd6cf
Author: Jason Merrill 
Date:   Tue Mar 6 17:31:44 2018 -0500

* lambda.c (is_capture_proxy_with_ref): Remove.

* constexpr.c, expr.c, cp-tree.h, semantics.c: Adjust.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index bd53bfbfe47..2c5a71f3ee5 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -5429,7 +5429,7 @@ potential_constant_expression_1 (tree t, bool want_rval, bool strict, bool now,
 case VAR_DECL:
   if (DECL_HAS_VALUE_EXPR_P (t))
 	{
-	  if (now && is_capture_proxy_with_ref (t))
+	  if (now && is_normal_capture_proxy (t))
 	{
 	  /* -- in a lambda-expression, a reference to this or to a
 		 variable with automatic storage duration defined outside that
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 8f3ec86e8ce..3a6f8f33e8d 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6896,7 +6896,6 @@ extern void insert_capture_proxy		(tree);
 extern void insert_pending_capture_proxies	(void);
 extern bool is_capture_proxy			(tree);
 extern bool is_normal_capture_proxy (tree);
-extern bool is_capture_proxy_with_ref   (tree);
 extern void register_capture_members		(tree);
 extern tree lambda_expr_this_capture(tree, bool);
 extern void maybe_generic_this_capture		(tree, tree);
diff --git a/gcc/cp/expr.c b/gcc/cp/expr.c
index b2c8cfaf88c..2e679868970 100644
--- a/gcc/cp/expr.c
+++ b/gcc/cp/expr.c
@@ -111,7 +111,7 @@ mark_use (tree expr, bool rvalue_p, bool read_p,
 {
 case VAR_DECL:
 case PARM_DECL:
-  if (rvalue_p && is_capture_proxy_with_ref (expr))
+  if (rvalue_p && is_normal_capture_proxy (expr))
 	{
 	  /* Look through capture by copy.  */
 	  tree cap = DECL_CAPTURED_VARIABLE (expr);
@@ -154,7 +154,7 @@ mark_use (tree expr, bool rvalue_p, bool read_p,
 	{
 	  /* Try to look through the reference.  */
 	  tree ref = TREE_OPERAND (expr, 0);
-	  if (rvalue_p && is_capture_proxy_with_ref (ref))
+	  if (rvalue_p && is_normal_capture_proxy (ref))
 	{
 	  /* Look through capture by reference.  */
 	  tree cap = DECL_CAPTURED_VARIABLE (ref);
diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index 345b210e89c..094979e81a3 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -291,24 +291,13 @@ is_normal_capture_proxy (tree decl)
   return DECL_NORMAL_CAPTURE_P (val);
 }
 
-/* Returns true iff DECL is a capture proxy for which we can use
-   DECL_CAPTURED_VARIABLE.  In effect, this is a normal proxy other than a
-   nested capture of a function parameter pack.  */
-
-bool
-is_capture_proxy_with_ref (tree var)
-{
-  return (is_normal_capture_proxy (var) && DECL_LANG_SPECIFIC (var)
-	  && DECL_CAPTURED_VARIABLE (var));
-}
-
 /* VAR is a capture proxy created by build_capture_proxy; add it to the
current function, which is the operator() for the appropriate lambda.  */
 
 void
 insert_capture_proxy (tree var)
 {
-  if (is_capture_proxy_with_ref (var))
+  if (is_normal_capture_proxy (var))
 {
   tree cap = DECL_CAPTURED_VARIABLE (var);
   if (CHECKING_P)
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 8a0096ddf92..bb8b5953539 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -3332,7 +3332,7 @@ process_outer_var_ref (tree decl, tsubst_flags_t complain, bool odr_use)
 {
   /* Check whether we've already built a proxy.  */
   tree var = decl;
-  while (is_capture_proxy_with_ref (var))
+  while (is_normal_capture_proxy (var))
 	var = DECL_CAPTURED_VARIABLE (var);
   tree d = retrieve_local_specialization (var);
 


Re: [C++ Patch/RFC] PR 71169 ("[7/8 Regression] ICE on invalid C++ code in pop_nested_class"), PR 71832 and more

2018-03-07 Thread Paolo Carlini

Hi,

On 07/03/2018 21:23, Jason Merrill wrote:

On Wed, Mar 7, 2018 at 12:18 PM, Paolo Carlini  wrote:

Hi,

[snip the various clarifications]

Il 7 Marzo 2018 17:57:07 CET, Jason Merrill  ha scritto:

My thought was that this patch adds a lot of managing of the flag in
different places in the parser, whereas looking for error_mark_node in
the template parms here would be just in one place.  But if you prefer
the current approach, that's fine, it's straightforward enough.

Thanks a lot for the various clarifications above, where essentially turns out 
that some details of my patch are correct essentially by chance ;) Seriously, 
I'm thinking the following: since 8 is getting real close, what if, for 8, for 
the known mild regressions, we go ahead with my super safe Plan B which I 
mentioned at beginning of the thread, then as soon as trunk branches we 
immediately apply my patch and we give it a serious spin, say we rebuild 
distros with it, and see what happens?

This is what I was suggesting, what do you think?
Eh, eh, certainly I don't have anything to say from the correctness 
point of view. As I already tried to explain, what I find annoying in 
this kind of approach, no matter how well is implemented, is that at 
parsing time we have to go anyway over all the parameters of all the 
template lists and then we know once and for all, for the corresponding 
class, whether there are erroneous parameters or not. In this kind of 
approach we go again through all the parameters, and, for example, 
multiple times when there are nested classes for example, I'm pretty 
sure in some (other) cases too (I should think more about that to be 
more specific). If you ask my opinion, at the moment I still believe 
that the best solution would be doing something at parsing time, save 
the information, in a more elegant way, maybe adding a special 
"erroneous TREE_VEC" flag to the TREE_VECs. I don't know exactly. Even 
better a unique flag for all the template parameter lists of each class. 
That said, if you don't foresee immediate performance-related issues, 
it's of course your call ;) If/when I will have a more concrete 
alternate proposal I will speak up ;) Nice anyway that we agree about 
the basic idea :-)


Paolo.


Re: [C++ Patch/RFC] PR 71169 ("[7/8 Regression] ICE on invalid C++ code in pop_nested_class"), PR 71832 and more

2018-03-07 Thread Jason Merrill
On Wed, Mar 7, 2018 at 12:18 PM, Paolo Carlini  wrote:
> Hi,
>
> [snip the various clarifications]
>
> Il 7 Marzo 2018 17:57:07 CET, Jason Merrill  ha scritto:
>>My thought was that this patch adds a lot of managing of the flag in
>>different places in the parser, whereas looking for error_mark_node in
>>the template parms here would be just in one place.  But if you prefer
>>the current approach, that's fine, it's straightforward enough.
>
> Thanks a lot for the various clarifications above, where essentially turns 
> out that some details of my patch are correct essentially by chance ;) 
> Seriously, I'm thinking the following: since 8 is getting real close, what 
> if, for 8, for the known mild regressions, we go ahead with my super safe 
> Plan B which I mentioned at beginning of the thread, then as soon as trunk 
> branches we immediately apply my patch and we give it a serious spin, say we 
> rebuild distros with it, and see what happens?

This is what I was suggesting, what do you think?
commit 706f372b52e65694161b9dff0272481d23fa898a
Author: Jason Merrill 
Date:   Wed Mar 7 14:45:19 2018 -0500

PR c++/71832 - ICE with ill-formed template parameter.

* pt.c (any_erroneous_template_args_p): New.
* parser.c (cp_parser_class_specifier_1): Use it.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 268be0fd543..8f3ec86e8ce 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6558,6 +6558,7 @@ extern int processing_template_parmlist;
 extern bool dependent_type_p			(tree);
 extern bool dependent_scope_p			(tree);
 extern bool any_dependent_template_arguments_p  (const_tree);
+extern bool any_erroneous_template_args_p	(const_tree);
 extern bool dependent_template_p		(tree);
 extern bool dependent_template_id_p		(tree, tree);
 extern bool type_dependent_expression_p		(tree);
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index a19bbe1e1d0..f7a8be50b79 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -22682,6 +22682,8 @@ cp_parser_class_specifier_1 (cp_parser* parser)
   FOR_EACH_VEC_SAFE_ELT (unparsed_funs_with_default_args, ix, e)
 	{
 	  decl = e->decl;
+	  if (any_erroneous_template_args_p (decl))
+	continue;
 	  /* If there are default arguments that have not yet been processed,
 	 take care of them now.  */
 	  if (class_type != e->class_type)
@@ -22704,6 +22706,8 @@ cp_parser_class_specifier_1 (cp_parser* parser)
   save_ccr = current_class_ref;
   FOR_EACH_VEC_SAFE_ELT (unparsed_nsdmis, ix, decl)
 	{
+	  if (any_erroneous_template_args_p (decl))
+	continue;
 	  if (class_type != DECL_CONTEXT (decl))
 	{
 	  if (pushed_scope)
@@ -27642,6 +27646,9 @@ cp_parser_late_parsing_for_member (cp_parser* parser, tree member_function)
   if (DECL_FUNCTION_TEMPLATE_P (member_function))
 member_function = DECL_TEMPLATE_RESULT (member_function);
 
+  if (any_erroneous_template_args_p (member_function))
+return;
+
   /* There should not be any class definitions in progress at this
  point; the bodies of members are only parsed outside of all class
  definitions.  */
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 89024c10fe2..1ce04aaabc7 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -25048,6 +25048,38 @@ any_dependent_template_arguments_p (const_tree args)
   return false;
 }
 
+/* Returns true if ARGS contains any errors.  */
+
+bool
+any_erroneous_template_args_p (const_tree args)
+{
+  int i;
+  int j;
+
+  if (args && TREE_CODE (args) != TREE_VEC)
+{
+  if (tree ti = get_template_info (args))
+	args = TI_ARGS (ti);
+  else
+	args = NULL_TREE;
+}
+
+  if (!args)
+return false;
+  if (args == error_mark_node)
+return true;
+
+  for (i = 0; i < TMPL_ARGS_DEPTH (args); ++i)
+{
+  const_tree level = TMPL_ARGS_LEVEL (args, i + 1);
+  for (j = 0; j < TREE_VEC_LENGTH (level); ++j)
+	if (error_operand_p (TREE_VEC_ELT (level, j)))
+	  return true;
+}
+
+  return false;
+}
+
 /* Returns TRUE if the template TMPL is type-dependent.  */
 
 bool


[PATCH] rs6000: -mreadonly-in-sdata (PR82411)

2018-03-07 Thread Segher Boessenkool
This adds a new option -mreadonly-in-sdata (on by default) that
controls whether readonly data can be put in sdata.  (For EABI this
does nothing, readonly data is put in sdata2 as usual).

Tested etc.; committing.


Segher


2018-03-07  Segher Boessenkool  

* config/rs6000/rs6000.c (rs6000_elf_in_small_data_p): Don't put
readonly data in sdata, if that is disabled.
* config/rs6000/sysv4.opt (mreadonly-in-sdata): New option.
* doc/invoke.texi (RS/6000 and PowerPC Options): Document
-mreadonly-in-sdata option.

gcc/testsuite/
* gcc.target/powerpc/ppc-sdata-2.c: Skip if -mno-readonly-in-sdata.

---
 gcc/config/rs6000/rs6000.c | 5 +
 gcc/config/rs6000/sysv4.opt| 4 
 gcc/doc/invoke.texi| 9 -
 gcc/testsuite/gcc.target/powerpc/ppc-sdata-2.c | 1 +
 4 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index fd96fb8..5cc116f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -32590,6 +32590,11 @@ rs6000_elf_in_small_data_p (const_tree decl)
 }
   else
 {
+  /* If we are told not to put readonly data in sdata, then don't.  */
+  if (TREE_READONLY (decl) && rs6000_sdata != SDATA_EABI
+ && !rs6000_readonly_in_sdata)
+   return false;
+
   HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (decl));
 
   if (size > 0
diff --git a/gcc/config/rs6000/sysv4.opt b/gcc/config/rs6000/sysv4.opt
index 9534c1c..fb03c0a 100644
--- a/gcc/config/rs6000/sysv4.opt
+++ b/gcc/config/rs6000/sysv4.opt
@@ -27,6 +27,10 @@ msdata=
 Target RejectNegative Joined Var(rs6000_sdata_name)
 Select method for sdata handling.
 
+mreadonly-in-sdata
+Target Report Var(rs6000_readonly_in_sdata) Init(1) Save
+Allow readonly data in sdata.
+
 mtls-size=
 Target RejectNegative Joined Var(rs6000_tls_size) Enum(rs6000_tls_size)
 Specify bit size of immediate TLS offsets.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 82db9d6..ac5a198 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1085,7 +1085,7 @@ See RS/6000 and PowerPC Options.
 -mdlmzb  -mno-dlmzb @gol
 -mprototype  -mno-prototype @gol
 -msim  -mmvme  -mads  -myellowknife  -memb  -msdata @gol
--msdata=@var{opt}  -mvxworks  -G @var{num} @gol
+-msdata=@var{opt}  -mreadonly-in-sdata  -mvxworks  -G @var{num} @gol
 -mrecip  -mrecip=@var{opt}  -mno-recip  -mrecip-precision @gol
 -mno-recip-precision @gol
 -mveclibabi=@var{type}  -mfriz  -mno-friz @gol
@@ -24081,6 +24081,13 @@ On embedded PowerPC systems, put all initialized 
global and static data
 in the @code{.data} section, and all uninitialized data in the
 @code{.bss} section.
 
+@item -mreadonly-in-sdata
+@itemx -mreadonly-in-sdata
+@opindex mreadonly-in-sdata
+@opindex mno-readonly-in-sdata
+Put read-only objects in the @code{.sdata} section as well.  This is the
+default.
+
 @item -mblock-move-inline-limit=@var{num}
 @opindex mblock-move-inline-limit
 Inline all block moves (such as calls to @code{memcpy} or structure
diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-sdata-2.c 
b/gcc/testsuite/gcc.target/powerpc/ppc-sdata-2.c
index 570c81f..ee77456 100644
--- a/gcc/testsuite/gcc.target/powerpc/ppc-sdata-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/ppc-sdata-2.c
@@ -5,6 +5,7 @@
 /* { dg-final { scan-assembler-not "\\.section\[ \t\]\\.sdata2," } } */
 /* { dg-final { scan-assembler "sdat@sdarel\\(13\\)" } } */
 /* { dg-final { scan-assembler "sdat2@sdarel\\(13\\)" } } */
+/* { dg-skip-if "" { *-*-* } { "-mno-readonly-in-sdata" } { "" } } */
 
 
 int sdat = 2;
-- 
1.8.3.1



Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2018-03-07 Thread Jeff Law
On 02/21/2018 03:11 AM, Alexandre Oliva wrote:
> On Feb 15, 2018, Szabolcs Nagy  wrote:
> 
>> i see assembler slow downs with these location view patches
>> i opened https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84408
> 
> 
> [LVU] reset view at function entry, omit views at line zero
> 
> Location views might be associated with locations that lack line
> number information (line number zero), but since we omit .loc
> directives that would have been issued with line number zero, we also
> omit the symbolic view numbers that would have been issued at such
> points.
> 
> Resetting views at function entry points address some of these issues,
> and alleviate the huge chains of symbolic views that have burdened
> assemblers since we disabled -ginternal-reset-location-views by
> default, but other problems of undefined views remain when it's not
> the whole function that lacks line number info, just parts of it.
> 
> So, when we encounter a request to output a view that may have been
> referenced, but we decide to omit the .loc because the line is zero,
> we will now omit the view as well, i.e., we will internally regard
> that view as zero-numbered.
> 
> Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?
> 
> Uros, could you please confirm whether this fixes the 84404 go problem
> you reported on alpha?  I'm guessing it's the same issue.  TIA,
> 
> for  gcc/ChangeLog
> 
>   PR debug/84404
>   PR debug/84408
>   * dwarf2out.c (struct dw_line_info_table): Update comments for
>   view == -1.
>   (FORCE_RESET_NEXT_VIEW): New.
>   (FORCE_RESETTING_VIEW_P): New.
>   (RESETTING_VIEW_P): Check for -1 too.
>   (ZERO_VIEW_P): Likewise.
>   (new_line_info_table): Force-reset next view.
>   (dwarf2out_begin_function): Likewise.
>   (dwarf2out_source_line): Simplify zero_view_p initialization.
>   Test FORCE_RESETTING_VIEW_P and RESETTING_VIEW_P instead of
>   view directly.  Omit view when omitting .loc at line 0.
> 
> for  gcc/testsuite/ChangeLog
> 
>   PR debug/84404
>   PR debug/84408
>   * gcc.dg/graphite/pr84404.c: New.
OK.

jeff


Re: [PATCH] fix ICE in generic_overlap (PR 84526)

2018-03-07 Thread Jeff Law
On 02/26/2018 10:32 AM, Martin Sebor wrote:
> 
> PR tree-optimization/84526 - ICE in generic_overlap at 
> gcc/gimple-ssa-warn-restrict.c:927 since r257860
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/84526
>   * gimple-ssa-warn-restrict.c (builtin_memref::set_base_and_offset):
>   Remove dead code.
>   (builtin_access::generic_overlap): Be prepared to handle non-array
>   base objects.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/84526
>   * gcc.dg/Wrestrict-10.c: New test.
> 
> Index: gcc/gimple-ssa-warn-restrict.c
> ===
> --- gcc/gimple-ssa-warn-restrict.c(revision 257963)
> +++ gcc/gimple-ssa-warn-restrict.c(working copy)
> @@ -396,6 +396,9 @@ builtin_memref::set_base_and_offset (tree expr)
>if (TREE_CODE (expr) == ADDR_EXPR)
>  expr = TREE_OPERAND (expr, 0);
>  
> +  /* Stash the reference for offset validation.  */
> +  ref = expr;
> +
>poly_int64 bitsize, bitpos;
>tree var_off;
>machine_mode mode;
> @@ -409,23 +412,33 @@ builtin_memref::set_base_and_offset (tree expr)
>base = get_inner_reference (expr, , , _off,
> , , , );
>  
> +  /* get_inner_reference is not expected to return null.  */
> +  gcc_assert (base != NULL);
> +
>poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);
>  
> -  HOST_WIDE_INT const_off;
> -  if (!base || !bytepos.is_constant (_off))
> +  /* Convert the poly_int64 offset to to offset_int.  The offset
> + should be constant but be prepared for it not to be just in
> + case.  */
> +  offset_int cstoff;
> +  if (bytepos.is_constant ())
>  {
> -  base = get_base_address (TREE_OPERAND (expr, 0));
> -  return;
> +  offrange[0] += cstoff;
> +  offrange[1] += cstoff;
> +
> +  /* Besides the reference saved above, also stash the offset
> +  for validation.  */
> +  if (TREE_CODE (expr) == COMPONENT_REF)
> + refoff = cstoff;
>  }
> +  else
> +offrange[1] += maxobjsize;
So is this assignment to offrange[1] really correct?

ISTM that it potentially overflows (relative to MAXOBJSIZE) and that
you'd likely be better off just assigning offrange[1] to MAXOBJSIZE.

Or alternately signal to the callers that we can't really analyze the
memory address and inhibit further analysis of the potential overlap of
the objects.

> @@ -918,12 +924,18 @@ builtin_access::generic_overlap ()
>if (!overlap_certain)
>  {
>if (!dstref->strbounded_p && !depends_p)
> + /* Memcpy only considers certain overlap.  */
>   return false;
>  
>/* There's no way to distinguish an access to the same member
>of a structure from one to two distinct members of the same
>structure.  Give up to avoid excessive false positives.  */
> -  tree basetype = TREE_TYPE (TREE_TYPE (dstref->base));
> +  tree basetype = TREE_TYPE (dstref->base);
> +
> +  if (POINTER_TYPE_P (basetype)
> +   || TREE_CODE (basetype) == ARRAY_TYPE)
> + basetype = TREE_TYPE (basetype);
> +
>if (RECORD_OR_UNION_TYPE_P (basetype))
>   return false;
>  }
This is probably fairly reasonable.  My only concern would be that we
handle multi-dimensional arrays correctly.  Do you need to handle
ARRAY_TYPEs with a loop?

I do have a more general concern.  Given that we walk backwards through
pointer casts and ADDR_EXPRs we potentially lose information.  For
example we might walk backwards through a cast from a small array to a
larger array type.

Thus later we use the smaller array type when computing the bounds and
subsequent offset from BASE.  This could lead to a missed warning as the
computed offset would be too small.

I'm inclined to ack after fixing offrange[1] computation noted above as
I don't think the patch makes things any worse.  As I noted in a prior
message, I don't see concerns with consistency of BASE that Jakub sees here.

jeff


Re: [PATCH] fix ICE in generic_overlap (PR 84526)

2018-03-07 Thread Jeff Law
On 02/23/2018 02:46 PM, Martin Sebor wrote:
>> This doesn't address any of my concerns that it is completely random
>> what {dst,src}ref->base is, apples and oranges; sometimes it is a pointer
>> (e.g. the argument of the function), sometimes the ADDR_EXPR operand,
>> sometimes the base of the reference, sometimes again address (if the
>> base of the reference is MEM_REF).  By the lack of consistency in what
>> it is, just deciding on its type whether you take TREE_TYPE or
>> TREE_TYPE (TREE_TYPE ()) of it also gives useless result.  You could e.g
>> call the memcpy etc. function with ADDR_EXPR of a VAR_DECL that has
>> pointer
>> type, then if dstref->base is that VAR_DECL, POINTER_TYPE_P (basetype)
>> would be true.
> 
> I think I understand what you're saying but this block is only
> used for string functions (not for memcpy), and only as a stopgap
> to avoid false positives.  Being limited to (a subset of) string
> functions the case I think you're concerned about, namely calling
> strcpy with a pointer to a pointer, shouldn't come up in valid
> code.  It's not bullet-proof but I don't think there is
> a fundamental problem, either with this patch or with the one
> that introduced it.  The fundamental problem is that MEM_REF
> can be ambiguous and that's what this code is trying to combat.
> See the example I gave here:
> https://gcc.gnu.org/ml/gcc/2018-02/msg00136.html
> 
> If you think I'm missing something please put together a small
> test case to help me see the problem.  I'm not nearly as
> comfortable thinking in terms of the internal representation
> to visualize all the possibilities here.
I think at least part of the issue here was that previously you had this
code in set_base_and_offset:

 HOST_WIDE_INT const_off;
  if (!base || !bytepos.is_constant (_off))
{
  base = get_base_address (TREE_OPERAND (expr, 0));
  return;
}

That is stripping away part of EXPR in a way I don't think was safe.
With this gone in the most recent patches I'm a lot less concerned.

Jeff




Re: [PATCH] Add a few new contrib.texi entries, especially for people who perform GCC fuzzy testing and report bugs (take 2)

2018-03-07 Thread Volker Reichelt

On 03/07/2018 03:24 PM, Jakub Jelinek wrote:


On Wed, Mar 07, 2018 at 02:06:33PM +0100, Jakub Jelinek wrote:

As appreciation of the hard work of people doing fuzzy testing of
GCC and reporting high quality bugs, I'd like to list them in contrib.texi.
This patch just lists them in the GCC testing group, shall we have a special
sub-list for the fuzzy testing?
The patch also adds or updates a couple of other entries, but I'm sure I've
missed many people in both the testing and GCC development categories, for
which I apologize.  We can always have incremental patches, contrib.texi
is helplessly obsolete anyway and for many doesn't reflect last 10-15 years
of work at all.  We should also add entries for people reporting lots of
bugs even if it is not from fuzzy testing.

Anyway, ok for trunk?  Are all people on the CC ok with them being listed in
there (reply privately if needed)?




Thanks Jakub, for the appreciation!

My way of saying thank you, is (of course) to file an new 
ice-on-valid-code regression:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84752  ;-)

Thanks again,
Volker



Re: [PATCH] fix ICE in generic_overlap (PR 84526)

2018-03-07 Thread Jeff Law
On 02/23/2018 01:13 PM, Jakub Jelinek wrote:
> On Fri, Feb 23, 2018 at 12:57:14PM -0700, Martin Sebor wrote:
>> +  /* get_inner_reference is not expected to return null.  */
>> +  gcc_assert (base != NULL);
>> +
>>poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);
>>  
>> -  HOST_WIDE_INT const_off;
>> -  if (!base || !bytepos.is_constant (_off))
>> -{
>> -  base = get_base_address (TREE_OPERAND (expr, 0));
>> -  return;
>> -}
>> -
>> +  /* There is no conversion from poly_int64 to offset_int even
>> + though the latter is wider, so go through HOST_WIDE_INT.
>> + The offset is expected to always be constant.  */
>> +  HOST_WIDE_INT const_off = bytepos.to_constant ();
> 
> The assert is ok, but removing the bytepos.is_constant (_off)
> is wrong, I'm sure Richard S. can come up with some SVE testcase
> where it will not be constant.  If it is not constant, you can handle
> it like var_off (which as I said on IRC or in the PR also seems to be
> incorrect, because if the base is not a decl the variable offset could be
> negative).
> 
>>offrange[0] += const_off;
>>offrange[1] += const_off;
>>  
>> @@ -923,7 +923,11 @@ builtin_access::generic_overlap ()
>>/* There's no way to distinguish an access to the same member
>>   of a structure from one to two distinct members of the same
>>   structure.  Give up to avoid excessive false positives.  */
>> -  tree basetype = TREE_TYPE (TREE_TYPE (dstref->base));
>> +  tree basetype = TREE_TYPE (dstref->base);
>> +  if (POINTER_TYPE_P (basetype)
>> +  || TREE_CODE (basetype) == ARRAY_TYPE)
>> +basetype = TREE_TYPE (basetype);
> 
> This doesn't address any of my concerns that it is completely random
> what {dst,src}ref->base is, apples and oranges; sometimes it is a pointer
> (e.g. the argument of the function), sometimes the ADDR_EXPR operand,
> sometimes the base of the reference, sometimes again address (if the
> base of the reference is MEM_REF).  By the lack of consistency in what
> it is, just deciding on its type whether you take TREE_TYPE or
> TREE_TYPE (TREE_TYPE ()) of it also gives useless result.  You could e.g
> call the memcpy etc. function with ADDR_EXPR of a VAR_DECL that has pointer
> type, then if dstref->base is that VAR_DECL, POINTER_TYPE_P (basetype)
> would be true.
The intent of the code is that BASE is a pointer to a memory location
that can be compared to another BASE for the purposes of trying to
determine if there's an overlap between two arguments.

Note that BASE may have a different type than what actually appeared in
the call.  Martin's code is presented with the actual argument, but then
tries to walk backwards to refine that value.  In particular it might
look through pointer typecasts or pointer arithmetic.  The latter
results in a memory address plus an offset.

It may also look at an ADDR_EXPR and strip away its various _REF nodes.
The result in this case is similar to pointer arithmetic in that we will
have a memory address plus an offset.

Note that the memory address computed in here and stored into BASE may
differ in its type from what is actually passed to the function for
which we are checking for invalid argument overlap.  In fact, I think

The key here is boil arguments down to a memory location plus an
optional offset, which in turn allows us to compare them against each
other for overlap.  In some ways it's like what we do in alias.c for RTL
(which isn't necessary a shining example of good code).

I don't think the lack of consistency is really a major issue inside
builtin_memref::set_base_and_offset -- as long as clients use the
information just for warnings and are aware that the underlying types
may differ from what appeared in the call itself.

Removal of the undesirable base = get_base_address (TREE_OPERAND (expr,
0)) definitely helps clarify IMHO.

I'll have more comments as a follow-up to one of Martin's messages.

Jeff


[PATCH 2/2] ifcvt: unfixed testcase for 2nd issue within PR tree-optimization/84178

2018-03-07 Thread David Malcolm
Here's a testcase for the 2nd issue within PR tree-optimization/84178.

With -03 -fno-tree-forwprop, trunk and gcc 7 segfault inside update_ssa
when called from version_loop_for_if_conversion when a loop is versioned.

During loop_version, some blocks are duplicated, and this can duplicate
SSA names, leading to the duplicated SSA names being added to
tree-into-ssa.c's old_ssa_names.

For example, _117 is an SSA name defined in one of these to-be-duplicated
blocks, and hence is added to old_ssa_names when duplicated.

One of the BBs has this gimplified predicate (again, these stmts are *not*
yet in a BB):
  _173 = h4.1_112 != 0;
  _171 = (signed char) _117;
  _172 = _171 >= 0;
  _170 = ~_172;
  _169 = _170 & _173;

Note the reference to SSA name _117 in the above.

When update_ssa runs later on in version_loop_for_if_conversion,
SSA name _117 is in the old_ssa_names bitmap, and thus has
prepare_use_sites_for called on it:

2771  /* If an old name is in NAMES_TO_RELEASE, we cannot remove it from
2772 OLD_SSA_NAMES, but we have to ignore its definition site.  */
2773  EXECUTE_IF_SET_IN_BITMAP (old_ssa_names, 0, i, sbi)
2774{
2775  if (names_to_release == NULL || !bitmap_bit_p (names_to_release, 
i))
2776prepare_def_site_for (ssa_name (i), insert_phi_p);
2777  prepare_use_sites_for (ssa_name (i), insert_phi_p);
2778}

prepare_use_sites_for iterates over the immediate users, which includes
the:
  _171 = (signed char) _117;
in the gimplified predicate.

This tried to call "mark_block_for_update" on the BB of this def_stmt,
leading to a read-through-NULL segfault, since this statement isn't in a
BB yet.

In an earlier attempt to fix this I wrote:
> The following patch fixes the 2nd ICE by inserting the gimplified predicates
> earlier: immediately before the possible version_loop_for_if_conversion,
> rather than within combine_blocks.  That way they're in their BBs before
> we attempt any further manipulation of the CFG.

to which Richard responded:
> Hum, but that will alter both copies of the loops, no?  I think the fix is
> to instead delay the update_ssa call to _after_ combine_blocks ()
> (and remember if it is necessary just in case we didn't version).

I attempted this, but didn't come up with something that worked; so am
posting this in the hope that Richard (or someone else) can finish the
fix.

gcc/testsuite/ChangeLog:
PR tree-optimization/84178
* gcc.c-torture/compile/pr84178-2.c: New test.
---
 gcc/testsuite/gcc.c-torture/compile/pr84178-2.c | 18 ++
 1 file changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr84178-2.c

diff --git a/gcc/testsuite/gcc.c-torture/compile/pr84178-2.c 
b/gcc/testsuite/gcc.c-torture/compile/pr84178-2.c
new file mode 100644
index 000..b2548e9
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr84178-2.c
@@ -0,0 +1,18 @@
+/* { dg-options "-fno-tree-forwprop" } */
+
+int zy, h4;
+
+void
+r8 (long int mu, int *jr, int *fi, short int dv)
+{
+  do
+{
+  int tx;
+
+  tx = !!h4 ? (zy + h4) : 1;
+  mu = tx;
+  *jr = (((unsigned char) mu > (254 >> dv)) ? 0 : (unsigned char) tx) + 
*fi;
+} while (*jr == 0);
+
+  r8 (mu, jr, fi, 1);
+}
-- 
1.8.5.3



[PATCH 1/2] tree-if-conv.c: fix ICE seen with -fno-tree-forwprop (PR tree-optimization/84178)

2018-03-07 Thread David Malcolm
PR tree-optimization/84178 reports a couple of source files that ICE inside
ifcvt when compiled with -03 -fno-tree-forwprop (trunk and gcc 7).

Both cases involve problems with ifcvt's per-BB gimplified predicates.

Testcase 1 fails this assertion within release_bb_predicate during cleanup:

283   if (flag_checking)
284 for (gimple_stmt_iterator i = gsi_start (stmts);
285  !gsi_end_p (i); gsi_next ())
286   gcc_assert (! gimple_use_ops (gsi_stmt (i)));

The testcase contains a division in the loop, which leads to
if_convertible_loop_p returning false (due to gimple_could_trap_p being true
for the division).  This happens *after* the per-BB gimplified predicates
have been created in predicate_bbs (loop).
Hence tree_if_conversion bails out to "cleanup", but the gimplified predicates
exist and make use of SSA names; for example this conjunction for two BB
conditions:

  _4 = h4.1_112 != 0;
  _175 = (signed char) _117;
  _176 = _175 >= 0;
  _174 = _4 & _176;

is using SSA names.

This assertion was added in r236498 (aka 
c3deca2519d97c55876869c57cf11ae1e5c6cf8b):

2016-05-20  Richard Biener  

* tree-if-conv.c (add_bb_predicate_gimplified_stmts): Use
gimple_seq_add_seq_without_update.
(release_bb_predicate): Assert we have no operands to free.
(if_convertible_loop_p_1): Calculate post dominators later.
Do not free BB predicates here.
(combine_blocks): Do not recompute BB predicates.
(version_loop_for_if_conversion): Save BB predicates around
loop versioning.

* gcc.dg/tree-ssa/ifc-cd.c: Adjust.

The following patch fixes this by adding a call to gimple_seq_discard
to release_bb_predicate.  It also updates the assertion, so that
instead of asserting the stmts have no imm uses, instead assert that
they weren't added to a bb before discarding them (otherwise discarding
them would be a bug).  We know this is the case because
insert_gimplified_predicates has:

  /* Once the sequence is code generated, set it to NULL.  */
  set_bb_predicate_gimplified_stmts (bb, NULL);

but asserting it seems appropriate as a double-check.

The patch doesn't address the 2nd issue within PR tree-optimization/84178.

Successfully bootstrapped on x86_64-pc-linux-gnu.

OK for trunk?

gcc/ChangeLog:
PR tree-optimization/84178
* tree-if-conv.c (release_bb_predicate): Remove the
the assertion that the stmts have NULL use_ops.
Discard the statements, asserting that they haven't
yet been added to a BB.

gcc/testsuite/ChangeLog:
PR tree-optimization/84178
* gcc.c-torture/compile/pr84178-1.c: New test.
---
 gcc/testsuite/gcc.c-torture/compile/pr84178-1.c | 18 ++
 gcc/tree-if-conv.c  |  8 +---
 2 files changed, 23 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr84178-1.c

diff --git a/gcc/testsuite/gcc.c-torture/compile/pr84178-1.c 
b/gcc/testsuite/gcc.c-torture/compile/pr84178-1.c
new file mode 100644
index 000..49f2c89
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr84178-1.c
@@ -0,0 +1,18 @@
+/* { dg-options "-fno-tree-forwprop" } */
+
+int zy, h4;
+
+void
+r8 (long int mu, int *jr, int *fi, short int dv)
+{
+  do
+{
+  int tx;
+
+  tx = !!h4 ? (zy / h4) : 1;
+  mu = tx;
+  *jr = (((unsigned char) mu > (254 >> dv)) ? 0 : (unsigned char) tx) + 
*fi;
+} while (*jr == 0);
+
+  r8 (mu, jr, fi, 1);
+}
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index cac3fd7..5467f3f 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -271,8 +271,7 @@ init_bb_predicate (basic_block bb)
   set_bb_predicate (bb, boolean_true_node);
 }
 
-/* Release the SSA_NAMEs associated with the predicate of basic block BB,
-   but don't actually free it.  */
+/* Release the SSA_NAMEs associated with the predicate of basic block BB.  */
 
 static inline void
 release_bb_predicate (basic_block bb)
@@ -280,11 +279,14 @@ release_bb_predicate (basic_block bb)
   gimple_seq stmts = bb_predicate_gimplified_stmts (bb);
   if (stmts)
 {
+  /* Ensure that these stmts haven't yet been added to a bb.  */
   if (flag_checking)
for (gimple_stmt_iterator i = gsi_start (stmts);
 !gsi_end_p (i); gsi_next ())
- gcc_assert (! gimple_use_ops (gsi_stmt (i)));
+ gcc_assert (! gimple_bb (gsi_stmt (i)));
 
+  /* Discard them.  */
+  gimple_seq_discard (stmts);
   set_bb_predicate_gimplified_stmts (bb, NULL);
 }
 }
-- 
1.8.5.3



Re: [PATCH/RFC] tree-if-conv.c: fix two ICEs seen with -fno-tree-forwprop (PR tree-optimization/84178)

2018-03-07 Thread David Malcolm
On Fri, 2018-02-16 at 12:48 +0100, Richard Biener wrote:
> On Thu, Feb 15, 2018 at 11:07 PM, David Malcolm 
> wrote:
> > On Fri, 2018-02-09 at 12:02 +0100, Richard Biener wrote:
> > > On Thu, Feb 8, 2018 at 11:23 PM, David Malcolm  > > om>
> > > wrote:
> > > > PR tree-optimization/84178 reports a couple of source files
> > > > that
> > > > ICE inside
> > > > ifcvt when compiled with -03 -fno-tree-forwprop (trunk and gcc
> > > > 7).
> > > > 
> > > > Both cases involve problems with ifcvt's per-BB gimplified
> > > > predicates.
> > > > 
> > > > Testcase 1 fails this assertion within release_bb_predicate
> > > > during
> > > > cleanup:
> > > > 
> > > > 283   if (flag_checking)
> > > > 284 for (gimple_stmt_iterator i = gsi_start
> > > > (stmts);
> > > > 285  !gsi_end_p (i); gsi_next ())
> > > > 286   gcc_assert (! gimple_use_ops (gsi_stmt (i)));
> > > > 
> > > > The testcase contains a division in the loop, which leads to
> > > > if_convertible_loop_p returning false (due to
> > > > gimple_could_trap_p
> > > > being true
> > > > for the division).  This happens *after* the per-BB gimplified
> > > > predicates
> > > > have been created in predicate_bbs (loop).
> > > > Hence tree_if_conversion bails out to "cleanup", but the
> > > > gimplified
> > > > predicates
> > > > exist and make use of SSA names; for example this conjunction
> > > > for
> > > > two BB
> > > > conditions:
> > > > 
> > > >   _4 = h4.1_112 != 0;
> > > >   _175 = (signed char) _117;
> > > >   _176 = _175 >= 0;
> > > >   _174 = _4 & _176;
> > > > 
> > > > is using SSA names.
> > > 
> > > But then this shouldn't cause any stmt operands to be created -
> > > who
> > > is calling
> > > update_stmt () on a stmt using the SSA names?  Maybe something
> > > calls
> > > force_gimple_operand_gsi to add to the sequence?
> > 
> > 
> > The immediate use is created deep within folding when the
> > gimplified
> > predicate is created.
> > 
> > Here's the backtrace of exactly where:
> > 
> > (gdb) bt
> > #0  link_imm_use_stmt (linknode=0x71a0b8d0, def= > 0x71a196c0>, stmt=)
> > at ../../src/gcc/ssa-iterators.h:307
> > #1  0x012531c5 in add_use_op (fn=0x71a03000,
> > stmt=, op=0x71a236d8,
> > last=0x7fffcb10) at ../../src/gcc/tree-ssa-operands.c:307
> > #2  0x01253607 in finalize_ssa_uses (fn=0x71a03000,
> > stmt=)
> > at ../../src/gcc/tree-ssa-operands.c:410
> > #3  0x0125368b in finalize_ssa_stmt_operands
> > (fn=0x71a03000, stmt=)
> > at ../../src/gcc/tree-ssa-operands.c:436
> > #4  0x01254b62 in build_ssa_operands (fn=0x71a03000,
> > stmt=)
> > at ../../src/gcc/tree-ssa-operands.c:948
> > #5  0x012550df in update_stmt_operands (fn=0x71a03000,
> > stmt=)
> > at ../../src/gcc/tree-ssa-operands.c:1081
> > #6  0x00c10642 in update_stmt_if_modified (s= > 0x71a23690>) at ../../src/gcc/gimple-ssa.h:185
> > #7  0x00c10e82 in update_modified_stmts
> > (seq=0x71a23690) at ../../src/gcc/gimple-iterator.c:58
> > #8  0x00c111f1 in gsi_insert_seq_before (i=0x7fffcfb0,
> > seq=0x71a23690, mode=GSI_SAME_STMT)
> > at ../../src/gcc/gimple-iterator.c:217
> > #9  0x00c241d0 in replace_stmt_with_simplification
> > (gsi=0x7fffcfb0, rcode=..., ops=0x7fffcdb0,
> > seq=0x7fffcdd8, inplace=false) at ../../src/gcc/gimple-
> > fold.c:4473
> > #10 0x00c25a63 in fold_stmt_1 (gsi=0x7fffcfb0,
> > inplace=false, valueize=0xc2663b )
> > at ../../src/gcc/gimple-fold.c:4775
> > #11 0x00c266b7 in fold_stmt (gsi=0x7fffcfb0) at
> > ../../src/gcc/gimple-fold.c:4996
> > #12 0x00c552b1 in maybe_fold_stmt (gsi=0x7fffcfb0) at
> > ../../src/gcc/gimplify.c:3193
> > #13 0x00c5f1e9 in gimplify_modify_expr
> > (expr_p=0x7fffd328, pre_p=0x7fffd910,
> > post_p=0x7fffd1e0,
> > want_value=false) at ../../src/gcc/gimplify.c:5803
> > #14 0x00c7b461 in gimplify_expr (expr_p=0x7fffd328,
> > pre_p=0x7fffd910, post_p=0x7fffd1e0,
> > gimple_test_f=0xc5d723 , fallback=0) at
> > ../../src/gcc/gimplify.c:11434
> > #15 0x00c62661 in gimplify_stmt (stmt_p=0x7fffd328,
> > seq_p=0x7fffd910) at ../../src/gcc/gimplify.c:6658
> > #16 0x00c4c449 in gimplify_and_add (t= > 0x71a26230>, seq_p=0x7fffd910) at
> > ../../src/gcc/gimplify.c:441
> > #17 0x00c4cc89 in internal_get_tmp_var (val= > 0x71a26140>, pre_p=0x7fffd910, post_p=0x0, is_formal=true,
> > allow_ssa=true) at ../../src/gcc/gimplify.c:597
> > #18 0x00c4ccd2 in get_formal_tmp_var (val= > 0x71a26140>, pre_p=0x7fffd910) at
> > ../../src/gcc/gimplify.c:618
> > #19 0x00c7ee6a in gimplify_expr (expr_p=0x71a261b0,
> > pre_p=0x7fffd910, post_p=0x7fffd790,
> > gimple_test_f=0xc0f6d0 

Re: [PING] [PATCH] consider successor blocks when avoiding -Wstringop-truncation (PR 84468)

2018-03-07 Thread Jeff Law
On 03/01/2018 02:17 PM, Martin Sebor wrote:
> 
> I read you last reply as asking me to handle multiple edges.
Sorry, I should have been clearer.

You need to be prepared for the possibility of multiple edges and handle
them in a conservatively correct way.  The most likely way to get
multiple outgoing edges is going to be via exception handling.

In this code I think that's easily accomplished by making the code which
walks into the successor block(s) conditional on single_succ_p (bb) and
that the edge is not marked as EDGE_ABNORMAL.

> The original patch handled just one edge and didn't bother
> with EDGE_ABNORMAL because I wanted to keep it simple.
Understood.  But that's not safe in the sense that once you have
multiple successors, you don't know which one you will transfer control
to -- thus you have to check all of them for a suitable store.  If any
doesn't have a suitable store, then you'd have to warn.

Essentially the question you need to answer is "given I'm in block X, do
all paths from block X have a suitable store to terminate the string
prior to a use".  You can handle that in the trivial case with the code
you're proposing in this patch, and that's fine for gcc-8.

But the "right" way to answer that question is to look at the virtual
operand chains.  Though as we've discussed that may not necessarily be a
good thing.





  But
> it sounds like you want me to go back to the first version
> and just add a check for EDGE_ABNORMAL.  The attached patch
> does that (plus it skips over any non-debug statements in
> the successor block). 
Correct.  I think your original patch with a check for single_succ_p is
the direction I think we want for gcc-8.



 Besides the usual I retested it with
> the Linux kernel.  There are just three instances of
> the warning in my default configuration but I know there
> are a lot more in more extensive builds.
I've actually got a fair amount of data here on kernel builds using the
trunk on a variety of architectures.  I haven't gone through it all yet,
but there's certainly some string-op truncation warnings and a few
others.  I haven't figured out how best to aggregate that info and I
don't want to duplicate Arnd's work.




> 
> 
>> I glanced over the tests and I didn't see any that would benefit from
>> handling multiple edges or the recursion (every one of the dg-bogus
>> markers should be immediately transferring control to the null
>> termination statement AFAICT).
> 
> I've added some more test cases involving C++ exceptions (and
> found one false positive(*)) but I don't have a test to exercise
> blocks with multiple outgoing edges.  For future reference, how
> would I create one?
EH is the best bet to create them.  You might also be able to get one if
we've got fake edges in the CFG (necessary for things like
post-dominator computation).  async exceptions could likely create them
fairly easily.


My comment was more about not seeing the need to look into successor
blocks if there's more than one.


Can you file a bug report on the false positive so that we have a
reminder to revisit looking at the virtual operand chains as a better
solution here?

I think this version is OK.

Thanks, and again sorry for the confusion.


Jeff


[PATCH][AArch64] PR target/84748: Mark *compare_cstore_insn as clobbering CC reg

2018-03-07 Thread Kyrill Tkachov

Hi all,

In this wrong-code PR the combine pass ends up moving a CC-using instruction past a 
*compare_cstore_insn
insn_and_split. After reload the *compare_cstore_insn splitter ends up 
generating a SUBS instruction that
clobbers the condition flags, and things go bad.

The solution is simple, the *compare_cstore_insn pattern should specify 
that it clobbers the CC register
so that combine (or any other pass) does not assume that it can move CC-using 
patterns across it.

This patch does that and fixes the testcase.

The testcase FAILs on GCC 8 only, but the buggy pattern is in GCC 6 onwards, so 
we should backport this as
a latent bug fix after it's had some time to bake in trunk.

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk and later backports?

Thanks,
Kyrill

2018-03-07  Kyrylo Tkachov  

PR target/84748
* config/aarch64/aarch64.md (*compare_cstore_insn): Mark pattern
as clobbering CC_REGNUM.

2018-03-07  Kyrylo Tkachov  

PR target/84748
* gcc.c-torture/execute/pr84748.c: New test.
commit 9c008e450e1cd95f2905071e6c8a0ff92b856358
Author: Kyrylo Tkachov 
Date:   Wed Mar 7 15:05:25 2018 +

[AArch64] PR target/84748: Mark *compare_cstore_insn as clobbering CC reg

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 69ff5ca..4463a13 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3242,7 +3242,8 @@ (define_insn "aarch64_cstore"
 (define_insn_and_split "*compare_cstore_insn"
   [(set (match_operand:GPI 0 "register_operand" "=r")
 	 (EQL:GPI (match_operand:GPI 1 "register_operand" "r")
-		  (match_operand:GPI 2 "aarch64_imm24" "n")))]
+		  (match_operand:GPI 2 "aarch64_imm24" "n")))
+   (clobber (reg:CC CC_REGNUM))]
   "!aarch64_move_imm (INTVAL (operands[2]), mode)
&& !aarch64_plus_operand (operands[2], mode)
&& !reload_completed"
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr84748.c b/gcc/testsuite/gcc.c-torture/execute/pr84748.c
new file mode 100644
index 000..9572ab2
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr84748.c
@@ -0,0 +1,34 @@
+/* { dg-require-effective-target int128 } */
+
+typedef unsigned __int128 u128;
+
+int a, c, d;
+u128 b;
+
+unsigned long long g0, g1;
+
+void
+store (unsigned long long a0, unsigned long long a1)
+{
+  g0 = a0;
+  g1 = a1;
+}
+
+void
+foo (void)
+{
+  b += a;
+  c = d != 84347;
+  b /= c;
+  u128 x = b;
+  store (x >> 0, x >> 64);
+}
+
+int
+main (void)
+{
+  foo ();
+  if (g0 != 0 || g1 != 0)
+__builtin_abort ();
+  return 0;
+}


Re: Fix some _GLIBCXX_DEBUG pretty printer errors

2018-03-07 Thread François Dumont

On 06/03/2018 22:21, Jonathan Wakely wrote:



@@ -575,10 +586,12 @@ class StdDebugIteratorPrinter:
 # and return the wrapped iterator value.
 def to_string (self):
 base_type = gdb.lookup_type('__gnu_debug::_Safe_iterator_base')
+itype = self.val.type.template_argument(0)
 safe_seq = self.val.cast(base_type)['_M_sequence']
-if not safe_seq or self.val['_M_version'] != safe_seq['_M_version']:
+if not safe_seq:
+return str(self.val.cast(itype))

So what's the effect of this change when we get a value-initialized
debug iterator? It prints the wrapped (value-initialized) non-debug
iterator instead?


Yes, for instance this test in smiple11.cc:


  std::deque::iterator deqiter0;
// { dg-final { note-test deqiter0 {non-dereferenceable iterator for 
std::deque} } }


    Was failing when running 'make check-debug 
RUNTESTFLAGS=prettyprinters.exp' because it was displaying 'invalid 
iterator' rather than non-dereferenceable. Now it is succeeded.


François



Re: [C++ Patch/RFC] PR 71169 ("[7/8 Regression] ICE on invalid C++ code in pop_nested_class"), PR 71832 and more

2018-03-07 Thread Paolo Carlini


Hi,

[snip the various clarifications]

Il 7 Marzo 2018 17:57:07 CET, Jason Merrill  ha scritto:
>My thought was that this patch adds a lot of managing of the flag in
>different places in the parser, whereas looking for error_mark_node in
>the template parms here would be just in one place.  But if you prefer
>the current approach, that's fine, it's straightforward enough.

Thanks a lot for the various clarifications above, where essentially turns out 
that some details of my patch are correct essentially by chance ;) Seriously, 
I'm thinking the following: since 8 is getting real close, what if, for 8, for 
the known mild regressions, we go ahead with my super safe Plan B which I 
mentioned at beginning of the thread, then as soon as trunk branches we 
immediately apply my patch and we give it a serious spin, say we rebuild 
distros with it, and see what happens?

Thanks,
Paolo


Re: [PATCH] Fix reg-stack error-recovery ICE (PR inline-asm/84683)

2018-03-07 Thread Jeff Law
On 03/06/2018 01:03 AM, Uros Bizjak wrote:
> On Mon, Mar 5, 2018 at 9:42 PM, Jakub Jelinek  wrote:
>> Hi!
>>
>> If we discover some bad inline-asm during reg-stack processing and we
>> error on those, we replace that inline-asm with a (use (const_int 0))
>> and therefore the various assumptions of reg-stack pass may not hold.
>> Seems we already have a couple of spots which are more permissive if
>> any_malformed_asm is true, this patch just adds another one.
>>
>> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>>
>> 2018-03-05  Jakub Jelinek  
>>
>> PR inline-asm/84683
>> * reg-stack.c (move_for_stack_reg): If any_malformed_asm, avoid
>> assertion failure.
>>
>> * g++.dg/ext/pr84683.C: New test.
> 
> LGTM.
Likewise.
jeff


Re: [C++ Patch/RFC] PR 71169 ("[7/8 Regression] ICE on invalid C++ code in pop_nested_class"), PR 71832 and more

2018-03-07 Thread Jason Merrill
On Wed, Mar 7, 2018 at 10:55 AM, Paolo Carlini  wrote:
> Hi,
>
> On 07/03/2018 16:38, Jason Merrill wrote:
>>
>> On Tue, Mar 6, 2018 at 5:00 PM, Paolo Carlini 
>> wrote:
>>>
>>> On 06/03/2018 21:33, Jason Merrill wrote:

 Interesting, that seems like a promising idea.  I'm not sure we need
 to do this based on an error in a default template arg, though; can we
 drop

 +  || error_operand_p (TREE_PURPOSE (parameter)))

 ?
>>>
>>> Good point, yes, I believe we can, isn't necessary for all the snippets I
>>> have around neither apparently to pass the testsuite (but this is rather
>>> straightforward, right?). As I said, I tried many different things, some
>>> also fiddled with TREE_PURPOSE, in pt.c too, but in what I sent I only
>>> added
>>> the check out of a reflex. Anyway, the below is in libstdc++, so far so
>>> good.
>>
>> What if, instead of adding another flag to cp_parser, we look for
>> errors in the template parms for a particular member before we do any
>> late parsing for it?
>
> Thus do the check before:
>
> FOR_EACH_VEC_SAFE_ELT (unparsed_funs_with_definitions, ix, decl)
>   cp_parser_late_parsing_for_member (parser, decl);
>
> and therefore skip the whole set of unparsed_funs? Because the template
> parameters to be considered is the same for all of them, right?

No; member templates have more template parameters; the error might be
in a nested class, and not affect functions from an enclosing class.

> (otherwise something is seriously wrong with my very idea!)

Not that seriously; once we've seen an error all bets are off, and
we're free to decide that enclosing class functions aren't worth
parsing in that case, either.

> And what about the other
> entities in the 'else if (--parser->num_classes_being_defined == 0)' body?

They also have associated decls where we can look at template parameters.

> On one hand I'm certainly concerned that we may be too heavy handed, on the
> other hand we may risk inconsistencies, with some definitions available
> during error recovery other not, where all of them have broken outer
> template parameters anyway. Well, in general, I must say that - assuming the
> possibly erroneous template parameters are shared by all the entities in the
> 'else if (--parser->num_classes_being_define == 0)' body - it would be too
> bad going through again all the template parameters where with just a bool
> we could save the work that we already did anyway, if see what I mean...

My thought was that this patch adds a lot of managing of the flag in
different places in the parser, whereas looking for error_mark_node in
the template parms here would be just in one place.  But if you prefer
the current approach, that's fine, it's straightforward enough.

Jason


Re: [PATCH] Add a few new contrib.texi entries, especially for people who perform GCC fuzzy testing and report bugs (take 2)

2018-03-07 Thread Gerald Pfeifer
On Wed, 7 Mar 2018, Jakub Jelinek wrote:
> Ok for trunk?
> 
> 2018-03-07  Jakub Jelinek  
> 
>   * doc/contrib.texi: Add entries for Martin Liska, David Malcolm,
>   Marek Polacek, extend Vladimir Makarov's, Jonathan Wakely's and
>   Volker Reichelt's entry and add entries for people that perform
>   GCC fuzzy testing and report numerous bugs.

Yes!  Thank you very much for doing this, Jakub!

(I wasn't familiar with all the folks helping with bug reports,
but am more than happy to defer to your judgement.)

Gerald


Re: [C++ Patch/RFC] PR 71169 ("[7/8 Regression] ICE on invalid C++ code in pop_nested_class"), PR 71832 and more

2018-03-07 Thread Paolo Carlini

Hi,

On 07/03/2018 16:38, Jason Merrill wrote:

On Tue, Mar 6, 2018 at 5:00 PM, Paolo Carlini  wrote:

On 06/03/2018 21:33, Jason Merrill wrote:

Interesting, that seems like a promising idea.  I'm not sure we need
to do this based on an error in a default template arg, though; can we
drop

+  || error_operand_p (TREE_PURPOSE (parameter)))

?

Good point, yes, I believe we can, isn't necessary for all the snippets I
have around neither apparently to pass the testsuite (but this is rather
straightforward, right?). As I said, I tried many different things, some
also fiddled with TREE_PURPOSE, in pt.c too, but in what I sent I only added
the check out of a reflex. Anyway, the below is in libstdc++, so far so
good.

What if, instead of adding another flag to cp_parser, we look for
errors in the template parms for a particular member before we do any
late parsing for it?

Thus do the check before:

    FOR_EACH_VEC_SAFE_ELT (unparsed_funs_with_definitions, ix, decl)
      cp_parser_late_parsing_for_member (parser, decl);

and therefore skip the whole set of unparsed_funs? Because the template 
parameters to be considered is the same for all of them, right? 
(otherwise something is seriously wrong with my very idea!) And what 
about the other entities in the 'else if 
(--parser->num_classes_being_defined == 0)' body? On one hand I'm 
certainly concerned that we may be too heavy handed, on the other hand 
we may risk inconsistencies, with some definitions available during 
error recovery other not, where all of them have broken outer template 
parameters anyway. Well, in general, I must say that - assuming the 
possibly erroneous template parameters are shared by all the entities in 
the 'else if (--parser->num_classes_being_define == 0)' body - it would 
be too bad going through again all the template parameters where with 
just a bool we could save the work that we already did anyway, if see 
what I mean...


Paolo.




Re: [C++ Patch/RFC] PR 71169 ("[7/8 Regression] ICE on invalid C++ code in pop_nested_class"), PR 71832 and more

2018-03-07 Thread Jason Merrill
On Tue, Mar 6, 2018 at 5:00 PM, Paolo Carlini  wrote:
> On 06/03/2018 21:33, Jason Merrill wrote:
>>
>> Interesting, that seems like a promising idea.  I'm not sure we need
>> to do this based on an error in a default template arg, though; can we
>> drop
>>
>> +  || error_operand_p (TREE_PURPOSE (parameter)))
>>
>> ?
>
> Good point, yes, I believe we can, isn't necessary for all the snippets I
> have around neither apparently to pass the testsuite (but this is rather
> straightforward, right?). As I said, I tried many different things, some
> also fiddled with TREE_PURPOSE, in pt.c too, but in what I sent I only added
> the check out of a reflex. Anyway, the below is in libstdc++, so far so
> good.

What if, instead of adding another flag to cp_parser, we look for
errors in the template parms for a particular member before we do any
late parsing for it?

Jason


libgo patch committed: push arena on AIX higher due to clashes

2018-03-07 Thread Ian Lance Taylor
This libgo patch by Tony Reix changes the arena position on AIX higher
to avoid clashes.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 258336)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-2a07cd31927ac943104f55d2b696e53e7cd073b3
+112623c89ee42b42bc748f12d9c704615634501b
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/malloc.go
===
--- libgo/go/runtime/malloc.go  (revision 258052)
+++ libgo/go/runtime/malloc.go  (working copy)
@@ -308,9 +308,9 @@ func mallocinit() {
p = uintptr(i)<<40 | uintptrMask&(0x0040<<32)
case GOOS == "aix":
if i == 0 {
-   p = uintptrMask&(1<<32) | 
uintptrMask&(0xa0<<52)
+   p = uintptrMask&(1<<42) | 
uintptrMask&(0xa0<<52)
} else {
-   p = uintptr(i)<<32 | 
uintptrMask&(0x70<<52)
+   p = uintptr(i)<<42 | 
uintptrMask&(0x70<<52)
}
default:
p = uintptr(i)<<40 | uintptrMask&(0x00c0<<32)


Re: [PATCH] fix ICE in generic_overlap (PR 84526)

2018-03-07 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-02/msg01441.html

Jeff, I know this is in your queue.  I ping it because a bug
with the same root cause was reported in RHBZ #1552639.

On 02/26/2018 11:13 AM, Martin Sebor wrote:

On 02/26/2018 10:39 AM, Jakub Jelinek wrote:

On Mon, Feb 26, 2018 at 10:32:27AM -0700, Martin Sebor wrote:

That attached revision updates the pass to directly convert
the poly64_int into offset_int using the API suggested by
Richard (thanks again).  As discussed below, I've made no
other changes.

Jakub, if you still have concerns that the false positive
suppression logic isn't sufficiently effective please post
a test case (or open a new bug).  I will look into it when
I get a chance.


Yes, I still have major concerns, I explained what I'd like to see
(differentiate clearly between what base is, either by adding a flag or
having separate base and base_addr, then in the users you can easily find
out what is what and can decide based on that).


I'm open to making these improvements but without a test case
or a way to validate them I'm not comfortable making intrusive
changes in this area at this late stage.  As I explained, the
code cannot be reached under the conditions you described.
In any event, the improvements you suggest are independent of
the fix for the ICE.


Until then, I'd like to get the ICE fix committed. Please
confirm that the attached patch is good to go in.


Due to the above concerns, I don't think the patch is good to go in.
If you find somebody else who thinks the patch is ok, I won't fight
against
it though.


Jeff, when you have a chance, can you please review/approve
the patch?

https://gcc.gnu.org/ml/gcc-patches/2018-02/msg01441.html

Martin




Re: C++ PATCH for c++/84684, wrong caching when evaluating a constexpr function

2018-03-07 Thread Jason Merrill
On Tue, Mar 6, 2018 at 3:48 PM, Marek Polacek  wrote:
> On Tue, Mar 06, 2018 at 03:39:36PM -0500, Jason Merrill wrote:
>> On Tue, Mar 6, 2018 at 1:13 PM, Marek Polacek  wrote:
>> > But I'm also wondering about massage_init_elt.  It has
>> >   tree t = fold_non_dependent_expr (init);
>> >   t = maybe_constant_init (t);
>> > but given that fold_non_dependent_expr now calls maybe_constant_value, 
>> > which
>> > then causes that we try to cache the calls above, this seems excessive,
>> > wouldn't we be better off with just calling fold_non_dependent_init as
>> > discussed recently?
>>
>> Probably.
>
> Do you want me to try it for GCC 8 or should we table it for GCC 9?
> I would think the latter since it's not a regression, just an optimization.

Agreed.

Jason


Re: [PR84620] output symbolic entry_view as data2, not addr

2018-03-07 Thread Jason Merrill
On Tue, Mar 6, 2018 at 2:28 AM, Jakub Jelinek  wrote:
> On Tue, Mar 06, 2018 at 03:13:11AM -0300, Alexandre Oliva wrote:
>> Jakub wrote (in the BZ):
>>
>> > I meant to say:
>> > The char * GTY ((tag ("dw_val_class_symview"))) val_symbolic_view; line
>> > should come at the and of the union, not before the other classes.
>>
>> I've moved the union field down in the revised patch below, but I don't
>> see the point, and I thought it would be better to keep it close to
>> logically-similar entries.  If the point is just to make it parallel to
>> the order of the enum (which manh other entries don't seem to have cared
>> about), maybe moving the enum would be better?
>
> I think the order should match the order of the dw_val_class entries and
> should be sorted from the most commonly used ones (ones used by most
> different attributes etc.), so that somebody trying to learn dwarf2out
> stuff can learn it more easily (say the dw_val_class_const,
> dw_val_class_const_unsigned, dw_val_class_flag first etc.), but apparently
> it is completely random already, I'll defer to Jason what he wants.

I don't think the order of the GTY tags matters much, it's just boilerplate.

Jason


Re: [PATCH] Fix ICE for static vars in offloaded functions

2018-03-07 Thread Richard Biener
On Wed, 7 Mar 2018, Tom de Vries wrote:

> On 03/07/2018 02:29 PM, Richard Biener wrote:
> > On Wed, 7 Mar 2018, Jakub Jelinek wrote:
> > 
> > > On Wed, Mar 07, 2018 at 02:20:26PM +0100, Tom de Vries wrote:
> > > > Fix ICE for static vars in offloaded functions
> > > > 
> > > > 2018-03-06  Tom de Vries  
> > > > 
> > > > PR lto/84592
> > > > * varpool.c (varpool_node::get_create): Mark static variables in
> > > > offloaded functions as offloadable.
> > > > 
> > > > * testsuite/libgomp.c/pr84592-2.c: New test.
> > > > * testsuite/libgomp.c/pr84592.c: New test.
> > > > * testsuite/libgomp.oacc-c-c++-common/pr84592-3.c: New test.
> > > 
> > > Ok, thanks
> > 
> > +  bool in_offload_func
> > +   = (cfun
> > +  && TREE_STATIC (decl)
> > +  && (lookup_attribute ("omp target entr
> > 
> > I think you want to use decl_function_context (decl) here,
> > not rely on magic cfun being set.  The whole varpool.c file
> > doesn't mention cfun yet and you shoudln't either.
> > 
> 
> decl_function_context (decl) returns main:
> ...
> (gdb) call debug_generic_expr (decl)
> test
> (gdb) call  decl_function_context (decl)
> $2 = (tree_node *) 0x76978c00
> (gdb) call debug_generic_expr ($2)
> main
> ...
> while the function annotated as being an offload function is main._omp_fn.0.

Well, that's because the static isn't duplicated (it can't be) so it 
retains the original context.

> The varpool_node::get_create is called during cgraph_edge::rebuild_edges here
> in expand_omp_target:

But at this point it's not created but just looked up, right?

I think the fix is to mark the decl as offloaded when we walk the IL
of the outlined function.  The current point looks like a hack.

Richard.

> ...
> 7087  /* Fix the callgraph edges for child_cfun.  Those for cfun will
> be
> 7088 fixed in a following pass.  */
> 7089  push_cfun (child_cfun);
> 7090  if (need_asm)
> 7091assign_assembler_name_if_needed (child_fn);
> 7092  cgraph_edge::rebuild_edges ();
> ...
> 
> Thanks,
> - Tom
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH] Fix ICE for static vars in offloaded functions

2018-03-07 Thread Tom de Vries

On 03/07/2018 02:29 PM, Richard Biener wrote:

On Wed, 7 Mar 2018, Jakub Jelinek wrote:


On Wed, Mar 07, 2018 at 02:20:26PM +0100, Tom de Vries wrote:

Fix ICE for static vars in offloaded functions

2018-03-06  Tom de Vries  

PR lto/84592
* varpool.c (varpool_node::get_create): Mark static variables in
offloaded functions as offloadable.

* testsuite/libgomp.c/pr84592-2.c: New test.
* testsuite/libgomp.c/pr84592.c: New test.
* testsuite/libgomp.oacc-c-c++-common/pr84592-3.c: New test.


Ok, thanks


+  bool in_offload_func
+   = (cfun
+  && TREE_STATIC (decl)
+  && (lookup_attribute ("omp target entr

I think you want to use decl_function_context (decl) here,
not rely on magic cfun being set.  The whole varpool.c file
doesn't mention cfun yet and you shoudln't either.



decl_function_context (decl) returns main:
...
(gdb) call debug_generic_expr (decl)
test
(gdb) call  decl_function_context (decl)
$2 = (tree_node *) 0x76978c00
(gdb) call debug_generic_expr ($2)
main
...
while the function annotated as being an offload function is main._omp_fn.0.

The varpool_node::get_create is called during cgraph_edge::rebuild_edges 
here in expand_omp_target:

...
7087  /* Fix the callgraph edges for child_cfun.  Those for cfun 
will be

7088 fixed in a following pass.  */
7089  push_cfun (child_cfun);
7090  if (need_asm)
7091assign_assembler_name_if_needed (child_fn);
7092  cgraph_edge::rebuild_edges ();
...

Thanks,
- Tom


Re: [PATCH] Document gcov-io (PR gcov-profile/84735).

2018-03-07 Thread Nathan Sidwell

On 03/07/2018 09:31 AM, Martin Liška wrote:

Hi.

Attempt of the patch is to explain that consumers of gcov tool should
use the intermediate format instead of reading of the profile files.
The format change is documented.


ok


--
Nathan Sidwell


[PATCH] Document gcov-io (PR gcov-profile/84735).

2018-03-07 Thread Martin Liška
Hi.

Attempt of the patch is to explain that consumers of gcov tool should
use the intermediate format instead of reading of the profile files.
The format change is documented.

Ready for trunk?
Thanks,
Martin

gcc/ChangeLog:

2018-03-07  Martin Liska  

PR gcov-profile/84735
* doc/gcov.texi: Document usage of profile files.
* gcov-io.h: Document changes in the format.
---
 gcc/doc/gcov.texi | 6 +++---
 gcc/gcov-io.h | 8 ++--
 2 files changed, 9 insertions(+), 5 deletions(-)


diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi
index d4c7806bc23..59235876aaa 100644
--- a/gcc/doc/gcov.texi
+++ b/gcc/doc/gcov.texi
@@ -816,9 +816,9 @@ A separate @file{.gcda} file is created for each object file compiled with
 this option.  It contains arc transition counts, value profile counts, and
 some summary information.
 
-The full details of the file format is specified in @file{gcov-io.h},
-and functions provided in that header file should be used to access the
-coverage files.
+It is not recommended to access the coverage files directly.
+Consumers should use the intermediate format that is provided
+by @command{gcov} tool via @option{--intermediate-format} option.
 
 @node Cross-profiling
 @section Data File Relocation to Support Cross-Profiling
diff --git a/gcc/gcov-io.h b/gcc/gcov-io.h
index 18937e35474..d6389c48908 100644
--- a/gcc/gcov-io.h
+++ b/gcc/gcov-io.h
@@ -48,7 +48,11 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 	padding: | char:0 | char:0 char:0 | char:0 char:0 char:0
 	item: int32 | int64 | string
 
-   The basic format of the files is
+   The basic format of the notes file is
+
+	file : int32:magic int32:version int32:stamp int32:support_unexecuted_blocks record*
+
+   The basic format of the data file is
 
	file : int32:magic int32:version int32:stamp record*
 
@@ -104,7 +108,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 	function-graph: announce_function basic_blocks {arcs | lines}*
 	announce_function: header int32:ident
 		int32:lineno_checksum int32:cfg_checksum
-		string:name string:source int32:lineno
+		string:name string:source int32:start_lineno int32:start_column int32:end_lineno
 	basic_block: header int32:flags*
 	arcs: header int32:block_no arc*
 	arc:  int32:dest_block int32:flags



libgo patch committed: Use fence instruction before rdtsc

2018-03-07 Thread Ian Lance Taylor
This libgo patch uses an appropriate fence instruction before the
rdtsc instruction.  This is required on some multicore implementations
in order to consistent cycle counts on different cores.  Bootstrapped
and ran Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 258259)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-3287064c24cbf0c50776cdb87a720d29130b4363
+2a07cd31927ac943104f55d2b696e53e7cd073b3
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/runtime/runtime_c.c
===
--- libgo/runtime/runtime_c.c   (revision 257914)
+++ libgo/runtime/runtime_c.c   (working copy)
@@ -33,13 +33,47 @@ runtime_atoi(const byte *p, intgo len)
return n;
 }
 
+#if defined(__i386__) || defined(__x86_64__) || defined (__s390__) || defined 
(__s390x__)
+
+// When cputicks is just asm instructions, skip the split stack
+// prologue for speed.
+
+int64 runtime_cputicks(void) __attribute__((no_split_stack));
+
+#endif
+
+// Whether the processor supports SSE2.
+#if defined (__i386__)
+static _Bool hasSSE2;
+
+// Force appropriate CPU level so that we can call the lfence/mfence
+// builtins.
+
+#pragma GCC push_options
+#pragma GCC target("sse2")
+
+#elif defined(__x86_64__)
+#define hasSSE2 true
+#endif
+
+#if defined(__i386__) || defined(__x86_64__)
+// Whether to use lfence, as opposed to mfence.
+// Set based on cpuid.
+static _Bool lfenceBeforeRdtsc;
+#endif // defined(__i386__) || defined(__x86_64__)
+
 int64
 runtime_cputicks(void)
 {
-#if defined(__386__) || defined(__x86_64__)
-  uint32 low, high;
-  asm("rdtsc" : "=a" (low), "=d" (high));
-  return (int64)(((uint64)high << 32) | (uint64)low);
+#if defined(__i386__) || defined(__x86_64__)
+  if (hasSSE2) {
+if (lfenceBeforeRdtsc) {
+  __builtin_ia32_lfence();
+} else {
+  __builtin_ia32_mfence();
+}
+  }
+  return __builtin_ia32_rdtsc();
 #elif defined (__s390__) || defined (__s390x__)
   uint64 clock = 0;
   /* stckf may not write the return variable in case of a clock error, so make
@@ -56,6 +90,10 @@ runtime_cputicks(void)
 #endif
 }
 
+#if defined(__i386__)
+#pragma GCC pop_options
+#endif
+
 void
 runtime_signalstack(byte *p, uintptr n)
 {
@@ -146,8 +184,21 @@ runtime_cpuinit()
 #if defined(__i386__) || defined(__x86_64__)
unsigned int eax, ebx, ecx, edx;
 
+   if (__get_cpuid(0, , , , )) {
+   if (eax != 0
+   && ebx == 0x756E6547// "Genu"
+   && edx == 0x49656E69// "ineI"
+   && ecx == 0x6C65746E) { // "ntel"
+   lfenceBeforeRdtsc = true;
+   }
+   }
if (__get_cpuid(1, , , , )) {
setCpuidECX(ecx);
+#if defined(__i386__)
+   if ((edx & bit_SSE2) != 0) {
+   hasSSE2 = true;
+   }
+#endif
}
 
 #if defined(HAVE_AS_X86_AES)


[PATCH] Add a few new contrib.texi entries, especially for people who perform GCC fuzzy testing and report bugs (take 2)

2018-03-07 Thread Jakub Jelinek
On Wed, Mar 07, 2018 at 02:06:33PM +0100, Jakub Jelinek wrote:
> As appreciation of the hard work of people doing fuzzy testing of
> GCC and reporting high quality bugs, I'd like to list them in contrib.texi.
> This patch just lists them in the GCC testing group, shall we have a special
> sub-list for the fuzzy testing?
> The patch also adds or updates a couple of other entries, but I'm sure I've
> missed many people in both the testing and GCC development categories, for
> which I apologize.  We can always have incremental patches, contrib.texi
> is helplessly obsolete anyway and for many doesn't reflect last 10-15 years
> of work at all.  We should also add entries for people reporting lots of
> bugs even if it is not from fuzzy testing.
> 
> Anyway, ok for trunk?  Are all people on the CC ok with them being listed in
> there (reply privately if needed)?

Zdenek Sojka reported that he already has an entry for the fuzzing in a
different section and with more details, so this adjusted patch moves
all other people doing the fuzzing into the same section and with the same
wording (and extends Volker's entry).

Ok for trunk?

2018-03-07  Jakub Jelinek  

* doc/contrib.texi: Add entries for Martin Liska, David Malcolm,
Marek Polacek, extend Vladimir Makarov's, Jonathan Wakely's and
Volker Reichelt's entry and add entries for people that perform
GCC fuzzy testing and report numerous bugs.

--- gcc/doc/contrib.texi.jj 2018-01-03 10:20:21.699538202 +0100
+++ gcc/doc/contrib.texi2018-03-07 15:18:21.036146284 +0100
@@ -586,6 +586,11 @@ Chen Liqin for various S+core related fi
 maintaining the S+core port.
 
 @item
+Martin Liska for his work on identical code folding, the sanitizers,
+HSA, general bug fixing and for running automated regression testing of GCC
+and reporting numerous bugs.
+
+@item
 Weiwen Liu for testing and various bug fixes.
 
 @item
@@ -615,8 +620,13 @@ various code generation improvements, wo
 @item
 Vladimir Makarov for hacking some ugly i960 problems, PowerPC hacking
 improvements to compile-time performance, overall knowledge and
-direction in the area of instruction scheduling, and design and
-implementation of the automaton based instruction scheduler.
+direction in the area of instruction scheduling, design and
+implementation of the automaton based instruction scheduler and
+design and implementation of the integrated and local register allocators.
+
+@item
+David Malcolm for his work on improving GCC diagnostics, JIT, self-tests
+and unit testing.
 
 @item
 Bob Manson for his behind the scenes work on dejagnu.
@@ -736,6 +746,10 @@ engine setup, various documentation fixe
 Geoff Noer for his work on getting cygwin native builds working.
 
 @item
+Vegard Nossum for running automated regression testing of GCC and reporting
+numerous bugs.
+
+@item
 Diego Novillo for his work on Tree SSA, OpenMP, SPEC performance
 tracking web pages, GIMPLE tuples, and assorted fixes.
 
@@ -784,6 +798,10 @@ out lots of problems we need to solve, m
 taking care of documentation maintenance in general.
 
 @item
+Marek Polacek for his work on the C front end, the sanitizers and general
+bug fixing.
+
+@item
 Andrew Pinski for processing bug reports by the dozen.
 
 @item
@@ -805,7 +823,12 @@ David Reese of Sun Microsystems contribu
 port.
 
 @item
-Volker Reichelt for keeping up with the problem reports.
+John Regehr for running automated regression testing of GCC and reporting
+numerous bugs.
+
+@item
+Volker Reichelt for running automated regression testing of GCC and reporting
+numerous bugs and for keeping up with the problem reports.
 
 @item
 Joern Rennecke for maintaining the sh port, loop, regmove & reload
@@ -947,6 +970,10 @@ Zdenek Sojka for running automated regre
 numerous bugs.
 
 @item
+Arseny Solokha for running automated regression testing of GCC and reporting
+numerous bugs.
+
+@item
 Jayant Sonar for contributing the CR16 port.
 
 @item
@@ -960,6 +987,10 @@ Jan Stein of the Chalmers Computer Socie
 Genix, as well as part of the 32000 machine description.
 
 @item
+Gerhard Steinmetz for running automated regression testing of GCC and reporting
+numerous bugs.
+
+@item
 Nigel Stephens for various mips16 related fixes/improvements.
 
 @item
@@ -979,6 +1010,14 @@ recently his vxworks contributions
 Jeff Sturm for Java porting help, bug fixes, and encouragement.
 
 @item
+Zhendong Su for running automated regression testing of GCC and reporting
+numerous bugs.
+
+@item
+Chengnian Sun for running automated regression testing of GCC and reporting
+numerous bugs.
+
+@item
 Shigeya Suzuki for this fixes for the bsdi platforms.
 
 @item
@@ -1050,7 +1089,7 @@ Andrew Waterman for contributing the RIS
 
 @item
 Jonathan Wakely for contributing libstdc++ Doxygen notes and XHTML
-guidance.
+guidance and maintaining libstdc++.
 
 @item
 Dean Wakerley for converting the install documentation from HTML to texinfo
@@ -1128,6 +1167,10 @@ Kevin 

Re: [PATCH] Fix ICE for static vars in offloaded functions

2018-03-07 Thread Jakub Jelinek
On Wed, Mar 07, 2018 at 02:29:48PM +0100, Richard Biener wrote:
> On Wed, 7 Mar 2018, Jakub Jelinek wrote:
> 
> > On Wed, Mar 07, 2018 at 02:20:26PM +0100, Tom de Vries wrote:
> > > Fix ICE for static vars in offloaded functions
> > > 
> > > 2018-03-06  Tom de Vries  
> > > 
> > >   PR lto/84592
> > >   * varpool.c (varpool_node::get_create): Mark static variables in
> > >   offloaded functions as offloadable.
> > > 
> > >   * testsuite/libgomp.c/pr84592-2.c: New test.
> > >   * testsuite/libgomp.c/pr84592.c: New test.
> > >   * testsuite/libgomp.oacc-c-c++-common/pr84592-3.c: New test.
> > 
> > Ok, thanks
> 
> +  bool in_offload_func
> +   = (cfun
> +  && TREE_STATIC (decl)
> +  && (lookup_attribute ("omp target entr
> 
> I think you want to use decl_function_context (decl) here, 
> not rely on magic cfun being set.  The whole varpool.c file
> doesn't mention cfun yet and you shoudln't either.

Oops, sure, thanks for catching it.

Jakub


Re: [PATCH] Fix ICE for static vars in offloaded functions

2018-03-07 Thread Richard Biener
On Wed, 7 Mar 2018, Jakub Jelinek wrote:

> On Wed, Mar 07, 2018 at 02:20:26PM +0100, Tom de Vries wrote:
> > Fix ICE for static vars in offloaded functions
> > 
> > 2018-03-06  Tom de Vries  
> > 
> > PR lto/84592
> > * varpool.c (varpool_node::get_create): Mark static variables in
> > offloaded functions as offloadable.
> > 
> > * testsuite/libgomp.c/pr84592-2.c: New test.
> > * testsuite/libgomp.c/pr84592.c: New test.
> > * testsuite/libgomp.oacc-c-c++-common/pr84592-3.c: New test.
> 
> Ok, thanks

+  bool in_offload_func
+   = (cfun
+  && TREE_STATIC (decl)
+  && (lookup_attribute ("omp target entr

I think you want to use decl_function_context (decl) here, 
not rely on magic cfun being set.  The whole varpool.c file
doesn't mention cfun yet and you shoudln't either.

please fix if you already committed the fix.

Thanks,
Richard.

>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[PATCH] Fix ICE for static vars in offloaded functions

2018-03-07 Thread Tom de Vries

Hi,

if we compile the testcase pr84592-2.c from the patch:
...
#include 

int
main (void)
{
  int n[1];

  n[0] = 3;

#pragma omp target
  {
static int test[4] = { 1, 2, 3, 4 };
n[0] += test[n[0]];
  }

  if (n[0] != 7)
abort ();

  return 0;
}
...

for nvptx offloading, we run into an assert:
...
lto1: internal compiler error: in input_varpool_node, at lto-cgraph.c:1424
0x959ebb input_varpool_node
gcc/lto-cgraph.c:1422
0x959ebb input_cgraph_1
gcc/lto-cgraph.c:1544
0x959ebb input_symtab()
gcc/lto-cgraph.c:1858
0x5aceac read_cgraph_and_symbols
gcc/lto/lto.c:2891
0x5aceac lto_main()
gcc/lto/lto.c:3356
...

The assert we run into is:
...
1422  gcc_assert (flag_ltrans
1423  || (!node->in_other_partition
1424  && !node->used_from_other_partition));
...

where node is:
...
(gdb) call debug_generic_expr (node.decl)
test
...

and the reason the assert triggers is:
...
(gdb) p node.in_other_partition
$1 = 1
...

AFAIU, what this means is that the variable test is placed in a 
different partition than the offloading function main._omp_fn.0 that 
uses the variable.



I looked at where global variables are put into offload_vars, and found 
that that happens in varpool_node::get_create:

...
  if ((flag_openacc || flag_openmp)
  && lookup_attribute ("omp declare target",
   DECL_ATTRIBUTES (decl)))
{
  node->offloadable = 1;
  if (ENABLE_OFFLOADING && !DECL_EXTERNAL (decl))
{
  g->have_offload = true;
  if (!in_lto_p)
vec_safe_push (offload_vars, decl);
}
}

...

The patch fixes the ICE there by marking the varpool_node test as 
offloadable as well.


Build and reg-tested libgomp on x86_64 with nvptx accelerator.
Bootstrapped and reg-tested on x86_64.

OK for stage4 trunk?

Thanks,
- Tom
Fix ICE for static vars in offloaded functions

2018-03-06  Tom de Vries  

	PR lto/84592
	* varpool.c (varpool_node::get_create): Mark static variables in
	offloaded functions as offloadable.

	* testsuite/libgomp.c/pr84592-2.c: New test.
	* testsuite/libgomp.c/pr84592.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/pr84592-3.c: New test.

---
 gcc/varpool.c  | 18 +---
 libgomp/testsuite/libgomp.c/pr84592-2.c| 20 ++
 libgomp/testsuite/libgomp.c/pr84592.c  | 32 ++
 .../libgomp.oacc-c-c++-common/pr84592-3.c  | 32 ++
 4 files changed, 98 insertions(+), 4 deletions(-)

diff --git a/gcc/varpool.c b/gcc/varpool.c
index 418753cca2a..a4fd892ca4d 100644
--- a/gcc/varpool.c
+++ b/gcc/varpool.c
@@ -151,11 +151,21 @@ varpool_node::get_create (tree decl)
   node = varpool_node::create_empty ();
   node->decl = decl;
 
-  if ((flag_openacc || flag_openmp)
-  && lookup_attribute ("omp declare target", DECL_ATTRIBUTES (decl)))
+  if (flag_openacc || flag_openmp)
 {
-  node->offloadable = 1;
-  if (ENABLE_OFFLOADING && !DECL_EXTERNAL (decl))
+  bool offload_var
+	= lookup_attribute ("omp declare target", DECL_ATTRIBUTES (decl));
+  bool in_offload_func
+	= (cfun
+	   && TREE_STATIC (decl)
+	   && (lookup_attribute ("omp target entrypoint",
+ DECL_ATTRIBUTES (cfun->decl))
+	   || lookup_attribute ("omp declare target",
+DECL_ATTRIBUTES (cfun->decl;
+  if (offload_var || in_offload_func)
+	node->offloadable = 1;
+
+  if (offload_var && ENABLE_OFFLOADING && !DECL_EXTERNAL (decl))
 	{
 	  g->have_offload = true;
 	  if (!in_lto_p)
diff --git a/libgomp/testsuite/libgomp.c/pr84592-2.c b/libgomp/testsuite/libgomp.c/pr84592-2.c
new file mode 100644
index 000..021497b28ff
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/pr84592-2.c
@@ -0,0 +1,20 @@
+#include 
+
+int
+main (void)
+{
+  int n[1];
+
+  n[0] = 3;
+
+#pragma omp target
+  {
+static int test[4] = { 1, 2, 3, 4 };
+n[0] += test[n[0]];
+  }
+
+  if (n[0] != 7)
+abort ();
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.c/pr84592.c b/libgomp/testsuite/libgomp.c/pr84592.c
new file mode 100644
index 000..197fd19bacc
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/pr84592.c
@@ -0,0 +1,32 @@
+/* { dg-additional-options "-ftree-switch-conversion" } */
+
+#include 
+
+int
+main (void)
+{
+  int n[1];
+
+  n[0] = 4;
+
+#pragma omp target
+  {
+int a = n[0];
+
+switch (a & 3)
+  {
+  case 0: a = 4; break;
+  case 1: a = 3; break;
+  case 2: a = 2; break;
+  default:
+	a = 1; break;
+  }
+
+n[0] = a;
+  }
+
+  if (n[0] != 4)
+abort ();
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84592-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84592-3.c
new file mode 100644
index 000..afcc1de7635
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84592-3.c
@@ -0,0 +1,32 @@
+/* { dg-additional-options 

Re: [PATCH] Ada: Fix s-oscons.ads generation

2018-03-07 Thread Arnaud Charlet
> >>The $(GNATLIBCFLAGS) are already included in $(GNATLIBCFLAGS_FOR_C).
> >>
> >>We must call the C compiler with the right machine flags.  So, add
> >>$(GNATLIBCFLAGS_FOR_C) to $(OSCONS_EXTRACT).  For example, on a
> >>bi-arch
> >>compiler supporting 32-bit and 64-bit instruction sets we pick
> >>otherwise
> >>only one variant due to a missing -m32 or -m64 flag.
> >>
> >>gcc/ada
> >>* gcc-interface/Makefile.in (OSCONS_CPP): Remove redundant
> >>$(GNATLIBCFLAGS).
> >>(OSCONS_EXTRACT): Add $(GNATLIBCFLAGS_FOR_C).
> >OK, thanks.
> 
> Thanks for the quick review. I would like to back port this to GCC 7.

Seems fine to me if it doesn't cause troubles on trunk.

Arno


[Committed] GCC 6: S/390: Disable branch prediction

2018-03-07 Thread Andreas Krebbel
Bootstrapped and regression tested with z900 and z10 on s390x.

gcc/ChangeLog:

2018-03-07  Andreas Krebbel  

Backport from mainline
2018-02-08  Andreas Krebbel  

* config/s390/s390-opts.h (enum indirect_branch): Define.
* config/s390/s390-protos.h (s390_return_addr_from_memory)
(s390_indirect_branch_via_thunk)
(s390_indirect_branch_via_inline_thunk): Add function prototypes.
(enum s390_indirect_branch_type): Define.
* config/s390/s390.c (struct s390_frame_layout, struct
machine_function): Remove.
(indirect_branch_prez10thunk_mask, indirect_branch_z10thunk_mask)
(indirect_branch_table_label_no, indirect_branch_table_name):
Define variables.
(INDIRECT_BRANCH_NUM_OPTIONS): Define macro.
(enum s390_indirect_branch_option): Define.
(s390_return_addr_from_memory): New function.
(s390_handle_string_attribute): New function.
(s390_attribute_table): Add new attribute handler.
(s390_execute_label): Handle UNSPEC_EXECUTE_JUMP patterns.
(s390_indirect_branch_via_thunk): New function.
(s390_indirect_branch_via_inline_thunk): New function.
(s390_function_ok_for_sibcall): When jumping via thunk disallow
sibling call optimization for non z10 compiles.
(s390_emit_call): Force indirect branch target to be a single
register.  Add r1 clobber for non-z10 compiles.
(s390_emit_epilogue): Emit return jump via return_use expander.
(s390_reorg): Handle JUMP_INSNs as execute targets.
(s390_option_override_internal): Perform validity checks for the
new command line options.
(s390_indirect_branch_attrvalue): New function.
(s390_indirect_branch_settings): New function.
(s390_set_current_function): Invoke s390_indirect_branch_settings.
(s390_output_indirect_thunk_function):  New function.
(s390_code_end): Implement target hook.
(s390_case_values_threshold): Implement target hook.
(TARGET_ASM_CODE_END, TARGET_CASE_VALUES_THRESHOLD): Define target
macros.
* config/s390/s390.h (struct s390_frame_layout)
(struct machine_function): Move here from s390.c.
(TARGET_INDIRECT_BRANCH_NOBP_RET)
(TARGET_INDIRECT_BRANCH_NOBP_JUMP)
(TARGET_INDIRECT_BRANCH_NOBP_JUMP_THUNK)
(TARGET_INDIRECT_BRANCH_NOBP_JUMP_INLINE_THUNK)
(TARGET_INDIRECT_BRANCH_NOBP_CALL)
(TARGET_DEFAULT_INDIRECT_BRANCH_TABLE)
(TARGET_INDIRECT_BRANCH_THUNK_NAME_EXRL)
(TARGET_INDIRECT_BRANCH_THUNK_NAME_EX)
(TARGET_INDIRECT_BRANCH_TABLE): Define macros.
* config/s390/s390.md (UNSPEC_EXECUTE_JUMP)
(INDIRECT_BRANCH_THUNK_REGNUM): Define constants.
(mnemonic attribute): Add values which aren't recognized
automatically.
("*cjump_long", "*icjump_long", "*basr", "*basr_r"): Disable
pattern for branch conversion.  Fix mnemonic attribute.
("*c", "*sibcall_br", "*sibcall_value_br", "*return"): Emit
indirect branch via thunk if requested.
("indirect_jump", ""): Expand patterns for branch conversion.
("*indirect_jump"): Disable for branch conversion using out of
line thunks.
("indirect_jump_via_thunk_z10")
("indirect_jump_via_thunk")

2018-03-07  Andreas Krebbel  

Backport from mainline
2018-02-09  Andreas Krebbel  

PR target/PR84295
* config/s390/s390.c (s390_set_current_function): Invoke
s390_indirect_branch_settings also if fndecl didn't change.

gcc/testsuite/ChangeLog:

2018-03-07  Andreas Krebbel  

Backport from mainline
2018-02-08  Andreas Krebbel  

* gcc.target/s390/nobp-function-pointer-attr.c: New test.
* gcc.target/s390/nobp-function-pointer-nothunk.c: New test.
* gcc.target/s390/nobp-function-pointer-z10.c: New test.
* gcc.target/s390/nobp-function-pointer-z900.c: New test.
* gcc.target/s390/nobp-indirect-jump-attr.c: New test.
* gcc.target/s390/nobp-indirect-jump-inline-attr.c: New test.
* gcc.target/s390/nobp-indirect-jump-inline-z10.c: New test.
* gcc.target/s390/nobp-indirect-jump-inline-z900.c: New test.
* gcc.target/s390/nobp-indirect-jump-nothunk.c: New test.
* gcc.target/s390/nobp-indirect-jump-z10.c: New test.
* gcc.target/s390/nobp-indirect-jump-z900.c: New test.
* gcc.target/s390/nobp-return-attr-all.c: New test.
* gcc.target/s390/nobp-return-attr-neg.c: New test.
* gcc.target/s390/nobp-return-mem-attr.c: New test.
* gcc.target/s390/nobp-return-mem-nothunk.c: New test.
* gcc.target/s390/nobp-return-mem-z10.c: New test.
* 

[PATCH] Add a few new contrib.texi entries, especially for people who perform GCC fuzzy testing and report bugs

2018-03-07 Thread Jakub Jelinek
Hi!

As appreciation of the hard work of people doing fuzzy testing of
GCC and reporting high quality bugs, I'd like to list them in contrib.texi.
This patch just lists them in the GCC testing group, shall we have a special
sub-list for the fuzzy testing?
The patch also adds or updates a couple of other entries, but I'm sure I've
missed many people in both the testing and GCC development categories, for
which I apologize.  We can always have incremental patches, contrib.texi
is helplessly obsolete anyway and for many doesn't reflect last 10-15 years
of work at all.  We should also add entries for people reporting lots of
bugs even if it is not from fuzzy testing.

Anyway, ok for trunk?  Are all people on the CC ok with them being listed in
there (reply privately if needed)?

2018-03-07  Jakub Jelinek  

* doc/contrib.texi: Add entries for Martin Liska, David Malcolm,
Marek Polacek, extend Vladimir Makarov's and Jonathan Wakely's entry
and add entries for people that perform GCC fuzzy testing and report
numerous bugs.

--- gcc/doc/contrib.texi.jj 2018-01-03 10:20:21.699538202 +0100
+++ gcc/doc/contrib.texi2018-03-07 13:44:15.155724282 +0100
@@ -586,6 +586,10 @@ Chen Liqin for various S+core related fi
 maintaining the S+core port.
 
 @item
+Martin Liska for his work on identical code folding, the sanitizers,
+HSA and general bug fixing.
+
+@item
 Weiwen Liu for testing and various bug fixes.
 
 @item
@@ -615,8 +619,13 @@ various code generation improvements, wo
 @item
 Vladimir Makarov for hacking some ugly i960 problems, PowerPC hacking
 improvements to compile-time performance, overall knowledge and
-direction in the area of instruction scheduling, and design and
-implementation of the automaton based instruction scheduler.
+direction in the area of instruction scheduling, design and
+implementation of the automaton based instruction scheduler and
+design and implementation of the integrated and local register allocators.
+
+@item
+David Malcolm for his work on improving GCC diagnostics, JIT, self-tests
+and unit testing.
 
 @item
 Bob Manson for his behind the scenes work on dejagnu.
@@ -784,6 +793,10 @@ out lots of problems we need to solve, m
 taking care of documentation maintenance in general.
 
 @item
+Marek Polacek for his work on the C front end, the sanitizers and general
+bug fixing.
+
+@item
 Andrew Pinski for processing bug reports by the dozen.
 
 @item
@@ -1050,7 +1063,7 @@ Andrew Waterman for contributing the RIS
 
 @item
 Jonathan Wakely for contributing libstdc++ Doxygen notes and XHTML
-guidance.
+guidance and maintaining libstdc++.
 
 @item
 Dean Wakerley for converting the install documentation from HTML to texinfo
@@ -1653,6 +1666,9 @@ Pekka Nikander
 Rick Niles
 
 @item
+Vegard Nossum
+
+@item
 Jon Olson
 
 @item
@@ -1671,6 +1687,12 @@ Derk Reefman
 David Rees
 
 @item
+John Regehr
+
+@item
+Volker Reichelt
+
+@item
 Paul Reilly
 
 @item
@@ -1698,12 +1720,27 @@ David Schuler
 Vin Shelton
 
 @item
+Zdenek Sojka
+
+@item
+Arseny Solokha
+
+@item
 Tim Souder
 
 @item
+Gerhard Steinmetz
+
+@item
+Zhendong Su
+
+@item
 Adam Sulmicki
 
 @item
+Chengnian Sun
+
+@item
 Bill Thorson
 
 @item
@@ -1722,6 +1759,9 @@ Ian Watson
 David E. Young
 
 @item
+Qirun Zhang
+
+@item
 And many others
 @end itemize
 


Jakub


Re: [PATCH] Ada: Fix s-oscons.ads generation

2018-03-07 Thread Sebastian Huber

On 07/03/18 11:33, Arnaud Charlet wrote:

The $(GNATLIBCFLAGS) are already included in $(GNATLIBCFLAGS_FOR_C).

We must call the C compiler with the right machine flags.  So, add
$(GNATLIBCFLAGS_FOR_C) to $(OSCONS_EXTRACT).  For example, on a bi-arch
compiler supporting 32-bit and 64-bit instruction sets we pick otherwise
only one variant due to a missing -m32 or -m64 flag.

gcc/ada
* gcc-interface/Makefile.in (OSCONS_CPP): Remove redundant
$(GNATLIBCFLAGS).
(OSCONS_EXTRACT): Add $(GNATLIBCFLAGS_FOR_C).

OK, thanks.


Thanks for the quick review. I would like to back port this to GCC 7.

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



[PATCH] [ARC] Add/update combiner patterns.

2018-03-07 Thread Claudiu Zissulescu
From: claziss 

Hi Andrew,

Please find the following patch which improves generating more instructions 
with .f flag (i.e., compare with zero).

Ok to apply?
Claudiu

gcc/
2018-01-26  Claudiu Zissulescu  

* config/arc/arc.md (add_shift): New pattern.
(add_shift2): Likewise.
(sub_shift): Likewise.
(sub_shift_cmp0_noout): Likewise.
(compare_si_ashiftsi): Likewise.
(xbfu_cmp0_noout): New combine pattern.
(xbfu_cmp0"): Likewise.
(movsi_set_cc_insn): Place the predicable variant first.
(commutative_binary_cmp0_noout): Remove clobber.
(commutative_binary_cmp0): New pattern.
(noncommutative_binary_cmp0): Likewise.
(noncommutative_binary_cmp0_noout): Likewise.
(noncommutative_binary_comparison_result_used): Removed.
(rsub_cmp0): New pattern.
(rsub_cmp0_noout): Likewise.
(extzvsi): Changed, keep only meaningful variants.
(SQH, SEZ): New iterators.
(SQH_postfix): New mode attribute.
(SEZ_prefix): New code attribute.
(xt_cmp0_noout): New instruction pattern.
(xt_cmp0): Likewise.
* config/arc/predicates.md (cc_set_register): Use CC_REG instead
of numerical value.
(noncommutative_operator): Check the availability of barrel
shifter option.
---
 gcc/config/arc/arc.md| 274 ---
 gcc/config/arc/predicates.md |   6 +-
 2 files changed, 233 insertions(+), 47 deletions(-)

diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 7633d36b5a6..fb3432964ac 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -810,20 +810,90 @@ archs4x, archs4xd, archs4xd_slow"
   "st%U0 %1,%0\;st%U0.di %1,%0"
   [(set_attr "type" "store")])
 
+;; Combiner patterns for compare with zero
+(define_mode_iterator SQH [QI HI])
+(define_mode_attr SQH_postfix [(QI "b") (HI "%_")])
+
+(define_code_iterator SEZ [sign_extend zero_extend])
+(define_code_attr SEZ_prefix [(sign_extend "sex") (zero_extend "ext")])
+
+(define_insn "*xt_cmp0_noout"
+  [(set (match_operand 0 "cc_set_register" "")
+   (compare:CC_ZN (SEZ:SI (match_operand:SQH 1 "register_operand" "r"))
+  (const_int 0)))]
+  ""
+  ".f\\t0,%1"
+  [(set_attr "type" "compare")
+   (set_attr "cond" "set_zn")])
+
+(define_insn "*xt_cmp0"
+  [(set (match_operand 0 "cc_set_register" "")
+   (compare:CC_ZN (SEZ:SI (match_operand:SQH 1 "register_operand" "r"))
+  (const_int 0)))
+   (set (match_operand:SI 2 "register_operand" "=r")
+   (SEZ:SI (match_dup 1)))]
+  ""
+  ".f\\t%2,%1"
+  [(set_attr "type" "compare")
+   (set_attr "cond" "set_zn")])
+
+(define_insn "*xbfu_cmp0_noout"
+  [(set (match_operand 0 "cc_set_register" "")
+   (compare:CC_Z
+(zero_extract:SI
+ (match_operand:SI 1 "register_operand"  "  r,r")
+ (match_operand:SI 2 "const_int_operand" "C3p,n")
+ (match_operand:SI 3 "const_int_operand" "  n,n"))
+(const_int 0)))]
+  "TARGET_HS && TARGET_BARREL_SHIFTER"
+  {
+   int assemble_op2 = (((INTVAL (operands[2]) - 1) & 0x1f) << 5) | (INTVAL 
(operands[3]) & 0x1f);
+   operands[2] = GEN_INT (assemble_op2);
+   return "xbfu%?.f\\t0,%1,%2";
+  }
+  [(set_attr "type"   "shift")
+   (set_attr "iscompact"  "false")
+   (set_attr "length" "4,8")
+   (set_attr "predicable" "no")
+   (set_attr "cond"   "set_zn")])
+
+(define_insn "*xbfu_cmp0"
+  [(set (match_operand 4 "cc_set_register" "")
+   (compare:CC_Z
+(zero_extract:SI
+ (match_operand:SI 1 "register_operand"  "0  ,r,0")
+ (match_operand:SI 2 "const_int_operand" "C3p,n,n")
+ (match_operand:SI 3 "const_int_operand" "n  ,n,n"))
+(const_int 0)))
+   (set (match_operand:SI 0 "register_operand"  "=r,r,r")
+   (zero_extract:SI (match_dup 1) (match_dup 2) (match_dup 3)))]
+  "TARGET_HS && TARGET_BARREL_SHIFTER"
+  {
+   int assemble_op2 = (((INTVAL (operands[2]) - 1) & 0x1f) << 5) | (INTVAL 
(operands[3]) & 0x1f);
+   operands[2] = GEN_INT (assemble_op2);
+   return "xbfu%?.f\\t%0,%1,%2";
+  }
+  [(set_attr "type"   "shift")
+   (set_attr "iscompact"  "false")
+   (set_attr "length" "4,8,8")
+   (set_attr "predicable" "yes,no,yes")
+   (set_attr "cond"   "set_zn")])
+
+; splitting to 'tst' allows short insns and combination into brcc.
 (define_insn_and_split "*movsi_set_cc_insn"
-  [(set (match_operand:CC_ZN 2 "cc_set_register" "")
-   (match_operator:CC_ZN 3 "zn_compare_operator"
- [(match_operand:SI 1 "nonmemory_operand" "cI,cL,Cal") (const_int 0)]))
-   (set (match_operand:SI 0 "register_operand" "=w,w,w")
+  [(set (match_operand 2 "cc_set_register" "")
+   (match_operator 3 "zn_compare_operator"
+   [(match_operand:SI 1 "nonmemory_operand" "rL,rI,Cal")
+(const_int 0)]))
+   (set (match_operand:SI 0 "register_operand" 

Re: [RFC][PR82479] missing popcount builtin detection

2018-03-07 Thread Bin.Cheng
On Wed, Mar 7, 2018 at 8:26 AM, Richard Biener
 wrote:
> On Tue, Mar 6, 2018 at 5:20 PM, Bin.Cheng  wrote:
>> On Mon, Mar 5, 2018 at 3:24 PM, Richard Biener
>>  wrote:
>>> On Thu, Feb 8, 2018 at 1:41 AM, Kugan Vivekanandarajah
>>>  wrote:
 Hi Richard,

 On 1 February 2018 at 23:21, Richard Biener  
 wrote:
> On Thu, Feb 1, 2018 at 5:07 AM, Kugan Vivekanandarajah
>  wrote:
>> Hi Richard,
>>
>> On 31 January 2018 at 21:39, Richard Biener  
>> wrote:
>>> On Wed, Jan 31, 2018 at 11:28 AM, Kugan Vivekanandarajah
>>>  wrote:
 Hi Richard,

 Thanks for the review.
 On 25 January 2018 at 20:04, Richard Biener 
  wrote:
> On Wed, Jan 24, 2018 at 10:56 PM, Kugan Vivekanandarajah
>  wrote:
>> Hi All,
>>
>> Here is a patch for popcount builtin detection similar to LLVM. I
>> would like to queue this for review for next stage 1.
>>
>> 1. This is done part of loop-distribution and effective for -O3 and 
>> above.
>> 2. This does not distribute loop to detect popcount (like
>> memcpy/memmove). I dont think that happens in practice. Please 
>> correct
>> me if I am wrong.
>
> But then it has no business inside loop distribution but instead is
> doing final value
> replacement, right?  You are pattern-matching the whole loop after 
> all.  I think
> final value replacement would already do the correct thing if you
> teached number of
> iteration analysis that niter for
>
>[local count: 955630224]:
>   # b_11 = PHI 
>   _1 = b_11 + -1;
>   b_8 = _1 & b_11;
>   if (b_8 != 0)
> goto ; [89.00%]
>   else
> goto ; [11.00%]
>
>[local count: 850510900]:
>   goto ; [100.00%]

 I am looking into this approach. What should be the scalar evolution
 for b_8 (i.e. b & (b -1) in a loop) should be? This is not clear to me
 and can this be represented with the scev?
>>>
>>> No, it's not affine and thus cannot be represented.  You only need the
>>> scalar evolution of the counting IV which is already handled and
>>> the number of iteration analysis needs to handle the above IV - this
>>> is the missing part.
>> Thanks for the clarification. I am now matching this loop pattern in
>> number_of_iterations_exit when number_of_iterations_exit_assumptions
>> fails. If the pattern matches, I am inserting the _builtin_popcount in
>> the loop preheater and setting the loop niter with this. This will be
>> used by the final value replacement. Is this what you wanted?
>
> No, you shouldn't insert a popcount stmt but instead the niter
> GENERIC tree should be a CALL_EXPR to popcount with the
> appropriate argument.

 Thats what I tried earlier but ran into some ICEs. I wasn't sure if
 niter in tree_niter_desc can take such.

 Attached patch now does this. Also had to add support for CALL_EXPR in
 few places to handle niter with CALL_EXPR. Does this look OK?
>>>
>>> Overall this looks ok - the patch includes changes in places that I don't 
>>> think
>>> need changes such as chrec_convert_1 or extract_ops_from_tree.
>>> The expression_expensive_p change should be more specific than making
>>> all calls inexpensive as well.
>>>
>>> The verify_ssa change looks bogus, you do
>>>
>>> +  dest = gimple_phi_result (count_phi);
>>> +  tree var = make_ssa_name (TREE_TYPE (dest), NULL);
>>> +  tree fn = builtin_decl_implicit (BUILT_IN_POPCOUNT);
>>> +
>>> +  var = build_call_expr (fn, 1, src);
>>> +  *niter = fold_build2 (MINUS_EXPR, TREE_TYPE (dest), var,
>>> +   build_int_cst (TREE_TYPE (dest), 1));
>>>
>>> why do you allocate a new SSA name here?  It seems unused
>>> as you overwrive 'var' with the CALL_EXPR immediately.
>>>
>>> I didn't review the pattern matching thoroughly nor the exact place you
>>> call it.  But
>>>
>>> +  if (check_popcount_pattern (loop, ))
>>> +   {
>>> + niter->assumptions = boolean_false_node;
>>> + niter->control.base = NULL_TREE;
>>> + niter->control.step = NULL_TREE;
>>> + niter->control.no_overflow = false;
>>> + niter->niter = count;
>>> + niter->assumptions = boolean_true_node;
>>> + niter->may_be_zero = boolean_false_node;
>>> + niter->max = -1;
>>> + niter->bound = NULL_TREE;
>>> + niter->cmp = ERROR_MARK;
>>> + return true;
>>> 

[PATCH][AARCH64]Fix immediate alternative of movhf_aarch64 pattern.

2018-03-07 Thread Renlin Li

Hi all,

For the immediate alternative, the constraint checks the operand with HF mode
while SImode is provided to the output template generation function.

Before the change, this inconsistency causes an ICE compiling the new test case 
in the patch.


aarch64-none-elf regression test Okay. Okay to commit the patch?

Regards,
Renlin

gcc/ChangeLog:

2018-03-07  Renlin Li  

* config/aarch64/aarch64.md (movhf_aarch64): Fix mode argument to
aarch64_output_scalar_simd_mov_immediate.

gcc/testsuite/ChangeLog:

2018-03-07  Renlin Li  

* gcc.target/aarch64/movi_hf.c: New.
* gcc.target/aarch64/f16_mov_immediate_1.c: Update.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 5a2a9309a3bbbfad6fcb6db07422d774909f0ba1..391fdd07e52f4d165a0109e3baa82571bafa37de 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1145,7 +1145,7 @@
umov\\t%w0, %1.h[0]
mov\\t%0.h[0], %1.h[0]
fmov\\t%h0, %1
-   * return aarch64_output_scalar_simd_mov_immediate (operands[1], SImode);
+   * return aarch64_output_scalar_simd_mov_immediate (operands[1], HImode);
ldr\\t%h0, %1
str\\t%h1, %0
ldrh\\t%w0, %1
diff --git a/gcc/testsuite/gcc.target/aarch64/f16_mov_immediate_1.c b/gcc/testsuite/gcc.target/aarch64/f16_mov_immediate_1.c
index 1ed3831e139745227487eafa3ccfdc05c99deb34..3d22d225851af653f17e04ce7c7cc65ee1c86172 100644
--- a/gcc/testsuite/gcc.target/aarch64/f16_mov_immediate_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/f16_mov_immediate_1.c
@@ -45,5 +45,5 @@ __fp16 f5 ()
 }
 
 /* { dg-final { scan-assembler-times "mov\tw\[0-9\]+, #?19520"   3 } } */
-/* { dg-final { scan-assembler-times "movi\tv\[0-9\]+\\\.2s, 0xbc, lsl 8"  1 } } */
-/* { dg-final { scan-assembler-times "movi\tv\[0-9\]+\\\.2s, 0x4c, lsl 8"  1 } } */
+/* { dg-final { scan-assembler-times "movi\tv\[0-9\]+\\\.4h, 0xbc, lsl 8"  1 } } */
+/* { dg-final { scan-assembler-times "movi\tv\[0-9\]+\\\.4h, 0x4c, lsl 8"  1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/movi_hf.c b/gcc/testsuite/gcc.target/aarch64/movi_hf.c
new file mode 100644
index ..9521b9b09c87bd5f19cb6b62b1228bae685d8667
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/movi_hf.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -std=c99" } */
+
+__fp16
+foo ()
+{
+  /* { dg-final { scan-assembler "movi\tv\[0-9\]+\.8b" } } */
+  return 0x1.544p5;
+}


Re: [PATCH] Fix aarch64_simd_reg_or_zero predicate (PR fortran/84565)

2018-03-07 Thread James Greenhalgh
On Tue, Feb 27, 2018 at 02:22:22PM +, Richard Sandiford wrote:
> Jakub Jelinek  writes:
> > Hi!
> >
> > The following testcase ICEs, because the aarch64_cmeqdf instruction
> > starting with r256612 no longer accepts CONST0_RTX (E_DFmode) as
> > valid argument, but the expander generates it anyway.
> >
> > The bug has been introduced during the addition of aarch64_simd_imm_zero
> > and aarch64_simd_or_scalar_imm_zero predicates, before the predicate
> > used to be:
> > (define_predicate "aarch64_simd_reg_or_zero"
> >   (and (match_code "reg,subreg,const_int,const_double,const_vector")
> >(ior (match_operand 0 "register_operand")
> >(ior (match_test "op == const0_rtx")
> > (match_test "aarch64_simd_imm_zero_p (op, mode)")
> > with
> > bool
> > aarch64_simd_imm_zero_p (rtx x, machine_mode mode)
> > {
> >   return x == CONST0_RTX (mode);
> > }
> > and so matched not just const,const_vector zeros, but also
> > const_int zero (that is through op == const0_rtx) and const_double
> > zero too.
> 
> Thanks for fixing this.
> 
> > Bootstrapped/regtested on aarch64-linux (scratch Fedora gcc 8 package build
> > with the patch applied), ok for trunk?
> >
> > 2018-02-27  Jakub Jelinek  
> >
> > PR fortran/84565
> > * config/aarch64/predicates.md (aarch64_simd_reg_or_zero): Use
> > aarch64_simd_or_scalar_imm_zero rather than aarch64_simd_imm_zero.
> >
> > * gfortran.dg/pr84565.f90: New test.
> >
> > --- gcc/config/aarch64/predicates.md.jj 2018-02-06 13:13:06.305751221 
> > +0100
> > +++ gcc/config/aarch64/predicates.md2018-02-26 16:30:01.902195208 
> > +0100
> > @@ -395,7 +395,7 @@ (define_predicate "aarch64_simd_reg_or_z
> >(and (match_code "reg,subreg,const_int,const_double,const,const_vector")
> > (ior (match_operand 0 "register_operand")
> > (match_test "op == const0_rtx")
> > -   (match_operand 0 "aarch64_simd_imm_zero"
> > +   (match_operand 0 "aarch64_simd_or_scalar_imm_zero"
> 
> I think this makes the match_test on the line above redundant.
> 
> LGTM otherwise (but I can't approve).  We should probably clean up
> the predicates so that the zero checks are more consistent.  genrecog
> could probably help by ignoring predicate codes that obviously don't
> apply (CONST_INT for vectors, CONST_VECTOR for ints, etc.).  But that's
> obviously all stage 1 stuff.

OK.

Thanks,
James



Re: [AARCH64] Disable pc relative literal load irrespective of TARGET_FIX_ERR_A53_84341

2018-03-07 Thread James Greenhalgh
On Wed, Mar 07, 2018 at 01:58:40AM +, Kugan Vivekanandarajah wrote:
> Hi James,
> 
> This patch has to be backported to gcc-7 branch as the build error for
> 521.wrf  with LTO is there too (for the same reason). This patch has
> been on trunk for some time now. So, is this  OK to backport this
> patch gcc-7 branch?

OK.

Thanks,
James

> On 30 August 2017 at 15:19, Kugan Vivekanandarajah
>  wrote:
> > Hi James,
> >
> > On 29 August 2017 at 21:31, James Greenhalgh  
> > wrote:
> >> On Tue, Jun 27, 2017 at 11:20:02AM +1000, Kugan Vivekanandarajah wrote:
> >>> https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00614.html  added this
> >>> workaround to get kernel building with when TARGET_FIX_ERR_A53_843419
> >>> is enabled.
> >>>
> >>> This was added to support building kernel loadable modules. In kernel,
> >>> when CONFIG_ARM64_ERRATUM_843419 is selected, the relocation needed
> >>> for ADRP/LDR (R_AARCH64_ADR_PREL_PG_HI21 and
> >>> R_AARCH64_ADR_PREL_PG_HI21_NC are removed from the kernel to avoid
> >>> loading objects with possibly offending sequence). Thus, it could only
> >>> support pc relative literal loads.
> >>>
> >>> However, the following patch was posted to kernel to add
> >>> -mpc-relative-literal-loads
> >>> http://www.spinics.net/lists/arm-kernel/msg476149.html
> >>>
> >>> -mpc-relative-literal-loads is unconditionally added to the kernel
> >>> build as can be seen from:
> >>> https://github.com/torvalds/linux/blob/master/arch/arm64/Makefile
> >>>
> >>> Therefore this patch removes the hunk so that applications like
> >>> SPECcpu2017's 521/621.wrf can be built (with LTO in this case) without
> >>> -mno-pc-relative-literal-loads
> >>>
> >>> Bootstrapped and regression tested on aarch64-linux-gnu with no new 
> >>> regressions.
> >>>
> >>> Is this OK for trunk?
> >>
> >> Hi Kugan,
> >>
> >> I'm struggling a little to convince myself that this is correct. I think
> >> the argument is that this was a workaround for one very particular issue
> >> with the linux kernel module loader, which hard faults by refusing to
> >> handle these relocations when in a workaround mode for Erratum 843419.
> >
> > Yes.
> >
> >> Most of these relocations won't occur because the kernel builds with
> >> -mcmodel=large, but literals would always use a small model style
> >> adrp/load, unless we were using -mpc-relative-literal-loads . So, this
> >> workaround enabled -mpc-relative-literal-loads always when we were in
> >> a workaround mode, thus allowing the kernel loader to continue.
> >>
> >> The argument for removing it then, is that with the kernel now always using
> >> -mpc-relative-literal-loads there is no reason for this workaround to stay
> >> in place. The linkers which we will use will apply the workaround if 
> >> needed.
> >
> > Yes.
> >
> >> Testcases and a detailed problem report of the build failures you had seen 
> >> in
> >> 521.wrf would have helped me get closer to understanding this, and made
> >> review substantially easier.
> >
> > Sorry for not being clear with this. Unfortunately 521.wrf  was
> > showing this error with LTO so I could not reproduce with a small
> > enough test case.
> >
> >> Am I on the right track?
> >>
> >> If so, this is OK for trunk. If not, please can you expand on what is going
> >> on in this patch.
> >
> > Thanks for the review.
> > Kugan
> >
> >>
> >> Thanks,
> >> James
> >>
> >>
> >>>
> >>> Thanks,
> >>> Kugan
> >>>
> >>> gcc/testsuite/ChangeLog:
> >>>
> >>> 2017-06-27  Kugan Vivekanandarajah  
> >>>
> >>> * gcc.target/aarch64/pr63304_1.c: Remove-mno-fix-cortex-a53-843419.
> >>>
> >>> gcc/ChangeLog:
> >>>
> >>> 2017-06-27  Kugan Vivekanandarajah  
> >>>
> >>> * config/aarch64/aarch64.c (aarch64_override_options_after_change_1):
> >>> Disable pc relative literal load irrespective of 
> >>> TARGET_FIX_ERR_A53_84341
> >>> for default.
> >>
> >>


Re: [PATCH] Ada: Fix s-oscons.ads generation

2018-03-07 Thread Arnaud Charlet
> The $(GNATLIBCFLAGS) are already included in $(GNATLIBCFLAGS_FOR_C).
> 
> We must call the C compiler with the right machine flags.  So, add
> $(GNATLIBCFLAGS_FOR_C) to $(OSCONS_EXTRACT).  For example, on a bi-arch
> compiler supporting 32-bit and 64-bit instruction sets we pick otherwise
> only one variant due to a missing -m32 or -m64 flag.
> 
> gcc/ada
>   * gcc-interface/Makefile.in (OSCONS_CPP): Remove redundant
>   $(GNATLIBCFLAGS).
>   (OSCONS_EXTRACT): Add $(GNATLIBCFLAGS_FOR_C).

OK, thanks.

Arno


Re: [patch] Fix PR target/84277

2018-03-07 Thread Richard Biener
On Wed, Mar 7, 2018 at 10:28 AM, Eric Botcazou  wrote:
>> For the middle-end part I'd like to see output_function_exception_table take
>> an enum with two members with appropriate name - I see there wasn't
>> documentation for the function but your change doesn't make semantics more
>> clear, esp.
>>
>> -  output_one_function_exception_table (0);
>> -  if (crtl->eh.call_site_record_v[1])
>> -output_one_function_exception_table (1);
>> +  output_one_function_exception_table (section);
>>
>> looks like we now might output only once while we did twice before...
>>
>> Looking at the two changed callers doesn't shed enough light on this
>> either...
>
> See output_one_function_exception_table itself which has the same argument and
> the other functions in the file: 0 corresponds to the table for hot part while
> 1 corresponds to the table for the cold part (if any).  Before the change,
> output_function_exception_table outputs both parts at once whereas, after the
> change, it outputs one part at a time, and thus is called twice from final.c.

I see.

> I can certainly add the missing documentation for this whole section stuff.

Please.  Ok with that change.

Richard.

> --
> Eric Botcazou


GCC 6 backports

2018-03-07 Thread Martin Liška
Hi.

Sending GCC 6 branch backports.
Patches can bootstrap on ppc64le-redhat-linux and survives regression tests.
I'm going to install the patches.

Martin
>From e65a740ebed1d27132352c4a55e0a2554eb34820 Mon Sep 17 00:00:00 2001
From: hubicka 
Date: Fri, 13 Oct 2017 13:44:05 +
Subject: [PATCH 01/15] Backport r253729

gcc/lto/ChangeLog:

2017-10-13  Jan Hubicka  

	* lto-lang.c (lto_post_options): Clean shlib flag when not doing PIC.
---
 gcc/lto/lto-lang.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/lto/lto-lang.c b/gcc/lto/lto-lang.c
index ebe908a3e17..0b0db619562 100644
--- a/gcc/lto/lto-lang.c
+++ b/gcc/lto/lto-lang.c
@@ -837,11 +837,13 @@ lto_post_options (const char **pfilename ATTRIBUTE_UNUSED)
  flag_pie is 2.  */
   flag_pie = MAX (flag_pie, flag_pic);
   flag_pic = flag_pie;
+  flag_shlib = 0;
   break;
 
 case LTO_LINKER_OUTPUT_EXEC: /* Normal executable */
   flag_pic = 0;
   flag_pie = 0;
+  flag_shlib = 0;
   break;
 
 case LTO_LINKER_OUTPUT_UNKNOWN:
-- 
2.16.2

>From ced3b528c81165680c7b60bedefe2fa640210f91 Mon Sep 17 00:00:00 2001
From: hubicka 
Date: Tue, 30 Jan 2018 13:17:40 +
Subject: [PATCH 02/15] Backport r257183

gcc/lto/ChangeLog:

2018-01-30  Jan Hubicka  

	PR lto/83954
	* lto-symtab.c (warn_type_compatibility_p): Silence false positive
	for type match warning on arrays of pointers.

gcc/testsuite/ChangeLog:

2018-01-30  Jan Hubicka  

	PR lto/83954
	* gcc.dg/lto/pr83954.h: New testcase.
	* gcc.dg/lto/pr83954_0.c: New testcase.
	* gcc.dg/lto/pr83954_1.c: New testcase.
---
 gcc/lto/lto-symtab.c | 19 +++
 gcc/testsuite/gcc.dg/lto/pr83954.h   |  3 +++
 gcc/testsuite/gcc.dg/lto/pr83954_0.c |  8 
 gcc/testsuite/gcc.dg/lto/pr83954_1.c |  7 +++
 4 files changed, 33 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr83954.h
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr83954_0.c
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr83954_1.c

diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c
index 94b919b53e6..395dca47929 100644
--- a/gcc/lto/lto-symtab.c
+++ b/gcc/lto/lto-symtab.c
@@ -280,11 +280,22 @@ warn_type_compatibility_p (tree prevailing_type, tree type,
   alias_set_type set1 = get_alias_set (type);
   alias_set_type set2 = get_alias_set (prevailing_type);
 
-  if (set1 && set2 && set1 != set2 
-  && (!POINTER_TYPE_P (type) || !POINTER_TYPE_P (prevailing_type)
+  if (set1 && set2 && set1 != set2)
+	{
+  tree t1 = type, t2 = prevailing_type;
+
+	  /* Alias sets of arrays are the same as alias sets of the inner
+	 types.  */
+	  while (TREE_CODE (t1) == ARRAY_TYPE && TREE_CODE (t2) == ARRAY_TYPE)
+	{
+	  t1 = TREE_TYPE (t1);
+	  t2 = TREE_TYPE (t2);
+	}
+  if ((!POINTER_TYPE_P (t1) || !POINTER_TYPE_P (t2))
 	  || (set1 != TYPE_ALIAS_SET (ptr_type_node)
-		  && set2 != TYPE_ALIAS_SET (ptr_type_node
-lev |= 5;
+		  && set2 != TYPE_ALIAS_SET (ptr_type_node)))
+ lev |= 5;
+	}
 }
 
   return lev;
diff --git a/gcc/testsuite/gcc.dg/lto/pr83954.h b/gcc/testsuite/gcc.dg/lto/pr83954.h
new file mode 100644
index 000..e0155402504
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr83954.h
@@ -0,0 +1,3 @@
+struct foo;
+extern struct foo *FOO_PTR_ARR[1];
+
diff --git a/gcc/testsuite/gcc.dg/lto/pr83954_0.c b/gcc/testsuite/gcc.dg/lto/pr83954_0.c
new file mode 100644
index 000..065a31dab80
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr83954_0.c
@@ -0,0 +1,8 @@
+/* { dg-lto-do link } */
+#include "pr83954.h"
+
+int main() {
+  // just to prevent symbol removal
+  FOO_PTR_ARR[1] = 0;
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/lto/pr83954_1.c b/gcc/testsuite/gcc.dg/lto/pr83954_1.c
new file mode 100644
index 000..61b40fc7759
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr83954_1.c
@@ -0,0 +1,7 @@
+#include "pr83954.h"
+
+struct foo {
+ int x;
+};
+struct foo *FOO_PTR_ARR[1] = { 0 };
+
-- 
2.16.2

>From e3ff25c3be1c1767b987c0280c67173d179341ee Mon Sep 17 00:00:00 2001
From: ebotcazou 
Date: Fri, 2 Feb 2018 18:09:32 +
Subject: [PATCH 03/15] Backport r257343

gcc/lto/ChangeLog:

2018-02-02  Eric Botcazou  

	PR lto/83954
	* lto-symtab.c (warn_type_compatibility_p): Do not recurse into the
	component type of array types with non-aliased component.
---
 gcc/lto/lto-symtab.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c
index 395dca47929..e6fb95fe372 100644
--- a/gcc/lto/lto-symtab.c
+++ b/gcc/lto/lto-symtab.c
@@ -284,9 +284,12 @@ warn_type_compatibility_p (tree prevailing_type, tree type,
 	{
   tree t1 = type, t2 = prevailing_type;
 
-	  /* Alias sets of arrays 

[PATCH] Ada: Fix s-oscons.ads generation

2018-03-07 Thread Sebastian Huber
The $(GNATLIBCFLAGS) are already included in $(GNATLIBCFLAGS_FOR_C).

We must call the C compiler with the right machine flags.  So, add
$(GNATLIBCFLAGS_FOR_C) to $(OSCONS_EXTRACT).  For example, on a bi-arch
compiler supporting 32-bit and 64-bit instruction sets we pick otherwise
only one variant due to a missing -m32 or -m64 flag.

gcc/ada
* gcc-interface/Makefile.in (OSCONS_CPP): Remove redundant
$(GNATLIBCFLAGS).
(OSCONS_EXTRACT): Add $(GNATLIBCFLAGS_FOR_C).
---
 gcc/ada/gcc-interface/Makefile.in | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/gcc-interface/Makefile.in 
b/gcc/ada/gcc-interface/Makefile.in
index 50213c7520e..ebb955ebce5 100644
--- a/gcc/ada/gcc-interface/Makefile.in
+++ b/gcc/ada/gcc-interface/Makefile.in
@@ -2381,9 +2381,9 @@ OSCONS_CC=$(subst ./xgcc,../../xgcc,$(subst -B./, 
-B../../,$(GCC_FOR_TARGET)))
 # ada/types.h does not conflict with a same-named system header (VxWorks
 # has a  header).
 
-OSCONS_CPP=$(OSCONS_CC) $(GNATLIBCFLAGS) $(GNATLIBCFLAGS_FOR_C) -E -C \
+OSCONS_CPP=$(OSCONS_CC) $(GNATLIBCFLAGS_FOR_C) -E -C \
   -DTARGET=\"$(target)\" -iquote $(fsrcpfx)ada $(fsrcpfx)ada/s-oscons-tmplt.c 
> s-oscons-tmplt.i
-OSCONS_EXTRACT=$(OSCONS_CC) -S s-oscons-tmplt.i
+OSCONS_EXTRACT=$(OSCONS_CC) $(GNATLIBCFLAGS_FOR_C) -S s-oscons-tmplt.i
 
 # Note: if you need to build with a non-GNU compiler, you could adapt the
 # following definitions (written for VMS DEC-C)
-- 
2.12.3



Re: [patch] Fix PR target/84277

2018-03-07 Thread Eric Botcazou
> For the middle-end part I'd like to see output_function_exception_table take
> an enum with two members with appropriate name - I see there wasn't
> documentation for the function but your change doesn't make semantics more
> clear, esp.
> 
> -  output_one_function_exception_table (0);
> -  if (crtl->eh.call_site_record_v[1])
> -output_one_function_exception_table (1);
> +  output_one_function_exception_table (section);
> 
> looks like we now might output only once while we did twice before...
> 
> Looking at the two changed callers doesn't shed enough light on this
> either...

See output_one_function_exception_table itself which has the same argument and 
the other functions in the file: 0 corresponds to the table for hot part while 
1 corresponds to the table for the cold part (if any).  Before the change, 
output_function_exception_table outputs both parts at once whereas, after the 
change, it outputs one part at a time, and thus is called twice from final.c.

I can certainly add the missing documentation for this whole section stuff.

-- 
Eric Botcazou


Re: [PATCH] Reject target_clones attribute in functions that can't be cloned (PR middle-end/84723)

2018-03-07 Thread Richard Biener
On Tue, 6 Mar 2018, Jakub Jelinek wrote:

> Hi!
> 
> The following testcases show various reasons why a function definition
> with target_clones attribute can't be cloned; instead of ICEing because
> node->create_version_clone_with_body returns NULL and we dereference it,
> this patch just diagnoses those and ignores the attribute.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Richard.

> Martin Sebor said he wants to work on attribute exclusion, in that case
> the error messages in pr84723-{1,4,5}.c would be adjusted if needed and
>   const char *reason = NULL;
>   if (lookup_attribute ("noclone", DECL_ATTRIBUTES (node->decl)))
>   reason = G_("function %q+F can never be copied "
>   "because it has % attribute");
>   else
>   reason = copy_forbidden (DECL_STRUCT_FUNCTION (node->decl));
> could be replaced just with
>   const char *reason = copy_forbidden (DECL_STRUCT_FUNCTION (node->decl));
> but attribute exclusions can't really help for the cases where cloning
> is refused because of other reasons.
> 
> 2018-03-06  Jakub Jelinek  
> 
>   PR middle-end/84723
>   * multiple_target.c: Include tree-inline.h and intl.h.
>   (expand_target_clones): Diagnose and fail if node->definition and
>   !tree_versionable_function_p (node->decl).
> 
>   * gcc.target/i386/pr84723-1.c: New test.
>   * gcc.target/i386/pr84723-2.c: New test.
>   * gcc.target/i386/pr84723-3.c: New test.
>   * gcc.target/i386/pr84723-4.c: New test.
>   * gcc.target/i386/pr84723-5.c: New test.
> 
> --- gcc/multiple_target.c.jj  2018-01-03 10:19:54.956533925 +0100
> +++ gcc/multiple_target.c 2018-03-06 11:46:09.327627941 +0100
> @@ -36,6 +36,8 @@ along with GCC; see the file COPYING3.
>  #include "pretty-print.h"
>  #include "gimple-iterator.h"
>  #include "gimple-walk.h"
> +#include "tree-inline.h"
> +#include "intl.h"
>  
>  /* Walker callback that replaces all FUNCTION_DECL of a function that's
> going to be versioned.  */
> @@ -312,6 +314,22 @@ expand_target_clones (struct cgraph_node
>return false;
>  }
>  
> +  if (node->definition
> +  && !tree_versionable_function_p (node->decl))
> +{
> +  error_at (DECL_SOURCE_LOCATION (node->decl),
> + "clones for % attribute cannot be created");
> +  const char *reason = NULL;
> +  if (lookup_attribute ("noclone", DECL_ATTRIBUTES (node->decl)))
> + reason = G_("function %q+F can never be copied "
> + "because it has % attribute");
> +  else
> + reason = copy_forbidden (DECL_STRUCT_FUNCTION (node->decl));
> +  if (reason)
> + inform (DECL_SOURCE_LOCATION (node->decl), reason, node->decl);
> +  return false;
> +}
> +
>char *attr_str = XNEWVEC (char, attr_len);
>int attrnum = get_attr_str (arglist, attr_str);
>char **attrs = XNEWVEC (char *, attrnum);
> --- gcc/testsuite/gcc.target/i386/pr84723-1.c.jj  2018-03-06 
> 12:00:17.533367738 +0100
> +++ gcc/testsuite/gcc.target/i386/pr84723-1.c 2018-03-06 11:52:15.831511845 
> +0100
> @@ -0,0 +1,11 @@
> +/* PR middle-end/84723 */
> +/* { dg-do compile } */
> +/* { dg-require-ifunc } */
> +/* { dg-options "-O2" } */
> +
> +__attribute__((target_clones ("avx", "default")))
> +__attribute__((noclone))
> +void
> +foo (void)   /* { dg-error "clones for .target_clones. attribute cannot be 
> created" } */
> +{/* { dg-message "function .foo. can never be copied because it 
> has .noclone. attribute" "" { target *-*-* } .-1 } */
> +}
> --- gcc/testsuite/gcc.target/i386/pr84723-2.c.jj  2018-03-06 
> 12:00:21.019367631 +0100
> +++ gcc/testsuite/gcc.target/i386/pr84723-2.c 2018-03-06 11:50:30.458545226 
> +0100
> @@ -0,0 +1,13 @@
> +/* PR middle-end/84723 */
> +/* { dg-do compile } */
> +/* { dg-require-ifunc } */
> +/* { dg-options "-O2" } */
> +
> +__attribute__((target_clones ("avx", "default")))
> +void
> +foo (void)   /* { dg-error "clones for .target_clones. attribute cannot be 
> created" } */
> +{/* { dg-message "function .foo. can never be copied because it 
> saves address of local label in a static variable" "" { target *-*-* } .-1 } 
> */
> +  static void *p = &
> +  asm volatile ("" : "+m" (p) : : "memory");
> +lab:;
> +}
> --- gcc/testsuite/gcc.target/i386/pr84723-3.c.jj  2018-03-06 
> 12:00:24.397367531 +0100
> +++ gcc/testsuite/gcc.target/i386/pr84723-3.c 2018-03-06 11:50:57.274536736 
> +0100
> @@ -0,0 +1,17 @@
> +/* PR middle-end/84723 */
> +/* { dg-do compile } */
> +/* { dg-require-ifunc } */
> +/* { dg-options "-O2" } */
> +
> +__attribute__((target_clones ("avx", "default")))
> +int
> +foo (int x)  /* { dg-error "clones for .target_clones. attribute cannot be 
> created" } */
> +{/* { dg-message "function .foo. can never be copied because it 
> receives a non-local goto" "" { target *-*-* } .-1 } */
> +  __label__ lab;
> +  

Re: [patch] Fix PR target/84277

2018-03-07 Thread Richard Biener
On Tue, Mar 6, 2018 at 6:29 PM, Eric Botcazou  wrote:
>> this is the breakage of SEH on 64-bit Windows introduced by defaulting to
>> the -freorder-blocks-and-partition optimization.  It is fixed by:
>>   1. Defining ASM_DECLARE_COLD_FUNCTION_NAME &
>> ASM_DECLARE_COLD_FUNCTION_SIZE, 2. Emitting a nop in one more case for SEH,
>>   3. Splitting the exception table into hot and cold parts; that's necessary
>> because the LSDA is referenced directly in the SEH scheme.
>>
>> Tested on x86-64/Windows, x86-64/Linux and IA-64/Linux, OK for mainline?
>
> Also tested on x86/Windows to verify that it doesn't break anything.
>
> Can anyone approve the middle-end bits?  The rest effectively affects only the
> -freorder-blocks-and-partition optimization on x86-64/Windows, which has never
> worked in C++ or Ada, so I don't think that it can do any harm...

For the middle-end part I'd like to see output_function_exception_table take
an enum with two members with appropriate name - I see there wasn't
documentation for the function but your change doesn't make semantics
more clear, esp.

-  output_one_function_exception_table (0);
-  if (crtl->eh.call_site_record_v[1])
-output_one_function_exception_table (1);
+  output_one_function_exception_table (section);

looks like we now might output only once while we did twice before...

Looking at the two changed callers doesn't shed enough light on this either...

Richard.

> --
> Eric Botcazou


Re: [RFC][PR82479] missing popcount builtin detection

2018-03-07 Thread Richard Biener
On Tue, Mar 6, 2018 at 5:20 PM, Bin.Cheng  wrote:
> On Mon, Mar 5, 2018 at 3:24 PM, Richard Biener
>  wrote:
>> On Thu, Feb 8, 2018 at 1:41 AM, Kugan Vivekanandarajah
>>  wrote:
>>> Hi Richard,
>>>
>>> On 1 February 2018 at 23:21, Richard Biener  
>>> wrote:
 On Thu, Feb 1, 2018 at 5:07 AM, Kugan Vivekanandarajah
  wrote:
> Hi Richard,
>
> On 31 January 2018 at 21:39, Richard Biener  
> wrote:
>> On Wed, Jan 31, 2018 at 11:28 AM, Kugan Vivekanandarajah
>>  wrote:
>>> Hi Richard,
>>>
>>> Thanks for the review.
>>> On 25 January 2018 at 20:04, Richard Biener 
>>>  wrote:
 On Wed, Jan 24, 2018 at 10:56 PM, Kugan Vivekanandarajah
  wrote:
> Hi All,
>
> Here is a patch for popcount builtin detection similar to LLVM. I
> would like to queue this for review for next stage 1.
>
> 1. This is done part of loop-distribution and effective for -O3 and 
> above.
> 2. This does not distribute loop to detect popcount (like
> memcpy/memmove). I dont think that happens in practice. Please correct
> me if I am wrong.

 But then it has no business inside loop distribution but instead is
 doing final value
 replacement, right?  You are pattern-matching the whole loop after 
 all.  I think
 final value replacement would already do the correct thing if you
 teached number of
 iteration analysis that niter for

[local count: 955630224]:
   # b_11 = PHI 
   _1 = b_11 + -1;
   b_8 = _1 & b_11;
   if (b_8 != 0)
 goto ; [89.00%]
   else
 goto ; [11.00%]

[local count: 850510900]:
   goto ; [100.00%]
>>>
>>> I am looking into this approach. What should be the scalar evolution
>>> for b_8 (i.e. b & (b -1) in a loop) should be? This is not clear to me
>>> and can this be represented with the scev?
>>
>> No, it's not affine and thus cannot be represented.  You only need the
>> scalar evolution of the counting IV which is already handled and
>> the number of iteration analysis needs to handle the above IV - this
>> is the missing part.
> Thanks for the clarification. I am now matching this loop pattern in
> number_of_iterations_exit when number_of_iterations_exit_assumptions
> fails. If the pattern matches, I am inserting the _builtin_popcount in
> the loop preheater and setting the loop niter with this. This will be
> used by the final value replacement. Is this what you wanted?

 No, you shouldn't insert a popcount stmt but instead the niter
 GENERIC tree should be a CALL_EXPR to popcount with the
 appropriate argument.
>>>
>>> Thats what I tried earlier but ran into some ICEs. I wasn't sure if
>>> niter in tree_niter_desc can take such.
>>>
>>> Attached patch now does this. Also had to add support for CALL_EXPR in
>>> few places to handle niter with CALL_EXPR. Does this look OK?
>>
>> Overall this looks ok - the patch includes changes in places that I don't 
>> think
>> need changes such as chrec_convert_1 or extract_ops_from_tree.
>> The expression_expensive_p change should be more specific than making
>> all calls inexpensive as well.
>>
>> The verify_ssa change looks bogus, you do
>>
>> +  dest = gimple_phi_result (count_phi);
>> +  tree var = make_ssa_name (TREE_TYPE (dest), NULL);
>> +  tree fn = builtin_decl_implicit (BUILT_IN_POPCOUNT);
>> +
>> +  var = build_call_expr (fn, 1, src);
>> +  *niter = fold_build2 (MINUS_EXPR, TREE_TYPE (dest), var,
>> +   build_int_cst (TREE_TYPE (dest), 1));
>>
>> why do you allocate a new SSA name here?  It seems unused
>> as you overwrive 'var' with the CALL_EXPR immediately.
>>
>> I didn't review the pattern matching thoroughly nor the exact place you
>> call it.  But
>>
>> +  if (check_popcount_pattern (loop, ))
>> +   {
>> + niter->assumptions = boolean_false_node;
>> + niter->control.base = NULL_TREE;
>> + niter->control.step = NULL_TREE;
>> + niter->control.no_overflow = false;
>> + niter->niter = count;
>> + niter->assumptions = boolean_true_node;
>> + niter->may_be_zero = boolean_false_node;
>> + niter->max = -1;
>> + niter->bound = NULL_TREE;
>> + niter->cmp = ERROR_MARK;
>> + return true;
>> +   }
>>
>> simply setting may_be_zero to false looks fishy.  Try
>> with -fno-tree-loop-ch.  Also max should not be negative,
>> it should be the number of bits in the IV type?
>>
>> A related testcase