Re: [PATCH] Fix tree-emutls ADDR_EXPR handling (PR middle-end/83945)

2018-01-19 Thread Richard Biener
On January 19, 2018 10:52:08 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>Before emutls lowering, expressions like _REF[(whatever)].a
>for static __thread e is considered gimple invariant (after all, for
>native TLS we do that too) and gimple invariants are shareable trees.
>lower_emutls* then has code to update the VAR_DECL in there with an
>SSA_NAME
>and if needed, split it appart into another stmt (i.e. manually
>regimplify)
>if needed, but doesn't take into account the possible tree sharing,
>which can result in 1) invalid tree sharing, because after changing the
>VAR_DECL for a SSA_NAME it is no longer gimple invariant 2) might not
>break it appart anymore the second and further times 3) might use
>SSA_NAME
>that doesn't dominate it.
>
>Fixed by checking if we'll need to update the ADDR_EXPR operand or
>anything
>in there and unsharing if so before actually changing it.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK. 

Richard. 

>2018-01-19  Jakub Jelinek  
>
>   PR middle-end/83945
>   * tree-emutls.c: Include gimplify.h.
>   (lower_emutls_2): New function.
>   (lower_emutls_1): If ADDR_EXPR is a gimple invariant and walk_tree
>   with lower_emutls_2 callback finds some TLS decl in it, unshare_expr
>   it before further processing.
>
>   * gcc.dg/tls/pr83945.c: New test.
>
>--- gcc/tree-emutls.c.jj   2018-01-03 10:19:54.790533899 +0100
>+++ gcc/tree-emutls.c  2018-01-19 19:11:14.849321308 +0100
>@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.
> #include "gimple-walk.h"
> #include "langhooks.h"
> #include "tree-iterator.h"
>+#include "gimplify.h"
> 
>/* Whenever a target does not support thread-local storage (TLS)
>natively,
>  we can emulate it with some run-time support in libgcc.  This will in
>@@ -429,6 +430,20 @@ gen_emutls_addr (tree decl, struct lower
>   return addr;
> }
> 
>+/* Callback for lower_emutls_1, return non-NULL if there is any TLS
>+   VAR_DECL in the subexpressions.  */
>+
>+static tree
>+lower_emutls_2 (tree *ptr, int *walk_subtrees, void *)
>+{
>+  tree t = *ptr;
>+  if (TREE_CODE (t) == VAR_DECL)
>+return DECL_THREAD_LOCAL_P (t) ? t : NULL_TREE;
>+  else if (!EXPR_P (t))
>+*walk_subtrees = 0;
>+  return NULL_TREE;
>+}
>+
>/* Callback for walk_gimple_op.  D = WI->INFO is a struct
>lower_emutls_data.
>  Given an operand *PTR within D->STMT, if the operand references a TLS
>   variable, then lower the reference to a call to the runtime.  Insert
>@@ -455,6 +470,13 @@ lower_emutls_1 (tree *ptr, int *walk_sub
>   {
> bool save_changed;
> 
>+/* Gimple invariants are shareable trees, so before changing
>+   anything in them if we will need to change anything, unshare
>+   them.  */
>+if (is_gimple_min_invariant (t)
>+&& walk_tree (_OPERAND (t, 0), lower_emutls_2, NULL,
>NULL))
>+  *ptr = t = unshare_expr (t);
>+
> /* If we're allowed more than just is_gimple_val, continue.  */
> if (!wi->val_only)
>   {
>--- gcc/testsuite/gcc.dg/tls/pr83945.c.jj  2018-01-19 19:16:52.376346273
>+0100
>+++ gcc/testsuite/gcc.dg/tls/pr83945.c 2018-01-19 19:16:25.186344529
>+0100
>@@ -0,0 +1,21 @@
>+/* PR middle-end/83945 */
>+/* { dg-do compile { target tls } } */
>+/* { dg-options "-O2" } */
>+
>+struct S { int a[1]; };
>+__thread struct T { int c; } e;
>+int f;
>+void bar (int);
>+
>+void
>+foo (int f, int x)
>+{
>+  struct S *h = (struct S *) 
>+  for (;;)
>+{
>+  int *a = h->a, i;
>+  for (i = x; i; i--)
>+  bar (a[f]);
>+  bar (a[f]);
>+}
>+}
>
>   Jakub



Re: [PATCH] Fix UMOD rtx simplification (PR target/83930)

2018-01-19 Thread Richard Biener
On January 19, 2018 10:29:36 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>THis bug has been introduced 16.5 years ago.  If
>simplify_binary_operation_1
>is called with op1 (MEM) on which avoid_constant_pool_reference returns
>something simpler, for UMOD simplification we check that trueop1 is
>CONST_INT and power of 2, but then we use INTVAL of the op1 (MEM).
>
>Fixed by using INTVAL on what we've tested.  Additionally, if the value
>has msb set and all other bits clear, exact_log2 will return > 0, but
>INTVAL (trueop1) - 1 will invoke UB.  Bootstrapped/regtested on
>x86_64-linux
>and i686-linux, ok for trunk?

OK. 

Richard. 

>2018-01-19  Jakub Jelinek  
>
>   PR target/83930
>   * simplify-rtx.c (simplify_binary_operation_1) : Use
>   UINTVAL (trueop1) instead of INTVAL (op1).
>
>   * gcc.dg/pr83930.c: New test.
>
>--- gcc/simplify-rtx.c.jj  2018-01-14 17:16:55.657836138 +0100
>+++ gcc/simplify-rtx.c 2018-01-19 10:24:03.389017997 +0100
>@@ -3411,7 +3411,8 @@ simplify_binary_operation_1 (enum rtx_co
>   if (CONST_INT_P (trueop1)
> && exact_log2 (UINTVAL (trueop1)) > 0)
>   return simplify_gen_binary (AND, mode, op0,
>-  gen_int_mode (INTVAL (op1) - 1, mode));
>+  gen_int_mode (UINTVAL (trueop1) - 1,
>+mode));
>   break;
> 
> case MOD:
>--- gcc/testsuite/gcc.dg/pr83930.c.jj  2018-01-19 10:33:16.657831745
>+0100
>+++ gcc/testsuite/gcc.dg/pr83930.c 2018-01-19 10:31:55.383859102 +0100
>@@ -0,0 +1,17 @@
>+/* PR target/83930 */
>+/* { dg-do compile } */
>+/* { dg-options "-Og -fno-tree-ccp -w" } */
>+
>+unsigned __attribute__ ((__vector_size__ (16))) v;
>+
>+static inline void
>+bar (unsigned char d)
>+{
>+  v /= d;
>+}
>+
>+__attribute__ ((always_inline)) void
>+foo (void)
>+{
>+  bar (4);
>+}
>
>   Jakub



[Committed] PR fortran/83900 -- Remove bogus assert

2018-01-19 Thread Steve Kargl
The patch and testcase are sufficient to describe the problem.

2018-01-19  Steven G. Kargl  

PR fortran/83900
* simplify.c (gfc_simplify_matmul): Delete bogus assertion.

2018-01-19  Steven G. Kargl  

PR fortran/83900
* gfortran.dg/matmul_17.f90: New test.

Index: gcc/fortran/simplify.c
===
--- gcc/fortran/simplify.c  (revision 256905)
+++ gcc/fortran/simplify.c  (working copy)
@@ -4590,7 +4590,6 @@ gfc_simplify_matmul (gfc_expr *matrix_a, gfc_expr *mat
   || !is_constant_array_expr (matrix_b))
 return NULL;
 
-  gcc_assert (gfc_compare_types (_a->ts, _b->ts));
   result = gfc_get_array_expr (matrix_a->ts.type,
   matrix_a->ts.kind,
   _a->where);
Index: gcc/testsuite/gfortran.dg/matmul_17.f90
===
--- gcc/testsuite/gfortran.dg/matmul_17.f90 (nonexistent)
+++ gcc/testsuite/gfortran.dg/matmul_17.f90 (working copy)
@@ -0,0 +1,9 @@
+! { dg-do run }
+! PR Fortran/83900
+! Contributed by Gerhard Steinmetz  
+program p
+   integer, parameter :: a(3,2) = 1
+   real, parameter :: b(2,3) = 2
+   real, parameter :: c(3,3) = matmul(a, b)
+   if (any(c /= 4.)) call abort
+end

-- 
Steve


[PATCH v3, rs6000] Use $ instead of . for PC, use "crset 2" instead of "crset eq"

2018-01-19 Thread Bill Schmidt
Hi,

Here's another version of this patch incorporating the late-breaking news
that the AIX assembler doesn't comprehend the "eq" symbol.  Same as 
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01787.html but adding the
change to use "crset 2" instead.

This one is still regstrapping on BE/LE, trunk/7.  If these all complete
successfully, is this okay for trunk and backport?

Thanks,
Bill


[gcc]

2018-01-19  Bill Schmidt  
David Edelsohn 

PR target/83946
* config/rs6000/rs6000.md (*call_indirect_nonlocal_sysv):
Change "crset eq" to "crset 2".
(*call_value_indirect_nonlocal_sysv): Likewise.
(*call_indirect_aix_nospec): Likewise.
(*call_value_indirect_aix_nospec): Likewise.
(*call_indirect_elfv2_nospec): Likewise.
(*call_value_indirect_elfv2_nospec): Likewise.
(*sibcall_nonlocal_sysv): Change "crset eq" to "crset 2";
change assembly output from . to $.
(*sibcall_value_nonlocal_sysv): Likewise.
(indirect_jump_nospec): Change assembly output from . to $.
(*tablejump_internal1_nospec): Likewise.

[gcc/testsuite]

2018-01-19  Bill Schmidt  
David Edelsohn 

PR target/83946
* gcc.target/powerpc/safe-indirect-jump-1.c: Change expected
assembly output from "crset eq" to "crset 2".
* gcc.target/powerpc/safe-indirect-jump-2.c: Change expected
assembly output from . to $.
* gcc.target/powerpc/safe-indirect-jump-3.c: Likewise.
* gcc.target/powerpc/safe-indirect-jump-1.c: Change expected
assembly output from "crset eq" to "crset 2".
* gcc.target/powerpc/safe-indirect-jump-8.c: Change expected
assembly output from "crset eq" to "crset 2", and from . to $.


Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 256894)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -10457,7 +10457,7 @@
   || which_alternative == 1 || which_alternative == 3)
 return "b%T0l";
   else
-return "crset eq\;beq%T0l-";
+return "crset 2\;beq%T0l-";
 }
   [(set_attr "type" "jmpreg,jmpreg,jmpreg,jmpreg")
(set (attr "length")
@@ -10570,7 +10570,7 @@
   || which_alternative == 1 || which_alternative == 3)
 return "b%T1l";
   else
-return "crset eq\;beq%T1l-";
+return "crset 2\;beq%T1l-";
 }
   [(set_attr "type" "jmpreg,jmpreg,jmpreg,jmpreg")
(set (attr "length")
@@ -10731,7 +10731,7 @@
(set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 3 "const_int_operand" 
"n,n")] UNSPEC_TOCSLOT))
(clobber (reg:P LR_REGNO))]
   "DEFAULT_ABI == ABI_AIX && !rs6000_speculate_indirect_jumps"
-  "crset eq\; 2,%2\;beq%T0l-\; 2,%3(1)"
+  "crset 2\; 2,%2\;beq%T0l-\; 2,%3(1)"
   [(set_attr "type" "jmpreg")
(set_attr "length" "16")])
 
@@ -10755,7 +10755,7 @@
(set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 4 "const_int_operand" 
"n,n")] UNSPEC_TOCSLOT))
(clobber (reg:P LR_REGNO))]
   "DEFAULT_ABI == ABI_AIX && !rs6000_speculate_indirect_jumps"
-  "crset eq\; 2,%3\;beq%T1l-\; 2,%4(1)"
+  "crset 2\; 2,%3\;beq%T1l-\; 2,%4(1)"
   [(set_attr "type" "jmpreg")
(set_attr "length" "16")])
 
@@ -10780,7 +10780,7 @@
(set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 2 "const_int_operand" 
"n,n")] UNSPEC_TOCSLOT))
(clobber (reg:P LR_REGNO))]
   "DEFAULT_ABI == ABI_ELFv2 && !rs6000_speculate_indirect_jumps"
-  "crset eq\;beq%T0l-\; 2,%2(1)"
+  "crset 2\;beq%T0l-\; 2,%2(1)"
   [(set_attr "type" "jmpreg")
(set_attr "length" "12")])
 
@@ -10803,7 +10803,7 @@
(set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 3 "const_int_operand" 
"n,n")] UNSPEC_TOCSLOT))
(clobber (reg:P LR_REGNO))]
   "DEFAULT_ABI == ABI_ELFv2 && !rs6000_speculate_indirect_jumps"
-  "crset eq\;beq%T1l-\; 2,%3(1)"
+  "crset 2\;beq%T1l-\; 2,%3(1)"
   [(set_attr "type" "jmpreg")
(set_attr "length" "12")])
 
@@ -10987,7 +10987,7 @@
return \"b%T0\";
   else
/* Can use CR0 since it is volatile across sibcalls.  */
-   return \"crset eq\;beq%T0-\;b .\";
+   return \"crset 2\;beq%T0-\;b $\";
 }
   else if (DEFAULT_ABI == ABI_V4 && flag_pic)
 {
@@ -11044,7 +11044,7 @@
return \"b%T1\";
   else
/* Can use CR0 since it is volatile across sibcalls.  */
-   return \"crset eq\;beq%T1-\;b .\";
+   return \"crset 2\;beq%T1-\;b $\";
 }
   else if (DEFAULT_ABI == ABI_V4 && flag_pic)
 {
@@ -12566,7 +12566,7 @@
   [(set (pc) (match_operand:P 0 "register_operand" "c,*l"))
(clobber (match_operand:CC 1 "cc_reg_operand" "=y,y"))]
   "!rs6000_speculate_indirect_jumps"
-  "crset %E1\;beq%T0- %1\;b ."
+  "crset %E1\;beq%T0- %1\;b $"
   [(set_attr "type" "jmpreg")
(set_attr "length" "12")])
 
@@ -12672,7 +12672,7 @@
(use (label_ref (match_operand 1)))
(clobber (match_operand:CC 2 

Re: [PATCH] C/C++: Add -Waddress-of-packed-member

2018-01-19 Thread Martin Sebor

On 01/19/2018 10:14 AM, Martin Sebor wrote:

On 01/14/2018 07:29 AM, H.J. Lu wrote:

When address of packed member of struct or union is taken, it may result
in an unaligned pointer value.  This patch adds
-Waddress-of-packed-member
to warn it:

$ cat x.i
struct pair_t
{
  char c;
  int i;
} __attribute__ ((packed));

extern struct pair_t p;
int *addr = 
$ gcc -O2 -S x.i
x.i:8:13: warning: initialization of 'int *' from address of packed
member of 'struct pair_t' may result in an unaligned pointer value
[-Waddress-of-packed-member]
 int *addr = 
 ^
$

This warning is enabled by default.


I like this enhancement.  It would be useful for data types,
packed or not, such as casting int* to long*.

I noticed some differences from Clang for the test case below.
It seems that GCC should warn on all the cases Clang does.

Also, since converting the address of a struct to that of its
first member is common (especially in C and when the member
itself is a struct) I wonder if the warning should trigger
for those conversions as well.

struct A {
  int i;
} __attribute__ ((packed));

long* f8 (struct A *p) { return >i; }   // Clang only
int* f4 (struct A *p) { return >i; }// Clang, GCC
short* f2 (struct A *p) { return >i; }  // Clang only
char* f1 (struct A *p) { return >i; }
void* f0 (struct A *p) { return >i; }

struct B { int i; };
struct C { struct B b; } __attribute__ ((packed));

long* g8 (struct C *p) { return p; }// should warn?
int* g4 (struct C *p) { return >b; } // Clang only

int* h4 (struct C *p) { return >b.i; }   // Clang only



After reading the Clang code review for the warning
(https://reviews.llvm.org/D20561) and experimenting with a few
more test cases I noticed some additional false negatives that
I think would be worthwhile diagnosing:

  struct A {
int i;
  } __attribute__ ((packed));

  int f (struct A *p)
  {
return *>i;
  }

An example similar to one of those discussed in the review is
one involving a conditional expression (Clang diagnoses this):

  int* f (struct A *p, int *q)
  {
return q ? q : >i;   // missing warning
  }

Clang doesn't diagnose the conversion of a packed member array
to a pointer to its element.  It seems to me that it should be
diagnosed:

  struct B {
int a[1];
  } __attribute__ ((packed));

  int* f (struct B *p)
  {
return p->a;   // missing warning
  }

Similarly, in C++, binding to more strictly aligned references
should probably be diagnosed (Clang misses it):

int* g (struct A )
{
  int  = r.i;   // missing warning here
  return   // regardless of how ir is used
}

Martin



[PATCH, rs6000] Requested cleanups for BE handling of -mno-speculate-indirect-jumps

2018-01-19 Thread Bill Schmidt
Hi,

Segher had previously requested some cleanups in 
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01605.html.
Due to time pressures, I delayed those, but they are ready now.  Here they are,
bootstrapped and tested on powerpc64le-linux-gnu and powerpc64-linux-gnu.  Is
this okay for trunk?  I don't intend to backport these.

Thanks,
Bill


2018-01-19  Bill Schmidt  

* config/rs6000/rs6000.md (*call_indirect_nonlocal_sysv):
Simplify the clause that sets the length attribute.
(*call_value_indirect_nonlocal_sysv): Likewise.
(*sibcall_nonlocal_sysv): Clean up code block; simplify the
clause that sets the length attribute.
(*sibcall_value_nonlocal_sysv): Likewise.


Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 256894)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -10463,17 +10463,11 @@
(set (attr "length")
(cond [(and (eq (symbol_ref "which_alternative") (const_int 0))
(eq (symbol_ref "rs6000_speculate_indirect_jumps")
-   (const_int 1)))
- (const_string "4")
-  (and (eq (symbol_ref "which_alternative") (const_int 0))
-   (eq (symbol_ref "rs6000_speculate_indirect_jumps")
(const_int 0)))
  (const_string "8")
-  (eq (symbol_ref "which_alternative") (const_int 1))
- (const_string "4")
   (and (eq (symbol_ref "which_alternative") (const_int 2))
-   (eq (symbol_ref "rs6000_speculate_indirect_jumps")
-   (const_int 1)))
+   (ne (symbol_ref "rs6000_speculate_indirect_jumps")
+   (const_int 0)))
  (const_string "8")
   (and (eq (symbol_ref "which_alternative") (const_int 2))
(eq (symbol_ref "rs6000_speculate_indirect_jumps")
@@ -10576,17 +10570,11 @@
(set (attr "length")
(cond [(and (eq (symbol_ref "which_alternative") (const_int 0))
(eq (symbol_ref "rs6000_speculate_indirect_jumps")
-   (const_int 1)))
- (const_string "4")
-  (and (eq (symbol_ref "which_alternative") (const_int 0))
-   (eq (symbol_ref "rs6000_speculate_indirect_jumps")
(const_int 0)))
  (const_string "8")
-  (eq (symbol_ref "which_alternative") (const_int 1))
- (const_string "4")
   (and (eq (symbol_ref "which_alternative") (const_int 2))
-   (eq (symbol_ref "rs6000_speculate_indirect_jumps")
-   (const_int 1)))
+   (ne (symbol_ref "rs6000_speculate_indirect_jumps")
+   (const_int 0)))
  (const_string "8")
   (and (eq (symbol_ref "which_alternative") (const_int 2))
(eq (symbol_ref "rs6000_speculate_indirect_jumps")
@@ -10973,47 +10961,40 @@
   "(DEFAULT_ABI == ABI_DARWIN
 || DEFAULT_ABI == ABI_V4)
&& (INTVAL (operands[2]) & CALL_LONG) == 0"
-  "*
 {
   if (INTVAL (operands[2]) & CALL_V4_SET_FP_ARGS)
-output_asm_insn (\"crxor 6,6,6\", operands);
+output_asm_insn ("crxor 6,6,6", operands);
 
   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
-output_asm_insn (\"creqv 6,6,6\", operands);
+output_asm_insn ("creqv 6,6,6", operands);
 
   if (which_alternative >= 2)
 {
   if (rs6000_speculate_indirect_jumps)
-   return \"b%T0\";
+   return "b%T0";
   else
/* Can use CR0 since it is volatile across sibcalls.  */
-   return \"crset eq\;beq%T0-\;b $\";
+   return "crset eq;beq%T0-;b $";
 }
   else if (DEFAULT_ABI == ABI_V4 && flag_pic)
 {
   gcc_assert (!TARGET_SECURE_PLT);
-  return \"b %z0@plt\";
+  return "b %z0@plt";
 }
   else
-return \"b %z0\";
-}"
+return "b %z0";
+}
   [(set_attr "type" "branch")
(set (attr "length")
-   (cond [(eq (symbol_ref "which_alternative") (const_int 0))
- (const_string "4")
-  (eq (symbol_ref "which_alternative") (const_int 1))
+   (cond [(eq (symbol_ref "which_alternative") (const_int 1))
  (const_string "8")
   (and (eq (symbol_ref "which_alternative") (const_int 2))
(eq (symbol_ref "rs6000_speculate_indirect_jumps")
-   (const_int 1)))
- (const_string "4")
-  (and (eq (symbol_ref "which_alternative") (const_int 2))
-   (eq (symbol_ref "rs6000_speculate_indirect_jumps")
(const_int 0)))
  (const_string "12")
   (and (eq (symbol_ref "which_alternative") (const_int 3))
-   (eq (symbol_ref "rs6000_speculate_indirect_jumps")
-   

Re: [PATCH v2, rs6000] Use $ instead of . for PC

2018-01-19 Thread Bill Schmidt
Forgot to mention, these have also been tested successfully as backports in
gcc-7-branch.  Okay to fix there as well?

Thanks,
Bill

> On Jan 19, 2018, at 9:31 PM, Bill Schmidt  wrote:
> 
> Hi,
> 
> Here's the same patch with corrected test cases.  This has now passed
> bootstrap and testing on powerpc64-linux-gnu and powerpc64le-linux-gnu
> with no regressions.  Is this okay for trunk?
> 
> Hopefully tomorrow early I will have a separate patch also changing
> "setcr eq" to "setcr 2" to accommodate the AIX assembler.
> 
> Thanks,
> Bill
> 
> 
> [gcc]
> 
> 2018-01-19  Bill Schmidt  
>   David Edelsohn 
> 
>   PR target/83946
>   * config/rs6000/rs6000.md (*sibcall_nonlocal_sysv): Change
>   assembly output from . to $.
>   (*sibcall_value_nonlocal_sysv): Likewise.
>   (indirect_jump_nospec): Likewise.
>   (*tablejump_internal1_nospec): Likewise.
> 
> [gcc/testsuite]
> 
> 2018-01-19  Bill Schmidt  
>   David Edelsohn 
> 
>   PR target/83946
>   * gcc.target/powerpc/safe-indirect-jump-2.c: Change expected
>   assembly output from . to $.
>   * gcc.target/powerpc/safe-indirect-jump-3.c: Likewise.
>   * gcc.target/powerpc/safe-indirect-jump-8.c: Likewise.
> 
> 
> Index: gcc/config/rs6000/rs6000.md
> ===
> --- gcc/config/rs6000/rs6000.md   (revision 256894)
> +++ gcc/config/rs6000/rs6000.md   (working copy)
> @@ -10987,7 +10987,7 @@
>   return \"b%T0\";
>   else
>   /* Can use CR0 since it is volatile across sibcalls.  */
> - return \"crset eq\;beq%T0-\;b .\";
> + return \"crset eq\;beq%T0-\;b $\";
> }
>   else if (DEFAULT_ABI == ABI_V4 && flag_pic)
> {
> @@ -11044,7 +11044,7 @@
>   return \"b%T1\";
>   else
>   /* Can use CR0 since it is volatile across sibcalls.  */
> - return \"crset eq\;beq%T1-\;b .\";
> + return \"crset eq\;beq%T1-\;b $\";
> }
>   else if (DEFAULT_ABI == ABI_V4 && flag_pic)
> {
> @@ -12566,7 +12566,7 @@
>   [(set (pc) (match_operand:P 0 "register_operand" "c,*l"))
>(clobber (match_operand:CC 1 "cc_reg_operand" "=y,y"))]
>   "!rs6000_speculate_indirect_jumps"
> -  "crset %E1\;beq%T0- %1\;b ."
> +  "crset %E1\;beq%T0- %1\;b $"
>   [(set_attr "type" "jmpreg")
>(set_attr "length" "12")])
> 
> @@ -12672,7 +12672,7 @@
>(use (label_ref (match_operand 1)))
>(clobber (match_operand:CC 2 "cc_reg_operand" "=y,y"))]
>   "!rs6000_speculate_indirect_jumps"
> -  "crset %E2\;beq%T0- %2\;b ."
> +  "crset %E2\;beq%T0- %2\;b $"
>   [(set_attr "type" "jmpreg")
>(set_attr "length" "12")])
> 
> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c
> ===
> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c   (revision 
> 256894)
> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c   (working copy)
> @@ -30,4 +30,4 @@ int foo (int x)
> 
> /* { dg-final { scan-assembler "crset 30" } } */
> /* { dg-final { scan-assembler "beqctr- 7" } } */
> -/* { dg-final { scan-assembler "b ." } } */
> +/* { dg-final { scan-assembler {b \$} } } */
> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c
> ===
> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c   (revision 
> 256894)
> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c   (working copy)
> @@ -49,4 +49,4 @@ int foo (int x)
> 
> /* { dg-final { scan-assembler "crset 30" } } */
> /* { dg-final { scan-assembler "beqctr- 7" } } */
> -/* { dg-final { scan-assembler "b ." } } */
> +/* { dg-final { scan-assembler {b \$} } } */
> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c
> ===
> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c   (revision 
> 256894)
> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c   (working copy)
> @@ -12,4 +12,4 @@ int bar ()
> 
> /* { dg-final { scan-assembler "crset eq" } } */
> /* { dg-final { scan-assembler "beqctr-" } } */
> -/* { dg-final { scan-assembler "b ." } } */
> +/* { dg-final { scan-assembler {b \$} } } */
> 



[PATCH v2, rs6000] Use $ instead of . for PC

2018-01-19 Thread Bill Schmidt
Hi,

Here's the same patch with corrected test cases.  This has now passed
bootstrap and testing on powerpc64-linux-gnu and powerpc64le-linux-gnu
with no regressions.  Is this okay for trunk?

Hopefully tomorrow early I will have a separate patch also changing
"setcr eq" to "setcr 2" to accommodate the AIX assembler.

Thanks,
Bill


[gcc]

2018-01-19  Bill Schmidt  
David Edelsohn 

PR target/83946
* config/rs6000/rs6000.md (*sibcall_nonlocal_sysv): Change
assembly output from . to $.
(*sibcall_value_nonlocal_sysv): Likewise.
(indirect_jump_nospec): Likewise.
(*tablejump_internal1_nospec): Likewise.

[gcc/testsuite]

2018-01-19  Bill Schmidt  
David Edelsohn 

PR target/83946
* gcc.target/powerpc/safe-indirect-jump-2.c: Change expected
assembly output from . to $.
* gcc.target/powerpc/safe-indirect-jump-3.c: Likewise.
* gcc.target/powerpc/safe-indirect-jump-8.c: Likewise.


Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 256894)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -10987,7 +10987,7 @@
return \"b%T0\";
   else
/* Can use CR0 since it is volatile across sibcalls.  */
-   return \"crset eq\;beq%T0-\;b .\";
+   return \"crset eq\;beq%T0-\;b $\";
 }
   else if (DEFAULT_ABI == ABI_V4 && flag_pic)
 {
@@ -11044,7 +11044,7 @@
return \"b%T1\";
   else
/* Can use CR0 since it is volatile across sibcalls.  */
-   return \"crset eq\;beq%T1-\;b .\";
+   return \"crset eq\;beq%T1-\;b $\";
 }
   else if (DEFAULT_ABI == ABI_V4 && flag_pic)
 {
@@ -12566,7 +12566,7 @@
   [(set (pc) (match_operand:P 0 "register_operand" "c,*l"))
(clobber (match_operand:CC 1 "cc_reg_operand" "=y,y"))]
   "!rs6000_speculate_indirect_jumps"
-  "crset %E1\;beq%T0- %1\;b ."
+  "crset %E1\;beq%T0- %1\;b $"
   [(set_attr "type" "jmpreg")
(set_attr "length" "12")])
 
@@ -12672,7 +12672,7 @@
(use (label_ref (match_operand 1)))
(clobber (match_operand:CC 2 "cc_reg_operand" "=y,y"))]
   "!rs6000_speculate_indirect_jumps"
-  "crset %E2\;beq%T0- %2\;b ."
+  "crset %E2\;beq%T0- %2\;b $"
   [(set_attr "type" "jmpreg")
(set_attr "length" "12")])
 
Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c
===
--- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c (revision 
256894)
+++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c (working copy)
@@ -30,4 +30,4 @@ int foo (int x)
 
 /* { dg-final { scan-assembler "crset 30" } } */
 /* { dg-final { scan-assembler "beqctr- 7" } } */
-/* { dg-final { scan-assembler "b ." } } */
+/* { dg-final { scan-assembler {b \$} } } */
Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c
===
--- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c (revision 
256894)
+++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c (working copy)
@@ -49,4 +49,4 @@ int foo (int x)
 
 /* { dg-final { scan-assembler "crset 30" } } */
 /* { dg-final { scan-assembler "beqctr- 7" } } */
-/* { dg-final { scan-assembler "b ." } } */
+/* { dg-final { scan-assembler {b \$} } } */
Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c
===
--- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c (revision 
256894)
+++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c (working copy)
@@ -12,4 +12,4 @@ int bar ()
 
 /* { dg-final { scan-assembler "crset eq" } } */
 /* { dg-final { scan-assembler "beqctr-" } } */
-/* { dg-final { scan-assembler "b ." } } */
+/* { dg-final { scan-assembler {b \$} } } */



Re: [PATCH, rs6000] Use $ instead of . for PC

2018-01-19 Thread Bill Schmidt

> On Jan 19, 2018, at 7:53 PM, David Edelsohn  wrote:
> 
> On Fri, Jan 19, 2018 at 3:58 PM, Bill Schmidt
>  wrote:
>> Hi,
>> 
>> My recent patches to trunk and gcc-7-branch for avoiding speculation of
>> indirect branches has a flaw, pointed out by David.  Usage of "." to
>> represent the program counter is not portable across all POWER
>> assemblers, particularly not being accepted on AIX.  "$" is the
>> universally accepted alternative.  So change the code and the test
>> cases to use $ instead of . for this purpose.
>> 
>> Regstrap is in progress on powerpc64-linux-gnu and powerpc64le-linux-gnu.
>> Assuming no issues are found, is this okay for trunk and backport to 7?
> 
> Once I got past the "." issue, I have discovered that the AIX
> assembler also doesn't like
> 
> crset eq
> 
> It doesn't like the symbolic name for the operand, it wants a numeric
> operand for "eq".

This seems to me to be a different matter.  The ISA defines this syntax
in Appendix C and specifies that assemblers should provide everything
listed there.  The use of eq standing alone is shown in the examples in
section C.3.  So the AIX assembler is noncompliant in this case.  I
realize that as a practical matter we have to deal with that, but this
needs to be fixed at some point.  Where do we file the bug report?

I'm having a lot of heartburn over this because my test machine is
experiencing disk slowdowns, so it's taking me up to 4 hours to complete
a bootstrap and regression test.  So whenever my current test finishes
I plan to repost the patch in existing form so that it can be committed
tomorrow.

I'll try to set up another patch later tonight if I don't completely run out
of gas so it can burn overnight.  But frankly I may not manage it.

Bill
> 
> Thanks, David
> 



Re: [PATCH, rs6000] Use $ instead of . for PC

2018-01-19 Thread Segher Boessenkool
On Fri, Jan 19, 2018 at 08:53:52PM -0500, David Edelsohn wrote:
> On Fri, Jan 19, 2018 at 3:58 PM, Bill Schmidt
>  wrote:
> > Hi,
> >
> > My recent patches to trunk and gcc-7-branch for avoiding speculation of
> > indirect branches has a flaw, pointed out by David.  Usage of "." to
> > represent the program counter is not portable across all POWER
> > assemblers, particularly not being accepted on AIX.  "$" is the
> > universally accepted alternative.  So change the code and the test
> > cases to use $ instead of . for this purpose.
> >
> > Regstrap is in progress on powerpc64-linux-gnu and powerpc64le-linux-gnu.
> > Assuming no issues are found, is this okay for trunk and backport to 7?
> 
> Once I got past the "." issue, I have discovered that the AIX
> assembler also doesn't like
> 
> crset eq
> 
> It doesn't like the symbolic name for the operand, it wants a numeric
> operand for "eq".

Wow, nasty.  It is the very first thing in the "extended mnemonics"
appendix in the ISA.

"crset 2" works (I tried it out on gcc111).


Segher


Re: [C++ Patch] PR 83921 ("[7/8 Regression] GCC rejects constexpr initialization of empty aggregate")

2018-01-19 Thread Paolo Carlini

Hi again,

On 19/01/2018 23:55, Paolo Carlini wrote:
...Therefore It seems to me that a way to more cleanly solve the bug 
would be moving something like || !DECL_NONTRIVIALLY_INITIALIZED_P to 
the the above check in check_for_uninitialized_const_var, and 
therefore remove completely the uninitialized case from 
potential_constant_expression_1, ...
Of course this doesn't work. check_for_uninitialized_const would really 
need to know that the VAR_DECL appears in a statement expression which 
is initializing a constexpr variable, nothing to do with 
DECL_NONTRIVIALLY_INITIALIZED_P. I'll give the issue more thought over 
the we, but removing completely the check from 
potential_constant_expression_1 seems very tough to me, assuming of 
course we really care about diagnosing the uninitialized inner in 
stmtexpr20.C, which is my invention, isn't in the testsuite yet ;)


Paolo.


Re: [PATCH, rs6000] Use $ instead of . for PC

2018-01-19 Thread David Edelsohn
On Fri, Jan 19, 2018 at 3:58 PM, Bill Schmidt
 wrote:
> Hi,
>
> My recent patches to trunk and gcc-7-branch for avoiding speculation of
> indirect branches has a flaw, pointed out by David.  Usage of "." to
> represent the program counter is not portable across all POWER
> assemblers, particularly not being accepted on AIX.  "$" is the
> universally accepted alternative.  So change the code and the test
> cases to use $ instead of . for this purpose.
>
> Regstrap is in progress on powerpc64-linux-gnu and powerpc64le-linux-gnu.
> Assuming no issues are found, is this okay for trunk and backport to 7?

Once I got past the "." issue, I have discovered that the AIX
assembler also doesn't like

crset eq

It doesn't like the symbolic name for the operand, it wants a numeric
operand for "eq".

Thanks, David


Re: [PATCH, x86, libgcc] PR target/83917 Correct debug for -mcall-ms2sysv-xlogues stubs

2018-01-19 Thread Jakub Jelinek
On Fri, Jan 19, 2018 at 05:33:10PM -0600, Daniel Santos wrote:
> When stepping through tail-call restore stubs the debugger has to assume
> that rsp - 8 is the CFA, although it is not.  This is because I did not
> explicitly add any .cfi directives.  This patch adds them to the
> tail-call restore stubs, but this is new territory for me, so I would
> appreciate feedback.
> 
> I've reg-tested on x86_64, but I still need to test on Solaris and
> Darwin.  OK to commit after those tests?

I think you can't assume that the assembler supports .cfi_* directives.
While e.g. libgcc/config/i386/morestack.S uses them unconditionally,
it is guarded with:
if test "$libgcc_cv_cfi" = "yes"; then
tmake_file="${tmake_file} t-stack i386/t-stack-i386"
fi
in config.host.  E.g. cygwin.S has:
#ifdef HAVE_GAS_CFI_SECTIONS_DIRECTIVE
.cfi_sections   .debug_frame
# define cfi_startproc().cfi_startproc
# define cfi_endproc()  .cfi_endproc
# define cfi_adjust_cfa_offset(X)   .cfi_adjust_cfa_offset X
# define cfi_def_cfa_register(X).cfi_def_cfa_register X
# define cfi_register(D,S)  .cfi_register D, S
# ifdef __x86_64__
#  define cfi_push(X)   .cfi_adjust_cfa_offset 8; .cfi_rel_offset X, 0
#  define cfi_pop(X).cfi_adjust_cfa_offset -8; .cfi_restore X
# else
#  define cfi_push(X)   .cfi_adjust_cfa_offset 4; .cfi_rel_offset X, 0
#  define cfi_pop(X).cfi_adjust_cfa_offset -4; .cfi_restore X
# endif
#else
# define cfi_startproc()
# define cfi_endproc()
# define cfi_adjust_cfa_offset(X)
# define cfi_def_cfa_register(X)
# define cfi_register(D,S)
# define cfi_push(X)
# define cfi_pop(X)
#endif /* HAVE_GAS_CFI_SECTIONS_DIRECTIVE */
perhaps you need something similar or commonize that (though, without
.cfi_sections, you want the default).

Jakub


[Committed] PR fortran/80768 -- testcase

2018-01-19 Thread Steve Kargl
Jakub fixed a typo in

---
r250734 | jakub | 2017-07-31 02:32:02 -0700 (Mon, 31 Jul 2017) | 2 lines

* check.c (gfc_check_num_images): Fix a pasto.

but he seemed to be unaware of PR fortran/80768.  I've converted
to code form the PR into a testcase.  I plan to fix the typo
in the 6 and 7-branch as it prevents an ICE.

2018-01-19  Steven G. Kargl  

PR fortran/80768
* gfortran.dg/num_images_1.f90:  New test that tests fix in r250734.

Index: gcc/testsuite/gfortran.dg/num_images_1.f90
===
--- gcc/testsuite/gfortran.dg/num_images_1.f90  (nonexistent)
+++ gcc/testsuite/gfortran.dg/num_images_1.f90  (working copy)
@@ -0,0 +1,9 @@
+! { dg-do compile }
+! { dg-options "-fcoarray=single -std=f2008" }
+! PR  Fortran/80768
+! Reported by Vittorio Zecca.
+program foo
+   implicit none
+   integer k5
+   k5 = num_images(failed=.false.) ! { dg-error "argument to NUM_IMAGES" }
+end program foo

-- 
Steve


[PATCH, x86, libgcc] PR target/83917 Correct debug for -mcall-ms2sysv-xlogues stubs

2018-01-19 Thread Daniel Santos
When stepping through tail-call restore stubs the debugger has to assume
that rsp - 8 is the CFA, although it is not.  This is because I did not
explicitly add any .cfi directives.  This patch adds them to the
tail-call restore stubs, but this is new territory for me, so I would
appreciate feedback.

I've reg-tested on x86_64, but I still need to test on Solaris and
Darwin.  OK to commit after those tests?

Thanks,
Daniel

Signed-off-by: Daniel Santos 
---
 libgcc/config/i386/resms64fx.h | 19 +++
 libgcc/config/i386/resms64x.h  | 22 ++
 2 files changed, 41 insertions(+)

diff --git a/libgcc/config/i386/resms64fx.h b/libgcc/config/i386/resms64fx.h
index c5f63d879fe..7dc8c7d89ed 100644
--- a/libgcc/config/i386/resms64fx.h
+++ b/libgcc/config/i386/resms64fx.h
@@ -34,21 +34,40 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 
.text
 MS2SYSV_STUB_BEGIN(resms64fx_17)
+.cfi_startproc
+.cfi_def_cfa %rbp, 16
mov -0x68(%rsi),%r15
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64fx_16)
+.cfi_startproc
+.cfi_def_cfa %rbp, 16
mov -0x60(%rsi),%r14
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64fx_15)
+.cfi_startproc
+.cfi_def_cfa %rbp, 16
mov -0x58(%rsi),%r13
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64fx_14)
+.cfi_startproc
+.cfi_def_cfa %rbp, 16
mov -0x50(%rsi),%r12
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64fx_13)
+.cfi_startproc
+.cfi_def_cfa %rbp, 16
mov -0x48(%rsi),%rbx
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64fx_12)
+.cfi_startproc
+.cfi_def_cfa %rbp, 16
mov -0x40(%rsi),%rdi
SSE_RESTORE
mov -0x38(%rsi),%rsi
leaveq
+.cfi_def_cfa %rsp, 8
ret
+.cfi_endproc
 MS2SYSV_STUB_END(resms64fx_12)
 MS2SYSV_STUB_END(resms64fx_13)
 MS2SYSV_STUB_END(resms64fx_14)
diff --git a/libgcc/config/i386/resms64x.h b/libgcc/config/i386/resms64x.h
index 1b44938ae7c..753be1f4c52 100644
--- a/libgcc/config/i386/resms64x.h
+++ b/libgcc/config/i386/resms64x.h
@@ -33,23 +33,45 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 
.text
 MS2SYSV_STUB_BEGIN(resms64x_18)
+.cfi_startproc
+.cfi_def_cfa %r10, 8
mov -0x70(%rsi),%r15
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64x_17)
+.cfi_startproc
+.cfi_def_cfa %r10, 8
mov -0x68(%rsi),%r14
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64x_16)
+.cfi_startproc
+.cfi_def_cfa %r10, 8
mov -0x60(%rsi),%r13
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64x_15)
+.cfi_startproc
+.cfi_def_cfa %r10, 8
mov -0x58(%rsi),%r12
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64x_14)
+.cfi_startproc
+.cfi_def_cfa %r10, 8
mov -0x50(%rsi),%rbp
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64x_13)
+.cfi_startproc
+.cfi_def_cfa %r10, 8
mov -0x48(%rsi),%rbx
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64x_12)
+.cfi_startproc
+.cfi_def_cfa %r10, 8
mov -0x40(%rsi),%rdi
SSE_RESTORE
mov -0x38(%rsi),%rsi
mov %r10,%rsp
+.cfi_def_cfa_register %rsp
ret
+.cfi_endproc
 MS2SYSV_STUB_END(resms64x_12)
 MS2SYSV_STUB_END(resms64x_13)
 MS2SYSV_STUB_END(resms64x_14)
-- 
2.15.0



Re: [C++ Patch] PR 83921 ("[7/8 Regression] GCC rejects constexpr initialization of empty aggregate")

2018-01-19 Thread Paolo Carlini

Hi,

On 19/01/2018 23:28, Jason Merrill wrote:

On Thu, Jan 18, 2018 at 5:53 PM, Paolo Carlini  wrote:

Hi,

I'm finishing testing on x86_64-linux the below - which anyway seems very
unlikely to cause regressions because we aren't really stress testing the
relevant checks in potential_constant_expression_1 much, if at all (surely
stmtexpr19.C tests static).

Anyway, the issue is the following. In 239268 aka "Implement C++17 constexpr
lambda" Jason added some checks to potential_constant_expression_1 covering
static, thread_local and uninitialized var declaration in constexpr function
context. Then extended to constexpr context more generally in 249382 aka
"constexpr and static var in statement-expression", with ext/stmtexpr19.C
covering the static case. Now, it looks like the check for uninitialized
vars in constexpr functions context is more correctly carried out by
check_for_uninitialized_const_var instead, because the simple check in
potential_constant_expression_1 as-is causes the regression pointed out by
this bug. Thus the fix below which just restricts the check in
potential_constant_expression_1, and the testcases, one for this bug proper,
plus one, very similar to stmtexpr19.C, double checking that we are still
diagnosing in the statement-expression context. I also verified under the
debugger how for constexpr-83921.C we are actually running
check_for_uninitialized_const_var on 'f' - which obviously passes.

Seems like this code should either be removed because it's covered by
check_for_uninitialized_const_var, or it should be fixed to check
default_init_uninitialized_part.

So this code is still needed for stmtexpr19.C?  Why doesn't
check_for_... handle that case?
Well, stmtexpr19.C exists to exercise 'static', which definitely 
check_for_uninitialized_const_var doesn't cover. The same would be true 
for a testcase exercising 'thread_local'. As regards my new 
stmtexpr20.C, check_for_uninitialized_const_var can't diagnose anything 
for a very simple reason:


  if (VAR_P (decl)
  && TREE_CODE (type) != REFERENCE_TYPE
  && (CP_TYPE_CONST_P (type) || var_in_constexpr_fn (decl))
  && !DECL_INITIAL (decl))

thus doesn't do anything for non-const vars outside a constexpr 
function. On the other hand, !DECL_NONTRIVIALLY_INITIALIZED_P is true 
for such vars in constexpr context which actually belong to a statement 
expression (always?). Therefore It seems to me that a way to more 
cleanly solve the bug would be moving something like || 
!DECL_NONTRIVIALLY_INITIALIZED_P to the the above check in 
check_for_uninitialized_const_var, and therefore remove completely the 
uninitialized case from potential_constant_expression_1, but I'm note 
sure it would always work because at the moment isn't entirely clear to 
me how DECL_NONTRIVIALLY_INITIALIZED_P reflects the statement-expression 
context bit (that explain the conservative slant of my patch) What do 
you think?


Paolo.


[C++] Deprecate old for-scope handling

2018-01-19 Thread Nathan Sidwell

Jason,
what do you think about deprecating the ARM-era for-scope handling that 
allows:

  void f ()
  {
for (int i = 0;;);
i = 2;
  }

we noisily accept that in c++98 mode with -fpermissive.  It wasn't even 
well formed then.  Implementing this has some unique requirements in the 
name-lookup machinery, which I ran into again today.


Option A: rip out now because it's a c++98 ARM-compatibility crutch
Option B: deprecate in gcc-8 and remove in gcc-9.

nathan

--
Nathan Sidwell


Re: [PATCH, fortran] Support Fortran 2018 teams

2018-01-19 Thread Alessandro Fanfarillo
I can confirm that the little change suggested by Steve passes the
regtests (on x86_64-pc-linux-gnu) and the regular tests using
OpenCoarrays.

On Fri, Jan 19, 2018 at 10:33 AM, Steve Kargl
 wrote:
> On Fri, Jan 19, 2018 at 09:18:14AM -0800, Damian Rouson wrote:
>> Thanks for catching that, Steve, and for responding, Alessandro.
>>
>> Anything else?
>>
>
> I've only just started to look at the patch.   Unfortunately,
> I know zero about teams, so need to read the patch and F2018
> standard simultaneously.
>
> --
> Steve



-- 

Alessandro Fanfarillo, Ph.D.
Postdoctoral Researcher
National Center for Atmospheric Research
Mesa Lab, Boulder, CO, USA
303-497-1229


Re: [PATCH, rs6000] Use $ instead of . for PC

2018-01-19 Thread Bill Schmidt

> On Jan 19, 2018, at 3:58 PM, Segher Boessenkool  
> wrote:
> 
> On Fri, Jan 19, 2018 at 10:20:23PM +0100, Andreas Schwab wrote:
>> On Jan 19 2018, Bill Schmidt  wrote:
>> 
 On Jan 19, 2018, at 3:09 PM, Jakub Jelinek  wrote:
 
 On Fri, Jan 19, 2018 at 02:58:07PM -0600, Bill Schmidt wrote:
> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c   
> (revision 256894)
> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c   
> (working copy)
> @@ -30,4 +30,4 @@ int foo (int x)
> 
> /* { dg-final { scan-assembler "crset 30" } } */
> /* { dg-final { scan-assembler "beqctr- 7" } } */
> -/* { dg-final { scan-assembler "b ." } } */
> +/* { dg-final { scan-assembler "b $" } } */
 
 Does $ in scan-assembler really match a literal $ and not end of line?
 Looking around, most of scan-assembler patterns that want to match a $ use
 \\\$
>>> 
>>> Right.  Working on getting the right number of backslashes in here...
>>> I can never remember which ones need one and which need 3.
>> 
>> Use braces.
> 
> Yes, either
> 
> +/* { dg-final { scan-assembler "b \\\$" } } */
> 
> or (preferably)
> 
> +/* { dg-final { scan-assembler {b \$} } } */

Thanks, going with the latter suggestion.  My environment crapped out so I'm 
restarting
the regstrap.  Will repost later this evening after it passes.

Thanks,
Bill
> 
> 
> Segher



Re: [C++ Patch] PR 83921 ("[7/8 Regression] GCC rejects constexpr initialization of empty aggregate")

2018-01-19 Thread Jason Merrill
On Thu, Jan 18, 2018 at 5:53 PM, Paolo Carlini  wrote:
> Hi,
>
> I'm finishing testing on x86_64-linux the below - which anyway seems very
> unlikely to cause regressions because we aren't really stress testing the
> relevant checks in potential_constant_expression_1 much, if at all (surely
> stmtexpr19.C tests static).
>
> Anyway, the issue is the following. In 239268 aka "Implement C++17 constexpr
> lambda" Jason added some checks to potential_constant_expression_1 covering
> static, thread_local and uninitialized var declaration in constexpr function
> context. Then extended to constexpr context more generally in 249382 aka
> "constexpr and static var in statement-expression", with ext/stmtexpr19.C
> covering the static case. Now, it looks like the check for uninitialized
> vars in constexpr functions context is more correctly carried out by
> check_for_uninitialized_const_var instead, because the simple check in
> potential_constant_expression_1 as-is causes the regression pointed out by
> this bug. Thus the fix below which just restricts the check in
> potential_constant_expression_1, and the testcases, one for this bug proper,
> plus one, very similar to stmtexpr19.C, double checking that we are still
> diagnosing in the statement-expression context. I also verified under the
> debugger how for constexpr-83921.C we are actually running
> check_for_uninitialized_const_var on 'f' - which obviously passes.

Seems like this code should either be removed because it's covered by
check_for_uninitialized_const_var, or it should be fixed to check
default_init_uninitialized_part.

So this code is still needed for stmtexpr19.C?  Why doesn't
check_for_... handle that case?

Jason


Re: [C++ PATCH] Avoid spurious -Wignored-qualifiers warning on artificial C++17 P0138R2 cast (PR c++/83919)

2018-01-19 Thread Jason Merrill
OK.

On Thu, Jan 18, 2018 at 6:17 PM, Jakub Jelinek  wrote:
> Hi!
>
> These casts for P0138R2 isn't something the user typed in their code,
> so diagnosing -Wignored-qualifiers on these looks wrong.
> -Wuseless-cast has been handled similarly in the past already.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2018-01-18  Jakub Jelinek  
>
> PR c++/83919
> * typeck.c (convert_for_assignment): Suppress warn_ignored_qualifiers
> for direct enum init.
> * decl.c (reshape_init): Likewise.
>
> * g++.dg/cpp0x/pr83919.C: New test.
>
> --- gcc/cp/typeck.c.jj  2018-01-17 22:00:06.863228592 +0100
> +++ gcc/cp/typeck.c 2018-01-18 10:58:39.976333499 +0100
> @@ -8689,6 +8689,7 @@ convert_for_assignment (tree type, tree
>if (check_narrowing (ENUM_UNDERLYING_TYPE (type), elt, complain))
> {
>   warning_sentinel w (warn_useless_cast);
> + warning_sentinel w2 (warn_ignored_qualifiers);
>   rhs = cp_build_c_cast (type, elt, complain);
> }
>else
> --- gcc/cp/decl.c.jj2018-01-18 00:41:40.564015137 +0100
> +++ gcc/cp/decl.c   2018-01-18 11:10:10.579713182 +0100
> @@ -6091,6 +6091,7 @@ reshape_init (tree type, tree init, tsub
>if (check_narrowing (ENUM_UNDERLYING_TYPE (type), elt, complain))
> {
>   warning_sentinel w (warn_useless_cast);
> + warning_sentinel w2 (warn_ignored_qualifiers);
>   return cp_build_c_cast (type, elt, tf_warning_or_error);
> }
>else
> --- gcc/testsuite/g++.dg/cpp0x/pr83919.C.jj 2018-01-18 11:03:08.048295967 
> +0100
> +++ gcc/testsuite/g++.dg/cpp0x/pr83919.C2018-01-18 10:59:57.465322655 
> +0100
> @@ -0,0 +1,10 @@
> +// PR c++/83919
> +// { dg-do compile { target c++11 } }
> +// { dg-options "-Wignored-qualifiers" }
> +
> +enum class Conf;
> +struct foo
> +{
> +  foo (const Conf& conf) : x{conf} {}  // { dg-bogus "type qualifiers 
> ignored on cast result type" }
> +  const Conf x;
> +};
>
> Jakub


Re: [PATCH] Fix unwind info in x86 interrupt functions (PR debug/83728)

2018-01-19 Thread Jason Merrill
On Thu, Jan 18, 2018 at 6:28 PM, Jakub Jelinek  wrote:
> Last summer i386 INCOMING_FRAME_SP_OFFSET macro has been changed, so that it
> is one word for most of the functions (as previously always), but 2 words
> for functions with interrupt attribute.
> Unfortunately this breaks the unwind info, as can be seen on the
> gcc.dg/guality/pr68037-1.c testcase.  Our infrastructure assumes we have
> just one set of cfi instructions in the CIE (we can have multiple CIEs, but
> only for various flags, not for different cfi instructions), so if
> the first function being assembled is an interrupt function, we start
> with the 2*word initial offset in the CIE we emit (when
> -fno-dwarf2-cfi-asm), or just assume the CIE has that (with -fdwarf2-cfi-asm
> when it is GAS that emits it) even when it doesn't.  For -fdwarf2-cfi-asm
> the effect is that until first CFA adjustment the offset is off by a word,
> after that correct.  In both cases all CFA offsets are off by a word in
> non-interrupt functions.
> Or, if the first function doesn't have interrupt attribute, CFI in
> non-interrupt functions is correct, but interrupt functions are wrong.
>
> I've looked around and it seems stormy16 is the only other target that
> bases the INCOMING_FRAME_SP_OFFSET value not just on ABI changing switches,
> but on properties of the current function.
>
> For these 2 targets the following patch introduces another macro,
> DEFAULT_INCOMING_FRAME_SP_OFFSET, that is meant to be the same for all
> functions of the same ABI and where GAS supports .cfi_startproc it should
> also match what GAS does.  For the hopefully minority of functions that
> need something else (i.e. interrupt functions) we emit a CFI instruction
> right at the start of the function.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, fixes the pr68037-1.c
> FAILs, ok for trunk?

OK, thanks.

Jason


patch to fix PR83147

2018-01-19 Thread Vladimir Makarov

The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83147

The patch was successfully boostrapped and tested on x86_64.

There is no test for the patch because a test from the PR does not 
reproduce the bug anymore.


Committed as rev. 256902.

Index: ChangeLog
===
--- ChangeLog	(revision 256901)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2018-01-19  Andreas Krebbel  
+
+	PR rtl-optimization/83147
+	* lra-constraints.c (remove_inheritance_pseudos): Use
+	lra_substitute_pseudo_within_insn.
+
 2018-01-19  Tom de Vries  
 	Cesar Philippidis  
 
Index: lra-constraints.c
===
--- lra-constraints.c	(revision 256891)
+++ lra-constraints.c	(working copy)
@@ -6719,10 +6719,12 @@ remove_inheritance_pseudos (bitmap remov
 		{
 		  lra_assert (GET_MODE (SET_SRC (prev_set))
   == GET_MODE (regno_reg_rtx[sregno]));
-		  if (GET_CODE (SET_SRC (set)) == SUBREG)
-			SUBREG_REG (SET_SRC (set)) = SET_SRC (prev_set);
-		  else
-			SET_SRC (set) = SET_SRC (prev_set);
+		  /* Although we have a single set, the insn can
+			 contain more one sregno register occurrence
+			 as a source.  Change all occurrences.  */
+		  lra_substitute_pseudo_within_insn (curr_insn, sregno,
+			 SET_SRC (prev_set),
+			 false);
 		  /* As we are finishing with processing the insn
 			 here, check the destination too as it might
 			 inheritance pseudo for another pseudo.  */


Re: [C++ PATCH] Fix ICE in joust with -Wconversion (PR c++/81167)

2018-01-19 Thread Jason Merrill
On Thu, Jan 18, 2018 at 6:35 PM, Jakub Jelinek  wrote:
> Hi!
>
> As mentioned in the PR, we ICE on this testcase because
> w->fn is a conversion operator, w->convs[0]->type is a reference
> to a class type, but because that conversion is ck_ref_bind,
> source_type looks through it and finds ck_identity with
> the class type.  Then we because w->fn is not a constructor
> do source = TREE_TYPE (source);, assuming we got a pointer
> type like on the other 5 testcases in check-c++-all with -Wconversion
> that cover this code, so source is NULL and we die in calling
> warning with bogus arguments.
>
> The following patch fixes it by only using TREE_TYPE on pointer/reference
> types.  I must say I don't understand this fully, but conversion
> operators should be used on class types, so that is what we are looking for
> with source, right?  No idea about the ! DECL_CONSTRUCTOR_P (w->fn)
> check though and nothing in the testsuite covers that.

I think that the test for POINTER_TYPE_P can replace the
DECL_CONSTRUCTOR_P test.

Jason


[og7,nvptx] Backport CUDA 9 support from trunk.

2018-01-19 Thread Cesar Philippidis
I've committed this patch to openacc-gcc-7-branch which backports the
recent CUDA 9 changes I applied to trunk here
. OG7 already
had an earlier version of this patch, but Tom requested some changes
which went made their way into trunk. This patch keeps both trunk and
og7 consistent.

Cesar
[nvptx] Backport CUDA 9 support from trunk.

2018-01-19  Cesar Philippidis  

	Backport from mainline:
	2018-01-19  Cesar Philippidis  

	PR target/83790
	gcc/
	* config/nvptx/nvptx.c (output_init_frag):


diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 55c7e3cbf90..90d87bafda6 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -1894,14 +1894,14 @@ output_init_frag (rtx sym)
   
   if (sym)
 {
-  bool function = SYMBOL_REF_DECL (sym)
-	&& (TREE_CODE (SYMBOL_REF_DECL (sym)) == FUNCTION_DECL);
+  bool function = (SYMBOL_REF_DECL (sym)
+		   && (TREE_CODE (SYMBOL_REF_DECL (sym)) == FUNCTION_DECL));
   if (!function)
 	fprintf (asm_out_file, "generic(");
   output_address (VOIDmode, sym);
   if (!function)
-	fprintf (asm_out_file, val ? ") + " : ")");
-  else if (val)
+	fprintf (asm_out_file, ")");
+  if (val)
 	fprintf (asm_out_file, " + ");
 }
 


Re: [PATCH, rs6000] Use $ instead of . for PC

2018-01-19 Thread Segher Boessenkool
On Fri, Jan 19, 2018 at 10:20:23PM +0100, Andreas Schwab wrote:
> On Jan 19 2018, Bill Schmidt  wrote:
> 
> >> On Jan 19, 2018, at 3:09 PM, Jakub Jelinek  wrote:
> >> 
> >> On Fri, Jan 19, 2018 at 02:58:07PM -0600, Bill Schmidt wrote:
> >>> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c   
> >>> (revision 256894)
> >>> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c   
> >>> (working copy)
> >>> @@ -30,4 +30,4 @@ int foo (int x)
> >>> 
> >>> /* { dg-final { scan-assembler "crset 30" } } */
> >>> /* { dg-final { scan-assembler "beqctr- 7" } } */
> >>> -/* { dg-final { scan-assembler "b ." } } */
> >>> +/* { dg-final { scan-assembler "b $" } } */
> >> 
> >> Does $ in scan-assembler really match a literal $ and not end of line?
> >> Looking around, most of scan-assembler patterns that want to match a $ use
> >> \\\$
> >
> > Right.  Working on getting the right number of backslashes in here...
> > I can never remember which ones need one and which need 3.
> 
> Use braces.

Yes, either

+/* { dg-final { scan-assembler "b \\\$" } } */

or (preferably)

+/* { dg-final { scan-assembler {b \$} } } */


Segher


[PATCH] Fix tree-emutls ADDR_EXPR handling (PR middle-end/83945)

2018-01-19 Thread Jakub Jelinek
Hi!

Before emutls lowering, expressions like _REF[(whatever)].a
for static __thread e is considered gimple invariant (after all, for
native TLS we do that too) and gimple invariants are shareable trees.
lower_emutls* then has code to update the VAR_DECL in there with an SSA_NAME
and if needed, split it appart into another stmt (i.e. manually regimplify)
if needed, but doesn't take into account the possible tree sharing,
which can result in 1) invalid tree sharing, because after changing the
VAR_DECL for a SSA_NAME it is no longer gimple invariant 2) might not
break it appart anymore the second and further times 3) might use SSA_NAME
that doesn't dominate it.

Fixed by checking if we'll need to update the ADDR_EXPR operand or anything
in there and unsharing if so before actually changing it.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-01-19  Jakub Jelinek  

PR middle-end/83945
* tree-emutls.c: Include gimplify.h.
(lower_emutls_2): New function.
(lower_emutls_1): If ADDR_EXPR is a gimple invariant and walk_tree
with lower_emutls_2 callback finds some TLS decl in it, unshare_expr
it before further processing.

* gcc.dg/tls/pr83945.c: New test.

--- gcc/tree-emutls.c.jj2018-01-03 10:19:54.790533899 +0100
+++ gcc/tree-emutls.c   2018-01-19 19:11:14.849321308 +0100
@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.
 #include "gimple-walk.h"
 #include "langhooks.h"
 #include "tree-iterator.h"
+#include "gimplify.h"
 
 /* Whenever a target does not support thread-local storage (TLS) natively,
we can emulate it with some run-time support in libgcc.  This will in
@@ -429,6 +430,20 @@ gen_emutls_addr (tree decl, struct lower
   return addr;
 }
 
+/* Callback for lower_emutls_1, return non-NULL if there is any TLS
+   VAR_DECL in the subexpressions.  */
+
+static tree
+lower_emutls_2 (tree *ptr, int *walk_subtrees, void *)
+{
+  tree t = *ptr;
+  if (TREE_CODE (t) == VAR_DECL)
+return DECL_THREAD_LOCAL_P (t) ? t : NULL_TREE;
+  else if (!EXPR_P (t))
+*walk_subtrees = 0;
+  return NULL_TREE;
+}
+
 /* Callback for walk_gimple_op.  D = WI->INFO is a struct lower_emutls_data.
Given an operand *PTR within D->STMT, if the operand references a TLS
variable, then lower the reference to a call to the runtime.  Insert
@@ -455,6 +470,13 @@ lower_emutls_1 (tree *ptr, int *walk_sub
{
  bool save_changed;
 
+ /* Gimple invariants are shareable trees, so before changing
+anything in them if we will need to change anything, unshare
+them.  */
+ if (is_gimple_min_invariant (t)
+ && walk_tree (_OPERAND (t, 0), lower_emutls_2, NULL, NULL))
+   *ptr = t = unshare_expr (t);
+
  /* If we're allowed more than just is_gimple_val, continue.  */
  if (!wi->val_only)
{
--- gcc/testsuite/gcc.dg/tls/pr83945.c.jj   2018-01-19 19:16:52.376346273 
+0100
+++ gcc/testsuite/gcc.dg/tls/pr83945.c  2018-01-19 19:16:25.186344529 +0100
@@ -0,0 +1,21 @@
+/* PR middle-end/83945 */
+/* { dg-do compile { target tls } } */
+/* { dg-options "-O2" } */
+
+struct S { int a[1]; };
+__thread struct T { int c; } e;
+int f;
+void bar (int);
+
+void
+foo (int f, int x)
+{
+  struct S *h = (struct S *) 
+  for (;;)
+{
+  int *a = h->a, i;
+  for (i = x; i; i--)
+   bar (a[f]);
+  bar (a[f]);
+}
+}

Jakub


Re: [PATCH, rs6000] Testcase fix-ups for gimple-folding intrinsic tests.

2018-01-19 Thread Segher Boessenkool
Hi!

On Fri, Jan 19, 2018 at 01:14:08PM -0600, Will Schmidt wrote:
>   Some testcase fix-ups affecting the gimple-folding tests, including:
> - Add xxspltib as a valid instruction for vec-abs tests (power9).
> - Fix up mismatched dg-require and dg-option values for the
> fold-vec-shift-left-longlong tests (fixes/eliminates a test failure on p6).
> 
> Sniff tested across P6,P8,P9.
> 
> OK for trunk?
> Thanks,
> -Will

This looks fine, okay for trunk, thanks!  One typo:

>   2018-01-19  Will Schmidt 
> 
>   * gcc.target/powerpc/fold-vec-abs-short-fwrap.c: Add xxspltib to
>   scan-asembler valid instructions list.
>   * gcc.target/powerpc/fold-vec-abs-short.c: Same.
>   * gcc.target/powerpc/fold-vec-shift-left-longlong.c: clean up
>   power8-vector requirement and option.
>   * gcc.target/powerpc/fold-vec-shift-left-fwrapv.c: Same.


Segher


[PATCH] Fix branch probability computation in do_compare_rtx_and_jump (PR tree-optimization/83081)

2018-01-19 Thread Jakub Jelinek
Hi!

This PR is about a certain test FAILing on arm, because it scans for
"Invalid sum ..." message in the *.ira dump, but recent GCC versions have
that message in there; not introduced by IRA though, but all the way from
expansion.  We are expanding:
   [local count: 1073741825]:
  _1 = *x_3(D);
  if (_1 u>= 0.0)
goto ; [99.95%]
  else
goto ; [0.05%]

   [local count: 536864]:
  sqrtf (_1);
and do_compare_rtx_and_jump decides to split the single _1 u>= 0.0
comparison into two.  The expectation is that the basic block counts stay
the same, so if bb 3's count is 0.05% times bb 2's count, the probabilities
need to be computed on both jumps so that this is preserved.
We want to expand essentially to:
   [local count: 1073741825]:
...
  if (cond1)
goto ; [first_prob]
  else
goto ; [first_prob.invert ()]

  :
  if (cond2)
goto ; [prob]
  else
goto ; [prob.invert ()]

   [local count: 536864]:
  sqrtf (_1);
and compute first_prob and prob from the original prob such that the bb
counts match.  The code used to hardcode first_prob to 1% or 99% depending
on condition, and leave the second probability the original one.

That IMHO can't work and the Invalid sum message verifies that.  If we want
the first jump to hit 99times more often than the second one or vice versa,
I believe first_prob should be .99 * prob or .01 * prob respectively, and
the second probability then should be (0.01 * prob) / (1 - 0.99 * prob)
or (0.99 * prob) / (1 - 0.01 * prob) respectively.

With this change the Invalid sum message disappears.
predict-8.c testcase was apparently trying to match the hardcoded 0.99
probability rather than .99 * 65% emitted now.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

If this patch is right, I think do_jump_by_parts* are buggy similarly too,
there we emit prob or prob.invert () for all the N jumps we emit instead of
the original single conditional jump with probability prob.  I think we'd
need to decide what relative probabilities we want to use for the different
jumps, e.g. all have even relative likelyhood and then adjust the
probability of each branch and from what we compute the following
probabiliries similarly to this patch.

2018-01-19  Jakub Jelinek  

PR tree-optimization/83081
* dojump.c (do_compare_rtx_and_jump): Fix adjustment of probabilities
when splitting a single conditional jump into 2.

* gcc.dg/predict-8.c: Adjust expected probability.

--- gcc/dojump.c.jj 2018-01-03 10:19:55.0 +0100
+++ gcc/dojump.c2018-01-19 17:07:25.238927314 +0100
@@ -1122,11 +1122,30 @@ do_compare_rtx_and_jump (rtx op0, rtx op
{
  profile_probability first_prob = prob;
  if (first_code == UNORDERED)
-   first_prob = profile_probability::guessed_always ().apply_scale
-(1, 100);
+   {
+ /* We want to split:
+if (x) goto t; // prob;
+into
+if (a) goto t; // first_prob;
+if (b) goto t; // prob;
+such that the overall probability of jumping to t
+remains the same, but the first jump jumps
+much less often than the second jump.  */
+ first_prob = prob.guessed ().apply_scale (1, 100);
+ prob = (prob.guessed () - first_prob) / first_prob.invert ();
+   }
  else if (first_code == ORDERED)
-   first_prob = profile_probability::guessed_always ().apply_scale
-(99, 100);
+   {
+ /* See above, except the first jump should jump much more
+often than the second one.  */
+ first_prob = prob.guessed ().apply_scale (99, 100);
+ prob = (prob.guessed () - first_prob) / first_prob.invert ();
+   }
+ else
+   {
+ first_prob = prob.guessed ().apply_scale (50, 100);
+ prob = first_prob;
+   }
  if (and_them)
{
  rtx_code_label *dest_label;
--- gcc/testsuite/gcc.dg/predict-8.c.jj 2017-11-21 23:17:43.149093787 +0100
+++ gcc/testsuite/gcc.dg/predict-8.c2018-01-19 22:24:09.949249810 +0100
@@ -8,4 +8,4 @@ int foo(float a, float b) {
 return 2;
 }
 
-/* { dg-final { scan-rtl-dump-times "99.0. .guessed" 2 "expand"} } */
+/* { dg-final { scan-rtl-dump-times "64.[34]. .guessed" 2 "expand"} } */

Jakub


[PATCH] Fix UMOD rtx simplification (PR target/83930)

2018-01-19 Thread Jakub Jelinek
Hi!

THis bug has been introduced 16.5 years ago.  If simplify_binary_operation_1
is called with op1 (MEM) on which avoid_constant_pool_reference returns
something simpler, for UMOD simplification we check that trueop1 is
CONST_INT and power of 2, but then we use INTVAL of the op1 (MEM).

Fixed by using INTVAL on what we've tested.  Additionally, if the value
has msb set and all other bits clear, exact_log2 will return > 0, but
INTVAL (trueop1) - 1 will invoke UB.  Bootstrapped/regtested on x86_64-linux
and i686-linux, ok for trunk?

2018-01-19  Jakub Jelinek  

PR target/83930
* simplify-rtx.c (simplify_binary_operation_1) : Use
UINTVAL (trueop1) instead of INTVAL (op1).

* gcc.dg/pr83930.c: New test.

--- gcc/simplify-rtx.c.jj   2018-01-14 17:16:55.657836138 +0100
+++ gcc/simplify-rtx.c  2018-01-19 10:24:03.389017997 +0100
@@ -3411,7 +3411,8 @@ simplify_binary_operation_1 (enum rtx_co
   if (CONST_INT_P (trueop1)
  && exact_log2 (UINTVAL (trueop1)) > 0)
return simplify_gen_binary (AND, mode, op0,
-   gen_int_mode (INTVAL (op1) - 1, mode));
+   gen_int_mode (UINTVAL (trueop1) - 1,
+ mode));
   break;
 
 case MOD:
--- gcc/testsuite/gcc.dg/pr83930.c.jj   2018-01-19 10:33:16.657831745 +0100
+++ gcc/testsuite/gcc.dg/pr83930.c  2018-01-19 10:31:55.383859102 +0100
@@ -0,0 +1,17 @@
+/* PR target/83930 */
+/* { dg-do compile } */
+/* { dg-options "-Og -fno-tree-ccp -w" } */
+
+unsigned __attribute__ ((__vector_size__ (16))) v;
+
+static inline void
+bar (unsigned char d)
+{
+  v /= d;
+}
+
+__attribute__ ((always_inline)) void
+foo (void)
+{
+  bar (4);
+}

Jakub


Re: [PATCH, rs6000] Use $ instead of . for PC

2018-01-19 Thread Andreas Schwab
On Jan 19 2018, Bill Schmidt  wrote:

>> On Jan 19, 2018, at 3:09 PM, Jakub Jelinek  wrote:
>> 
>> On Fri, Jan 19, 2018 at 02:58:07PM -0600, Bill Schmidt wrote:
>>> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c (revision 
>>> 256894)
>>> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c (working copy)
>>> @@ -30,4 +30,4 @@ int foo (int x)
>>> 
>>> /* { dg-final { scan-assembler "crset 30" } } */
>>> /* { dg-final { scan-assembler "beqctr- 7" } } */
>>> -/* { dg-final { scan-assembler "b ." } } */
>>> +/* { dg-final { scan-assembler "b $" } } */
>> 
>> Does $ in scan-assembler really match a literal $ and not end of line?
>> Looking around, most of scan-assembler patterns that want to match a $ use
>> \\\$
>
> Right.  Working on getting the right number of backslashes in here...
> I can never remember which ones need one and which need 3.

Use braces.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH, rs6000] Use $ instead of . for PC

2018-01-19 Thread Bill Schmidt

> On Jan 19, 2018, at 3:09 PM, Jakub Jelinek  wrote:
> 
> On Fri, Jan 19, 2018 at 02:58:07PM -0600, Bill Schmidt wrote:
>> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c  (revision 
>> 256894)
>> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c  (working copy)
>> @@ -30,4 +30,4 @@ int foo (int x)
>> 
>> /* { dg-final { scan-assembler "crset 30" } } */
>> /* { dg-final { scan-assembler "beqctr- 7" } } */
>> -/* { dg-final { scan-assembler "b ." } } */
>> +/* { dg-final { scan-assembler "b $" } } */
> 
> Does $ in scan-assembler really match a literal $ and not end of line?
> Looking around, most of scan-assembler patterns that want to match a $ use
> \\\$

Right.  Working on getting the right number of backslashes in here...
I can never remember which ones need one and which need 3.

Thanks...

Bill
> 
>> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c
>> ===
>> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c  (revision 
>> 256894)
>> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c  (working copy)
>> @@ -49,4 +49,4 @@ int foo (int x)
>> 
>> /* { dg-final { scan-assembler "crset 30" } } */
>> /* { dg-final { scan-assembler "beqctr- 7" } } */
>> -/* { dg-final { scan-assembler "b ." } } */
>> +/* { dg-final { scan-assembler "b $" } } */
>> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c
>> ===
>> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c  (revision 
>> 256894)
>> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c  (working copy)
>> @@ -12,4 +12,4 @@ int bar ()
>> 
>> /* { dg-final { scan-assembler "crset eq" } } */
>> /* { dg-final { scan-assembler "beqctr-" } } */
>> -/* { dg-final { scan-assembler "b ." } } */
>> +/* { dg-final { scan-assembler "b $" } } */
>> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c
>> ===
>> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c  (revision 
>> 256894)
>> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c  (working copy)
>> @@ -30,4 +30,4 @@ int foo (int x)
>> 
>> /* { dg-final { scan-assembler "crset 30" } } */
>> /* { dg-final { scan-assembler "beqctr- 7" } } */
>> -/* { dg-final { scan-assembler "b ." } } */
>> +/* { dg-final { scan-assembler "b $" } } */
>> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c
>> ===
>> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c  (revision 
>> 256894)
>> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c  (working copy)
>> @@ -49,4 +49,4 @@ int foo (int x)
>> 
>> /* { dg-final { scan-assembler "crset 30" } } */
>> /* { dg-final { scan-assembler "beqctr- 7" } } */
>> -/* { dg-final { scan-assembler "b ." } } */
>> +/* { dg-final { scan-assembler "b $" } } */
>> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c
>> ===
>> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c  (revision 
>> 256894)
>> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c  (working copy)
>> @@ -12,4 +12,4 @@ int bar ()
>> 
>> /* { dg-final { scan-assembler "crset eq" } } */
>> /* { dg-final { scan-assembler "beqctr-" } } */
>> -/* { dg-final { scan-assembler "b ." } } */
>> +/* { dg-final { scan-assembler "b $" } } */
> 
>   Jakub



Re: [PATCH, rs6000] Use $ instead of . for PC

2018-01-19 Thread Jakub Jelinek
On Fri, Jan 19, 2018 at 02:58:07PM -0600, Bill Schmidt wrote:
> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c   (revision 
> 256894)
> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c   (working copy)
> @@ -30,4 +30,4 @@ int foo (int x)
>  
>  /* { dg-final { scan-assembler "crset 30" } } */
>  /* { dg-final { scan-assembler "beqctr- 7" } } */
> -/* { dg-final { scan-assembler "b ." } } */
> +/* { dg-final { scan-assembler "b $" } } */

Does $ in scan-assembler really match a literal $ and not end of line?
Looking around, most of scan-assembler patterns that want to match a $ use
\\\$

> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c
> ===
> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c   (revision 
> 256894)
> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c   (working copy)
> @@ -49,4 +49,4 @@ int foo (int x)
>  
>  /* { dg-final { scan-assembler "crset 30" } } */
>  /* { dg-final { scan-assembler "beqctr- 7" } } */
> -/* { dg-final { scan-assembler "b ." } } */
> +/* { dg-final { scan-assembler "b $" } } */
> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c
> ===
> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c   (revision 
> 256894)
> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c   (working copy)
> @@ -12,4 +12,4 @@ int bar ()
>  
>  /* { dg-final { scan-assembler "crset eq" } } */
>  /* { dg-final { scan-assembler "beqctr-" } } */
> -/* { dg-final { scan-assembler "b ." } } */
> +/* { dg-final { scan-assembler "b $" } } */
> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c
> ===
> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c   (revision 
> 256894)
> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c   (working copy)
> @@ -30,4 +30,4 @@ int foo (int x)
>  
>  /* { dg-final { scan-assembler "crset 30" } } */
>  /* { dg-final { scan-assembler "beqctr- 7" } } */
> -/* { dg-final { scan-assembler "b ." } } */
> +/* { dg-final { scan-assembler "b $" } } */
> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c
> ===
> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c   (revision 
> 256894)
> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c   (working copy)
> @@ -49,4 +49,4 @@ int foo (int x)
>  
>  /* { dg-final { scan-assembler "crset 30" } } */
>  /* { dg-final { scan-assembler "beqctr- 7" } } */
> -/* { dg-final { scan-assembler "b ." } } */
> +/* { dg-final { scan-assembler "b $" } } */
> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c
> ===
> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c   (revision 
> 256894)
> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c   (working copy)
> @@ -12,4 +12,4 @@ int bar ()
>  
>  /* { dg-final { scan-assembler "crset eq" } } */
>  /* { dg-final { scan-assembler "beqctr-" } } */
> -/* { dg-final { scan-assembler "b ." } } */
> +/* { dg-final { scan-assembler "b $" } } */

Jakub


Re: [PATCH, rs6000] Use $ instead of . for PC

2018-01-19 Thread Bill Schmidt
I see that David already proposed this same patch in PR83946.  Sorry, I've 
gotten behind on my email.

Two changes I need:  The scan-assembly should have \$ rather than $ in it, and 
I should add
PR83946 to the ChangeLog.

Sorry for the noise.

-- Bill

Bill Schmidt, Ph.D.
STSM, GCC Architect for Linux on POWER
Linux on Power Toolchain
IBM Linux Technology Center
wschm...@linux.vnet.ibm.com

> On Jan 19, 2018, at 2:58 PM, Bill Schmidt  wrote:
> 
> Hi,
> 
> My recent patches to trunk and gcc-7-branch for avoiding speculation of
> indirect branches has a flaw, pointed out by David.  Usage of "." to
> represent the program counter is not portable across all POWER
> assemblers, particularly not being accepted on AIX.  "$" is the 
> universally accepted alternative.  So change the code and the test
> cases to use $ instead of . for this purpose.
> 
> Regstrap is in progress on powerpc64-linux-gnu and powerpc64le-linux-gnu.
> Assuming no issues are found, is this okay for trunk and backport to 7?
> 
> Thanks,
> Bill
> 
> 
> [gcc]
> 
> 2018-01-19  Bill Schmidt  
> 
>   * config/rs6000/rs6000.md (*sibcall_nonlocal_sysv): Change
>   assembly output from . to $.
>   (*sibcall_value_nonlocal_sysv): Likewise.
>   (indirect_jump_nospec): Likewise.
>   (*tablejump_internal1_nospec): Likewise.
> 
> [gcc/testsuite]
> 
> 2018-01-19  Bill Schmidt  
> 
>   * gcc.target/powerpc/safe-indirect-jump-2.c: Change expected
>   assembly output from . to $.
>   * gcc.target/powerpc/safe-indirect-jump-3.c: Likewise.
>   * gcc.target/powerpc/safe-indirect-jump-8.c: Likewise.
> 
> 
> Index: gcc/config/rs6000/rs6000.md
> ===
> --- gcc/config/rs6000/rs6000.md   (revision 256894)
> +++ gcc/config/rs6000/rs6000.md   (working copy)
> @@ -10987,7 +10987,7 @@
>   return \"b%T0\";
>   else
>   /* Can use CR0 since it is volatile across sibcalls.  */
> - return \"crset eq\;beq%T0-\;b .\";
> + return \"crset eq\;beq%T0-\;b $\";
> }
>   else if (DEFAULT_ABI == ABI_V4 && flag_pic)
> {
> @@ -11044,7 +11044,7 @@
>   return \"b%T1\";
>   else
>   /* Can use CR0 since it is volatile across sibcalls.  */
> - return \"crset eq\;beq%T1-\;b .\";
> + return \"crset eq\;beq%T1-\;b $\";
> }
>   else if (DEFAULT_ABI == ABI_V4 && flag_pic)
> {
> @@ -12566,7 +12566,7 @@
>   [(set (pc) (match_operand:P 0 "register_operand" "c,*l"))
>(clobber (match_operand:CC 1 "cc_reg_operand" "=y,y"))]
>   "!rs6000_speculate_indirect_jumps"
> -  "crset %E1\;beq%T0- %1\;b ."
> +  "crset %E1\;beq%T0- %1\;b $"
>   [(set_attr "type" "jmpreg")
>(set_attr "length" "12")])
> 
> @@ -12672,7 +12672,7 @@
>(use (label_ref (match_operand 1)))
>(clobber (match_operand:CC 2 "cc_reg_operand" "=y,y"))]
>   "!rs6000_speculate_indirect_jumps"
> -  "crset %E2\;beq%T0- %2\;b ."
> +  "crset %E2\;beq%T0- %2\;b $"
>   [(set_attr "type" "jmpreg")
>(set_attr "length" "12")])
> 
> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c
> ===
> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c   (revision 
> 256894)
> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c   (working copy)
> @@ -30,4 +30,4 @@ int foo (int x)
> 
> /* { dg-final { scan-assembler "crset 30" } } */
> /* { dg-final { scan-assembler "beqctr- 7" } } */
> -/* { dg-final { scan-assembler "b ." } } */
> +/* { dg-final { scan-assembler "b $" } } */
> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c
> ===
> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c   (revision 
> 256894)
> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c   (working copy)
> @@ -49,4 +49,4 @@ int foo (int x)
> 
> /* { dg-final { scan-assembler "crset 30" } } */
> /* { dg-final { scan-assembler "beqctr- 7" } } */
> -/* { dg-final { scan-assembler "b ." } } */
> +/* { dg-final { scan-assembler "b $" } } */
> Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c
> ===
> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c   (revision 
> 256894)
> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c   (working copy)
> @@ -12,4 +12,4 @@ int bar ()
> 
> /* { dg-final { scan-assembler "crset eq" } } */
> /* { dg-final { scan-assembler "beqctr-" } } */
> -/* { dg-final { scan-assembler "b ." } } */
> +/* { dg-final { scan-assembler "b $" } } */
> Index: gcc/config/rs6000/rs6000.md
> ===
> --- gcc/config/rs6000/rs6000.md   (revision 256894)
> +++ gcc/config/rs6000/rs6000.md   (working copy)
> @@ -10987,7 +10987,7 @@
>   return \"b%T0\";
>   else
>   /* 

[PATCH, rs6000] Use $ instead of . for PC

2018-01-19 Thread Bill Schmidt
Hi,

My recent patches to trunk and gcc-7-branch for avoiding speculation of
indirect branches has a flaw, pointed out by David.  Usage of "." to
represent the program counter is not portable across all POWER
assemblers, particularly not being accepted on AIX.  "$" is the 
universally accepted alternative.  So change the code and the test
cases to use $ instead of . for this purpose.

Regstrap is in progress on powerpc64-linux-gnu and powerpc64le-linux-gnu.
Assuming no issues are found, is this okay for trunk and backport to 7?

Thanks,
Bill


[gcc]

2018-01-19  Bill Schmidt  

* config/rs6000/rs6000.md (*sibcall_nonlocal_sysv): Change
assembly output from . to $.
(*sibcall_value_nonlocal_sysv): Likewise.
(indirect_jump_nospec): Likewise.
(*tablejump_internal1_nospec): Likewise.

[gcc/testsuite]

2018-01-19  Bill Schmidt  

* gcc.target/powerpc/safe-indirect-jump-2.c: Change expected
assembly output from . to $.
* gcc.target/powerpc/safe-indirect-jump-3.c: Likewise.
* gcc.target/powerpc/safe-indirect-jump-8.c: Likewise.


Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 256894)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -10987,7 +10987,7 @@
return \"b%T0\";
   else
/* Can use CR0 since it is volatile across sibcalls.  */
-   return \"crset eq\;beq%T0-\;b .\";
+   return \"crset eq\;beq%T0-\;b $\";
 }
   else if (DEFAULT_ABI == ABI_V4 && flag_pic)
 {
@@ -11044,7 +11044,7 @@
return \"b%T1\";
   else
/* Can use CR0 since it is volatile across sibcalls.  */
-   return \"crset eq\;beq%T1-\;b .\";
+   return \"crset eq\;beq%T1-\;b $\";
 }
   else if (DEFAULT_ABI == ABI_V4 && flag_pic)
 {
@@ -12566,7 +12566,7 @@
   [(set (pc) (match_operand:P 0 "register_operand" "c,*l"))
(clobber (match_operand:CC 1 "cc_reg_operand" "=y,y"))]
   "!rs6000_speculate_indirect_jumps"
-  "crset %E1\;beq%T0- %1\;b ."
+  "crset %E1\;beq%T0- %1\;b $"
   [(set_attr "type" "jmpreg")
(set_attr "length" "12")])
 
@@ -12672,7 +12672,7 @@
(use (label_ref (match_operand 1)))
(clobber (match_operand:CC 2 "cc_reg_operand" "=y,y"))]
   "!rs6000_speculate_indirect_jumps"
-  "crset %E2\;beq%T0- %2\;b ."
+  "crset %E2\;beq%T0- %2\;b $"
   [(set_attr "type" "jmpreg")
(set_attr "length" "12")])
 
Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c
===
--- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c (revision 
256894)
+++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-2.c (working copy)
@@ -30,4 +30,4 @@ int foo (int x)
 
 /* { dg-final { scan-assembler "crset 30" } } */
 /* { dg-final { scan-assembler "beqctr- 7" } } */
-/* { dg-final { scan-assembler "b ." } } */
+/* { dg-final { scan-assembler "b $" } } */
Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c
===
--- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c (revision 
256894)
+++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-3.c (working copy)
@@ -49,4 +49,4 @@ int foo (int x)
 
 /* { dg-final { scan-assembler "crset 30" } } */
 /* { dg-final { scan-assembler "beqctr- 7" } } */
-/* { dg-final { scan-assembler "b ." } } */
+/* { dg-final { scan-assembler "b $" } } */
Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c
===
--- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c (revision 
256894)
+++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-8.c (working copy)
@@ -12,4 +12,4 @@ int bar ()
 
 /* { dg-final { scan-assembler "crset eq" } } */
 /* { dg-final { scan-assembler "beqctr-" } } */
-/* { dg-final { scan-assembler "b ." } } */
+/* { dg-final { scan-assembler "b $" } } */
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 256894)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -10987,7 +10987,7 @@
return \"b%T0\";
   else
/* Can use CR0 since it is volatile across sibcalls.  */
-   return \"crset eq\;beq%T0-\;b .\";
+   return \"crset eq\;beq%T0-\;b $\";
 }
   else if (DEFAULT_ABI == ABI_V4 && flag_pic)
 {
@@ -11044,7 +11044,7 @@
return \"b%T1\";
   else
/* Can use CR0 since it is volatile across sibcalls.  */
-   return \"crset eq\;beq%T1-\;b .\";
+   return \"crset eq\;beq%T1-\;b $\";
 }
   else if (DEFAULT_ABI == ABI_V4 && flag_pic)
 {
@@ -12566,7 +12566,7 @@
   [(set (pc) (match_operand:P 0 "register_operand" "c,*l"))
(clobber (match_operand:CC 1 "cc_reg_operand" "=y,y"))]
   "!rs6000_speculate_indirect_jumps"
- 

Re: [PATCH v2, rs6000] Implement 32- and 64-bit BE handling for -mno-speculate-indirect-jumps

2018-01-19 Thread Bill Schmidt
On Jan 19, 2018, at 12:32 PM, David Edelsohn  wrote:
> 
> This patch is incorrect for AIX.  Which also means that the backport
> to GCC 7 branch is incorrect for AIX and must be corrected before the
> release.
> 
> AIX assembler does not accept "." (period) as the current address.
> 
> "b ." is incorrect.  And testing for "b ." is incorrect.  I am going
> to try testing with "$" in trunk.
> 
> GCC 7.3 must be re-spun.

Thanks, David, patch in progress.

FYI, I missed the 7.3 RC deadline, so this will only get picked up on a respin
anyway.  So I should have the fix in place before that happens.

Bill
> 
> Thanks, David
> 
> On Tue, Jan 16, 2018 at 9:08 PM, Bill Schmidt
>  wrote:
>> Hi,
>> 
>> This patch supercedes and extends 
>> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01479.html,
>> adding the remaining big-endian support for -mno-speculate-indirect-jumps.
>> This includes 32-bit support for indirect calls and sibling calls, and
>> 64-bit support for indirect calls.  The endian-neutral switch handling has
>> already been committed.
>> 
>> Using -m32 -O2 on safe-indirect-jumps-1.c results in a test for a sibling
>> call, so this has been added as safe-indirect-jumps-8.c.  Also,
>> safe-indirect-jumps-7.c adds a variant that will not generate a sibling
>> call for -m32, so we still get indirect call coverage.
>> 
>> Bootstrapped and tested on powerpc64-linux-gnu and powerpc64le-linux-gnu
>> with no regressions.  Is this okay for trunk?
>> 
>> Thanks,
>> Bill
>> 
>> 
>> [gcc]
>> 
>> 2018-01-16  Bill Schmidt  
>> 
>>* config/rs6000/rs6000.md (*call_indirect_nonlocal_sysv):
>>Generate different code for -mno-speculate-indirect-jumps.
>>(*call_value_indirect_nonlocal_sysv): Likewise.
>>(*call_indirect_aix): Disable for
>>-mno-speculate-indirect-jumps.
>>(*call_indirect_aix_nospec): New define_insn.
>>(*call_value_indirect_aix): Disable for
>>-mno-speculate-indirect-jumps.
>>(*call_value_indirect_aix_nospec): New define_insn.
>>(*sibcall_nonlocal_sysv): Generate different code for
>>-mno-speculate-indirect-jumps.
>>(*sibcall_value_nonlocal_sysv): Likewise.
>> 
>> [gcc/testsuite]
>> 
>> 2018-01-16  Bill Schmidt  
>> 
>>* gcc.target/powerpc/safe-indirect-jump-1.c: Remove endian
>>restriction, but still restrict to 64-bit.
>>* gcc.target/powerpc/safe-indirect-jump-7.c: New file.
>>* gcc.target/powerpc/safe-indirect-jump-8.c: New file.
>> 
>> 
>> Index: gcc/config/rs6000/rs6000.md
>> ===
>> --- gcc/config/rs6000/rs6000.md (revision 256753)
>> +++ gcc/config/rs6000/rs6000.md (working copy)
>> @@ -10453,10 +10453,35 @@
>>   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
>> output_asm_insn ("creqv 6,6,6", operands);
>> 
>> -  return "b%T0l";
>> +  if (rs6000_speculate_indirect_jumps
>> +  || which_alternative == 1 || which_alternative == 3)
>> +return "b%T0l";
>> +  else
>> +return "crset eq\;beq%T0l-";
>> }
>>   [(set_attr "type" "jmpreg,jmpreg,jmpreg,jmpreg")
>> -   (set_attr "length" "4,4,8,8")])
>> +   (set (attr "length")
>> +   (cond [(and (eq (symbol_ref "which_alternative") (const_int 0))
>> +   (eq (symbol_ref "rs6000_speculate_indirect_jumps")
>> +   (const_int 1)))
>> + (const_string "4")
>> +  (and (eq (symbol_ref "which_alternative") (const_int 0))
>> +   (eq (symbol_ref "rs6000_speculate_indirect_jumps")
>> +   (const_int 0)))
>> + (const_string "8")
>> +  (eq (symbol_ref "which_alternative") (const_int 1))
>> + (const_string "4")
>> +  (and (eq (symbol_ref "which_alternative") (const_int 2))
>> +   (eq (symbol_ref "rs6000_speculate_indirect_jumps")
>> +   (const_int 1)))
>> + (const_string "8")
>> +  (and (eq (symbol_ref "which_alternative") (const_int 2))
>> +   (eq (symbol_ref "rs6000_speculate_indirect_jumps")
>> +   (const_int 0)))
>> + (const_string "12")
>> +  (eq (symbol_ref "which_alternative") (const_int 3))
>> + (const_string "8")]
>> + (const_string "4")))])
>> 
>> (define_insn_and_split "*call_nonlocal_sysv"
>>   [(call (mem:SI (match_operand:P 0 "symbol_ref_operand" "s,s"))
>> @@ -10541,10 +10566,35 @@
>>   else if (INTVAL (operands[3]) & CALL_V4_CLEAR_FP_ARGS)
>> output_asm_insn ("creqv 6,6,6", operands);
>> 
>> -  return "b%T1l";
>> +  if (rs6000_speculate_indirect_jumps
>> +  || which_alternative == 1 || which_alternative == 3)
>> +return "b%T1l";
>> +  else
>> +return "crset eq\;beq%T1l-";
>> }
>>   [(set_attr "type" 

Re: [PATCH v3, rs6000] Add -mspeculate-indirect-jumps option and implement non-speculating bctr / bctrl

2018-01-19 Thread Segher Boessenkool
On Fri, Jan 19, 2018 at 10:01:50AM -0700, Sandra Loosemore wrote:
> 
> I see no documentation for the new option here  :-(

+;; -mno-speculate-indirect-jumps adds deliberate misprediction to indirect
+;; branches via the CTR.
+mspeculate-indirect-jumps
+Target Undocumented Var(rs6000_speculate_indirect_jumps) Init(1) Save

It's undocumented on purpose.  I don't know if that is the best idea.


Segher


Re: [C++ PATCH] Speed up inplace_merge algorithm & fix inefficient logic(PR c++/83938)

2018-01-19 Thread Jason Merrill
This is a libstdc++ bug and patch, not the C++ front end.  So I'm
adding the libstdc++ list to CC.

On Fri, Jan 19, 2018 at 3:02 AM, chang jc  wrote:
> Current std::inplace_merge() suffers from performance issue by inefficient
> logic under limited memory,
>
> It leads to performance downgrade.
>
> Please help to review it.
>
> Index: include/bits/stl_algo.h
> ===
> --- include/bits/stl_algo.h (revision 256871)
> +++ include/bits/stl_algo.h (working copy)
> @@ -2437,7 +2437,7 @@
>   _BidirectionalIterator __second_cut = __middle;
>   _Distance __len11 = 0;
>   _Distance __len22 = 0;
> - if (__len1 > __len2)
> + if (__len1 < __len2)
> {
>   __len11 = __len1 / 2;
>   std::advance(__first_cut, __len11);
> @@ -2539,9 +2539,15 @@
>const _DistanceType __len1 = std::distance(__first, __middle);
>const _DistanceType __len2 = std::distance(__middle, __last);
>
> +
>typedef _Temporary_buffer<_BidirectionalIterator, _ValueType> _TmpBuf;
> -  _TmpBuf __buf(__first, __last);
> -
> +  _BidirectionalIterator __start, __end;
> +  if (__len1 < __len2) {
> +   __start = __first; __end = __middle;
> +  } else {
> +   __start = __middle; __end = __last;
> +  }
> +  _TmpBuf __buf(__start, ___end);
>if (__buf.begin() == 0)
> std::__merge_without_buffer
>   (__first, __middle, __last, __len1, __len2, __comp);
> Index: include/bits/stl_tempbuf.h
> ===
> --- include/bits/stl_tempbuf.h  (revision 256871)
> +++ include/bits/stl_tempbuf.h  (working copy)
> @@ -95,7 +95,7 @@
> std::nothrow));
>   if (__tmp != 0)
> return std::pair<_Tp*, ptrdiff_t>(__tmp, __len);
> - __len /= 2;
> + __len = (__len + 1) / 2;
> }
>return std::pair<_Tp*, ptrdiff_t>(static_cast<_Tp*>(0), 0);
>  }
>
>
>
>
> Thanks


Re: Small C++ tweak to fold_simple

2018-01-19 Thread Jason Merrill
On Fri, Jan 19, 2018 at 8:28 AM, Marek Polacek  wrote:
> fold_simple struck me as odd, certainly the NULL_TREE assignment is
> unnecessary, so this patch simplifies it a bit.

OK.

> I've been confused about the commentary too, what is the difference between
> "constant-expressions" and "constexpressions"?  We should clarify that.

I have no idea, I think it's probably an editing mistake that can be removed.

Jason


[PATCH, rs6000] Testcase fix-ups for gimple-folding intrinsic tests.

2018-01-19 Thread Will Schmidt
Hi,
  Some testcase fix-ups affecting the gimple-folding tests, including:
- Add xxspltib as a valid instruction for vec-abs tests (power9).
- Fix up mismatched dg-require and dg-option values for the
fold-vec-shift-left-longlong tests (fixes/eliminates a test failure on p6).

Sniff tested across P6,P8,P9.

OK for trunk?
Thanks,
-Will

[testsuite]

2018-01-19  Will Schmidt 

* gcc.target/powerpc/fold-vec-abs-short-fwrap.c: Add xxspltib to
scan-asembler valid instructions list.
* gcc.target/powerpc/fold-vec-abs-short.c: Same.
* gcc.target/powerpc/fold-vec-shift-left-longlong.c: clean up
power8-vector requirement and option.
* gcc.target/powerpc/fold-vec-shift-left-fwrapv.c: Same.

diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-short-fwrapv.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-short-fwrapv.c
index 2562179..705bbe9 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-short-fwrapv.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-short-fwrapv.c
@@ -11,8 +11,8 @@ vector signed short
 test3 (vector signed short x)
 {
   return vec_abs (x);
 }
 
-/* { dg-final { scan-assembler-times "vspltisw|vxor" 1 } } */
+/* { dg-final { scan-assembler-times "vspltisw|xxspltib|vxor" 1 } } */
 /* { dg-final { scan-assembler-times "vsubuhm" 1 } } */
 /* { dg-final { scan-assembler-times "vmaxsh" 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-short.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-short.c
index d312000..0ad850f 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-short.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-short.c
@@ -11,8 +11,8 @@ vector signed short
 test3 (vector signed short x)
 {
   return vec_abs (x);
 }
 
-/* { dg-final { scan-assembler-times "vspltisw|vxor" 1 } } */
+/* { dg-final { scan-assembler-times "vspltisw|xxspltib|vxor" 1 } } */
 /* { dg-final { scan-assembler-times "vsubuhm" 1 } } */
 /* { dg-final { scan-assembler-times "vmaxsh" 1 } } */
diff --git 
a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-left-longlong-fwrapv.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-left-longlong-fwrapv.c
index b776683..486426a 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-left-longlong-fwrapv.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-left-longlong-fwrapv.c
@@ -1,11 +1,11 @@
 /* Verify that overloaded built-ins for vec_sl produce the right results.  */
 /* This test covers the shift left tests with the -fwrapv option. */
 
 /* { dg-do compile } */
-/* { dg-require-effective-target powerpc_altivec_ok } */
-/* { dg-options "-maltivec -O2 -mpower8-vector -fwrapv" } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mpower8-vector -O2 -fwrapv" } */
 
 #include 
 
 vector signed long long
 testsl_signed_longlong (vector signed long long x, vector unsigned long long y)
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-left-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-left-longlong.c
index f040486..4116dbc 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-left-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-left-longlong.c
@@ -1,11 +1,11 @@
 /* cross section of shift tests specific for shift-left.
  * This is a counterpart to the fold-vec-shift-left-frwapv test.  */
 
 /* { dg-do compile } */
-/* { dg-require-effective-target powerpc_altivec_ok } */
-/* { dg-options "-maltivec -mpower8-vector -O2" } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mpower8-vector -O2" } */
 
 #include 
 
 vector signed long long
 testsl_signed_longlong (vector signed long long x, vector unsigned long long y)




Re: [PATCH v2, rs6000] Implement 32- and 64-bit BE handling for -mno-speculate-indirect-jumps

2018-01-19 Thread David Edelsohn
This patch is incorrect for AIX.  Which also means that the backport
to GCC 7 branch is incorrect for AIX and must be corrected before the
release.

AIX assembler does not accept "." (period) as the current address.

"b ." is incorrect.  And testing for "b ." is incorrect.  I am going
to try testing with "$" in trunk.

GCC 7.3 must be re-spun.

Thanks, David

On Tue, Jan 16, 2018 at 9:08 PM, Bill Schmidt
 wrote:
> Hi,
>
> This patch supercedes and extends 
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01479.html,
> adding the remaining big-endian support for -mno-speculate-indirect-jumps.
> This includes 32-bit support for indirect calls and sibling calls, and
> 64-bit support for indirect calls.  The endian-neutral switch handling has
> already been committed.
>
> Using -m32 -O2 on safe-indirect-jumps-1.c results in a test for a sibling
> call, so this has been added as safe-indirect-jumps-8.c.  Also,
> safe-indirect-jumps-7.c adds a variant that will not generate a sibling
> call for -m32, so we still get indirect call coverage.
>
> Bootstrapped and tested on powerpc64-linux-gnu and powerpc64le-linux-gnu
> with no regressions.  Is this okay for trunk?
>
> Thanks,
> Bill
>
>
> [gcc]
>
> 2018-01-16  Bill Schmidt  
>
> * config/rs6000/rs6000.md (*call_indirect_nonlocal_sysv):
> Generate different code for -mno-speculate-indirect-jumps.
> (*call_value_indirect_nonlocal_sysv): Likewise.
> (*call_indirect_aix): Disable for
> -mno-speculate-indirect-jumps.
> (*call_indirect_aix_nospec): New define_insn.
> (*call_value_indirect_aix): Disable for
> -mno-speculate-indirect-jumps.
> (*call_value_indirect_aix_nospec): New define_insn.
> (*sibcall_nonlocal_sysv): Generate different code for
> -mno-speculate-indirect-jumps.
> (*sibcall_value_nonlocal_sysv): Likewise.
>
> [gcc/testsuite]
>
> 2018-01-16  Bill Schmidt  
>
> * gcc.target/powerpc/safe-indirect-jump-1.c: Remove endian
> restriction, but still restrict to 64-bit.
> * gcc.target/powerpc/safe-indirect-jump-7.c: New file.
> * gcc.target/powerpc/safe-indirect-jump-8.c: New file.
>
>
> Index: gcc/config/rs6000/rs6000.md
> ===
> --- gcc/config/rs6000/rs6000.md (revision 256753)
> +++ gcc/config/rs6000/rs6000.md (working copy)
> @@ -10453,10 +10453,35 @@
>else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
>  output_asm_insn ("creqv 6,6,6", operands);
>
> -  return "b%T0l";
> +  if (rs6000_speculate_indirect_jumps
> +  || which_alternative == 1 || which_alternative == 3)
> +return "b%T0l";
> +  else
> +return "crset eq\;beq%T0l-";
>  }
>[(set_attr "type" "jmpreg,jmpreg,jmpreg,jmpreg")
> -   (set_attr "length" "4,4,8,8")])
> +   (set (attr "length")
> +   (cond [(and (eq (symbol_ref "which_alternative") (const_int 0))
> +   (eq (symbol_ref "rs6000_speculate_indirect_jumps")
> +   (const_int 1)))
> + (const_string "4")
> +  (and (eq (symbol_ref "which_alternative") (const_int 0))
> +   (eq (symbol_ref "rs6000_speculate_indirect_jumps")
> +   (const_int 0)))
> + (const_string "8")
> +  (eq (symbol_ref "which_alternative") (const_int 1))
> + (const_string "4")
> +  (and (eq (symbol_ref "which_alternative") (const_int 2))
> +   (eq (symbol_ref "rs6000_speculate_indirect_jumps")
> +   (const_int 1)))
> + (const_string "8")
> +  (and (eq (symbol_ref "which_alternative") (const_int 2))
> +   (eq (symbol_ref "rs6000_speculate_indirect_jumps")
> +   (const_int 0)))
> + (const_string "12")
> +  (eq (symbol_ref "which_alternative") (const_int 3))
> + (const_string "8")]
> + (const_string "4")))])
>
>  (define_insn_and_split "*call_nonlocal_sysv"
>[(call (mem:SI (match_operand:P 0 "symbol_ref_operand" "s,s"))
> @@ -10541,10 +10566,35 @@
>else if (INTVAL (operands[3]) & CALL_V4_CLEAR_FP_ARGS)
>  output_asm_insn ("creqv 6,6,6", operands);
>
> -  return "b%T1l";
> +  if (rs6000_speculate_indirect_jumps
> +  || which_alternative == 1 || which_alternative == 3)
> +return "b%T1l";
> +  else
> +return "crset eq\;beq%T1l-";
>  }
>[(set_attr "type" "jmpreg,jmpreg,jmpreg,jmpreg")
> -   (set_attr "length" "4,4,8,8")])
> +   (set (attr "length")
> +   (cond [(and (eq (symbol_ref "which_alternative") (const_int 0))
> +   (eq (symbol_ref "rs6000_speculate_indirect_jumps")
> +   (const_int 1)))
> + (const_string "4")
> +  (and (eq (symbol_ref "which_alternative") (const_int 0))
> + 

Re: [PATCH, rs6000] Add 128-bit support for vec_xl(), vec_xl_be(), vec_xst(), vec_xst_be() builtins.

2018-01-19 Thread Carl Love
On Fri, 2018-01-19 at 10:13 -0600, Segher Boessenkool wrote:
> On Thu, Jan 18, 2018 at 04:51:47PM -0600, Segher Boessenkool wrote:
> > > +(define_insn "vsx_ld_elemrev_v1ti"
> > > +  [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> > > +(vec_select:V1TI
> > > +   (match_operand:V1TI 1 "memory_operand" "Z")
> > > +   (parallel [(const_int 0)])))]
> > > +  "VECTOR_MEM_VSX_P (V1TImode) && !BYTES_BIG_ENDIAN"
> > > +{
> > > +  return "lxvd2x %x0,%y1; xxpermdi %x0,%x0,%x0,2";
> > > +}
> > We currently have exactly as many xxpermdi,2 as xxswapdi (147 each)
> > but the latter is more readable, please prefer that.
> 
> Ignore this part; I managed to fumble my grep commands.  We have *no*
> xxswapd in the source currently (well, one in comments, and 11
> xxswapdi
> but that is a misspelling); stage 4 is not the time to start using it
> (do all supported assemblers implement it, implement it correctly,
> etc.)
> 
> So your xxpermdi is the best for now.


Segher:

Here are the key changes that I am testing now for vsx.md
and powerpc.exp.  Just making sure we are on the same page here.


diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 03f8ec2d6..6ea05e46e 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -1093,6 +1093,18 @@ (define_insn "vsx_ld_elemrev_v2di"
   "lxvd2x %x0,%y1"
   [(set_attr "type" "vecload")])
 
+(define_insn "vsx_ld_elemrev_v1ti"
+  [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
+(vec_select:V1TI
+ (match_operand:V1TI 1 "memory_operand" "Z")
+ (parallel [(const_int 0)])))]
+  "VECTOR_MEM_VSX_P (V1TImode) && !BYTES_BIG_ENDIAN"
+{
+   return "lxvd2x %x0,%y1\;xxpermdi %x0,%x0,%x0,2";   <<- 
Reverted change
+}
+  [(set_attr "type" "vecload")
+   (set_attr "length" "8")])
+
 (define_insn "vsx_ld_elemrev_v2df"
   [(set (match_operand:V2DF 0 "vsx_register_operand" "=wa")
 (vec_select:V2DF
@@ -1222,6 +1234,18 @@ (define_insn "*vsx_ld_elemrev_v16qi_internal"
   "lxvb16x %x0,%y1"
   [(set_attr "type" "vecload")])
 
+(define_insn "vsx_st_elemrev_v1ti"
+  [(set (match_operand:V1TI 0 "memory_operand" "=Z")
+(vec_select:V1TI
+  (match_operand:V1TI 1 "vsx_register_operand" "+wa") <<---  
Fix RTL to mention
+  (parallel [(const_int 0)])))]  
operand 1 change
+  "VECTOR_MEM_VSX_P (V2DImode) && !BYTES_BIG_ENDIAN"
+{
+  return "xxpermdi %x1,%x1,%x1,2\;stxvd2x %x1,%y0";
<<-- Reverted change
+}
+  [(set_attr "type" "vecstore")
+   (set_attr "length" "8")])
+
 (define_insn "vsx_st_elemrev_v2df"
   [(set (match_operand:V2DF 0 "memory_operand" "=Z")
 (vec_select:V2DF
@@ -1272,7 +1296,7 @@ (define_expand "vsx_st_elemrev_v8hi"
 {
   if (!TARGET_P9_VECTOR)
 {
-  rtx subreg, perm[16], pcv;
+  rtx mem_subreg, subreg, perm[16], pcv;
   rtx tmp = gen_reg_rtx (V8HImode);
   /* 2 is leftmost element in register */
   unsigned int reorder[16] = {13,12,15,14,9,8,11,10,5,4,7,6,1,0,3,2};
@@ -1287,11 +1311,21 @@ (define_expand "vsx_st_elemrev_v8hi"
   emit_insn (gen_altivec_vperm_v8hi_direct (tmp, operands[1],
 operands[1], pcv));
   subreg = simplify_gen_subreg (V4SImode, tmp, V8HImode, 0);
-  emit_insn (gen_vsx_st_elemrev_v4si (subreg, operands[0]));
+  mem_subreg = simplify_gen_subreg (V4SImode, operands[0], V8HImode, 0);
+  emit_insn (gen_vsx_st_elemrev_v4si (mem_subreg, subreg));
   DONE;
 }
 })
 
+(define_insn "*vsx_st_elemrev_v2di_internal"
+  [(set (match_operand:V2DI 0 "memory_operand" "=Z")
+(vec_select:V2DI
+  (match_operand:V2DI 1 "vsx_register_operand" "wa")
+  (parallel [(const_int 1) (const_int 0)])))]
+  "VECTOR_MEM_VSX_P (V2DImode) && !BYTES_BIG_ENDIAN && TARGET_P9_VECTOR"
+  "stxvd2x %x1,%y0"
+  [(set_attr "type" "vecstore")])
+
 (define_insn "*vsx_st_elemrev_v8hi_internal"
   [(set (match_operand:V8HI 0 "memory_operand" "=Z")
 (vediff --git a/gcc/testsuite/gcc.target/powerpc/builtins-4-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/builtins-4-runnable.cindex 
ed37424ca..de9b916de 100644--- 
a/gcc/testsuite/gcc.target/powerpc/builtins-4-runnable.c+++ 
b/gcc/testsuite/gcc.target/powerpc/builtins-4-runnable.c@@ -1,10 +1,13 @@ /* { 
dg-do run } */ /* { dg-require-effective-target vsx_hw } */-/* { dg-options 
"-maltivec -mvsx" } */  +/* { dg-options "-maltivec -mvsx" } */  #include 
 #include  // vector++#ifdef DEBUG #include 
+#endif  void abort (void); @@ -24,9 +27,11 @@ int main() {float 
data_f[100];   double data_d[100];-   +  __uint128_t data_u128[100];+  
__int128_t data_128[100];+   signed long long disp;-   +   vector signed char 
vec_c_expected1, vec_c_expected2, vec_c_result1, vec_c_result2;   vector 
unsigned char vec_uc_expected1, vec_uc_expected2, vec_uc_result1, 
vec_uc_result2;@@ -42,11 +47,13 @@ int main() { vec_sll_result1, 
vec_sll_result2;   

Re: [PATCH] Add gcc.dg/stack-check-16.c

2018-01-19 Thread Jeff Law
On 01/19/2018 04:22 AM, Jakub Jelinek wrote:
> Hi!
> 
> This patch adds a new testcases, not exactly sure what is the exact
> origin and what was the problem.  The changes I've done are:
> 1) macroize, so that the test is just a few lines rather than 160KB,
>verified -fdump-tree-gimple printf call is identical between this
>and the original test
> 2) remove optimize(0) attribute, the test is compiled with -O0
> 3) use __builtin_alloca instead of alloca and add mtrace prototype,
>remove -w because no warnings are emitted any longer
> 
> The test passes at least on x86_64-linux with -m32/-m64.  Ok for trunk?
> 
> 2018-01-19  Jeff Law  
>   Jakub Jelinek  
> 
>   * gcc.dg/stack-check-16.c: New test.
OK.

I've been wandering my mailboxes to find the discussion which led to
this test, but can't seem to find it.  Based on the structure of the
test I'm pretty sure it is related to a large outgoing argument area.

THe fact that it wasn't submitted upstream, but was in the aarch64
bundle means it was probably related to the need to probe the outgoing
argument area which is specific to aarch64.

Without any scanning of dump files or assembly output it must have been
an ICE in that code.  The thing that worries me is I can't recall fixing
such an ICE!

But again, OK for the trunk.

Jeff



Re: [PATCH, rs6000] Add 128-bit support for vec_xl(), vec_xl_be(), vec_xst(), vec_xst_be() builtins.

2018-01-19 Thread Carl Love
On Fri, 2018-01-19 at 10:13 -0600, Segher Boessenkool wrote:
> On Thu, Jan 18, 2018 at 04:51:47PM -0600, Segher Boessenkool wrote:
> > > +(define_insn "vsx_ld_elemrev_v1ti"
> > > +  [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> > > +(vec_select:V1TI
> > > +   (match_operand:V1TI 1 "memory_operand" "Z")
> > > +   (parallel [(const_int 0)])))]
> > > +  "VECTOR_MEM_VSX_P (V1TImode) && !BYTES_BIG_ENDIAN"
> > > +{
> > > +  return "lxvd2x %x0,%y1; xxpermdi %x0,%x0,%x0,2";
> > > +}
> > We currently have exactly as many xxpermdi,2 as xxswapdi (147 each)
> > but the latter is more readable, please prefer that.
> 
> Ignore this part; I managed to fumble my grep commands.  We have *no*
> xxswapd in the source currently (well, one in comments, and 11
> xxswapdi
> but that is a misspelling); stage 4 is not the time to start using it
> (do all supported assemblers implement it, implement it correctly,
> etc.)
> 
> So your xxpermdi is the best for now.
> 

I was going to ask you about that again.  I seem to be getting
regressions with it for gcc -O0 builtins-4-runnable.c.  Will revert and
retest.  

Carl 



Re: [PATCH] rl78 anddi3 improvement

2018-01-19 Thread DJ Delorie

Jeff Law  writes:
> So I think you're ultimately far better off determining why GCC does not
> generate efficient code for 64bit logicals on the rl78 target.

In thinking about this more, one possible reason is that rl78 has an
8-bit WORD_MODE.  Which means DImode operations are not reduced to
SImode, they're reduced to QImode.  If you want SImode instead, you need
to intervene.


Re: FW: [PATCH] rl78 umaxdi3 improvement

2018-01-19 Thread DJ Delorie

"Sebastian Perta"  writes:
>   * config/rl78/rl78.md: New define_expand "umaxdi3".
>   * config/rl78/rl78.md: New define_expand "smaxdi3".
>   * config/rl78/rl78.md: New define_expand "smindi3".
>   * config/rl78/rl78.md: New define_expand "umindi3".
>   * config/rl78/rl78.md: New define_expand "anddi3".
>
>   * config/rl78/rl78-protos.h: New function declaration rl78_split_movdi
>   * config/rl78/rl78.md: New define_expand "movdi"
>   * config/rl78/rl78.c: New function definition rl78_split_movdi
>
>   * config/rl78/rl78-expand.md: New define_expand "bswaphi2"
>   * config/rl78/rl78-virt.md: New define_insn "*bswaphi2_virt"
>   * config/rl78/rl78-real.md: New define_insn "*bswaphi2_real"

These are OK.  thanks!


Re: [PATCH] RL78 UNUSED note setting bug fix in rl78_note_reg_set

2018-01-19 Thread DJ Delorie

Sebastian Perta  writes:
> * config/rl78/rl78.c (rl78_note_reg_set): fixed dead reg check
> for non-QImode registers

This is OK.  Thanks!

Note: in the future; ChangeLog entries should be provided separate from
the patch; they rarely apply cleanly anyway.

> Index: config/rl78/rl78.c
> ===
> --- config/rl78/rl78.c(revision 256590)
> +++ config/rl78/rl78.c(working copy)
> @@ -3792,7 +3792,7 @@
>  rl78_note_reg_set (char *dead, rtx d, rtx insn)
>  {
>int r, i;
> -
> +  bool is_dead;
>if (GET_CODE (d) == MEM)
>  rl78_note_reg_uses (dead, XEXP (d, 0), insn);
>
> @@ -3799,9 +3799,15 @@
>if (GET_CODE (d) != REG)
>  return;
>
> + /* Do not mark the reg unused unless all QImode parts of it are dead.  */
>r = REGNO (d);
> -  if (dead [r])
> -add_reg_note (insn, REG_UNUSED, gen_rtx_REG (GET_MODE (d), r));
> +  is_dead = true;
> +  for (i = 0; i < GET_MODE_SIZE (GET_MODE (d)); i ++)
> +  if (!dead [r + i])
> +  is_dead = false;
> +  if(is_dead)
> +add_reg_note (insn, REG_UNUSED, gen_rtx_REG (GET_MODE (d), r));
>if (dump_file)
>  fprintf (dump_file, "note set reg %d size %d\n", r, GET_MODE_SIZE 
> (GET_MODE (d)));
>for (i = 0; i < GET_MODE_SIZE (GET_MODE (d)); i ++)


Re: [PATCH][ARM] Fix test fail with conflicting -mfloat-abi

2018-01-19 Thread Kyrill Tkachov


On 16/01/18 10:31, Sudakshina Das wrote:

Hi Christophe

On 12/01/18 18:32, Christophe Lyon wrote:

Le 12 janv. 2018 15:26, "Sudakshina Das"  a écrit :

Hi

This patch fixes my earlier test case that fails for arm-none-eabi
with explicit user option for -mfloat-abi which conflict with
the test case options. I have added a guard to skip the test
on those cases.

@Christophe:
Sorry about this. I think this should fix the test case.
Can you please confirm if this works for you?


Yes it does thanks


Thanks for checking that. I have added one more directive for armv5t as well to 
avoid any conflicts for mcpu options.



I agree with what Sudi said in 
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01422.html
I'd rather keep the test in the generic torture suite as long as we get the 
directives right.

So this is ok for trunk (as the changes are arm-specific directives) with one 
change below:

Thanks,
Kyrill


Sudi




Thanks
Sudi

gcc/testsuite/ChangeLog

2018-01-12  Sudakshina Das  

 * gcc.c-torture/compile/pr82096.c: Add dg-skip-if
 directive.






diff --git a/gcc/testsuite/gcc.c-torture/compile/pr82096.c 
b/gcc/testsuite/gcc.c-torture/compile/pr82096.c
index 9fed28c..35551f5 100644
--- a/gcc/testsuite/gcc.c-torture/compile/pr82096.c
+++ b/gcc/testsuite/gcc.c-torture/compile/pr82096.c
@@ -1,3 +1,5 @@
+/* { dg-require-effective-target arm_arch_v5t_ok } */

Please also guard this on { target arm*-*-* }
That way this test will be run on other targets as well so that they can 
benefit from it.

+/* { dg-skip-if "Do not combine float-abi values" { arm*-*-* } { "-mfloat-abi=*" } { 
"-mfloat-abi=soft" } } */
 /* { dg-additional-options "-march=armv5t -mthumb -mfloat-abi=soft" { target 
arm*-*-* } } */




Re: [PATCH,AIX] Optimize parsing of include files.

2018-01-19 Thread Ian Lance Taylor
On Fri, Jan 19, 2018 at 8:36 AM, David Edelsohn  wrote:
> On Thu, Jan 18, 2018 at 9:56 AM, REIX, Tony  wrote:
>>
>> Description:
>>  * This patch optimizes the time required for parsing the include files.
>>
>> Tests:
>>  * AIX: Build: SUCCESS
>>- build made by means of gmake on AIX.
>>
>> ChangeLog:
>>   * xcoff.c: Optimize parsing of include files.
>
> Hi, Tony
>
> The ChangeLog should be more detailed.
>
> * xcoff.c (xcoff_incl_compare): New function.
> (xcoff_incl_search): New function.
> (xcoff_process_linenos): Use bsearch to find include file.
> (xcoff_initialize_fileline): Sort include file information.
>
>
> The rest is okay, although the calls to bsearch and backtrace_qsort
> don't follow the style of other files.
>
> This is okay.

I committed the patch with that ChangeLog entry.

Ian


[PATCH PR82604]Fix regression in ftree-parallelize-loops

2018-01-19 Thread Bin Cheng
Hi,
This patch is supposed to fix regression caused by loop distribution when
ftree-parallelize-loops.  The reason is distributed memset call can't be
understood/analyzed in data reference analysis, as a result, parloop can
only parallelize the innermost 2-level loop nest.  Before distribution
change, parloop can parallelize the innermost 3-level loop nest, i.e,
more parallelization.
As commented in the PR, ideally, loop distribution should be able to
distribute memset call for 3-level loop nest.  Unfortunately this requires
sophisticated work proving equality between tree expressions which gcc
is not good at now.
Another fix is to improve data reference analysis so that memset call
can be supported.  We don't know how big this change is and it's definitely
not GCC 8 task.

So this patch fixes the regression in a bit hacking way.  It first enables
3-level loop nest distribution when flag_tree_parloops > 1.  Secondly, it
supports 3-level loop nest distribution for ZERO-ing stmt which can only
be distributed as a loop (nest) of memset, but can't be distributed as a
single memset.  The overall effect is ZERO-ing stmt will be distributed
to one loop deeper than now, so parloop can parallelize as before.

Bootstrap and test on x86_64 and AArch64 ongoing.  Is it OK if no errors?

Thanks,
bin
2018-01-19  Bin Cheng  

PR tree-optimization/82604
* tree-loop-distribution.c (enum partition_kind): New enum item
PKIND_PARTIAL_MEMSET.
(partition_builtin_p): Support above new enum item.
(generate_code_for_partition): Ditto.
(compute_access_range): Differentiate cases that equality can be
proven at all loops, the innermost loops or no loops.
(classify_builtin_st, classify_builtin_ldst): Adjust call to above
function.  Set PKIND_PARTIAL_MEMSET for partition appropriately.
(finalize_partitions, distribute_loop): Don't fuse partition of
PKIND_PARTIAL_MEMSET kind when distributing 3-level loop nest.
(prepare_perfect_loop_nest): Distribute 3-level loop nest only if
parloop is enabled.diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index a3d76e4..807fd07 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -584,7 +584,19 @@ build_rdg (struct loop *loop, control_dependences *cd)
 
 /* Kind of distributed loop.  */
 enum partition_kind {
-PKIND_NORMAL, PKIND_MEMSET, PKIND_MEMCPY, PKIND_MEMMOVE
+PKIND_NORMAL,
+/* Partial memset stands for a paritition can be distributed into a loop
+   of memset calls, rather than a single memset call.  It's handled just
+   like a normal parition, i.e, distributed as separate loop, no memset
+   call is generated.
+
+   Note: This is a hacking fix trying to distribute ZERO-ing stmt in a
+   loop nest as deep as possible.  As a result, parloop achieves better
+   parallelization by parallelizing deeper loop nest.  This hack should
+   be unnecessary and removed once distributed memset can be understood
+   and analyzed in data reference analysis.  See PR82604 for more.  */
+PKIND_PARTIAL_MEMSET,
+PKIND_MEMSET, PKIND_MEMCPY, PKIND_MEMMOVE
 };
 
 /* Type of distributed loop.  */
@@ -659,7 +671,7 @@ partition_free (partition *partition)
 static bool
 partition_builtin_p (partition *partition)
 {
-  return partition->kind != PKIND_NORMAL;
+  return partition->kind > PKIND_PARTIAL_MEMSET;
 }
 
 /* Returns true if the partition contains a reduction.  */
@@ -1127,6 +1139,7 @@ generate_code_for_partition (struct loop *loop,
   switch (partition->kind)
 {
 case PKIND_NORMAL:
+case PKIND_PARTIAL_MEMSET:
   /* Reductions all have to be in the last partition.  */
   gcc_assert (!partition_reduction_p (partition)
  || !copy_p);
@@ -1399,17 +1412,22 @@ find_single_drs (struct loop *loop, struct graph *rdg, 
partition *partition,
 
 /* Given data reference DR in LOOP_NEST, this function checks the enclosing
loops from inner to outer to see if loop's step equals to access size at
-   each level of loop.  Return true if yes; record access base and size in
-   BASE and SIZE; save loop's step at each level of loop in STEPS if it is
-   not null.  For example:
+   each level of loop.  Return 2 if we can prove this at all level loops;
+   record access base and size in BASE and SIZE; save loop's step at each
+   level of loop in STEPS if it is not null.  For example:
 
  int arr[100][100][100];
  for (i = 0; i < 100; i++)   ;steps[2] = 4
for (j = 100; j > 0; j--) ;steps[1] = -400
 for (k = 0; k < 100; k++)   ;steps[0] = 4
-  arr[i][j - 1][k] = 0; ;base = , size = 400.  */
+  arr[i][j - 1][k] = 0; ;base = , size = 400
 
-static bool
+   Return 1 if we can prove the equality at the innermost loop, but not all
+   level loops.  In this case, no information is recorded.
+
+   Return 0 if no 

Re: [PATCH, fortran] Support Fortran 2018 teams

2018-01-19 Thread Steve Kargl
On Fri, Jan 19, 2018 at 09:18:14AM -0800, Damian Rouson wrote:
> Thanks for catching that, Steve, and for responding, Alessandro.
> 
> Anything else?
> 

I've only just started to look at the patch.   Unfortunately,
I know zero about teams, so need to read the patch and F2018
standard simultaneously.

-- 
Steve


Re: [PATCH, fortran] Support Fortran 2018 teams

2018-01-19 Thread Damian Rouson
Thanks for catching that, Steve, and for responding, Alessandro.

Anything else?

Damian

On January 19, 2018 at 9:17:03 AM, Alessandro Fanfarillo (elfa...@ucar.edu) 
wrote:

Yes, definitively ar->team. 

On Fri, Jan 19, 2018 at 8:36 AM, Steve Kargl 
 wrote: 
> index 882fe577b76..b4baf5be554 100644 
> --- a/gcc/fortran/array.c 
> +++ b/gcc/fortran/array.c 
> @@ -158,6 +158,7 @@ gfc_match_array_ref (gfc_array_ref *ar, gfc_array_spec 
> *as, int init, 
> bool matched_bracket = false; 
> gfc_expr *tmp; 
> bool stat_just_seen = false; 
> + bool team_just_seen = false; 
> 
> memset (ar, '\0', sizeof (*ar)); 
> 
> @@ -230,8 +231,21 @@ coarray: 
> if (m == MATCH_ERROR) 
> return MATCH_ERROR; 
> 
> + team_just_seen = false; 
> stat_just_seen = false; 
> - if (gfc_match(" , stat = %e",) == MATCH_YES && ar->stat == NULL) 
> + if (gfc_match (" , team = %e", ) == MATCH_YES && ar->stat == NULL) 
> 
> 
> Is the 2nd ar->stat suppose to be ar->team? 
> 
> -- 
> Steve 



-- 

Alessandro Fanfarillo, Ph.D. 
Postdoctoral Researcher 
National Center for Atmospheric Research 
Mesa Lab, Boulder, CO, USA 
303-497-1229 


Re: [PATCH, fortran] Support Fortran 2018 teams

2018-01-19 Thread Alessandro Fanfarillo
Yes, definitively ar->team.

On Fri, Jan 19, 2018 at 8:36 AM, Steve Kargl
 wrote:
> index 882fe577b76..b4baf5be554 100644
> --- a/gcc/fortran/array.c
> +++ b/gcc/fortran/array.c
> @@ -158,6 +158,7 @@ gfc_match_array_ref (gfc_array_ref *ar, gfc_array_spec
> *as, int init,
>bool matched_bracket = false;
>gfc_expr *tmp;
>bool stat_just_seen = false;
> +  bool team_just_seen = false;
>
>memset (ar, '\0', sizeof (*ar));
>
> @@ -230,8 +231,21 @@ coarray:
>if (m == MATCH_ERROR)
> return MATCH_ERROR;
>
> +  team_just_seen = false;
>stat_just_seen = false;
> -  if (gfc_match(" , stat = %e",) == MATCH_YES && ar->stat == NULL)
> +  if (gfc_match (" , team = %e", ) == MATCH_YES && ar->stat == NULL)
>
>
> Is the 2nd ar->stat suppose to be ar->team?
>
> --
> Steve



-- 

Alessandro Fanfarillo, Ph.D.
Postdoctoral Researcher
National Center for Atmospheric Research
Mesa Lab, Boulder, CO, USA
303-497-1229


Re: [PATCH] C/C++: Add -Waddress-of-packed-member

2018-01-19 Thread Martin Sebor

On 01/14/2018 07:29 AM, H.J. Lu wrote:

When address of packed member of struct or union is taken, it may result
in an unaligned pointer value.  This patch adds -Waddress-of-packed-member
to warn it:

$ cat x.i
struct pair_t
{
  char c;
  int i;
} __attribute__ ((packed));

extern struct pair_t p;
int *addr = 
$ gcc -O2 -S x.i
x.i:8:13: warning: initialization of 'int *' from address of packed member of 
'struct pair_t' may result in an unaligned pointer value 
[-Waddress-of-packed-member]
 int *addr = 
 ^
$

This warning is enabled by default.


I like this enhancement.  It would be useful for data types,
packed or not, such as casting int* to long*.

I noticed some differences from Clang for the test case below.
It seems that GCC should warn on all the cases Clang does.

Also, since converting the address of a struct to that of its
first member is common (especially in C and when the member
itself is a struct) I wonder if the warning should trigger
for those conversions as well.

struct A {
  int i;
} __attribute__ ((packed));

long* f8 (struct A *p) { return >i; }   // Clang only
int* f4 (struct A *p) { return >i; }// Clang, GCC
short* f2 (struct A *p) { return >i; }  // Clang only
char* f1 (struct A *p) { return >i; }
void* f0 (struct A *p) { return >i; }

struct B { int i; };
struct C { struct B b; } __attribute__ ((packed));

long* g8 (struct C *p) { return p; }// should warn?
int* g4 (struct C *p) { return >b; } // Clang only

int* h4 (struct C *p) { return >b.i; }   // Clang only


Martin


RE:[PATCH,AIX] Optimize parsing of include files.

2018-01-19 Thread REIX, Tony
Thanks David,

I've saved you comments in our Wiki so I hope I'll remember and do better next 
time.

Regards,

Cordialement,

Tony Reix

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net


De : David Edelsohn [dje@gmail.com]
Envoyé : vendredi 19 janvier 2018 17:36
À : REIX, Tony
Cc : gcc-patches@gcc.gnu.org; Ian Lance Taylor; BERGAMINI, DAMIEN
Objet : Re: [PATCH,AIX] Optimize parsing of include files.

On Thu, Jan 18, 2018 at 9:56 AM, REIX, Tony  wrote:
>
> Description:
>  * This patch optimizes the time required for parsing the include files.
>
> Tests:
>  * AIX: Build: SUCCESS
>- build made by means of gmake on AIX.
>
> ChangeLog:
>   * xcoff.c: Optimize parsing of include files.

Hi, Tony

The ChangeLog should be more detailed.

* xcoff.c (xcoff_incl_compare): New function.
(xcoff_incl_search): New function.
(xcoff_process_linenos): Use bsearch to find include file.
(xcoff_initialize_fileline): Sort include file information.


The rest is okay, although the calls to bsearch and backtrace_qsort
don't follow the style of other files.

This is okay.

Thanks, David


[og7] backport fix for PR83920

2018-01-19 Thread Cesar Philippidis
I've backported the patch Tom committed to trunk to fix PR83920 to
openacc-gcc-7-branch in revision
d0a1e0fa43ca4004fde33707cb6a93c01cb11507. No changes were required for
og7. The original email can be found here
.

Cesar


Re: [PATCH v3, rs6000] Add -mspeculate-indirect-jumps option and implement non-speculating bctr / bctrl

2018-01-19 Thread Sandra Loosemore

On 01/15/2018 04:09 PM, Bill Schmidt wrote:


[gcc]

2018-01-15  Bill Schmidt  

* config/rs6000/rs6000.c (rs6000_opt_vars): Add entry for
-mspeculate-indirect-jumps.
* config/rs6000/rs6000.md (*call_indirect_elfv2): Disable
for -mno-speculate-indirect-jumps.
(*call_indirect_elfv2_nospec): New define_insn.
(*call_value_indirect_elfv2): Disable for
-mno-speculate-indirect-jumps.
(*call_value_indirect_elfv2_nospec): New define_insn.
(indirect_jump): Emit different RTL for
-mno-speculate-indirect-jumps.
(*indirect_jump): Disable for
-mno-speculate-indirect-jumps.
(*indirect_jump_nospec): New define_insn.
(tablejump): Emit different RTL for
-mno-speculate-indirect-jumps.
(tablejumpsi): Disable for -mno-speculate-indirect-jumps.
(tablejumpsi_nospec): New define_expand.
(tablejumpdi): Disable for -mno-speculate-indirect-jumps.
(tablejumpdi_nospec): New define_expand.
(*tablejump_internal1): Disable for
-mno-speculate-indirect-jumps.
(*tablejump_internal1_nospec): New define_insn.
* config/rs6000/rs6000.opt (mspeculate-indirect-jumps): New
option.


I see no documentation for the new option here  :-(

-Sandra


Re: [PATCH,AIX] Optimize parsing of include files.

2018-01-19 Thread David Edelsohn
On Thu, Jan 18, 2018 at 9:56 AM, REIX, Tony  wrote:
>
> Description:
>  * This patch optimizes the time required for parsing the include files.
>
> Tests:
>  * AIX: Build: SUCCESS
>- build made by means of gmake on AIX.
>
> ChangeLog:
>   * xcoff.c: Optimize parsing of include files.

Hi, Tony

The ChangeLog should be more detailed.

* xcoff.c (xcoff_incl_compare): New function.
(xcoff_incl_search): New function.
(xcoff_process_linenos): Use bsearch to find include file.
(xcoff_initialize_fileline): Sort include file information.


The rest is okay, although the calls to bsearch and backtrace_qsort
don't follow the style of other files.

This is okay.

Thanks, David


Re: [PATCH, rs6000] Add 128-bit support for vec_xl(), vec_xl_be(), vec_xst(), vec_xst_be() builtins.

2018-01-19 Thread Segher Boessenkool
On Thu, Jan 18, 2018 at 04:51:47PM -0600, Segher Boessenkool wrote:
> > +(define_insn "vsx_ld_elemrev_v1ti"
> > +  [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> > +(vec_select:V1TI
> > +     (match_operand:V1TI 1 "memory_operand" "Z")
> > +     (parallel [(const_int 0)])))]
> > +  "VECTOR_MEM_VSX_P (V1TImode) && !BYTES_BIG_ENDIAN"
> > +{
> > +  return "lxvd2x %x0,%y1; xxpermdi %x0,%x0,%x0,2";
> > +}

> We currently have exactly as many xxpermdi,2 as xxswapdi (147 each)
> but the latter is more readable, please prefer that.

Ignore this part; I managed to fumble my grep commands.  We have *no*
xxswapd in the source currently (well, one in comments, and 11 xxswapdi
but that is a misspelling); stage 4 is not the time to start using it
(do all supported assemblers implement it, implement it correctly, etc.)

So your xxpermdi is the best for now.


Segher


Re: New code merge optimization?

2018-01-19 Thread Georg-Johann Lay

On 18.01.2018 16:08, Sebastian Perta wrote:

Hello,

I am interested in implementing a new pass in gcc to merge identical
sequences of code in GCC to be used mainly for RL78.
The commercial RL78 compilers have such algorithms implemented and they make
quite good use of it.
Opportunities arise from the limited capabilities of RL78, for other targets
this might be a lot less useful.

A while ago I found the following:
https://www.gnu.org/software/gcc/projects/cfo.html
And I ported all algorithms to gcc 4.9.2 and tried it on RL78 and RX and
this is what I found out:
For RX: no visible improvements with any of them
For RL78: some minor improvements only with -frtl-seqbastr:
Compiling all the C files from gcc/testsuite/gcc.c-torture/execute/*c  with
"-Os" and "-Os  -frtl-seqabstr" (using the modified gcc 4.9.2)
The algorithm was effective only in 60 files(out of 1643 files, that's only
0.03% of the files currently present in gcc/testsuite/gcc.c-torture/execute)
On those 60 files I got an average of 6.5% improvement with the best
improvement for pr58574.c (36.4%).

What do you think: is it worthwhile porting this to the trunk or I will just
waste my time?


For that particular implementation, it makes no sense to put time into 
it, IMO, for the following reasons:


* It causes compile-time hogs. I saw modules that take ~seconds to 
compile to consume ~30min with that feature activated.  Presumably, it's 
just a proof-of-concept but brute force approach with quadratic or even 
exponential run-time complexity.


* Parts of it jump in after register allocation, hence when register 
allocation uses different registers, the gains where sometimes 
negligible, even though the code had good factoring opportunities.


* Adding calls after reload turned out to be problematic and would 
regularly ICE (e.g. for avr).


I spend some time with this testing for avr and some 32-bit target, but 
gave up soon.


IMO without a sound analysis, e.g. of expected host complexity, any such 
attempts are doomed to fail.


From a general perspective, it is desirable to add respective path 
before passes that might shred factoring opportunities, in particular: 
instruction scheduling, forward propagation (same code ending up in 
different chunks depending on context), instruction combination (dito), 
likely many other passes.  But this makes it harder to precisely 
estimate costs, in particular register pressure effects or turning leafs 
into non-leafs.


Johann


Re: [PATCH,NVPTX] Fix PR83920

2018-01-19 Thread Richard Biener
On January 19, 2018 3:15:45 PM GMT+01:00, Tom de Vries  
wrote:
>On 01/18/2018 02:27 PM, Tom de Vries wrote:
>> On 01/18/2018 12:40 AM, Cesar Philippidis wrote:
>>> In PR83920, I encountered a nvptx bug where live predicate variables
>>> were clobbered before their value was broadcasted. 
>> 
>> Hi,
>> 
>> I've managed to reproduce the problem based on the description in the
>PR.
>
>> I think the way to address it is using a tmp .pred reg like so:
>> ...
>> {
>>    .reg .u32 %x;
>>    mov.u32 %x,%tid.x;
>>    setp.ne.u32 %rnotvzero,%x,0;
>> }
>> 
>> {
>>    .reg .pred %rcond2;
>>    setp.eq.u32 %rcond2, 1, 0; // workaround
>> 
>>    @%rnotvzero bra Lskip;
>>    ...
>>    setp.. %rcond,op1,op2; // could be here, could be
>earlier
>>    mov.b1 %rcond2, %rcond; // used pseudo opcode mov.b1 for
>convenience
>>   Lskip:
>>    selp.u32 %rcondu32,1,0,%rcond2;
>>    shfl.idx.b32 %rcondu32,%rcondu32,0,31;
>>    setp.ne.u32 %rcond,%rcondu32,0;
>> }
>> ...
>> 
>
>Hi,
>
>this is the fix that I plan to commit (similar to the scheme listed 
>above, but modified to keep the selp.u32 using rcond, which is easier
>in 
>code generation).
>
>Build and reg-tested on x86_64 with nvptx accelerator.
>
>Richard, this is an 8 regression for the nvptx target. OK for stage 4
>or 
>defer to stage1?

OK for stage 4.

Richard. 

>Thanks,
>- Tom



Re: [PATCH, fortran] Support Fortran 2018 teams

2018-01-19 Thread Steve Kargl
index 882fe577b76..b4baf5be554 100644
--- a/gcc/fortran/array.c
+++ b/gcc/fortran/array.c
@@ -158,6 +158,7 @@ gfc_match_array_ref (gfc_array_ref *ar, gfc_array_spec
*as, int init,
   bool matched_bracket = false;
   gfc_expr *tmp;
   bool stat_just_seen = false;
+  bool team_just_seen = false;

   memset (ar, '\0', sizeof (*ar));

@@ -230,8 +231,21 @@ coarray:
   if (m == MATCH_ERROR)
return MATCH_ERROR;

+  team_just_seen = false;
   stat_just_seen = false;
-  if (gfc_match(" , stat = %e",) == MATCH_YES && ar->stat == NULL)
+  if (gfc_match (" , team = %e", ) == MATCH_YES && ar->stat == NULL)


Is the 2nd ar->stat suppose to be ar->team?

-- 
Steve


Re: [PATCH] Fix profile_quality sanity check.

2018-01-19 Thread Tom de Vries

On 01/19/2018 04:08 PM, Martin Liška wrote:

On 01/19/2018 02:21 PM, Tom de Vries wrote:

How about keeping profile_uninitialized at the zero value location and 
asserting m_quality != profile_uninitialized ?

Thanks,
- Tom


Yes, that would be possible.

Can you please test that the patch does not generate warnings?



Confirmed, the warnings are gone.

Thanks,
- Tom


I'm running regression tests.

Martin


0001-Fix-profile_quality-sanity-check.patch


 From e1159c2404947f675200efc4476e7e0994b81101 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 19 Jan 2018 15:27:40 +0100
Subject: [PATCH] Fix profile_quality sanity check.

gcc/ChangeLog:

2018-01-18  Martin Liska  

* profile-count.h (enum profile_quality): Add
profile_uninitialized as the first value. Do not number values
as they are zero based.
(profile_count::verify): Update sanity check.
(profile_probability::verify): Likewise.
---
  gcc/profile-count.h | 22 +++---
  1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/gcc/profile-count.h b/gcc/profile-count.h
index 7a43917ebbc..828d6d0ee4b 100644
--- a/gcc/profile-count.h
+++ b/gcc/profile-count.h
@@ -26,34 +26,36 @@ struct function;
  /* Quality of the profile count.  Because gengtype does not support enums
 inside of classes, this is in global namespace.  */
  enum profile_quality {
+  /* Uninitialized value.  */
+  profile_uninitialized,
/* Profile is based on static branch prediction heuristics and may
   or may not match reality.  It is local to function and can not be 
compared
   inter-procedurally.  Never used by probabilities (they are always local).
 */
-  profile_guessed_local = 1,
+  profile_guessed_local,
/* Profile was read by feedback and was 0, we used local heuristics to guess
   better.  This is the case of functions not run in profile fedback.
   Never used by probabilities.  */
-  profile_guessed_global0 = 2,
+  profile_guessed_global0,
  
/* Same as profile_guessed_global0 but global count is adjusted 0.  */

-  profile_guessed_global0adjusted = 3,
+  profile_guessed_global0adjusted,
  
/* Profile is based on static branch prediction heuristics.  It may or may

   not reflect the reality but it can be compared interprocedurally
   (for example, we inlined function w/o profile feedback into function
with feedback and propagated from that).
   Never used by probablities.  */
-  profile_guessed = 4,
+  profile_guessed,
/* Profile was determined by autofdo.  */
-  profile_afdo = 5,
+  profile_afdo,
/* Profile was originally based on feedback but it was adjusted
   by code duplicating optimization.  It may not precisely reflect the
   particular code path.  */
-  profile_adjusted = 6,
+  profile_adjusted,
/* Profile was read from profile feedback or determined by accurate static
   method.  */
-  profile_precise = 7
+  profile_precise
  };
  
  /* The base value for branch probability notes and edge probabilities.  */

@@ -505,8 +507,7 @@ public:
/* Return false if profile_probability is bogus.  */
bool verify () const
  {
-  gcc_checking_assert (profile_guessed_local <= m_quality
-  && m_quality <= profile_precise);
+  gcc_checking_assert (m_quality != profile_uninitialized);
if (m_val == uninitialized_probability)
return m_quality == profile_guessed;
else if (m_quality < profile_guessed)
@@ -786,8 +787,7 @@ public:
/* Return false if profile_count is bogus.  */
bool verify () const
  {
-  gcc_checking_assert (profile_guessed_local <= m_quality
-  && m_quality <= profile_precise);
+  gcc_checking_assert (m_quality != profile_uninitialized);
return m_val != uninitialized_count || m_quality == 
profile_guessed_local;
  }
  





[PATCH] Fix missing profiles with PGO (PR tree-optimization/83051).

2018-01-19 Thread Martin Liška
Hello.

Following ICE can be seen when we have -fprofile-generate or -fprofile-use w/ 
missing
gcda file. I hope the proper fix is to check for reliable profile.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

gcc/ChangeLog:

2018-01-19  Martin Liska  

PR tree-optimization/83051
* predict.c (handle_missing_profiles): Consider profile only
if it's reliable.

gcc/testsuite/ChangeLog:

2018-01-19  Martin Liska  

PR tree-optimization/83051
* gcc.dg/torture/pr83055.c: New test.
---
 gcc/predict.c  |  2 +-
 gcc/testsuite/gcc.dg/torture/pr83055.c | 13 +
 2 files changed, 14 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr83055.c


diff --git a/gcc/predict.c b/gcc/predict.c
index 4c1e4489b55..c5144e49c8d 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -3357,7 +3357,7 @@ handle_missing_profiles (void)
   if (!(node->count == profile_count::zero ()))
 continue;
   for (e = node->callers; e; e = e->next_caller)
-	if (e->count.initialized_p () && e->count > 0)
+	if (e->count.initialized_p () && e->count.reliable_p ())
 	  {
 call_count = call_count + e->count;
 
diff --git a/gcc/testsuite/gcc.dg/torture/pr83055.c b/gcc/testsuite/gcc.dg/torture/pr83055.c
new file mode 100644
index 000..9bc71c6cddf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr83055.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fprofile-generate" } */
+
+void __attribute__ ((__cold__)) a (void);
+void b (void);
+void __attribute__ ((noinline)) c (void) { a (); }
+
+void
+d (void)
+{
+  b ();
+  c ();
+}



Re: [PATCH] Fix profile_quality sanity check.

2018-01-19 Thread Martin Liška
On 01/19/2018 02:21 PM, Tom de Vries wrote:
> How about keeping profile_uninitialized at the zero value location and 
> asserting m_quality != profile_uninitialized ?
> 
> Thanks,
> - Tom

Yes, that would be possible.

Can you please test that the patch does not generate warnings?

I'm running regression tests.

Martin
>From e1159c2404947f675200efc4476e7e0994b81101 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 19 Jan 2018 15:27:40 +0100
Subject: [PATCH] Fix profile_quality sanity check.

gcc/ChangeLog:

2018-01-18  Martin Liska  

	* profile-count.h (enum profile_quality): Add
	profile_uninitialized as the first value. Do not number values
	as they are zero based.
	(profile_count::verify): Update sanity check.
	(profile_probability::verify): Likewise.
---
 gcc/profile-count.h | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/gcc/profile-count.h b/gcc/profile-count.h
index 7a43917ebbc..828d6d0ee4b 100644
--- a/gcc/profile-count.h
+++ b/gcc/profile-count.h
@@ -26,34 +26,36 @@ struct function;
 /* Quality of the profile count.  Because gengtype does not support enums
inside of classes, this is in global namespace.  */
 enum profile_quality {
+  /* Uninitialized value.  */
+  profile_uninitialized,
   /* Profile is based on static branch prediction heuristics and may
  or may not match reality.  It is local to function and can not be compared
  inter-procedurally.  Never used by probabilities (they are always local).
*/
-  profile_guessed_local = 1,
+  profile_guessed_local,
   /* Profile was read by feedback and was 0, we used local heuristics to guess
  better.  This is the case of functions not run in profile fedback.
  Never used by probabilities.  */
-  profile_guessed_global0 = 2,
+  profile_guessed_global0,
 
   /* Same as profile_guessed_global0 but global count is adjusted 0.  */
-  profile_guessed_global0adjusted = 3,
+  profile_guessed_global0adjusted,
 
   /* Profile is based on static branch prediction heuristics.  It may or may
  not reflect the reality but it can be compared interprocedurally
  (for example, we inlined function w/o profile feedback into function
   with feedback and propagated from that).
  Never used by probablities.  */
-  profile_guessed = 4,
+  profile_guessed,
   /* Profile was determined by autofdo.  */
-  profile_afdo = 5,
+  profile_afdo,
   /* Profile was originally based on feedback but it was adjusted
  by code duplicating optimization.  It may not precisely reflect the
  particular code path.  */
-  profile_adjusted = 6,
+  profile_adjusted,
   /* Profile was read from profile feedback or determined by accurate static
  method.  */
-  profile_precise = 7
+  profile_precise
 };
 
 /* The base value for branch probability notes and edge probabilities.  */
@@ -505,8 +507,7 @@ public:
   /* Return false if profile_probability is bogus.  */
   bool verify () const
 {
-  gcc_checking_assert (profile_guessed_local <= m_quality
-			   && m_quality <= profile_precise);
+  gcc_checking_assert (m_quality != profile_uninitialized);
   if (m_val == uninitialized_probability)
 	return m_quality == profile_guessed;
   else if (m_quality < profile_guessed)
@@ -786,8 +787,7 @@ public:
   /* Return false if profile_count is bogus.  */
   bool verify () const
 {
-  gcc_checking_assert (profile_guessed_local <= m_quality
-			   && m_quality <= profile_precise);
+  gcc_checking_assert (m_quality != profile_uninitialized);
   return m_val != uninitialized_count || m_quality == profile_guessed_local;
 }
 
-- 
2.14.3



Re: [PATCH,NVPTX] Fix PR83920

2018-01-19 Thread Tom de Vries

On 01/18/2018 02:27 PM, Tom de Vries wrote:

On 01/18/2018 12:40 AM, Cesar Philippidis wrote:

In PR83920, I encountered a nvptx bug where live predicate variables
were clobbered before their value was broadcasted. 


Hi,

I've managed to reproduce the problem based on the description in the PR.



I think the way to address it is using a tmp .pred reg like so:
...
{
   .reg .u32 %x;
   mov.u32 %x,%tid.x;
   setp.ne.u32 %rnotvzero,%x,0;
}

{
   .reg .pred %rcond2;
   setp.eq.u32 %rcond2, 1, 0; // workaround

   @%rnotvzero bra Lskip;
   ...
   setp.. %rcond,op1,op2; // could be here, could be earlier
   mov.b1 %rcond2, %rcond; // used pseudo opcode mov.b1 for convenience
  Lskip:
   selp.u32 %rcondu32,1,0,%rcond2;
   shfl.idx.b32 %rcondu32,%rcondu32,0,31;
   setp.ne.u32 %rcond,%rcondu32,0;
}
...



Hi,

this is the fix that I plan to commit (similar to the scheme listed 
above, but modified to keep the selp.u32 using rcond, which is easier in 
code generation).


Build and reg-tested on x86_64 with nvptx accelerator.

Richard, this is an 8 regression for the nvptx target. OK for stage 4 or 
defer to stage1?


Thanks,
- Tom
[nvptx] Fix bug in jit bug workaround

2018-01-19  Tom de Vries  
	Cesar Philippidis  

	PR target/83920

	* config/nvptx/nvptx.c (nvptx_single): Fix jit workaround.

	* testsuite/libgomp.oacc-c-c++-common/pr83920.c: New test.
	* testsuite/libgomp.oacc-fortran/pr83920.f90: New test.

---
 gcc/config/nvptx/nvptx.c   | 28 +--
 .../testsuite/libgomp.oacc-c-c++-common/pr83920.c  | 32 ++
 libgomp/testsuite/libgomp.oacc-fortran/pr83920.f90 | 28 +++
 3 files changed, 86 insertions(+), 2 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 86fc13f4fc0..afb0e4dd185 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -4096,9 +4096,33 @@ nvptx_single (unsigned mask, basic_block from, basic_block to)
 
 	 There is nothing in the PTX spec to suggest that this is wrong, or
 	 to explain why the extra initialization is needed.  So, we classify
-	 it as a JIT bug, and the extra initialization as workaround.  */
-	  emit_insn_before (gen_movbi (pvar, const0_rtx),
+	 it as a JIT bug, and the extra initialization as workaround:
+
+		{
+		.reg .u32 %x;
+		mov.u32 %x,%tid.x;
+		setp.ne.u32 %rnotvzero,%x,0;
+		}
+
+		+.reg .pred %rcond2;
+		+setp.eq.u32 %rcond2, 1, 0;
+
+		 @%rnotvzero bra Lskip;
+		 setp.. %rcond,op1,op2;
+		+mov.pred %rcond2, %rcond;
+		 Lskip:
+		+mov.pred %rcond, %rcond2;
+		 selp.u32 %rcondu32,1,0,%rcond;
+		 shfl.idx.b32 %rcondu32,%rcondu32,0,31;
+		 setp.ne.u32 %rcond,%rcondu32,0;
+	  */
+	  rtx_insn *label = PREV_INSN (tail);
+	  gcc_assert (label && LABEL_P (label));
+	  rtx tmp = gen_reg_rtx (BImode);
+	  emit_insn_before (gen_movbi (tmp, const0_rtx),
 			bb_first_real_insn (from));
+	  emit_insn_before (gen_rtx_SET (tmp, pvar), label);
+	  emit_insn_before (gen_rtx_SET (pvar, tmp), tail);
 #endif
 	  emit_insn_before (nvptx_gen_vcast (pvar), tail);
 	}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr83920.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr83920.c
new file mode 100644
index 000..6cd3b5d6f06
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr83920.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+
+#include 
+
+#define n 10
+
+static void __attribute__((noinline)) __attribute__((noclone))
+foo (int beta, int *c)
+{
+  #pragma acc parallel copy(c[0:(n * n) - 1]) num_gangs(2)
+  #pragma acc loop gang
+  for (int j = 0; j < n; ++j)
+if (beta != 1)
+  {
+#pragma acc loop vector
+	for (int i = 0; i < n; ++i)
+	  c[i + (j * n)] = 0;
+  }
+}
+
+int
+main (void)
+{
+  int c[n * n];
+
+  c[0] = 1;
+  foo (0, c);
+  if (c[0] != 0)
+abort ();
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr83920.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr83920.f90
new file mode 100644
index 000..34ad001abcd
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr83920.f90
@@ -0,0 +1,28 @@
+! { dg-do run }
+
+subroutine foo (BETA, C)
+  real ::  C(100,100)
+  integer :: i, j, l
+  real, parameter :: one = 1.0
+  real :: beta
+
+  !$acc parallel copy(c(1:100,1:100)) num_gangs(2)
+  !$acc loop gang
+  do j = 1, 100
+ if (beta /= one) then
+!$acc loop vector
+do i = 1, 100
+   C(i,j) = 0.0
+end do
+ end if
+  end do
+  !$acc end parallel
+end subroutine foo
+
+program test_foo
+  real :: c(100,100), beta
+  beta = 0.0
+  c(:,:) = 1.0
+  call foo (beta, c)
+  if (c(1,1) /= 0.0) call abort ()
+end program test_foo


Re: [PATCH] Fix pr83619.C (was Re: Fix ICE with profile info mismatch)

2018-01-19 Thread Jakub Jelinek
On Fri, Jan 19, 2018 at 02:33:09PM +0100, Rainer Orth wrote:
> however, the test now FAILs everywhere with
> 
> +FAIL: g++.dg/torture/pr83619.C   -O0  (test for excess errors)
> +FAIL: g++.dg/torture/pr83619.C   -O1  (test for excess errors)
> +FAIL: g++.dg/torture/pr83619.C   -O2  (test for excess errors)
> +FAIL: g++.dg/torture/pr83619.C   -O2 -flto  (test for excess errors)
> +FAIL: g++.dg/torture/pr83619.C   -O2 -flto -flto-partition=none  (test for 
> excess errors)
> +FAIL: g++.dg/torture/pr83619.C   -O3 -g  (test for excess errors)
> +FAIL: g++.dg/torture/pr83619.C   -Os  (test for excess errors)
> 
> from
> 
>   g->c (); // { dg-message "incomplete" }
> 
> Removing the dg-message cures this.  Tested with the appropriate runtest
> invocation on i386-pc-solaris2.11.

Oops, no idea how I've missed it.

> 2018-01-19  Rainer Orth  
> 
>   * g++.dg/torture/pr83619.C: Remove dg-message.

Ok, thanks.

> diff --git a/gcc/testsuite/g++.dg/torture/pr83619.C 
> b/gcc/testsuite/g++.dg/torture/pr83619.C
> --- a/gcc/testsuite/g++.dg/torture/pr83619.C
> +++ b/gcc/testsuite/g++.dg/torture/pr83619.C
> @@ -24,7 +24,7 @@ public:
>  static void
>  c (e *g)
>  {
> -  g->c ();   // { dg-message "incomplete" }
> +  g->c ();
>  }
>};
>  };


Jakub


RE: [PATCH] RL78 UNUSED note setting bug fix in rl78_note_reg_set

2018-01-19 Thread Sebastian Perta
HI DJ,

>>Do you have checkin privs yet?
>> This is ok aside from.. ... + /* Do not mark the reg unused unless all
QImode parts of it are dead.  */
Can I checkin this patch? Thank you!

Best Regards,
Sebastian


> -Original Message-
> From: Sebastian Perta
> Sent: 12 January 2018 18:42
> To: 'DJ Delorie' 
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH] RL78 UNUSED note setting bug fix in rl78_note_reg_set
> 
> Hi DJ,
> 
> >>Do you have checkin privs yet?
> I have filled out the form. "Thanks for your request. It must be approved
by
> the person you named as approver ...
> 
> >> This is ok aside from..
> Sorry about this. I will keep this in mind in future.
> I corrected the patch with your second suggestion.
> 
> Best Regards,
> Sebastian
> 
> Index: ChangeLog
> ==
> =
> --- ChangeLog (revision 256590)
> +++ ChangeLog (working copy)
> @@ -1,3 +1,8 @@
> +2018-01-12  Sebastian Perta  
> +
> + * config/rl78/rl78.c (rl78_note_reg_set): fixed dead reg check
> + for non-QImode registers
> +
>  2018-01-12  Vladimir Makarov  
> 
>   PR rtl-optimization/80481
> Index: config/rl78/rl78.c
> ==
> =
> --- config/rl78/rl78.c(revision 256590)
> +++ config/rl78/rl78.c(working copy)
> @@ -3792,7 +3792,7 @@
>  rl78_note_reg_set (char *dead, rtx d, rtx insn)
>  {
>int r, i;
> -
> +  bool is_dead;
>if (GET_CODE (d) == MEM)
>  rl78_note_reg_uses (dead, XEXP (d, 0), insn);
> 
> @@ -3799,9 +3799,15 @@
>if (GET_CODE (d) != REG)
>  return;
> 
> + /* Do not mark the reg unused unless all QImode parts of it are dead.
*/
>r = REGNO (d);
> -  if (dead [r])
> -add_reg_note (insn, REG_UNUSED, gen_rtx_REG (GET_MODE (d), r));
> +  is_dead = true;
> +  for (i = 0; i < GET_MODE_SIZE (GET_MODE (d)); i ++)
> +   if (!dead [r + i])
> +   is_dead = false;
> +  if(is_dead)
> + add_reg_note (insn, REG_UNUSED, gen_rtx_REG (GET_MODE (d),
> r));
>if (dump_file)
>  fprintf (dump_file, "note set reg %d size %d\n", r, GET_MODE_SIZE
> (GET_MODE (d)));
>for (i = 0; i < GET_MODE_SIZE (GET_MODE (d)); i ++)
> 
> > -Original Message-
> > From: DJ Delorie [mailto:d...@redhat.com]
> > Sent: 12 January 2018 18:12
> > To: Sebastian Perta 
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH] RL78 UNUSED note setting bug fix in
> rl78_note_reg_set
> >
> >
> > "Sebastian Perta"  writes:
> > > Please let me know if this is OK. Thank you!
> >
> > Do you have checkin privs yet?
> >
> > This is ok aside from..
> >
> > > +  /* 'dead' keeps track of the QImode registers if r is of different
size
> > > +  we need to check the other subparts as well  */
> >
> > Missing period at the end of a sentence; should capitalize first word
> > but it's a variable, which should be block caps anyway, and it reads
> > better as two sentences:
> >
> > > +  /* DEAD keeps track of the QImode registers.  If R is of different
size
> > > +  we need to check the other subparts as well.  */
> >
> > Or rewrite to not mention variables?
> >
> > > + /* Do not mark the reg unused unless all QImode parts of it are
dead.
> */



Re: [PATCH] Fix pr83619.C (was Re: Fix ICE with profile info mismatch)

2018-01-19 Thread Rainer Orth
Hi Jan,

>> On Thu, Jan 18, 2018 at 04:59:01PM +0100, Jan Hubicka wrote:
>> > this patch ICE where the profile in cgraph mismatch profile in BB. This
>> > is becuase
>> > of expansion of speculative devirtualization where we get some roundoff
>> > issues.
>> > 
>> > Bootstrapped/regtested x86_64-linux, comitted.
>> > Honza
>> > 
>> >PR ipa/83619
>> >* g++.dg/torture/pr83619.C: New testcase.
>> >* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Update edge
>> >frequencies.
>> > --- testsuite/g++.dg/torture/pr83619.C (revision 0)
>> > +++ testsuite/g++.dg/torture/pr83619.C (working copy)
>> ...
>> 
>> This testcase FAILs everywhere:
>> /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C: In static member
>> function 'static void i::j<  >::c(e*)':
>> /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:25:8: warning: invalid
>> use of incomplete type 'class e'
>> /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:8:7: note: forward
>> declaration of 'class e'
>> FAIL: g++.dg/torture/pr83619.C   -O0  (test for excess errors)
>> Excess errors:
>> /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:25:8: warning: invalid
>> use of incomplete type 'class e'
>> 
>> The following patch tweaks it so that it doesn't emit the warning, yet still
>> ICEs before your cgraph.c change and PASSes after it.
>> Tested on x86_64-linux and i686-linux, ok for trunk?
>> 
>> 2018-01-18  Jakub Jelinek  
>> 
>>  PR ipa/83619
>>  * g++.dg/torture/pr83619.C (e): Define before first use instead of
>>  forward declaration.
>
> Oops, sorry. I had corrected version somewhere. The change is OK.

however, the test now FAILs everywhere with

+FAIL: g++.dg/torture/pr83619.C   -O0  (test for excess errors)
+FAIL: g++.dg/torture/pr83619.C   -O1  (test for excess errors)
+FAIL: g++.dg/torture/pr83619.C   -O2  (test for excess errors)
+FAIL: g++.dg/torture/pr83619.C   -O2 -flto  (test for excess errors)
+FAIL: g++.dg/torture/pr83619.C   -O2 -flto -flto-partition=none  (test for 
excess errors)
+FAIL: g++.dg/torture/pr83619.C   -O3 -g  (test for excess errors)
+FAIL: g++.dg/torture/pr83619.C   -Os  (test for excess errors)

from

  g->c (); // { dg-message "incomplete" }

Removing the dg-message cures this.  Tested with the appropriate runtest
invocation on i386-pc-solaris2.11.

Ok for mainline?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2018-01-19  Rainer Orth  

* g++.dg/torture/pr83619.C: Remove dg-message.

diff --git a/gcc/testsuite/g++.dg/torture/pr83619.C b/gcc/testsuite/g++.dg/torture/pr83619.C
--- a/gcc/testsuite/g++.dg/torture/pr83619.C
+++ b/gcc/testsuite/g++.dg/torture/pr83619.C
@@ -24,7 +24,7 @@ public:
 static void
 c (e *g)
 {
-  g->c ();			// { dg-message "incomplete" }
+  g->c ();
 }
   };
 };


Small C++ tweak to fold_simple

2018-01-19 Thread Marek Polacek
fold_simple struck me as odd, certainly the NULL_TREE assignment is
unnecessary, so this patch simplifies it a bit.

I've been confused about the commentary too, what is the difference between
"constant-expressions" and "constexpressions"?  We should clarify that.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2018-01-19  Marek Polacek  

* constexpr.c (fold_simple): Simplify.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 9a548d29bbc..ca7f369f7e9 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -4931,22 +4931,21 @@ fold_simple_1 (tree t)
 }
 
 /* If T is a simple constant expression, returns its simplified value.
-   Otherwise returns T.  In contrast to maybe_constant_value do we
+   Otherwise returns T.  In contrast to maybe_constant_value we
simplify only few operations on constant-expressions, and we don't
try to simplify constexpressions.  */
 
 tree
 fold_simple (tree t)
 {
-  tree r = NULL_TREE;
   if (processing_template_decl)
 return t;
 
-  r = fold_simple_1 (t);
-  if (!r)
-r = t;
+  tree r = fold_simple_1 (t);
+  if (r)
+return r;
 
-  return r;
+  return t;
 }
 
 /* If T is a constant expression, returns its reduced value.

Marek


Re: [PATCH] Fix profile_quality sanity check.

2018-01-19 Thread Tom de Vries

On 01/19/2018 01:11 PM, Martin Liška wrote:

On 01/18/2018 04:57 PM, Tom de Vries wrote:

On 01/18/2018 03:59 PM, Martin Liška wrote:

Hi.

Following patch adds a new enum value so that we don't see following warning:
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01211.html



Hi,

with the patch, I still see the same warning.

And not surprisingly, given that profile_precise is still 7 and m_quality is 
still a 3 bits wide bitfield.


Hi.

Sorry I was too eager and I haven't realized that 2^4 - 1 can't fit 8 values ;)
Thus I'm suggesting to simply removal of the sanity checking as it does not make
sense to enlarge the bit enum. And I'm also suggesting to not to number values 
of the
enum.



How about keeping profile_uninitialized at the zero value location and 
asserting m_quality != profile_uninitialized ?


Thanks,
- Tom



0001-Remove-profile_quality-sanity-check.patch


 From 0656d0dce5c26cf206ad4fcb21809a4aeb02ec42 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 18 Jan 2018 13:26:27 +0100
Subject: [PATCH] Remove profile_quality sanity check.

gcc/ChangeLog:

2018-01-18  Martin Liska  

* profile-count.h (enum profile_quality): Do not number values
as they are zero based.
(profile_count::verify): Remove sanity check.
(profile_probability::verify): Remove sanity check.
---
  gcc/profile-count.h | 18 +++---
  1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/gcc/profile-count.h b/gcc/profile-count.h
index 7a43917ebbc..74ec9b465d3 100644
--- a/gcc/profile-count.h
+++ b/gcc/profile-count.h
@@ -30,30 +30,30 @@ enum profile_quality {
   or may not match reality.  It is local to function and can not be 
compared
   inter-procedurally.  Never used by probabilities (they are always local).
 */
-  profile_guessed_local = 1,
+  profile_guessed_local,
/* Profile was read by feedback and was 0, we used local heuristics to guess
   better.  This is the case of functions not run in profile fedback.
   Never used by probabilities.  */
-  profile_guessed_global0 = 2,
+  profile_guessed_global0,
  
/* Same as profile_guessed_global0 but global count is adjusted 0.  */

-  profile_guessed_global0adjusted = 3,
+  profile_guessed_global0adjusted,
  
/* Profile is based on static branch prediction heuristics.  It may or may

   not reflect the reality but it can be compared interprocedurally
   (for example, we inlined function w/o profile feedback into function
with feedback and propagated from that).
   Never used by probablities.  */
-  profile_guessed = 4,
+  profile_guessed,
/* Profile was determined by autofdo.  */
-  profile_afdo = 5,
+  profile_afdo,
/* Profile was originally based on feedback but it was adjusted
   by code duplicating optimization.  It may not precisely reflect the
   particular code path.  */
-  profile_adjusted = 6,
+  profile_adjusted,
/* Profile was read from profile feedback or determined by accurate static
   method.  */
-  profile_precise = 7
+  profile_precise
  };
  
  /* The base value for branch probability notes and edge probabilities.  */

@@ -505,8 +505,6 @@ public:
/* Return false if profile_probability is bogus.  */
bool verify () const
  {
-  gcc_checking_assert (profile_guessed_local <= m_quality
-  && m_quality <= profile_precise);
if (m_val == uninitialized_probability)
return m_quality == profile_guessed;
else if (m_quality < profile_guessed)
@@ -786,8 +784,6 @@ public:
/* Return false if profile_count is bogus.  */
bool verify () const
  {
-  gcc_checking_assert (profile_guessed_local <= m_quality
-  && m_quality <= profile_precise);
return m_val != uninitialized_count || m_quality == 
profile_guessed_local;
  }
  





Re: [PATCH] Fix profile_quality sanity check.

2018-01-19 Thread Martin Liška
On 01/18/2018 04:57 PM, Tom de Vries wrote:
> On 01/18/2018 03:59 PM, Martin Liška wrote:
>> Hi.
>>
>> Following patch adds a new enum value so that we don't see following warning:
>> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01211.html
>>
> 
> Hi,
> 
> with the patch, I still see the same warning.
> 
> And not surprisingly, given that profile_precise is still 7 and m_quality is 
> still a 3 bits wide bitfield.

Hi.

Sorry I was too eager and I haven't realized that 2^4 - 1 can't fit 8 values ;)
Thus I'm suggesting to simply removal of the sanity checking as it does not make
sense to enlarge the bit enum. And I'm also suggesting to not to number values 
of the
enum.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

> 
> So, I don't quite understand why you think that the patch would address the 
> warning.
> 
> Thanks,
> - Tom
> 
>> Apart from that I decided to not to number values of the enum as it uses
>> default number. Is it welcome?
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Ready to be installed?
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2018-01-18  Martin Liska  
>>
>> * profile-count.h (enum profile_quality): Add
>> profile_uninitialized as the first value. Do not number values
>> as they are zero based.
>> ---
>>   gcc/profile-count.h | 16 +---
>>   1 file changed, 9 insertions(+), 7 deletions(-)
>>
>>
>>
>> 0001-Fix-profile_quality-sanity-check.patch
>>
>>
>> diff --git a/gcc/profile-count.h b/gcc/profile-count.h
>> index 7a43917ebbc..e899963118b 100644
>> --- a/gcc/profile-count.h
>> +++ b/gcc/profile-count.h
>> @@ -26,34 +26,36 @@ struct function;
>>   /* Quality of the profile count.  Because gengtype does not support enums
>>  inside of classes, this is in global namespace.  */
>>   enum profile_quality {
>> +  /* Uninitialized value.  */
>> +  profile_uninitialized,
>>     /* Profile is based on static branch prediction heuristics and may
>>    or may not match reality.  It is local to function and can not be 
>> compared
>>    inter-procedurally.  Never used by probabilities (they are always 
>> local).
>>  */
>> -  profile_guessed_local = 1,
>> +  profile_guessed_local,
>>     /* Profile was read by feedback and was 0, we used local heuristics to 
>> guess
>>    better.  This is the case of functions not run in profile fedback.
>>    Never used by probabilities.  */
>> -  profile_guessed_global0 = 2,
>> +  profile_guessed_global0,
>>       /* Same as profile_guessed_global0 but global count is adjusted 0.  */
>> -  profile_guessed_global0adjusted = 3,
>> +  profile_guessed_global0adjusted,
>>       /* Profile is based on static branch prediction heuristics.  It may or 
>> may
>>    not reflect the reality but it can be compared interprocedurally
>>    (for example, we inlined function w/o profile feedback into function
>>     with feedback and propagated from that).
>>    Never used by probablities.  */
>> -  profile_guessed = 4,
>> +  profile_guessed,
>>     /* Profile was determined by autofdo.  */
>> -  profile_afdo = 5,
>> +  profile_afdo,
>>     /* Profile was originally based on feedback but it was adjusted
>>    by code duplicating optimization.  It may not precisely reflect the
>>    particular code path.  */
>> -  profile_adjusted = 6,
>> +  profile_adjusted,
>>     /* Profile was read from profile feedback or determined by accurate 
>> static
>>    method.  */
>> -  profile_precise = 7
>> +  profile_precise
>>   };
>>     /* The base value for branch probability notes and edge probabilities.  
>> */
>>
> 

>From 0656d0dce5c26cf206ad4fcb21809a4aeb02ec42 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 18 Jan 2018 13:26:27 +0100
Subject: [PATCH] Remove profile_quality sanity check.

gcc/ChangeLog:

2018-01-18  Martin Liska  

	* profile-count.h (enum profile_quality): Do not number values
	as they are zero based.
	(profile_count::verify): Remove sanity check.
	(profile_probability::verify): Remove sanity check.
---
 gcc/profile-count.h | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/gcc/profile-count.h b/gcc/profile-count.h
index 7a43917ebbc..74ec9b465d3 100644
--- a/gcc/profile-count.h
+++ b/gcc/profile-count.h
@@ -30,30 +30,30 @@ enum profile_quality {
  or may not match reality.  It is local to function and can not be compared
  inter-procedurally.  Never used by probabilities (they are always local).
*/
-  profile_guessed_local = 1,
+  profile_guessed_local,
   /* Profile was read by feedback and was 0, we used local heuristics to guess
  better.  This is the case of functions not run in profile fedback.
  Never used by probabilities.  */
-  profile_guessed_global0 = 2,
+  profile_guessed_global0,
 
   /* Same as profile_guessed_global0 but global count is adjusted 0.  */
-  

Re: [PATCH 4/5] Remove predictors that are unrealiable.

2018-01-19 Thread Martin Liška
On 01/11/2018 11:39 AM, Jan Hubicka wrote:
>> These predictors are in my opinion not reliable and thus I decided to remove 
>> them:
>>
>> 1) PRED_NEGATIVE_RETURN: probability is ~51%
>> 2) PRED_RECURSIVE_CALL: there are 2 dominant edges that influence value to 
>> 63%;
>> w/o these edges it goes down to 52%
>> 3) PRED_POLYMORPHIC_CALL: having very low coverage, probability is ~51%
>> 4) PRED_INDIR_CALL: likewise
>>
>> Question here is whether we want to remove them, or to predict them with a 
>> 'ignored'
>> flag? Doing that, we can measure statistics of the predictor in the future?
> 
> I believe that recursive call was introudced to help exchange2 benchmark.  I 
> think it does
> make sense globally because function simply can not contain hot self 
> recursive call and thus
> I would not care about low benchmark coverage.

Yes it probably was. However according to numbers I have it does not influence 
the jumpy
benchmarks ;)

> 
> For polymorphic/indir call I think they are going to grow in importance in 
> future
> (especialy first one) and thus I would like to keep them tracked.  If you 
> simply
> set probablility to PROB_EVEN, they won't affect branch prediction outcome and
> we still will get data on how common they are.

Sure, let's make then PROB_EVEN.

> 
> For PRED_NAGATIVE_RETURN, can you take a look why it changed from 98 to 51%?
> The idea is that negative values are often used to report error codes and 
> that seems
> reasonable.  Perhaps it can be made more specific so it remains working ofr 
> spec2k16?

Yes, there's a huge difference in between CPU 2006 and 2017. Former has 63% w/ 
dominant edges,
and later one only 11%. It's caused by these 2 benchmarks with a high coverage:

500.perlbench_r: regexec.c.065i.profile:
  negative return heuristics of edge 1368->1370: 2.0%  exec 2477714850 hit 
2429863555 (98.1%)

and 523.xalancbmk_r:
build/build_peak_gcc7-m64./NameDatatypeValidator.cpp.065i.profile:  
negative return heuristics of edge 3->4: 2.0%  exec 1221735072 hit 1221522453 
(100.0%)

Ideas what to do with the predictor for GCC 8 release?
Martin

> 
> Honza
>>
>> Martin
> 
>> >From afbc86cb72eab37bcf6325954d0bf306b301f76e Mon Sep 17 00:00:00 2001
>> From: marxin 
>> Date: Thu, 28 Dec 2017 10:23:48 +0100
>> Subject: [PATCH 4/5] Remove predictors that are unrealiable.
>>
>> gcc/ChangeLog:
>>
>> 2017-12-28  Martin Liska  
>>
>>  * predict.c (return_prediction): Do not predict
>>  PRED_NEGATIVE_RETURN.
>>  (tree_bb_level_predictions): Do not predict PRED_RECURSIVE_CALL.
>>  (tree_estimate_probability_bb): Do not predict
>>  PRED_POLYMORPHIC_CALL and PRED_INDIR_CALL.
>>  * predict.def (PRED_INDIR_CALL): Remove unused predictors.
>>  (PRED_POLYMORPHIC_CALL): Likewise.
>>  (PRED_RECURSIVE_CALL): Likewise.
>>  (PRED_NEGATIVE_RETURN): Likewise.
>> ---
>>  gcc/predict.c   | 17 ++---
>>  gcc/predict.def | 13 -
>>  2 files changed, 2 insertions(+), 28 deletions(-)
>>
>> diff --git a/gcc/predict.c b/gcc/predict.c
>> index 51fd14205c2..f53724792e9 100644
>> --- a/gcc/predict.c
>> +++ b/gcc/predict.c
>> @@ -2632,14 +2632,6 @@ return_prediction (tree val, enum prediction 
>> *prediction)
>>  }
>>else if (INTEGRAL_TYPE_P (TREE_TYPE (val)))
>>  {
>> -  /* Negative return values are often used to indicate
>> - errors.  */
>> -  if (TREE_CODE (val) == INTEGER_CST
>> -  && tree_int_cst_sgn (val) < 0)
>> -{
>> -  *prediction = NOT_TAKEN;
>> -  return PRED_NEGATIVE_RETURN;
>> -}
>>/* Constant return values seems to be commonly taken.
>>   Zero/one often represent booleans so exclude them from the
>>   heuristics.  */
>> @@ -2820,9 +2812,6 @@ tree_bb_level_predictions (void)
>> DECL_ATTRIBUTES (decl)))
>>  predict_paths_leading_to (bb, PRED_COLD_FUNCTION,
>>NOT_TAKEN);
>> -  if (decl && recursive_call_p (current_function_decl, decl))
>> -predict_paths_leading_to (bb, PRED_RECURSIVE_CALL,
>> -  NOT_TAKEN);
>>  }
>>else if (gimple_code (stmt) == GIMPLE_PREDICT)
>>  {
>> @@ -2880,12 +2869,10 @@ tree_estimate_probability_bb (basic_block bb, bool 
>> local_only)
>>   something exceptional.  */
>>&& gimple_has_side_effects (stmt))
>>  {
>> +  /* Consider just normal function calls, skip indirect and
>> +  polymorphic calls as these tend to be unreliable.  */
>>if (gimple_call_fndecl (stmt))
>>  predict_edge_def (e, PRED_CALL, NOT_TAKEN);
>> -  else if (virtual_method_call_p (gimple_call_fn (stmt)))
>> -predict_edge_def (e, PRED_POLYMORPHIC_CALL, NOT_TAKEN);
>> -  else
>> -predict_edge_def (e, PRED_INDIR_CALL, TAKEN);
>>

Re: Check whether any statements need masking (PR 83922)

2018-01-19 Thread Richard Biener
On Fri, Jan 19, 2018 at 10:55 AM, Richard Sandiford
 wrote:
> This PR is an odd case in which, due to the low optimisation level,
> we enter vectorisation with:
>
>   outer1:
> x_1 = PHI ;
> ...
>
>   inner:
> x_2 = 0;
> ...
>
>   outer2:
> x_3 = PHI ;
>
> These statements are tentatively treated as a double reduction by
> vect_force_simple_reduction, but in the end only x_3 and x_2 are marked
> as relevant.  vect_analyze_loop_operations skips over x_3, leaving the
> vectorizable_reduction check to a presumed future test of x_1, which
> in this case never happens.  We therefore end up vectorising x_2 only
> (complete with peeling for niters!) and leave the scalar x_3 in place.
>
> This caused a segfault in the support for fully-masked loops,
> since there were no statements that needed masking.  Fixed by
> checking for that.
>
> But I think this is also a flaw in vect_analyze_loop_operations.
> Outer loop vectorisation reduces the number of times that the
> inner loop is executed, so it wouldn't necessarily be valid
> to leave the scalar x_3 in place for all vectorisable x_2.
> There's already code to forbid that when x_1 isn't present:
>
>   /* FORNOW: we currently don't support the case that these phis
>  are not used in the outerloop (unless it is double reduction,
>  i.e., this phi is vect_reduction_def), cause this case
>  requires to actually do something here.  */
>
> I think we need to do the same if x_1 is present but not relevant.

Hmm, yeah...

> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
> OK to install?

Ok.

Richard.

> Richard
>
>
> 2018-01-19  Richard Sandiford  
>
> gcc/
> PR tree-optimization/83922
> * tree-vect-loop.c (vect_verify_full_masking): Return false if
> there are no statements that need masking.
> (vect_active_double_reduction_p): New function.
> (vect_analyze_loop_operations): Use it when handling phis that
> are not in the loop header.
>
> gcc/testsuite/
> PR tree-optimization/83922
> * gcc.dg/pr83922.c: New test.
>
> Index: gcc/tree-vect-loop.c
> ===
> --- gcc/tree-vect-loop.c2018-01-19 09:36:33.409191362 +
> +++ gcc/tree-vect-loop.c2018-01-19 09:52:00.681330865 +
> @@ -1294,6 +1294,12 @@ vect_verify_full_masking (loop_vec_info
>struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>unsigned int min_ni_width;
>
> +  /* Use a normal loop if there are no statements that need masking.
> + This only happens in rare degenerate cases: it means that the loop
> + has no loads, no stores, and no live-out values.  */
> +  if (LOOP_VINFO_MASKS (loop_vinfo).is_empty ())
> +return false;
> +
>/* Get the maximum number of iterations that is representable
>   in the counter type.  */
>tree ni_type = TREE_TYPE (LOOP_VINFO_NITERSM1 (loop_vinfo));
> @@ -1739,6 +1745,33 @@ vect_update_vf_for_slp (loop_vec_info lo
>  }
>  }
>
> +/* Return true if STMT_INFO describes a double reduction phi and if
> +   the other phi in the reduction is also relevant for vectorization.
> +   This rejects cases such as:
> +
> +  outer1:
> +   x_1 = PHI ;
> +   ...
> +
> +  inner:
> +   x_2 = ...;
> +   ...
> +
> +  outer2:
> +   x_3 = PHI ;
> +
> +   if nothing in x_2 or elsewhere makes x_1 relevant.  */
> +
> +static bool
> +vect_active_double_reduction_p (stmt_vec_info stmt_info)
> +{
> +  if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_double_reduction_def)
> +return false;
> +
> +  gimple *other_phi = STMT_VINFO_REDUC_DEF (stmt_info);
> +  return STMT_VINFO_RELEVANT_P (vinfo_for_stmt (other_phi));
> +}
> +
>  /* Function vect_analyze_loop_operations.
>
> Scan the loop stmts and make sure they are all vectorizable.  */
> @@ -1786,8 +1819,7 @@ vect_analyze_loop_operations (loop_vec_i
>   i.e., this phi is vect_reduction_def), cause this case
>   requires to actually do something here.  */
>if (STMT_VINFO_LIVE_P (stmt_info)
> -  && STMT_VINFO_DEF_TYPE (stmt_info)
> - != vect_double_reduction_def)
> + && !vect_active_double_reduction_p (stmt_info))
>  {
>if (dump_enabled_p ())
> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> Index: gcc/testsuite/gcc.dg/pr83922.c
> ===
> --- /dev/null   2018-01-19 09:30:49.543814408 +
> +++ gcc/testsuite/gcc.dg/pr83922.c  2018-01-19 09:52:00.680331041 +
> @@ -0,0 +1,21 @@
> +/* { dg-options "-O -ftree-vectorize" } */
> +
> +int j4;
> +
> +void
> +k1 (int ak)
> +{
> +  while (ak < 1)
> +{
> +  

Re: Avoid ICE for nested inductions (PR 83914)

2018-01-19 Thread Richard Biener
On Fri, Jan 19, 2018 at 10:45 AM, Richard Sandiford
 wrote:
> This testcase ICEd because we converted the initial value of an
> induction to the vector element type even for nested inductions.
> This isn't necessary because the initial expression is vectorised
> normally, and it meant that init_expr was no longer the original
> statement operand by the time we called vect_get_vec_def_for_operand.
>
> Also, adding the conversion code here made the existing SLP conversion
> redundant.
>
> It looks like something went wrong when rebasing the peeling via masks
> work on top of the support for SLP reductions, sorry about that.
>
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
> OK to install?

Ok.

Richard.

> Richard
>
>
> 2018-01-19  Richard Sandiford  
>
> gcc/
> PR tree-optimization/83914
> * tree-vect-loop.c (vectorizable_induction): Don't convert
> init_expr or apply the peeling adjustment for inductions
> that are nested within the vectorized loop.  Remove redundant
> conversion code.
>
> gcc/testsuite/
> PR tree-optimization/83914
> * gcc.dg/vect/pr83914.c: New test.
>
> Index: gcc/tree-vect-loop.c
> ===
> --- gcc/tree-vect-loop.c2018-01-16 15:13:24.938622242 +
> +++ gcc/tree-vect-loop.c2018-01-19 09:36:33.409191362 +
> @@ -7678,28 +7678,33 @@ vectorizable_induction (gimple *phi,
>init_expr = PHI_ARG_DEF_FROM_EDGE (phi,
>  loop_preheader_edge (iv_loop));
>
> -  /* Convert the initial value and step to the desired type.  */
>stmts = NULL;
> -  init_expr = gimple_convert (, TREE_TYPE (vectype), init_expr);
> -  step_expr = gimple_convert (, TREE_TYPE (vectype), step_expr);
> -
> -  /* If we are using the loop mask to "peel" for alignment then we need
> - to adjust the start value here.  */
> -  tree skip_niters = LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo);
> -  if (skip_niters != NULL_TREE)
> +  if (!nested_in_vect_loop)
>  {
> -  if (FLOAT_TYPE_P (vectype))
> -   skip_niters = gimple_build (, FLOAT_EXPR, TREE_TYPE (vectype),
> -   skip_niters);
> -  else
> -   skip_niters = gimple_convert (, TREE_TYPE (vectype),
> - skip_niters);
> -  tree skip_step = gimple_build (, MULT_EXPR, TREE_TYPE (vectype),
> -skip_niters, step_expr);
> -  init_expr = gimple_build (, MINUS_EXPR, TREE_TYPE (vectype),
> -   init_expr, skip_step);
> +  /* Convert the initial value to the desired type.  */
> +  tree new_type = TREE_TYPE (vectype);
> +  init_expr = gimple_convert (, new_type, init_expr);
> +
> +  /* If we are using the loop mask to "peel" for alignment then we need
> +to adjust the start value here.  */
> +  tree skip_niters = LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo);
> +  if (skip_niters != NULL_TREE)
> +   {
> + if (FLOAT_TYPE_P (vectype))
> +   skip_niters = gimple_build (, FLOAT_EXPR, new_type,
> +   skip_niters);
> + else
> +   skip_niters = gimple_convert (, new_type, skip_niters);
> + tree skip_step = gimple_build (, MULT_EXPR, new_type,
> +skip_niters, step_expr);
> + init_expr = gimple_build (, MINUS_EXPR, new_type,
> +   init_expr, skip_step);
> +   }
>  }
>
> +  /* Convert the step to the desired type.  */
> +  step_expr = gimple_convert (, TREE_TYPE (vectype), step_expr);
> +
>if (stmts)
>  {
>new_bb = gsi_insert_seq_on_edge_immediate (pe, stmts);
> @@ -7718,15 +7723,6 @@ vectorizable_induction (gimple *phi,
>/* Enforced above.  */
>unsigned int const_nunits = nunits.to_constant ();
>
> -  /* Convert the init to the desired type.  */
> -  stmts = NULL;
> -  init_expr = gimple_convert (, TREE_TYPE (vectype), init_expr);
> -  if (stmts)
> -   {
> - new_bb = gsi_insert_seq_on_edge_immediate (pe, stmts);
> - gcc_assert (!new_bb);
> -   }
> -
>/* Generate [VF*S, VF*S, ... ].  */
>if (SCALAR_FLOAT_TYPE_P (TREE_TYPE (step_expr)))
> {
> Index: gcc/testsuite/gcc.dg/vect/pr83914.c
> ===
> --- /dev/null   2018-01-19 09:30:49.543814408 +
> +++ gcc/testsuite/gcc.dg/vect/pr83914.c 2018-01-19 09:36:33.405191141 +
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O3" } */
> +
> +struct s { struct s *ptrs[16]; } *a, *b;
> +int c;
> +void
> +foo (int n)
> +{
> +  for (; n; a = b, n--)
> +{
> +  b = a + 1;
> +  for (c = 8; c; c--)
> +   a->ptrs[c] = b;
> +}
> +}


[PATCH] Add gcc.dg/stack-check-16.c

2018-01-19 Thread Jakub Jelinek
Hi!

This patch adds a new testcases, not exactly sure what is the exact
origin and what was the problem.  The changes I've done are:
1) macroize, so that the test is just a few lines rather than 160KB,
   verified -fdump-tree-gimple printf call is identical between this
   and the original test
2) remove optimize(0) attribute, the test is compiled with -O0
3) use __builtin_alloca instead of alloca and add mtrace prototype,
   remove -w because no warnings are emitted any longer

The test passes at least on x86_64-linux with -m32/-m64.  Ok for trunk?

2018-01-19  Jeff Law  
Jakub Jelinek  

* gcc.dg/stack-check-16.c: New test.

--- gcc/testsuite/gcc.dg/stack-check-16.c.jj2018-01-19 11:58:39.325950389 
+0100
+++ gcc/testsuite/gcc.dg/stack-check-16.c   2018-01-19 11:50:34.856026594 
+0100
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-options "-fstack-clash-protection" } */
+/* { dg-require-effective-target supports_stack_clash_protection } */
+
+int printf (const char *, ...);
+void blah (char *space) { }
+void mtrace (void);
+
+int do_test (void)
+{
+blah (__builtin_alloca (10));
+mtrace ();
+printf (
+#define A(a) "%" #a "$s"
+#define B(a) A(a)
+#define C(a,b,c,d) B(a##b##c##d)
+#define D(a,b,c) C(a,b,c,1) C(a,b,c,2) C(a,b,c,3) C(a,b,c,4) C(a,b,c,5) \
+C(a,b,c,6) C(a,b,c,7) C(a,b,c,8) C(a,b,c,9)
+#define E(a,b,c) C(a,b,c,0) D(a,b,c)
+#define F(a,b) E(a,b,1) E(a,b,2) E(a,b,3) E(a,b,4) E(a,b,5) \
+  E(a,b,6) E(a,b,7) E(a,b,8) E(a,b,9)
+#define G(a,b) E(a,b,0) F(a,b)
+#define H(a) G(a,1) G(a,2) G(a,3) G(a,4) G(a,5) G(a,6) G(a,7) G(a,8) G(a,9)
+#define I(a) G(a,0) H(a)
+#define J I(1) I(2) I(3) I(4) I(5) I(6) I(7) I(8) I(9)
+   D(,,)
+   F(,)
+   H()
+   J
+   C(10,0,0,0) C(10,0,0,1),
+#undef A
+#define A(a) "a",
+   I(0) J
+   "\n");
+  return 0;
+}

Jakub


Re: [PATCH][arm] XFAIL advsimd-intrinsics/vld1x2.c

2018-01-19 Thread Kyrill Tkachov

Ping.

https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00913.html

Thanks,
Kyrill

On 11/01/18 15:33, Kyrill Tkachov wrote:

Hi all,

This recently added test fails on arm. We haven't implemented these intrinsics 
for arm
(any volunteers?) so for now let's XFAIL these on that target.
Also, the float64 versions of these intrinsics are not supposed to be available 
on arm
so this patch slightly adjusts the test to not include them for aarch32.
In any case the entire test is XFAILed on arm, so this doesn't have any 
noticeable
effect.

The same number of tests (PASS) still occur on aarch64 but now they appear as 
XFAIL
rather than FAIL on arm.

Ok for trunk? (from an aarch64 perspective).

Thanks,
Kyrill

2018-01-11  Kyrylo Tkachov  

 * gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: Make float64
 tests specific to aarch64.  XFAIL test on arm.




[PATCH][arm] Fix gcc.target/arm/negdi-[12].c

2018-01-19 Thread Kyrill Tkachov

Hi all,

These tests are failing for a silly reason. They scan for an occurrence of the 
NEGS instruction.
NEGS (and NEG in general) is a pre-UAL alias of RSB with an immediate of 0 and 
we only emit it
in one pattern: *thumb2_negsi2_short in thumb2.md. In all other instances of 
negation we emit
the modern RSB mnemonic. This causes needless differences in assembly output.
For example, for these testcases we emit NEG when compiling for -march=armv7-a, 
but for armv7ve
we emit RSB, causing the scan-assembler tests to fail.

This patch updates the *thumb2_negsi2_short pattern to use the RSB mnemonic and
fixes the flaky scan-assembler directives.

These tests now pass for my compiler configured with:
--with-cpu=cortex-a15 --with-fpu=neon-vfpv4 --with-float=hard --with-mode=thumb

Bootstrapped and tested on arm-none-linux-gnueabihf as well.

Committing to trunk.
Thanks,
Kyrill

2018-01-19  Kyrylo Tkachov  

* config/arm/thumb2.md (*thumb2_negsi2_short): Use RSB mnemonic
instead of NEG.

2018-01-19  Kyrylo Tkachov  

* gcc.target/arm/negdi-1.c: Remove bogus assembler scan for negs.
* gcc.target/arm/negdi-2.c: Likewise.
* gcc.target/arm/thumb-16bit-ops.c: Replace scan for NEGS with RSBS.
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index e2e2298957a4b1790008a59d5e6cf9092ad2b00f..8eb20003ab2f6df847d9344ea78e0175ce6dd902 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -1420,7 +1420,7 @@ (define_insn "*thumb2_negsi2_short"
 	(neg:SI (match_operand:SI 1 "low_register_operand" "l")))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_THUMB2 && reload_completed"
-  "neg%!\t%0, %1"
+  "rsb%!\t%0, %1, #0"
   [(set_attr "predicable" "yes")
(set_attr "length" "2")
(set_attr "type" "alu_sreg")]
diff --git a/gcc/testsuite/gcc.target/arm/negdi-1.c b/gcc/testsuite/gcc.target/arm/negdi-1.c
index c9bef049c4a467288e90c96abe1a34e2d1e028fc..efa49ad62800378ba3ee810931680c159479c7b9 100644
--- a/gcc/testsuite/gcc.target/arm/negdi-1.c
+++ b/gcc/testsuite/gcc.target/arm/negdi-1.c
@@ -12,6 +12,5 @@ Expected output:
 	rsb	r0, r0, #0
 	mov	r1, r0, asr #31
 */
-/* { dg-final { scan-assembler-times "rsb" 1 { target { arm_nothumb } } } } */
-/* { dg-final { scan-assembler-times "negs\\t" 1 { target { ! { arm_nothumb } } } } } */
+/* { dg-final { scan-assembler-times "rsbs?\\t...?, ...?, #0" 1 } } */
 /* { dg-final { scan-assembler-times "asr" 1 } } */
diff --git a/gcc/testsuite/gcc.target/arm/negdi-2.c b/gcc/testsuite/gcc.target/arm/negdi-2.c
index c20ea9c010e7f5de67a953730efa680146b3..38dffeddd54c14d64c39b35992bbb7225efbeb7f 100644
--- a/gcc/testsuite/gcc.target/arm/negdi-2.c
+++ b/gcc/testsuite/gcc.target/arm/negdi-2.c
@@ -11,6 +11,5 @@ Expected output:
 	rsb	r0, r0, #0
 	mov	r1, #0
 */
-/* { dg-final { scan-assembler-times "rsb\\t...?, ...?, #0" 1 { target { arm_nothumb } } } } */
-/* { dg-final { scan-assembler-times "negs\\t...?, ...?" 1 { target { ! arm_nothumb } } } } */
+/* { dg-final { scan-assembler-times "rsbs?\\t...?, ...?, #0" 1 } } */
 /* { dg-final { scan-assembler-times "mov" 1 } } */
diff --git a/gcc/testsuite/gcc.target/arm/thumb-16bit-ops.c b/gcc/testsuite/gcc.target/arm/thumb-16bit-ops.c
index 90407eb6872efe9c545c5945de17a2eead91eead..9f4f659b35c062c9c576e9bb2d6aae74f91f6a86 100644
--- a/gcc/testsuite/gcc.target/arm/thumb-16bit-ops.c
+++ b/gcc/testsuite/gcc.target/arm/thumb-16bit-ops.c
@@ -200,4 +200,4 @@ s (int a, int b)
   return -b;
 }
 
-/* { dg-final { scan-assembler "negs	r0, r1" } } */
+/* { dg-final { scan-assembler "rsbs	r0, r1, #0" } } */


[PATCH][arm] Fix gcc.target/arm/pr40956.c

2018-01-19 Thread Kyrill Tkachov

Hi all,

The scan-assembler tests here check for MOVS for Thumb1 and MOV for Thumb2,
but in fact there's no reason why we wouldn't generate MOVS for Thumb2 as well,
it really depends on a lot of optimisation decisions.
The only behaviour we want to test is that we move a 0 constant into a register
only once, which can be achieved with either MOV or MOVS.
Simplify the check by always checking for either MOV or MOVS.

Committing to trunk.
Thanks,
Kyrill

2018-01-19  Kyrylo Tkachov  

* gcc.target/arm/pr40956.c: Adjust scan-assembler pattern.
diff --git a/gcc/testsuite/gcc.target/arm/pr40956.c b/gcc/testsuite/gcc.target/arm/pr40956.c
index 4fefa49a5878a90e3a78db7d3df1015a36d0897f..7429272a8c26a00ed40ead45d8d5737986e4bb0a 100644
--- a/gcc/testsuite/gcc.target/arm/pr40956.c
+++ b/gcc/testsuite/gcc.target/arm/pr40956.c
@@ -1,8 +1,7 @@
 /* { dg-options "-Os -fpic" }  */
 /* { dg-require-effective-target fpic } */
 /* Make sure the constant "0" is loaded into register only once.  */
-/* { dg-final { scan-assembler-times "movs\[\\t \]*r., #0" 1 { target arm_thumb1 } } } */
-/* { dg-final { scan-assembler-times "mov\[\\t \]*r., #0" 1 { target { ! arm_thumb1 } } } } */
+/* { dg-final { scan-assembler-times "movs?\[\\t \]*r., #0" 1 } } */
 
 int foo(int p, int* q)
 {


Re: [libstdc++][testsuite] Fix dg-options/dg-add-options order

2018-01-19 Thread Christophe Lyon
On 19 January 2018 at 11:12, Jonathan Wakely  wrote:
> On 19/01/18 09:44 +0100, Christophe Lyon wrote:
>>
>> Hi,
>>
>> While checking dg directives order, I've noticed a few libstdc++ tests
>> where the current order is:
>> // { dg-add-options ieee }
>> // { dg-options "-D__STDCPP_WANT_MATH_SPEC_FUNCS__" }
>> while dg-add-options should be after dg-options.
>
>
> Yes, those are wrong. I have a script to check for this but have
> forgotten to run it recently:
>
Me too :)

> ./testsuite/ext/special_functions/airy_ai/check_nan.cc has dg-options after
> dg-add-options
> ./testsuite/ext/special_functions/hyperg/check_nan.cc has dg-options after
> dg-add-options
> ./testsuite/ext/special_functions/conf_hyperg/check_nan.cc has dg-options
> after dg-add-options
> ./testsuite/ext/special_functions/airy_bi/check_nan.cc has dg-options after
> dg-add-options
> ./testsuite/special_functions/02_assoc_legendre/check_nan.cc has dg-options
> after dg-add-options
> ./testsuite/special_functions/14_expint/check_nan.cc has dg-options after
> dg-add-options
> ./testsuite/special_functions/12_ellint_2/check_nan.cc has dg-options after
> dg-add-options
> ./testsuite/special_functions/09_cyl_bessel_k/check_nan.cc has dg-options
> after dg-add-options
> ./testsuite/special_functions/21_sph_neumann/check_nan.cc has dg-options
> after dg-add-options
> ./testsuite/special_functions/15_hermite/check_nan.cc has dg-options after
> dg-add-options
> ./testsuite/special_functions/19_sph_bessel/check_nan.cc has dg-options
> after dg-add-options
> ./testsuite/special_functions/05_comp_ellint_2/check_nan.cc has dg-options
> after dg-add-options
> ./testsuite/special_functions/11_ellint_1/check_nan.cc has dg-options after
> dg-add-options
> ./testsuite/special_functions/17_legendre/check_nan.cc has dg-options after
> dg-add-options
> ./testsuite/special_functions/10_cyl_neumann/check_nan.cc has dg-options
> after dg-add-options
> ./testsuite/special_functions/06_comp_ellint_3/check_nan.cc has dg-options
> after dg-add-options
> ./testsuite/special_functions/06_comp_ellint_3/pr66689.cc has dg-options
> after dg-add-options
> ./testsuite/special_functions/01_assoc_laguerre/check_nan.cc has dg-options
> after dg-add-options
> ./testsuite/special_functions/16_laguerre/check_nan.cc has dg-options after
> dg-add-options
> ./testsuite/special_functions/13_ellint_3/check_nan.cc has dg-options after
> dg-add-options
> ./testsuite/special_functions/13_ellint_3/pr66689.cc has dg-options after
> dg-add-options
> ./testsuite/special_functions/07_cyl_bessel_i/check_nan.cc has dg-options
> after dg-add-options
> ./testsuite/special_functions/18_riemann_zeta/check_nan.cc has dg-options
> after dg-add-options
> ./testsuite/special_functions/08_cyl_bessel_j/check_nan.cc has dg-options
> after dg-add-options
> ./testsuite/special_functions/04_comp_ellint_1/check_nan.cc has dg-options
> after dg-add-options
> ./testsuite/special_functions/03_beta/check_nan.cc has dg-options after
> dg-add-options
> ./testsuite/special_functions/20_sph_legendre/check_nan.cc has dg-options
> after dg-add-options
>
>
>> The attached patch fixes that, is that OK for stage4?
>
>
> Yes, testsuite changes are OK. Thanks!
>
OK, thanks


Re: [libstdc++][testsuite] Fix dg-options/dg-add-options order

2018-01-19 Thread Jonathan Wakely

On 19/01/18 09:44 +0100, Christophe Lyon wrote:

Hi,

While checking dg directives order, I've noticed a few libstdc++ tests
where the current order is:
// { dg-add-options ieee }
// { dg-options "-D__STDCPP_WANT_MATH_SPEC_FUNCS__" }
while dg-add-options should be after dg-options.


Yes, those are wrong. I have a script to check for this but have
forgotten to run it recently:

./testsuite/ext/special_functions/airy_ai/check_nan.cc has dg-options after 
dg-add-options
./testsuite/ext/special_functions/hyperg/check_nan.cc has dg-options after 
dg-add-options
./testsuite/ext/special_functions/conf_hyperg/check_nan.cc has dg-options after 
dg-add-options
./testsuite/ext/special_functions/airy_bi/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/02_assoc_legendre/check_nan.cc has dg-options 
after dg-add-options
./testsuite/special_functions/14_expint/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/12_ellint_2/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/09_cyl_bessel_k/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/21_sph_neumann/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/15_hermite/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/19_sph_bessel/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/05_comp_ellint_2/check_nan.cc has dg-options 
after dg-add-options
./testsuite/special_functions/11_ellint_1/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/17_legendre/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/10_cyl_neumann/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/06_comp_ellint_3/check_nan.cc has dg-options 
after dg-add-options
./testsuite/special_functions/06_comp_ellint_3/pr66689.cc has dg-options after 
dg-add-options
./testsuite/special_functions/01_assoc_laguerre/check_nan.cc has dg-options 
after dg-add-options
./testsuite/special_functions/16_laguerre/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/13_ellint_3/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/13_ellint_3/pr66689.cc has dg-options after 
dg-add-options
./testsuite/special_functions/07_cyl_bessel_i/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/18_riemann_zeta/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/08_cyl_bessel_j/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/04_comp_ellint_1/check_nan.cc has dg-options 
after dg-add-options
./testsuite/special_functions/03_beta/check_nan.cc has dg-options after 
dg-add-options
./testsuite/special_functions/20_sph_legendre/check_nan.cc has dg-options after 
dg-add-options



The attached patch fixes that, is that OK for stage4?


Yes, testsuite changes are OK. Thanks!



Check whether any statements need masking (PR 83922)

2018-01-19 Thread Richard Sandiford
This PR is an odd case in which, due to the low optimisation level,
we enter vectorisation with:

  outer1:
x_1 = PHI ;
...

  inner:
x_2 = 0;
...

  outer2:
x_3 = PHI ;

These statements are tentatively treated as a double reduction by
vect_force_simple_reduction, but in the end only x_3 and x_2 are marked
as relevant.  vect_analyze_loop_operations skips over x_3, leaving the
vectorizable_reduction check to a presumed future test of x_1, which
in this case never happens.  We therefore end up vectorising x_2 only
(complete with peeling for niters!) and leave the scalar x_3 in place.

This caused a segfault in the support for fully-masked loops,
since there were no statements that needed masking.  Fixed by
checking for that.

But I think this is also a flaw in vect_analyze_loop_operations.
Outer loop vectorisation reduces the number of times that the
inner loop is executed, so it wouldn't necessarily be valid
to leave the scalar x_3 in place for all vectorisable x_2.
There's already code to forbid that when x_1 isn't present:

  /* FORNOW: we currently don't support the case that these phis
 are not used in the outerloop (unless it is double reduction,
 i.e., this phi is vect_reduction_def), cause this case
 requires to actually do something here.  */

I think we need to do the same if x_1 is present but not relevant.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
OK to install?

Richard


2018-01-19  Richard Sandiford  

gcc/
PR tree-optimization/83922
* tree-vect-loop.c (vect_verify_full_masking): Return false if
there are no statements that need masking.
(vect_active_double_reduction_p): New function.
(vect_analyze_loop_operations): Use it when handling phis that
are not in the loop header.

gcc/testsuite/
PR tree-optimization/83922
* gcc.dg/pr83922.c: New test.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-01-19 09:36:33.409191362 +
+++ gcc/tree-vect-loop.c2018-01-19 09:52:00.681330865 +
@@ -1294,6 +1294,12 @@ vect_verify_full_masking (loop_vec_info
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   unsigned int min_ni_width;
 
+  /* Use a normal loop if there are no statements that need masking.
+ This only happens in rare degenerate cases: it means that the loop
+ has no loads, no stores, and no live-out values.  */
+  if (LOOP_VINFO_MASKS (loop_vinfo).is_empty ())
+return false;
+
   /* Get the maximum number of iterations that is representable
  in the counter type.  */
   tree ni_type = TREE_TYPE (LOOP_VINFO_NITERSM1 (loop_vinfo));
@@ -1739,6 +1745,33 @@ vect_update_vf_for_slp (loop_vec_info lo
 }
 }
 
+/* Return true if STMT_INFO describes a double reduction phi and if
+   the other phi in the reduction is also relevant for vectorization.
+   This rejects cases such as:
+
+  outer1:
+   x_1 = PHI ;
+   ...
+
+  inner:
+   x_2 = ...;
+   ...
+
+  outer2:
+   x_3 = PHI ;
+
+   if nothing in x_2 or elsewhere makes x_1 relevant.  */
+
+static bool
+vect_active_double_reduction_p (stmt_vec_info stmt_info)
+{
+  if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_double_reduction_def)
+return false;
+
+  gimple *other_phi = STMT_VINFO_REDUC_DEF (stmt_info);
+  return STMT_VINFO_RELEVANT_P (vinfo_for_stmt (other_phi));
+}
+
 /* Function vect_analyze_loop_operations.
 
Scan the loop stmts and make sure they are all vectorizable.  */
@@ -1786,8 +1819,7 @@ vect_analyze_loop_operations (loop_vec_i
  i.e., this phi is vect_reduction_def), cause this case
  requires to actually do something here.  */
   if (STMT_VINFO_LIVE_P (stmt_info)
-  && STMT_VINFO_DEF_TYPE (stmt_info)
- != vect_double_reduction_def)
+ && !vect_active_double_reduction_p (stmt_info))
 {
   if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
Index: gcc/testsuite/gcc.dg/pr83922.c
===
--- /dev/null   2018-01-19 09:30:49.543814408 +
+++ gcc/testsuite/gcc.dg/pr83922.c  2018-01-19 09:52:00.680331041 +
@@ -0,0 +1,21 @@
+/* { dg-options "-O -ftree-vectorize" } */
+
+int j4;
+
+void
+k1 (int ak)
+{
+  while (ak < 1)
+{
+  int ur;
+
+  for (ur = 0; ur < 2; ++ur)
+{
+  ++j4;
+  if (j4 != 0)
+j4 = 0;
+}
+
+  ++ak;
+}
+}


[PATCH][arm] Fix gcc.target/arm/pr79058.c

2018-01-19 Thread Kyrill Tkachov

Hi all,

This testcase tests 32-bit ARM state functionality, so add the -marm to make it 
explicit
as well as to avoid Thumb1 hard-float errors for certain toolchain 
configurations.

Committing to trunk.
Thanks,
Kyrill

2018-01-19  Kyrylo Tkachov  

* gcc.target/arm/pr79058.c: Add arm_arm_ok check and -marm to options.
diff --git a/gcc/testsuite/gcc.target/arm/pr79058.c b/gcc/testsuite/gcc.target/arm/pr79058.c
index f2841f514df36c2f56f23cb690d56a9a13fb9184..54a1d8aa0072de8387973285a1a9a9adb91abf9f 100644
--- a/gcc/testsuite/gcc.target/arm/pr79058.c
+++ b/gcc/testsuite/gcc.target/arm/pr79058.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arm_ok } */
 /* { dg-skip-if "do not override -mcpu" { *-*-* } { "-mcpu=*" } { "-mcpu=arm7tdmi" } } */
-/* { dg-options "-Os -mbig-endian -mcpu=arm7tdmi" } */
+/* { dg-options "-Os -mbig-endian -marm -mcpu=arm7tdmi" } */
 
 enum { NILFS_SEGMENT_USAGE_ACTIVE, NILFS_SEGMENT_USAGE_DIRTY } a;
 


Avoid ICE for nested inductions (PR 83914)

2018-01-19 Thread Richard Sandiford
This testcase ICEd because we converted the initial value of an
induction to the vector element type even for nested inductions.
This isn't necessary because the initial expression is vectorised
normally, and it meant that init_expr was no longer the original
statement operand by the time we called vect_get_vec_def_for_operand.

Also, adding the conversion code here made the existing SLP conversion
redundant.

It looks like something went wrong when rebasing the peeling via masks
work on top of the support for SLP reductions, sorry about that.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
OK to install?

Richard


2018-01-19  Richard Sandiford  

gcc/
PR tree-optimization/83914
* tree-vect-loop.c (vectorizable_induction): Don't convert
init_expr or apply the peeling adjustment for inductions
that are nested within the vectorized loop.  Remove redundant
conversion code.

gcc/testsuite/
PR tree-optimization/83914
* gcc.dg/vect/pr83914.c: New test.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-01-16 15:13:24.938622242 +
+++ gcc/tree-vect-loop.c2018-01-19 09:36:33.409191362 +
@@ -7678,28 +7678,33 @@ vectorizable_induction (gimple *phi,
   init_expr = PHI_ARG_DEF_FROM_EDGE (phi,
 loop_preheader_edge (iv_loop));
 
-  /* Convert the initial value and step to the desired type.  */
   stmts = NULL;
-  init_expr = gimple_convert (, TREE_TYPE (vectype), init_expr);
-  step_expr = gimple_convert (, TREE_TYPE (vectype), step_expr);
-
-  /* If we are using the loop mask to "peel" for alignment then we need
- to adjust the start value here.  */
-  tree skip_niters = LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo);
-  if (skip_niters != NULL_TREE)
+  if (!nested_in_vect_loop)
 {
-  if (FLOAT_TYPE_P (vectype))
-   skip_niters = gimple_build (, FLOAT_EXPR, TREE_TYPE (vectype),
-   skip_niters);
-  else
-   skip_niters = gimple_convert (, TREE_TYPE (vectype),
- skip_niters);
-  tree skip_step = gimple_build (, MULT_EXPR, TREE_TYPE (vectype),
-skip_niters, step_expr);
-  init_expr = gimple_build (, MINUS_EXPR, TREE_TYPE (vectype),
-   init_expr, skip_step);
+  /* Convert the initial value to the desired type.  */
+  tree new_type = TREE_TYPE (vectype);
+  init_expr = gimple_convert (, new_type, init_expr);
+
+  /* If we are using the loop mask to "peel" for alignment then we need
+to adjust the start value here.  */
+  tree skip_niters = LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo);
+  if (skip_niters != NULL_TREE)
+   {
+ if (FLOAT_TYPE_P (vectype))
+   skip_niters = gimple_build (, FLOAT_EXPR, new_type,
+   skip_niters);
+ else
+   skip_niters = gimple_convert (, new_type, skip_niters);
+ tree skip_step = gimple_build (, MULT_EXPR, new_type,
+skip_niters, step_expr);
+ init_expr = gimple_build (, MINUS_EXPR, new_type,
+   init_expr, skip_step);
+   }
 }
 
+  /* Convert the step to the desired type.  */
+  step_expr = gimple_convert (, TREE_TYPE (vectype), step_expr);
+
   if (stmts)
 {
   new_bb = gsi_insert_seq_on_edge_immediate (pe, stmts);
@@ -7718,15 +7723,6 @@ vectorizable_induction (gimple *phi,
   /* Enforced above.  */
   unsigned int const_nunits = nunits.to_constant ();
 
-  /* Convert the init to the desired type.  */
-  stmts = NULL;
-  init_expr = gimple_convert (, TREE_TYPE (vectype), init_expr);
-  if (stmts)
-   {
- new_bb = gsi_insert_seq_on_edge_immediate (pe, stmts);
- gcc_assert (!new_bb);
-   }
-
   /* Generate [VF*S, VF*S, ... ].  */
   if (SCALAR_FLOAT_TYPE_P (TREE_TYPE (step_expr)))
{
Index: gcc/testsuite/gcc.dg/vect/pr83914.c
===
--- /dev/null   2018-01-19 09:30:49.543814408 +
+++ gcc/testsuite/gcc.dg/vect/pr83914.c 2018-01-19 09:36:33.405191141 +
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+struct s { struct s *ptrs[16]; } *a, *b;
+int c;
+void
+foo (int n)
+{
+  for (; n; a = b, n--)
+{
+  b = a + 1;
+  for (c = 8; c; c--)
+   a->ptrs[c] = b;
+}
+}


Re: [PATCH] Fix pr83619.C (was Re: Fix ICE with profile info mismatch)

2018-01-19 Thread Jan Hubicka
> On Thu, Jan 18, 2018 at 04:59:01PM +0100, Jan Hubicka wrote:
> > this patch ICE where the profile in cgraph mismatch profile in BB. This is 
> > becuase
> > of expansion of speculative devirtualization where we get some roundoff 
> > issues.
> > 
> > Bootstrapped/regtested x86_64-linux, comitted.
> > Honza
> > 
> > PR ipa/83619
> > * g++.dg/torture/pr83619.C: New testcase.
> > * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Update edge
> > frequencies.
> > --- testsuite/g++.dg/torture/pr83619.C  (revision 0)
> > +++ testsuite/g++.dg/torture/pr83619.C  (working copy)
> ...
> 
> This testcase FAILs everywhere:
> /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C: In static member function 
> 'static void i::j<  >::c(e*)':
> /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:25:8: warning: invalid use of 
> incomplete type 'class e'
> /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:8:7: note: forward 
> declaration of 'class e'
> FAIL: g++.dg/torture/pr83619.C   -O0  (test for excess errors)
> Excess errors:
> /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:25:8: warning: invalid use of 
> incomplete type 'class e'
> 
> The following patch tweaks it so that it doesn't emit the warning, yet still
> ICEs before your cgraph.c change and PASSes after it.
> Tested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2018-01-18  Jakub Jelinek  
> 
>   PR ipa/83619
>   * g++.dg/torture/pr83619.C (e): Define before first use instead of
>   forward declaration.

Oops, sorry. I had corrected version somewhere. The change is OK.

Honz
> 
> --- gcc/testsuite/g++.dg/torture/pr83619.C.jj 2018-01-18 21:11:54.865206861 
> +0100
> +++ gcc/testsuite/g++.dg/torture/pr83619.C2018-01-18 23:51:32.693482293 
> +0100
> @@ -5,7 +5,9 @@ class d
>  public:
>virtual unsigned c ();
>  };
> -class e;
> +class e : public d
> +{
> +};
>  class i
>  {
>void h ();
> @@ -33,9 +35,6 @@ public:
>l (int);
>k *operator-> ();
>  };
> -class e : public d
> -{
> -};
>  class m final : e
>  {
>unsigned c ();
> 
>   Jakub


Re: [PATCH] -Warray-bounds: Fix false positive in some "switch" stmts (PR tree-optimization/83510)

2018-01-19 Thread Richard Biener
On Fri, Jan 19, 2018 at 12:36 AM, David Malcolm  wrote:
> PR tree-optimization/83510 reports that r255649 (for
> PR tree-optimization/83312) introduced a false positive for
> -Warray-bounds for array accesses within certain switch statements:
> those for which value-ranges allow more than one case to be reachable,
> but for which one or more of the VR-unreachable cases contain
> out-of-range array accesses.
>
> In the reproducer, after the switch in f is inlined into g, we have 3 cases
> for the switch (case 9, case 10-19, and default), within a loop that
> ranges from 0..9.
>
> With both the old and new code, vr_values::simplify_switch_using_ranges clears
> the EDGE_EXECUTABLE flag on the edge to the "case 10-19" block.  This
> happens during the dom walk within the substitute_and_fold_engine.
>
> With the old code, the clearing of that EDGE_EXECUTABLE flag led to the
>   /* Skip blocks that were found to be unreachable.  */
> code in the old implementation of vrp_prop::check_all_array_refs skipping
> the "case 10-19" block.
>
> With the new code, we have a second dom walk, and that dom_walker's ctor
> sets all edges to be EDGE_EXECUTABLE, losing that information.
>
> Then, dom_walker::before_dom_children (here, the subclass'
> check_array_bounds_dom_walker::before_dom_children) can return one edge, if
> there's a unique successor edge, and dom_walker::walk filters the dom walk
> to just that edge.
>
> Here we have two VR-valid edges (case 9 and default), and an VR-invalid
> successor edge (case 10-19).  There's no *unique* valid successor edge,
> and hence taken_edge is NULL, and the filtering in dom_walker::walk
> doesn't fire.
>
> Hence we've lost the filtering of the "case 10-19" BB, hence the false
> positive.
>
> The issue is that we have two dom walks: first within vr_values'
> substitute_and_fold_dom_walker (which has skip_unreachable_blocks == false),
> then another within vrp_prop::check_all_array_refs (with
> skip_unreachable_blocks == true).
>
> Each has different "knowledge" about ruling out edges due to value-ranges,
> but we aren't combining that information.  The former "knows" about
> out-edges at a particular control construct (e.g. at a switch), the latter
> "knows" about dominance, but only about unique successors (hence the
> problem when two out of three switch cases are valid).
>
> This patch combines the information by preserving the EDGE_EXECUTABLE
> flags from the first dom walk, and using it in the second dom walk,
> potentially rejecting additional edges.
>
> Doing so fixes the false positive.
>
> I attempted an alternative fix, merging the two dom walks into one, but
> that led to crashes in identify_jump_threads, so I went with this, as
> a less invasive fix.
>
> Successfully bootstrapped on x86_64-pc-linux-gnu.
> OK for trunk?

Ok, but I think you need to update the domwalk construction in
graphite-scop-detection.c as well - did you test w/o graphite?

grep might be your friend...

Thanks,
Richard.

> gcc/ChangeLog:
> PR tree-optimization/83510
> * domwalk.c (set_all_edges_as_executable): New function.
> (dom_walker::dom_walker): Add new param "preserve_flags".  Move
> setup of edge flags to set_all_edges_as_executable and guard it
> with !preserve_flags.
> * domwalk.h (dom_walker::dom_walker): Add new param
> "preserve_flags", defaulting to false.
> (set_all_edges_as_executable): New decl.
> * tree-vrp.c
> (check_array_bounds_dom_walker::check_array_bounds_dom_walker):
> Pass "true" for new param of dom_walker's ctor.
> (vrp_prop::vrp_finalize): Call set_all_edges_as_executable
> if check_all_array_refs will be called.
>
> gcc/testsuite/ChangeLog:
> PR tree-optimization/83510
> * gcc.c-torture/compile/pr83510.c: New test case.
> ---
>  gcc/domwalk.c |  30 +++--
>  gcc/domwalk.h |  11 ++
>  gcc/testsuite/gcc.c-torture/compile/pr83510.c | 172 
> ++
>  gcc/tree-vrp.c|  21 +++-
>  4 files changed, 224 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr83510.c
>
> diff --git a/gcc/domwalk.c b/gcc/domwalk.c
> index 102a293..988ff71 100644
> --- a/gcc/domwalk.c
> +++ b/gcc/domwalk.c
> @@ -169,12 +169,29 @@ sort_bbs_postorder (basic_block *bbs, int n)
>  qsort (bbs, n, sizeof *bbs, cmp_bb_postorder);
>  }
>
> +/* Set EDGE_EXECUTABLE on every edge within FN's CFG.  */
> +
> +void
> +set_all_edges_as_executable (function *fn)
> +{
> +  basic_block bb;
> +  FOR_ALL_BB_FN (bb, fn)
> +{
> +  edge_iterator ei;
> +  edge e;
> +  FOR_EACH_EDGE (e, ei, bb->succs)
> +   e->flags |= EDGE_EXECUTABLE;
> +}
> +}
> +
>  /* Constructor for a dom walker.
>
> If SKIP_UNREACHBLE_BLOCKS is true, then we need to set
> -   EDGE_EXECUTABLE on every edge in the CFG. */
> +  

[libstdc++][testsuite] Fix dg-options/dg-add-options order

2018-01-19 Thread Christophe Lyon
Hi,

While checking dg directives order, I've noticed a few libstdc++ tests
where the current order is:
// { dg-add-options ieee }
// { dg-options "-D__STDCPP_WANT_MATH_SPEC_FUNCS__" }
while dg-add-options should be after dg-options.

The attached patch fixes that, is that OK for stage4?

I've tested on arm/aarch64, where it makes no difference. My reading
of proc add_options_for_ieee
in gcc/testsuite/lib/target-supports.exp tells me that it would only
have an impact on alpha, sh and rx targets.

Christophe
libstdc++-v3/ChangeLog:

2018-01-19  Christophe Lyon  

* testsuite/ext/special_functions/airy_ai/check_nan.cc: Fix
  dg-options and dg-add-options order.
* testsuite/ext/special_functions/airy_bi/check_nan.cc: Likewise.
* testsuite/ext/special_functions/conf_hyperg/check_nan.cc:
Likewise.
* testsuite/ext/special_functions/hyperg/check_nan.cc: Likewise.
* testsuite/special_functions/01_assoc_laguerre/check_nan.cc:
Likewise.
* testsuite/special_functions/02_assoc_legendre/check_nan.cc:
Likewise.
* testsuite/special_functions/03_beta/check_nan.cc: Likewise.
* testsuite/special_functions/04_comp_ellint_1/check_nan.cc:
Likewise.
* testsuite/special_functions/05_comp_ellint_2/check_nan.cc:
Likewise.
* testsuite/special_functions/06_comp_ellint_3/check_nan.cc:
Likewise.
* testsuite/special_functions/06_comp_ellint_3/pr66689.cc:
Likewise.
* testsuite/special_functions/07_cyl_bessel_i/check_nan.cc:
Likewise.
* testsuite/special_functions/08_cyl_bessel_j/check_nan.cc:
Likewise.
* testsuite/special_functions/09_cyl_bessel_k/check_nan.cc:
Likewise.
* testsuite/special_functions/10_cyl_neumann/check_nan.cc:
Likewise.
* testsuite/special_functions/11_ellint_1/check_nan.cc: Likewise.
* testsuite/special_functions/12_ellint_2/check_nan.cc: Likewise.
* testsuite/special_functions/13_ellint_3/check_nan.cc: Likewise.
* testsuite/special_functions/13_ellint_3/pr66689.cc: Likewise.
* testsuite/special_functions/14_expint/check_nan.cc: Likewise.
* testsuite/special_functions/15_hermite/check_nan.cc: Likewise.
* testsuite/special_functions/16_laguerre/check_nan.cc: Likewise.
* testsuite/special_functions/17_legendre/check_nan.cc: Likewise.
* testsuite/special_functions/18_riemann_zeta/check_nan.cc:
Likewise.
* testsuite/special_functions/19_sph_bessel/check_nan.cc:
Likewise.
* testsuite/special_functions/20_sph_legendre/check_nan.cc:
Likewise.
* testsuite/special_functions/21_sph_neumann/check_nan.cc:
Likewise.

diff --git a/libstdc++-v3/testsuite/ext/special_functions/airy_ai/check_nan.cc 
b/libstdc++-v3/testsuite/ext/special_functions/airy_ai/check_nan.cc
index eef2f74..c5473a3 100644
--- a/libstdc++-v3/testsuite/ext/special_functions/airy_ai/check_nan.cc
+++ b/libstdc++-v3/testsuite/ext/special_functions/airy_ai/check_nan.cc
@@ -1,7 +1,7 @@
 // { dg-do run { target c++11 } }
 // { dg-require-c-std "" }
-// { dg-add-options ieee }
 // { dg-options "-D__STDCPP_WANT_MATH_SPEC_FUNCS__" }
+// { dg-add-options ieee }
 
 // Copyright (C) 2016-2018 Free Software Foundation, Inc.
 //
diff --git a/libstdc++-v3/testsuite/ext/special_functions/airy_bi/check_nan.cc 
b/libstdc++-v3/testsuite/ext/special_functions/airy_bi/check_nan.cc
index dafa271..4b85288 100644
--- a/libstdc++-v3/testsuite/ext/special_functions/airy_bi/check_nan.cc
+++ b/libstdc++-v3/testsuite/ext/special_functions/airy_bi/check_nan.cc
@@ -1,7 +1,7 @@
 // { dg-do run { target c++11 } }
 // { dg-require-c-std "" }
-// { dg-add-options ieee }
 // { dg-options "-D__STDCPP_WANT_MATH_SPEC_FUNCS__" }
+// { dg-add-options ieee }
 
 // Copyright (C) 2016-2018 Free Software Foundation, Inc.
 //
diff --git 
a/libstdc++-v3/testsuite/ext/special_functions/conf_hyperg/check_nan.cc 
b/libstdc++-v3/testsuite/ext/special_functions/conf_hyperg/check_nan.cc
index 7af46ff..0da38b1 100644
--- a/libstdc++-v3/testsuite/ext/special_functions/conf_hyperg/check_nan.cc
+++ b/libstdc++-v3/testsuite/ext/special_functions/conf_hyperg/check_nan.cc
@@ -1,7 +1,7 @@
 // { dg-do run { target c++11 } }
 // { dg-require-c-std "" }
-// { dg-add-options ieee }
 // { dg-options "-D__STDCPP_WANT_MATH_SPEC_FUNCS__" }
+// { dg-add-options ieee }
 
 // Copyright (C) 2016-2018 Free Software Foundation, Inc.
 //
diff --git a/libstdc++-v3/testsuite/ext/special_functions/hyperg/check_nan.cc 
b/libstdc++-v3/testsuite/ext/special_functions/hyperg/check_nan.cc
index 78ff9e8..ee3b679 100644
--- a/libstdc++-v3/testsuite/ext/special_functions/hyperg/check_nan.cc
+++ b/libstdc++-v3/testsuite/ext/special_functions/hyperg/check_nan.cc
@@ -1,7 +1,7 @@
 // { dg-do run { target c++11 } }
 // { dg-require-c-std "" }
-// { dg-add-options ieee }
 // { dg-options 

Re: [PATCH,PTX] Add support for CUDA 9

2018-01-19 Thread Tom de Vries

On 01/19/2018 01:59 AM, Cesar Philippidis wrote:

Here's the updated patch with the changes that you requested. There are
no new regressions in trunk. I tested it on my desktop running driver
387.34 on a Pascal GPU.

Is this OK for trunk?


OK with 'PR target/83790' added to the changelog entry.

Thanks,
- Tom



trunk-cuda9.diff


2018-01-18  Cesar Philippidis  

gcc/
* config/nvptx/nvptx.c (output_init_frag): Don't use generic address
spaces for function labels.

gcc/testsuite/
* gcc.target/nvptx/indirect_call.c: New test.

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 86fc13f4fc0..4cb87c8ad07 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -1899,9 +1899,15 @@ output_init_frag (rtx sym)

if (sym)

  {
-  fprintf (asm_out_file, "generic(");
+  bool function = (SYMBOL_REF_DECL (sym)
+  && (TREE_CODE (SYMBOL_REF_DECL (sym)) == FUNCTION_DECL));
+  if (!function)
+   fprintf (asm_out_file, "generic(");
output_address (VOIDmode, sym);
-  fprintf (asm_out_file, val ? ") + " : ")");
+  if (!function)
+   fprintf (asm_out_file, ")");
+  if (val)
+   fprintf (asm_out_file, " + ");
  }
  
if (!sym || val)

diff --git a/gcc/testsuite/gcc.target/nvptx/indirect_call.c 
b/gcc/testsuite/gcc.target/nvptx/indirect_call.c
new file mode 100644
index 000..39992a7137b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/indirect_call.c
@@ -0,0 +1,19 @@
+/* { dg-options "-O2 -msoft-stack" } */
+/* { dg-do run } */
+
+int
+f1 (int a)
+{
+  return a + 1;
+}
+
+int (*f2)(int) = f1;
+
+int
+main ()
+{
+  if (f2 (100) != 101)
+__builtin_abort();
+
+  return 0;
+}





Re: [PATCH, i386]: Backport a small retpoline improvement to gcc-7 branch

2018-01-19 Thread Richard Biener
On Thu, 18 Jan 2018, Uros Bizjak wrote:

> Hello!
> 
> I'd like to backport a small improvement to retpoline functionality to
> gcc-7 branch.
> 
> 2018-01-17  Uros Bizjak  
> 
> * config/i386/i386.c (indirect_thunk_name): Declare regno
> as unsigned int.  Compare regno with INVALID_REGNUM.
> (output_indirect_thunk): Ditto.
> (output_indirect_thunk_function): Ditto.
> (ix86_code_end): Declare regno as unsigned int.  Use INVALID_REGNUM
> in the call to output_indirect_thunk_function.
> 
> OK for branch?

Please wait until after the release of 7.3 if this doesn't fix an actual 
bug.

Thanks,
Richard.

> Uros.
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[C++ PATCH] Speed up inplace_merge algorithm & fix inefficient logic(PR c++/83938)

2018-01-19 Thread chang jc
Current std::inplace_merge() suffers from performance issue by inefficient
logic under limited memory,

It leads to performance downgrade.

Please help to review it.

Index: include/bits/stl_algo.h
===
--- include/bits/stl_algo.h (revision 256871)
+++ include/bits/stl_algo.h (working copy)
@@ -2437,7 +2437,7 @@
  _BidirectionalIterator __second_cut = __middle;
  _Distance __len11 = 0;
  _Distance __len22 = 0;
- if (__len1 > __len2)
+ if (__len1 < __len2)
{
  __len11 = __len1 / 2;
  std::advance(__first_cut, __len11);
@@ -2539,9 +2539,15 @@
   const _DistanceType __len1 = std::distance(__first, __middle);
   const _DistanceType __len2 = std::distance(__middle, __last);

+
   typedef _Temporary_buffer<_BidirectionalIterator, _ValueType> _TmpBuf;
-  _TmpBuf __buf(__first, __last);
-
+  _BidirectionalIterator __start, __end;
+  if (__len1 < __len2) {
+   __start = __first; __end = __middle;
+  } else {
+   __start = __middle; __end = __last;
+  }
+  _TmpBuf __buf(__start, ___end);
   if (__buf.begin() == 0)
std::__merge_without_buffer
  (__first, __middle, __last, __len1, __len2, __comp);
Index: include/bits/stl_tempbuf.h
===
--- include/bits/stl_tempbuf.h  (revision 256871)
+++ include/bits/stl_tempbuf.h  (working copy)
@@ -95,7 +95,7 @@
std::nothrow));
  if (__tmp != 0)
return std::pair<_Tp*, ptrdiff_t>(__tmp, __len);
- __len /= 2;
+ __len = (__len + 1) / 2;
}
   return std::pair<_Tp*, ptrdiff_t>(static_cast<_Tp*>(0), 0);
 }




Thanks