[PATCH, lower-subreg] Fix 70355

2016-03-28 Thread Richard Henderson
The ICE comes fixing up a debug_insn, and the debug info incoming to the pass 
seems reasonable.  Just recognizing that this situation is possible and not 
asserting appears to work.


Ok?


r~
* lower-subreg.c (simplify_subreg_concatn): Reject paradoxical subregs.


diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
index 5432d05..f7b3ac4 100644
--- a/gcc/lower-subreg.c
+++ b/gcc/lower-subreg.c
@@ -614,7 +614,8 @@ simplify_subreg_concatn (machine_mode outermode, rtx op,
 
   innermode = GET_MODE (op);
   gcc_assert (byte < GET_MODE_SIZE (innermode));
-  gcc_assert (GET_MODE_SIZE (outermode) <= GET_MODE_SIZE (innermode));
+  if (GET_MODE_SIZE (outermode) > GET_MODE_SIZE (innermode))
+return NULL_RTX;
 
   inner_size = GET_MODE_SIZE (innermode) / XVECLEN (op, 0);
   part = XVECEXP (op, 0, byte / inner_size);
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr70355.c 
b/gcc/testsuite/gcc.c-torture/compile/pr70355.c
new file mode 100644
index 000..4749427
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr70355.c
@@ -0,0 +1,14 @@
+/* { dg-require-effective-target int128 } */
+/* { dg-additional-options "-g" } */
+
+typedef unsigned __int128 v2ti __attribute__ ((vector_size (32)));
+
+unsigned
+foo (unsigned i, v2ti v)
+{
+  do {
+i--;
+v %= ~v;
+  } while (i);
+  return v[0] + v[1];
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr70355.c 
b/gcc/testsuite/gcc.target/i386/pr70355.c
new file mode 100644
index 000..b55f6fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr70355.c
@@ -0,0 +1,14 @@
+/* { dg-require-effective-target int128 } */
+/* { dg-options "-O2 -fno-tree-ter -funroll-loops -mavx512f -g" } */
+
+typedef unsigned __int128 v2ti __attribute__ ((vector_size (32)));
+
+unsigned
+foo (unsigned i, v2ti v)
+{
+  do {
+i--;
+v %= ~v;
+  } while (i);
+  return v[0] + v[1];
+}


Re: [PATCH] Disable guality tests for powerpc*-linux*

2016-03-28 Thread Mike Stump
On Mar 28, 2016, at 5:38 PM, Bill Schmidt  wrote:
> For a long time we've had hundreds of failing guality tests.  These
> failures don't seem to have any correlation with gdb functionality for
> POWER, which is working fine.

> Verified to remove hundreds of failure messages on
> powerpc64le-unknown-linux-gnu. :)  Is this ok for trunk?

So, I think this should be up to the target maintainer and/or the quality 
people, but I think it is reasonable to do this.  If all or most target 
maintainers do this, then we flip to not run, unless the target triplet asks 
for them.

On my (random) target, I don’t fail any quality tests:

newlib/newlib/libc/posix/popen.c:138: undefined reference to `vfork’
newlib/newlib/libc/posix/popen.c:218: undefined reference to `waitpid’

so they all turn off.  I have a fork (it forks my simulator) and a wait, but no 
vfork or waitpid.  This might explain why some people don’t trip on the issue.

Re: [PATCH ARM v2] PR69770 -mlong-calls does not affect calls to __gnu_mcount_nc generated by -pg

2016-03-28 Thread Kugan


On 25/03/16 08:02, Charles Baylis wrote:
> When compiling with -mlong-calls and -pg, calls to the __gnu_mcount_nc
> function are not generated as long calls.
> 
> This is the sequel to this patch
> https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00881.html
> 
> This patch fixes the following problems with the previous patch.
> . Nested functions now work (thanks to Richard E for spotting this)
> . Thumb-1 now works
> 
> This patch works by adding new patterns (one for ARM/Thumb-2 and one
> for Thumb-1) which are placed in the prologue as a placeholder for
> some RTL which describes the address. This is either a SYMBOL_REF for
> targets with MOVW/MOVT, or a literal pool reference for other targets.
> The implementation of ARM_FUNCTION_PROFILER is changed to search for
> this insn so that the the address of the __gnu_mcount_nc function can
> be loaded using an appropriate sequence for the target.
> 
> I also tried generating the profiling call sequence directly in the
> prologue, but this requires some unpleasant hacks to prevent spurious
> register pushes from ASM_OUTPUT_REG_PUSH.
> 
> Tested with no new regressions on arm-unknown-linux-gnueabihf on QEMU.
> The generated code sequences have been inspected for normal and nested
> functions on ARM v6, ARM v7, Thumb-1, and Thumb-2 targets.
> 
> This does not fix a regression, so I don't expect to apply it for
> GCC6, is it OK for when stage 1 re-opens.
> 

Hi Charles,

+static void
+arm_emit_long_call_profile_insn ()
+{
+  rtx sym_ref = gen_rtx_SYMBOL_REF (Pmode, "__gnu_mcount_nc");
+  /* if movt/movw are not available, use a constant pool */
+  if (!arm_arch_thumb2)

Should this be !TARGET_USE_MOVT?

Thanks,
Kugan


[patch commited FT32] Add -nodiv option

2016-03-28 Thread James Bowman
The attached patches add an "-mnodiv" option to FT32, preventing use of the 
divide and modulo instructions. This required additional software div/mod 
implemtations in FT32's libgcc.

2016-03-28  James Bowman  

* config/ft32/ft32.opt (mnodiv): New.
* config/ft32/ft32.md (*divsi3, *modsi3): Qualify with
TARGET_NODIV.
* doc/invoke.texi (FT32 Options -mnodiv): New.

2016-03-28  James Bowman  

* libgcc/config/ft32/lib1funcs.S (*divsi3, *modsi3): New.

Index: gcc/config/ft32/ft32.md
===
--- gcc/config/ft32/ft32.md (revision 234405)
+++ gcc/config/ft32/ft32.md (working copy)
@@ -101,7 +101,7 @@
   (div:SI
(match_operand:SI 1 "register_operand" "r,r")
(match_operand:SI 2 "ft32_rimm_operand" "r,KA")))]
-  ""
+  "!TARGET_NODIV"
   "div.l  %0,%1,%2")
 
 (define_insn "modsi3"
@@ -109,7 +109,7 @@
   (mod:SI
(match_operand:SI 1 "register_operand" "r,r")
(match_operand:SI 2 "ft32_rimm_operand" "r,KA")))]
-  ""
+  "!TARGET_NODIV"
   "mod.l  %0,%1,%2")
 
 (define_insn "udivsi3"
@@ -117,7 +117,7 @@
   (udiv:SI
(match_operand:SI 1 "register_operand" "r,r")
(match_operand:SI 2 "ft32_rimm_operand" "r,KA")))]
-  ""
+  "!TARGET_NODIV"
   "udiv.l %0,%1,%2")
 
 (define_insn "umodsi3"
@@ -125,7 +125,7 @@
   (umod:SI
(match_operand:SI 1 "register_operand" "r,r")
(match_operand:SI 2 "register_operand" "r,KA")))]
-  ""
+  "!TARGET_NODIV"
   "umod.l %0,%1,%2")
 
 (define_insn "extvsi"
Index: gcc/config/ft32/ft32.opt
===
--- gcc/config/ft32/ft32.opt(revision 234405)
+++ gcc/config/ft32/ft32.opt(working copy)
@@ -25,3 +25,7 @@ target the software simulator.
 mlra
 Target Report Var(ft32_lra_flag) Init(0) Save
 Use LRA instead of reload.
+
+mnodiv
+Target Report Mask(NODIV)
+Avoid use of the DIV and MOD instructions

Index: libgcc/config/ft32/lib1funcs.S
===
--- libgcc/config/ft32/lib1funcs.S  (revision 234405)
+++ libgcc/config/ft32/lib1funcs.S  (working copy)
@@ -25,8 +25,8 @@ see the files COPYING3 and COPYING.RUNTI
 # for implementation details of all except division which is detailed below
 #
 
+#ifdef L_fp_tools
 // .global __cmpsf2_
-
 nan:.long 0x7FFF# also abs mask
 inf:.long 0x7F80
 sign_mask:  .long 0x8000
@@ -37,6 +37,14 @@ smallest_norm:  .long 0x0080# im
 high_FF:.long 0xFF00
 high_uint:  .long 0x
 
+ntz_table:
+.byte   32,0,1,12,2,6,0,13,3,0,7,0,0,0,0,14
+.byte   10,4,0,0,8,0,0,25,0,0,0,0,0,21,27,15
+.byte   31,11,5,0,0,0,0,0,9,0,0,24,0,0,20,26
+.byte   30,0,0,0,0,23,0,19,29,0,22,18,28,17,16,0
+
+#endif
+
 # Supply a few 'missing' instructions
 
 # not
@@ -87,12 +95,6 @@ high_uint:  .long 0x
 lpmi.b  \x, \x, 0
 .endm
 
-ntz_table:
-.byte   32,0,1,12,2,6,0,13,3,0,7,0,0,0,0,14
-.byte   10,4,0,0,8,0,0,25,0,0,0,0,0,21,27,15
-.byte   31,11,5,0,0,0,0,0,9,0,0,24,0,0,20,26
-.byte   30,0,0,0,0,23,0,19,29,0,22,18,28,17,16,0
-
 # calculate leading zero count
 .macro  nlz x, scr
 flip\x, \x, 31
@@ -503,6 +505,9 @@ mul_z0:
 ## for implementation details
 
 
+
+
+#ifdef  L_divsf3
 dc_1: .long 0xe7d7
 dc_2: .long 0xffe8
 dc_3: .long 0xffbad86f
@@ -517,9 +522,6 @@ dc_11: .long0x0452b1bf
 dc_12: .long0xFFC0
 spec_val_test:  .long   0x7F7F
 
-
-
-#ifdef  L_divsf3
 .global __divsf3
 __divsf3:
 push$r13
@@ -869,6 +871,7 @@ float_not_zero2:
 return
 #endif
 
+#if 0
 ##
 ##
 ## float compare
@@ -913,7 +916,74 @@ cmp_is_gt:
 cmp_is_eq:
 ldk $r0, 0
 return
+#endif
 
+#ifdef  L_udivsi3
+.global __udivsi3
+__udivsi3:
+   # $r0 is dividend
+   # $r1 is divisor
+   ldk $r2,0
+   push$r28
+   ldk $r28,-32
+0:
+   lshr$r3,$r0,31  # Shift $r2:$r0 left one
+   ashl$r0,$r0,1
+   ashl$r2,$r2,1
+   or  $r2,$r2,$r3
+   cmp $r2,$r1
+   jmpcb,1f
+2:
+   sub $r2,$r2,$r1
+   add $r0,$r0,1
+1:
+   add $r28,$r28,1
+   jmpx31,$r28,1,0b
+   pop $r28
+   # $r0: quotient
+   # $r2: remainder
+   return
+#endif
 
+#ifdef L_umodsi3
+.global__umodsi3
+__umodsi3:
+   call__udivsi3
+   move$r0,$r2
+   return
+#endif
 
+#ifdef L_divsi3
+.global__divsi3
+__divsi3:
+   xor $r5,$r0,$r1 # $r5 is sign of result
+   ashr$r2,$r0,31  # $r0 = abs($r0)
+   xor $r0,$r0,$r2
+ 

[PATCH] Disable guality tests for powerpc*-linux*

2016-03-28 Thread Bill Schmidt
Hi,

For a long time we've had hundreds of failing guality tests.  These
failures don't seem to have any correlation with gdb functionality for
POWER, which is working fine.  At this point the value of these tests to
us seems questionable.  Fixing these is such low priority that it is
unlikely we will ever get around to it.  In the meanwhile, the failures
simply clutter up our regression test reports.  Thus I'd like to disable
them, and that's what this test does.

Verified to remove hundreds of failure messages on
powerpc64le-unknown-linux-gnu. :)  Is this ok for trunk?

Thanks,
Bill


2016-03-28  Bill Schmidt  

* g++.dg/guality/guality.exp: Disable for powerpc*-linux*.
* gcc.dg/guality/guality.exp: Likewise.


Index: gcc/testsuite/g++.dg/guality/guality.exp
===
--- gcc/testsuite/g++.dg/guality/guality.exp(revision 234476)
+++ gcc/testsuite/g++.dg/guality/guality.exp(working copy)
@@ -13,6 +13,11 @@ if { [istarget "powerpc-ibm-aix*"] } {
 return
 }
 
+if { [istarget "powerpc*-linux*"] } {
+set torture_execute_xfail "powerpc*-linux*"
+return
+}
+
 proc check_guality {args} {
 # Don't count check_guality as PASS, or FAIL etc., that would make
 # the total PASS count dependent on how many parallel runtest invocations
Index: gcc/testsuite/gcc.dg/guality/guality.exp
===
--- gcc/testsuite/gcc.dg/guality/guality.exp(revision 234476)
+++ gcc/testsuite/gcc.dg/guality/guality.exp(working copy)
@@ -13,6 +13,11 @@ if { [istarget "powerpc-ibm-aix*"] } {
 return
 }
 
+if { [istarget "powerpc*-linux*"] } {
+set torture_execute_xfail "powerpc*-linux*"
+return
+}
+
 proc check_guality {args} {
 # Don't count check_guality as PASS, or FAIL etc., that would make
 # the total PASS count dependent on how many parallel runtest invocations




[Bug target/70355] [5/6 Regression] ICE: in simplify_subreg_concatn, at lower-subreg.c:617 with -funroll-loops -mavx512f

2016-03-28 Thread rth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70355

--- Comment #2 from Richard Henderson  ---
Created attachment 38113
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38113=edit
proposed patch

Testing the following, which works on the reduced test case.

As a missed-optimization, we really ought to be handling logic operations on
these wide types via normal sse logic insns.  Perhaps not for V1TI, but
definitely for V2TI and V4TI.  There's no point in breaking them down into
4 and 8 DImode operations, respectively.

[Bug rtl-optimization/70432] New: Move operation across function call to reduce save / restore

2016-03-28 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70432

Bug ID: 70432
   Summary: Move operation across function call to reduce save /
restore
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: glisse at gcc dot gnu.org
  Target Milestone: ---

I have no idea how one would do such a transformation, and there is likely a
dup of this somewhere... The compiler generates much smaller code for f2 than
f1.

void g(void);
int f1(int a,int b,int c){
  g();
  return a+b+c;
}
int f2(int a,int b,int c){
  int r=a+b+c;
  g();
  return r;
}

f1:
.LFB0:
.cfi_startproc
pushq   %r12
.cfi_def_cfa_offset 16
.cfi_offset 12, -16
pushq   %rbp
.cfi_def_cfa_offset 24
.cfi_offset 6, -24
movl%esi, %r12d
pushq   %rbx
.cfi_def_cfa_offset 32
.cfi_offset 3, -32
movl%edi, %ebx
movl%edx, %ebp
callg
leal(%rbx,%r12), %eax
popq%rbx
.cfi_def_cfa_offset 24
addl%ebp, %eax
popq%rbp
.cfi_def_cfa_offset 16
popq%r12
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE0:
.size   f1, .-f1
.p2align 4,,15
.globl  f2
.type   f2, @function
f2:
.LFB1:
.cfi_startproc
addl%esi, %edi
pushq   %rbx
.cfi_def_cfa_offset 16
.cfi_offset 3, -16
leal(%rdi,%rdx), %ebx
callg
movl%ebx, %eax
popq%rbx
.cfi_def_cfa_offset 8
ret
.cfi_endproc

Patches to fix optimizer bug & C++ exceptions for GCC VAX backend

2016-03-28 Thread Jake Hamby
Amazingly enough, my patch worked well enough that my NetBSD VAX kernel built 
with GCC 5.3 is no longer crashing. I feel pretty good about what I have so far 
so here's the complete diff for both the C++ exception fix and the bad 
condition codes optimizer bug. It should be good enough to check into NetBSD 
current, at least, and I believe it does fix most, if not all, of the bad code 
generation bugs with optimization on VAX.

-Jake

Index: gcc/except.c
===
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/except.c,v
retrieving revision 1.3
diff -u -r1.3 except.c
--- gcc/except.c23 Mar 2016 15:51:36 -  1.3
+++ gcc/except.c28 Mar 2016 23:24:40 -
@@ -2288,7 +2288,8 @@
 #endif
 {
 #ifdef EH_RETURN_HANDLER_RTX
-  emit_move_insn (EH_RETURN_HANDLER_RTX, crtl->eh.ehr_handler);
+  rtx insn = emit_move_insn (EH_RETURN_HANDLER_RTX, crtl->eh.ehr_handler);
+  RTX_FRAME_RELATED_P (insn) = 1;  // XXX FIXME in VAX backend?
 #else
   error ("__builtin_eh_return not supported on this target");
 #endif
Index: gcc/config/vax/builtins.md
===
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/config/vax/builtins.md,v
retrieving revision 1.5
diff -u -r1.5 builtins.md
--- gcc/config/vax/builtins.md  24 Jan 2016 09:43:34 -  1.5
+++ gcc/config/vax/builtins.md  28 Mar 2016 23:24:40 -
@@ -50,7 +50,8 @@
(ctz:SI (match_operand:SI 1 "general_operand" "nrmT")))
(set (cc0) (match_dup 0))]
   ""
-  "ffs $0,$32,%1,%0")
+  "ffs $0,$32,%1,%0"
+  [(set_attr "cc" "clobber")])
 
 (define_expand "sync_lock_test_and_set"
   [(set (match_operand:VAXint 0 "nonimmediate_operand" "=")
@@ -88,7 +89,8 @@
   (match_dup 1))
  (const_int 1))])]
   ""
-  "jbssi %1,%0,%l2")
+  "jbssi %1,%0,%l2"
+  [(set_attr "cc" "none")])
 
 (define_insn "jbbssihi"
   [(parallel
@@ -105,7 +107,8 @@
   (match_dup 1))
  (const_int 1))])]
   ""
-  "jbssi %1,%0,%l2")
+  "jbssi %1,%0,%l2"
+  [(set_attr "cc" "none")])
 
 (define_insn "jbbssisi"
   [(parallel
@@ -122,8 +125,8 @@
   (match_dup 1))
  (const_int 1))])]
   ""
-  "jbssi %1,%0,%l2")
-
+  "jbssi %1,%0,%l2"
+  [(set_attr "cc" "none")])
 
 (define_expand "sync_lock_release"
   [(set (match_operand:VAXint 0 "memory_operand" "+m")
@@ -160,7 +163,8 @@
   (match_dup 1))
  (const_int 0))])]
   ""
-  "jbcci %1,%0,%l2")
+  "jbcci %1,%0,%l2"
+  [(set_attr "cc" "none")])
 
 (define_insn "jbbccihi"
   [(parallel
@@ -177,7 +181,8 @@
   (match_dup 1))
  (const_int 0))])]
   ""
-  "jbcci %1,%0,%l2")
+  "jbcci %1,%0,%l2"
+  [(set_attr "cc" "none")])
 
 (define_insn "jbbccisi"
   [(parallel
@@ -194,4 +199,5 @@
   (match_dup 1))
  (const_int 0))])]
   ""
-  "jbcci %1,%0,%l2")
+  "jbcci %1,%0,%l2"
+  [(set_attr "cc" "none")])
Index: gcc/config/vax/elf.h
===
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/config/vax/elf.h,v
retrieving revision 1.6
diff -u -r1.6 elf.h
--- gcc/config/vax/elf.h23 Mar 2016 15:51:37 -  1.6
+++ gcc/config/vax/elf.h28 Mar 2016 23:24:40 -
@@ -26,7 +26,7 @@
 #define REGISTER_PREFIX "%"
 #define REGISTER_NAMES \
   { "%r0", "%r1",  "%r2",  "%r3", "%r4", "%r5", "%r6", "%r7", \
-"%r8", "%r9", "%r10", "%r11", "%ap", "%fp", "%sp", "%pc", }
+"%r8", "%r9", "%r10", "%r11", "%ap", "%fp", "%sp", "%pc", "%psw", }
 
 #undef SIZE_TYPE
 #define SIZE_TYPE "long unsigned int"
@@ -45,18 +45,8 @@
count pushed by the CALLS and before the start of the saved registers.  */
 #define INCOMING_FRAME_SP_OFFSET 0
 
-/* Offset from the frame pointer register value to the top of the stack.  */
-#define FRAME_POINTER_CFA_OFFSET(FNDECL) 0
-
-/* We use R2-R5 (call-clobbered) registers for exceptions.  */
-#define EH_RETURN_DATA_REGNO(N) ((N) < 4 ? (N) + 2 : INVALID_REGNUM)
-
-/* Place the top of the stack for the DWARF2 EH stackadj value.  */
-#define EH_RETURN_STACKADJ_RTX \
-  gen_rtx_MEM (SImode, \
-  plus_constant (Pmode,\
- gen_rtx_REG (Pmode, FRAME_POINTER_REGNUM),\
- -4))
+/* We use R2-R3 (call-clobbered) registers for exceptions.  */
+#define EH_RETURN_DATA_REGNO(N) ((N) < 2 ? (N) + 2 : INVALID_REGNUM)
 
 /* Simple store the return handler into the call frame.  */
 #define EH_RETURN_HANDLER_RTX  \
@@ -66,10 +56,6 @@
  16))
 
 
-/* Reserve the top of the stack for exception handler stackadj value.  */
-#undef STARTING_FRAME_OFFSET
-#define STARTING_FRAME_OFFSET -4
-
 /* The VAX wants no space 

Re: [ARM] Add support for overflow add, sub, and neg operations

2016-03-28 Thread Michael Collison
An updated patch that resolves the issues with thumb2 support and adds 
test cases as requested. Looking to check this into GCC 7 stage1 when it 
opens.


2016-02-24  Michael Collison  

* config/arm/arm-modes.def: Add new condition code mode CC_V
to represent the overflow bit.
* config/arm/arm.c (maybe_get_arm_condition_code):
Add support for CC_Vmode.
* config/arm/arm.md (addv4, add3_compareV,
addsi3_compareV_upper): New patterns to support signed
builtin overflow add operations.
(uaddv4, add3_compareC, addsi3_compareV_upper):
New patterns to support unsigned builtin add overflow operations.
(subv4, sub3_compare1): New patterns to support signed
builtin overflow subtract operations,
(usubv4): New patterns to support unsigned builtin subtract
overflow operations.
(negvsi3, negvdi3, negdi2_compare, negsi2_carryin_compare): New 
patterns

to support builtin overflow negate operations.
* gcc.target/arm/builtin_saddl.c: New testcase.
* gcc.target/arm/builtin_saddll.c: New testcase.
* gcc.target/arm/builtin_uaddl.c: New testcase.
* gcc.target/arm/builtin_uaddll.c: New testcase.
* gcc.target/arm/builtin_ssubl.c: New testcase.
* gcc.target/arm/builtin_ssubll.c: New testcase.
* gcc.target/arm/builtin_usubl.c: New testcase.
* gcc.target/arm/builtin_usubll.c: New testcase.

On 02/29/2016 04:13 AM, Kyrill Tkachov wrote:


On 26/02/16 10:32, Michael Collison wrote:



On 02/25/2016 02:51 AM, Kyrill Tkachov wrote:

Hi Michael,

On 24/02/16 23:02, Michael Collison wrote:
This patch adds support for builtin overflow of add, subtract and 
negate. This patch is targeted for gcc 7 stage 1. It was tested 
with no regressions in arm and thumb modes on the following targets:


arm-non-linux-gnueabi
arm-non-linux-gnuabihf
armeb-none-linux-gnuabihf
arm-non-eabi



I'll have a deeper look once we're closer to GCC 7 development.
I've got a few comments in the meantime.


2016-02-24 Michael Collison 

* config/arm/arm-modes.def: Add new condition code mode CC_V
to represent the overflow bit.
* config/arm/arm.c (maybe_get_arm_condition_code):
Add support for CC_Vmode.
* config/arm/arm.md (addv4, add3_compareV,
addsi3_compareV_upper): New patterns to support signed
builtin overflow add operations.
(uaddv4, add3_compareC, addsi3_compareV_upper):
New patterns to support unsigned builtin add overflow operations.
(subv4, sub3_compare1): New patterns to support signed
builtin overflow subtract operations,
(usubv4): New patterns to support unsigned builtin subtract
overflow operations.
(negvsi3, negvdi3, negdi2_compre, negsi2_carryin_compare): New 
patterns

to support builtin overflow negate operations.




Can you please summarise what sequences are generated for these 
operations, and how

they are better than the default fallback sequences.


Sure for a simple test case such as:

int
fn3 (int x, int y, int *ovf)
{
  int res;
  *ovf = __builtin_sadd_overflow (x, y, );
  return res;
}

Current trunk at -O2 generates

fn3:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
cmp r1, #0
mov r3, #0
add r1, r0, r1
blt .L4
cmp r1, r0
blt .L3
.L2:
str r3, [r2]
mov r0, r1
bx  lr
.L4:
cmp r1, r0
ble .L2
.L3:
mov r3, #1
b   .L2

With the overflow patch this now generates:

   addsr0, r0, r1
   movvs   r3, #1
   movvc   r3, #0
   str r3, [r2]
   bx  lr



Thanks! That looks much better.

Also, we'd need tests for each of these overflow operations, since 
these are pretty complex

patterns that are being added.


The patterns are tested now most notably by tests in:

c-c++-common/torture/builtin-arith-overflow*.c

I had a few failures I resolved so the builtin overflow arithmetic 
functions are definitely being exercised.


Great, that gives me more confidence on the correctness aspects but...



Also, you may want to consider splitting this into a patch series, 
each adding a single
overflow operation, together with its tests. That way it will be 
easier to keep track of
which pattern applies to which use case and they can go in 
independently of each other.


Let me know if you still fell the same way given the existing test 
cases.




... I'd like us to still have scan-assembler tests. The torture tests 
exercise the correctness,
but we'd want tests to catch regressions where we stop generating the 
new patterns due to other

optimisation changes, which would lead to code quality regressions.
So I'd like us to have scan-assembler tests for these sequences to 
make sure we generate the right

instructions.

Thanks,
Kyrill



+(define_expand "uaddv4"
+  [(match_operand:SIDI 0 

Re: Should a disabled warning be allowed to be promoted to an error(Bugzilla PR 70275)?

2016-03-28 Thread Martin Sebor

On 03/28/2016 01:56 PM, Florian Weimer wrote:

In Bugzilla PR # 70275, Manuel López-Ibáñez reports that even though
he provides the "-Werror=return-type" option, the compiler doesn't
flag the warning/error about a control reaching the end of a non-void
function, due to the presence of the "-w" option.  He points out that
clang++ wtill flags the promoted warning even though warnings are
inhibited.


I think -w is ordered with respect to the other warning obtions, and
-w inhibits previously requested warnings, and future -W flags may
enable other warnings.  With this in mind, I agree that the current
GCC behavior is consistent and probably not a bug.


The general rule of thumb documented in the manual is that more
specific options take precedence over more general ones, regardless
of where they appear on the command line:

  The combined effect of positive and negative forms [of warning
  options] is that more specific options have priority over less
  specific ones, independently of their position in the command-
  line.

The manual doesn't give an exhaustive list of the more general
options but it mentions -Wall and -Wextra as examples.  I would
tend to view -w as another such example and, like the reporter,
expect "-Werror=return-type -w" to disable all warnings but treat
-Wreturn-type as an error.

Martin


Bad CC0 optimizer in VAX backend (was Re: Patches to fix GCC’s C++ exception handling on NetBSD/VAX)

2016-03-28 Thread Jake Hamby
I have some bad news and some good news. The bad news is that there has been a 
nasty optimizer bug lurking in the VAX backend for GCC for many years, which 
has to do with over-optimistic removal of necessary tst/cmp instructions under 
certain circumstances. This manifests at -O or higher and the symptoms are 
strange behavior like stack overflows because of branches going the wrong way.

The good news is that my suspicions about the CC0 handler appear to be correct, 
and better yet, I'm currently testing a patch that isn't too big and I believe 
will fix this bug. The bad behavior is in vax_notice_update_cc (), which is 
supposed to call CC_STATUS_INIT if the CC flags have been reset, or set 
cc_status.value1 and possibly cc_status.value2 with the destination of the 
current (set...) instruction, whose signed comparison to 0 will be stored in 
the N and Z flags as a side effect (most VAX instructions do this).

The first bug is that most, but not all of the define_insn patterns in vax.md 
actually do this. The paths that emit movz* instructions will not set N and Z, 
and there are some code paths that can't be trusted because they handle a 
64-bit integer in two 32-bit pieces, for example, so N and Z won't reflect the 
desired result (comparison of operand 0 to 0).

The current version of vax_notice_update_cc has a specific check for this: it 
won't set the cc_status cache if GET_CODE (SET_DEST (exp)) != ZERO_EXTRACT. The 
problem is that this is checking the RTL expression for (zero_extract...) and 
not whether what was actually emitted is a movz* or not. That's why all of the 
define_insn()'s have to be annotated with their actual behavior vis-a-vis 
compare to 0, and then the change to vax.c to read the CC attribute makes it 
really much easier to get the correct behavior.

I need to do a lot of testing today to see if this really does help fix GCC's 
VAX backend, but I'm hopeful that this is at least a real bug that needs to be 
fixed, and that I have a fix for it that should work. The other good news is 
that you only need three cc enums: "none" (doesn't alter Z or N flags), 
"compare" (Z and N reflect comparison of operand 0 to literal 0), and "clobber" 
(call CC_STATUS_INIT to reset cc_status cache).

One thing to note is that the current version of vax.c only sets CC_NO_OVERFLOW 
on certain paths, while it should actually always set CC_NO_OVERFLOW so as to 
avoid emitting any unsigned branch instructions that would use an incorrect 
value of the C flag (which most VAX instructions do NOT set). The code that 
handles CC_NO_OVERFLOW is in final.c, and what it does is change signed 
comparisons with 0 into unsigned ones that do the same thing. (a < 0) is always 
false (for unsigned ints), (a >= 0) is always true, (a <= 0) turns into (a == 
0), and (a > 0) turns into (a != 0). Here's the patch to vax.c. I'll send the 
rest after I've tested that it does what I think it should do. The diff to 
vax.md will be larger, but it's mostly adding the "cc" attributes and not code.

Index: gcc/config/vax/vax.c
===
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/config/vax/vax.c,v
retrieving revision 1.15
diff -u -r1.15 vax.c
--- gcc/config/vax/vax.c24 Mar 2016 04:27:29 -  1.15
+++ gcc/config/vax/vax.c28 Mar 2016 22:28:10 -
@@ -1131,61 +1136,51 @@
 /* Worker function for NOTICE_UPDATE_CC.  */
 
 void
-vax_notice_update_cc (rtx exp, rtx insn ATTRIBUTE_UNUSED)
+vax_notice_update_cc (rtx exp, rtx_insn *insn)
 {
+  attr_cc cc = get_attr_cc (insn);
+
   if (GET_CODE (exp) == SET)
 {
   if (GET_CODE (SET_SRC (exp)) == CALL)
CC_STATUS_INIT;
-  else if (GET_CODE (SET_DEST (exp)) != ZERO_EXTRACT
-  && GET_CODE (SET_DEST (exp)) != PC)
+  else if (GET_CODE (SET_DEST (exp)) != PC
+  && cc == CC_COMPARE)
{
- cc_status.flags = 0;
- /* The integer operations below don't set carry or
+ /* Some of the integer operations don't set carry or
 set it in an incompatible way.  That's ok though
 as the Z bit is all we need when doing unsigned
 comparisons on the result of these insns (since
 they're always with 0).  Set CC_NO_OVERFLOW to
 generate the correct unsigned branches.  */
- switch (GET_CODE (SET_SRC (exp)))
-   {
-   case NEG:
- if (GET_MODE_CLASS (GET_MODE (exp)) == MODE_FLOAT)
-   break;
-   case AND:
-   case IOR:
-   case XOR:
-   case NOT:
-   case CTZ:
-   case MEM:
-   case REG:
- cc_status.flags = CC_NO_OVERFLOW;
- break;
-   default:
- break;
-   }
+ cc_status.flags = CC_NO_OVERFLOW;
  cc_status.value1 = SET_DEST (exp);
  cc_status.value2 = SET_SRC (exp);
}
+  else if (cc != CC_NONE)
+   

[Bug libgcc/66382] POWER8 Vector optimized implementation of __float128 (IEEE754 128-bit Binary Floating Point)

2016-03-28 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66382

--- Comment #7 from Michael Meissner  ---
I do not think it is worthwhile to expand the IEEE 128-bit software emulation
routines at the point of call in ISA 2.07 (power8).  This is due to the fact
that a lot of processing goes on in the emulation library.

Now if people were motivated, we could replace the soft-fp functions with tuned
versions written for the Power8 ISA.

However, I don't think it is going to be an easy task.  I'll list the things
that come to mind first.  They might be able to be done, but it is a
time/effort calculation of whether the return on investment is worth it.

The first issue is the next ISA (3.0) has support in it already for doing IEEE
128-bit floating point in hardware, including supporting the various rounding
modes, etc. If you configure and build gcc with an assembler that supports the
ISA 3.0 instruction set, it will add in IFUNC support so that when a program is
run on ISA 3.0 hardware, it will automatically use a version of __addkf3 that
uses the xsaddqp instruction instead of doing the emulation. So this effort
would only be for the current generation of hardware.

The next issue is right now you cannot do 64-bit scalar int arithmetic in the
VSX unit. At present, the compiler does not allow DImode into the Altivec
registers, but the only support for 64-bit integer arithmetic uses an Altivec
encoding and only uses Altivec registers (vaddudm, etc.).

I am working on patches to allow DImode variables in Altivec registers (and
later SImode, HImode, QImode). My first run with the patch shows 1 benchmark 2%
faster (perlbench) and one 2% slower (omnetpp), but I feel it needs a lot more
tuning.  At the moment, that work is lower priority, as I am trying to make
__float128 _Complex work as my highest priority (obviously other people could
take up this work).

After allowing DImode into Altivec registers, and perhaps doing 64-bit
arithmetic via the 64-bit integer vector instructions, another issue is that
the cycle time of the vector unit is 1/2 that of the GPR unit, so it will need
a lot of tuning.

I don't think the ISA 2.07 instruction set is general enough to do inserts and
extracts of 128-bit values that you would need for packing and unpacking the
IEEE 128-bit floating point values. ISA 3.0 has all of this support, including
specialized instructions to extract/set the exponent or mantissa, but then it
also has the hardware support for IEEE 128-bit floating point.

Load/stores are also problematical in ISA 2.07, given there is no d-form
addressing for Altivec registers.  So if you spill a DImode value in an Altivec
register, you need to load up the offset in a GPR to do the memory operation.

Re: [PATCH] c++/67376 Comparison with pointer to past-the-end, of array fails inside constant expression

2016-03-28 Thread Martin Sebor

I think let's defer the fix for c++/60760 (i.e. the nullptr_p bits)
until stage 1, when it can be combined with the POINTER_PLUS_EXPR fix,
and put the rest of this patch in now.


I can split up the patch into two and post the subset without
the fix for c++/60760, though I don't expect to be done with
it after I get back (next week).

I'd like to understand your concern with the fix for c++/60760.
Is it that it's incomplete (doesn't reject taking the address
of the first member of a struct, as in >first_member),
or are you worried that the changes may not be stable enough?


More the latter; it seems like significant new code and doesn't fix a
regression.


Attached is an updated patch without the fix for c++/60760, retested
on x86_64.

Martin
PR c++/67376 - [5/6 regression] Comparison with pointer to past-the-end
	of array fails inside constant expression
PR c++/70170 - [6 regression] bogus not a constant expression error comparing
	pointer to array to null
PR c++/70172 - incorrect reinterpret_cast from integer to pointer error
	on invalid constexpr initialization
PR c++/70228 - insufficient detail in diagnostics for a constexpr out of bounds
	array subscript

gcc/testsuite/ChangeLog:
2016-03-18  Martin Sebor  

	PR c++/67376
	PR c++/70170
	PR c++/70172
	PR c++/70228
	* g++.dg/cpp0x/constexpr-array-ptr10.C: New test.
	* g++.dg/cpp0x/constexpr-array-ptr9.C: New test.
	* g++.dg/cpp0x/constexpr-array5.C: Adjust text of expected diagnostic.
	* g++.dg/cpp0x/constexpr-string.C: Same.
	* g++.dg/cpp0x/constexpr-wstring2.C: Same.
	* g++.dg/cpp0x/pr65398.C: Same.
	* g++.dg/ext/constexpr-vla1.C: Same.
	* g++.dg/ext/constexpr-vla2.C: Same.
	* g++.dg/ext/constexpr-vla3.C: Same.
	* g++.dg/ubsan/pr63956.C: Same.

gcc/cp/ChangeLog:
2016-03-18  Martin Sebor  

	PR c++/67376
	PR c++/70170
	PR c++/70172
	PR c++/70228
	* constexpr.c (diag_array_subscript): New function.
	(cxx_eval_array_reference): Detect out of bounds array indices.

gcc/ChangeLog:
2016-03-18  Martin Sebor  

	PR c++/67376
	* fold-const.c (maybe_nonzero_address): New function.
	(fold_comparison): Call it.  Fold equality and relational
	expressions involving null pointers.
	(tree_single_nonzero_warnv_p): Call maybe_nonzero_address.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 7776cac..c900080 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1837,6 +1837,30 @@ find_array_ctor_elt (tree ary, tree dindex, bool insert = false)
   return -1;
 }
 
+/* Under the control of CTX, issue a detailed diagnostic for
+   an out-of-bounds subscript INDEX into the expression ARRAY.  */
+
+static void
+diag_array_subscript (const constexpr_ctx *ctx, tree array, tree index)
+{
+  if (!ctx->quiet)
+{
+  tree arraytype = TREE_TYPE (array);
+
+  /* Convert the unsigned array subscript to a signed integer to avoid
+	 printing huge numbers for small negative values.  */
+  tree sidx = fold_convert (ssizetype, index);
+  if (DECL_P (array))
+	{
+	  error ("array subscript value %qE is outside the bounds "
+		 "of array %qD of type %qT", sidx, array, arraytype);
+	  inform (DECL_SOURCE_LOCATION (array), "declared here");
+	}
+  else
+	error ("array subscript value %qE is outside the bounds "
+	   "of array type %qT", sidx, arraytype);
+}
+}
 
 /* Subroutine of cxx_eval_constant_expression.
Attempt to reduce a reference to an array slot.  */
@@ -1861,6 +1885,7 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, tree t,
 	false,
 	non_constant_p, overflow_p);
   VERIFY_CONSTANT (index);
+
   if (lval && ary == oldary && index == oldidx)
 return t;
   else if (lval)
@@ -1885,8 +1910,7 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, tree t,
   if (!tree_fits_shwi_p (index)
   || (i = tree_to_shwi (index)) < 0)
 {
-  if (!ctx->quiet)
-	error ("negative array subscript");
+  diag_array_subscript (ctx, ary, index);
   *non_constant_p = true;
   return t;
 }
@@ -1898,8 +1922,7 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, tree t,
   VERIFY_CONSTANT (nelts);
   if (!tree_int_cst_lt (index, nelts))
 {
-  if (!ctx->quiet)
-	error ("array subscript out of bound");
+  diag_array_subscript (ctx, ary, index);
   *non_constant_p = true;
   return t;
 }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 44fe2a2..3f65243 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -8336,6 +8336,20 @@ pointer_may_wrap_p (tree base, tree offset, HOST_WIDE_INT bitpos)
   return total.to_uhwi () > (unsigned HOST_WIDE_INT) size;
 }
 
+/* Return a positive integer when the symbol DECL is known to have
+   a nonzero address, zero when it's known not to (e.g., it's a weak
+   symbol), and a negative integer when the symbol is not yet in the
+   symbol table and so whether or not its address is zero is unknown.  */
+static int
+maybe_nonzero_address (tree decl)
+{
+  if (DECL_P (decl) && decl_in_symtab_p 

[Bug c++/64266] Can GCC produce local mergeable symbols for *.__FUNCTION__ and *.__PRETTY_FUNCTION__ functions?

2016-03-28 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64266

--- Comment #8 from Jason Merrill  ---
(In reply to Jason Merrill from comment #7)
> until the C++ committee decides something about whether __func__ can be 
> shared.

Actually, C++11 already says "It is unspecified whether such a variable has an
address distinct from that of any other object in the program."  So it's
definitely sharable.

RFA: PATCH to tree-inline.c:remap_decls for c++/70353 (ICE with __func__ and constexpr)

2016-03-28 Thread Jason Merrill
The constexpr evaluation code uses the inlining code to remap the 
constexpr function body for evaluation so that recursion works properly. 
 In this testcase __func__ is declared as a static local variable, so 
rather than remap it, remap_decls tries to add it to the local_decls 
list for the function we're inlining into.  But there is no such 
function in this case, so we crash.


Avoid the add_local_decl call when cfun is null avoids the ICE (thanks 
Jakub), but results in an undefined symbol.  Calling 
varpool_node::finalize_decl instead allows cgraph to handle the 
reference from 'c' properly.


OK if testing passes?
commit d875b432434be92ab345c3851fa3ba4bef738399
Author: Jason Merrill 
Date:   Mon Mar 28 17:14:43 2016 -0400

	PR c++/70353
	* tree-inline.c (remap_decls): Use varpool_node::finalize_decl if
	cfun is null.

diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-__func__2.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-__func__2.C
new file mode 100644
index 000..e678290
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-__func__2.C
@@ -0,0 +1,13 @@
+// PR c++/70353
+// { dg-do link { target c++11 } }
+
+constexpr const char* ce ()
+{
+  return __func__;
+}
+
+const char *c = ce();
+
+int main()
+{
+}
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 9d4f8f7..74b0ca9 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -614,10 +614,16 @@ remap_decls (tree decls, vec **nonlocalized_list,
   if (can_be_nonlocal (old_var, id))
 	{
 	  /* We need to add this variable to the local decls as otherwise
-	 nothing else will do so.  */
+	 nothing else will do so.  If we're not in a function,
+	 tell cgraph about it.  */
 	  if (TREE_CODE (old_var) == VAR_DECL
 	  && ! DECL_EXTERNAL (old_var))
-	add_local_decl (cfun, old_var);
+	{
+	  if (cfun)
+		add_local_decl (cfun, old_var);
+	  else
+		varpool_node::finalize_decl (old_var);
+	}
 	  if ((!optimize || debug_info_level > DINFO_LEVEL_TERSE)
 	  && !DECL_IGNORED_P (old_var)
 	  && nonlocalized_list)


[Bug bootstrap/70422] [6 regression] Bootstrap comparison failure

2016-03-28 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70422

--- Comment #7 from Jim Wilson  ---
The function alloc_entries calls fancy_abort with __func__, causing the string
to get into the output.  This function is inlined into the hash_table expand
function, which is then in turn inlined into other functions.

Without debug info, there is a single reference to this string, in the
hash_table expand function, so it gets emitted into the comdat rodata section
for that function.

With debug info, we have multiple references to this string.  One is in the
code for the expand function.  The others are symbol table entries for the
inlined instances of the alloc_entries and/of hash_table expand functions.

dwarf2out calls output_constant_def to get the rtl for a string. 
output_constant_def calls build_constant_desc which generates the rtl for a
string.

Without flag_section_anchors, build_constant_desc generates a SYMBOL_REF, and
the placement to a section happens later.

With flag_section_anchors, build_constant_desc generates a label in a specific
section.  It calls get_constant_section to get the section, which calls
select_section, which is default_elf_select_section, which calls
mergeable_constant_section, which calls function_mergeable_rodata_prefix, which
uses current_function_decl.  So the section placement depends on which function
we happen to be in when we see the first reference to the string, which in this
case is curr_statistics_hash.

So the string gets emitted into different sections with and without debug info
when flag_sections_anchors is true.

It seems like there are some conflicting goals here.  We want to share the
string across functions, but we also want to put it in function specific comdat
sections.  We want the same code with and without debug info, but we have more
references to the string with debug info than without.  It isn't clear what the
solution is.

It does appear that the fundamental problem is that the section anchors support
in build_constant_desc should not be choosing a section when it is called from
dwarf2out.c.  The section should be chosen later somehow.  Or perhaps just not
for debug info references.  Not clear how to handle this as the moment.

[Bug c++/70353] [5/6 regression] ICE on __PRETTY_FUNCTION__ in a constexpr function

2016-03-28 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70353

--- Comment #16 from Martin Sebor  ---
I was just about commit the following patch for the failure (false positive) in
the test.

Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog (revision 234510)
+++ gcc/testsuite/ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2016-03-28  Martin Sebor  
+
+   * g++.dg/ext/fnname5.C: Adjust ERE pattern to anticipate any
+   whitespace character after the .string assembler directive.
+
 2016-03-28  Dominique d'Humieres  

g++.dg/ext/fnname5.C: Update the test for Darwin.
Index: gcc/testsuite/g++.dg/ext/fnname5.C
===
--- gcc/testsuite/g++.dg/ext/fnname5.C  (revision 234510)
+++ gcc/testsuite/g++.dg/ext/fnname5.C  (working copy)
@@ -29,5 +29,5 @@ main ()
 /* { dg-final { scan-assembler-not "_ZZN1A3fooEvE12__FUNCTION__" } } */
 /* { dg-final { scan-assembler-not "_ZZN1A3fooEiE12__FUNCTION__" } } */
 /* { dg-final { scan-assembler-not "_ZZN1A3fooEiE19__PRETTY_FUNCTION__" } } */
-/* { dg-final { scan-assembler ".(string|ascii) \"void A::foo\\(int\\)(.0)?\""
} } */
-/* { dg-final { scan-assembler ".(string|ascii) \"foo(.0)?\"" } } */
+/* { dg-final { scan-assembler ".(string|ascii)\[\[:space:\]\]*\"void
A::foo\\(int\\)(.0)?\"" } } */
+/* { dg-final { scan-assembler ".(string|ascii)\[\[:space:\]\]*\"foo(.0)?\"" }
} */

[Bug target/64971] [5/6 Regression] gcc.c-torture/compile/pr37433.c ICEs with -mabi=ilp32

2016-03-28 Thread rth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64971

--- Comment #8 from Richard Henderson  ---
Created attachment 38112
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38112=edit
proposed patch

Andrew's approach to force the SYMBOL_REF to DImode is certainly one way
to approach it; another is to accept the SImode SYMBOL_REF.  Which can be
done relatively easily with a define_special_predicate.

Whichever solution is chosen, there appears to be a disconnect between 
the call and sibcall patterns:

 (1) sibcalls fail to check for aarch64_islong_call_p.
 (2) sibcalls use a combined pattern with aarch64_call_insn_operand
 and Usf constraint, whereas calls use two separate patterns.

This disconnect should be fixed at the same time.

[Bug c++/64266] Can GCC produce local mergeable symbols for *.__FUNCTION__ and *.__PRETTY_FUNCTION__ functions?

2016-03-28 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64266

Jason Merrill  changed:

   What|Removed |Added

 Status|RESOLVED|ASSIGNED
 Resolution|FIXED   |---

--- Comment #7 from Jason Merrill  ---
Reverted the patch due to bootstrap problems on some targets.

Martin, have you tried just building with -fmerge-all-constants?  That should
provide the effect you are looking for, until the C++ committee decides
something about whether __func__ can be shared.

[Bug c++/55004] [meta-bug] constexpr issues

2016-03-28 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55004
Bug 55004 depends on bug 70353, which changed state.

Bug 70353 Summary: [5/6 regression] ICE on __PRETTY_FUNCTION__ in a constexpr 
function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70353

   What|Removed |Added

 Status|RESOLVED|ASSIGNED
 Resolution|FIXED   |---

[Bug c++/70353] [5/6 regression] ICE on __PRETTY_FUNCTION__ in a constexpr function

2016-03-28 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70353

Jason Merrill  changed:

   What|Removed |Added

 Status|RESOLVED|ASSIGNED
 Resolution|FIXED   |---

--- Comment #15 from Jason Merrill  ---
Reverted the fix because it broke bootstrap on some targets.

[Bug c++/70353] [5/6 regression] ICE on __PRETTY_FUNCTION__ in a constexpr function

2016-03-28 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70353

--- Comment #14 from Jason Merrill  ---
Author: jason
Date: Mon Mar 28 20:16:21 2016
New Revision: 234511

URL: https://gcc.gnu.org/viewcvs?rev=234511=gcc=rev
Log:
PR c++/70422

PR c++/64266
PR c++/70353
* decl.c, pt.c, constexpr.c: Revert last patch.

Added:
trunk/gcc/testsuite/g++.old-deja/g++.ext/pretty4.C
Removed:
trunk/gcc/testsuite/g++.dg/cpp0x/constexpr-__func__2.C
trunk/gcc/testsuite/g++.dg/ext/fnname5.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/constexpr.c
trunk/gcc/cp/decl.c
trunk/gcc/cp/pt.c

[Bug bootstrap/70422] [6 regression] Bootstrap comparison failure

2016-03-28 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70422

--- Comment #6 from Jason Merrill  ---
Author: jason
Date: Mon Mar 28 20:16:21 2016
New Revision: 234511

URL: https://gcc.gnu.org/viewcvs?rev=234511=gcc=rev
Log:
PR c++/70422

PR c++/64266
PR c++/70353
* decl.c, pt.c, constexpr.c: Revert last patch.

Added:
trunk/gcc/testsuite/g++.old-deja/g++.ext/pretty4.C
Removed:
trunk/gcc/testsuite/g++.dg/cpp0x/constexpr-__func__2.C
trunk/gcc/testsuite/g++.dg/ext/fnname5.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/constexpr.c
trunk/gcc/cp/decl.c
trunk/gcc/cp/pt.c

[Bug c++/64266] Can GCC produce local mergeable symbols for *.__FUNCTION__ and *.__PRETTY_FUNCTION__ functions?

2016-03-28 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64266

--- Comment #6 from Jason Merrill  ---
Author: jason
Date: Mon Mar 28 20:16:21 2016
New Revision: 234511

URL: https://gcc.gnu.org/viewcvs?rev=234511=gcc=rev
Log:
PR c++/70422

PR c++/64266
PR c++/70353
* decl.c, pt.c, constexpr.c: Revert last patch.

Added:
trunk/gcc/testsuite/g++.old-deja/g++.ext/pretty4.C
Removed:
trunk/gcc/testsuite/g++.dg/cpp0x/constexpr-__func__2.C
trunk/gcc/testsuite/g++.dg/ext/fnname5.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/constexpr.c
trunk/gcc/cp/decl.c
trunk/gcc/cp/pt.c

Fwd: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible

2016-03-28 Thread Bin.Cheng
Sorry, Should have replied to gcc-patches list.

Thanks,
bin

-- Forwarded message --
From: "Bin.Cheng" 
Date: Tue, 29 Mar 2016 03:55:04 +0800
Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking
DR against its innermost loop bahavior if possible
To: Richard Biener 

On 3/17/16, Richard Biener  wrote:
> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng  wrote:
>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
>>  wrote:
>>>
>>> Hmm.
>> Hi,
>> Thanks for reviewing.
>>>
>>> +  equal_p = true;
>>> +  if (e1->base_address && e2->base_address)
>>> +equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
>>> +  if (e1->offset && e2->offset)
>>> +equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
>>>
>>> surely better to return false early.
>>>
>>> I think we don't want this in tree-data-refs.h also because of ...
>>>
>>> @@ -615,15 +619,29 @@
>>> hash_memrefs_baserefs_and_store_DRs_read_written_info
>>> (data_reference_p a)
>>>data_reference_p *master_dr, *base_master_dr;and REALPART) before
>>> creating the DR (or adjust the equality function
>> and hashing
>>>tree ref = DR_REF (a);
>>>tree base_ref = DR_BASE_OBJECT (a);
>>> +  innermost_loop_behavior *innermost = _INNERMOST (a);
>>>tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
>>>bool exist1, exist2;
>>>
>>> -  while (TREE_CODE (ref) == COMPONENT_REF
>>> -|| TREE_CODE (ref) == IMAGPART_EXPR
>>> -|| TREE_CODE (ref) == REALPART_EXPR)
>>> -ref = TREE_OPERAND (ref, 0);
>>> +  /* If reference in DR has innermost loop behavior and it is not
>>> + a compound memory reference, we store it to innermost_DR_map,
>>> + otherwise to ref_DR_map.  */
>>> +  if (TREE_CODE (ref) == COMPONENT_REF
>>> +  || TREE_CODE (ref) == IMAGPART_EXPR
>>> +  || TREE_CODE (ref) == REALPART_EXPR
>>> +  || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
>>> +  || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
>>> +{
>>> +  while (TREE_CODE (ref) == COMPONENT_REF
>>> +|| TREE_CODE (ref) == IMAGPART_EXPR
>>> +|| TREE_CODE (ref) == REALPART_EXPR)
>>> +   ref = TREE_OPERAND (ref, 0);
>>> +
>>> +  master_dr = _DR_map->get_or_insert (ref, );
>>> +}
>>> +  else
>>> +master_dr = _DR_map->get_or_insert (innermost, );
>>>
>>> we don't want an extra hashmap but replace ref_DR_map entirely.  So we'd
>>> need to
>>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
>>> and REALPART) before creating the DR (or adjust the equality function
>>> and hashing
>>> to disregard them which means subtracting their offset from DR_INIT.
>> I am not sure if I understand correctly.  But for component reference,
>> it is the base object that we want to record/track.  For example,
>>
>>   for (i = 0; i < N; i++) {
>> m = *data++;
>>
>> m1 = p1->x - m;
>> m2 = p2->x + m;
>>
>> p3->y = (m1 >= m2) ? p1->y : p2->y;
>>
>> p1++;
>> p2++;
>> p3++;
>>   }
>> We want to infer that reads of p1/p2 in condition statement won't trap
>> because there are unconditional reads of the structures, though the
>> unconditional reads are actual of other sub-objects.  Here it is the
>> invariant part of address that we want to track.
>
> Well, the variant parts - we want to strip invariant parts as far as we can
> (offsetof (x) and offsetof (y))
>
>> Also illustrated by this example, we can't rely on data-ref analyzer
>> here.  Because in gathering/scattering cases, the address could be not
>> affine at all.
>
> Sure, but that's a different issue.
>
>>>
>>> To adjust the references we collect you'd maybe could use a callback
>>> to get_references_in_stmt
>>> to adjust them.
>>>
>>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple
>>> as
>> Is this a part of the method you suggested above, or is it an
>> alternative one?  If it's the latter, then I have below questions
>> embedded.
>
> It is an alternative to adding a hook to get_references_in_stmt and
> probably "easier".
>
>>>
>>> Index: tree-if-conv.c
>>> ===
>>> --- tree-if-conv.c  (revision 234215)
>>> +++ tree-if-conv.c  (working copy)
>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>>>
>>>for (i = 0; refs->iterate (i, ); i++)
>>>  {
>>> +  tree *refp = _REF (dr);
>>> +  while ((TREE_CODE (*refp) == COMPONENT_REF
>>> + && TREE_OPERAND (*refp, 2) == NULL_TREE)
>>> +|| TREE_CODE (*refp) == IMAGPART_EXPR
>>> +|| TREE_CODE (*refp) == REALPART_EXPR)
>>> +   refp = _OPERAND (*refp, 0);
>>> +  if (refp != _REF (dr))
>>> +   {
>>> + tree saved_base = *refp;
>>> + *refp = integer_zero_node;
>>> +
>>> + if (DR_INIT (dr))
>>> +   {
>>> + 

Re: Should a disabled warning be allowed to be promoted to an error(Bugzilla PR 70275)?

2016-03-28 Thread Florian Weimer
> In Bugzilla PR # 70275, Manuel López-Ibáñez reports that even though
> he provides the "-Werror=return-type" option, the compiler doesn't
> flag the warning/error about a control reaching the end of a non-void
> function, due to the presence of the "-w" option.  He points out that
> clang++ wtill flags the promoted warning even though warnings are
> inhibited.

I think -w is ordered with respect to the other warning obtions, and
-w inhibits previously requested warnings, and future -W flags may
enable other warnings.  With this in mind, I agree that the current
GCC behavior is consistent and probably not a bug.


[Bug c++/70431] New: [C++11] Out of line defaulted copy constructor of a union does not compile.

2016-03-28 Thread vmpstr at chromium dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70431

Bug ID: 70431
   Summary: [C++11] Out of line defaulted copy constructor of a
union does not compile.
   Product: gcc
   Version: 5.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vmpstr at chromium dot org
  Target Milestone: ---

Given:

union Foo {
  Foo() = default;
  Foo(const Foo& other);

  int a;
  float b;
};

Foo::Foo(const Foo& other) = default;

I get the following compilation error:

$ g++ -std=c++11 -c foo.cpp
foo.cpp: In copy constructor ‘Foo::Foo(const Foo&)’:
foo.cpp:1:7: error: initializations for multiple members of ‘Foo’
 union Foo {
   ^
foo.cpp: At global scope:
foo.cpp:9:30: note: synthesized method ‘Foo::Foo(const Foo&)’ first required
here 
 Foo::Foo(const Foo& other) = default;
  ^

$ g++ --version
g++ (GCC) 5.3.0

The expected behavior is that this would compile. Note that if the constructor
is inline, then it compiles as expected.

Re: Compute all gimple statements in which a var is used

2016-03-28 Thread Richard Biener
On March 28, 2016 7:23:26 PM GMT+02:00, Cristina Georgiana Opriceana 
 wrote:
>Hello,
>
>In order to compute all the statements where a variable is used, is it
>enough to rely on the SSA analysis? I tried to do something like this:
>
>FOR_EACH_LOCAL_DECL (cfun, i, var) {
>for (unsigned int i = 0; i < num_ssa_names; i++) {
>if (ssa_name(i) && SSA_NAME_VAR (ssa_name(i)) == var) {
>tree ssa_node = ssa_name(i);
>FOR_EACH_IMM_USE_FAST (use_op, imm_iter, ssa_node) {
>gimple * use_stmt = USE_STMT(use_op);
>...
>}

No, you'll miss variables not in SSA form.

Richard.

>Thanks,
>Cristina




[Bug fortran/70235] [4.9/5/6 Regression] Incorrect output with PF format

2016-03-28 Thread jvdelisle at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70235

--- Comment #23 from Jerry DeLisle  ---
(In reply to Dominique d'Humieres from comment #22)
> Created attachment 38107 [details]
> New patch with test.
> 
> With the patch we now get for y=6431.25
> 
> ru,-8pf18.2 y=  0.01
> 
> IMO this is the correct rounding. Does someone disagree with that?
> 
> What tests should be removed/added from gfortran.dg/fmt_pf.f90?

I think good.

With round up, Any x,  .01 >= x > 0.0 is .01
   .1 >= x > 0.0 is .1

Should a disabled warning be allowed to be promoted to an error(Bugzilla PR 70275)?

2016-03-28 Thread kevin-tucker
In Bugzilla PR # 70275, Manuel López-Ibáñez reports that even though he 
provides the "-Werror=return-type" option, the compiler doesn't flag the 
warning/error about a control reaching the end of a non-void function, due to 
the presence of the "-w" option.  He points out that clang++ wtill flags the 
promoted warning even though warnings are inhibited.

I'm thinking that not reporting the promoted error is the correct and desired 
behavior, based on two factors:
1) Warnings are inhibited, so there is no warning to promote.

2) The comment in diagnostic.c which reads:
/* Give preference to being able to inhibit warnings, before they
 get reclassified to something else.  */

I was looking at this issue as one I could use to get my feet wet in GCC 
maintenance , and would be glad to take on if it is decided that it is 
desired, which I would argue that it is not, but I thought I'd open up 
discussion on the topic.

I have worked on the MIcrochip PIC18 C compiler, and worked for DDC-I, 
supporting their compiler suites,(Ada, C/C++) as well as some work on the ASIS 
standard, and thought it might be fun to work on GCC.


Looking forward to reading discussion on the topic!

Kevin Tucker



Re: [PATCH] Add security_sensitive attribute to clean function stack and regs.

2016-03-28 Thread Marcos Díaz
(Ping and changelog fix)

On Tue, Mar 22, 2016 at 11:15 AM, Marcos Díaz
 wrote:
> Hi,
>the attached patch adds a new attribute 'security_sensitive' for functions.
> The idea was discussed in PR middle-end/69976.
> This attribute makes gcc to emit clean up code at the function's epilogue.
> This clean-up code cleans the stack used by this function and that isn't
> needed anymore. It also cleans used registers. It only works in x86_64.
> Please, review the discussion here:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69976
> since we had some doubts with the implementation.
>
> We also added some test-cases and ran all tests in x86_64.
> We think this isn't a bug-fix but a new feature.
>
> Changelog
> 2016-03-21  Marcos Diaz  
>Andres Tiraboschi  
>
> PR tree-optimization/69820
This line should've been  PR middle-end/69976
> * config/i386/i386-protos.h: Add ix86_clear_regs_emit and
> ix86_sec_sensitive_attr_p
> * config/i386/i386.c: (ix86_sec_sensitive_attr_p): New function
> (ix86_using_red_zone): now take into account if the function has the new
> attribute.
> (is_preserved_reg): New function.
> (is_integer_reg): New function.
> (is_used_as_ret): New function.
> (reg_to_string): New function.
> (clear_reg_emit): New function.
> (ix86_clear_regs_emit): New function.
> (ix86_expand_epilogue): Added code to emit clean up code only when
> security_sensitive attribute is set.
> (ix86_handle_security_sensitive_attribute): New function.
> (ix86_attribute_table): Added new attribute.
> * config/i386/i386.md: (UNSPECV_CLRSTACK): New unspecv.
> (UNSPECV_CLRREGS): New unspecv.
> (return): Conditionally emit cleaning regs code.
> (simple_return): Likewise
> (clear_regs): New insn.
> (clear_stack): New insn.
> * doc/extend.texi: Added description for new security_sensitive attribute.

Changelog
2016-03-21  Marcos Diaz  
   Andres Tiraboschi  

PR middle-end/69976
* config/i386/i386-protos.h: Add ix86_clear_regs_emit and
ix86_sec_sensitive_attr_p
* config/i386/i386.c: (ix86_sec_sensitive_attr_p): New function
(ix86_using_red_zone): now take into account if the function has the new
attribute.
(is_preserved_reg): New function.
(is_integer_reg): New function.
(is_used_as_ret): New function.
(reg_to_string): New function.
(clear_reg_emit): New function.
(ix86_clear_regs_emit): New function.
(ix86_expand_epilogue): Added code to emit clean up code only when
security_sensitive attribute is set.
(ix86_handle_security_sensitive_attribute): New function.
(ix86_attribute_table): Added new attribute.
* config/i386/i386.md: (UNSPECV_CLRSTACK): New unspecv.
(UNSPECV_CLRREGS): New unspecv.
(return): Conditionally emit cleaning regs code.
(simple_return): Likewise
(clear_regs): New insn.
(clear_stack): New insn.
* doc/extend.texi: Added description for new security_sensitive attribute.


Re: rs6000 stack_tie mishap again

2016-03-28 Thread Richard Henderson

On 03/23/2016 09:10 PM, Alan Modra wrote:

On Wed, Mar 23, 2016 at 05:04:39PM +0100, Olivier Hainque wrote:

The reason why 894 is not accounted in the base ref computation is because it
is part of the epilogue sequence, and init_alias_analysis has:

   /* Walk the insns adding values to the new_reg_base_value array.  */
   for (i = 0; i < rpo_cnt; i++)
{ ...
  if (could_be_prologue_epilogue
  && prologue_epilogue_contains (insn))
continue;

The motivation for this is unclear to me.


https://gcc.gnu.org/ml/gcc-patches/1999-08n/msg00048.html


My rough understanding is that we probably really care about frame_related
insns only here, at least on targets where the flag is supposed to be set
accurately.


Also, possibly just prologue insns.  So you might be able to modify
init_alias_analysis just to ignore the prologue and skip any need for
yet another hook.

Let's see what rth thinks.  He did say the patch might need to be
redone.  :)
https://gcc.gnu.org/ml/gcc-patches/1999-08n/msg00072.html


I be surprised if this is works as expected without side-effects.  You've now 
exposed the restore of the frame pointer to alias analysis, and it's probably 
not seen as constant anymore.  As you reference, I expect that any patch that 
opens the epilogue to such scrutiny is going to have to special-case the frame 
pointer as well.


That said, as Segher points out later in the thread, one can arrange for hard 
regs within the body to bleed into temporaries used within the epilogue, which 
is bad.  So perhaps this is exactly what's needed longer-term.  More 
investigation is required.


But I expect for stage4, the best solution is to strengthen the stack_tie 
pattern to block all memory.  Early scheduling of the stack frame deallocation 
(a simple logic insn) can't really be that important to performance.



r~


[Bug target/70421] [5/6 Regression] wrong code with v16si vector and useless cast at -O -mavx512f

2016-03-28 Thread zsojka at seznam dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70421

--- Comment #3 from Zdenek Sojka  ---
(In reply to Jakub Jelinek from comment #2)
> Untested fix:
> --- gcc/config/i386/i386.c(revision 234449)
> +++ gcc/config/i386/i386.c(working copy)
> @@ -46930,7 +46930,7 @@ half:
>  {
>tmp = gen_reg_rtx (mode);
>emit_insn (gen_rtx_SET (tmp, gen_rtx_VEC_DUPLICATE (mode, val)));
> -  emit_insn (gen_blendm (target, tmp, target,
> +  emit_insn (gen_blendm (target, target, tmp,
>force_reg (mmode,
>   gen_int_mode (1 << elt, mmode;
>  }

Thanks, that fixes the testcase and some other miscompiles. I haven't run the
testsuite though.

Is the fact that the useless cast in:
  x ^= ((v16si)v)[u[0]];
changes the generated code (makes it significantly worse at all opt levels)
worth creating a PR?

Compute all gimple statements in which a var is used

2016-03-28 Thread Cristina Georgiana Opriceana
Hello,

In order to compute all the statements where a variable is used, is it
enough to rely on the SSA analysis? I tried to do something like this:

FOR_EACH_LOCAL_DECL (cfun, i, var) {
for (unsigned int i = 0; i < num_ssa_names; i++) {
if (ssa_name(i) && SSA_NAME_VAR (ssa_name(i)) == var) {
tree ssa_node = ssa_name(i);
FOR_EACH_IMM_USE_FAST (use_op, imm_iter, ssa_node) {
gimple * use_stmt = USE_STMT(use_op);
...
}

Thanks,
Cristina


[Bug c++/70430] Incorrect result for logical "and" operation with mixed vector and scalar

2016-03-28 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70430

Marc Glisse  changed:

   What|Removed |Added

   Keywords||wrong-code
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-03-28
 Ever confirmed|0   |1

--- Comment #1 from Marc Glisse  ---
cp/typeck.c:4367
  op1 = build_conditional_expr (location, op1, z, m1, complain);

Argh!!!

I don't dare run svn blame, but I am pretty sure that's my fault :-(

[Bug testsuite/64177] Various cilk+ testsuite failures

2016-03-28 Thread iverbin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64177

iverbin at gcc dot gnu.org changed:

   What|Removed |Added

 CC||iverbin at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org

--- Comment #1 from iverbin at gcc dot gnu.org ---
This issue was discussed here:
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01817.html

We need __cilkrts_set_param ("nworkers", "2"); in such tests.

Re: FW: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C

2016-03-28 Thread Ilya Verbin
Hi Thomas!

Do you plan to commit this patch? :)

On Mon, Sep 29, 2014 at 09:24:40 -0600, Jeff Law wrote:
> On 09/29/14 08:26, Thomas Schwinge wrote:
> >On Mon, 29 Sep 2014 13:58:31 +, "Tannenbaum, Barry M" 
> > wrote:
> >>In a nutshell, add the following code to main() before the call to f3():
> >>
> >> int status = __cilkrts_set_param("nworkers", "2");
> >> if (0 != status) {
> >> // Failed to set the number of Cilk workers
> >> return status;
> >> }
> >
> >Yeah, that's what I had proposed with the patch at the end of my previous
> >email,
> >.
> >I'm sorry if I didn't make it obvious that more text and the patch were
> >following after the full-quote of the original issue description.
> >
> >>Here's the details: [...]
> >
> >Thanks again for your helpful comments; that's appreciated.
> >
> >Here's again my proposed patch.  Note, that the include paths in GCC
> >compiler testing (gcc/testsuite/) are not set up to pick up the
> > include file, so I've manually added a propotype for
> >the __cilkrts_set_param function to the three files.  I can change that,
> >if requested.
> >
> >commit ee7138e451d1f3284d6fa0f61fe517c82db94060
> >Author: Thomas Schwinge 
> >Date:   Mon Sep 29 12:47:34 2014 +0200
> >
> > Audit Cilk Plus tests for CILK_NWORKERS=1.
> >
> > gcc/testsuite/
> > * c-c++-common/cilk-plus/CK/spawning_arg.c (main): Call
> > __cilkrts_set_param to set two workers.
> > * c-c++-common/cilk-plus/CK/steal_check.c (main): Likewise.
> > * g++.dg/cilk-plus/CK/catch_exc.cc (main): Likewise.
> OK.
> Jeff

  -- Ilya


Re: [Patch, Fortran, pr70397, gcc-5, v1] [5/6 Regression] ice while allocating ultimate polymorphic

2016-03-28 Thread Andre Vehreschild
Hi Paul,

thanks for the quick review. Committed to gcc-5-branch as r234507. The
patch for trunk needs more polishing than expected. I hope to present
it soon.

Regards,
Andre

On Sun, 27 Mar 2016 19:19:11 +0200
Paul Richard Thomas  wrote:

> Hi Andre,
> 
> The patch looks to be fine to me for both trunk and 5-branch.
> 
> Thanks for the patch.
> 
> Paul
> 
> On 27 March 2016 at 18:53, Andre Vehreschild  wrote:
> > Hi all,
> >
> > and here is already the follow-up. In the initial patch a safe wasn't 
> > commenced
> > before pulling the patch, which lead to a refactoring of the new functions 
> > node
> > to be partial only. Sorry for the noise.
> >
> > - Andre
> >
> > Am Sun, 27 Mar 2016 18:49:18 +0200
> > schrieb Andre Vehreschild :
> >  
> >> Hi all,
> >>
> >> attached is a patch to fix an ICE on allocating an unlimited polymorphic
> >> entity from a non-poly class or type without an length component. The 
> >> routine
> >> gfc_copy_class_to_class() assumed that both the source and destination
> >> object's type is unlimited polymorphic, but in this case it is true for the
> >> destination only, which made gfortran look for a non-existent _len 
> >> component
> >> in the source object and therefore ICE. This is fixed by the patch by 
> >> adding
> >> a function to return either the _len component, when it exists, or a 
> >> constant
> >> zero node to init the destination object's _len component with.
> >>
> >> Bootstrapped and regtested ok on x86_64-linux-gnu/F23. (Might have some
> >> line deltas, because my git is a bit older. Sorry, only have limited/slow
> >> net-access currently.)
> >>
> >> The same patch should be adaptable to trunk. To come...
> >>
> >> Ok for 5-trunk?
> >>
> >> Regards,
> >>   Andre  
> >
> >
> >



-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 234506)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,12 @@
+2016-03-28  Andre Vehreschild  
+
+	PR fortran/70397
+	* trans-expr.c (gfc_class_len_or_zero_get): Add function to return a
+	constant zero tree, when the class to get the _len component from is
+	not unlimited polymorphic.
+	(gfc_copy_class_to_class): Use the new function.
+	* trans.h: Added interface of new function gfc_class_len_or_zero_get.
+
 2016-03-28  Alessandro Fanfarillo  
 
 	Backport from trunk.
Index: gcc/fortran/trans-expr.c
===
--- gcc/fortran/trans-expr.c	(Revision 234506)
+++ gcc/fortran/trans-expr.c	(Arbeitskopie)
@@ -173,6 +173,24 @@
 }
 
 
+/* Try to get the _len component of a class.  When the class is not unlimited
+   poly, i.e. no _len field exists, then return a zero node.  */
+
+tree
+gfc_class_len_or_zero_get (tree decl)
+{
+  tree len;
+  if (POINTER_TYPE_P (TREE_TYPE (decl)))
+decl = build_fold_indirect_ref_loc (input_location, decl);
+  len = gfc_advance_chain (TYPE_FIELDS (TREE_TYPE (decl)),
+			   CLASS_LEN_FIELD);
+  return len != NULL_TREE ? fold_build3_loc (input_location, COMPONENT_REF,
+	 TREE_TYPE (len), decl, len,
+	 NULL_TREE)
+			  : integer_zero_node;
+}
+
+
 /* Get the specified FIELD from the VPTR.  */
 
 static tree
@@ -250,6 +268,7 @@
 
 #undef CLASS_DATA_FIELD
 #undef CLASS_VPTR_FIELD
+#undef CLASS_LEN_FIELD
 #undef VTABLE_HASH_FIELD
 #undef VTABLE_SIZE_FIELD
 #undef VTABLE_EXTENDS_FIELD
@@ -1070,7 +1089,7 @@
   if (unlimited)
 {
   if (from_class_base != NULL_TREE)
-	from_len = gfc_class_len_get (from_class_base);
+	from_len = gfc_class_len_or_zero_get (from_class_base);
   else
 	from_len = integer_zero_node;
 }
Index: gcc/fortran/trans.h
===
--- gcc/fortran/trans.h	(Revision 234506)
+++ gcc/fortran/trans.h	(Arbeitskopie)
@@ -356,6 +356,7 @@
 tree gfc_class_data_get (tree);
 tree gfc_class_vptr_get (tree);
 tree gfc_class_len_get (tree);
+tree gfc_class_len_or_zero_get (tree);
 gfc_expr * gfc_find_and_cut_at_last_class_ref (gfc_expr *);
 /* Get an accessor to the class' vtab's * field, when a class handle is
available.  */
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 234506)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,9 @@
+2016-03-28  Andre Vehreschild  
+
+	PR fortran/70397
+	* gfortran.dg/unlimited_polymorphic_25.f90: New test.
+	* gfortran.dg/unlimited_polymorphic_26.f90: New test.
+
 2016-03-28  Kirill Yukhin  
 
 	PR target/70406
Index: gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90
===
--- gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90	(nicht existent)
+++ 

[Bug fortran/70397] [5/6 Regression] ice while allocating ultimate polymorphic

2016-03-28 Thread vehre at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70397

--- Comment #2 from vehre at gcc dot gnu.org ---
Author: vehre
Date: Mon Mar 28 16:29:14 2016
New Revision: 234507

URL: https://gcc.gnu.org/viewcvs?rev=234507=gcc=rev
Log:
gcc/fortran/ChangeLog:

2016-03-28  Andre Vehreschild  

PR fortran/70397
* trans-expr.c (gfc_class_len_or_zero_get): Add function to return a
constant zero tree, when the class to get the _len component from is
not unlimited polymorphic.
(gfc_copy_class_to_class): Use the new function.
* trans.h: Added interface of new function gfc_class_len_or_zero_get.

gcc/testsuite/ChangeLog:

2016-03-28  Andre Vehreschild  

PR fortran/70397
* gfortran.dg/unlimited_polymorphic_25.f90: New test.
* gfortran.dg/unlimited_polymorphic_26.f90: New test.



Added:
   
branches/gcc-5-branch/gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90
   
branches/gcc-5-branch/gcc/testsuite/gfortran.dg/unlimited_polymorphic_26.f90
Modified:
branches/gcc-5-branch/gcc/fortran/ChangeLog
branches/gcc-5-branch/gcc/fortran/trans-expr.c
branches/gcc-5-branch/gcc/fortran/trans.h
branches/gcc-5-branch/gcc/testsuite/ChangeLog

Re: Spurious register spill with volatile function argument

2016-03-28 Thread Florian Weimer
* Paul Koning:

>> On Mar 28, 2016, at 8:11 AM, Florian Weimer  wrote:
>> 
>> ...
>> The problem is that “reading” is either not defined, or the existing
>> flatly contradicts existing practice.
>> 
>> For example, if p is a pointer to a struct, will the expression >m
>> read *p?
>
> Presumably the offset of m is substantially larger than 0?  If so, my
> answer would be "it had better not".  Does any compiler treat that
> statement as an access to *p ?

As I tried to explain, GCC does, for aliasing purposes.  For the
expression p->m, there is an implicit read of *p, asserting that the
static and dynamic types match.


[Bug target/70107] ICE: in emit_move_insn, at expr.c:3546 with -mcpu=power8

2016-03-28 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70107

--- Comment #4 from Markus Trippelsdorf  ---
You need to add "-std=c++11 -fpermissive" to gcc < 6.
Trunk seems to be fixed, but gcc-5 and gcc-4.9 still ICE.

[Bug target/70107] ICE: in emit_move_insn, at expr.c:3546 with -mcpu=power8

2016-03-28 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70107

--- Comment #3 from Bill Schmidt  ---
Also, on latest GCC 5 and GCC 4.9, the front end objects:

wschmidt@genoa:~/src$ $GCC_INSTALL/bin/g++ -w -c -mcpu=power8 pr70107.ii
pr70107.ii:3:46: error: expected type-specifier before 'decltype'
   auto operator()(T Base::*, Derived ref) -> decltype(ref);
  ^
pr70107.ii:3:46: error: expected initializer before 'decltype'
pr70107.ii: In function 'void test_b12(Functor)':
pr70107.ii:8:9: error: expected nested-name-specifier before 'ClassFunc'
   using ClassFunc = Signature TestClass::*;
 ^
pr70107.ii:9:3: error: 'ClassFunc' was not declared in this scope
   ClassFunc b;
   ^
pr70107.ii:10:5: error: 'b' was not declared in this scope
   a(b, f);
 ^
wschmidt@genoa:~/src$

[Bug target/70107] ICE: in emit_move_insn, at expr.c:3546 with -mcpu=power8

2016-03-28 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70107

--- Comment #2 from Bill Schmidt  ---
Could not confirm with trunk r234476 on powerpc64le dated 2016-03-24.  Can you
please retry with something at least that recent?

[Bug c++/70430] New: Incorrect result for logical "and" operation with mixed vector and scalar

2016-03-28 Thread mjh at edg dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70430

Bug ID: 70430
   Summary: Incorrect result for logical "and" operation with
mixed vector and scalar
   Product: gcc
   Version: 5.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mjh at edg dot com
  Target Milestone: ---

I'm seeing surprising results at run time when mixing a vector and a scalar in
a logical "and" operation:

$ cat foo.C
extern "C" int printf (const char *, ...);
typedef int v4si __attribute__ ((vector_size (16)));
int main() {
  v4si a = {1,1,1,1}, b = {1,0,-1,2}, x = {-1,-1,-1,-1},  z = {0,0,0,0}, c;
  c = a && b;  printf("%d %d %d %d\n", c[0], c[1], c[2], c[3]);
  c = x && b;  printf("%d %d %d %d\n", c[0], c[1], c[2], c[3]);
  c = (v4si){1,1,1,1} && b;printf("%d %d %d %d\n", c[0], c[1], c[2], c[3]);
  c = 1 && b;  printf("%d %d %d %d\n", c[0], c[1], c[2], c[3]);
  c = b && a;  printf("%d %d %d %d\n", c[0], c[1], c[2], c[3]);
  c = b && x;  printf("%d %d %d %d\n", c[0], c[1], c[2], c[3]);
  c = b && (v4si){1,1,1,1};printf("%d %d %d %d\n", c[0], c[1], c[2], c[3]);
  printf("bug:\n");
  c = b && 1;  printf("%d %d %d %d\n", c[0], c[1], c[2], c[3]); //?

  c = z && b;  printf("%d %d %d %d\n", c[0], c[1], c[2], c[3]);
  c = (v4si){0,0,0,0} && b;printf("%d %d %d %d\n", c[0], c[1], c[2], c[3]);
  c = 0 && b;  printf("%d %d %d %d\n", c[0], c[1], c[2], c[3]);
  c = b && z;  printf("%d %d %d %d\n", c[0], c[1], c[2], c[3]);
  c = b && (v4si){0,0,0,0};printf("%d %d %d %d\n", c[0], c[1], c[2], c[3]);
  printf("bug:\n");
  c = b && 0;  printf("%d %d %d %d\n", c[0], c[1], c[2], c[3]); //?

  return 0;
}
$ g++520 foo.C && a.out
-1 0 -1 -1
-1 0 -1 -1
-1 0 -1 -1
-1 0 -1 -1
-1 0 -1 -1
-1 0 -1 -1
-1 0 -1 -1
bug:
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
bug:
-1 0 -1 -1
$

These two cases (i.e., with an initial vector and a subsequent scalar) seem to
give the opposite result of what I would expect.  I didn't check to see if
logical "or" has similar issues.

Re: Spurious register spill with volatile function argument

2016-03-28 Thread Paul_Koning

> On Mar 28, 2016, at 8:11 AM, Florian Weimer  wrote:
> 
> ...
> The problem is that “reading” is either not defined, or the existing
> flatly contradicts existing practice.
> 
> For example, if p is a pointer to a struct, will the expression >m
> read *p?

Presumably the offset of m is substantially larger than 0?  If so, my answer 
would be "it had better not".  Does any compiler treat that statement as an 
access to *p ?

paul


[Bug target/70416] [SH]: error: 'asm' operand requires impossible reload when building ruby2.3

2016-03-28 Thread olegendo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70416

--- Comment #17 from Oleg Endo  ---
(In reply to John Paul Adrian Glaubitz from comment #16)
> 
> Wow, great to hear :).

I'm now testing this patch on sh-elf...

[Bug target/70416] [SH]: error: 'asm' operand requires impossible reload when building ruby2.3

2016-03-28 Thread glaubitz at physik dot fu-berlin.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70416

--- Comment #16 from John Paul Adrian Glaubitz  ---
(In reply to Oleg Endo from comment #15)
> (In reply to Oleg Endo from comment #14)
> > 
> > Maybe at that stage in the reload code it will end up using the last *addsi3
> > pattern and not try to look for a new pattern in the .md when it wants to
> > change it.  In other words, maybe it'll help if the *addsi3 patterns are
> > merged into a single pattern somehow.  I'll give it a try...
> 
> The following seems to fix the problem:

Wow, great to hear :).

[Bug target/70429] Wrong code with -O1.

2016-03-28 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70429

Kirill Yukhin  changed:

   What|Removed |Added

 CC||kyukhin at gcc dot gnu.org

--- Comment #4 from Kirill Yukhin  ---
Seems like combiner performs invalid reassociation. This trivial addition to
Jakub's PR70222 fix makes test work:

--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -10526,7 +10526,7 @@ simplify_shift_const_1 (enum rtx_code code,
machine_mode result_mode,
{
  /* For ((unsigned) (cstULL >> count)) >> cst2 we have to make
 sure the result will be masked.  See PR70222.  */
- if (code == LSHIFTRT
+ if ((code == LSHIFTRT || code == ASHIFTRT)
  && mode != result_mode
  && !merge_outer_ops (_op, _const, AND,
   GET_MODE_MASK (result_mode)

[Bug c++/69515] partial specialization of variable templates is broken

2016-03-28 Thread tom at honermann dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69515

--- Comment #2 from Tom Honermann  ---
(In reply to Tom Honermann from comment #1)

Actually, the test case in comment 1 seems to be a different issue; its failure
is a regression introduced in r234231 via bug 70095.

As of r234231 (and up through at least r234502), the test case in comment 0
triggers an ICE.

$ g++ -std=c++14 t.cpp -o t
t.cpp:9:31: error: Two symbols with same comdat_group are not linked by the
same_comdat_group list.
 auto&& b = foo>;
   ^
foo/3 (A foo) @0x7f6c0f987000
  Type: variable definition analyzed
  Visibility: public weak comdat comdat_group:foo one_only
  previous sharing asm name: 2
  References: 
  Referring: b/1 (addr)_Z41__static_initialization_and_destruction_0ii/5 (addr)
  Availability: not-ready
  Varpool flags:
foo/2 (A foo) @0x7f6c0f97cf80
  Type: variable definition analyzed
  Visibility: public weak comdat comdat_group:foo one_only
  next sharing asm name: 3
  References: 
  Referring: a/0 (addr)_Z41__static_initialization_and_destruction_0ii/5 (addr)
  Availability: not-ready
  Varpool flags:
t.cpp:9:31: internal compiler error: symtab_node::verify failed
0x928ee1 symtab_node::verify_symtab_nodes()
../../gcc-trunk/gcc/symtab.c:1219
0x93ba14 symtab_node::checking_verify_symtab_nodes()
../../gcc-trunk/gcc/cgraph.h:602
0x93ba14 symbol_table::compile()
../../gcc-trunk/gcc/cgraphunit.c:2380
0x93df87 symbol_table::compile()
../../gcc-trunk/gcc/cgraphunit.c:2536
0x93df87 symbol_table::finalize_compilation_unit()
../../gcc-trunk/gcc/cgraphunit.c:2562

Re: C++ PATCH for c++/70353 (core issue 1962)

2016-03-28 Thread Dominique d'Humières

> Le 28 mars 2016 à 15:11, Jason Merrill  a écrit :
> 
> On 03/28/2016 09:02 AM, Dominique d'Humières wrote:
>> 
>>> Le 28 mars 2016 à 14:55, Jason Merrill  a écrit :
>>> 
>>> OK, thanks.
>>> 
>>> Jason
>> 
>> Does it mean that I should commit the patch?
> 
> Please.

Revision 234504.

Dominique

> Jason
> 



[Bug bootstrap/70422] [6 regression] Bootstrap comparison failure

2016-03-28 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70422

--- Comment #5 from Jason Merrill  ---
(In reply to Jim Wilson from comment #4)
> The broken targets all define flag_section_anchors at -O1 and up.

Probably the -O1 matters because it implies -fmerge-constants.

Re: C++ PATCH for c++/70353 (core issue 1962)

2016-03-28 Thread Jason Merrill

On 03/28/2016 09:02 AM, Dominique d'Humières wrote:



Le 28 mars 2016 à 14:55, Jason Merrill  a écrit :

OK, thanks.

Jason


Does it mean that I should commit the patch?


Please.

Jason




Re: C++ PATCH for c++/70353 (core issue 1962)

2016-03-28 Thread Dominique d'Humières

> Le 28 mars 2016 à 14:55, Jason Merrill  a écrit :
> 
> OK, thanks.
> 
> Jason

Does it mean that I should commit the patch?

Dominique



regression in C++ parsing performance between 4.9.3 and 5.3.1

2016-03-28 Thread Pádraig Brady

The attached shell script will generate a larger version
of the following:

constexpr bool static_str_equal(const char* x, const char* y) {
  return (*x == 0 || *y == 0) ?
(*x == *y) :
((*x == *y) && static_str_equal(x + 1, y + 1));
}
int main(void)
{
 static_assert( !static_str_equal("unspecified1", "unspecified"), "");
}

There is a significant slow down and increase in ram usage
compiling that, between gcc 4.9.3 and 5.3.1 (20151207):

$ sh cstrcmp.sh > cstrcmp.cpp
$ ~/gcc-4.9.3/g++ -ftime-report --std=gnu++11 cstrcmp.cpp

Execution times (seconds)
 phase setup :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 1%) wall  
  1445 kB ( 1%) ggc
 phase parsing   :   0.91 (100%) usr   0.16 (100%) sys   1.08 (99%) 
wall  149089 kB (99%) ggc
 |name lookup:   0.03 ( 3%) usr   0.00 ( 0%) sys   0.02 ( 2%) wall  
89 kB ( 0%) ggc
 |overload resolution:   0.01 ( 1%) usr   0.00 ( 0%) sys   0.02 ( 2%) wall  
   628 kB ( 0%) ggc
 preprocessing   :   0.02 ( 2%) usr   0.05 (31%) sys   0.04 ( 4%) wall  
  2048 kB ( 1%) ggc
 parser (global) :   0.02 ( 2%) usr   0.03 (19%) sys   0.08 ( 7%) wall  
 16178 kB (11%) ggc
 parser function body:   0.87 (96%) usr   0.08 (50%) sys   0.96 (88%) wall  
130855 kB (87%) ggc
 TOTAL :   0.91 0.16 1.09 
150594 kB

$ g++ -ftime-report --std=gnu++11 cstrcmp.cpp  # default fedora version

Execution times (seconds)
 phase setup :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall  
  1408 kB ( 0%) ggc
 phase parsing   :   1.90 (99%) usr   0.26 (100%) sys   2.16 (100%) 
wall  469089 kB (100%) ggc
 phase opt and generate  :   0.01 ( 1%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall  
53 kB ( 0%) ggc
 |name lookup:   0.04 ( 2%) usr   0.00 ( 0%) sys   0.05 ( 2%) wall  
93 kB ( 0%) ggc
 |overload resolution:   0.01 ( 1%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall  
   628 kB ( 0%) ggc
 garbage collection  :   0.20 (10%) usr   0.02 ( 8%) sys   0.21 (10%) wall  
 0 kB ( 0%) ggc
 preprocessing   :   0.01 ( 1%) usr   0.05 (19%) sys   0.08 ( 4%) wall  
  2048 kB ( 0%) ggc
 parser (global) :   0.02 ( 1%) usr   0.04 (15%) sys   0.05 ( 2%) wall  
 16181 kB ( 3%) ggc
 parser function body:   1.67 (87%) usr   0.15 (58%) sys   1.82 (84%) wall  
450852 kB (96%) ggc
 integrated RA   :   0.01 ( 1%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall  
24 kB ( 0%) ggc
 TOTAL :   1.91 0.26 2.17 
470569 kB

Note I compiled the gcc 5 branch at 20160302 with the same results.

thanks,
Pádraig


cstrcmp.sh
Description: Bourne shell script


[Bug c++/70095] [C++14] Link error on partially specialized variable template

2016-03-28 Thread tom at honermann dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70095

Tom Honermann  changed:

   What|Removed |Added

 CC||tom at honermann dot net

--- Comment #2 from Tom Honermann  ---
The change in comment 1 introduced a regression.  The following test passes
with r234230, but fails with r234231:

$ cat t.cpp
template 
constexpr bool b = false;
template
constexpr bool b = true;
int main() {
b;
b;
}

$ g++ -std=c++14 t.cpp -o t
/tmp/ccJTAQId.s: Assembler messages:
/tmp/ccJTAQId.s:27: Error: symbol `_ZL1b' is already defined

Re: C++ PATCH for c++/70353 (core issue 1962)

2016-03-28 Thread Jason Merrill

OK, thanks.

Jason


[PATCH] Fix in-tree gmp/mpfr/mpc generation (PR 67728)

2016-03-28 Thread Bernd Edlinger

Hi,

as described in the tracker we have bootstrap problems with in-tree gmp-6.1.0
on certain targets, and also a linker issue with check-mpc due to the changed
mpfr library path.

These are triggered by overriding CFLAGS and LDFLAGS in in-tree builds.
It did not happen with the gmp/mpfr/mpc versions that download_prerequisites
installs, but the currently latest version of these libraries use CFLAGS to pass
-DNO_ASM which is overridden by gcc and causes the gmp-6.1.0 to be
mis-compiled.  And the mpc issue is triggered by overriding LDFLAGS
and the changed mpfr library path.  So this started with mpfr v3.1.0 which
moved the sources into a src sub-directory.

The proposed patch fixes these problems by passing -DNO_ASM in AM_CFLAGS,
and adding both possible mpfr library paths to HOST_LIB_PATH_mpfr.
I've also adjusted HOST_LIB_PATH_mpc although it did not yet create problems.

Boot-strapped and regression tested on x86_64-pc-linux-gnu, with different
gmp versions including the latest snapshot.
I have additionally built arm cross compilers, which was not working before.

Is this OK for trunk?


Thanks
Bernd.2016-03-28  Bernd Edlinger  

PR bootstrap/67728
* Makefile.def (gmp): Explicitly disable assembler.
(mpfr): Add src/.libs to lib_path2.
(mpc): Adjust lib_path.
* Makefile.tpi (HOST_LIB_PATH_): Use lib_path2 if defined.
* Makefile.in: Regenerated.
Index: Makefile.def
===
--- Makefile.def	(revision 234490)
+++ Makefile.def	(working copy)
@@ -50,6 +50,7 @@ host_modules= { module= gcc; bootstrap=true;
 host_modules= { module= gmp; lib_path=.libs; bootstrap=true;
 		// Work around in-tree gmp configure bug with missing flex.
 		extra_configure_flags='--disable-shared LEX="touch lex.yy.c"';
+		extra_make_flags='AM_CFLAGS="-DNO_ASM"';
 		no_install= true;
 		// none-*-* disables asm optimizations, bootstrap-testing
 		// the compiler more thoroughly.
@@ -57,11 +58,11 @@ host_modules= { module= gmp; lib_path=.libs; boots
 		// gmp's configure will complain if given anything
 		// different from host for target.
 	target="none-${host_vendor}-${host_os}"; };
-host_modules= { module= mpfr; lib_path=.libs; bootstrap=true;
+host_modules= { module= mpfr; lib_path=.libs; lib_path2=src/.libs; bootstrap=true;
 		extra_configure_flags='--disable-shared @extra_mpfr_configure_flags@';
 		extra_make_flags='AM_CFLAGS="-DNO_ASM"';
 		no_install= true; };
-host_modules= { module= mpc; lib_path=.libs; bootstrap=true;
+host_modules= { module= mpc; lib_path=src/.libs; bootstrap=true;
 		extra_configure_flags='--disable-shared @extra_mpc_gmp_configure_flags@ @extra_mpc_mpfr_configure_flags@';
 		no_install= true; };
 host_modules= { module= isl; lib_path=.libs; bootstrap=true;
Index: Makefile.in
===
--- Makefile.in	(revision 234490)
+++ Makefile.in	(working copy)
@@ -639,12 +639,12 @@ HOST_LIB_PATH_gmp = \
 
 @if mpfr
 HOST_LIB_PATH_mpfr = \
-  $$r/$(HOST_SUBDIR)/mpfr/.libs:$$r/$(HOST_SUBDIR)/prev-mpfr/.libs:
+  $$r/$(HOST_SUBDIR)/mpfr/src/.libs:$$r/$(HOST_SUBDIR)/mpfr/.libs:$$r/$(HOST_SUBDIR)/prev-mpfr/.libs:
 @endif mpfr
 
 @if mpc
 HOST_LIB_PATH_mpc = \
-  $$r/$(HOST_SUBDIR)/mpc/.libs:$$r/$(HOST_SUBDIR)/prev-mpc/.libs:
+  $$r/$(HOST_SUBDIR)/mpc/src/.libs:$$r/$(HOST_SUBDIR)/prev-mpc/src/.libs:
 @endif mpc
 
 @if isl
@@ -11299,7 +11299,7 @@ all-gmp: configure-gmp
 	s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
 	$(HOST_EXPORTS)  \
 	(cd $(HOST_SUBDIR)/gmp && \
-	  $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_HOST_FLAGS) $(STAGE1_FLAGS_TO_PASS)  \
+	  $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_HOST_FLAGS) $(STAGE1_FLAGS_TO_PASS) AM_CFLAGS="-DNO_ASM" \
 		$(TARGET-gmp))
 @endif gmp
 
@@ -11328,7 +11328,7 @@ all-stage1-gmp: configure-stage1-gmp
 		CXXFLAGS_FOR_TARGET="$(CXXFLAGS_FOR_TARGET)" \
 		LIBCFLAGS_FOR_TARGET="$(LIBCFLAGS_FOR_TARGET)" \
 		$(EXTRA_HOST_FLAGS)  \
-		$(STAGE1_FLAGS_TO_PASS)  \
+		$(STAGE1_FLAGS_TO_PASS) AM_CFLAGS="-DNO_ASM" \
 		TFLAGS="$(STAGE1_TFLAGS)" \
 		$(TARGET-stage1-gmp)
 
@@ -11343,7 +11343,7 @@ clean-stage1-gmp:
 	fi; \
 	cd $(HOST_SUBDIR)/gmp && \
 	$(MAKE) $(EXTRA_HOST_FLAGS)  \
-	$(STAGE1_FLAGS_TO_PASS)  clean
+	$(STAGE1_FLAGS_TO_PASS) AM_CFLAGS="-DNO_ASM" clean
 @endif gmp-bootstrap
 
 
@@ -11370,7 +11370,7 @@ all-stage2-gmp: configure-stage2-gmp
 		CFLAGS_FOR_TARGET="$(CFLAGS_FOR_TARGET)" \
 		CXXFLAGS_FOR_TARGET="$(CXXFLAGS_FOR_TARGET)" \
 		LIBCFLAGS_FOR_TARGET="$(LIBCFLAGS_FOR_TARGET)" \
-		$(EXTRA_HOST_FLAGS) $(POSTSTAGE1_FLAGS_TO_PASS)  \
+		$(EXTRA_HOST_FLAGS) $(POSTSTAGE1_FLAGS_TO_PASS) AM_CFLAGS="-DNO_ASM" \
 		TFLAGS="$(STAGE2_TFLAGS)" \
 		$(TARGET-stage2-gmp)
 
@@ -11384,7 +11384,7 @@ clean-stage2-gmp:
 	  $(MAKE) stage2-start; \
 	fi; \
 	cd $(HOST_SUBDIR)/gmp && \
-	$(MAKE) $(EXTRA_HOST_FLAGS) $(POSTSTAGE1_FLAGS_TO_PASS)  clean
+	$(MAKE) $(EXTRA_HOST_FLAGS) $(POSTSTAGE1_FLAGS_TO_PASS) 

[Bug target/70416] [SH]: error: 'asm' operand requires impossible reload when building ruby2.3

2016-03-28 Thread olegendo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70416

--- Comment #15 from Oleg Endo  ---
(In reply to Oleg Endo from comment #14)
> 
> Maybe at that stage in the reload code it will end up using the last *addsi3
> pattern and not try to look for a new pattern in the .md when it wants to
> change it.  In other words, maybe it'll help if the *addsi3 patterns are
> merged into a single pattern somehow.  I'll give it a try...

The following seems to fix the problem:

Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md (revision 234258)
+++ gcc/config/sh/sh.md (working copy)
@@ -2248,7 +2248,7 @@
(plus:SI (match_operand:SI 1 "arith_reg_operand" "r")
 (match_operand:SI 2 "arith_or_int_operand" "rn")))]
   "TARGET_SH1 && !sh_lra_p ()
-   && reload_completed
+   && (reload_completed || reload_in_progress)
&& !reg_overlap_mentioned_p (operands[0], operands[1])"
   "#"
   "&& 1"

[Bug rtl-optimization/70134] combine misses jump optimization on powerpc64le

2016-03-28 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70134

Segher Boessenkool  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #1 from Segher Boessenkool  ---
Right.  But why is such awful code generated before combine?
Either the tree passes missed something, or expand is doing
a shitty job, I bet.

[Bug target/70429] Wrong code with -O1.

2016-03-28 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70429

Segher Boessenkool  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Segher Boessenkool  ---
This is a lot like PR70222, but that was fixed.  Adding cc:.

Re: Spurious register spill with volatile function argument

2016-03-28 Thread Florian Weimer
* Andrew Haley:

> "volatile" doesn't really mean very much, formally speaking.  Sure, the
> standard says "accesses to volatile objects are evaluated
> strictly according to the rules of the abstract machine," but nowhere
> is it specified exactly what constitutes an access.

Reading or modifying an object is defined as “access”.

The problem is that “reading” is either not defined, or the existing
flatly contradicts existing practice.

For example, if p is a pointer to a struct, will the expression >m
read *p?

Previously, this wasn't very interesting.  But under the model memory,
it's suddenly quite relevant.  If reading p->m implies a read of the
entire object *p, you cannot use a member to synchronize access to
other members of the struct.  For example, if m is a mutex, and
carefully acquire the mutex before you read or write other members,
you still have data race between a write to some other member and the
acquisition of the mutex because the mutex acquisition reads the
entire struct (including the member written to ).

One possible cure is to take the address of the mutex and keep track
of it separately.  Or you could construct a pointer using offsetof.
But no one is doing that, obviously.

This is not entirely hypothetical.  Even today, GCC's aliasing
analysis requires that those implicit whole-object reads take place,
to make certain forms of type-punning invalid which would otherwise be
well-defined (and for which GCC would generate invalid code).


[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f

2016-03-28 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406

Kirill Yukhin  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Kirill Yukhin  ---
Done.

Re: [Patch, Fortran] STOP issue with coarrays

2016-03-28 Thread Alessandro Fanfarillo
Thanks for the quick feedback.

The patch has been committed on trunk as rev. 234502 and on
gcc-5-branch as rev. 234503 .

In attachment three simple test cases; coarray_stop_str.f90 fails
because of a syntax error in the check string.
I guess the problem is due to the presence of the "STOPPED" string
inside the check string; how can I manage this situation?

Thanks,

Alessandro

PS: Paul, I see this as a good exercise before working on Failed Images :-).


2016-03-27 19:14 GMT+02:00 Paul Richard Thomas :
> Hi Alessandro,
>
> The patch is fine for trunk and 5-branch - going on trivial, I would
> say! Are you going to add the testcase?
>
> Thanks a lot! I am impressed that you are doing these between
> celebrating your doctorate and preparing for your move :-)
>
> Paul
>
> On 27 March 2016 at 17:10, Alessandro Fanfarillo
>  wrote:
>> Dear all,
>>
>> the attached patch fixes the issue reported by Anton Shterenlikht
>> (https://gcc.gnu.org/ml/fortran/2016-03/msg00037.html). The compiler
>> delegates the external library to manage the STOP statement in case
>> -fcoarray=lib is used.
>>
>> Built and regtested on x86_64-pc-linux-gnu.
>>
>> Ok for trunk and gcc-5-branch?
>
>
>
> --
> The difference between genius and stupidity is; genius has its limits.
>
> Albert Einstein
! { dg-do compile }
! { dg-options "-fdump-tree-original -fcoarray=lib" }
!
! STOP managed by the external library
!
implicit none
  STOP
end

! { dg-final { scan-tree-dump-times "_gfortran_caf_stop_str \\(0B, 0\\);" 1 "original" } }
! { dg-do compile }
! { dg-options "-fdump-tree-original -fcoarray=lib" }
!
! STOP managed by the external library
!
implicit none
  STOP 100
end

! { dg-final { scan-tree-dump-times "_gfortran_caf_stop_numeric \\(100\\);" 1 "original" } }
! { dg-do compile }
! { dg-options "-fdump-tree-original -fcoarray=lib" }
!
! STOP managed by the external library
!
implicit none
  STOP "STOPPED"
end

! { dg-final { scan-tree-dump-times "_gfortran_caf_stop_str \\(&\\"STOPPED\\"[1]\\{lb: 1 sz: 1\\}, 7\\);" 1 "original" } }


Re: rs6000 stack_tie mishap again

2016-03-28 Thread Segher Boessenkool
On Mon, Mar 28, 2016 at 12:45:03PM +0200, Olivier Hainque wrote:
> > The normal -m32 compiler here generates code like
> > 
> > lwz 11,0(1)
> > 
> > and try as I might I cannot get it to fail.  Maybe because the GPR11
> > setup here involves a load?
> 
> You need to have had r11 last used to designate a global
> symbol as part of the function body (in order to have base_term
> designate a symbol_ref etc), and then have the scheduler
> decide that moving across is a good idea. It's certainly not
> an easy combination to trigger.

Yes, I did that (with some asm's).  Like this:

===
void g(int, char *);

int dum;

void f(int x)
{
char big[20];
g(x, big);
g(x, big);
register void *p asm("r11") = 
asm("" : : "r"(p));
}
===

and the end of that becomes

===
bl g # 11   *call_nonlocal_sysvsi/1 [length = 4]
lis 11,dum@ha# 12   elf_high[length = 4]
la 11,dum@l(11)  # 13   elf_low [length = 4]
lwz 11,0(1)  # 33   *movsi_internal1/3  [length = 4]
lwz 0,4(11)  # 34   *movsi_internal1/3  [length = 4]
lwz 31,-4(11)# 36   *movsi_internal1/3  [length = 4]
mr 1,11  # 38   *movsi_internal1/1  [length = 4]
mtlr 0   # 35   *movsi_internal1/9  [length = 4]
blr  # 39   *return_internal_si [length = 4]
===

> > It seems clear that just considering the
> > prologue is enough to fix the original problem (frame pointer setup),
> > and problems like yours cannot happen in the prologue.
> 
> Right. What is unclear is if it's correct to consider only
> prologues here. ISTM we arrange to produce CFI for epilogues
> as well today, at least in some circumstances, and maybe the
> issue Richard had with prologues at the time would need to
> be addressed for epilogues as well today.

Correct is to do alias analysis in both the prologue and epilogue.
If we don't, ports have to do all kinds of stack ties etc. to fake it.

Currently we do neither, so doing just one is a step in the right
direction.

> > Better would be not to have this hack at all.
> 
> Sure.
> 
> >> My rough understanding is that we probably really care about frame_related
> >> insns only here, at least on targets where the flag is supposed to be set
> >> accurately.
> > 
> > On targets with DWARF2 unwind info the flag should be set on those insns
> > that need unwind info.  This does not include all insns in the epilogue
> > that access the frame, so I don't think this will help you?
> 
> My idea was that, maybe, the insns we need to see for alias analysis
> are precisely those for which the bit should not be set, which happened
> to be exactly the case in the problematic situation we hit. But as I said,
> I'm not 100% convinced the reasoning is globally correct.

All the register restores do not have the flag set, in most cases.
But they can in others (say, a target that does not optimise the CFI
stuff very well).

> >> This is the basis of the proposed patch, which essentially disconnects the
> >> shortcut above for !frame_related insns on targets which need precise alias
> >> analysis within epilogues, as indicated by a new target hook.
> > 
> > Eww.  Isn't that really all targets that schedule the epilogue at all?
> 
> Yes. Most of them use stronger barriers which the dependency analyser
> knows about, so aren't affected by this. 
> 
> That's a possible alternative approach for rs6000.

A clobber of scratch should work yes.  It's really heavy handed though,
we can move the GPR1 restore quite freely otherwise (in shrink-wrap),
but perhaps allowing scheduling to move it doesn't buy us much at all.

This doesn't solve the problem however that other dependencies in the
epilogue can be messed up in similar ways.


Segher


[Bug target/70429] Wrong code with -O1.

2016-03-28 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70429

Markus Trippelsdorf  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org,
   ||trippels at gcc dot gnu.org

--- Comment #2 from Markus Trippelsdorf  ---
Adding CC.

Re: rs6000 stack_tie mishap again

2016-03-28 Thread Olivier Hainque

> On Mar 28, 2016, at 05:01 , Segher Boessenkool  
> wrote:
> 
> The normal -m32 compiler here generates code like
> 
>   lwz 11,0(1)
> 
> and try as I might I cannot get it to fail.  Maybe because the GPR11
> setup here involves a load?

You need to have had r11 last used to designate a global
symbol as part of the function body (in order to have base_term
designate a symbol_ref etc), and then have the scheduler
decide that moving across is a good idea. It's certainly not
an easy combination to trigger.

>> We have observed this with a gcc 4.7 back-end and weren't able to reproduce
>> with a more recent version.
> 
> This makes it not a regression and thus out of scope for GCC 6.  We're
> supposed to stabilise things now ;-)

Sure, no problem if this is only for gcc 7. I posted the message
while the details were still fresh in my mind.

>>if (could_be_prologue_epilogue
>>&& prologue_epilogue_contains (insn))
>>  continue;
>> 
>> The motivation for this is unclear to me.
> 
> Alan linked to the history.

Right


>  It seems clear that just considering the
> prologue is enough to fix the original problem (frame pointer setup),
> and problems like yours cannot happen in the prologue.

Right. What is unclear is if it's correct to consider only
prologues here. ISTM we arrange to produce CFI for epilogues
as well today, at least in some circumstances, and maybe the
issue Richard had with prologues at the time would need to
be addressed for epilogues as well today.

> Better would be not to have this hack at all.

Sure.

>> My rough understanding is that we probably really care about frame_related
>> insns only here, at least on targets where the flag is supposed to be set
>> accurately.
> 
> On targets with DWARF2 unwind info the flag should be set on those insns
> that need unwind info.  This does not include all insns in the epilogue
> that access the frame, so I don't think this will help you?

My idea was that, maybe, the insns we need to see for alias analysis
are precisely those for which the bit should not be set, which happened
to be exactly the case in the problematic situation we hit. But as I said,
I'm not 100% convinced the reasoning is globally correct.

>> This is the basis of the proposed patch, which essentially disconnects the
>> shortcut above for !frame_related insns on targets which need precise alias
>> analysis within epilogues, as indicated by a new target hook.
> 
> Eww.  Isn't that really all targets that schedule the epilogue at all?

Yes. Most of them use stronger barriers which the dependency analyser
knows about, so aren't affected by this. 

That's a possible alternative approach for rs6000.

Thanks for your feedback,

Olivier



[Bug target/70429] Wrong code with -O1.

2016-03-28 Thread ienkovich at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70429

Ilya Enkovich  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-03-28
 CC||ienkovich at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Ilya Enkovich  ---
Confirm.

Looks like combine issue.  Before combine:

6: r94:DI=0x14ff6e2207db5d1f
7: {r93:DI=r94:DI>>r92:SI#0;clobber flags:CC;}
  REG_DEAD r94:DI
  REG_DEAD r92:SI
  REG_UNUSED flags:CC
  REG_EQUAL 0x14ff6e2207db5d1f>>r92:SI#0
8: {r95:SI=r93:DI#0>>0x4;clobber flags:CC;}

Combine log:

allowing combination of insns 6, 7 and 8  
original costs 3 + 4 + 4 = 11 
replacement costs 3 + 4 = 7   
deferring deletion of insn with uid = 6.  
modifying insn i2 7: r93:DI=0x14ff6e2207db5d1 
deferring rescan insn with uid = 7.   
modifying insn i3 8: {r95:SI#0=r93:DI>>r92:SI#0;clobber flags:CC;}

This affects 4 high bits if the result.

[PATCH, libgomp] Fix deadlock in acc_set_device_type

2016-03-28 Thread Chung-Lin Tang
Hi Jakub, there's a path for deadlock on acc_device_lock when going
through the acc_set_device_type() OpenACC library function.
Basically, the gomp_init_targets_once() function should not be
called with that held. The attached patch moves it appropriately.

Also in this patch, there are several cases in acc_* functions
where gomp_init_targets_once() is guarded by a test of
!cached_base_dev. Since that function already uses pthread_once() to
call gomp_target_init(), and technically cached_base_dev
is protected by acc_device_lock, the cleanest way should be to
simply drop those "if(!cached_base_dev)" tests.

Tested libgomp without regressions on an nvptx offloaded system,
is this okay for trunk?

Thanks,
Chung-Lin

2016-03-28  Chung-Lin Tang  

* oacc-init.c (acc_init): Remove !cached_base_dev condition on call to
gomp_init_targets_once().
(acc_set_device_type): Remove !cached_base_dev condition on call to
gomp_init_targets_once(), move call to before acc_device_lock acquire,
to avoid deadlock.
(acc_get_device_num): Remove !cached_base_dev condition on call to
gomp_init_targets_once().
(acc_set_device_num): Likewise.
Index: oacc-init.c
===
--- oacc-init.c (revision 234502)
+++ oacc-init.c (working copy)
@@ -433,8 +433,7 @@ goacc_attach_host_thread_to_device (int ord)
 void
 acc_init (acc_device_t d)
 {
-  if (!cached_base_dev)
-gomp_init_targets_once ();
+  gomp_init_targets_once ();
 
   gomp_mutex_lock (_device_lock);
 
@@ -498,11 +497,10 @@ acc_set_device_type (acc_device_t d)
   struct gomp_device_descr *base_dev, *acc_dev;
   struct goacc_thread *thr = goacc_thread ();
 
+  gomp_init_targets_once ();
+
   gomp_mutex_lock (_device_lock);
 
-  if (!cached_base_dev)
-gomp_init_targets_once ();
-
   cached_base_dev = base_dev = resolve_device (d, true);
   acc_dev = _dev[goacc_device_num];
 
@@ -563,8 +561,7 @@ acc_get_device_num (acc_device_t d)
   if (d >= _ACC_device_hwm)
 gomp_fatal ("unknown device type %u", (unsigned) d);
 
-  if (!cached_base_dev)
-gomp_init_targets_once ();
+  gomp_init_targets_once ();
 
   gomp_mutex_lock (_device_lock);
   dev = resolve_device (d, true);
@@ -584,8 +581,7 @@ acc_set_device_num (int ord, acc_device_t d)
   struct gomp_device_descr *base_dev, *acc_dev;
   int num_devices;
 
-  if (!cached_base_dev)
-gomp_init_targets_once ();
+  gomp_init_targets_once ();
 
   if (ord < 0)
 ord = goacc_device_num;


Re: Proposed Patch for Bug 69687

2016-03-28 Thread Marcel Böhme
Hi Bernd: I submitted the filled disclaimer form on March 4th. Now, a 
representative of FSF mentioned that the copyright assignment is ready on their 
end.

Can someone please go ahead and review the patch?

Best regards,
- Marcel


> On 4 Mar 2016, at 1:43 AM, Bernd Schmidt  wrote:
> 
> On 03/03/2016 04:18 PM, Mike Stump wrote:
>> On Mar 3, 2016, at 6:55 AM, Marcel Böhme  wrote:
>>> I have revised the patch and removed the limits.
>> 
>> I looked at the patch, I can find no more unreasonable limits!  Wonderful.  
>> Hope someone will finish off the review and approve.
> 
> Marcel, if you haven't contributed before, we'll need a copyright assignment 
> from you before we can accept the patch. Do you already have one on file?
> 
> 
> Bernd
> 



Re: Spurious register spill with volatile function argument

2016-03-28 Thread Andrew Haley
On 27/03/16 06:57, Michael Clark wrote:

> GCC, Clang folk, any ideas on why there is a stack spill for a
> volatile register argument passed in esi? Does volatile force the
> argument to have storage allocated on the stack? Is this a corner
> case in the C standard? This argument in the x86_64 calling
> convention only has a register, so technically it can’t change
> outside the control of the C "virtual machine” so volatile has a
> vague meaning here.

"volatile" doesn't really mean very much, formally speaking.  Sure, the
standard says "accesses to volatile objects are evaluated
strictly according to the rules of the abstract machine," but nowhere
is it specified exactly what constitutes an access.  (To be precise,
"what constitutes an access to an object that has volatile-qualified
type is implementation-defined.")

So, we have to fall back to tradition.  Traditionally, all volatile
objects are allocated stack slots and all accesses to them are memory
accesses.  This is consistent behaviour, and has been for a long time.
It is also extremely useful when debugging optimized code.

> volatile for scalar function arguments seems to mean: “make this
> volatile and subject to change outside of the compiler” rather than
> being a qualifier for its storage (which is a register).

No, arguments are not necessarily stored in registers: they're passed
in registers, but after function entry function they're just auto
variables and are stored wherever the compiler likes.

Andrew.


Re: Constexpr in intrinsics?

2016-03-28 Thread Marc Glisse

On Mon, 28 Mar 2016, Allan Sandfeld Jensen wrote:


On Sunday 27 March 2016, Marc Glisse wrote:

On Sun, 27 Mar 2016, Allan Sandfeld Jensen wrote:

Would it be possible to add constexpr to the intrinsics headers?

For instance _mm_set_XX and _mm_setzero intrinsics.


Already suggested here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65197

A patch would be welcome (I started doing it at some point, I don't
remember if it was functional, the patch is attached).


That looks very similar to the patch I experimented with, and that at least
works for using them in C++11 constexpr functions.


Ideally it could also be added all intrinsics that can be evaluated at
compile time, but it is harder to tell which those are.

Does gcc have a C extension we can use to set constexpr?


What for?


To have similar functionality in C. For instance to explicitly allow those
functions to be evaluated at compile time, and values with similar attributes
be optimized completely out.


Those intrinsics that are implemented without builtins can already be 
evaluated at compile time.


#include 

__m128d f(){
  __m128d a=_mm_set_pd(1,2);
  __m128d b=_mm_setr_pd(4,3);
  return _mm_add_pd(a, b);
}

The generated asm is just

movapd  .LC0(%rip), %xmm0
ret

For the more esoteric intrinsics, what is missing is not in the parser, it 
is a folder that understands the behavior of each particular intrinsic.



And of course avoid using precompiler noise, in
shared C/C++ headers like these are.


--
Marc Glisse


[Bug c++/69515] partial specialization of variable templates is broken

2016-03-28 Thread tom at honermann dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69515

Tom Honermann  changed:

   What|Removed |Added

 CC||tom at honermann dot net

--- Comment #1 from Tom Honermann  ---
I'm also experiencing this; gcc r234493.

$ cat t.cpp
template 
constexpr bool b = false;
template
constexpr bool b = true;
int main() {
b;
b;
}

$ g++ -std=c++14 t.cpp -o t
/tmp/ccYIejpd.s: Assembler messages:
/tmp/ccYIejpd.s:27: Error: symbol `_ZL1b' is already defined

$ gcc --version
gcc (GCC) 6.0.0 20160327 (experimental)
...

[Bug target/70429] New: Wrong code with -O1.

2016-03-28 Thread vsevolod.livinskij at frtk dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70429

Bug ID: 70429
   Summary: Wrong code with -O1.
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vsevolod.livinskij at frtk dot ru
  Target Milestone: ---

Created attachment 38111
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38111=edit
Reproducer.

Test case produces incorrect result with -O1 and all x86 arch. Everything works
fine with -O0 option.

Output:
> g++ -O0 repr.cpp -mavx2; ./a.out
4119272
> g++ -O1 repr.cpp -mavx2; ./a.out
272554728

GCC version:
 > g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/export/users/vlivinsk/gcc-trunk/bin/libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /export/users/vlivinsk/gcc-trunk/gcc/configure
--with-arch=corei7 --with-cpu=corei7 --enable-clocale=gnu --with-system-zlib
--enable-shared --with-demangler-in-ld --enable-cloog-backend=isl
--with-fpmath=sse --enable-checking=release --enable-languages=c,c++,lto
--with-gmp=/export/users/vlivinsk/gcc-trunk/gmp-6.1.0/bin
--with-mpfr=/export/users/vlivinsk/gcc-trunk/mpfr-3.1.3/bin
--with-mpc=/export/users/vlivinsk/gcc-trunk/mpc-1.0.3/bin
--prefix=/export/users/vlivinsk/gcc-trunk/bin
Thread model: posix
gcc version 6.0.0 20160327 (experimental) (Revision=234496)

Test:
#include 
#include 

bool a = 1;
int b = 612010083;

void foo () {
b = (int)((~7710322944595239647L + 9223372036854775807L) >> a) >> 4;
}

int main () {
foo ();
if (b != 4119272)
abort();
return 0;
}

[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f

2016-03-28 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406

--- Comment #5 from Kirill Yukhin  ---
Author: kyukhin
Date: Mon Mar 28 08:01:56 2016
New Revision: 234501

URL: https://gcc.gnu.org/viewcvs?rev=234501=gcc=rev
Log:
PR target/70406.

gcc/
 * config/i386/i386.md (define_split, andn): Fix modes.

gcc/testsuite/
 * gcc.target/i386/pr70406.c: New test.

Added:
branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70406.c
Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/config/i386/i386.md
branches/gcc-5-branch/gcc/testsuite/ChangeLog

[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f

2016-03-28 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406

--- Comment #4 from Kirill Yukhin  ---
Author: kyukhin
Date: Mon Mar 28 07:59:44 2016
New Revision: 234500

URL: https://gcc.gnu.org/viewcvs?rev=234500=gcc=rev
Log:
PR target/70406

gcc/
 * config/i386/i386.md (define_split, andn): Fix modes.

gcc/testsuite/
 * gcc.target/i386/pr70406.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr70406.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.md
trunk/gcc/testsuite/ChangeLog