Re: [patch] Fix Unwind support on DragonFly BSD after sigtramp move

2017-07-18 Thread Jeff Law
On 07/07/2017 05:17 PM, John Marino wrote:
> Right after DragonFly 4.8 was released (27 Mar 2017), the signal
> trampoline was moved (twice) in response to a Ryzen bug.  This broke
> GCC's unwind support for DragonFly.
> 
> To avoid hardcoding the sigtramp location to avoid issues like this in
> the future, a new sysctl was added to DragonFly to return the signal
> trampoline address range (FreeBSD has a similar sysctl for similar
> reasons).  The attached patch fixes DragonFly unwind support for current
> DragonFly, and maintains support for Release 4.8 and earlier.
> 
> This patch has been in use for a few months and works fine.  It is
> similar in function to the FreeBSD Aarch64 unwind support I submitted
> through Andreas T. a few months ago.
> 
> I believe the patch can be applied to trunk and release 7 branch.
> I am the closest thing to a maintainer for DragonFly, so I don't know if
> additional approval is needed.  This patch is purely DragonFly-specific
> and cannot affect other platforms in any way.
> 
> If agreed, it would be great if somebody could commit this for me
> against the trunk and GCC-7-branch.
> 
> Thanks!
> John
> 
> P.S.  Yes, my copyright assignment is on file (I've contributed a few
> patches already).
> 
> suggested log entry of libgcc/ChangeLog:
> 
> 2017-07-XX  John Marino  
>* config/i386/dragonfly-unwind.h: Handle sigtramp relocation.

This is fine.  Sorry it's taken so long for me to get to this.

jeff


[PATCH][RFA/RFC] Stack clash mitigation patch 03/08 V2 -- right patch attached

2017-07-18 Thread Jeff Law

Opps, I clearly attached the wrong file.

--

I don't think this patch changed in any significant way since V1.
--

One of the painful aspects of all this code is the amount of target
dependent bits that have to be written and tested.

I didn't want to be scanning assembly code or RTL for prologues.  Each
target would have to have its own scanner which was too painful to
contemplate.

So instead I settled on having a routine that the target dependent
prologue expanders could call to dump information about what they were
doing.

This greatly simplifies the testing side of things by having a standard
way to dump decisions.  When combined with the dejagnu routines from
patch #1 which describe key attributes of the target's prologue
generation I can write tests in a fairly generic way.

This will be used by every target dependent prologue expander in this
series.

OK for the trunk?

* function.c (dump_stack_clash_frame_info): New function.
* function.h (dump_stack_clash_frame_info): Prototype.
(enum stack_clash_probes): New enum.

diff --git a/gcc/function.c b/gcc/function.c
index f625489..ca48b3f 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -5695,6 +5695,58 @@ get_arg_pointer_save_area (void)
   return ret;
 }
 
+
+/* If debugging dumps are requested, dump information about how the
+   target handled -fstack-check=clash for the prologue.
+
+   PROBES describes what if any probes were emitted.
+
+   RESIDUALS indicates if the prologue had any residual allocation
+   (i.e. total allocation was not a multiple of PROBE_INTERVAL).  */
+
+void
+dump_stack_clash_frame_info (enum stack_clash_probes probes, bool residuals)
+{
+  if (!dump_file)
+return;
+
+  switch (probes)
+{
+case NO_PROBE_NO_FRAME:
+  fprintf (dump_file,
+  "Stack clash no probe no stack adjustment in prologue.\n");
+  break;
+case NO_PROBE_SMALL_FRAME:
+  fprintf (dump_file,
+  "Stack clash no probe small stack adjustment in prologue.\n");
+  break;
+case PROBE_INLINE:
+  fprintf (dump_file, "Stack clash inline probes in prologue.\n");
+  break;
+case PROBE_LOOP:
+  fprintf (dump_file, "Stack clash probe loop in prologue.\n");
+  break;
+}
+
+  if (residuals)
+fprintf (dump_file, "Stack clash residual allocation in prologue.\n");
+  else
+fprintf (dump_file, "Stack clash no residual allocation in prologue.\n");
+
+  if (frame_pointer_needed)
+fprintf (dump_file, "Stack clash frame pointer needed.\n");
+  else
+fprintf (dump_file, "Stack clash no frame pointer needed.\n");
+
+  if (TREE_THIS_VOLATILE (cfun->decl))
+fprintf (dump_file,
+"Stack clash noreturn prologue, assuming no implicit"
+" probes in caller.\n");
+  else
+fprintf (dump_file,
+"Stack clash not noreturn prologue.\n");
+}
+
 /* Add a list of INSNS to the hash HASHP, possibly allocating HASHP
for the first time.  */
 
diff --git a/gcc/function.h b/gcc/function.h
index 0f34bcd..87dac80 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -553,6 +553,14 @@ do {   
\
   ((TARGET_PTRMEMFUNC_VBIT_LOCATION == ptrmemfunc_vbit_in_pfn)  \
? MAX (FUNCTION_BOUNDARY, 2 * BITS_PER_UNIT) : FUNCTION_BOUNDARY)
 
+enum stack_clash_probes {
+  NO_PROBE_NO_FRAME,
+  NO_PROBE_SMALL_FRAME,
+  PROBE_INLINE,
+  PROBE_LOOP
+};
+
+extern void dump_stack_clash_frame_info (enum stack_clash_probes, bool);
 
 
 extern void push_function_context (void);


[PATCH] [RFA/RFC] Stack clash mitigation patch 02/08 V2

2017-07-18 Thread Jeff Law

This time with the patch attached.



 Forwarded Message 
Subject: [PATCH] [RFA/RFC] Stack clash mitigation patch 02/08 V2
Date: Tue, 18 Jul 2017 23:17:23 -0600
From: Jeff Law 
To: gcc-patches 


The biggest change since V1 of this patch is dropping the changes to
STACK_CHECK_MOVING_SP.  They're not needed.

This patch also refactors a bit of the new code in explow.c.  In
particular it pulls out 3 chunks of code for protecting dynamic stack
adjustments so they can be re-used by backends that have their own
allocation routines for dynamic stack data.

--
The key goal of this patch is to introduce stack clash protections for
dynamically allocated stack space and indirect uses of
STACK_CHECK_PROTECT via get_stack_check_protect.

Those two changes accomplish two things.  First it gives most targets
protection of dynamically allocated space (exceptions are targets which
expanders to allocate dynamic stack space such as ppc).

Second, targets which are not covered by -fstack-clash-protection
prologues later, but which are covered by -fstack-clash-protection get a
fair amount of protection.

We essentially vector into a totally different routine to allocate/probe
the dynamic stack space when -fstack-clash-protection is active.  It
differs from the existing routine is that it allocates PROBE_INTERVAL
chunks and probes them as they are allocated.  The existing code would
allocate the entire space as a single hunk, then probe PROBE_INTERVAL
chunks within the hunk.

That routine is never presented with constant allocations on x86, but is
presented with constant allocations on other architectures.  It will
optimize cases when it knows it does not need the loop or the residual
allocation after the loop.   It does not have an unrolled loop mode, but
one could be added -- it didn't seem worth the effort.

The test will check that the loop is avoided for one case where it makes
sense.  It does not check for avoiding the residual allocation, but it
could probably be made to do so.


The indirection for STACK_CHECK_PROTECT via get_stack_protect is worth
some further discussion as well.

Early in the development of the stack-clash mitigation patches we
thought we could get away with re-using much of the existing target code
for -stack-check=specific.

Essentially that code starts a probing loop at STACK_CHECK_PROTECT and
probes 2-3 pages beyond the current function's needs.  The problem was
that starting at STACK_CHECK_PROTECT would skip probes in the first
couple pages leaving the code vulnerable.

So the idea was to avoid using STACK_CHECK_PROTECT directly.  Instead we
would indirect through a new function (get_stack_check_protect) which
would return either 0 or STACK_CHECK_PROTECT depending on whether or not
we wanted -fstack-clash-protection or -fstack-check=specific respectively.

That scheme works reasonably well.  Except that it will tend to allocate
a large (larger than PROBE_INTERVAL) chunk of memory at once, then go
back and probe regions of PROBE_INTERVAL size.  That introduces an
unfortunate race condition with asynch signals and also crashes valgrind
on ppc and aarch64.

Rather than throw that code away, it may still be valuable to those
targets with -fstack-check=specific support, but without
-fstack-clash-protection support.  So I'm including it here.


OK for the trunk?



* explow.c (anti_adjust_stack_and_probe_stack_clash): New function.
(get_stack_check_protect): Likewise.
(compute_stack_clash_protection_loop_data): Likewise.
(emit_stack_clash_protection_loop_start): Likewise.
(emit_stack_clash_protection_loop_end): Likewise.
(allocate_dynamic_stack_space):
Use anti_adjust_stack_and_probe_stack_clash.
* explow.h (compute_stack_clash_protection_loop_data): Prototype.
(emit_stack_clash_protection_loop_start): Likewise.
(emit_stack_clash_protection_loop_end): Likewise.
* rtl.h (get_stack_check_protect): Prototype.
* config/aarch64/aarch64.c (aarch64_expand_prologue): Use
get_stack_check_protect.
* config/alpha/alpha.c (alpha_expand_prologue): Likewise.
* config/arm/arm.c (arm_expand_prologue): Likewise.
* config/i386/i386.c (ix86_expand_prologue): Likewise.
* config/ia64/ia64.c (ia64_expand_prologue): Likewise.
* config/mips/mips.c (mips_expand_prologue): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_emit_prologue): Likewise.
* config/rs6000/rs6000.c (rs6000_emit_prologue): Likewise.
* config/sparc/sparc.c (sparc_expand_prologue): Likewise.


testsuite

* gcc.dg/stack-check-3.c: New test.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index ef1b5a8..0a8b40a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3676,12 +3676,14 @@ aarch64_expand_prologue (void)
 {
   if (crtl->is_leaf && !cfun->calls_alloca)
 

[Bug libstdc++/81480] New: Assertion `ec' failed

2017-07-18 Thread akhilesh.k at samsung dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81480

Bug ID: 81480
   Summary: Assertion `ec'   failed
   Product: gcc
   Version: 6.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: akhilesh.k at samsung dot com
  Target Milestone: ---

While   libstdc++ tests verification i found 
below summary report 

Is this expected ? 
Or fixed in latest versions 


=== libstdc++ Summary ===

# of expected passes10170
# of unexpected failures6
# of expected failures  66
# of unsupported tests  612
runtest completed at Tue Jul 18 10:22:55 2017


--Fail logs 
Line 8625: FAIL: abi/header_cxxabi.c (test for excess errors)
Line 8710: FAIL: experimental/filesystem/iterators/directory_iterator.cc
execution test
Line 8714: FAIL:
experimental/filesystem/iterators/recursive_directory_iterator.cc execution
test
Line 8734: FAIL: experimental/filesystem/operations/exists.cc execution test
Line 8738: FAIL: experimental/filesystem/operations/is_empty.cc execution test
Line 8750: FAIL: experimental/filesystem/operations/temp_directory_path.cc
execution test



testsuite/abi/header_cxxabi.c:21:20: fatal error: cxxabi.h: No such file or
directory
And others are related with 
void test01(): Assertion `ec' failed.

[PATCH][RFA/RFC] Stack clash mitigation patch 08/08 V2

2017-07-18 Thread Jeff Law
I don't think this patch has changed in any significant way since the V1
patch.

I have tested a slightly different version which punts stack clash
protection for very large static stack frames -- otherwise tests which
have *huge* frames will timeout, run out of memory during compilation, etc.

--
s390's most interesting property is that the caller allocates space for
the callee to save registers into.

So we start with a very conservative assumption about the offset between
SP and the most recent stack probe.  As we encounter those register
saves we may be able to decrease that offset.  And like aarch64 as we
allocate space, the offset increases.  If the offset crosses
PROBE_INTERVAL, we must emit probes.

For large frames, I did not implement an allocate/probe in a loop.
Someone with a better understanding of the architecture is better suited
for that work.  I'll note that you're going to need another scratch
register   This is the cause of the xfail of one test which expects to
see a prologue allocate/probe loop.

s390 has a -mbackchain option.  I'm not sure where it's used, but we do
try to handle it in the initial offset computation.   However, we don't
handle it in the actual allocations that occur when -fstack-clash-protection

Other than the xfail noted above, the s390 uses the same tests as the
x86, ppc and aarch64 ports.

I suspect we're going to need further iteration here.

* config/s390/s390.c (PROBE_INTERVAL): Define.
(allocate_stack_space): New function, partially extracted from
s390_emit_prologue.
(s390_emit_prologue): Track offset to most recent stack probe.
Code to allocate space moved into allocate_stack_space.
Dump actions when no stack is allocated.

testsuite/

* gcc.dg/stack-check-6.c: xfail for s390*-*-*.

commit 0d2fdca4d86238f2fc095c7d91013e927c6ecf0c
Author: Jeff Law 
Date:   Fri Jul 7 17:25:35 2017 +

S390 implementatoin

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 958ee3b..7d4020c 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -10999,6 +10999,107 @@ pass_s390_early_mach::execute (function *fun)
 
 } // anon namespace
 
+#define PROBE_INTERVAL (1 << STACK_CHECK_PROBE_INTERVAL_EXP)
+
+/* Allocate SIZE bytes of stack space, using TEMP_REG as a temporary
+   if necessary.  LAST_PROBE_OFFSET contains the offset of the closest
+   probe relative to the stack pointer.
+
+   Note that SIZE is negative. 
+
+   TMP_REG_IS_LIVE indicates that TEMP_REG actually holds a live
+   value and must be restored if we clobber it.  */
+static void
+allocate_stack_space (rtx size, HOST_WIDE_INT last_probe_offset,
+ rtx temp_reg, bool temp_reg_is_live)
+{
+  rtx insn;
+
+  /* If we are emitting stack probes and a SIZE allocation would cross
+ the PROBE_INTERVAL boundary, then we need significantly different
+ sequences to allocate and probe the stack.  */
+  if (flag_stack_clash_protection
+  && last_probe_offset + -INTVAL (size) < PROBE_INTERVAL)
+dump_stack_clash_frame_info (NO_PROBE_SMALL_FRAME, true);
+  else if (flag_stack_clash_protection
+  && last_probe_offset + -INTVAL (size) >= PROBE_INTERVAL)
+{
+  rtx memref;
+
+  HOST_WIDE_INT rounded_size = -INTVAL (size) & -PROBE_INTERVAL;
+
+  emit_move_insn (temp_reg, GEN_INT (PROBE_INTERVAL - 8));
+
+  /* We really should have a runtime loop version as well.  */
+  for (unsigned int i = 0; i < rounded_size; i += PROBE_INTERVAL)
+   {
+ insn = emit_insn (gen_add2_insn (stack_pointer_rtx,
+  GEN_INT (-PROBE_INTERVAL)));
+ RTX_FRAME_RELATED_P (insn);
+
+ /* We just allocated PROBE_INTERVAL bytes of stack space.  Thus,
+a probe is mandatory here, but LAST_PROBE_OFFSET does not
+change.  */
+ memref = gen_rtx_MEM (Pmode, gen_rtx_PLUS (Pmode, temp_reg,
+stack_pointer_rtx));
+ MEM_VOLATILE_P (memref);
+ emit_move_insn (memref, temp_reg);
+   }
+
+  /* Handle any residual allocation request.  */
+  HOST_WIDE_INT residual = -INTVAL (size) - rounded_size;
+  insn = emit_insn (gen_add2_insn (stack_pointer_rtx,
+  GEN_INT (-residual)));
+  RTX_FRAME_RELATED_P (insn) = 1;
+  last_probe_offset += residual;
+  if (last_probe_offset >= PROBE_INTERVAL)
+   {
+ emit_move_insn (temp_reg, GEN_INT (residual
+- GET_MODE_SIZE (word_mode)));
+ memref = gen_rtx_MEM (Pmode, gen_rtx_PLUS (Pmode, temp_reg,
+stack_pointer_rtx));
+ MEM_VOLATILE_P (memref);
+ emit_move_insn (memref, temp_reg);
+   }
+
+  /* We clobbered TEMP_REG, but it really isn't a temporary at this point,
+restore its 

[PATCH][RFA/RFC] Stack clash mitigation patch 04/08 V2

2017-07-18 Thread Jeff Law
I don't think this has changed in any significant way since V1.

--

This patch introduces x86 target specific bits to mitigate stack clash
attacks.

The key differences relative to the -fstack-check=specific expander are
that it never allocates more than PROBE_INTERVAL bytes at a time, it
probes into each allocated hunk immediately after allocation, and it
exploits the fact that a call instruction generates an implicit probe at
*sp for the callee and uses that fact to avoid many explicit probes.


The highlights:

  1. If the size of the local frame is < PROBE_INTERVAL, then no
 probing is needed.

  2. For up to 4 * PROBE_INTERVAL sized frames the allocation and
 probe are emitted inline.

  3. Anything larger we implement as a allocate/probe loop where
 each iteration handles a region of PROBE_INTERVAL size.

  4. Residuals need not be probed.

  5. CFIs should be correct, even for the loop case.

  6. Introduces several new tests.  Many of which are used unmodified
 on ppc, aarch64 and s390 later.


This implementation should be very efficient for the most common cases.

OK for the trunk?


* config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): New.
(ix86_expand_prologue): Dump stack clash info as needed.
Call ix86_adjust_stack_and_probe_stack_clash as needed.

testsuite/

* gcc.dg/stack-check-4.c: New test.
* gcc.dg/stack-check-5.c: New test.
* gcc.dg/stack-check-6.c: New test.
* gcc.dg/stack-check-7.c: New test.
* gcc.dg/stack-check-8.c: New test.
* gcc.dg/stack-check-9.c: New test.
* gcc.dg/stack-check-10.c: New test.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 0947b3c..fa10c28 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13779,6 +13779,140 @@ release_scratch_register_on_entry (struct scratch_reg 
*sr)
 
 #define PROBE_INTERVAL (1 << STACK_CHECK_PROBE_INTERVAL_EXP)
 
+/* Emit code to adjust the stack pointer by SIZE bytes while probing it.
+
+   This differs from the next routine in that it tries hard to prevent
+   attacks that jump the stack guard.  Thus it is never allowed to allocate
+   more than PROBE_INTERVAL bytes of stack space without a suitable
+   probe.  */
+
+static void
+ix86_adjust_stack_and_probe_stack_clash (const HOST_WIDE_INT size)
+{
+  struct machine_function *m = cfun->machine;
+
+  /* If this function does not statically allocate stack space, then
+ no probes are needed.  */
+  if (!size)
+{
+  dump_stack_clash_frame_info (NO_PROBE_NO_FRAME, false);
+  return;
+}
+
+  /* If we are a noreturn function, then we have to consider the
+ possibility that we're called via a jump rather than a call.
+
+ Thus we don't have the implicit probe generated by saving the
+ return address into the stack at the call.  Thus, the stack
+ pointer could be anywhere in the guard page.  The safe thing
+ to do is emit a probe now.
+
+ ?!? This should be revamped to work like aarch64 and s390 where
+ we track the offset from the most recent probe.  Normally that
+ offset would be zero.  For a non-return function we would reset
+ it to PROBE_INTERVAL - (STACK_BOUNDARY / BITS_PER_UNIT).   Then
+ we just probe when we cross PROBE_INTERVAL.  */
+  if (TREE_THIS_VOLATILE (cfun->decl))
+emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
+-GET_MODE_SIZE (word_mode)));
+
+  /* If we allocate less than PROBE_INTERVAL bytes statically,
+ then no probing is necessary, but we do need to allocate
+ the stack.  */
+  if (size < PROBE_INTERVAL)
+{
+  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
+GEN_INT (-size), -1,
+m->fs.cfa_reg == stack_pointer_rtx);
+  dump_stack_clash_frame_info (NO_PROBE_SMALL_FRAME, true);
+  return;
+}
+
+  /* We're allocating a large enough stack frame that we need to
+ emit probes.  Either emit them inline or in a loop depending
+ on the size.  */
+  if (size <= 4 * PROBE_INTERVAL)
+{
+  HOST_WIDE_INT i;
+  for (i = PROBE_INTERVAL; i <= size; i += PROBE_INTERVAL)
+   {
+ /* Allocate PROBE_INTERVAL bytes.  */
+ pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
+GEN_INT (-PROBE_INTERVAL), -1,
+m->fs.cfa_reg == stack_pointer_rtx);
+
+ /* And probe at *sp.  */
+ emit_stack_probe (stack_pointer_rtx);
+   }
+
+  /* We need to allocate space for the residual, but we do not need
+to probe the residual.  */
+  HOST_WIDE_INT residual = (i - PROBE_INTERVAL - size);
+  if (residual)
+   pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
+  GEN_INT (residual), -1,
+  m->fs.cfa_reg == 

[PATCH][RFA/RFC] Stack clash mitigation patch 07/08 V2

2017-07-18 Thread Jeff Law

So this patch has changed considerably since V1 as well.

First, we no longer track the bulk of the register stores in the
prologue.  Those may be separately shrink wrapped and thus not executed
on all paths and as such are not candidates for implicit probes.

Second, per the discussions we've had on-list, we're less aggressive at
probing.  We assume the caller has not pushed us more than 1kbyte into
the stack guard.  Thus stacks of < 3kbytes in the callee need no probes.

Third, the implicit probe tracking is simplified.  I'm exceedingly happy
to find out that we can never have a nonzero initial_adjust and
callee_adjust at the same time.  That's a significant help.

We still use the save of lr/fp as an implicit probe.

This ought to be much more efficient than the prior version.


Hopefully this is closer to something the aarch64 maintainers are
comfortable with.

--

* config/aarch/aarch64.c (aarch64_output_probe_stack_range): Handle
-fstack-clash-protection probing too.
(aarch64_allocate_and_probe_stack_space): New function.
(aarch64_expand_prologue): Assert we never have both an initial
adjustment and callee save adjustment.  Track distance between SP and
most recent probe.  Use aarch64_allocate_and_probe_stack_space
when -fstack-clash-protect is enabled rather than just adjusting sp.
Dump actions via dump_stack_clash_frame_info.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 0a8b40a..8764d62 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2830,6 +2830,9 @@ aarch64_output_probe_stack_range (rtx reg1, rtx reg2)
   char loop_lab[32];
   rtx xops[2];
 
+  if (flag_stack_clash_protection)
+reg1 = stack_pointer_rtx;
+
   ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno++);
 
   /* Loop.  */
@@ -2841,7 +2844,14 @@ aarch64_output_probe_stack_range (rtx reg1, rtx reg2)
   output_asm_insn ("sub\t%0, %0, %1", xops);
 
   /* Probe at TEST_ADDR.  */
-  output_asm_insn ("str\txzr, [%0]", xops);
+  if (flag_stack_clash_protection)
+{
+  gcc_assert (xops[0] == stack_pointer_rtx);
+  xops[1] = GEN_INT (PROBE_INTERVAL - 8);
+  output_asm_insn ("str\txzr, [%0, %1]", xops);
+}
+  else
+output_asm_insn ("str\txzr, [%0]", xops);
 
   /* Test if TEST_ADDR == LAST_ADDR.  */
   xops[1] = reg2;
 static void
 aarch64_save_callee_saves (machine_mode mode, HOST_WIDE_INT start_offset,
@@ -3605,6 +3617,68 @@ aarch64_set_handled_components (sbitmap components)
   cfun->machine->reg_is_wrapped_separately[regno] = true;
 }
 
+/* Allocate SIZE bytes of stack space using SCRATCH_REG as a scratch
+   register.
+
+   LAST_PROBE_OFFSET contains the offset between the stack pointer and
+   the last known probe.  As LAST_PROBE_OFFSET crosses PROBE_INTERVAL
+   emit a probe and adjust LAST_PROBE_OFFSET.  */
+static void
+aarch64_allocate_and_probe_stack_space (int scratchreg, HOST_WIDE_INT size,
+   HOST_WIDE_INT *last_probe_offset)
+{
+  rtx temp = gen_rtx_REG (word_mode, scratchreg);
+
+  HOST_WIDE_INT rounded_size = size & -PROBE_INTERVAL;
+  HOST_WIDE_INT residual = size - rounded_size;
+
+  /* We can handle a small number of allocations/probes inline.  Otherwise
+ punt to a loop.  */
+  if (rounded_size && rounded_size <= 4 * PROBE_INTERVAL)
+{
+  for (HOST_WIDE_INT i = 0; i < rounded_size; i += PROBE_INTERVAL)
+   {
+ /* We should never need a scratch register for this adjustment.  */
+ aarch64_sub_sp (-1, PROBE_INTERVAL, true);
+
+ /* We just allocated PROBE_INTERVAL bytes.  Thus, a probe is
+mandatory.  Note that LAST_PROBE_OFFSET does not change here.  */
+ emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
+  (PROBE_INTERVAL
+   - GET_MODE_SIZE (word_mode;
+   }
+  dump_stack_clash_frame_info (PROBE_INLINE, size != rounded_size);
+}
+  else if (rounded_size)
+{
+  /* Compute the ending address.  */
+  emit_move_insn (temp, GEN_INT (-rounded_size));
+  emit_insn (gen_add3_insn (temp, stack_pointer_rtx, temp));
+
+  /* This allocates and probes the stack.  Like the inline version above
+it does not need to change LAST_PROBE_OFFSET.
+
+It almost certainly does not update CFIs correctly.  */
+  emit_insn (gen_probe_stack_range (temp, temp, temp));
+  dump_stack_clash_frame_info (PROBE_LOOP, size != rounded_size);
+}
+
+  /* Handle any residuals.  */
+  if (residual)
+{
+  aarch64_sub_sp (-1, residual, true);
+  *last_probe_offset += residual;
+  if (*last_probe_offset >= PROBE_INTERVAL)
+   {
+ *last_probe_offset -= PROBE_INTERVAL;
+ emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
+  (residual
+   - 

[PATCH][RFA/RFC] Stack clash mitigation patch 06/08 V2

2017-07-18 Thread Jeff Law


These are the PPC bits for stack clash mitigation.



As noted before the PPC bits were larger/more complex than other ports.
 Part of that was due to the PPC defining its own dynamic stack
allocation expander -- which in turn means we weren't using any of the
generic code in explow.c for stack clash mitigation and instead had a
virtual copy in rs6000.md.

This patch removes that virtual copy and instead uses the slightly
refactored bits and is clearly simpler.

--

PPC is interesting in that its ABIs requires *sp to always contain the
backchain.  That implicit probe is very useful in eliminating many
explicit probes.  In fact, from the standpoint of avoiding explicit
probes it's probably the most ideal situation.

We honor the requirement that store-with-base-register-modification
instructions are the only way to allocate stack in the new code.   This
means we have to keep a copy of the backchain handy as a source operand
for that instruction.  This implies a special case for a stack frame of
precisely PROBE_INTERVAL bytes, which need not copy the backchain into a
temporary.

I'm pretty sure the CFIs are not right for the loop case.


We select between two probing loop styles for the probe_stack_range insn.

The PPC port also has its own insn to allocate dynamic stack space.  So
there's a chunk of code in that expander to handle -fstack-clash-protection.


You'll note there are no new tests -- the tests added in patch #5 are
used by the PPC port as-is.

Ok for the trunk?


* config/rs6000/rs6000-protos.h (output_probe_stack_range): Update
prototype for new argument.
* config/rs6000/rs6000.c (wrap_frame_mem): New function extracted
from rs6000_emit_allocate_stack.
(PROBE_INTERVAL): Define.
(handle_residual): New function. 
(rs6000_emit_probe_stack_range_stack_clash): New function.
(rs6000_emit_allocate_stack): Use wrap_frame_mem.
Call rs6000_emit_probe_stack_range_stack_clash as needed.
(rs6000_emit_probe_stack_range): Add additional argument
to call to gen_probe_stack_range{si,di}.
(output_probe_stack_range): New.
(output_probe_stack_range_1):  Renamed from output_probe_stack_range.
(output_probe_stack_range_stack_clash): New.
(rs6000_emit_prologue): Emit notes into dump file as requested.
* rs6000.md (allocate_stack): Handle -fstack-clash-protection
(probe_stack_range): Operand 0 is now early-clobbered.
Add additional operand and pass it to output_probe_stack_range.


diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index aeec9b2..451c442 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -134,7 +134,7 @@ extern void rs6000_emit_sISEL (machine_mode, rtx[]);
 extern void rs6000_emit_sCOND (machine_mode, rtx[]);
 extern void rs6000_emit_cbranch (machine_mode, rtx[]);
 extern char * output_cbranch (rtx, const char *, int, rtx_insn *);
-extern const char * output_probe_stack_range (rtx, rtx);
+extern const char * output_probe_stack_range (rtx, rtx, rtx);
 extern void rs6000_emit_dot_insn (rtx dst, rtx src, int dot, rtx ccreg);
 extern bool rs6000_emit_set_const (rtx, rtx);
 extern int rs6000_emit_cmove (rtx, rtx, rtx, rtx);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index aa70e30..d270edf 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -25618,6 +25618,213 @@ rs6000_emit_stack_tie (rtx fp, bool hard_frame_needed)
   emit_insn (gen_stack_tie (gen_rtx_PARALLEL (VOIDmode, p)));
 }
 
+/* INSN allocates SIZE bytes on the stack (STACK_REG) using a store
+   with update style insn.
+
+   Set INSN's alias set/attributes and add suitable flags and notes
+   for the dwarf CFI machinery.  */
+static void
+wrap_frame_mem (rtx insn, rtx stack_reg, HOST_WIDE_INT size)
+{
+  rtx par = PATTERN (insn);
+  gcc_assert (GET_CODE (par) == PARALLEL);
+  rtx set = XVECEXP (par, 0, 0);
+  gcc_assert (GET_CODE (set) == SET);
+  rtx mem = SET_DEST (set);
+  gcc_assert (MEM_P (mem));
+  MEM_NOTRAP_P (mem) = 1;
+  set_mem_alias_set (mem, get_frame_alias_set ());
+
+  RTX_FRAME_RELATED_P (insn) = 1;
+  add_reg_note (insn, REG_FRAME_RELATED_EXPR,
+   gen_rtx_SET (stack_reg, gen_rtx_PLUS (Pmode, stack_reg,
+ GEN_INT (-size;
+}
+
+#define PROBE_INTERVAL (1 << STACK_CHECK_PROBE_INTERVAL_EXP)
+
+#if PROBE_INTERVAL > 32768
+#error Cannot use indexed addressing mode for stack probing
+#endif
+
+/* Allocate ROUNDED_SIZE - ORIG_SIZE bytes on the stack, storing
+   ORIG_SP into *sp after the allocation.
+
+   ROUNDED_SIZE will be a multiple of PROBE_INTERVAL and
+   ORIG_SIZE - ROUNDED_SIZE will be less than PROBE_INTERVAL.
+
+   Return the insn that allocates the residual space.  */
+static rtx_insn *
+handle_residual (HOST_WIDE_INT orig_size,
+HOST_WIDE_INT rounded_size,
+rtx 

[PATCH][RFA/RFC] Stack clash mitigation patch 05/08

2017-07-18 Thread Jeff Law

I don't think this has changed in any significant way since V1.

--

The prior patch introduced -fstack-clash-protection prologues for the
x86.  And yet we still saw large allocations in our testing.

It turns out combine-stack-adjustments would take

allocate PROBE_INTERVAL
probe
allocate PROBE_INTERVAL
probe
allocate PROBE_INTERVAL
probe
allocate RESIDUAL

And turn that into

allocate (3 * PROBE_INTERVAL) + residual
probe
probe
probe

Adjusting the address of the probes appropriately.  Ugh.

This patch introduces a new note that the backend can attach to a stack
adjustment which essentially tells c-s-a to not merge it into other
adjustments.  THere's an x86 specific test to verify behavior.

OK for the trunk?


* combine-stack-adj.c (combine_stack_adjustments_for_block): Do
nothing for stack adjustments with REG_STACK_CHECK.
* config/i386/i386.c (pro_epilogue_adjust_stack): Return insn.
(ix86_adjust_satck_and_probe_stack_clash): Add REG_STACK_NOTEs.
* reg-notes.def (STACK_CHECK): New note.

testsuite/

* gcc.target/i386/stack-check-11.c: New test.


diff --git a/gcc/combine-stack-adj.c b/gcc/combine-stack-adj.c
index 9ec14a3..82d6dba 100644
--- a/gcc/combine-stack-adj.c
+++ b/gcc/combine-stack-adj.c
@@ -508,6 +508,8 @@ combine_stack_adjustments_for_block (basic_block bb)
continue;
 
   set = single_set_for_csa (insn);
+  if (set && find_reg_note (insn, REG_STACK_CHECK, NULL_RTX))
+   set = NULL_RTX;
   if (set)
{
  rtx dest = SET_DEST (set);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index fa10c28..d297e8a 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13405,7 +13405,7 @@ ix86_add_queued_cfa_restore_notes (rtx insn)
zero if %r11 register is live and cannot be freely used and positive
otherwise.  */
 
-static void
+static rtx
 pro_epilogue_adjust_stack (rtx dest, rtx src, rtx offset,
   int style, bool set_cfa)
 {
@@ -13496,6 +13496,7 @@ pro_epilogue_adjust_stack (rtx dest, rtx src, rtx 
offset,
   m->fs.sp_valid = valid;
   m->fs.sp_realigned = realigned;
 }
+  return insn;
 }
 
 /* Find an available register to be used as dynamic realign argument
@@ -13837,9 +13838,11 @@ ix86_adjust_stack_and_probe_stack_clash (const 
HOST_WIDE_INT size)
   for (i = PROBE_INTERVAL; i <= size; i += PROBE_INTERVAL)
{
  /* Allocate PROBE_INTERVAL bytes.  */
- pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
+ rtx insn
+   = pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
 GEN_INT (-PROBE_INTERVAL), -1,
 m->fs.cfa_reg == stack_pointer_rtx);
+ add_reg_note (insn, REG_STACK_CHECK, const0_rtx);
 
  /* And probe at *sp.  */
  emit_stack_probe (stack_pointer_rtx);
diff --git a/gcc/reg-notes.def b/gcc/reg-notes.def
index 8734d26..18cf7e3 100644
--- a/gcc/reg-notes.def
+++ b/gcc/reg-notes.def
@@ -223,6 +223,10 @@ REG_NOTE (ARGS_SIZE)
pseudo reg.  */
 REG_NOTE (RETURNED)
 
+/* Indicates the instruction is a stack check probe that should not
+   be combined with other stack adjustments.  */
+REG_NOTE (STACK_CHECK)
+
 /* Used to mark a call with the function decl called by the call.
The decl might not be available in the call due to splitting of the call
insn.  This note is a SYMBOL_REF.  */
diff --git a/gcc/testsuite/gcc.target/i386/stack-check-11.c 
b/gcc/testsuite/gcc.target/i386/stack-check-11.c
new file mode 100644
index 000..183103f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/stack-check-11.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstack-clash-protection" } */
+/* { dg-require-effective-target supports_stack_clash_protection } */
+
+extern void arf (unsigned long int *, unsigned long int *);
+void
+frob ()
+{
+  unsigned long int num[859];
+  unsigned long int den[859];
+  arf (den, num);
+}
+
+/* { dg-final { scan-assembler-times "subq" 4 } } */
+/* { dg-final { scan-assembler-times "orq" 3 } } */
+


[PATCH][RFA/RFC] Stack clash mitigation patch 03/08 V2

2017-07-18 Thread Jeff Law

I don't think this patch changed in any significant way since V1.
--

One of the painful aspects of all this code is the amount of target
dependent bits that have to be written and tested.

I didn't want to be scanning assembly code or RTL for prologues.  Each
target would have to have its own scanner which was too painful to
contemplate.

So instead I settled on having a routine that the target dependent
prologue expanders could call to dump information about what they were
doing.

This greatly simplifies the testing side of things by having a standard
way to dump decisions.  When combined with the dejagnu routines from
patch #1 which describe key attributes of the target's prologue
generation I can write tests in a fairly generic way.

This will be used by every target dependent prologue expander in this
series.

OK for the trunk?
commit ff99b9cb21a195fb2b2c0e4d580db2b1e806ec97
Author: Kyrylo Tkachov 
Date:   Thu Apr 14 10:52:45 2016 +0100

[AArch64] From Andrew Pinski: Work around PR target/64971

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index da85a7f..a9e811e 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -855,6 +855,13 @@ (define_expand "sibcall"
 	   || aarch64_is_noplt_call_p (callee)))
   XEXP (operands[0], 0) = force_reg (Pmode, callee);
 
+/* FIXME: This is a band-aid.  Need to analyze why expand_expr_addr_expr
+   is generating an SImode symbol reference.  See PR 64971.  */
+if (TARGET_ILP32
+	&& GET_CODE (XEXP (operands[0], 0)) == SYMBOL_REF
+	&& GET_MODE (XEXP (operands[0], 0)) == SImode)
+  XEXP (operands[0], 0) = convert_memory_address (Pmode,
+		  XEXP (operands[0], 0));
 if (operands[2] == NULL_RTX)
   operands[2] = const0_rtx;
 
@@ -886,6 +893,14 @@ (define_expand "sibcall_value"
 	   || aarch64_is_noplt_call_p (callee)))
   XEXP (operands[1], 0) = force_reg (Pmode, callee);
 
+/* FIXME: This is a band-aid.  Need to analyze why expand_expr_addr_expr
+   is generating an SImode symbol reference.  See PR 64971.  */
+if (TARGET_ILP32
+	&& GET_CODE (XEXP (operands[1], 0)) == SYMBOL_REF
+	&& GET_MODE (XEXP (operands[1], 0)) == SImode)
+  XEXP (operands[1], 0) = convert_memory_address (Pmode,
+		  XEXP (operands[1], 0));
+
 if (operands[3] == NULL_RTX)
   operands[3] = const0_rtx;
 
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr37433-1.c b/gcc/testsuite/gcc.c-torture/compile/pr37433-1.c
new file mode 100644
index 000..322c167
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr37433-1.c
@@ -0,0 +1,11 @@
+void regex_subst(void)
+{
+  const void *subst = "";
+  (*(void (*)(int))subst) (0);
+}
+
+void foobar (void)
+{
+  int x;
+  (*(void (*)(void))) ();
+}


[PATCH] [RFA/RFC] Stack clash mitigation patch 02/08 V2

2017-07-18 Thread Jeff Law

The biggest change since V1 of this patch is dropping the changes to
STACK_CHECK_MOVING_SP.  They're not needed.

This patch also refactors a bit of the new code in explow.c.  In
particular it pulls out 3 chunks of code for protecting dynamic stack
adjustments so they can be re-used by backends that have their own
allocation routines for dynamic stack data.

--
The key goal of this patch is to introduce stack clash protections for
dynamically allocated stack space and indirect uses of
STACK_CHECK_PROTECT via get_stack_check_protect.

Those two changes accomplish two things.  First it gives most targets
protection of dynamically allocated space (exceptions are targets which
expanders to allocate dynamic stack space such as ppc).

Second, targets which are not covered by -fstack-clash-protection
prologues later, but which are covered by -fstack-clash-protection get a
fair amount of protection.

We essentially vector into a totally different routine to allocate/probe
the dynamic stack space when -fstack-clash-protection is active.  It
differs from the existing routine is that it allocates PROBE_INTERVAL
chunks and probes them as they are allocated.  The existing code would
allocate the entire space as a single hunk, then probe PROBE_INTERVAL
chunks within the hunk.

That routine is never presented with constant allocations on x86, but is
presented with constant allocations on other architectures.  It will
optimize cases when it knows it does not need the loop or the residual
allocation after the loop.   It does not have an unrolled loop mode, but
one could be added -- it didn't seem worth the effort.

The test will check that the loop is avoided for one case where it makes
sense.  It does not check for avoiding the residual allocation, but it
could probably be made to do so.


The indirection for STACK_CHECK_PROTECT via get_stack_protect is worth
some further discussion as well.

Early in the development of the stack-clash mitigation patches we
thought we could get away with re-using much of the existing target code
for -stack-check=specific.

Essentially that code starts a probing loop at STACK_CHECK_PROTECT and
probes 2-3 pages beyond the current function's needs.  The problem was
that starting at STACK_CHECK_PROTECT would skip probes in the first
couple pages leaving the code vulnerable.

So the idea was to avoid using STACK_CHECK_PROTECT directly.  Instead we
would indirect through a new function (get_stack_check_protect) which
would return either 0 or STACK_CHECK_PROTECT depending on whether or not
we wanted -fstack-clash-protection or -fstack-check=specific respectively.

That scheme works reasonably well.  Except that it will tend to allocate
a large (larger than PROBE_INTERVAL) chunk of memory at once, then go
back and probe regions of PROBE_INTERVAL size.  That introduces an
unfortunate race condition with asynch signals and also crashes valgrind
on ppc and aarch64.

Rather than throw that code away, it may still be valuable to those
targets with -fstack-check=specific support, but without
-fstack-clash-protection support.  So I'm including it here.


OK for the trunk?




[PATCH][RFA/RFC] Stack clash mitigation patch 02b/08 V2

2017-07-18 Thread Jeff Law

-fstack-clash-protection is now separate from -fstack-check=.  But we
still want targets without stack-clash specific prologue support to be
able to get partial coverage from -fstack-clash-protection.  This adds
the necessary checks for flag_stack_clash_protection to the appropriate
targets so we can re-use -fstack-check=specific to give some stack clsh
protection.

I did not add it to the targets where I've done stack-clash-protection
support.  It'd be a waste of time (x86, ppc, aarch64, s390).

OK for the trunk?

* config/alpha/alpha.c (alpha_expand_prologue): Also check
flag_stack_clash_protection.
* config/arm/arm.c (arm_compute_static_chain_stack_bytes): Likewise.
(arm_expand_prologue, thumb1_expand_prologue): Likewise.
(arm_frame_pointer_required): Likewise.
* config/ia64/ia64.c (ia64_compute_frame_size): Likewise.
(ia64_expand_prologue): Likewise.
* config/mips/mips.c (mips_expand_prologue): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_expand_prologue): Likewise.
* config/sparc/sparc.c (sparc_expand_prologue): Likewise.
(sparc_flat_expand_prologue): Likewise.
* config/spu/spu.c (spu_expand_prologue): Likewise.



diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c
index 91f3d7c..36e78a0 100644
--- a/gcc/config/alpha/alpha.c
+++ b/gcc/config/alpha/alpha.c
@@ -7740,7 +7740,7 @@ alpha_expand_prologue (void)
  Note that we are only allowed to adjust sp once in the prologue.  */
 
   probed_size = frame_size;
-  if (flag_stack_check)
+  if (flag_stack_check || flag_stack_clash_protection)
 probed_size += get_stack_check_protect ();
 
   if (probed_size <= 32768)
@@ -7755,7 +7755,7 @@ alpha_expand_prologue (void)
  /* We only have to do this probe if we aren't saving registers or
 if we are probing beyond the frame because of -fstack-check.  */
  if ((sa_size == 0 && probed_size > probed - 4096)
- || flag_stack_check)
+ || flag_stack_check || flag_stack_clash_protection)
emit_insn (gen_probe_stack (GEN_INT (-probed_size)));
}
 
@@ -7785,7 +7785,8 @@ alpha_expand_prologue (void)
 late in the compilation, generate the loop as a single insn.  */
   emit_insn (gen_prologue_stack_probe_loop (count, ptr));
 
-  if ((leftover > 4096 && sa_size == 0) || flag_stack_check)
+  if ((leftover > 4096 && sa_size == 0)
+ || flag_stack_check || flag_stack_clash_protection)
{
  rtx last = gen_rtx_MEM (DImode,
  plus_constant (Pmode, ptr, -leftover));
@@ -7793,7 +7794,7 @@ alpha_expand_prologue (void)
  emit_move_insn (last, const0_rtx);
}
 
-  if (flag_stack_check)
+  if (flag_stack_check || flag_stack_clash_protection)
{
  /* If -fstack-check is specified we have to load the entire
 constant into a register and subtract from the sp in one go,
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 9822ca7..4a93767 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -19151,7 +19151,8 @@ arm_compute_static_chain_stack_bytes (void)
   /* See the defining assertion in arm_expand_prologue.  */
   if (IS_NESTED (arm_current_func_type ())
   && ((TARGET_APCS_FRAME && frame_pointer_needed && TARGET_ARM)
- || (flag_stack_check == STATIC_BUILTIN_STACK_CHECK
+ || ((flag_stack_check == STATIC_BUILTIN_STACK_CHECK
+  || flag_stack_clash_protection)
  && !df_regs_ever_live_p (LR_REGNUM)))
   && arm_r3_live_at_start_p ()
   && crtl->args.pretend_args_size == 0)
@@ -21453,7 +21454,8 @@ arm_expand_prologue (void)
  clobbered when creating the frame, we need to save and restore it.  */
   clobber_ip = IS_NESTED (func_type)
   && ((TARGET_APCS_FRAME && frame_pointer_needed && TARGET_ARM)
-  || (flag_stack_check == STATIC_BUILTIN_STACK_CHECK
+  || ((flag_stack_check == STATIC_BUILTIN_STACK_CHECK
+   || flag_stack_clash_protection)
   && !df_regs_ever_live_p (LR_REGNUM)
   && arm_r3_live_at_start_p ()));
 
@@ -21667,7 +21669,8 @@ arm_expand_prologue (void)
  stack checking.  We use IP as the first scratch register, except for the
  non-APCS nested functions if LR or r3 are available (see clobber_ip).  */
   if (!IS_INTERRUPT (func_type)
-  && flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
+  && (flag_stack_check == STATIC_BUILTIN_STACK_CHECK
+ || flag_stack_clash_protection))
 {
   unsigned int regno;
 
@@ -24959,7 +24962,9 @@ thumb1_expand_prologue (void)
 current_function_static_stack_size = size;
 
   /* If we have a frame, then do stack checking.  FIXME: not implemented.  */
-  if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK && size)
+  if ((flag_stack_check == STATIC_BUILTIN_STACK_CHECK
+  

[PATCH][RFA/RFC] Stack clash mitigation patch 01/08 V2

2017-07-18 Thread Jeff Law

The biggest change in this update to patch 01/08 is moving of stack
clash protection out of -fstack-check= and into its own option,
-fstack-clash-protection.  I believe other issues raised by reviewers
have been addressed as well.

--

This patch introduces a new option -fstack-clash-protection designed to
protect the code GCC generates against stack-clash style attacks.

This patch also introduces dejagnu bits that are later used in tests.
The idea was to introduce dejagnu functions which describe aspects of
the target and have the tests adjust their expectations based on those
dejagnu functions rather than on a target name.

Finally, this patch introduces one new test of note.  Some targets have
call instructions that store a return pointer into the stack and we take
advantage of that ISA feature to avoid some explicit probes.

This optimization is restricted to cases where the caller does not have
a frame of its own (because there's no reasonable way to tear that frame
down on the return path).

However, a sufficiently smart compiler could realize that a call to a
noreturn function could be converted into a jump, even if the caller has
a frame because that frame need not be torn down.  Thus it would be
possible for a function calling a noreturn function to advance the stack
into the guard without actually touching the guard page, which breaks
the assumption that the call instruction would touch the guard
triggering a fault for that case.

GCC doesn't currently optimize that case for various reasons, but it
seemed prudent to go ahead and explicitly verify that with a test.


Thoughts?  OK for the trunk?




* common.opt (-fstack-clash-protection): New option.
* flag-types.h (enum stack_check_type): Note difference between
-fstack-check= and -fstack-clash-protection.
* toplev.c (process_options): Handle -fstack-clash-protection.
* doc/invoke.texi (-fstack-clash-protection): Document new option.
(-fstack-check): Note additional problem with -fstack-check=generic.
Note that -fstack-check is primarily for Ada and refer users
to -fstack-clash-protection for stack-clash-protection.
(-fstack-clash-protection): Document.

testsuite/

* gcc.dg/stack-check-2.c: New test.
* lib/target-supports.exp
(check_effective_target_supports_stack_clash_protection): New function.
(check_effective_target_frame_pointer_for_non_leaf): Likewise.
(check_effective_target_caller_implicit_probes): Likewise.



diff --git a/gcc/common.opt b/gcc/common.opt
index e81165c..cfaf2bc 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2306,6 +2306,11 @@ fstack-check
 Common Alias(fstack-check=, specific, no)
 Insert stack checking code into the program.  Same as -fstack-check=specific.
 
+fstack-clash-protection
+Common Report Var(flag_stack_clash_protection)
+Insert code to probe each page of stack space as it is allocated to protect
+from stack-clash style attacks
+
 fstack-limit
 Common Var(common_deferred_options) Defer
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 3e5cee8..bf1298d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11333,7 +11333,8 @@ target support in the compiler but comes with the 
following drawbacks:
 @enumerate
 @item
 Modified allocation strategy for large objects: they are always
-allocated dynamically if their size exceeds a fixed threshold.
+allocated dynamically if their size exceeds a fixed threshold.  Note this
+may change the semantics of some code.
 
 @item
 Fixed limit on the size of the static frame of functions: when it is
@@ -11348,6 +11349,19 @@ generic implementation, code performance is hampered.
 Note that old-style stack checking is also the fallback method for
 @samp{specific} if no target support has been added in the compiler.
 
+@samp{-fstack-check=} is designed for Ada's needs to detect infinite recursion
+and stack overflows.  @samp{specific} is an excellent choice when compiling
+Ada code.  It is not generally sufficient to protect against stack-clash
+attacks.  To protect against those you want @samp{-fstack-clash-protection}.
+
+@item -fstack-clash-protection
+@opindex fstack-clash-protection
+Generate code to prevent stack clash style attacks.  When this option is
+enabled, the compiler will only allocate one page of stack space at a time
+and each page is accessed immediately after allocation.  Thus, it prevents
+allocations from jumping over any stack guard page provided by the
+operating system.
+
 @item -fstack-limit-register=@var{reg}
 @itemx -fstack-limit-symbol=@var{sym}
 @itemx -fno-stack-limit
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 5faade5..8874cba 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -166,7 +166,14 @@ enum permitted_flt_eval_methods
   PERMITTED_FLT_EVAL_METHODS_C11
 };
 
-/* Type of stack check.  */
+/* Type of stack check.
+
+   Stack checking is designed to detect infinite recursion for Ada
+   

[PATCH][RFA/RFC] Stack clash mitigation patch 00/08 V2

2017-07-18 Thread Jeff Law

So later than I wanted, here's the V2 of the stack clash mitigation work.

Probably the biggest change in this version was moving the protection
out of -fstack-check= and into its own option (-fstack-clash-protection)

This has been bootstrapped and regression tested on the same set of
targets {x86_64, powerpc, powerpc64le, aarch64, s390x}-linux-gnu.  I've
also enabled -fstack-clash-protection and eyeballed test results
relative to the baselines to ensure nothing unexpected was failing.

Since this patch hits other targets that are not protected from
stack-clash a little harder, I also tested things like {alpha, mips,
ia64, hppa}-linux-gnu through building all-gcc.

As with the prior patch, comments, flames, questions are welcomed.

Jeff


Re: [PATCH] Fix pr80044, -static and -pie insanity, and pr81170

2017-07-18 Thread Alan Modra
On Tue, Jul 18, 2017 at 07:49:48AM -0700, H.J. Lu wrote:
> The difference is with --enable-default-pie, the gcc driver doesn't pass
> both -pie and -static ld when "-static -pie" is used.   Does your change
> pass both -pie and -static ld when "-static -pie" is used?

Again, as I said in the original post: "In both cases you now will
have -static completely overriding -pie".

That means "gcc -pie -static" and "gcc -static -pie" just pass
"-static" to ld, and select the appropriate startup files for a static
executable, when configured with --disable-default-pie.  Which is what
happens currently for --enable-default-pie.

None of this is rocket science.  I know what I'm doing where the
linker and startup files are concerned, and I'm comfortable with the
gcc specs language.  The patch is simple!  It should be easy to
review, except for trying to understand the "-" lines.  Yet it has sat
unreviewed for nearly four weeks.  And it fixes a powerpc
--enable-default-pie bootstrap failure (pr81295).

Joseph, would you please take a look?
https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01678.html

I know there is more to do in this area, for example, it seems to me
that the HAVE_LD_PIE definition of GNU_USER_TARGET_STARTFILE_SPEC is
good for !HAVE_LD_PIE, and similarly for GNU_USE_TARGET_ENDFILE_SPEC.
And yes, I propagated that duplication into rs6000/sysv4.h, which
needs some serious tidying.  rs6000/sysv4.h linux support ought to be
using the gnu-user.h defines rather than copying them, something I've
told Segher I'll look at after this patch goes in.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PING} [PATCH] Transform (m1 > m2) * d into m1> m2 ? d : 0

2017-07-18 Thread Hurugalawadi, Naveen
Hi,  

Please consider this as a personal reminder to review the patch
at following link and let me know your comments on the same.  

https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00178.html

Thanks,
Naveen




Re: [PING 5] [PATCH] [AArch64] vec_pack_trunc_ should split after register allocator

2017-07-18 Thread Hurugalawadi, Naveen
Hi,  

Please consider this as a personal reminder to review the patch
at following link and let me know your comments on the same.  

https://gcc.gnu.org/ml/gcc-patches/2017-04/msg01334.html

Thanks,
Naveen



    

[Bug middle-end/81478] By default, GCC emits a function call for complex multiplication

2017-07-18 Thread smcallis at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81478

--- Comment #2 from Sean McAllister  ---
*** Bug 81479 has been marked as a duplicate of this bug. ***

[Bug c++/81479] By default, GCC emits a function call for complex multiplication

2017-07-18 Thread smcallis at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81479

Sean McAllister  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Sean McAllister  ---
Sorry submitted twice somehow

*** This bug has been marked as a duplicate of bug 81478 ***

[Bug middle-end/81478] By default, GCC emits a function call for complex multiplication

2017-07-18 Thread smcallis at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81478

--- Comment #1 from Sean McAllister  ---
Created attachment 41785
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41785=edit
cfloat class

[Bug c++/81479] New: By default, GCC emits a function call for complex multiplication

2017-07-18 Thread smcallis at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81479

Bug ID: 81479
   Summary: By default, GCC emits a function call for complex
multiplication
   Product: gcc
   Version: 7.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: smcallis at gmail dot com
  Target Milestone: ---

I've seen this in gcc 4.4.7, 4.7.4 4.8.4, 5.4.1, 6.3.0 and 7.1.0

When compiling some simple complex arithmetic:

template 
void __attribute__((noinline)) benchcore(const std::vector , const 
std::vector , const std::vector , std::vector , cx uu,
cx vv, 
size_t nn) {
   for (ssize_t ii=0; ii < nn; ii++) {
dd[ii] = (
aa[ii]*uu +
bb[ii]*vv +
cc[ii]
);
}
}

> g++ -I. test.cc -O3 -o test

The assembly generated is very unfriendly, it just basically unconditionally
branches to the __mulsc3 function every time.

   0x00402a78 <+104>:   movss  0x4(%r12,%rbx,8),%xmm3
   0x00402a7f <+111>:   movss  (%r12,%rbx,8),%xmm2
   0x00402a85 <+117>:   movss  0x18(%rsp),%xmm0
   0x00402a8b <+123>:   movss  0x1c(%rsp),%xmm1
   0x00402a91 <+129>:   callq  0x400af0 <__mulsc3@plt>
   0x00402a96 <+134>:   movq   %xmm0,0x28(%rsp)
   0x00402a9c <+140>:   movss  0x14(%rsp),%xmm3
   0x00402aa2 <+146>:   movss  0x28(%rsp),%xmm5
   0x00402aa8 <+152>:   movss  0x2c(%rsp),%xmm4
   0x00402aae <+158>:   movss  0x4(%rbp,%rbx,8),%xmm1
   0x00402ab4 <+164>:   movss  0x0(%rbp,%rbx,8),%xmm0
   0x00402aba <+170>:   movss  0x10(%rsp),%xmm2
   0x00402ac0 <+176>:   movss  %xmm5,0xc(%rsp)
   0x00402ac6 <+182>:   movss  %xmm4,0x8(%rsp)
   0x00402acc <+188>:   callq  0x400af0 <__mulsc3@plt>
   0x00402ad1 <+193>:   movq   %xmm0,0x20(%rsp)
   0x00402ad7 <+199>:   movss  0x8(%rsp),%xmm4
   0x00402add <+205>:   movss  0xc(%rsp),%xmm5
   0x00402ae3 <+211>:   addss  0x24(%rsp),%xmm4
   0x00402ae9 <+217>:   addss  0x20(%rsp),%xmm5
   0x00402aef <+223>:   addss  0x4(%r13,%rbx,8),%xmm4
   0x00402af6 <+230>:   addss  0x0(%r13,%rbx,8),%xmm5
   0x00402afd <+237>:   movss  %xmm4,0x4(%r14,%rbx,8)
   0x00402b04 <+244>:   movss  %xmm5,(%r14,%rbx,8)

Which then implement the spec in Annex G of the ANSI C spec.  Though this isn't
a "bug" per se, an very nice enhancement would be to recognize that one can
compute the complex multiply first, then check the results for NaN and _then_
call a function to correct the results.  This would allow the main
multiplication code to be inlined, with the cost of 2 compares, an and a jump. 
The unlikely path of having to fix the result will almost never be called.  
This will make default complex multiplies without -fcx-limited-range on be much
better by default, if not ideal.

[Bug c++/81478] New: By default, GCC emits a function call for complex multiplication

2017-07-18 Thread smcallis at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81478

Bug ID: 81478
   Summary: By default, GCC emits a function call for complex
multiplication
   Product: gcc
   Version: 7.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: smcallis at gmail dot com
  Target Milestone: ---

Created attachment 41784
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41784=edit
Benchmark code

I've seen this in gcc 4.4.7, 4.7.4 4.8.4, 5.4.1, 6.3.0 and 7.1.0

When compiling some simple complex arithmetic:

template 
void __attribute__((noinline)) benchcore(const std::vector , const 
std::vector , const std::vector , std::vector , cx uu,
cx vv, 
size_t nn) {
   for (ssize_t ii=0; ii < nn; ii++) {
dd[ii] = (
aa[ii]*uu +
bb[ii]*vv +
cc[ii]
);
}
}

> g++ -I. test.cc -O3 -o test

The assembly generated is very unfriendly, it just basically unconditionally
branches to the __mulsc3 function every time.

   0x00402a78 <+104>:   movss  0x4(%r12,%rbx,8),%xmm3
   0x00402a7f <+111>:   movss  (%r12,%rbx,8),%xmm2
   0x00402a85 <+117>:   movss  0x18(%rsp),%xmm0
   0x00402a8b <+123>:   movss  0x1c(%rsp),%xmm1
   0x00402a91 <+129>:   callq  0x400af0 <__mulsc3@plt>
   0x00402a96 <+134>:   movq   %xmm0,0x28(%rsp)
   0x00402a9c <+140>:   movss  0x14(%rsp),%xmm3
   0x00402aa2 <+146>:   movss  0x28(%rsp),%xmm5
   0x00402aa8 <+152>:   movss  0x2c(%rsp),%xmm4
   0x00402aae <+158>:   movss  0x4(%rbp,%rbx,8),%xmm1
   0x00402ab4 <+164>:   movss  0x0(%rbp,%rbx,8),%xmm0
   0x00402aba <+170>:   movss  0x10(%rsp),%xmm2
   0x00402ac0 <+176>:   movss  %xmm5,0xc(%rsp)
   0x00402ac6 <+182>:   movss  %xmm4,0x8(%rsp)
   0x00402acc <+188>:   callq  0x400af0 <__mulsc3@plt>
   0x00402ad1 <+193>:   movq   %xmm0,0x20(%rsp)
   0x00402ad7 <+199>:   movss  0x8(%rsp),%xmm4
   0x00402add <+205>:   movss  0xc(%rsp),%xmm5
   0x00402ae3 <+211>:   addss  0x24(%rsp),%xmm4
   0x00402ae9 <+217>:   addss  0x20(%rsp),%xmm5
   0x00402aef <+223>:   addss  0x4(%r13,%rbx,8),%xmm4
   0x00402af6 <+230>:   addss  0x0(%r13,%rbx,8),%xmm5
   0x00402afd <+237>:   movss  %xmm4,0x4(%r14,%rbx,8)
   0x00402b04 <+244>:   movss  %xmm5,(%r14,%rbx,8)

Which then implement the spec in Annex G of the ANSI C spec.  Though this isn't
a "bug" per se, an very nice enhancement would be to recognize that one can
compute the complex multiply first, then check the results for NaN and _then_
call a function to correct the results.  This would allow the main
multiplication code to be inlined, with the cost of 2 compares, an and a jump. 
The unlikely path of having to fix the result will almost never be called.  
This will make default complex multiplies without -fcx-limited-range on be much
better by default, if not ideal.

Re: whereis PLUGIN_REGISTER_GGC_CACHES? how to migrate it for GCC v6.x?

2017-07-18 Thread Leslie Zhai



在 2017年07月19日 04:54, Trevor Saunders 写道:

On Tue, Jul 18, 2017 at 03:52:55PM +0800, Leslie Zhai wrote:

Hi Trevor,

Thanks for your kind response!

在 2017年07月17日 19:51, Trevor Saunders 写道:

On Wed, Jul 12, 2017 at 10:12:03AM +0800, Leslie Zhai wrote:

PS: Trevor's email is not available? thanks!

Sorry about that, I've left Mozilla and been vacationing for a month, so
didn't get to updating MAINTAINERS yet.  Here's a patch doing that.

Could I use PLUGIN_REGISTER_GGC_ROOTS to take place of
PLUGIN_REGISTER_GGC_CACHES? https://gcc.gnu.org/ml/gcc/2017-07/msg00052.html

If you are ok with your plugin keeping things alive that would otherwise
be garbage collected because they are no longer needed sure I think that
will work.  Though some things are ggc_free()ed and so you might want to
be careful touching those things that should be collected but you are
keeping alive.  Does that help?

Very helpful :) thank you so much!




Trev


Trev

commit ff900f40d23f765fd59047a90a7e3ff18cbcbf5a
Author: Trevor Saunders 
Date:   Mon Jul 17 07:44:50 2017 -0400

  update my entry in MAINTAINERS
  ChangeLog:
  * MAINTAINERS: Update my email address.

diff --git a/MAINTAINERS b/MAINTAINERS
index a2f24742374..6a314049c42 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -557,7 +557,7 @@ Hariharan Sandanagobalane   

   Iain Sandoe  
   Duncan Sands 
   Sujoy Saraswati  

-Trevor Saunders
+Trevor Saunders

   Aaron Sawdey 
   Roger Sayle  
   Will Schmidt 

--
Regards,
Leslie Zhai - a LLVM developer https://reviews.llvm.org/p/xiangzhai/





--
Regards,
Leslie Zhai - a LLVM developer https://reviews.llvm.org/p/xiangzhai/





Re: [PATCH] PR libstdc++/81395 fix crash when write follows large read

2017-07-18 Thread Jonathan Wakely

On 19/07/17 01:17 +0100, Jonathan Wakely wrote:

This fixes a crash that happens in std::filebuf when a large read
consumes the entire get area and is followed by a write, which is then
synced to the file by a call to overflow.

The problem is that xsgetn calls _M_set_buffer(0) after reading from
the file (i.e. when in 'read' mode). As the comments on _M_set_buffer
say, an argument of 0 is used for 'write' mode. This causes the
filebuf to have an active put area while in 'read' mode, so that the
next write inserts straight into that put area, rather than performing
the required seek to leave 'read' mode.

The next overflow then tries to leave 'read' mode by doing a seek, but
that then tries to flush the non-empty put area by calling overflow,
which goes into a loop until we overflow the stack.

The solution is to simply remove the call to _M_set_buffer(0). It's
not needed because the buffers are already set up appropriately after
xsgetn has read from the file: there's no active putback, no put area,
and setg(eback(), egptr(), egptr()) has been called so there's nothing
available in the get area. All we need to do is set _M_reading = true
so that a following write knows it needs to perform a seek.

The new testcase passes with GCC 4.5, so this is technically a
regression. However, I have a more demanding test that fails even with
GCC 4.5, so I don't think mixing reads and writes without intervening
seeks was ever working completely. I hope it is now.

I spent a LOT of time checking the make check-performance results
before and after this patch (and with various other attempted fixes)
and any difference seemed to be noise.

PR libstdc++/81395
* include/bits/fstream.tcc (basic_filebuf::xsgetn): Don't set buffer
pointers for write mode after reading.
* testsuite/27_io/basic_filebuf/sgetn/char/81395.cc: New.


The new test needs this dg-require so it doesn't FAIL on target boards
with no file I/O, and the dg-do is redundant.

Committed to trunk.


commit e868d4e4a67faa9b889720b5fcdd10f5eb0f4fa8
Author: Jonathan Wakely 
Date:   Wed Jul 19 01:19:20 2017 +0100

Use dg-require-fileio in new test

	* testsuite/27_io/basic_filebuf/sgetn/char/81395.cc: Add dg-require.

diff --git a/libstdc++-v3/testsuite/27_io/basic_filebuf/sgetn/char/81395.cc b/libstdc++-v3/testsuite/27_io/basic_filebuf/sgetn/char/81395.cc
index 4985628..ea8dbc1 100644
--- a/libstdc++-v3/testsuite/27_io/basic_filebuf/sgetn/char/81395.cc
+++ b/libstdc++-v3/testsuite/27_io/basic_filebuf/sgetn/char/81395.cc
@@ -15,7 +15,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-do run }
+// { dg-require-fileio "" }
 
 // PR libstdc++/81395
 


[PATCH] PR libstdc++/81395 fix crash when write follows large read

2017-07-18 Thread Jonathan Wakely

This fixes a crash that happens in std::filebuf when a large read
consumes the entire get area and is followed by a write, which is then
synced to the file by a call to overflow.

The problem is that xsgetn calls _M_set_buffer(0) after reading from
the file (i.e. when in 'read' mode). As the comments on _M_set_buffer
say, an argument of 0 is used for 'write' mode. This causes the
filebuf to have an active put area while in 'read' mode, so that the
next write inserts straight into that put area, rather than performing
the required seek to leave 'read' mode.

The next overflow then tries to leave 'read' mode by doing a seek, but
that then tries to flush the non-empty put area by calling overflow,
which goes into a loop until we overflow the stack.

The solution is to simply remove the call to _M_set_buffer(0). It's
not needed because the buffers are already set up appropriately after
xsgetn has read from the file: there's no active putback, no put area,
and setg(eback(), egptr(), egptr()) has been called so there's nothing
available in the get area. All we need to do is set _M_reading = true
so that a following write knows it needs to perform a seek.

The new testcase passes with GCC 4.5, so this is technically a
regression. However, I have a more demanding test that fails even with
GCC 4.5, so I don't think mixing reads and writes without intervening
seeks was ever working completely. I hope it is now.

I spent a LOT of time checking the make check-performance results
before and after this patch (and with various other attempted fixes)
and any difference seemed to be noise.

PR libstdc++/81395
* include/bits/fstream.tcc (basic_filebuf::xsgetn): Don't set buffer
pointers for write mode after reading.
* testsuite/27_io/basic_filebuf/sgetn/char/81395.cc: New.

Tested powerpc64le-linux, committed to trunk.

commit 535a7ea29b4d6724519c0f472bcfe3eb9d79070a
Author: Jonathan Wakely 
Date:   Tue Jul 18 15:20:25 2017 +0100

PR libstdc++/81395 fix crash when write follows large read

PR libstdc++/81395
* include/bits/fstream.tcc (basic_filebuf::xsgetn): Don't set buffer
pointers for write mode after reading.
* testsuite/27_io/basic_filebuf/sgetn/char/81395.cc: New.

diff --git a/libstdc++-v3/include/bits/fstream.tcc 
b/libstdc++-v3/include/bits/fstream.tcc
index b1beff86..ef51a84 100644
--- a/libstdc++-v3/include/bits/fstream.tcc
+++ b/libstdc++-v3/include/bits/fstream.tcc
@@ -699,7 +699,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  
   if (__n == 0)
 {
-  _M_set_buffer(0);
+  // Set _M_reading. Buffer is already in initial 'read' mode.
   _M_reading = true;
 }
   else if (__len == 0)
diff --git a/libstdc++-v3/testsuite/27_io/basic_filebuf/sgetn/char/81395.cc 
b/libstdc++-v3/testsuite/27_io/basic_filebuf/sgetn/char/81395.cc
new file mode 100644
index 000..4985628
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/basic_filebuf/sgetn/char/81395.cc
@@ -0,0 +1,46 @@
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run }
+
+// PR libstdc++/81395
+
+#include 
+#include  // for std::memset
+#include   // For BUFSIZ
+
+using std::memset;
+
+int main()
+{
+  {
+std::filebuf fb;
+fb.open("test.txt", std::ios::out);
+char data[BUFSIZ];
+memset(data, 'A', sizeof(data));
+fb.sputn(data, sizeof(data));
+  }
+
+  std::filebuf fb;
+  fb.open("test.txt", std::ios::in|std::ios::out);
+  char buf[BUFSIZ];
+  memset(buf, 0, sizeof(buf));
+  fb.sgetn(buf, sizeof(buf));
+  // Switch from reading to writing without seeking first:
+  fb.sputn("B", 1);
+  fb.pubsync();
+}


[Bug libstdc++/81395] [5/6/7/8 Regression] basic_filebuf::overflow recurses and overflows stack

2017-07-18 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81395

--- Comment #14 from Jonathan Wakely  ---
Author: redi
Date: Tue Jul 18 23:39:34 2017
New Revision: 250328

URL: https://gcc.gnu.org/viewcvs?rev=250328=gcc=rev
Log:
PR libstdc++/81395 fix crash when write follows large read

PR libstdc++/81395
* include/bits/fstream.tcc (basic_filebuf::xsgetn): Don't set buffer
pointers for write mode after reading.
* testsuite/27_io/basic_filebuf/sgetn/char/81395.cc: New.

Added:
trunk/libstdc++-v3/testsuite/27_io/basic_filebuf/sgetn/char/81395.cc
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/include/bits/fstream.tcc

Go patch committed: Insert type conversion for closure function value

2017-07-18 Thread Ian Lance Taylor
This patch by Than McIntosh changes the Go frontend, in
Func_expression::do_get_backend, when creating the backend
representation for a closure, to create a backend type conversion to
account for potential differences between the closure struct type
(where the number of fields is dependent on the number of values
referenced in the closure) and the generic function descriptor type
(struct with single function pointer field).  Bootstrapped and ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 250326)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-c49ddc84f3ce89310585aad23ab6e51ef5523748
+3d9ff9bc339942922f1be3bef07c6fe2978ad81a
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 249799)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -1204,7 +1204,14 @@ Func_expression::do_get_backend(Translat
   // expression.  It is a pointer to a struct whose first field points
   // to the function code and whose remaining fields are the addresses
   // of the closed-over variables.
-  return this->closure_->get_backend(context);
+  Bexpression *bexpr = this->closure_->get_backend(context);
+
+  // Introduce a backend type conversion, to account for any differences
+  // between the argument type (function descriptor, struct with a
+  // single field) and the closure (struct with multiple fields).
+  Gogo* gogo = context->gogo();
+  Btype *btype = this->type()->get_backend(gogo);
+  return gogo->backend()->convert_expression(btype, bexpr, this->location());
 }
 
 // Ast dump for function.


[Bug go/81451] missing futex check - libgo/runtime/thread-linux.c:12:0 futex.h:13:12: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘long’

2017-07-18 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81451

Ian Lance Taylor  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Ian Lance Taylor  ---
Should be fixed now on trunk.

[Bug go/81451] missing futex check - libgo/runtime/thread-linux.c:12:0 futex.h:13:12: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘long’

2017-07-18 Thread ian at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81451

--- Comment #4 from ian at gcc dot gnu.org  ---
Author: ian
Date: Tue Jul 18 23:14:29 2017
New Revision: 250326

URL: https://gcc.gnu.org/viewcvs?rev=250326=gcc=rev
Log:
PR go/81451
runtime: inline runtime_osinit

We had two identical copies of runtime_osinit. They set runtime_ncpu,
a variable that is no longer used. Removing that leaves us with two lines.
Inline those two lines in the two places the function was called.

This fixes GCC PR 81451.

Reviewed-on: https://go-review.googlesource.com/48862

Removed:
trunk/libgo/runtime/thread-linux.c
trunk/libgo/runtime/thread-sema.c
Modified:
trunk/gcc/go/gofrontend/MERGE
trunk/libgo/Makefile.am
trunk/libgo/Makefile.in
trunk/libgo/go/runtime/stubs.go
trunk/libgo/runtime/go-libmain.c
trunk/libgo/runtime/go-main.c
trunk/libgo/runtime/proc.c
trunk/libgo/runtime/runtime.h

libgo patch committed: Inline runtime.osinit

2017-07-18 Thread Ian Lance Taylor
Libgo had two identical copies of runtime_osinit. They set
runtime_ncpu, a variable that is no longer used. Removing that leaves
us with two lines. Inline those two lines in the two places the
function was called.  This fixes GCC PR 81451.  Bootstrapped and ran
Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 250325)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-0036bd04d077f8bbe5aa9a62fb8830c53068209e
+c49ddc84f3ce89310585aad23ab6e51ef5523748
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/Makefile.am
===
--- libgo/Makefile.am   (revision 250217)
+++ libgo/Makefile.am   (working copy)
@@ -399,12 +399,6 @@ rtems_task_variable_add_file =
 endif
 
 if LIBGO_IS_LINUX
-runtime_thread_files = runtime/thread-linux.c
-else
-runtime_thread_files = runtime/thread-sema.c
-endif
-
-if LIBGO_IS_LINUX
 runtime_getncpu_file = runtime/getncpu-linux.c
 else
 if LIBGO_IS_DARWIN
@@ -469,7 +463,6 @@ runtime_files = \
runtime/runtime_c.c \
runtime/stack.c \
runtime/thread.c \
-   $(runtime_thread_files) \
runtime/yield.c \
$(rtems_task_variable_add_file) \
$(runtime_getncpu_file)
Index: libgo/go/runtime/stubs.go
===
--- libgo/go/runtime/stubs.go   (revision 249799)
+++ libgo/go/runtime/stubs.go   (working copy)
@@ -422,13 +422,13 @@ func getPanicking() uint32 {
return panicking
 }
 
-// Temporary for gccgo until we initialize ncpu in Go.
+// Called by C code to set the number of CPUs.
 //go:linkname setncpu runtime.setncpu
 func setncpu(n int32) {
ncpu = n
 }
 
-// Temporary for gccgo until we reliably initialize physPageSize in Go.
+// Called by C code to set the page size.
 //go:linkname setpagesize runtime.setpagesize
 func setpagesize(s uintptr) {
if physPageSize == 0 {
Index: libgo/runtime/go-libmain.c
===
--- libgo/runtime/go-libmain.c  (revision 249799)
+++ libgo/runtime/go-libmain.c  (working copy)
@@ -105,7 +105,8 @@ gostart (void *arg)
 
   runtime_check ();
   runtime_args (a->argc, (byte **) a->argv);
-  runtime_osinit ();
+  setncpu (getproccount ());
+  setpagesize (getpagesize ());
   runtime_sched = runtime_getsched();
   runtime_schedinit ();
   __go_go (runtime_main, NULL);
Index: libgo/runtime/go-main.c
===
--- libgo/runtime/go-main.c (revision 249799)
+++ libgo/runtime/go-main.c (working copy)
@@ -51,7 +51,8 @@ main (int argc, char **argv)
   runtime_cpuinit ();
   runtime_check ();
   runtime_args (argc, (byte **) argv);
-  runtime_osinit ();
+  setncpu (getproccount ());
+  setpagesize (getpagesize ());
   runtime_sched = runtime_getsched();
   runtime_schedinit ();
   __go_go (runtime_main, NULL);
Index: libgo/runtime/proc.c
===
--- libgo/runtime/proc.c(revision 249799)
+++ libgo/runtime/proc.c(working copy)
@@ -370,7 +370,6 @@ extern G* allocg(void)
   __asm__ (GOSYM_PREFIX "runtime.allocg");
 
 Sched* runtime_sched;
-int32  runtime_ncpu;
 
 bool   runtime_isarchive;
 
Index: libgo/runtime/runtime.h
===
--- libgo/runtime/runtime.h (revision 249799)
+++ libgo/runtime/runtime.h (working copy)
@@ -217,7 +217,6 @@ extern  M*  runtime_getallm(void)
 extern Sched*  runtime_sched;
 extern uint32  runtime_panicking(void)
   __asm__ (GOSYM_PREFIX "runtime.getPanicking");
-extern int32   runtime_ncpu;
 extern struct debugVars runtime_debug;
 
 extern boolruntime_isstarted;
@@ -237,7 +236,6 @@ voidruntime_gogo(G*)
 struct __go_func_type;
 void   runtime_args(int32, byte**)
   __asm__ (GOSYM_PREFIX "runtime.args");
-void   runtime_osinit();
 void   runtime_alginit(void)
   __asm__ (GOSYM_PREFIX "runtime.alginit");
 void   runtime_goargs(void)
Index: libgo/runtime/thread-linux.c
===
--- libgo/runtime/thread-linux.c(revision 249799)
+++ libgo/runtime/thread-linux.c(working copy)
@@ -1,20 +0,0 @@
-// Copyright 2009 The Go Authors. All rights reserved.
-// Use of this source code is governed by a BSD-style
-// license that can be found in the LICENSE file.
-
-#include "runtime.h"
-#include "defs.h"
-
-// Linux futex.
-
-#include 
-#include 
-#include 
-
-void
-runtime_osinit(void)
-{
-   runtime_ncpu = getproccount();
-   setncpu(runtime_ncpu);
-   setpagesize(getpagesize());
-}
Index: libgo/runtime/thread-sema.c

gcc-5-20170718 is now available

2017-07-18 Thread gccadmin
Snapshot gcc-5-20170718 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/5-20170718/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 5 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-5-branch 
revision 250325

You'll find:

 gcc-5-20170718.tar.xzComplete GCC

  SHA256=bd54b008e0e948d84a8a60a4a2dc97de5b7ea252577cc1d7b3d4c7f1988b14c5
  SHA1=78f7e8fbe2a6ce8d07f54a99ad4d6ebad4244eb0

Diffs from 5-20170711 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-5
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Go patch committed: Pass correct 'function' flag to circular_pointer_type

2017-07-18 Thread Ian Lance Taylor
The code in Named_type::do_get_backend in the Go frontend was not
passing the correct flag value for circular function types to
Backend::circular_pointer_type (it was always setting this flag to
false). This patch by Than McIntosh passes a true value if the type
being converted is a function type.  Bootstrapped and ran Go testsuite
on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 250324)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-21775ae119830810d9e415a02e85349f4190c68c
+0036bd04d077f8bbe5aa9a62fb8830c53068209e
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/types.cc
===
--- gcc/go/gofrontend/types.cc  (revision 249799)
+++ gcc/go/gofrontend/types.cc  (working copy)
@@ -10994,13 +10994,13 @@ Named_type::do_get_backend(Gogo* gogo)
   if (this->seen_in_get_backend_)
{
  this->is_circular_ = true;
- return gogo->backend()->circular_pointer_type(bt, false);
+ return gogo->backend()->circular_pointer_type(bt, true);
}
   this->seen_in_get_backend_ = true;
   bt1 = Type::get_named_base_btype(gogo, base);
   this->seen_in_get_backend_ = false;
   if (this->is_circular_)
-   bt1 = gogo->backend()->circular_pointer_type(bt, false);
+   bt1 = gogo->backend()->circular_pointer_type(bt, true);
   if (!gogo->backend()->set_placeholder_pointer_type(bt, bt1))
bt = gogo->backend()->error_type();
   return bt;


Re: [PATCH, rs6000] Rev 2, 1/2 Add x86 MMX <mmintrin,h> intrinsics to GCC PPC64LE target

2017-07-18 Thread Segher Boessenkool
Hi!

On Mon, Jul 17, 2017 at 02:15:00PM -0500, Steven Munroe wrote:
> Correct the problems Segher found in review and added a changes to deal
> with the fallout from the __builtin_cpu_supports warning for older
> distros.
> 
> Tested on P8 LE and P6/P7/P8 BE. No new tests failures.
> 
> ./gcc/ChangeLog:
> 
> 2017-07-17  Steven Munroe  
> 
>   * config.gcc (powerpc*-*-*): Add mmintrin.h.
>   * config/rs6000/mmintrin.h: New file.
>   * config/rs6000/x86intrin.h [__ALTIVEC__]: Include mmintrin.h.

Okay for trunk.  Thanks,


Segher


[Bug go/81324] libgo does not build with glibc 2.18

2017-07-18 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81324

Ian Lance Taylor  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Ian Lance Taylor  ---
Should be fixed now, I hope.

Re: [PATCH rs6000] Fix up BMI/BMI2 intrinsic DG tests

2017-07-18 Thread Steven Munroe
On Tue, 2017-07-18 at 16:54 -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Mon, Jul 17, 2017 at 01:28:20PM -0500, Steven Munroe wrote:
> > After a resent GCC change the previously submitted BMI/BMI2 intrinsic
> > test started to fail with the following warning/error.
> > 
> > ppc_cpu_supports_hw_available122373.c: In function 'main':
> > ppc_cpu_supports_hw_available122373.c:9:10: warning:
> > __builtin_cpu_supports need
> > s GLIBC (2.23 and newer) that exports hardware capability bits
> > 
> > The does not occur on systems with the newer (2.23) GLIBC but is common
> > on older (stable) distos.
> > 
> > As this is coming from the bmi-check.h and bmi2-check.h includes (and
> > not the tests directly) it seems simpler to simply skip the test unless
> > __BUILTIN_CPU_SUPPORTS__ is defined.
> 
> So this will skip on most current systems; is there no reasonable
> way around that?
> 
The work around would be to add an #else leg where we obtain the address
of the auxv then scan for the AT_PLATFOM, AT_HWCAP, and AT_HWCAP2
entries. Then perform the required string compares and / or bit tests.

> Okay otherwise.  One typo thing:
> 
> > 2017-07-17  Steven Munroe  
> > 
> > *gcc.target/powerpc/bmi-check.h (main): Skip unless
> > __BUILTIN_CPU_SUPPORTS__ defined.
> > *gcc.target/powerpc/bmi2-check.h (main): Skip unless
> > __BUILTIN_CPU_SUPPORTS__ defined.
> 
> There should be a space after the asterisks.
> 
> 
> Segher
> 




[Bug go/81324] libgo does not build with glibc 2.18

2017-07-18 Thread ian at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81324

--- Comment #2 from ian at gcc dot gnu.org  ---
Author: ian
Date: Tue Jul 18 22:06:31 2017
New Revision: 250324

URL: https://gcc.gnu.org/viewcvs?rev=250324=gcc=rev
Log:
PR go/81324
sysinfo.c: ignore ptrace_peeksiginfo_args from 

With some versions of glibc and GNU/Linux ptrace_pseeksiginfo_args is
defined in both  and . We don't actually
care about the struct, so use a #define to avoid a redefinition error.

This fixes https://gcc.gnu.org/PR81324.

Reviewed-on: https://go-review.googlesource.com/49290

Modified:
trunk/gcc/go/gofrontend/MERGE
trunk/libgo/sysinfo.c

libgo patch committed: Ignore ptrace_pseeksiginfo_args from

2017-07-18 Thread Ian Lance Taylor
This patch should fix PR 81324 filed against libgo.  With some
versions of glibc and GNU/Linux ptrace_pseeksiginfo_args is defined in
both  and . We don't actually care about
the struct, so use a #define to avoid a redefinition error.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 250217)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-2ae6bf76f97f7d4c63a1f0ad0683b9ba62baaf06
+21775ae119830810d9e415a02e85349f4190c68c
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/sysinfo.c
===
--- libgo/sysinfo.c (revision 249799)
+++ libgo/sysinfo.c (working copy)
@@ -106,9 +106,13 @@
 /* Avoid https://sourceware.org/bugzilla/show_bug.cgi?id=762 .  */
 #define ia64_fpreg pt_ia64_fpreg
 #define pt_all_user_regs pt_ia64_all_user_regs
+/* Avoid redefinition of ptrace_peeksiginfo from .
+   https://gcc.gnu.org/PR81324 .  */
+#define ptrace_peeksiginfo_args ignore_ptrace_peeksiginfo_args
 #include 
 #undef ia64_fpreg
 #undef pt_all_user_regs
+#undef ptrace_peeksiginfo_args
 #endif
 #if defined(HAVE_LINUX_RTNETLINK_H)
 #include 


Re: [PATCH rs6000] Fix up BMI/BMI2 intrinsic DG tests

2017-07-18 Thread Segher Boessenkool
Hi!

On Mon, Jul 17, 2017 at 01:28:20PM -0500, Steven Munroe wrote:
> After a resent GCC change the previously submitted BMI/BMI2 intrinsic
> test started to fail with the following warning/error.
> 
> ppc_cpu_supports_hw_available122373.c: In function 'main':
> ppc_cpu_supports_hw_available122373.c:9:10: warning:
> __builtin_cpu_supports need
> s GLIBC (2.23 and newer) that exports hardware capability bits
> 
> The does not occur on systems with the newer (2.23) GLIBC but is common
> on older (stable) distos.
> 
> As this is coming from the bmi-check.h and bmi2-check.h includes (and
> not the tests directly) it seems simpler to simply skip the test unless
> __BUILTIN_CPU_SUPPORTS__ is defined.

So this will skip on most current systems; is there no reasonable
way around that?

Okay otherwise.  One typo thing:

> 2017-07-17  Steven Munroe  
> 
>   *gcc.target/powerpc/bmi-check.h (main): Skip unless
>   __BUILTIN_CPU_SUPPORTS__ defined.
>   *gcc.target/powerpc/bmi2-check.h (main): Skip unless
>   __BUILTIN_CPU_SUPPORTS__ defined.

There should be a space after the asterisks.


Segher


[Bug target/81471] [5/6/7/8 Regression] internal compiler error: in curr_insn_transform, at lra-constraints.c:3495

2017-07-18 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81471

Uroš Bizjak  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Uroš Bizjak  ---
Fixed everywhere.

[Bug target/81471] [5/6/7/8 Regression] internal compiler error: in curr_insn_transform, at lra-constraints.c:3495

2017-07-18 Thread uros at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81471

--- Comment #10 from uros at gcc dot gnu.org ---
Author: uros
Date: Tue Jul 18 21:44:03 2017
New Revision: 250322

URL: https://gcc.gnu.org/viewcvs?rev=250322=gcc=rev
Log:
PR target/81471
* config/i386/i386.md (rorx_immediate_operand): New mode attribute.
(*bmi2_rorx3_1): Use rorx_immediate_operand as
operand 2 predicate.
(*bmi2_rorxsi3_1_zext): Use const_0_to_31_operand as
operand 2 predicate.
(ror,rol -> rorx splitters): Use const_int_operand as
operand 2 predicate.

testsuite/ChangeLog:

PR target/81471
* gcc.target/i386/pr81471.c: New test.


Added:
branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr81471.c
Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/config/i386/i386.md
branches/gcc-5-branch/gcc/testsuite/ChangeLog

Re: whereis PLUGIN_REGISTER_GGC_CACHES? how to migrate it for GCC v6.x?

2017-07-18 Thread Trevor Saunders
On Tue, Jul 18, 2017 at 03:52:55PM +0800, Leslie Zhai wrote:
> Hi Trevor,
> 
> Thanks for your kind response!
> 
> 在 2017年07月17日 19:51, Trevor Saunders 写道:
> > On Wed, Jul 12, 2017 at 10:12:03AM +0800, Leslie Zhai wrote:
> > > PS: Trevor's email is not available? thanks!
> > Sorry about that, I've left Mozilla and been vacationing for a month, so
> > didn't get to updating MAINTAINERS yet.  Here's a patch doing that.
> Could I use PLUGIN_REGISTER_GGC_ROOTS to take place of
> PLUGIN_REGISTER_GGC_CACHES? https://gcc.gnu.org/ml/gcc/2017-07/msg00052.html

If you are ok with your plugin keeping things alive that would otherwise
be garbage collected because they are no longer needed sure I think that
will work.  Though some things are ggc_free()ed and so you might want to
be careful touching those things that should be collected but you are
keeping alive.  Does that help?

Trev

> 
> > 
> > Trev
> > 
> > commit ff900f40d23f765fd59047a90a7e3ff18cbcbf5a
> > Author: Trevor Saunders 
> > Date:   Mon Jul 17 07:44:50 2017 -0400
> > 
> >  update my entry in MAINTAINERS
> >  ChangeLog:
> >  * MAINTAINERS: Update my email address.
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index a2f24742374..6a314049c42 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -557,7 +557,7 @@ Hariharan Sandanagobalane   
> > 
> >   Iain Sandoe   
> >   Duncan Sands  
> >   Sujoy Saraswati   
> > 
> > -Trevor Saunders
> > +Trevor Saunders
> > 
> >   Aaron Sawdey  
> > 
> >   Roger Sayle   
> >   Will Schmidt  
> > 
> 
> -- 
> Regards,
> Leslie Zhai - a LLVM developer https://reviews.llvm.org/p/xiangzhai/
> 
> 
> 


[PATCH], Update cpu-builtin-1.c test on PowerPC

2017-07-18 Thread Michael Meissner
This patch modifies the change I made on July 12th.  It modifies the test for
the __builtin_cpu_is and __builtin_cpu_supports built-in functions to use an
#ifdef instead of target-requires for doing the tests.  One motavation is to
make the back port to GCC 6/7 easier, as I won't have to back port the change
to add the target option ppc_cpu_supports_hw.

I've checked the trunk with compilers built with a new GLIBC and without, and
it passes both compilers.  I also checked the back port to GCC 6/7 and both
work fine as well.

Can I check this patch into the trunk and backports to GCC 6 and 7?

2017-07-18  Michael Meissner  

PR target/81193
* gcc.target/powerpc/cpu-builtin-1.c: Change test to use #ifdef
__BUILTIN_CPU_SUPPORTS to see if the GLIBC is new enough that
__builtin_cpu_is and __builtin_cpu_supports are supported.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/testsuite/gcc.target/powerpc/cpu-builtin-1.c
===
--- gcc/testsuite/gcc.target/powerpc/cpu-builtin-1.c(revision 250316)
+++ gcc/testsuite/gcc.target/powerpc/cpu-builtin-1.c(working copy)
@@ -1,10 +1,14 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
-/* { dg-require-effective-target ppc_cpu_supports_hw } */
 
 void
 use_cpu_is_builtins (unsigned int *p)
 {
+  /* If GCC was configured to use an old GLIBC (before 2.23), the
+ __builtin_cpu_is and __builtin_cpu_supports built-in functions return 0,
+ and the compiler issues a warning that you need a newer glibc to use them.
+ Use #ifdef to avoid the warning.  */
+#ifdef __BUILTIN_CPU_SUPPORTS__
   p[0] = __builtin_cpu_is ("power9");
   p[1] = __builtin_cpu_is ("power8");
   p[2] = __builtin_cpu_is ("power7");
@@ -20,11 +24,15 @@ use_cpu_is_builtins (unsigned int *p)
   p[12] = __builtin_cpu_is ("ppc440");
   p[13] = __builtin_cpu_is ("ppc405");
   p[14] = __builtin_cpu_is ("ppc-cell-be");
+#else
+  p[0] = 0;
+#endif
 }
 
 void
 use_cpu_supports_builtins (unsigned int *p)
 {
+#ifdef __BUILTIN_CPU_SUPPORTS__
   p[0] = __builtin_cpu_supports ("4xxmac");
   p[1] = __builtin_cpu_supports ("altivec");
   p[2] = __builtin_cpu_supports ("arch_2_05");
@@ -63,4 +71,7 @@ use_cpu_supports_builtins (unsigned int
   p[35] = __builtin_cpu_supports ("ucache");
   p[36] = __builtin_cpu_supports ("vcrypto");
   p[37] = __builtin_cpu_supports ("vsx");
+#else
+  p[0] = 0;
+#endif
 }


[Bug target/81471] [5/6/7/8 Regression] internal compiler error: in curr_insn_transform, at lra-constraints.c:3495

2017-07-18 Thread uros at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81471

--- Comment #9 from uros at gcc dot gnu.org ---
Author: uros
Date: Tue Jul 18 20:16:47 2017
New Revision: 250319

URL: https://gcc.gnu.org/viewcvs?rev=250319=gcc=rev
Log:
PR target/81471
* config/i386/i386.md (rorx_immediate_operand): New mode attribute.
(*bmi2_rorx3_1): Use rorx_immediate_operand as
operand 2 predicate.
(*bmi2_rorxsi3_1_zext): Use const_0_to_31_operand as
operand 2 predicate.
(ror,rol -> rorx splitters): Use const_int_operand as
operand 2 predicate.

testsuite/ChangeLog:

PR target/81471
* gcc.target/i386/pr81471.c: New test.


Added:
branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr81471.c
Modified:
branches/gcc-6-branch/gcc/ChangeLog
branches/gcc-6-branch/gcc/config/i386/i386.md
branches/gcc-6-branch/gcc/testsuite/ChangeLog

[Bug rtl-optimization/68988] reload_pseudo_compare_func violates qsort requirements

2017-07-18 Thread ygribov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68988

Yury Gribov  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-07-18
 CC||ygribov at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Yury Gribov  ---
Relevant upstream commit:
https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00898.html

[Bug sanitizer/80027] ASAN breaks DT_RPATH $ORIGIN in dlopen()

2017-07-18 Thread ygribov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80027

Yury Gribov  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
URL||https://bugs.llvm.org//show
   ||_bug.cgi?id=27790
 CC||ygribov at gcc dot gnu.org
 Resolution|--- |MOVED

--- Comment #9 from Yury Gribov  ---
Closing, the bug is in upstream libasan and should be fixed there.

[Bug sanitizer/78654] ubsan can lead to excessive stack usage

2017-07-18 Thread ygribov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78654

Yury Gribov  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||ygribov at gcc dot gnu.org
 Resolution|--- |INVALID

--- Comment #6 from Yury Gribov  ---
Closing, the overhead is unavoidable.

[Bug sanitizer/55316] gcc/libsanitizer/asan/asan_linux.cc:70:3: error: #error "Unsupported arch"

2017-07-18 Thread ygribov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55316

Yury Gribov  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||ygribov at gcc dot gnu.org
 Resolution|--- |INVALID

--- Comment #5 from Yury Gribov  ---
Closing, support for hppa should be added upstream first.

[Bug sanitizer/61693] [asan] is not intercepting aligned_alloc

2017-07-18 Thread ygribov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61693

Yury Gribov  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 CC||ygribov at gcc dot gnu.org
  Known to work||5.0
 Resolution|--- |FIXED
   Target Milestone|--- |5.0

--- Comment #6 from Yury Gribov  ---
Fixed.

[Bug sanitizer/63245] renderMemorySnippet shouldn't show more bytes than the underlying type

2017-07-18 Thread ygribov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63245

Yury Gribov  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||ygribov at gcc dot gnu.org
  Known to work||5.0
 Resolution|--- |FIXED
   Target Milestone|--- |5.0

--- Comment #4 from Yury Gribov  ---
Fixed.

[Bug middle-end/41992] ICE on invalid dereferencing of void *

2017-07-18 Thread ygribov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41992

Yury Gribov  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||ygribov at gcc dot gnu.org
 Resolution|--- |FIXED
   Target Milestone|--- |4.7.0

--- Comment #5 from Yury Gribov  ---
Fixed.

Re: RFC: stack/heap collision vulnerability and mitigation with GCC

2017-07-18 Thread Florian Weimer
* Jeff Law:

> On 06/28/2017 12:45 AM, Florian Weimer wrote:
>> * Richard Earnshaw:
>> 
>>> I can't help but feel there's a bit of a goode olde mediaeval witch hunt
>>> going on here.  As Wilco points out, we can never defend against a
>>> function that is built without probe operations but skips the entire
>>> guard zone.  The only defence there is a larger guard zone, but how big
>>> do you make it?
>> 
>> Right.  And in the exploitable cases we have seen, there is a
>> dynamically sized allocation which the attacker can influence, so it
>> seems fairly likely that in a partially hardended binary, there could
>> be another call stack which is exploitable, with a non-hardened
>> function at the top.
>> 
>> I think a probing scheme which assumes that if the caller moves the
>> stack pointer into more than half of the guard area, that's the
>> callers fault would be totally appropriate in practice.  If possible,
>> callee-only probing for its own stack usage is preferable, but not if
>> it means instrumenting all functions which use the stack.

> That position is a surprise Florian :-)  I would have expected a full
> protection position, particularly after the discussions we've had about
> noreturn functions.

I might have gotten carried away on that one.

I really want stack probing to be enabled by default across the board,
so this becomes a non-issue because the caller has been compiled with
probing as well.  However, in order to get there, we need extremely
cheap instrumentation, and if we cover any hypthetical corner case,
this might force us to instrument all functions, and that again
defeats the goal of enabling it by default.

Does that make sense?

> I guess the difference in your position is driven by the relatively high
> frequency of probing worst case assumptions are going to have on aarch64
> with a relatively small vulnerability surface?

Right, and I expect that the limited form of probing can be enabled by
default, so that eventually, the caller will take care of its share of
probing (i.e., it has never moved the stack pointer more than half
into the guard page, or whatever caller/callee split of
responsibilities we come up with).

> Which is a fairly stark contrast to the noreturn situation where it
> rarely, if ever comes up in practice and never on a hot path?

I've since researched the noreturn situation a bit more.  We never
turn noreturn functions into tail calls because the intent is to
preserve the call stack, in the expectation that either the noreturn
function itself performs a backtrace, or that someone later looks at
the coredump.  So the noreturn risk just doesn't seem to be there.


Re: [PATCH 4/6] lra-assigns.c: give up on qsort checking in assign_by_spills

2017-07-18 Thread Yuri Gribov
On Sat, Jul 15, 2017 at 9:47 PM, Alexander Monakov  wrote:
> The reload_pseudo_compare_func comparator, when used from assign_by_spills,
> can be non-transitive, indicating A < B < C < A if both A and C satisfy
> !bitmap_bit_p (_reload_pseudos, rAC), but B does not.
>
> This function was originally a proper comparator, and the problematic
> clause was added to fix PR 57878:
> https://gcc.gnu.org/ml/gcc-patches/2013-07/msg00732.html
>
> That the comparator is invalid implies that that PR, if it still exists,
> can reappear (but probably under more complicated circumstances).
>
> This looks like a sensitive area, so disabling checking is the only
> obvious approach.

May make sense to add PR rtl-optimization/68988 annotation to changelog.

> * lra-assigns.c (reload_pseudo_compare_func): Add a FIXME.
> (assign_by_spills): Use non-checking qsort.
> ---
>  gcc/lra-assigns.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/lra-assigns.c b/gcc/lra-assigns.c
> index 2aadeef..a67d1a6 100644
> --- a/gcc/lra-assigns.c
> +++ b/gcc/lra-assigns.c
> @@ -217,6 +217,7 @@ reload_pseudo_compare_func (const void *v1p, const void 
> *v2p)
>/* The code below executes rarely as nregs == 1 in most cases.
>  So we should not worry about using faster data structures to
>  check reload pseudos.  */
> +  /* FIXME this makes comparator non-transitive and thus invalid.  */
>&& ! bitmap_bit_p (_reload_pseudos, r1)
>&& ! bitmap_bit_p (_reload_pseudos, r2))
>  return diff;
> @@ -1384,7 +1385,7 @@ assign_by_spills (void)
>bitmap_ior_into (_reload_pseudos, _optional_reload_pseudos);
>for (iter = 0; iter <= 1; iter++)
>  {
> -  qsort (sorted_pseudos, n, sizeof (int), reload_pseudo_compare_func);
> +  qsort_nochk (sorted_pseudos, n, sizeof (int), 
> reload_pseudo_compare_func);
>nfails = 0;
>for (i = 0; i < n; i++)
> {
> --
> 1.8.3.1
>


[Bug driver/67425] -frandom-seed documentation doesn't match code, incomplete

2017-07-18 Thread ygribov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67425

Yury Gribov  changed:

   What|Removed |Added

   Target Milestone|--- |6.0

[Bug sanitizer/59600] no_sanitize_address mishandled when function is inlined

2017-07-18 Thread ygribov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59600

Yury Gribov  changed:

   What|Removed |Added

   Target Milestone|--- |4.9.0

[PATCH 2/2] combine: Fix for PR81423

2017-07-18 Thread Segher Boessenkool
We here have an AND of a SUBREG of an LSHIFTRT.  If that SUBREG is
paradoxical, the extraction we form is the length of the size of the
inner mode, which includes some bits that should not be in the result.
Just give up in that case.

Tested on powerpc64-linux {-m32,-m64} and on x86_64-linux.  Committing
to trunk.


Segher


2018-07-18  Segher Boessenkool  

PR rtl-optimization/81423
* combine.c (make_compound_operation_int): Don't try to optimize
the AND of a SUBREG of an LSHIFTRT if that SUBREG is paradoxical.

---
 gcc/combine.c | 15 +++
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index c5200db..c486f12 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -7990,18 +7990,9 @@ make_compound_operation_int (machine_mode mode, rtx 
*x_ptr,
 XEXP (inner_x0, 1),
 i, 1, 0, in_code == COMPARE);
 
- if (new_rtx)
-   {
- /* If we narrowed the mode when dropping the subreg, then
-we must zero-extend to keep the semantics of the AND.  */
- if (GET_MODE_SIZE (inner_mode) >= GET_MODE_SIZE (mode))
-   ;
- else if (SCALAR_INT_MODE_P (inner_mode))
-   new_rtx = simplify_gen_unary (ZERO_EXTEND, mode,
- new_rtx, inner_mode);
- else
-   new_rtx = NULL;
-   }
+ /* If we narrowed the mode when dropping the subreg, then we lose.  */
+ if (GET_MODE_SIZE (inner_mode) < GET_MODE_SIZE (mode))
+   new_rtx = NULL;
 
  /* If that didn't give anything, see if the AND simplifies on
 its own.  */
-- 
1.9.3



[PATCH 1/2] simplify-rtx: The truncation of an IOR can have all bits set (PR81423)

2017-07-18 Thread Segher Boessenkool
... if it is an IOR with a constant with all bits set in the mode
that is truncated to, for example.  Handle that case.

With this patch the problematic situation for the PR81423 testcase
isn't even reached; but the next patch fixes that anyway.

Bootstrapped and tested on powerpc64-linux {-m32,-m64} and on
x86_64-linux.  Is this okay for trunk?


Segher


2017-07-18  Segher Boessenkool  

PR rtl-optimization/81423
* simplify-rtx.c (simplify_truncation): Handle truncating an IOR
with a constant that is -1 in the truncated to mode.

---
 gcc/simplify-rtx.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 3bce329..ef41479 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -857,6 +857,15 @@ simplify_truncation (machine_mode mode, rtx op,
 return simplify_gen_unary (TRUNCATE, mode, XEXP (op, 0),
   GET_MODE (XEXP (op, 0)));
 
+  /* (truncate:A (ior X C)) is (const_int -1) if C is equal to that already,
+ in mode A.  */
+  if (GET_CODE (op) == IOR
+  && SCALAR_INT_MODE_P (mode)
+  && SCALAR_INT_MODE_P (op_mode)
+  && CONST_INT_P (XEXP (op, 1))
+  && trunc_int_for_mode (INTVAL (XEXP (op, 1)), mode) == -1)
+return constm1_rtx;
+
   return NULL_RTX;
 }
 
-- 
1.9.3



[Bug sanitizer/59600] no_sanitize_address mishandled when function is inlined

2017-07-18 Thread ygribov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59600

Yury Gribov  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||ygribov at gcc dot gnu.org
  Known to work||4.9.0
 Resolution|--- |FIXED

--- Comment #12 from Yury Gribov  ---
Fixed.

[Bug driver/67425] -frandom-seed documentation doesn't match code, incomplete

2017-07-18 Thread ygribov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67425

ygribov at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
  Known to work||5.3.1
 Resolution|--- |FIXED

--- Comment #6 from ygribov at gcc dot gnu.org ---
Fixed.

[Bug middle-end/81464] [8 Regression] ICE in expand_omp_for_static_chunk, at omp-expand.c:4236

2017-07-18 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81464

Tom de Vries  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
  Component|web |middle-end
 Resolution|--- |FIXED

--- Comment #7 from Tom de Vries  ---
Patch with test-case committed, marking resolved fixed.

Re: [PATCH, PR81464] Handle equal-argument loop exit phi in expand_omp_for_static_chunk

2017-07-18 Thread Tom de Vries

On 07/18/2017 06:59 PM, Jakub Jelinek wrote:

On Tue, Jul 18, 2017 at 06:48:56PM +0200, Tom de Vries wrote:

Hi,

this patch fixes PR81464, an ICE in ompexpssa.

The ICE occurs in expand_omp_for_static_chunk when we're trying to fix up a
loop exit phi:
...
   # .MEM_88 = PHI <.MEM_86(46), .MEM_86(71)>
...


That is something that should be cleaned up by some phi opt, but if it has
been introduced during the parloops pass or too early before that, we
probably should deal with it.



I checked, it's introduced during the parloops pass.


--- a/gcc/omp-expand.c
+++ b/gcc/omp-expand.c
@@ -4206,6 +4206,10 @@ expand_omp_for_static_chunk (struct omp_region *region,
  source_location locus;
  
  	  phi = psi.phi ();

+ if (operand_equal_p (gimple_phi_arg_def (phi, 0),
+  redirect_edge_var_map_def (vm), 0))
+ continue;


Wrong formatting, please remove 2 spaces before continue;

Otherwise LGTM.



Updated and committed.

Thanks,
- Tom


Re: [PATCH] PR libstdc++/81064 fix versioned namespace

2017-07-18 Thread François Dumont

On 18/07/2017 16:03, Ville Voutilainen wrote:

On 18 July 2017 at 16:31, Jonathan Wakely  wrote:

This is quite a huge change, so I'd like to wait and see if anyone
else has any opinion on it.

Personally I think it's necessary (assuming I understand the PR
correctly) and so if nobody objects I think we should go with this
change for GCC 8. Let's give it a few days for comments (and I'll
finish going through the patch carefully).


Looks like the right approach to me. I haven't looked at the patch in
detail, but the main gist
of it is something that we should certainly do for GCC 8. The Elf says "aye".


Thanks for the feedbacks.

However I've been a little bit too confident regarding its validation. 
There are unexpected failures when versioned namespace is activated.


Most of them are related to its usage with experimental namespace. I 
haven't yet fully consider it but just in case: do we really need to 
have versioned namespace on top of experimental namespace ?


François




[Bug target/81471] [5/6/7/8 Regression] internal compiler error: in curr_insn_transform, at lra-constraints.c:3495

2017-07-18 Thread uros at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81471

--- Comment #8 from uros at gcc dot gnu.org ---
Author: uros
Date: Tue Jul 18 18:28:12 2017
New Revision: 250317

URL: https://gcc.gnu.org/viewcvs?rev=250317=gcc=rev
Log:
PR target/81471
* config/i386/i386.md (rorx_immediate_operand): New mode attribute.
(*bmi2_rorx3_1): Use rorx_immediate_operand as
operand 2 predicate.
(*bmi2_rorxsi3_1_zext): Use const_0_to_31_operand as
operand 2 predicate.
(ror,rol -> rorx splitters): Use const_int_operand as
operand 2 predicate.

testsuite/ChangeLog:

PR target/81471
* gcc.target/i386/pr81471.c: New test.


Added:
branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr81471.c
Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/i386/i386.md
branches/gcc-7-branch/gcc/testsuite/ChangeLog

Re: [PATCH] Kill TYPE_METHODS debug 1/9

2017-07-18 Thread Jim Wilson

On 07/14/2017 09:48 AM, Nathan Sidwell wrote:

This changes dbxout and dwarf2out.



Oh, the patch series survived a bootstrap on x86_64-linux.


Changes to the debug info files requires a gdb make check with and 
without the patch to check for regressions.  Since you are changing both 
dbxout and dwarf2out, you would need to do this twice, once for each 
debug info type.  Testing dbxout may be a little tricky, since few 
systems still use this by default.  Maybe you can hack an x86_64-linux 
build to use dbxout by default to test it.


Otherwise, this looks OK.

Jim




Re: [PATCH 1/3] matching tokens: c-family parts

2017-07-18 Thread Marek Polacek
On Tue, Jul 11, 2017 at 11:24:43AM -0400, David Malcolm wrote:
> OK for trunk? (assuming the rest is approved)
 
This is ok.  I'll have to play with this some more before I approve the C part.

Thanks,

Marek


Re: [PATCH] match.pd: reassociate multiplications with constants

2017-07-18 Thread Alexander Monakov
On Mon, 17 Jul 2017, Alexander Monakov wrote:
> On Mon, 17 Jul 2017, Marc Glisse wrote:
> > > +/* Combine successive multiplications.  Similar to above, but handling
> > > +   overflow is different.  */
> > > +(simplify
> > > + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
> > > + (with {
> > > +   bool overflow_p;
> > > +   wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), _p);
> > > +  }
> > > +  (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))
> > 
> > I wonder if there are cases where this would cause trouble for saturating
> > integers. The only case I can think of is when @2 is -1, but that's likely
> > simplified to NEGATE_EXPR first.
> 
> Ah, yes, I think if @2 is -1 or 0 then we should not attempt this transform 
> for
> either saturating or sanitized types, just like in the first patch. I think
> wrapping the 'with' with 'if (!integer_minus_onep (@2) && !integer_zerop 
> (@2))'
> works, since as you say it should become a negate/zero anyway?

Updated patch:

* match.pd ((X * CST1) * CST2): Simplify to X * (CST1 * CST2).
testsuite:
* gcc.dg/tree-ssa/assoc-2.c: Enhance.
* gcc.dg/tree-ssa/slsr-4.c: Adjust.

diff --git a/gcc/match.pd b/gcc/match.pd
index 36045f1..0bb5541 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -283,6 +283,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 || mul != wi::min_value (TYPE_PRECISION (type), SIGNED))
  { build_zero_cst (type); })

+/* Combine successive multiplications.  Similar to above, but handling
+   overflow is different.  */
+(simplify
+ (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
+ /* More specific rules can handle 0 and -1; skip them here to avoid
+wrong transformations for sanitized and saturating types.  */
+ (if (!integer_zerop (@2) && !integer_minus_onep (@2))
+  (with {
+bool overflow_p;
+wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), _p);
+   }
+   (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))
+(mult @0 { wide_int_to_tree (type, mul); })
+
 /* Optimize A / A to 1.0 if we don't care about
NaNs or Infinities.  */
 (simplify
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
index a92c882..cc0e9d4 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
@@ -5,4 +5,15 @@ int f0(int a, int b){
   return a * 33 * b * 55;
 }

-/* { dg-final { scan-tree-dump-times "mult_expr" 2 "gimple" } } */
+int f1(int a){
+  a *= 33;
+  return a * 55;
+}
+
+int f2(int a, int b){
+  a *= 33;
+  return a * b * 55;
+}
+
+/* { dg-final { scan-tree-dump-times "mult_expr" 7 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "mult_expr" 5 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c 
b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
index 17d7b4c..1e943b7 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
@@ -23,13 +23,9 @@ f (int i)
   foo (y);
 }
 
-/* { dg-final { scan-tree-dump-times "\\* 4" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\* 10" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\+ 20;" 1 "slsr" } } */
+/* { dg-final { scan-tree-dump-times "\\* 40" 1 "slsr" } } */
 /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\- 16;" 1 "slsr" } } */
 /* { dg-final { scan-tree-dump-times "\\- 160" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\* 4" 1 "optimized" } } */
-/* { dg-final { scan-tree-dump-times "\\* 10" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\* 40" 1 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "\\+ 40" 1 "optimized" } } */


Re: [PATCH, PR81464] Handle equal-argument loop exit phi in expand_omp_for_static_chunk

2017-07-18 Thread Jakub Jelinek
On Tue, Jul 18, 2017 at 06:48:56PM +0200, Tom de Vries wrote:
> Hi,
> 
> this patch fixes PR81464, an ICE in ompexpssa.
> 
> The ICE occurs in expand_omp_for_static_chunk when we're trying to fix up a
> loop exit phi:
> ...
>   # .MEM_88 = PHI <.MEM_86(46), .MEM_86(71)>
> ...

That is something that should be cleaned up by some phi opt, but if it has
been introduced during the parloops pass or too early before that, we
probably should deal with it.

> --- a/gcc/omp-expand.c
> +++ b/gcc/omp-expand.c
> @@ -4206,6 +4206,10 @@ expand_omp_for_static_chunk (struct omp_region *region,
> source_location locus;
>  
> phi = psi.phi ();
> +   if (operand_equal_p (gimple_phi_arg_def (phi, 0),
> +redirect_edge_var_map_def (vm), 0))
> +   continue;

Wrong formatting, please remove 2 spaces before continue;

Otherwise LGTM.

Jakub


Re: [PATCH] Implement one optimization from build_range_check in match.pd (PR tree-optimization/81346)

2017-07-18 Thread Jakub Jelinek
On Tue, Jul 18, 2017 at 10:47:37AM -0600, Martin Sebor wrote:
> On 07/18/2017 09:43 AM, Jakub Jelinek wrote:
> > On Tue, Jul 18, 2017 at 09:31:11AM -0600, Martin Sebor wrote:
> > > > --- gcc/match.pd.jj 2017-07-17 16:25:20.0 +0200
> > > > +++ gcc/match.pd2017-07-18 12:32:52.896924558 +0200
> > > > @@ -1125,6 +1125,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > > > && wi::neg_p (@1, TYPE_SIGN (TREE_TYPE (@1
> > > >  (cmp @2 @0))
> > > > 
> > > > +/* (X - 1U) <= INT_MAX-1U into (int) X > 0.  */
> > > 
> > > Since the transformation applies to other types besides int I
> > > suggest to make it clear in the comment.  E.g., something like:
> > > 
> > >   /* (X - 1U) <= TYPE_MAX - 1U into (TYPE) X > 0 for any integer
> > >  TYPE.  */
> > > 
> > > (with spaces around all the operators as per GCC coding style).
> > 
> > I think many of the match.pd comments are also not fully generic
> > to describe what it does, just to give an idea what it does.
> ...
> > Examples of other comments that "suffer" from similar lack of sufficient
> > genericity, but perhaps are good enough to let somebody understand it
> > quickly:
> 
> Sure, but that doesn't make them a good example to follow.  As
> someone pointed out to me in code reviews, existing deviations
> from the preferred style, whether documented or not, or lack of
> clarity, aren't a license to add more.  Please take my suggestion
> here in the same constructive spirit.

The point I'm trying to make is that in order to make the
comments generic enough they will be too large and too hard to parse.
IMHO sometimes it is better to just give an example of what it
does, and those who want to read all the details on what exactly it does,
there is the simplify below it with all the details.
Consider another randomly chosen comment:
 /* hypot(x,x) -> fabs(x)*sqrt(2).  */
This also isn't describing generically what it does, because
it handles not just hypot ->fabs*sqrt, but also hypotl ->fabsl*sqrtl,
hypotf ->fabsf*sqrtf, maybe others.

In the end, it is Richard's call on what he wants to have in match.pd
comments.

Jakub


[Bug libstdc++/81476] severe slow-down with range-v3 library compared to clang

2017-07-18 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81476

Jonathan Wakely  changed:

   What|Removed |Added

 Status|WAITING |NEW
  Component|c++ |libstdc++

--- Comment #9 from Jonathan Wakely  ---
Thanks. I see similar performance using gcc or clang, so the difference is
libstdc++ vs libc++ not gcc vs clang.

[Bug web/81464] [8 Regression] ICE in expand_omp_for_static_chunk, at omp-expand.c:4236

2017-07-18 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81464

Tom de Vries  changed:

   What|Removed |Added

   Keywords||patch
  Component|middle-end  |web

--- Comment #6 from Tom de Vries  ---
https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01087.html

[PATCH, PR81464] Handle equal-argument loop exit phi in expand_omp_for_static_chunk

2017-07-18 Thread Tom de Vries

Hi,

this patch fixes PR81464, an ICE in ompexpssa.

The ICE occurs in expand_omp_for_static_chunk when we're trying to fix 
up a loop exit phi:

...
  # .MEM_88 = PHI <.MEM_86(46), .MEM_86(71)>
...

It's a loop exit phi with equal arguments, which means that the variable 
has the same value when the loop is executed, and when the loop is 
skipped, in other words, it's not modified in the loop.


The fixup code ICEs when it cannot find a loop header phi corresponding 
to the loop exit phi. But it's expected that there's no loop header phi, 
given that the variable is not modified in the loop.


The patch fixes the ICE by not trying to fix up this particular kind of 
loop exit phi.


Bootstrapped and reg-tested on x86_64.

OK for trunk?

Thanks,
- Tom
Handle equal-argument loop exit phi in expand_omp_for_static_chunk

2017-07-18  Tom de Vries  

	PR middle-end/81464
	* omp-expand.c (expand_omp_for_static_chunk): Handle equal-argument loop
	exit phi.

	* gfortran.dg/pr81464.f90: New test.

---
 gcc/omp-expand.c  |  4 
 gcc/testsuite/gfortran.dg/pr81464.f90 | 19 +++
 2 files changed, 23 insertions(+)

diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
index 929c530..63b91d7 100644
--- a/gcc/omp-expand.c
+++ b/gcc/omp-expand.c
@@ -4206,6 +4206,10 @@ expand_omp_for_static_chunk (struct omp_region *region,
 	  source_location locus;
 
 	  phi = psi.phi ();
+	  if (operand_equal_p (gimple_phi_arg_def (phi, 0),
+			   redirect_edge_var_map_def (vm), 0))
+	  continue;
+
 	  t = gimple_phi_result (phi);
 	  gcc_assert (t == redirect_edge_var_map_result (vm));
 
diff --git a/gcc/testsuite/gfortran.dg/pr81464.f90 b/gcc/testsuite/gfortran.dg/pr81464.f90
new file mode 100644
index 000..425cae9
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr81464.f90
@@ -0,0 +1,19 @@
+! { dg-do compile }
+! { dg-options "--param parloops-chunk-size=2 -ftree-parallelize-loops=2 -O1" }
+
+program main
+  implicit none
+  real, dimension(:,:),allocatable :: a, b, c
+  real :: sm
+
+  allocate (a(2,2), b(2,2), c(2,2))
+
+  call random_number(a)
+  call random_number(b)
+
+  c = matmul(a,b)
+  sm = sum(c)
+
+  deallocate(a,b,c)
+
+end program main


[Bug c++/81476] severe slow-down with range-v3 library compared to clang

2017-07-18 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81476

--- Comment #8 from Alexander Monakov  ---
Yeah, the

  target.insert(target.cbegin(), ranges::begin(concatenated),
ranges::end(concatenated));

appears to cause a bad case of Schlemiel-The-Painter, for each inserted char
the tail of target is memmove'd by 1 byte to the right...

Re: [PATCH] Implement one optimization from build_range_check in match.pd (PR tree-optimization/81346)

2017-07-18 Thread Martin Sebor

On 07/18/2017 09:43 AM, Jakub Jelinek wrote:

On Tue, Jul 18, 2017 at 09:31:11AM -0600, Martin Sebor wrote:

--- gcc/match.pd.jj 2017-07-17 16:25:20.0 +0200
+++ gcc/match.pd2017-07-18 12:32:52.896924558 +0200
@@ -1125,6 +1125,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& wi::neg_p (@1, TYPE_SIGN (TREE_TYPE (@1
 (cmp @2 @0))

+/* (X - 1U) <= INT_MAX-1U into (int) X > 0.  */


Since the transformation applies to other types besides int I
suggest to make it clear in the comment.  E.g., something like:

  /* (X - 1U) <= TYPE_MAX - 1U into (TYPE) X > 0 for any integer
 TYPE.  */

(with spaces around all the operators as per GCC coding style).


I think many of the match.pd comments are also not fully generic
to describe what it does, just to give an idea what it does.

...

Examples of other comments that "suffer" from similar lack of sufficient
genericity, but perhaps are good enough to let somebody understand it
quickly:


Sure, but that doesn't make them a good example to follow.  As
someone pointed out to me in code reviews, existing deviations
from the preferred style, whether documented or not, or lack of
clarity, aren't a license to add more.  Please take my suggestion
here in the same constructive spirit.

Martin


Bug in lto-plugin.c ?

2017-07-18 Thread Georg-Johann Lay

Hi, I tried to build a canadian cross with Configured with
--build=x86_64-linux-gnu
--host=i686-w64-mingw32
--target=avr

While the result appears to work under wine, I am getting the
following error from ld in a non-LTO compile + link:

e:/winavr/8.0_2017-07-18/bin/../lib/gcc/avr/8.0.0/../../../../avr/bin/ld.exe: 
error: asprintf failed


After playing around I found that -fno-use-linker-plugin avoids that
message, and I speculated that the error is emit by lto-plugin.c

In claim_file_handler() we have:


  /* We pass the offset of the actual file, not the archive header.
 Can't use PRIx64, because that's C99, so we have to print the
 64-bit hex int as two 32-bit ones. */
  int lo, hi, t;
  lo = file->offset & 0x;
  hi = ((int64_t)file->offset >> 32) & 0x;
  t = hi ? asprintf (, "%s@0x%x%08x", file->name, lo, hi)
 : asprintf (, "%s@0x%x", file->name, lo);
  check (t >= 0, LDPL_FATAL, "asprintf failed");


If hi != 0, then why is hi printed at the low end? Shouldn't hi and lo
be swapped like so

  t = hi ? asprintf (, "%s@0x%x%08x", file->name, hi, lo)

if this is supposed to yield a 64-bit print?

What else could lead to an "asprintf failed" ?  Unfortunately I have
no idea how to debug that on the host...

Johann



Re: [PATCH] Add self as maintainer of D front-end and libphobos

2017-07-18 Thread Gerald Pfeifer
On Thu, 13 Jul 2017, Iain Buclaw wrote:
>> As per message on the D language being accepted, this adds myself as a
>> maintainer of the D front-end and libphobos runtime library.
> I may have to request a ping here.

I would commit this when the first bits of D go in.  (Technically you 
could commit this now, though, I guess; that's just my recommendation.)

Gerald


[Bug c++/81476] severe slow-down with range-v3 library compared to clang

2017-07-18 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81476

--- Comment #7 from Markus Trippelsdorf  ---
Ah:

-   99.79% 0.02%  a.out  a.out[.] std::vector::_M_range_insert::_M_range_insert::insert
   
  ▒
   99.63% memcpy@@GLIBC_2.14

[Bug c++/81476] severe slow-down with range-v3 library compared to clang

2017-07-18 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81476

Markus Trippelsdorf  changed:

   What|Removed |Added

 CC||trippels at gcc dot gnu.org

--- Comment #6 from Markus Trippelsdorf  ---
Almost all the time is spend in memcpy.
Clang with libstdc++ is equally bad.
Gcc with "g++ -c -nostdinc++ -nodefaultlibs -lc -isystem /usr/include/c++/v1/
-lm -lc++ -lgcc_s -O3 -I./ test.cpp" is fine.

Not sure where the memcpy calls come from.

[PATCH, i386]: Fix PR 81471, ICE in curr_insn_transform

2017-07-18 Thread Uros Bizjak
Hello!

Attached patch tightens rorx operand 2 predicate to allow only
const_int RTXes that are also allowed by the operand constraint. This
prevents combine to propagate unsupported const_ints to the pattern.

2017-07-18  Uros Bizjak  

PR target/81471
* config/i386/i386.md (rorx_immediate_operand): New mode attribute.
(*bmi2_rorx3_1): Use rorx_immediate_operand as
operand 2 predicate.
(*bmi2_rorxsi3_1_zext): Use const_0_to_31_operand as
operand 2 predicate.
(ror,rol -> rorx splitters): Use const_int_operand as
operand 2 predicate.

testsuite/ChangeLog:

2017-07-18  Uros Bizjak  

PR target/81471
* gcc.target/i386/pr81471.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN, will be backported to other release branches.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 250278)
+++ config/i386/i386.md (working copy)
@@ -10732,10 +10732,15 @@
   split_double_mode (mode, [0], 1, [4], [5]);
 })
 
+(define_mode_attr rorx_immediate_operand
+   [(SI "const_0_to_31_operand")
+(DI "const_0_to_63_operand")])
+
 (define_insn "*bmi2_rorx3_1"
   [(set (match_operand:SWI48 0 "register_operand" "=r")
-   (rotatert:SWI48 (match_operand:SWI48 1 "nonimmediate_operand" "rm")
-   (match_operand:QI 2 "immediate_operand" "")))]
+   (rotatert:SWI48
+ (match_operand:SWI48 1 "nonimmediate_operand" "rm")
+ (match_operand:QI 2 "" "")))]
   "TARGET_BMI2"
   "rorx\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "rotatex")
@@ -10778,7 +10783,7 @@
 (define_split
   [(set (match_operand:SWI48 0 "register_operand")
(rotate:SWI48 (match_operand:SWI48 1 "nonimmediate_operand")
- (match_operand:QI 2 "immediate_operand")))
+ (match_operand:QI 2 "const_int_operand")))
(clobber (reg:CC FLAGS_REG))]
   "TARGET_BMI2 && reload_completed"
   [(set (match_dup 0)
@@ -10792,7 +10797,7 @@
 (define_split
   [(set (match_operand:SWI48 0 "register_operand")
(rotatert:SWI48 (match_operand:SWI48 1 "nonimmediate_operand")
-   (match_operand:QI 2 "immediate_operand")))
+   (match_operand:QI 2 "const_int_operand")))
(clobber (reg:CC FLAGS_REG))]
   "TARGET_BMI2 && reload_completed"
   [(set (match_dup 0)
@@ -10802,7 +10807,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r")
(zero_extend:DI
  (rotatert:SI (match_operand:SI 1 "nonimmediate_operand" "rm")
-  (match_operand:QI 2 "immediate_operand" "I"]
+  (match_operand:QI 2 "const_0_to_31_operand" "I"]
   "TARGET_64BIT && TARGET_BMI2"
   "rorx\t{%2, %1, %k0|%k0, %1, %2}"
   [(set_attr "type" "rotatex")
@@ -10846,7 +10851,7 @@
   [(set (match_operand:DI 0 "register_operand")
(zero_extend:DI
  (rotate:SI (match_operand:SI 1 "nonimmediate_operand")
-(match_operand:QI 2 "immediate_operand"
+(match_operand:QI 2 "const_int_operand"
(clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && TARGET_BMI2 && reload_completed"
   [(set (match_dup 0)
@@ -10861,7 +10866,7 @@
   [(set (match_operand:DI 0 "register_operand")
(zero_extend:DI
  (rotatert:SI (match_operand:SI 1 "nonimmediate_operand")
-  (match_operand:QI 2 "immediate_operand"
+  (match_operand:QI 2 "const_int_operand"
(clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && TARGET_BMI2 && reload_completed"
   [(set (match_dup 0)
Index: testsuite/gcc.target/i386/pr81471.c
===
--- testsuite/gcc.target/i386/pr81471.c (nonexistent)
+++ testsuite/gcc.target/i386/pr81471.c (working copy)
@@ -0,0 +1,13 @@
+/* PR target/81471 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mbmi2" } */
+
+static inline unsigned int rotl (unsigned int x, int k)
+{
+  return (x << k) | (x >> (32 - k));
+}
+
+unsigned long long test (unsigned int z)
+{
+  return rotl (z, 55);
+}


[Bug target/81471] [5/6/7/8 Regression] internal compiler error: in curr_insn_transform, at lra-constraints.c:3495

2017-07-18 Thread uros at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81471

--- Comment #7 from uros at gcc dot gnu.org ---
Author: uros
Date: Tue Jul 18 16:10:20 2017
New Revision: 250315

URL: https://gcc.gnu.org/viewcvs?rev=250315=gcc=rev
Log:
PR target/81471
* config/i386/i386.md (rorx_immediate_operand): New mode attribute.
(*bmi2_rorx3_1): Use rorx_immediate_operand as
operand 2 predicate.
(*bmi2_rorxsi3_1_zext): Use const_0_to_31_operand as
operand 2 predicate.
(ror,rol -> rorx splitters): Use const_int_operand as
operand 2 predicate.

testsuite/ChangeLog:

PR target/81471
* gcc.target/i386/pr81471.c: New test.


Added:
trunk/gcc/testsuite/gcc.target/i386/pr81471.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.md
trunk/gcc/testsuite/ChangeLog

[PATCH] Fix infinite recursion with div-by-zero (PR middle-end/70992)

2017-07-18 Thread Marek Polacek
We ended up in infinite recursion between extract_muldiv_1 and
fold_plusminus_mult_expr, because one turns this expression into the other
and the other does the reverse:

((2147483648 / 0) * 2) + 2 <-> 2 * (2147483648 / 0 + 1)

I tried (unsuccessfully) to fix it in either extract_muldiv_1 or
fold_plusminus_mult_expr, but in the end I went with just turning (x / 0) + A
to x / 0 (and similarly for %), because with that undefined division we can do
anything and this fixes the issue.  Any better ideas?

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2017-07-18  Marek Polacek  

PR middle-end/70992
* fold-const.c (fold_binary_loc): Fold (x / 0) + A to x / 0,
and (x % 0) + A to x % 0.

* gcc.dg/torture/pr70992.c: New test.
* gcc.dg/torture/pr70992-2.c: New test.

diff --git gcc/fold-const.c gcc/fold-const.c
index 1bcbbb58154..9abdc9a8c20 100644
--- gcc/fold-const.c
+++ gcc/fold-const.c
@@ -9387,6 +9387,12 @@ fold_binary_loc (location_t loc,
  TREE_TYPE (arg0), arg0,
  cst0));
}
+ /* Adding anything to a division-by-zero makes no sense and
+can confuse extract_muldiv and fold_plusminus_mult_expr.  */
+ else if ((TREE_CODE (arg0) == TRUNC_DIV_EXPR
+   || TREE_CODE (arg0) == TRUNC_MOD_EXPR)
+  && integer_zerop (TREE_OPERAND (arg0, 1)))
+   return fold_convert_loc (loc, type, arg0);
}
 
   /* Handle (A1 * C1) + (A2 * C2) with A1, A2 or C1, C2 being the same or
diff --git gcc/testsuite/gcc.dg/torture/pr70992-2.c 
gcc/testsuite/gcc.dg/torture/pr70992-2.c
index e69de29bb2d..c5d2c5f2683 100644
--- gcc/testsuite/gcc.dg/torture/pr70992-2.c
+++ gcc/testsuite/gcc.dg/torture/pr70992-2.c
@@ -0,0 +1,9 @@
+/* PR middle-end/70992 */
+/* { dg-do compile } */
+
+unsigned int *od;
+int
+fn (void)
+{
+  return (0 % 0 + 1) * *od * 2; /* { dg-warning "division by zero" } */
+}
diff --git gcc/testsuite/gcc.dg/torture/pr70992.c 
gcc/testsuite/gcc.dg/torture/pr70992.c
index e69de29bb2d..56728e09d1b 100644
--- gcc/testsuite/gcc.dg/torture/pr70992.c
+++ gcc/testsuite/gcc.dg/torture/pr70992.c
@@ -0,0 +1,41 @@
+/* PR middle-end/70992 */
+/* { dg-do compile } */
+
+typedef unsigned int uint32_t;
+typedef int int32_t;
+
+uint32_t
+fn (uint32_t so)
+{
+  return (so + so) * (0x8000 / 0 + 1); /* { dg-warning "division by zero" 
} */
+}
+
+uint32_t
+fn5 (uint32_t so)
+{
+  return (0x8000 / 0 + 1) * (so + so); /* { dg-warning "division by zero" 
} */
+}
+
+uint32_t
+fn6 (uint32_t so)
+{
+  return (0x8000 / 0 - 1) * (so + so); /* { dg-warning "division by zero" 
} */
+}
+
+uint32_t
+fn2 (uint32_t so)
+{
+  return (so + so) * (0x8000 / 0 - 1); /* { dg-warning "division by zero" 
} */
+}
+
+int32_t
+fn3 (int32_t so)
+{
+  return (so + so) * (0x8000 / 0 + 1); /* { dg-warning "division by zero" 
} */
+}
+
+int32_t
+fn4 (int32_t so)
+{
+  return (so + so) * (0x8000 / 0 - 1); /* { dg-warning "division by zero" 
} */
+}

Marek


Re: [PATCH] Implement one optimization from build_range_check in match.pd (PR tree-optimization/81346)

2017-07-18 Thread Marc Glisse

On Tue, 18 Jul 2017, Jakub Jelinek wrote:


On Tue, Jul 18, 2017 at 05:35:54PM +0200, Marc Glisse wrote:

On Tue, 18 Jul 2017, Jakub Jelinek wrote:


In the PR Marc noted that the optimization might be useful even for
constants other than 1, by transforming
x+C1 <= C2 if unsigned and C2-C1==INT_MAX into (int)x > (int)(-1-C1).


(int)x >= (int)(-C1) might be easier (and more valid, except that the only
case where that makes a difference seems to be when C2==UINT_MAX, in which
case we could hope not to reach this transformation).


Don't we canonicalize that (int)x >= (int)(-C1) to (int)x > (int)(-1-C1)
immediately though?


We probably don't canonicalize (int)x >= INT_MIN to (int)x > INT_MAX ;-) 
(what I was suggesting essentially delegates the check for INT_MIN or 
overflow to the canonicalization code)


--
Marc Glisse


[Bug sanitizer/81281] [6/7/8 Regression] UBSAN: false positive, dropped promotion to long type.

2017-07-18 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81281

--- Comment #3 from Marek Polacek  ---
Before that rev we had

int a = (int) (unsigned int) ll - (unsigned int) ci) - (unsigned int) i) +
unsigned int) ci - (unsigned int) ll) + (unsigned int) i) - (unsigned int)
ci)) + 2270794745);

and now just

int a = -2024172551(OVF) - (int) ci;

[Bug c++/81476] severe slow-down with range-v3 library compared to clang

2017-07-18 Thread h2+bugs at fsfe dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81476

--- Comment #5 from Hannes Hauswedell  ---
> Please clarify if the problem is in compilation time (how long the compiler 
> takes to compile the file), or in performance of generated code.

Performance of code.

> Please attach preprocessed testcase obtained with -E or -save-temps.

I have added intermediate code.

> For clang are you using GCC's libstdc++ or llvm's libc++?

For clang I am using libc++ (FreeBSD). I has no immediate luck trying to use
libstdc++ with clang (to exclude library issues), but I will give it another
shot later.

Re: [PATCH] Implement one optimization from build_range_check in match.pd (PR tree-optimization/81346)

2017-07-18 Thread Jakub Jelinek
On Tue, Jul 18, 2017 at 05:35:54PM +0200, Marc Glisse wrote:
> On Tue, 18 Jul 2017, Jakub Jelinek wrote:
> 
> > In the PR Marc noted that the optimization might be useful even for
> > constants other than 1, by transforming
> > x+C1 <= C2 if unsigned and C2-C1==INT_MAX into (int)x > (int)(-1-C1).
> 
> (int)x >= (int)(-C1) might be easier (and more valid, except that the only
> case where that makes a difference seems to be when C2==UINT_MAX, in which
> case we could hope not to reach this transformation).

Don't we canonicalize that (int)x >= (int)(-C1) to (int)x > (int)(-1-C1)
immediately though?

> > +   && TYPE_PRECISION (TREE_TYPE (@0)) > 1
> 
> I see you've been bitten in the past ;-)

Many times ;)

Jakub


Re: [PATCH][RFA/RFC] Stack clash mitigation 0/9

2017-07-18 Thread Jeff Law
On 07/13/2017 03:26 AM, Christophe Lyon wrote:
> I have executed a validation of your patch series on aarch64 and arm
> targets, and I have minor comments.
> 
> On arm, all new tests are unsupported, as expected.
Good.

> On aarch64-linux, the new tests pass, but they fail on aarch64-elf:
>   - FAIL appears  [ => FAIL]:
That's really strange.  I just tried that here and the only two failures
I got were stack-check-7 and stack-check-8 which failed because I didn't
have a cross assembler installed.

> 
> 
> As I noticed that you used dg-require-effective-target
> stack_clash_protected instead of
> dg-require-stack-check "clash" that I recently committed, I also tried
> with the later.
Yea.  Ultimately I decided that unless the target had explicitly added
support for stack clash protection that the tests should be considered
UNRESOLVED, even if the port had partial protection (as is the case with
ARM).  Thus I ended up with a new effective target test.  I should have
mentioned that in the cover letter.

Thanks,


Jeff


[Bug c++/81476] severe slow-down with range-v3 library compared to clang

2017-07-18 Thread h2+bugs at fsfe dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81476

--- Comment #4 from Hannes Hauswedell  ---
Created attachment 41783
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41783=edit
intermediate code

Re: [PATCH] Implement one optimization from build_range_check in match.pd (PR tree-optimization/81346)

2017-07-18 Thread Jakub Jelinek
On Tue, Jul 18, 2017 at 09:31:11AM -0600, Martin Sebor wrote:
> > --- gcc/match.pd.jj 2017-07-17 16:25:20.0 +0200
> > +++ gcc/match.pd2017-07-18 12:32:52.896924558 +0200
> > @@ -1125,6 +1125,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > && wi::neg_p (@1, TYPE_SIGN (TREE_TYPE (@1
> >  (cmp @2 @0))
> > 
> > +/* (X - 1U) <= INT_MAX-1U into (int) X > 0.  */
> 
> Since the transformation applies to other types besides int I
> suggest to make it clear in the comment.  E.g., something like:
> 
>   /* (X - 1U) <= TYPE_MAX - 1U into (TYPE) X > 0 for any integer
>  TYPE.  */
> 
> (with spaces around all the operators as per GCC coding style).

I think many of the match.pd comments are also not fully generic
to describe what it does, just to give an idea what it does.
The above isn't correct either, because it isn't for any integer TYPE,
there needs to be a signed and corresponding unsigned type involved,
X is of the unsigned type, so is the 1.  And TYPE_MAX is actually
the signed type's maximum cast to unsigned type.  And the reason for not putting
spaces around - in the second case was an attempt to give a hint that
it is comparison against a INT_MAX-1U constant, not another subtraction.
After all, the pattern doesn't handle subtraction, because that isn't
what is in the IL, but addition, i.e. X + -1U.
And, the <= -> > is just one possibility, the pattern also handles
> -> <=.

Examples of other comments that "suffer" from similar lack of sufficient
genericity, but perhaps are good enough to let somebody understand it
quickly:
/* Avoid this transformation if C is INT_MIN, i.e. C == -C.  */
  /* Avoid this transformation if X might be INT_MIN or
 Y might be -1, because we would then change valid
 INT_MIN % -(-1) into invalid INT_MIN % -1.  */
   /* If the constant operation overflows we cannot do the transform
  directly as we would introduce undefined overflow, for example
  with (a - 1) + INT_MIN.  */
  /* X+INT_MAX+1 is X-INT_MIN.  */

Jakub


[Bug sanitizer/81281] [6/7/8 Regression] UBSAN: false positive, dropped promotion to long type.

2017-07-18 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81281

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org

--- Comment #2 from Marek Polacek  ---
Started with r229167.  Would be nice to have one-file testcase but that doesn't
seem to be that easy...

[Bug c++/81476] severe slow-down with range-v3 library compared to clang

2017-07-18 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81476

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2017-07-18
 Ever confirmed|0   |1

--- Comment #3 from Jonathan Wakely  ---
In summary, please read https://gcc.gnu.org/bugs/

Re: [PATCH] Implement one optimization from build_range_check in match.pd (PR tree-optimization/81346)

2017-07-18 Thread Marc Glisse

On Tue, 18 Jul 2017, Jakub Jelinek wrote:


In the PR Marc noted that the optimization might be useful even for
constants other than 1, by transforming
x+C1 <= C2 if unsigned and C2-C1==INT_MAX into (int)x > (int)(-1-C1).


(int)x >= (int)(-C1) might be easier (and more valid, except that the only 
case where that makes a difference seems to be when C2==UINT_MAX, in which 
case we could hope not to reach this transformation).



Shall I do that immediately, or incrementally?


I vote for "incremental", unless someone finds an issue with your current 
patch.


Shall we also change build_range_check to do that (i.e. drop the 
integer_onep above and use right etype constant?


I would rather consider build_range_check legacy and avoid modifying it 
too much, but if you are motivated...



+   && TYPE_PRECISION (TREE_TYPE (@0)) > 1


I see you've been bitten in the past ;-)

--
Marc Glisse


Re: Killing old dead bugs

2017-07-18 Thread Jonathan Wakely
On 18 July 2017 at 16:32, Yuri Gribov wrote:
> Jonathan also mentioned something not immediately obvious in IRC:
> logging into BZ with gcc.gnu.org account provides elevated privileges.
> So if you have write access, you should get extra BZ rights for free.

We should document this at https://gcc.gnu.org/bugs/management.html
and maybe also somewhere more visible (I'd forgotten that page even
existed).


Re: [PATCH] Move fold_div_compare optimization to match.pd (PR tree-optimization/81346)

2017-07-18 Thread Jakub Jelinek
On Tue, Jul 18, 2017 at 05:21:42PM +0200, Marc Glisse wrote:
> On Tue, 18 Jul 2017, Jakub Jelinek wrote:
> 
> > +/* X / C1 op C2 into a simple range test.  */
> > +(for cmp (simple_comparison)
> > + (simplify
> > +  (cmp (trunc_div:s @0 INTEGER_CST@1) INTEGER_CST@2)
> > +  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> > +   && integer_nonzerop (@1)
> > +   && !TREE_OVERFLOW (@1)
> > +   && !TREE_OVERFLOW (@2))
> 
> (not specific to this patch)
> I wonder if we should check TREE_OVERFLOW for the input that way in many
> more transformations in match.pd, or never, or how to decide in which
> cases to do it...

The reason for putting it here was that: 1) fold_div_compare did that
2) it relies on TREE_OVERFLOW to detect if the optimization is ok or not,
if there are some TREE_OVERFLOW on the inputs, then it might misbehave.

> > + (with
> > +  {
> > +   tree etype = range_check_type (TREE_TYPE (@0));
> > +   if (etype)
> > + {
> > +   if (! TYPE_UNSIGNED (etype))
> > + etype = unsigned_type_for (etype);
> 
> Now that you enforce unsignedness, can you think of cases where going
> through range_check_type is useful compared to
>   tree etype = unsigned_type_for (TREE_TYPE (@0));
> ? I can propose that trivial patch as a follow-up if you like.

I couldn't convince myself it is safe.  While enums and bool are handled
early, aren't there e.g. Ada integral types with weirdo min/max values and
similar stuff where the range_check_type test would fail?  If it never
fails, why we do it there?  The reason I've added unsigned_type_for
afterwards is that that build_range_check actually does something like
that too.
With -fwrapv it will perform the subtraction of lo in whatever type
range_check_type returns (could be e.g. int for -fwrapv) but then
recurses and runs into:
  if (integer_zerop (low))
{
  if (! TYPE_UNSIGNED (etype))
{
  etype = unsigned_type_for (etype);
  high = fold_convert_loc (loc, etype, high);
  exp = fold_convert_loc (loc, etype, exp);
}
  return build_range_check (loc, type, exp, 1, 0, high);
}
I was thinking whether e.g. range_check_type shouldn't have an extra
argument which would be false for build_range_check and true for
the use in match.pd, and if that arg is false, it would use
!TYPE_OVERFLOW_WRAPS (etype) and if that arg is true, it would
use !TYPE_UNSIGNED (etype) instead.

> > +   hi = fold_convert (etype, hi);
> > +   lo = fold_convert (etype, lo);
> > +   hi = const_binop (MINUS_EXPR, etype, hi, lo);
> > + }
> > +  }
> > +  (if (etype && hi && !TREE_OVERFLOW (hi))
> 
> I don't think you can have an overflow here anymore, now that etype is
> always unsigned and since you check the input (doesn't hurt though).

If const_binop for unsigned etype will never return NULL nor TREE_OVERFLOW
on the result, then that can surely go.  But again, I'm not 100% sure.
> 
> > +   (if (code == EQ_EXPR)
> > +   (le (minus (convert:etype @0) { lo; }) { hi; })
> > +   (gt (minus (convert:etype @0) { lo; }) { hi; })

Jakub


[Bug c++/81476] severe slow-down with range-v3 library compared to clang

2017-07-18 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81476

--- Comment #2 from Andrew Pinski  ---
For clang are you using GCC's libstdc++ or llvm's libc++?

[Bug libgomp/81386] [8 regression] libgomp.fortran/appendix-a/a.16.1.f90 fails starting with 249424

2017-07-18 Thread carll at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81386

--- Comment #9 from Carl Love  ---
Commit 250295 reverts commit 249424 which was causing issues.  commit 249424 is
actually a fix for the vec_mule and vec_mulo support that was added in commit
248125.  The original commit was using the wrong word size for the half word
builtin, hence the need for commit 249424 to fix that.

Re: Killing old dead bugs

2017-07-18 Thread Yuri Gribov
On Tue, Jul 18, 2017 at 3:54 PM, Martin Sebor  wrote:
> On 07/17/2017 02:25 PM, Yuri Gribov wrote:
>>
>> On Mon, Jul 17, 2017 at 4:23 PM, Martin Sebor  wrote:
>>>
>>> On 07/17/2017 02:14 AM, Yuri Gribov wrote:


 Hi Mikhail,

 On Sun, Jul 2, 2017 at 6:39 PM, Mikhail Maltsev 
 wrote:
>
>
> Hi. Yes, bug maintenance is appreciated. See this message and replies
> to it: https://gcc.gnu.org/ml/gcc/2016-04/msg00258.html .



 Replies in your link suggest to leave a final comment in bugs with
 explanatory suggestion to close them so that maintainers who read
 gcc-bugs list hopefully notice them and act appropriately.
 Unfortunately I found this to _not_ work in practice. Below you can
 see a list of bugs I've investigated (and often bisected) in the past
 weeks - none of them were closed by maintainers (or at least
 commented).

 So I'm afraid we have to conclude that there's no working process to
 close stale bugs in place (which may be one of the reasons of bugcount
 growth).
>>>
>>>
>>>
>>> The informal process that some (most?) of us have been following
>>> is to close them with a comment explaining our rationale.
>>> It's good to fill in the Known to fail/Known to work versions if they
>>> can be determined.  Mentioning the commit that fixed the bug as
>>> you did for the bugs below is ideal.  Adding a test case if one
>>> doesn't exist in the test suite is also very useful, though quite
>>> a bit more work.  In my experience, if a bug is closed that should
>>> stay open, someone usually notices pretty quickly and reopens it,
>>> so I wouldn't be too worried about doing something wrong.
>>
>>
>> Martin,
>>
>> Firstly, thanks for detailed explanation.
>>
>> What to do about bugs originating in upstream packages?  I noticed
>> they sometimes get closed with "RESOLVED MOVED" resolution
>> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58841) but often this
>> does not happen and they just hang in tracker forever for no good
>> reason.
>>
>> Actually what I tried to emphasize is that it's impossible for a
>> casual commiter (who does not have maintainer access to Bugzilla i.e.
>> rights to close bugs) to contribute to project by cleaning stale bugs,
>> because requests to close them are mostly ignored (because
>> maintainers, obviously, have much more interesting work to do).
>
>
> I take your point.  I didn't realize closing bugs was restricted.
> Given the work you've done on the bugs below (and elsewhere) you
> should be able to close them.  If you aren't and would like to be
> able to, please request it by emailing overse...@gcc.gnu.org ((at
> least I think that's the right way to go about it), or follow up
> here and I'm sure someone with the right karma will make it happen.

Jonathan also mentioned something not immediately obvious in IRC:
logging into BZ with gcc.gnu.org account provides elevated privileges.
So if you have write access, you should get extra BZ rights for free.

>>> The process for managing bugs is in more detail described here:
>>>
>>>   https://gcc.gnu.org/bugs/management.html
>>>
>>> If you think it should be clarified in some way please feel free
>>> to send in a patch.
>>>
>>> Martin
>>>
>>>

 * Bug 41992 - ICE on invalid dereferencing of void *
 (https://gcc.gnu.org/ml/gcc-bugs/2017-07/msg00860.html)
 * Bug 63245 - renderMemorySnippet shouldn't show more bytes than the
 underlying type
 (https://gcc.gnu.org/ml/gcc-bugs/2017-07/msg00645.html)
 * Bug 61693 - [asan] is not intercepting aligned_alloc
 (https://gcc.gnu.org/ml/gcc-bugs/2017-07/msg00643.html)
 * Bug 61771 - Test failures in ASan testsuite on ARM Linux due to FP
 format mismatch between libasan and GCC
 (https://gcc.gnu.org/ml/gcc-bugs/2017-07/msg00646.html)
 * Bug 78028 - ASAN doesn't find memory leak
 (https://gcc.gnu.org/ml/gcc-bugs/2017-07/msg00653.html)
 * Bug 55316 - gcc/libsanitizer/asan/asan_linux.cc:70:3: error: #error
 "Unsupported arch"
 (https://gcc.gnu.org/ml/gcc-bugs/2017-07/msg00636.html)
 * Bug 78654 - ubsan can lead to excessive stack usage
 (https://gcc.gnu.org/ml/gcc-bugs/2017-07/msg00640.html)
 * Bug 60892 - GCC (libsanitizer) fails to build with Linux 2.6.21
 headers (https://gcc.gnu.org/ml/gcc-bugs/2017-07/msg00649.html)
 * Bug 61995 - gcc 4.9.1 fails to compile with error in libsanitizer
 (https://gcc.gnu.org/ml/gcc-bugs/2017-07/msg00648.html)
 * Bug 80027 - ASAN breaks DT_RPATH $ORIGIN in dlopen()
 (https://gcc.gnu.org/ml/gcc-bugs/2017-07/msg00787.html)
 * Bug 54123 - inline functions not optimized as well as static inline
 (https://gcc.gnu.org/ml/gcc-bugs/2017-07/msg01321.html)

 -Y

>>>
>


Re: [PATCH] Implement one optimization from build_range_check in match.pd (PR tree-optimization/81346)

2017-07-18 Thread Martin Sebor

--- gcc/match.pd.jj 2017-07-17 16:25:20.0 +0200
+++ gcc/match.pd2017-07-18 12:32:52.896924558 +0200
@@ -1125,6 +1125,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& wi::neg_p (@1, TYPE_SIGN (TREE_TYPE (@1
 (cmp @2 @0))

+/* (X - 1U) <= INT_MAX-1U into (int) X > 0.  */


Since the transformation applies to other types besides int I
suggest to make it clear in the comment.  E.g., something like:

  /* (X - 1U) <= TYPE_MAX - 1U into (TYPE) X > 0 for any integer
 TYPE.  */

(with spaces around all the operators as per GCC coding style).

Martin



[Bug hsa/81477] New: HSA offloading regressions: "function cannot be cloned"

2017-07-18 Thread tschwinge at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81477

Bug ID: 81477
   Summary: HSA offloading regressions: "function cannot be
cloned"
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: hsa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: hubicka at gcc dot gnu.org, jamborm at gcc dot gnu.org,
marxin at gcc dot gnu.org
  Target Milestone: ---

I noticed that with HSA offloading enabled, as of r250048 ("Avoid global
optimize flag checks in LTO"), a handful of test cases FAIL "for excess
errors": "c-c++-common/gomp/pr61486-1.c" (C, C++),
"c-c++-common/gomp/pr63249.c" (C, C++), "g++.dg/gomp/pr63249.C",
"g++.dg/gomp/pr66571-1.C".

cc1: warning: could not emit HSAIL for function foo._omp_fn.0: function
cannot be cloned [-Whsa]

gcc/ipa-hsa.c:

/* If NODE is not versionable, warn about not emiting HSAIL and return
false.
   Otherwise return true.  */

static bool
check_warn_node_versionable (cgraph_node *node)
{
  if (!node->local.versionable)
{
  warning_at (EXPR_LOCATION (node->decl), OPT_Whsa,
  "could not emit HSAIL for function %s: function cannot be
"
  "cloned", node->name ());
  return false;
}
  return true;
}

Re: [PATCH] Move fold_div_compare optimization to match.pd (PR tree-optimization/81346)

2017-07-18 Thread Marc Glisse

On Tue, 18 Jul 2017, Jakub Jelinek wrote:


+/* X / C1 op C2 into a simple range test.  */
+(for cmp (simple_comparison)
+ (simplify
+  (cmp (trunc_div:s @0 INTEGER_CST@1) INTEGER_CST@2)
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+   && integer_nonzerop (@1)
+   && !TREE_OVERFLOW (@1)
+   && !TREE_OVERFLOW (@2))


(not specific to this patch)
I wonder if we should check TREE_OVERFLOW for the input that way in many
more transformations in match.pd, or never, or how to decide in which
cases to do it...


+   (with { tree lo, hi; bool neg_overflow;
+  enum tree_code code = fold_div_compare (cmp, @1, @2, , ,
+  _overflow); }
+(switch
+ (if (code == LT_EXPR || code == GE_EXPR)
+   (if (TREE_OVERFLOW (lo))
+   { build_int_cst (type, (code == LT_EXPR) ^ neg_overflow); }
+   (if (code == LT_EXPR)
+(lt @0 { lo; })
+(ge @0 { lo; }
+ (if (code == LE_EXPR || code == GT_EXPR)
+   (if (TREE_OVERFLOW (hi))
+   { build_int_cst (type, (code == LE_EXPR) ^ neg_overflow); }
+   (if (code == LE_EXPR)
+(le @0 { hi; })
+(gt @0 { hi; }
+ (if (!lo && !hi)
+  { build_int_cst (type, code == NE_EXPR); })
+ (if (code == EQ_EXPR && !hi)
+  (ge @0 { lo; }))
+ (if (code == EQ_EXPR && !lo)
+  (le @0 { hi; }))
+ (if (code == NE_EXPR && !hi)
+  (lt @0 { lo; }))
+ (if (code == NE_EXPR && !lo)
+  (gt @0 { hi; }))
+ (if (GENERIC)
+  { build_range_check (UNKNOWN_LOCATION, type, @0, code == EQ_EXPR,
+  lo, hi); })
+ (with
+  {
+   tree etype = range_check_type (TREE_TYPE (@0));
+   if (etype)
+ {
+   if (! TYPE_UNSIGNED (etype))
+ etype = unsigned_type_for (etype);


Now that you enforce unsignedness, can you think of cases where going
through range_check_type is useful compared to
  tree etype = unsigned_type_for (TREE_TYPE (@0));
? I can propose that trivial patch as a follow-up if you like.


+   hi = fold_convert (etype, hi);
+   lo = fold_convert (etype, lo);
+   hi = const_binop (MINUS_EXPR, etype, hi, lo);
+ }
+  }
+  (if (etype && hi && !TREE_OVERFLOW (hi))


I don't think you can have an overflow here anymore, now that etype is
always unsigned and since you check the input (doesn't hurt though).


+   (if (code == EQ_EXPR)
+   (le (minus (convert:etype @0) { lo; }) { hi; })
+   (gt (minus (convert:etype @0) { lo; }) { hi; })


--
Marc Glisse


[PATCH] Rename TYPE_{MIN,MAX}VAL

2017-07-18 Thread Nathan Sidwell
As I mentioned in my previous patch, we currently have 
TYPE_{MIN,MAX}_VALUES for numeric types and TYPE_{MIN,MAX}VAL for 
type-agnostic access.


This patch renames the latter to TYPE_{MIN,MAX}VAL_RAW, matching 
TYPE_VALUES_RAW, which had a similar problem.


While renaming the macros, I reordered them in tree.h to be grouped by 
the underlying field.  I think that makes more sense here, as the only 
case when grouping as min/max makes most sense is for the numeric types. 
 And mostly when looking at this, I want to discover what things might 
use this field.


Because of that reordering, I'm hesitant to apply the obvious rule.  I'd 
appreciate review.  thanks.


(This patch is not dependent on the TYPE_METHODS removal patch)

nathan

--
Nathan Sidwell
2017-07-18  Nathan Sidwell  

	gcc/ 
	* tree.h (TYPE_MINVAL, TYPE_MAXVAL): Rename to ...
	(TYPE_MINVAL_RAW, TYPE_MAXVAL_RAW): ... these.
	* tree.c (find_decls_types_r, verify_type): Use TYPE_{MIN,MAX}VAL_RAW.
	* lto-streamer-out.c (DFS::DFS_write_tree_body): Likewise.
	(hash_tree): Likewise.
	* tree-streamer-in.c (lto_input_ts_type_non_common_tree_pointers):
	Likewise.
	* tree-streamer-out.c (write_ts_type_non_common_tree_pointers):
	Likewise.

	cp/
	* cp-tree.h (PACK_EXPANSION_PARAMETER_PACKS,
	PACK_EXPANSION_EXTRA_ARGS): Use TYPE_{MIN,MAX}VAL_RAW.

	lto/
	* lto.c (mentions_vars_p_type): Use TYPE_{MIN,MAX}VAL_RAW.
	(compare_tree_sccs_1, lto_fixup_prevailing_decls): Likewise.

	objc/
	* objc-act.h (CLASS_NST_METHODS, CLASS_CLS_METHODS): Use
	TYPE_{MIN,MAX}VAL_RAW.

Index: cp/cp-tree.h
===
--- cp/cp-tree.h	(revision 250309)
+++ cp/cp-tree.h	(working copy)
@@ -3522,13 +3522,13 @@ extern void decl_shadowed_for_var_insert
 #define PACK_EXPANSION_PARAMETER_PACKS(NODE)		\
   *(TREE_CODE (NODE) == EXPR_PACK_EXPANSION		\
 ? _OPERAND (NODE, 1)\
-: _MINVAL (TYPE_PACK_EXPANSION_CHECK (NODE)))
+: _MINVAL_RAW (TYPE_PACK_EXPANSION_CHECK (NODE)))
 
 /* Any additional template args to be applied when substituting into
the pattern, set by tsubst_pack_expansion for partial instantiations.  */
 #define PACK_EXPANSION_EXTRA_ARGS(NODE)		\
   *(TREE_CODE (NODE) == TYPE_PACK_EXPANSION	\
-? _MAXVAL (NODE)			\
+? _MAXVAL_RAW (NODE)			\
 : _OPERAND ((NODE), 2))
 
 /* True iff this pack expansion is within a function context.  */
Index: lto/lto.c
===
--- lto/lto.c	(revision 250309)
+++ lto/lto.c	(working copy)
@@ -646,8 +646,8 @@ mentions_vars_p_type (tree t)
   CHECK_NO_VAR (TYPE_ATTRIBUTES (t));
   CHECK_NO_VAR (TYPE_NAME (t));
 
-  CHECK_VAR (TYPE_MINVAL (t));
-  CHECK_VAR (TYPE_MAXVAL (t));
+  CHECK_VAR (TYPE_MINVAL_RAW (t));
+  CHECK_VAR (TYPE_MAXVAL_RAW (t));
 
   /* Accessor is for derived node types only. */
   CHECK_NO_VAR (t->type_non_common.binfo);
@@ -1414,9 +1414,10 @@ compare_tree_sccs_1 (tree t1, tree t2, t
   else if (code == FUNCTION_TYPE
 	   || code == METHOD_TYPE)
 	compare_tree_edges (TYPE_ARG_TYPES (t1), TYPE_ARG_TYPES (t2));
+
   if (!POINTER_TYPE_P (t1))
-	compare_tree_edges (TYPE_MINVAL (t1), TYPE_MINVAL (t2));
-  compare_tree_edges (TYPE_MAXVAL (t1), TYPE_MAXVAL (t2));
+	compare_tree_edges (TYPE_MINVAL_RAW (t1), TYPE_MINVAL_RAW (t2));
+  compare_tree_edges (TYPE_MAXVAL_RAW (t1), TYPE_MAXVAL_RAW (t2));
 }
 
   if (CODE_CONTAINS_STRUCT (code, TS_LIST))
@@ -2580,8 +2581,8 @@ lto_fixup_prevailing_decls (tree t)
   LTO_NO_PREVAIL (TYPE_ATTRIBUTES (t));
   LTO_NO_PREVAIL (TYPE_NAME (t));
 
-  LTO_SET_PREVAIL (TYPE_MINVAL (t));
-  LTO_SET_PREVAIL (TYPE_MAXVAL (t));
+  LTO_SET_PREVAIL (TYPE_MINVAL_RAW (t));
+  LTO_SET_PREVAIL (TYPE_MAXVAL_RAW (t));
   LTO_NO_PREVAIL (t->type_non_common.binfo);
 
   LTO_SET_PREVAIL (TYPE_CONTEXT (t));
Index: lto-streamer-out.c
===
--- lto-streamer-out.c	(revision 250309)
+++ lto-streamer-out.c	(working copy)
@@ -835,8 +835,8 @@ DFS::DFS_write_tree_body (struct output_
 	DFS_follow_tree_edge (TYPE_ARG_TYPES (expr));
 
   if (!POINTER_TYPE_P (expr))
-	DFS_follow_tree_edge (TYPE_MINVAL (expr));
-  DFS_follow_tree_edge (TYPE_MAXVAL (expr));
+	DFS_follow_tree_edge (TYPE_MINVAL_RAW (expr));
+  DFS_follow_tree_edge (TYPE_MAXVAL_RAW (expr));
   if (RECORD_OR_UNION_TYPE_P (expr))
 	DFS_follow_tree_edge (TYPE_BINFO (expr));
 }
@@ -1271,8 +1271,8 @@ hash_tree (struct streamer_tree_cache_d
 	   || code == METHOD_TYPE)
 	visit (TYPE_ARG_TYPES (t));
   if (!POINTER_TYPE_P (t))
-	visit (TYPE_MINVAL (t));
-  visit (TYPE_MAXVAL (t));
+	visit (TYPE_MINVAL_RAW (t));
+  visit (TYPE_MAXVAL_RAW (t));
   if (RECORD_OR_UNION_TYPE_P (t))
 	visit (TYPE_BINFO (t));
 }
Index: objc/objc-act.h
===
--- objc/objc-act.h	(revision 250309)
+++ objc/objc-act.h	(working 

[Bug other/67552] [meta-bug] x86 interrupt attribute

2017-07-18 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67552

H.J. Lu  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |7.0

--- Comment #8 from H.J. Lu  ---
Implemented for GCC 7.

[Bug target/66960] Add interrupt attribute to x86 backend

2017-07-18 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66960

H.J. Lu  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |7.0

--- Comment #21 from H.J. Lu  ---
Implemenmted for GCC 7.

  1   2   3   >