Re: [patch part, libgcc] Add AVX-specific matmul

2016-11-29 Thread Uros Bizjak
> the patch at https://gcc.gnu.org/ml/fortran/2016-11/msg00246.html
> (the one going to gcc-patches was rejected due to size of
> regernerated files) contains one libgcc change, which exposes
> the __cpu_model interface fox i386 to libgfortran.
>
> The Fortran bits are OKd, but I need an approval from a libgcc
> maintainer (or some hint how to do this better :-).
>
> 2016-11-27  Thomas Koenig  
>
>PR fortran/78379
>* config/i386/cpuinfo.c:  Move denums for processor vendors,
>processor type, processor subtypes and declaration of
>struct __processor_model into
>* config/i386/cpuinfo.h:  New header file.

The above x86 specific part is OK.

Uros.


Re: [PATCH v2] Fix PR78588 - rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too large for 64-bit type

2016-11-29 Thread Markus Trippelsdorf
On 2016.11.29 at 15:25 -0600, Segher Boessenkool wrote:
> On Tue, Nov 29, 2016 at 05:00:05PM +0100, Markus Trippelsdorf wrote:
> > Building gcc with -fsanitize=undefined shows:
> >  rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too large 
> > for 64-bit type 'long unsigned int'
> > 
> > This happens because if_then_else_cond() in combine.c calls
> > num_sign_bit_copies() in rtlanal.c with mode==BLKmode.
> > 
> > 5205   bitwidth = GET_MODE_PRECISION (mode);
> > 5206   if (bitwidth > HOST_BITS_PER_WIDE_INT)
> > 5207 return 1;
> > 5208
> > 5209   nonzero = nonzero_bits (x, mode);
> > 5210   return nonzero & (HOST_WIDE_INT_1U << (bitwidth - 1))
> > 5211  ? 1 : bitwidth - floor_log2 (nonzero) - 1;
> > 
> > This causes (bitwidth - 1) to wrap around.
> 
> Could you also add a gcc_assert here?
> 
> > PR rtl-optimization/78588 
> > * combine.c (if_then_else_cond): Also guard against BLKmode.
> 
> Approved, please apply.  Thanks,

Because it can only happen when mode==BLKmode, this is what I checked
in:

diff --git a/gcc/combine.c b/gcc/combine.c
index 22fb7a9..a32a0ec 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -9176,7 +9176,7 @@ if_then_else_cond (rtx x, rtx *ptrue, rtx *pfalse)
   /* If X is known to be either 0 or -1, those are the true and
  false values when testing X.  */
   else if (x == constm1_rtx || x == const0_rtx
-  || (mode != VOIDmode
+  || (mode != VOIDmode && mode != BLKmode
   && num_sign_bit_copies (x, mode) == GET_MODE_PRECISION (mode)))
 {
   *ptrue = constm1_rtx, *pfalse = const0_rtx;
diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index 4e4eb2e..60550ad 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -4840,6 +4840,8 @@ num_sign_bit_copies1 (const_rtx x, machine_mode mode, 
const_rtx known_x,
   if (mode == VOIDmode)
 mode = GET_MODE (x);
 
+  gcc_checking_assert (mode != BLKmode);
+
   if (mode == VOIDmode || FLOAT_MODE_P (mode) || FLOAT_MODE_P (GET_MODE (x))
   || VECTOR_MODE_P (GET_MODE (x)) || VECTOR_MODE_P (mode))
 return 1;

-- 
Markus


Re: [Patch, Fortran] PR 78573: [7 Regression] [OOP] ICE in resolve_component, at fortran/resolve.c:13405

2016-11-29 Thread Janus Weil
2016-11-29 23:21 GMT+01:00 Steve Kargl :
> On Tue, Nov 29, 2016 at 10:58:35PM +0100, Janus Weil wrote:
>>
>> here is a rather straightforward patch for an ice-on-invalid
>> regression. Regtests cleanly on x86_64-linux-gnu. Ok for trunk?
>>
>
> Yes.

Thanks, Steve. Committed as r242996.

Cheers,
Janus


[patch part, libgcc] Add AVX-specific matmul

2016-11-29 Thread Thomas Koenig

Hello world,

the patch at https://gcc.gnu.org/ml/fortran/2016-11/msg00246.html
(the one going to gcc-patches was rejected due to size of
regernerated files) contains one libgcc change, which exposes
the __cpu_model interface fox i386 to libgfortran.

The Fortran bits are OKd, but I need an approval from a libgcc
maintainer (or some hint how to do this better :-).

I have attached the libgcc-specific part of the patch.

OK for trunk?

Regards

Thomas

2016-11-27  Thomas Koenig  

PR fortran/78379
* config/i386/cpuinfo.c:  Move denums for processor vendors,
processor type, processor subtypes and declaration of
struct __processor_model into
* config/i386/cpuinfo.h:  New header file.
* Makefile.am:  Add dependence of m4/matmul_internal_m4 to
mamtul files..
* Makefile.in:  Regenerated.
* acinclude.m4:  Check for AVX, AVX2 and AVX512F.
* config.h.in:  Add HAVE_AVX, HAVE_AVX2 and HAVE_AVX512F.
* configure:  Regenerated.
* configure.ac:  Use checks for AVX, AVX2 and AVX_512F.
* m4/matmul_internal.m4:  New file. working part of matmul.m4.
* m4/matmul.m4:  Implement architecture-specific switching
for AVX, AVX2 and AVX512F by including matmul_internal.m4
multiple times.
* generated/matmul_c10.c: Regenerated.
* generated/matmul_c16.c: Regenerated.
* generated/matmul_c4.c: Regenerated.
* generated/matmul_c8.c: Regenerated.
* generated/matmul_i1.c: Regenerated.
* generated/matmul_i16.c: Regenerated.
* generated/matmul_i2.c: Regenerated.
* generated/matmul_i4.c: Regenerated.
* generated/matmul_i8.c: Regenerated.
* generated/matmul_r10.c: Regenerated.
* generated/matmul_r16.c: Regenerated.
* generated/matmul_r4.c: Regenerated.
* generated/matmul_r8.c: Regenerated.
Index: config/i386/cpuinfo.c
===
--- config/i386/cpuinfo.c	(Revision 242477)
+++ config/i386/cpuinfo.c	(Arbeitskopie)
@@ -26,6 +26,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include "cpuid.h"
 #include "tsystem.h"
 #include "auto-target.h"
+#include "cpuinfo.h"
 
 #ifdef HAVE_INIT_PRIORITY
 #define CONSTRUCTOR_PRIORITY (101)
@@ -36,97 +37,9 @@ see the files COPYING3 and COPYING.RUNTIME respect
 int __cpu_indicator_init (void)
   __attribute__ ((constructor CONSTRUCTOR_PRIORITY));
 
-/* Processor Vendor and Models. */
+struct __processor_model __cpu_model = { };
 
-enum processor_vendor
-{
-  VENDOR_INTEL = 1,
-  VENDOR_AMD,
-  VENDOR_OTHER,
-  VENDOR_MAX
-};
 
-/* Any new types or subtypes have to be inserted at the end. */
-
-enum processor_types
-{
-  INTEL_BONNELL = 1,
-  INTEL_CORE2,
-  INTEL_COREI7,
-  AMDFAM10H,
-  AMDFAM15H,
-  INTEL_SILVERMONT,
-  INTEL_KNL,
-  AMD_BTVER1,
-  AMD_BTVER2,  
-  AMDFAM17H,
-  CPU_TYPE_MAX
-};
-
-enum processor_subtypes
-{
-  INTEL_COREI7_NEHALEM = 1,
-  INTEL_COREI7_WESTMERE,
-  INTEL_COREI7_SANDYBRIDGE,
-  AMDFAM10H_BARCELONA,
-  AMDFAM10H_SHANGHAI,
-  AMDFAM10H_ISTANBUL,
-  AMDFAM15H_BDVER1,
-  AMDFAM15H_BDVER2,
-  AMDFAM15H_BDVER3,
-  AMDFAM15H_BDVER4,
-  AMDFAM17H_ZNVER1,
-  INTEL_COREI7_IVYBRIDGE,
-  INTEL_COREI7_HASWELL,
-  INTEL_COREI7_BROADWELL,
-  INTEL_COREI7_SKYLAKE,
-  INTEL_COREI7_SKYLAKE_AVX512,
-  CPU_SUBTYPE_MAX
-};
-
-/* ISA Features supported. New features have to be inserted at the end.  */
-
-enum processor_features
-{
-  FEATURE_CMOV = 0,
-  FEATURE_MMX,
-  FEATURE_POPCNT,
-  FEATURE_SSE,
-  FEATURE_SSE2,
-  FEATURE_SSE3,
-  FEATURE_SSSE3,
-  FEATURE_SSE4_1,
-  FEATURE_SSE4_2,
-  FEATURE_AVX,
-  FEATURE_AVX2,
-  FEATURE_SSE4_A,
-  FEATURE_FMA4,
-  FEATURE_XOP,
-  FEATURE_FMA,
-  FEATURE_AVX512F,
-  FEATURE_BMI,
-  FEATURE_BMI2,
-  FEATURE_AES,
-  FEATURE_PCLMUL,
-  FEATURE_AVX512VL,
-  FEATURE_AVX512BW,
-  FEATURE_AVX512DQ,
-  FEATURE_AVX512CD,
-  FEATURE_AVX512ER,
-  FEATURE_AVX512PF,
-  FEATURE_AVX512VBMI,
-  FEATURE_AVX512IFMA
-};
-
-struct __processor_model
-{
-  unsigned int __cpu_vendor;
-  unsigned int __cpu_type;
-  unsigned int __cpu_subtype;
-  unsigned int __cpu_features[1];
-} __cpu_model = { };
-
-
 /* Get the specific type of AMD CPU.  */
 
 static void
Index: config/i386/cpuinfo.h
===
--- config/i386/cpuinfo.h	(Revision 0)
+++ config/i386/cpuinfo.h	(Arbeitskopie)
@@ -0,0 +1,90 @@
+
+/* Processor Vendor and Models. */
+
+enum processor_vendor
+{
+  VENDOR_INTEL = 1,
+  VENDOR_AMD,
+  VENDOR_OTHER,
+  VENDOR_MAX
+};
+
+/* Any new types or subtypes have to be inserted at the end. */
+
+enum processor_types
+{
+  INTEL_BONNELL = 1,
+  INTEL_CORE2,
+  INTEL_COREI7,
+  AMDFAM10H,
+  AMDFAM15H,
+  INTEL_SILVERMONT,
+  INTEL_KNL,
+  AMD_BTVER1,
+  AMD_BTVER2,  
+  AMDFAM17H,
+  CPU_TYPE_MAX
+};
+
+enum processor_subtypes
+{
+  INTEL_COREI7_NEHALEM = 1,
+  INTEL_COREI7_WESTMERE,
+  

Re: [PATCH] Another debug info stv fix (PR rtl-optimization/78547)

2016-11-29 Thread Uros Bizjak
On Tue, Nov 29, 2016 at 8:44 PM, Jakub Jelinek  wrote:
> Hi!
>
> The following testcase ICEs because DECL_RTL/DECL_INCOMING_RTL are adjusted
> by the stv pass through the PUT_MODE modifications, which means that for
> var-tracking.c they contain a bogus mode.
>
> Fixed by wrapping those into TImode subreg or adjusting the MEMs to have the
> correct mode.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-11-29  Jakub Jelinek  
>
> PR rtl-optimization/78547
> * config/i386/i386.c (convert_scalars_to_vectors): If any
> insns have been converted, adjust all parameter's DEC_RTL and
> DECL_INCOMING_RTL back from V1TImode to TImode if the parameters have
> TImode.

LGTM.

Thanks,
Uros.
> --- gcc/config/i386/i386.c.jj   2016-11-29 08:31:58.0 +0100
> +++ gcc/config/i386/i386.c  2016-11-29 12:21:36.867323776 +0100
> @@ -4075,6 +4075,39 @@ convert_scalars_to_vector ()
> crtl->stack_alignment_needed = 128;
>if (crtl->stack_alignment_estimated < 128)
> crtl->stack_alignment_estimated = 128;
> +  /* Fix up DECL_RTL/DECL_INCOMING_RTL of arguments.  */
> +  if (TARGET_64BIT)
> +   for (tree parm = DECL_ARGUMENTS (current_function_decl);
> +parm; parm = DECL_CHAIN (parm))
> + {
> +   if (TYPE_MODE (TREE_TYPE (parm)) != TImode)
> + continue;
> +   if (DECL_RTL_SET_P (parm)
> +   && GET_MODE (DECL_RTL (parm)) == V1TImode)
> + {
> +   rtx r = DECL_RTL (parm);
> +   if (REG_P (r))
> + SET_DECL_RTL (parm, gen_rtx_SUBREG (TImode, r, 0));
> +   else
> + {
> +   gcc_assert (MEM_P (r));
> +   SET_DECL_RTL (parm, adjust_address_nv (r, TImode, 0));
> + }
> + }
> +   if (DECL_INCOMING_RTL (parm)
> +   && GET_MODE (DECL_INCOMING_RTL (parm)) == V1TImode)
> + {
> +   rtx r = DECL_INCOMING_RTL (parm);
> +   if (REG_P (r))
> + DECL_INCOMING_RTL (parm) = gen_rtx_SUBREG (TImode, r, 0);
> +   else
> + {
> +   gcc_assert (MEM_P (r));
> +   DECL_INCOMING_RTL (parm)
> + = change_address (r, TImode, NULL_RTX);
> + }
> + }
> + }
>  }
>
>return 0;
> --- gcc/testsuite/gcc.dg/pr78547.c.jj   2016-11-29 12:26:26.544662630 +0100
> +++ gcc/testsuite/gcc.dg/pr78547.c  2016-11-29 12:26:09.0 +0100
> @@ -0,0 +1,18 @@
> +/* PR rtl-optimization/78547 */
> +/* { dg-do compile { target int128 } } */
> +/* { dg-options "-Os -g -freorder-blocks-algorithm=simple -Wno-psabi" } */
> +/* { dg-additional-options "-mstringop-strategy=libcall" { target i?86-*-* 
> x86_64-*-* } } */
> +
> +typedef unsigned __int128 u128;
> +typedef unsigned __int128 V __attribute__ ((vector_size (64)));
> +
> +V
> +foo (u128 a, u128 b, u128 c, V d)
> +{
> +  V e = (V) {a};
> +  V f = e & 1;
> +  e = 0 != e;
> +  c = c;
> +  f = f << ((V) {c} & 7);
> +  return f + e;
> +}
>
> Jakub


Re: PING! [PATCH, Fortran, accaf, v1] Add caf-API-calls to asynchronously handle allocatable components in derived type coarrays.

2016-11-29 Thread Paul Richard Thomas
Dear Andre,

This all looks OK to me. The only comment that I have that you might
deal with before committing is that some of the Boolean expressions,
eg:
+  int caf_dereg_mode
+  = ((caf_mode & GFC_STRUCTURE_CAF_MODE_IN_COARRAY) != 0
+  || c->attr.codimension)
+  ? ((caf_mode & GFC_STRUCTURE_CAF_MODE_DEALLOC_ONLY) != 0
+  ? GFC_CAF_COARRAY_DEALLOCATE_ONLY
+  : GFC_CAF_COARRAY_DEREGISTER)
+  : GFC_CAF_COARRAY_NOCOARRAY;

are getting be sufficiently convoluted that a small, appropriately
named, helper function might be clearer. Of course, this is true of
many parts of gfortran but it is not too late to start making the code
a bit clearer.

You can commit to the present trunk as far as I am concerned. I know
that the caf enthusiasts will test it to bits before release!

Regards

Paul


On 28 November 2016 at 19:33, Andre Vehreschild  wrote:
> PING!
>
> I know it's a lengthy patch, but comments would be nice anyway.
>
> - Andre
>
> On Tue, 22 Nov 2016 20:46:50 +0100
> Andre Vehreschild  wrote:
>
>> Hi all,
>>
>> attached patch addresses the need of extending the API of the caf-libs to
>> enable allocatable components asynchronous allocation. Allocatable components
>> in derived type coarrays are different from regular coarrays or coarrayed
>> components. The latter have to be allocated on all images or on none.
>> Furthermore is the allocation a point of synchronisation.
>>
>> For allocatable components the F2008 allows to have some allocated on some
>> images and on others not. Furthermore is the registration with the caf-lib,
>> that an allocatable component is present in a derived type coarray no longer 
>> a
>> synchronisation point. To implement these features two new types of coarray
>> registration have been introduced. The first one just registering the
>> component with the caf-lib and the latter doing the allocate. Furthermore has
>> the caf-API been extended to provide a query function to learn about the
>> allocation status of a component on a remote image.
>>
>> Sorry, that the patch is rather lengthy. Most of this is due to the
>> structure_alloc_comps' signature change. The routine and its wrappers are 
>> used
>> rather often which needed the appropriate changes.
>>
>> I know I left two or three TODOs in the patch to remind me of things I have 
>> to
>> investigate further. For the current state these TODOs are no reason to hold
>> back the patch. The third party library opencoarrays implements the mpi-part
>> of the caf-model and will change in sync. It would of course be advantageous
>> to just have to say: With gcc-7 gfortran implements allocatable components in
>> derived coarrays nearly completely.
>>
>> I know we are in stage 3. But the patch bootstraps and regtests ok on
>> x86_64-linux/F23. So, is it ok for trunk or shall it go to 7.2?
>>
>> Regards,
>>   Andre
>
>
> --
> Andre Vehreschild * Email: vehre ad gmx dot de



-- 
The difference between genius and stupidity is; genius has its limits.

Albert Einstein


Re: Ping: Re: [patch, avr] Add flash size to device info and make wrap around default

2016-11-29 Thread Pitchumani Sivanupandi

On Tuesday 29 November 2016 10:06 PM, Denis Chertykov wrote:

2016-11-28 10:17 GMT+03:00 Pitchumani Sivanupandi
:

On Saturday 26 November 2016 12:11 AM, Denis Chertykov wrote:

I'm sorry for delay.

I have a problem with the patch:
(Stripping trailing CRs from patch; use --binary to disable.)
patching file avr-arch.h
(Stripping trailing CRs from patch; use --binary to disable.)
patching file avr-devices.c
(Stripping trailing CRs from patch; use --binary to disable.)
patching file avr-mcus.def
Hunk #1 FAILED at 62.
1 out of 1 hunk FAILED -- saving rejects to file avr-mcus.def.rej
(Stripping trailing CRs from patch; use --binary to disable.)
patching file gen-avr-mmcu-specs.c
Hunk #1 succeeded at 215 (offset 5 lines).
(Stripping trailing CRs from patch; use --binary to disable.)
patching file specs.h
Hunk #1 succeeded at 58 (offset 1 line).
Hunk #2 succeeded at 66 (offset 1 line).


There are changes in avr-mcus.def after this patch is submitted.
Now, I have incorporated the changes and attached the resolved patch.

Regards,
Pitchumani

gcc/ChangeLog

2016-11-09  Pitchumani Sivanupandi 

 * config/avr/avr-arch.h (avr_mcu_t): Add flash_size member.
 * config/avr/avr-devices.c(avr_mcu_types): Add flash size info.
 * config/avr/avr-mcu.def: Likewise.
 * config/avr/gen-avr-mmcu-specs.c (print_mcu): Remove hard-coded prefix
 check to find wrap-around value, instead use MCU flash size. For 8k
flash
 devices, update link_pmem_wrap spec string to add --pmem-wrap-around=8k.
 * config/avr/specs.h: Remove link_pmem_wrap from LINK_RELAX_SPEC and
 add to linker specs (LINK_SPEC) directly.

Committed.

It looks like only avr-mcus.def and ChangeLog are committed.
Without the other changes trunk build is broken.

Regards,
Pitchumani


[PATCH] Fix prs 78602 & 78560 on PowerPC (vec_extract/vec_sec)

2016-11-29 Thread Michael Meissner
These two patches to fix PRs 78602 and 78560 fix aspects of the vector set and
extract code I've been working on in the last couple of months.

The two symptoms were essentially the same thing, one on vector set and the
other on vector extract.  The core issue was both set and extract did not
verify an argument that should have been in a register, actually was in a
register, and generated code that raised an insn not found message.

PR 78602 was an error that I found in working with the next set of patches for
vector extract, where it would generate the insn not found message if the test
cases were compiled without optimization.  The solution was to call force_reg
if the element number wasn't a register or constant.

PR 78560 was the opposite, in that it was on vector set, and it showed up with
-O3 optimization level.  Like 78602, the answer was to call force_reg to force
the value being set into a vector element into a register.

Once the initial bug in 78560 was fixed, a secondary bug reared its head, in
that the calculation for the elment being set was a bit offset, when instead it
should have been a byte offset.  The assembler complained when the offset field
was not between 0..15.

I have done full bootstraps and make check with no regressions on a little
endian power8 (64-bit only), a big endian power8 (64-bit only), and a big
endian power7 (both 32-bit and 64-bit).  Cann I install both patches to the
trunk?

2016-11-29  Michael Meissner  

PR target/78602
* config/rs6000/rs6000.c (rs6000_expand_vector_extract): If the
element is not a constant or in a register, force it to a
register.

2016-11-29  Michael Meissner  

PR target/78560
* config/rs6000/rs6000.c (rs6000_expand_vector_set): Force value
that will be set to a vector element to be in a register.
* config/rs6000/vsx.md (vsx_set__p9): Fix thinko that used
the wrong multiplier to convert the element number to a byte
offset.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 242972)
+++ gcc/config/rs6000/rs6000.c  (revision 242973)
@@ -7257,6 +7257,8 @@ rs6000_expand_vector_extract (rtx target
  convert_move (tmp, elt, 0);
  elt = tmp;
}
+  else if (!REG_P (elt))
+   elt = force_reg (DImode, elt);
 
   switch (mode)
{
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 242973)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -7105,6 +7105,8 @@ rs6000_expand_vector_set (rtx target, rt
   int width = GET_MODE_SIZE (inner_mode);
   int i;
 
+  val = force_reg (GET_MODE (val), val);
+
   if (VECTOR_MEM_VSX_P (mode))
 {
   rtx insn = NULL_RTX;
Index: gcc/config/rs6000/vsx.md
===
--- gcc/config/rs6000/vsx.md(revision 242968)
+++ gcc/config/rs6000/vsx.md(working copy)
@@ -2904,7 +2904,7 @@ (define_insn "vsx_set__p9"
   if (!VECTOR_ELT_ORDER_BIG)
 ele = nunits - 1 - ele;
 
-  operands[3] = GEN_INT (nunits * ele);
+  operands[3] = GEN_INT (GET_MODE_SIZE (mode) * ele);
   if (mode == V4SImode)
 return "xxinsertw %x0,%x2,%3";
   else


Re: [PATCH][AArch64] PR target/71112: Properly create lowpart of pic_offset_table_rtx with -fpie

2016-11-29 Thread Andrew Pinski
On Tue, Nov 29, 2016 at 1:09 AM, Kyrill Tkachov
 wrote:
> Hi all,
>
> This ICE only occurs on big-endian ILP32 -fpie code. The expansion code
> generates the invalid load:
> (insn 6 5 7 (set (reg/f:SI 76)
> (unspec:SI [
> (mem/u/c:SI (lo_sum:SI (nil)
> (symbol_ref:SI ("dbs") [flags 0x40]  0x7f6e387c0ab0 dbs>)) [0  S4 A8])
> ] UNSPEC_GOTSMALLPIC28K))
>  (expr_list:REG_EQUAL (symbol_ref:SI ("dbs") [flags 0x40]  0x7f6e387c0ab0 dbs>)
> (nil)))
>
> to load the symbol. Note the (nil) argument to lo_sum.
> The buggy hunk meant to take the lowpart of the pic_offset_table_rtx
> register but it did so by explicitly
> constructing a subreg, for which the offset is wrong for big-endian. The
> right way is to use gen_lowpart which
> knows what exactly to do, with this patch we emit:
> (insn 6 5 7 (set (reg/f:SI 76)
> (unspec:SI [
> (mem/u/c:SI (lo_sum:SI (subreg:SI (reg:DI 73) 4)
> (symbol_ref:SI ("dbs") [flags 0x40]  0x7ffb097e6ab0 dbs>)) [0  S4 A8])
> ] UNSPEC_GOTSMALLPIC28K))
>  (expr_list:REG_EQUAL (symbol_ref:SI ("dbs") [flags 0x40]  0x7ffb097e6ab0 dbs>)
> (nil)))
>
> and everything works fine.
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
> Also tested on aarch64_be-none-elf.
> Ok for trunk?

Naveen posted the exact same patch:
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02305.html
Though with a slightly different testcase :).

Thanks,
Andrew

>
> Thanks,
> Kyrill
>
> 2016-11-29  Kyrylo Tkachov  
>
> PR target/71112
> * config/aarch64/aarch64.c (aarch64_load_symref_appropriately,
> case SYMBOL_SMALL_GOT_28K): Use gen_lowpart rather than constructing
> subreg directly.
>
> 2016-11-29  Kyrylo Tkachov  
>
> PR target/71112
> * gcc.c-torture/compile/pr71112.c: New test.


Re: Go patch committed: Merge to gccgo branch

2016-11-29 Thread Ian Lance Taylor
Now I've merged GCC trunk revision 242992 to the gccgo branch.

Ian


RE: [PATCH] PR fortran/77505 -- Treat negative character length as LEN=0

2016-11-29 Thread Punnoose, Elizebeth
Please excuse the messy formatting in my initial mail. Resending with proper 
formatting.

This patch checks for negative character length in the array constructor, and 
treats it as LEN=0. 

A warning message is also printed if bounds checking is enabled.

Bootstrapped and regression tested the patch on x86_64-linux-gnu and 
aarch64-linux-gnu.

Index: ChangeLog
===
--- ChangeLog   (revision 242906)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2016-11-30  Elizebeth Punnoose 
+
+   PR fortran/77505
+   * trans-array.c (trans_array_constructor): Treat negative character
+   length as LEN=0.
+
 2016-11-27  Paul Thomas  
 
PR fortran/78474

Index: trans-array.c
===
--- trans-array.c   (revision 242906)
+++ trans-array.c   (working copy)
@@ -2226,6 +2226,8 @@ trans_array_constructor (gfc_ss * ss, lo
   gfc_ss_info *ss_info;
   gfc_expr *expr;
   gfc_ss *s;
+  tree neg_len;
+  char *msg;
 
   /* Save the old values for nested checking.  */
   old_first_len = first_len;
@@ -2271,6 +2273,28 @@ trans_array_constructor (gfc_ss * ss, lo
  gfc_conv_expr_type (_se, expr->ts.u.cl->length,
 gfc_charlen_type_node);
  ss_info->string_length = length_se.expr;
+
+ /* Check if the character length is negative,
+   if so consider it as LEN=0.  */
+ neg_len = fold_build2_loc (input_location, LT_EXPR,
+  boolean_type_node, 
ss_info->string_length,
+  build_int_cst (gfc_charlen_type_node, 
0));
+ /* Print a warning if bounds checking is enabled.  */
+ if (gfc_option.rtcheck & GFC_RTCHECK_BOUNDS)
+ {
+msg = xasprintf ("Negative character length will be treated"
+" as LEN=0");
+gfc_trans_runtime_check (false, true, neg_len, _se.pre,
+where, msg);
+free (msg);
+ }
+ ss_info->string_length = fold_build3_loc (input_location, COND_EXPR,
+  gfc_charlen_type_node, neg_len,
+  build_int_cst (gfc_charlen_type_node, 0),
+  ss_info->string_length);
+ ss_info->string_length = gfc_evaluate_now (ss_info->string_length,
+   _se.pre);
+
  gfc_add_block_to_block (_loop->pre, _se.pre);
  gfc_add_block_to_block (_loop->post, _se.post);
}


Index: ChangeLog
===
--- ChangeLog   (revision 242906)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2016-11-30 Elizebeth Punnoose 
+
+   PR fortran/77505
+   * gfortran.dg/pr77505_1.f90: New test.
+   * gfortran.dg/pr77505_2.f90: New test.
+
 2016-11-27  Paul Thomas  
 
PR fortran/78474


Index: pr77505_1.f90
===
--- pr77505_1.f90 (nonexistent)
+++ pr77505_1.f90  (working copy)
@@ -0,0 +1,13 @@
+! { dg-do run }
+program rabbithole
+implicit none
+character(len=:),allocatable    :: text_block(:) integer   
  
+:: i integer :: ii
+character(len=10)   :: cten='abcdefghij'
+character(len=20)   :: ctwenty='abcdefghijabcdefghij'
+ii=-6
+text_block=[ character(len=ii) :: cten, ctwenty ] write(*,*)'WRITE IT'
+write(*,'(a)')(trim(text_block(i)),i=1,size(text_block))
+end program rabbithole


Index: pr77505_2.f90
===
--- pr77505_2.f90 (nonexistent)
+++ pr77505_2.f90  (working copy)
@@ -0,0 +1,14 @@
+! { dg-options "-fcheck=bounds" }
+! { dg-do run }
+program rabbithole
+implicit none
+character(len=:),allocatable    :: text_block(:)
+integer :: i
+integer :: ii
+character(len=10)   :: cten='abcdefghij'
+character(len=20)   :: ctwenty='abcdefghijabcdefghij'
+ii=-6
+text_block=[ character(len=ii) :: cten, ctwenty ]
+write(*,*)'WRITE IT'
+write(*,'(a)')(trim(text_block(i)),i=1,size(text_block))
+end program rabbithole


Thanks,
Elizebeth


[PATCH] PR fortran/77505 -- Treat negative character length as LEN=0

2016-11-29 Thread Punnoose, Elizebeth
This patch checks for negative character length in the array constructor,
and treats it as LEN=0. A warning message is also printed if bounds checking is 
enabled.

Bootstrapped and regression tested the patch on x86_64-linux-gnu and 
aarch64-linux-gnu.

Index: ChangeLog
===
--- ChangeLog    (revision 242906)
+++ ChangeLog (working copy)
@@ -1,3 +1,9 @@
+2016-11-30  Elizebeth Punnoose 
+
+ PR fortran/77505
+ * trans-array.c (trans_array_constructor): Treat negative 
character
+ length as LEN=0.
+
2016-11-27  Paul Thomas  

    PR fortran/78474


Index: trans-array.c
===
--- trans-array.c (revision 242906)
+++ trans-array.c (working copy)
@@ -2226,6 +2226,8 @@ trans_array_constructor (gfc_ss * ss, lo
   gfc_ss_info *ss_info;
   gfc_expr *expr;
   gfc_ss *s;
+  tree neg_len;
+  char *msg;

   /* Save the old values for nested checking.  */
   old_first_len = first_len;
@@ -2271,6 +2273,28 @@ trans_array_constructor (gfc_ss * ss, lo
     gfc_conv_expr_type (_se, expr->ts.u.cl->length,
     gfc_charlen_type_node);
     ss_info->string_length = length_se.expr;
+
+   /* Check if the character length is negative,
+  if so consider it as LEN=0.  */
+   neg_len = fold_build2_loc (input_location, LT_EXPR,
+  
boolean_type_node, ss_info->string_length,
+  
build_int_cst (gfc_charlen_type_node, 0));
+   /* Print a warning if bounds checking is enabled.  */
+   if (gfc_option.rtcheck & GFC_RTCHECK_BOUNDS)
+   {
+     msg = xasprintf ("Negative character length will be treated"
+  " as LEN=0");
+     gfc_trans_runtime_check (false, true, neg_len, _se.pre,
+  where, msg);
+     free (msg);
+   }
+   ss_info->string_length = fold_build3_loc (input_location, 
COND_EXPR,
+   
gfc_charlen_type_node, neg_len,
+   
build_int_cst (gfc_charlen_type_node, 0),
+   
ss_info->string_length);
+   ss_info->string_length = gfc_evaluate_now 
(ss_info->string_length,
+   
       _se.pre);
+
     gfc_add_block_to_block (_loop->pre, _se.pre);
     gfc_add_block_to_block (_loop->post, _se.post);
   }


Index: ChangeLog
===
--- ChangeLog    (revision 242906)
+++ ChangeLog (working copy)
@@ -1,3 +1,9 @@
+2016-11-30 Elizebeth Punnoose 
+
+ PR fortran/77505
+ * gfortran.dg/pr77505_1.f90: New test.
+ * gfortran.dg/pr77505_2.f90: New test.
+
2016-11-27  Paul Thomas  

    PR fortran/78474


Index: pr77505_1.f90
===
--- pr77505_1.f90 (nonexistent)
+++ pr77505_1.f90  (working copy)
@@ -0,0 +1,13 @@
+! { dg-do run }
+program rabbithole
+implicit none
+character(len=:),allocatable    :: text_block(:)
+integer :: i
+integer :: ii
+character(len=10)   :: cten='abcdefghij'
+character(len=20)   :: ctwenty='abcdefghijabcdefghij'
+ii=-6
+text_block=[ character(len=ii) :: cten, ctwenty ]
+write(*,*)'WRITE IT'
+write(*,'(a)')(trim(text_block(i)),i=1,size(text_block))
+end program rabbithole


Index: pr77505_2.f90
===
--- pr77505_2.f90 (nonexistent)
+++ pr77505_2.f90  (working copy)
@@ -0,0 +1,14 @@
+! { dg-options "-fcheck=bounds" }
+! { dg-do run }
+program rabbithole
+implicit none
+character(len=:),allocatable    :: text_block(:)
+integer :: i
+integer :: ii
+character(len=10)   :: cten='abcdefghij'
+character(len=20)   :: ctwenty='abcdefghijabcdefghij'
+ii=-6
+text_block=[ character(len=ii) :: cten, ctwenty ]
+write(*,*)'WRITE IT'
+write(*,'(a)')(trim(text_block(i)),i=1,size(text_block))
+end program rabbithole


Thanks,
Elizebeth


Fix arc builds

2016-11-29 Thread Jeff Law


There's a couple unused variables in arc_handle_option.  This patch 
removes them.  Verified the arc port builds again.


Installed on the trunk.

Jeff
commit 0177a97d002107d99f82be0861ac0052285ccc0a
Author: law 
Date:   Wed Nov 30 04:37:10 2016 +

* common/config/arc/arc-common.c (arc_handle_option): Remove unused
variables.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@242994 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index df787e1..a5b191b 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,8 @@
 2016-11-29  Jeff Law  
 
+   * common/config/arc/arc-common.c (arc_handle_option): Remove unused
+   variables.
+
* lra-constraints.c (check_and_process_move): Constrain the
range of DCLASS and SCLASS to avoid false positive out of bounds
array index warning.
diff --git a/gcc/common/config/arc/arc-common.c 
b/gcc/common/config/arc/arc-common.c
index 1dbddae..9f87122 100644
--- a/gcc/common/config/arc/arc-common.c
+++ b/gcc/common/config/arc/arc-common.c
@@ -69,9 +69,7 @@ arc_handle_option (struct gcc_options *opts,
 {
   size_t code = decoded->opt_index;
   int value = decoded->value;
-  const char *arg = decoded->arg;
   static int mcpu_seen = PROCESSOR_NONE;
-  char *p;
 
   switch (code)
 {


Re: [RFA] Fix false positive out of bounds array index warning in LRA

2016-11-29 Thread Jeff Law

On 11/02/2016 01:20 PM, Bernd Schmidt wrote:

On 10/29/2016 06:21 PM, Jeff Law wrote:


On a small number of ports, we only have 2 defined register classes.
NO_REGS and ALL_REGS.  Examples would include nvptx and vax.

So let's look at check_and_process_move from lra-constraints.c:

  sclass = dclass = NO_REGS;
  if (REG_P (dreg))
dclass = get_reg_class (REGNO (dreg));
  if (dclass == ALL_REGS)
/* ALL_REGS is used for new pseudos created by transformations
   like reload of SUBREG_REG (see function
   simplify_operand_subreg).  We don't know their class yet.  We
   should figure out the class from processing the insn
   constraints not in this fast path function.  Even if ALL_REGS
   were a right class for the pseudo, secondary_... hooks usually
   are not define for ALL_REGS.  */
return false;
  [ ... ]
  /* Set up hard register for a reload pseudo for hook
 secondary_reload because some targets just ignore unassigned
 pseudos in the hook.  */
  if (dclass != NO_REGS && lra_get_regno_hard_regno (REGNO (dreg)) < 0)
{
  dregno = REGNO (dreg);
  reg_renumber[dregno] = ira_class_hard_regs[dclass][0];
}


The reference to ira_class_hard_regs is flagged by VRP as having an
out-of-bounds array reference.  WTF!?  Of course it's pretty obvious
once you look at the dumps...

On most targets dclass in this code will have a VR_VARYING range and as
a result no warning will be issued.  But on the ptx and vax ports VRP is
able to give us the range ~[NO_REGS,ALL_REGS]  aka ~[0,1] The
out-of-range array index is now obvious.


So I tried to look up the rules for enum values and it seems like we
can't prove that the code in the if statement is dead. However, can't we
at least prove that it is "dead enough" for warning purposes?
Thinking more about this, suppressing the warning in the case where we 
might have an out of bounds enum seems wrong -- that would seem like an 
important case we would want to catch.  (ie, some path shoves an 
out-of-range value into the enum object which is then used in an array 
reference).


That does make me wonder about if we would want a warning if we identify 
an assignment to an enum object where the RHS is out of hte range of the 
enum.  It'd probably trigger too many false positives in practice :(







Ok for the trunk?


Hmm, seems like a deficiency in the warning code TBH, but I guess this
patch can't hurt. Maybe open a PR for improving the warning.
I'm going to go ahead and install the patch.  I'm still [ondering what, 
if any, BZ to open around improving the existing warning, adding a new 
one or converting GCC to use safe enums :-)


jeff


Re: [RFC PATCH] avoid printing type suffix with %E (PR c/78165)

2016-11-29 Thread Jeff Law

On 11/19/2016 02:04 PM, Martin Sebor wrote:

On 10/26/2016 02:46 PM, Joseph Myers wrote:

On Wed, 26 Oct 2016, Martin Sebor wrote:


The attached patch implements one such approach by having the pretty
printer recognize the space format flag to suppress the type suffix,
so "%E" still prints the suffix but "% E" does not.  I did this to
preserve the existing output but I think it would be nicer to avoid
printing the suffix with %E and treat (for instance) the pound sign
as a request to add the suffix.  I have tested the attached patch
but not the alternative.


I think printing the suffixes is a relic of %E being used to print full
expressions.

It's established by now that printing expressions reconstructed from
trees
is a bad idea; we can get better results by having precise location
ranges
and underlining the relevant part of the source.  So if we could make
sure
nowhere is trying the use %E (or %qE, etc.) with expressions that might
not be constants, where the type might be relevant, then we'd have
confidence that stopping printing the suffix is safe.  But given the low
quality of the reconstructed expressions, it's probably safe anyway.

(Most %qE uses are for identifiers not expressions.  If we give
identifiers a different static type from "tree" - and certainly there
isn't much reason for them to have the same type as expressions - then
we'll need to change the format for either identifiers or expressions.)


Attached is a trivial patch to remove the suffix.  I didn't see
any failures in the test suite as a result.  I didn't attempt to
remove the type suffix from any tests (nor did my relatively
superficial search find any) but it will help simplify the tests
for my patches that are still in the review queue.

I should add to the rationale for the change I gave in my reply
to Jeff:

  https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01692.html

that the print_dec[su] function that's sometimes used to format
integers in warning messages (e.g., by the -Walloca-larger-than
pass) doesn't add the suffix because it doesn't have knowledge
of the argument's type (it operates on wide_int).  That further
adds to the inconsistency in diagnostics.  This patch makes all
integers in diagnostics consistent regardless of their type.

Thanks
Martin

gcc-78165.diff


PR c/78165 - avoid printing type suffix for constants in %E output

gcc/c-family/ChangeLog:

PR c/78165
* c-pretty-print (pp_c_integer_constant): Avoid formatting type
suffix.
I think you and Joseph have made a practical case for dropping the type 
suffix.


Ok for the trunk.

jeff



Re: [PATCH] avoid calling alloca(0)

2016-11-29 Thread Jeff Law

On 11/26/2016 05:52 PM, Martin Sebor wrote:

On 11/25/2016 12:51 PM, Jeff Law wrote:

On 11/23/2016 06:15 PM, Martin Sebor wrote:


gcc_assert works only in some instances (e.g., in c-ada-spec.c:191)
but not in others because some actually do make the alloca(0) call
at runtime: at a minimum, lto.c:3285, reg-stack.c:2008, and
tree-ssa-threadedge.c:344 assert during bootstrap.

You might have the wrong line number of reg-stack.c and lto.  You've
pointed to the start of subst_asm_stack_regs and lto_main respectively.
It'd probably be better if you posted the line with a bit of context.


I must have copied the wrong line numbers or had stale sources
in my tree.  Sorry about that.  In lto.c, there are two calls
to XALLOCAVEC.  I believe the first one is the one where the
alloca(0) call takes place:

  1580
  1581  tree *map = XALLOCAVEC (tree, 2 * len);
  1582  for (tree_scc *pscc = *slot; pscc; pscc = pscc->next)
--
  1610{
  1611  tree *map2 = XALLOCAVEC (tree, 2 * len);
  1612  for (unsigned i = 0; i < len; ++i)
I'm not at all familiar with this code, but something looks real fishy 
here.  Essentially if pscc->entry_len is >= 1 and len == 0, then we'll 
read map[0] and map[1] which were never allocated (see compare_tree_sccs 
and compare_tree_sccs_1).


It may be the case that  pscc->entry_len and len are related in such a 
way that never happens, but I can't easily prove it.  I'd really like 
Richi to chime in on how this stuff is supposed to work.





In reg-stack.c it's these three:

  2052
  2053  note_reg = XALLOCAVEC (rtx, i);
  2054  note_loc = XALLOCAVEC (rtx *, i);
  2055  note_kind = XALLOCAVEC (enum reg_note, i);
  2056
So for reg-stack.c I think we move the n_notes initialization before the 
XALLOCAVEC, then wrap the XALLOCAVEC calls and the subsequent loop over 
the notes inside an if (i > 0) conditional.


Damn you for making me look at reg-stack.c.  It's been years and 
hopefully it'll be years before I have to do it again :-)



n_notes = 0;
if (i > 0)
  {
note_reg =
note_loc =
note_kind =

for (note = REG_NOTES (insn); ...)
  {
 ...
  }
  }


I'm pretty sure I can twiddle the tree-ssa-threadedge code to avoid the 
problem in there.




To find all such calls I modified GCC to emit an inform call for
every XALLOCAVEC invocation with a zero argument, configured the
patched GCC on x86_64 with all languages (including lto),
bootstrapped it, ran the full test suite, and extracted the set
of unique notes from the logs.  Attached in the .log file is
the output along with counts of each.  Curiously, neither of
the two above shows up, even though adding asserts for them
broke bootstrap.  I haven't investigated why.
Thanks.  That's interesting data -- every one of those should be deeply 
investigated.  I suspect we'd probably trip even more if we did a test 
with config-list.mk and perhaps even more if we took those cross 
compilers and built the target libraries.


What I think this tells us is that we're not at a place where we're 
clean.  But we can incrementally get there.  The warning is only 
catching a fairly small subset of the cases AFAICT.  That's not unusual 
and analyzing why it didn't trigger on those cases might be useful as well.


So where does this leave us for gcc-7?  I'm wondering if we drop the 
warning in, but not enable it by default anywhere.  We fix the cases we 
can (such as reg-stack,c tree-ssa-threadedge.c, maybe others) before 
stage3 closes, and shoot for the rest in gcc-8, including improvign the 
warning (if there's something we can clearly improve), and enabling the 
warning in -Wall or -Wextra.



Jeff


Re: [PATCH] enable -Wformat-length for dynamically allocated buffers (pr 78245)

2016-11-29 Thread Martin Sebor

That said, I defer to you on how to proceed here.  I'm prepared
to do the work(*) but I do worry about jeopardizing the chances
of this patch and the others making it into 7.0.

So would it make sense to just init/fini the b_o_s framework in your
pass and for builtin expansion?


I think that should work for the sprintf checking.  Let me test it.
We can deal with the memxxx and strxxx patch (53562) independently
if you prefer.


Attached is a modified patch that calls {init,fini}_object_sizes()
from the gimple-ssa-sprintf pass instead.

While this works fine, I do like the approach of making the calls
in a single function better because it makes for a more robust API.
Decoupling the init/fini calls from the compute_object_size()
function that depends on them having been made makes the API easier
to accidentally misuse by calling one while forgetting to call one
or both of the other two.

Martin

PR middle-end/78245 - missing -Wformat-length on an overflow of a dynamically allocated buffer

gcc/testsuite/ChangeLog:

	PR middle-end/78245
	* gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Add tests.

gcc/ChangeLog:

	PR middle-end/78245
	* gimple-ssa-sprintf.c (get_destination_size): Call
	compute_object_size.
	* tree-object-size.c (addr_object_size): Adjust.
	(pass_through_call): Adjust.
	(internal_object_size): New function.
	(compute_builtin_object_size): Call internal_object_size.
	(pass_object_sizes::execute): Adjust.
	* tree-object-size.h (fini_object_sizes): Declare.

diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index ead8b0e..34b3723 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -2466,6 +2466,9 @@ get_destination_size (tree dest)
  a member array as opposed to the whole enclosing object), otherwise
  use type-zero object size to determine the size of the enclosing
  object (the function fails without optimization in this type).  */
+
+  init_object_sizes ();
+
   int ost = optimize > 0;
   unsigned HOST_WIDE_INT size;
   if (compute_builtin_object_size (dest, ost, ))
@@ -2800,6 +2803,8 @@ pass_sprintf_length::execute (function *fun)
 	}
 }
 
+  fini_object_sizes ();
+
   return 0;
 }
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-3.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
index 8d97fa8..9874332 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
@@ -1,5 +1,10 @@
 /* { dg-do compile } */
-/* { dg-options "-std=c99 -O2 -Wformat -Wformat-length=1 -ftrack-macro-expansion=0" } */
+/* { dg-options "-O2 -Wformat -Wformat-length=1 -ftrack-macro-expansion=0" } */
+/* Verify that all sprintf built-ins detect overflow involving directives
+   with non-constant arguments known to be constrained by some range of
+   values, and even when writing into dynamically allocated buffers.
+   -O2 (-ftree-vrp) is necessary for the tests involving ranges to pass,
+   otherwise -O1 is sufficient.  */
 
 #ifndef LINE
 #  define LINE 0
@@ -7,18 +12,26 @@
 
 #define bos(x) __builtin_object_size (x, 0)
 
-#define T(bufsize, fmt, ...)		\
-do {\
-  if (!LINE || __LINE__ == LINE)	\
-	{\
-	  char *d = (char *)__builtin_malloc (bufsize);			\
-	  __builtin___sprintf_chk (d, 0, bos (d), fmt, __VA_ARGS__);	\
-	  sink (d);			\
-	}\
-} while (0)
+/* Defined (and redefined) to the allocation function to use, either
+   malloc, or alloca, or a VLA.  */
+#define ALLOC(p, n)   (p) = __builtin_malloc (n)
 
-void
-sink (void*);
+/* Defined (and redefined) to the sprintf function to exercise.  */
+#define TEST_SPRINTF(d, maxsize, objsize, fmt, ...)		\
+  __builtin___sprintf_chk (d, 0, objsize, fmt, __VA_ARGS__)
+
+#define T(bufsize, fmt, ...)\
+  do {			\
+if (!LINE || __LINE__ == LINE)			\
+  {			\
+	char *d;	\
+	ALLOC (d, bufsize);\
+	TEST_SPRINTF (d, 0, bos (d), fmt, __VA_ARGS__);	\
+	sink (d);	\
+  }			\
+  } while (0)
+
+void sink (void*);
 
 /* Identity function to verify that the checker figures out the value
of the operand even when it's not constant (i.e., makes use of
@@ -232,3 +245,88 @@ void test_sprintf_chk_range_sshort (signed short *a, signed short *b)
   T ( 4, "%i",  Ra (998,  999));
   T ( 4, "%i",  Ra (999, 1000)); /* { dg-warning "may write a terminating nul past the end of the destination" } */
 }
+
+/* Exercise ordinary sprintf with malloc.  */
+#undef TEST_SPRINTF
+#define TEST_SPRINTF(d, maxsize, objsize, fmt, ...)	\
+  __builtin_sprintf (d, fmt, __VA_ARGS__)
+
+void test_sprintf_malloc (const char *s, const char *t)
+{
+#define x x ()
+
+  T (1, "%-s", x ? "" : "1");   /* { dg-warning "nul past the end" } */
+  T (1, "%-s", x ? "1" : "");   /* { dg-warning "nul past the end" } */
+  T (1, "%-s", x ? s : "1");/* { dg-warning "nul past the end" } */
+  T (1, "%-s", x ? "1" : s);/* { dg-warning "nul past the end" } */
+  T (1, "%-s", x ? s : 

Re: [PATCH] Support nested functions (PR sanitize/78541).

2016-11-29 Thread Jeff Law

On 11/29/2016 03:44 AM, Martin Liška wrote:

Currently we an assert that prevents proper use-after-scope sanitization
in nested functions. With the attached patch, we are able to do so.
I'm adding 2 test-cases, first one is the ICE reported in PR and the second
one tests proper report of use-after-scope passed by FRAME belonging to a
nested function call.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin


0001-Support-nested-functions-PR-sanitize-78541.patch


From 8e02ebdf64a82f0dfc7be531a38702497dece26b Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 28 Nov 2016 13:05:33 +0100
Subject: [PATCH] Support nested functions (PR sanitize/78541).

gcc/testsuite/ChangeLog:

2016-11-28  Martin Liska  

PR sanitize/78541
* gcc.dg/asan/pr78541-2.c: New test.
* gcc.dg/asan/pr78541.c: New test.

gcc/ChangeLog:

2016-11-28  Martin Liska  

PR sanitize/78541
* asan.c (asan_expand_mark_ifn): Properly
select a VAR_DECL from FRAME.* component reference.

OK.
jeff



Re: [PING] [PATCH] Fix PR31096

2016-11-29 Thread Hurugalawadi, Naveen
Hi Jeff,

>> I believe Richi asked for a small change after which you can consider 
>> the patch approved:

Yeah. Thanks for all the comments and reviews.
Patch committed after the modification as:-

https://gcc.gnu.org/ml/gcc-cvs/2016-11/msg01019.html

Thanks,
Naveen

libgo patch committed: Some fixes for c-archive/c-shared mode

2016-11-29 Thread Ian Lance Taylor
This patch to libgo fixes a couple of problems that arise when
building Go code into an archive or shared library that is linked into
a C program.

In archive mode, initsig is called before the memory allocator has
been initialized.  The code was doing a memory allocation because of
the call to funcPC(sigtramp).  When escape analysis is fully
implemented, that call should not allocate.  For now, finesse the
issue by calling a C function to get the C function pointer value of
sigtramp.

When returning from a call from C to a Go function, a deferred
function is run to go back to syscall mode.  When the call occurs on a
non-Go thread, that call sets g to nil, making it impossible to add
the _defer struct back to the pool.  Just drop it and let the garbage
collector clean it up.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 242726)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-4d8e00e730897cc7e73b1582522ecab031cfcaf2
+1d3e0ceee45012a1c3b4ff7f5119a72f90bfcf6a
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/panic.go
===
--- libgo/go/runtime/panic.go   (revision 242724)
+++ libgo/go/runtime/panic.go   (working copy)
@@ -141,6 +141,15 @@ func freedefer(d *_defer) {
if d.special {
return
}
+
+   // When C code calls a Go function on a non-Go thread, the
+   // deferred call to cgocallBackDone will set g to nil.
+   // Don't crash trying to put d on the free list; just let it
+   // be garbage collected.
+   if getg() == nil {
+   return
+   }
+
mp := acquirem()
pp := mp.p.ptr()
if len(pp.deferpool) == cap(pp.deferpool) {
Index: libgo/go/runtime/signal1_unix.go
===
--- libgo/go/runtime/signal1_unix.go(revision 242724)
+++ libgo/go/runtime/signal1_unix.go(working copy)
@@ -93,7 +93,7 @@ func initsig(preinit bool) {
}
 
t.flags |= _SigHandling
-   setsig(i, funcPC(sigtramp), true)
+   setsig(i, getSigtramp(), true)
}
 }
 
@@ -137,7 +137,7 @@ func sigenable(sig uint32) {
if t.flags&_SigHandling == 0 {
t.flags |= _SigHandling
fwdSig[sig] = getsig(int32(sig))
-   setsig(int32(sig), funcPC(sigtramp), true)
+   setsig(int32(sig), getSigtramp(), true)
}
}
 }
@@ -265,7 +265,7 @@ func raisebadsignal(sig int32, c *sigctx
// We may receive another instance of the signal before we
// restore the Go handler, but that is not so bad: we know
// that the Go program has been ignoring the signal.
-   setsig(sig, funcPC(sigtramp), true)
+   setsig(sig, getSigtramp(), true)
 }
 
 func crash() {
Index: libgo/go/runtime/stubs.go
===
--- libgo/go/runtime/stubs.go   (revision 242724)
+++ libgo/go/runtime/stubs.go   (working copy)
@@ -502,8 +502,8 @@ func goexit1()
 func schedtrace(bool)
 func freezetheworld()
 
-// Signal trampoline, written in C.
-func sigtramp()
+// Get signal trampoline, written in C.
+func getSigtramp() uintptr
 
 // The sa_handler field is generally hidden in a union, so use C accessors.
 func getSigactionHandler(*_sigaction) uintptr
Index: libgo/runtime/go-signal.c
===
--- libgo/runtime/go-signal.c   (revision 242724)
+++ libgo/runtime/go-signal.c   (working copy)
@@ -140,6 +140,15 @@ sigtramp(int sig, siginfo_t *info, void
 
 #endif // USING_SPLIT_STACK
 
+// C function to return the address of the sigtramp function.
+uintptr getSigtramp(void) __asm__ (GOSYM_PREFIX "runtime.getSigtramp");
+
+uintptr
+getSigtramp()
+{
+  return (uintptr)(void*)sigtramp;
+}
+
 // C code to manage the sigaction sa_sigaction field, which is
 // typically a union and so hard for mksysinfo.sh to handle.
 


Re: [PATCH] Fix format_integer (PR tree-optimization/78586)

2016-11-29 Thread Jeff Law

On 11/29/2016 12:48 PM, Jakub Jelinek wrote:

Hi!

As mentioned in the PR, the LSHIFT_EXPR computation of values that
will need longest or shortest string is both incorrect (it shifts
integer_one_node left, so for precisions above precision of integer
it returns 0 (not to mention that it is invalid GENERIC, because the types
of first operand and result have to match)) and unnecessary - every integral
type already has TYPE_MIN_VALUE and TYPE_MAX_VALUE readily available.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Note, in the PR I've raised various further questions, Martin, can you look
at them?

2016-11-29  Jakub Jelinek  

PR tree-optimization/78586
* gimple-ssa-sprintf.c (format_integer): Use TYPE_MAX_VALUE or
TYPE_MIN_VALUE or build_all_ones_cst instead of folding LSHIFT_EXPR.
Don't build_int_cst min/max twice.  Formatting fix.

* gcc.c-torture/execute/pr78586.c: New test.
This is OK.  Note there's a goodly amount of code duplicated in that 
last hunk.  Your call whether or not to try and unify any of that.


jeff



Re: [Patch 4/5] OpenACC tile clause support, Fortran front-end parts

2016-11-29 Thread Cesar Philippidis
On 11/18/2016 03:24 AM, Jakub Jelinek wrote:
> On Sat, Nov 12, 2016 at 08:51:00AM -0800, Cesar Philippidis wrote:
>> On 11/11/2016 02:34 AM, Jakub Jelinek wrote:
>>> On Thu, Nov 10, 2016 at 06:46:46PM +0800, Chung-Lin Tang wrote:
>>
>> And here's the patch.
> 
> The patch doesn't look like OpenACC tile clause fortran support,
> but bind/nohost clause C/C++ support.

I think I got that patch mixed up with the acc routines patch. Here is
the fortran tile clause patch.

One notable difference between the trunk and gomp4 implementation of the
tile clause is that gomp4 errors on negative value tile arguments,
whereas trunk issues warnings. Is there a reason why the fortran FE
generally emits a warning, on say num_threads(-5), instead of an error?

Chung-Lin, I noticed in your source tree that you included a change to
gfortranspec.c. Is that necessary for trunk? I've included in this patch
just in case the other tile patches require it.

Cesar
2016-11-29  Cesar Philippidis  
	Joseph Myers  

	gcc/fortran/
	* gfortranspec.c (lang_specific_pre_link): Update call to do_spec.
	* openmp.c (resolve_omp_clauses): Error on directives
	containing both tile and collapse clauses.
	(resolve_oacc_loop_blocks): Represent '*' tile arguments as zero.
	* trans-openmp.c (gfc_trans_omp_do): Lower tiled loops like
 	collapsed loops.

	gcc/testsuite/
	* gfortran.dg/goacc/combined-directives.f90: Remove xfail.
	* gfortran.dg/goacc/tile-1.f90: New test.
	* gfortran.dg/goacc/tile-2.f90: New test.
	* gfortran.dg/goacc/tile-lowering.f95: New test.


diff --git a/gcc/fortran/gfortranspec.c b/gcc/fortran/gfortranspec.c
index 8a0e19a..6dc6fbe 100644
--- a/gcc/fortran/gfortranspec.c
+++ b/gcc/fortran/gfortranspec.c
@@ -439,7 +439,7 @@ int
 lang_specific_pre_link (void)
 {
   if (library)
-do_spec ("%:include(libgfortran.spec)");
+do_spec ("%:include(libgfortran.spec)", 0);
 
   return 0;
 }
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 11ffb5d..81f758e 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -4757,6 +4757,8 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 if (omp_clauses->wait_list)
   for (el = omp_clauses->wait_list; el; el = el->next)
 	resolve_scalar_int_expr (el->expr, "WAIT");
+  if (omp_clauses->collapse && omp_clauses->tile_list)
+gfc_error ("Incompatible use of TILE and COLLAPSE at %L", >loc);
   if (omp_clauses->depend_source && code->op != EXEC_OMP_ORDERED)
 gfc_error ("SOURCE dependence type only allowed "
 	   "on ORDERED directive at %L", >loc);
@@ -5903,11 +5905,11 @@ resolve_oacc_loop_blocks (gfc_code *code)
 	  if (el->expr == NULL)
 	{
 	  /* NULL expressions are used to represent '*' arguments.
-		 Convert those to a -1 expressions.  */
+		 Convert those to a 0 expressions.  */
 	  el->expr = gfc_get_constant_expr (BT_INTEGER,
 		gfc_default_integer_kind,
 		>loc);
-	  mpz_set_si (el->expr->value.integer, -1);
+	  mpz_set_si (el->expr->value.integer, 0);
 	}
 	  else
 	{
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 59fd6b3..d38bebc 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -3455,6 +3455,17 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
   dovar_init *di;
   unsigned ix;
   vec *saved_doacross_steps = doacross_steps;
+  gfc_expr_list *tile = do_clauses ? do_clauses->tile_list : clauses->tile_list;
+
+  /* Both collapsed and tiled loops are lowered the same way.  In
+ OpenACC, those clauses are not compatible, so prioritize the tile
+ clause, if present.  */
+  if (tile)
+{
+  collapse = 0;
+  for (gfc_expr_list *el = tile; el; el = el->next)
+	collapse++;
+}
 
   doacross_steps = NULL;
   if (clauses->orderedc)
diff --git a/gcc/testsuite/gfortran.dg/goacc/combined-directives.f90 b/gcc/testsuite/gfortran.dg/goacc/combined-directives.f90
index abb5e6b..42a447a 100644
--- a/gcc/testsuite/gfortran.dg/goacc/combined-directives.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/combined-directives.f90
@@ -143,8 +143,7 @@ end subroutine test
 ! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. vector" 2 "gimple" } }
 ! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. seq" 2 "gimple" } }
 ! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. auto" 2 "gimple" } }
-! XFAILed: OpenACC tile clauses are discarded during gimplification.
-! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. tile.2, 3" 2 "gimple" { xfail *-*-* } } }
+! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. tile.2, 3" 2 "gimple" } }
 ! { dg-final { scan-tree-dump-times "acc loop private.i. independent" 2 "gimple" } }
 ! { dg-final { scan-tree-dump-times "private.z" 2 "gimple" } }
 ! { dg-final { scan-tree-dump-times "omp target oacc_\[^ \]+ map.force_tofrom:y" 2 "gimple" } }
diff 

Re: Remove stray '@' from install.texi (was Re: [PATCH] Delete GCJ)

2016-11-29 Thread David Malcolm
On Tue, 2016-11-29 at 18:20 -0700, Sandra Loosemore wrote:
> On 11/29/2016 06:10 PM, David Malcolm wrote:
> > [snip]
> > 
> > r242985 seems to have broken the build, for me at least (with
> > texinfo
> > 5.1):
> > 
> > ../../src/gcc/doc/install.texi:2199: use braces to give a command
> > as an argument to @=
> > make[2]: *** [doc/gccinstall.info] Error 1
> > 
> > The attached patch fixes it.
> > 
> > OK to commit?
> 
> OK.  (This is so trivial it would qualify under the obvious patch
> rule 
> anyway.)
> 
> -Sandra

My texinfo skills aren't as strong as yours, so I was half-wondering if
this was some syntax I wasn't aware of, or maybe a version issue. 

Thanks for the confirmation; committed to trunk as r242991.

Dave


Re: Remove stray '@' from install.texi (was Re: [PATCH] Delete GCJ)

2016-11-29 Thread Sandra Loosemore

On 11/29/2016 06:10 PM, David Malcolm wrote:

[snip]

r242985 seems to have broken the build, for me at least (with texinfo
5.1):

../../src/gcc/doc/install.texi:2199: use braces to give a command as an 
argument to @=
make[2]: *** [doc/gccinstall.info] Error 1

The attached patch fixes it.

OK to commit?


OK.  (This is so trivial it would qualify under the obvious patch rule 
anyway.)


-Sandra



[committed] substring locations and # line directives (PR preprocessor/78569)

2016-11-29 Thread David Malcolm
The ICE in PR preprocessor/78569 appears to be due to an attempt to
generate substring locations in a .i file where the underlying .c file
has changed since the .i file was generated.

This can't work, so it seems safest for the on-demand substring
locations to be unavailable for such files, falling back to
"whole string" locations for such cases.

Successfully bootstrapped on x86_64-pc-linux-gnu;
adds 6 PASS results to gcc.sum.

Committed to trunk as r242990.

gcc/ChangeLog:
PR preprocessor/78569
* input.c (get_substring_ranges_for_loc): Fail gracefully if
line directives were present.

gcc/testsuite/ChangeLog:
PR preprocessor/78569
* gcc.dg/format/pr78569.c: New test case.
---
 gcc/input.c   | 10 ++
 gcc/testsuite/gcc.dg/format/pr78569.c | 24 
 2 files changed, 34 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/format/pr78569.c

diff --git a/gcc/input.c b/gcc/input.c
index 611e18b..1c7228a 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -1331,6 +1331,16 @@ get_substring_ranges_for_loc (cpp_reader *pfile,
   if (cpp_get_options (pfile)->track_macro_expansion != 2)
 return "track_macro_expansion != 2";
 
+  /* If #line or # 44 "file"-style directives are present, then there's
+ no guarantee that the line numbers we have can be used to locate
+ the strings.  For example, we might have a .i file with # directives
+ pointing back to lines within a .c file, but the .c file might
+ have been edited since the .i file was created.
+ In such a case, the safest course is to disable on-demand substring
+ locations.  */
+  if (line_table->seen_line_directive)
+return "seen line directive";
+
   /* If string concatenation has occurred at STRLOC, get the locations
  of all of the literal tokens making up the compound string.
  Otherwise, just use STRLOC.  */
diff --git a/gcc/testsuite/gcc.dg/format/pr78569.c 
b/gcc/testsuite/gcc.dg/format/pr78569.c
new file mode 100644
index 000..e827087
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/format/pr78569.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-Wformat-length" } */
+
+/* A run of blank lines, so that we would fail the assertion in input.c:1388:
+   gcc_assert (line_width >= (start.column - 1 + literal_length));  */
+
+
+
+
+
+void test (void)
+{
+  char tmp[128];
+  /* Point to the run of blank lines, so that the components of the overlong
+ string appear to be present within the run of blank lines.  */
+# 6 "../../../../src/gcc/testsuite/gcc.dg/format/pr78569.c"
+  __builtin_snprintf (tmp, sizeof(tmp),
+ "The Base Band sends this value as a response to a "
+ "request for IMSI detach sent over the control "
+ "channel uplink (see section 7.6.1).");
+
+  /* { dg-warning "output truncated" "" { target *-*-* } 7 } */
+  /* { dg-message "format output" "" { target *-*-* } 6 } */
+}
-- 
1.8.5.3



Remove stray '@' from install.texi (was Re: [PATCH] Delete GCJ)

2016-11-29 Thread David Malcolm
On Tue, 2016-11-29 at 14:23 -0700, Jeff Law wrote:
> On 11/21/2016 04:23 PM, Matthias Klose wrote:
> > On 21.11.2016 18:16, Rainer Orth wrote:
> > > Hi Matthias,
> > > 
> > > > ahh, didn't see that :-/ Now fixed, is this clearer now?
> > > > 
> > > > The options @option{--with-target-bdw-gc-include} and
> > > > @option{--with-target-bdw-gc-lib} must always specified
> > > > together for
> > >^ be
> > 
> > thanks to all sorting out the documentation issues. Now attaching
> > the updated
> > diff. Ok to commit?
> > 
> > Matthias
> > 
> > 

[...]

> > gcc/
> > 
> > 2016-11-19  Matthias Klose  
> > 
> > * doc/install.texi: Document configure options --enable
> > -objc-gc
> > and --with-target-bdw-gc.

[...]

r242985 seems to have broken the build, for me at least (with texinfo
5.1):

../../src/gcc/doc/install.texi:2199: use braces to give a command as an 
argument to @=
make[2]: *** [doc/gccinstall.info] Error 1

The attached patch fixes it.

OK to commit?
DaveFrom 60ec369e17c898fb3cdcbca43ed77d450a30d074 Mon Sep 17 00:00:00 2001
From: David Malcolm 
Date: Tue, 29 Nov 2016 20:38:53 -0500
Subject: [PATCH] Remove stray character from install.texi

gcc/ChangeLog:
	* doc/install.texi (--with-target-bdw-gc): Remove stray '@'.
---
 gcc/doc/install.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 5d96e5f..140ff80 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2196,7 +2196,7 @@ continues.
 @itemx --with-target-bdw-gc-lib=@var{list}
 Specify search directories for the garbage collector header files and
 libraries. @var{list} is a comma separated list of key value pairs of the
-form @samp{@var{multilibdir}@=@var{path}}, where the default multilib key
+form @samp{@var{multilibdir}=@var{path}}, where the default multilib key
 is named as @samp{.} (dot), or is omitted (e.g.
 @samp{--with-target-bdw-gc=/opt/bdw-gc,32=/opt-bdw-gc32}).
 
-- 
1.8.5.3



Re: [PATCH] Fix format_integer (PR tree-optimization/78586)

2016-11-29 Thread Martin Sebor

On 11/29/2016 12:48 PM, Jakub Jelinek wrote:

Hi!

As mentioned in the PR, the LSHIFT_EXPR computation of values that
will need longest or shortest string is both incorrect (it shifts
integer_one_node left, so for precisions above precision of integer
it returns 0 (not to mention that it is invalid GENERIC, because the types
of first operand and result have to match)) and unnecessary - every integral
type already has TYPE_MIN_VALUE and TYPE_MAX_VALUE readily available.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Note, in the PR I've raised various further questions, Martin, can you look
at them?


Sure.  I replied in the bug.

Thanks for fixing this!  My only comment is that it would be nice
to have the test in the builtin-sprintf-*.c series as a compile-only
test.  I often check them with my cross-compilers which I can't do
with tests that require running.  If you prefer to keep it as is
I'll go ahead and add one like it that's compile-only.

Martin



Re: PING! [PATCH, Fortran, accaf, v1] Add caf-API-calls to asynchronously handle allocatable components in derived type coarrays.

2016-11-29 Thread Andre Vehreschild
Hi Jerry,

Tests with multiple images go into the opencoarrays testsuite. Still to push 
though. The tests I have so far all pass.

- Andre

Am 30. November 2016 00:06:22 MEZ, schrieb Jerry DeLisle 
:
>On 11/28/2016 10:33 AM, Andre Vehreschild wrote:
>> PING!
>>
>> I know it's a lengthy patch, but comments would be nice anyway.
>>
>> - Andre
>>
>> On Tue, 22 Nov 2016 20:46:50 +0100
>> Andre Vehreschild  wrote:
>>
>>> Hi all,
>>>
>>> attached patch addresses the need of extending the API of the
>caf-libs to
>>> enable allocatable components asynchronous allocation. Allocatable
>components
>>> in derived type coarrays are different from regular coarrays or
>coarrayed
>>> components. The latter have to be allocated on all images or on
>none.
>>> Furthermore is the allocation a point of synchronisation.
>>>
>>> For allocatable components the F2008 allows to have some allocated
>on some
>>> images and on others not. Furthermore is the registration with the
>caf-lib,
>>> that an allocatable component is present in a derived type coarray
>no longer a
>>> synchronisation point. To implement these features two new types of
>coarray
>>> registration have been introduced. The first one just registering
>the
>>> component with the caf-lib and the latter doing the allocate.
>Furthermore has
>>> the caf-API been extended to provide a query function to learn about
>the
>>> allocation status of a component on a remote image.
>>>
>>> Sorry, that the patch is rather lengthy. Most of this is due to the
>>> structure_alloc_comps' signature change. The routine and its
>wrappers are used
>>> rather often which needed the appropriate changes.
>>>
>>> I know I left two or three TODOs in the patch to remind me of things
>I have to
>>> investigate further. For the current state these TODOs are no reason
>to hold
>>> back the patch. The third party library opencoarrays implements the
>mpi-part
>>> of the caf-model and will change in sync. It would of course be
>advantageous
>>> to just have to say: With gcc-7 gfortran implements allocatable
>components in
>>> derived coarrays nearly completely.
>>>
>>> I know we are in stage 3. But the patch bootstraps and regtests ok
>on
>>> x86_64-linux/F23. So, is it ok for trunk or shall it go to 7.2?
>>>
>>> Regards,
>>> Andre
>>
>>
>
>Patch applies OK, regression tested OK here, test cases look
>reasonable. Have 
>you been able to test with multiple images?
>
>Jerry

-- 
Andre Vehreschild * Kreuzherrenstr. 8 * 52062 Aachen
Tel.: +49 241 929 10 18 * ve...@gmx.de


Re: PING! [PATCH, Fortran, accaf, v1] Add caf-API-calls to asynchronously handle allocatable components in derived type coarrays.

2016-11-29 Thread Jerry DeLisle

On 11/28/2016 10:33 AM, Andre Vehreschild wrote:

PING!

I know it's a lengthy patch, but comments would be nice anyway.

- Andre

On Tue, 22 Nov 2016 20:46:50 +0100
Andre Vehreschild  wrote:


Hi all,

attached patch addresses the need of extending the API of the caf-libs to
enable allocatable components asynchronous allocation. Allocatable components
in derived type coarrays are different from regular coarrays or coarrayed
components. The latter have to be allocated on all images or on none.
Furthermore is the allocation a point of synchronisation.

For allocatable components the F2008 allows to have some allocated on some
images and on others not. Furthermore is the registration with the caf-lib,
that an allocatable component is present in a derived type coarray no longer a
synchronisation point. To implement these features two new types of coarray
registration have been introduced. The first one just registering the
component with the caf-lib and the latter doing the allocate. Furthermore has
the caf-API been extended to provide a query function to learn about the
allocation status of a component on a remote image.

Sorry, that the patch is rather lengthy. Most of this is due to the
structure_alloc_comps' signature change. The routine and its wrappers are used
rather often which needed the appropriate changes.

I know I left two or three TODOs in the patch to remind me of things I have to
investigate further. For the current state these TODOs are no reason to hold
back the patch. The third party library opencoarrays implements the mpi-part
of the caf-model and will change in sync. It would of course be advantageous
to just have to say: With gcc-7 gfortran implements allocatable components in
derived coarrays nearly completely.

I know we are in stage 3. But the patch bootstraps and regtests ok on
x86_64-linux/F23. So, is it ok for trunk or shall it go to 7.2?

Regards,
Andre





Patch applies OK, regression tested OK here, test cases look reasonable. Have 
you been able to test with multiple images?


Jerry


Re: Add a mem_alias_size helper class

2016-11-29 Thread Jeff Law

On 11/29/2016 03:51 PM, Richard Sandiford wrote:

Jeff Law  writes:

On 11/15/2016 09:04 AM, Richard Sandiford wrote:

alias.c encodes memory sizes as follows:

size > 0: the exact size is known
size == 0: the size isn't known
size < 0: the exact size of the reference itself is known,
  but the address has been aligned via AND.  In this case
  "-size" includes the size of the reference and the worst-case
  number of bytes traversed by the AND.

This patch wraps this up in a helper class and associated
functions.  The new routines fix what seems to be a hole
in the old logic: if the size of a reference A was unknown,
offset_overlap_p would assume that it could conflict with any
other reference B, even if we could prove that B comes before A.

The fallback CONSTANT_P (x) && CONSTANT_P (y) case looked incorrect.
Either "c" is trustworthy as a distance between the two constants,
in which case the alignment handling should work as well there as
elsewhere, or "c" isn't trustworthy, in which case offset_overlap_p
is unsafe.  I think the latter's true; AFAICT we have no evidence
that "c" really is the distance between the two references, so using
it in the check doesn't make sense.

At this point we've excluded cases for which:

(a) the base addresses are the same
(b) x and y are SYMBOL_REFs, or SYMBOL_REF-based constants
wrapped in a CONST
(c) x and y are both constant integers

No useful cases should be left.  As things stood, we would
assume that:

  (mem:SI (const_int X))

could overlap:

  (mem:SI (symbol_ref Y))

but not:

  (mem:SI (const (plus (symbol_ref Y) (const_int 4

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Thanks,
Richard


[ This patch is part of the SVE series posted here:
  https://gcc.gnu.org/ml/gcc/2016-11/msg00030.html ]

gcc/
2016-11-15  Richard Sandiford  
Alan Hayward  
David Sherwood  

* alias.c (mem_alias_size): New class.
(mem_alias_size::mode): New function.
(mem_alias_size::exact_p): Likewise.
(mem_alias_size::max_size_known_p): Likewise.
(align_to): Likewise.
(alias_may_gt): Likewise.
(addr_side_effect_eval): Change type of size argument to
mem_alias_size.  Use plus_constant.
(offset_overlap_p): Change type of xsize and ysize to
mem_alias_size.  Use alias_may_gt.  Don't assume an overlap
between an access of unknown size and an access that's known
to be earlier than it.
(memrefs_conflict_p): Change type of xsize and ysize to
mem_alias_size.  Remove fallback CONSTANT_P (x) && CONSTANT_P (y)
handling.

OK.  One possible nit below you might want to consider changing.


+/* Represents the size of a memory reference during alias analysis.
+   There are three possibilities:

-/* Set up all info needed to perform alias analysis on memory references.  */
+   (1) the size needs to be treated as completely unknown
+   (2) the size is known exactly and no alignment is applied to the address
+   (3) the size is known exactly but an alignment is applied to the address
+
+   (3) is used for aligned addresses of the form (and X (const_int -N)),
+   which can subtract something in the range [0, N) from the original
+   address X.  We handle this by subtracting N - 1 from X and adding N - 1
+   to the size, so that the range spans all possible bytes.  */
+class mem_alias_size {
+public:
+  /* Return an unknown size (case (1) above).  */
+  static mem_alias_size unknown () { return (HOST_WIDE_INT) 0; }
+
+  /* Return an exact size (case (2) above).  */
+  static mem_alias_size exact (HOST_WIDE_INT size) { return size; }
+
+  /* Return a worst-case size after alignment (case (3) above).
+ SIZE includes the maximum adjustment applied by the alignment.  */
+  static mem_alias_size aligned (HOST_WIDE_INT size) { return -size; }
+
+  /* Return the size of memory reference X.  */
+  static mem_alias_size mem (const_rtx x) { return MEM_SIZE (x); }
+
+  static mem_alias_size mode (machine_mode m);
+
+  /* Return true if the exact size of the memory is known.  */
+  bool exact_p () const { return m_value > 0; }
+  bool exact_p (HOST_WIDE_INT *) const;
+
+  /* Return true if an upper bound on the memory size is known;
+ i.e. not case (1) above.  */
+  bool max_size_known_p () const { return m_value != 0; }
+  bool max_size_known_p (HOST_WIDE_INT *) const;
+
+  /* Return true if the size is subject to alignment.  */
+  bool aligned_p () const { return m_value < 0; }
+
+private:
+  mem_alias_size (HOST_WIDE_INT value) : m_value (value) {}
+
+  HOST_WIDE_INT m_value;
+};

If I were to see a call to the aligned_p method, my first thought is
testing if an object is properly aligned.  This method actually tells us
something different -- was the size adjusted to account for alignment
issues.

In fact, when I was reading the memrefs_conflict_p 

Re: Add a mem_alias_size helper class

2016-11-29 Thread Richard Sandiford
Jeff Law  writes:
> On 11/15/2016 09:04 AM, Richard Sandiford wrote:
>> alias.c encodes memory sizes as follows:
>>
>> size > 0: the exact size is known
>> size == 0: the size isn't known
>> size < 0: the exact size of the reference itself is known,
>>   but the address has been aligned via AND.  In this case
>>   "-size" includes the size of the reference and the worst-case
>>   number of bytes traversed by the AND.
>>
>> This patch wraps this up in a helper class and associated
>> functions.  The new routines fix what seems to be a hole
>> in the old logic: if the size of a reference A was unknown,
>> offset_overlap_p would assume that it could conflict with any
>> other reference B, even if we could prove that B comes before A.
>>
>> The fallback CONSTANT_P (x) && CONSTANT_P (y) case looked incorrect.
>> Either "c" is trustworthy as a distance between the two constants,
>> in which case the alignment handling should work as well there as
>> elsewhere, or "c" isn't trustworthy, in which case offset_overlap_p
>> is unsafe.  I think the latter's true; AFAICT we have no evidence
>> that "c" really is the distance between the two references, so using
>> it in the check doesn't make sense.
>>
>> At this point we've excluded cases for which:
>>
>> (a) the base addresses are the same
>> (b) x and y are SYMBOL_REFs, or SYMBOL_REF-based constants
>> wrapped in a CONST
>> (c) x and y are both constant integers
>>
>> No useful cases should be left.  As things stood, we would
>> assume that:
>>
>>   (mem:SI (const_int X))
>>
>> could overlap:
>>
>>   (mem:SI (symbol_ref Y))
>>
>> but not:
>>
>>   (mem:SI (const (plus (symbol_ref Y) (const_int 4
>>
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>
>> Thanks,
>> Richard
>>
>>
>> [ This patch is part of the SVE series posted here:
>>   https://gcc.gnu.org/ml/gcc/2016-11/msg00030.html ]
>>
>> gcc/
>> 2016-11-15  Richard Sandiford  
>>  Alan Hayward  
>>  David Sherwood  
>>
>>  * alias.c (mem_alias_size): New class.
>>  (mem_alias_size::mode): New function.
>>  (mem_alias_size::exact_p): Likewise.
>>  (mem_alias_size::max_size_known_p): Likewise.
>>  (align_to): Likewise.
>>  (alias_may_gt): Likewise.
>>  (addr_side_effect_eval): Change type of size argument to
>>  mem_alias_size.  Use plus_constant.
>>  (offset_overlap_p): Change type of xsize and ysize to
>>  mem_alias_size.  Use alias_may_gt.  Don't assume an overlap
>>  between an access of unknown size and an access that's known
>>  to be earlier than it.
>>  (memrefs_conflict_p): Change type of xsize and ysize to
>>  mem_alias_size.  Remove fallback CONSTANT_P (x) && CONSTANT_P (y)
>>  handling.
> OK.  One possible nit below you might want to consider changing.
>
>> +/* Represents the size of a memory reference during alias analysis.
>> +   There are three possibilities:
>>
>> -/* Set up all info needed to perform alias analysis on memory references.  
>> */
>> +   (1) the size needs to be treated as completely unknown
>> +   (2) the size is known exactly and no alignment is applied to the address
>> +   (3) the size is known exactly but an alignment is applied to the address
>> +
>> +   (3) is used for aligned addresses of the form (and X (const_int -N)),
>> +   which can subtract something in the range [0, N) from the original
>> +   address X.  We handle this by subtracting N - 1 from X and adding N - 1
>> +   to the size, so that the range spans all possible bytes.  */
>> +class mem_alias_size {
>> +public:
>> +  /* Return an unknown size (case (1) above).  */
>> +  static mem_alias_size unknown () { return (HOST_WIDE_INT) 0; }
>> +
>> +  /* Return an exact size (case (2) above).  */
>> +  static mem_alias_size exact (HOST_WIDE_INT size) { return size; }
>> +
>> +  /* Return a worst-case size after alignment (case (3) above).
>> + SIZE includes the maximum adjustment applied by the alignment.  */
>> +  static mem_alias_size aligned (HOST_WIDE_INT size) { return -size; }
>> +
>> +  /* Return the size of memory reference X.  */
>> +  static mem_alias_size mem (const_rtx x) { return MEM_SIZE (x); }
>> +
>> +  static mem_alias_size mode (machine_mode m);
>> +
>> +  /* Return true if the exact size of the memory is known.  */
>> +  bool exact_p () const { return m_value > 0; }
>> +  bool exact_p (HOST_WIDE_INT *) const;
>> +
>> +  /* Return true if an upper bound on the memory size is known;
>> + i.e. not case (1) above.  */
>> +  bool max_size_known_p () const { return m_value != 0; }
>> +  bool max_size_known_p (HOST_WIDE_INT *) const;
>> +
>> +  /* Return true if the size is subject to alignment.  */
>> +  bool aligned_p () const { return m_value < 0; }
>> +
>> +private:
>> +  mem_alias_size (HOST_WIDE_INT value) : m_value (value) {}
>> +
>> +  HOST_WIDE_INT m_value;
>> +};
> If I were to see a call to the 

Re: [PATCH] xtensa: Fix PR target/78603

2016-11-29 Thread Max Filippov
On Tue, Nov 29, 2016 at 2:16 PM, augustine.sterl...@gmail.com
 wrote:
> On Tue, Nov 29, 2016 at 2:08 PM, Max Filippov  wrote:
>> 2016-11-29  Max Filippov  
>> gcc/
>> * config/xtensa/xtensa.c (hwloop_optimize): Don't emit zero
>> overhead loop start between a call and its CALL_ARG_LOCATION
>> note.
>
> Approved. Please apply.

Applied to trunk. Thank you!

-- Max


Re: [PATCH] Fix x86_64 fix_debug_reg_uses (PR rtl-optimization/78575)

2016-11-29 Thread Jakub Jelinek
On Tue, Nov 29, 2016 at 03:20:08PM -0700, Jeff Law wrote:
> On 11/29/2016 12:41 PM, Jakub Jelinek wrote:
> >The x86_64 stv pass uses PUT_MODE to change REGs and MEMs in place to affect
> >all setters and users, but that is undesirable in debug insns which are
> >intentionally ignored during the analysis and we should keep using correct
> >modes (TImode) instead of the new one (V1TImode).
> Note that MEMs are not shared, so twiddling the mode on any given MEM
> changes one and only one object.

I thought they shouldn't be shared.  Which means I'll debug tomorrow why
they are shared (the DECL_INCOMING_RTL is shared with a REG_EQUAL note
content).

Jakub


Re: [Patch, Fortran] PR 78573: [7 Regression] [OOP] ICE in resolve_component, at fortran/resolve.c:13405

2016-11-29 Thread Steve Kargl
On Tue, Nov 29, 2016 at 10:58:35PM +0100, Janus Weil wrote:
> 
> here is a rather straightforward patch for an ice-on-invalid
> regression. Regtests cleanly on x86_64-linux-gnu. Ok for trunk?
> 

Yes.

-- 
Steve


Re: [PATCH] Fix x86_64 fix_debug_reg_uses (PR rtl-optimization/78575)

2016-11-29 Thread Jeff Law

On 11/29/2016 12:41 PM, Jakub Jelinek wrote:

Hi!

The x86_64 stv pass uses PUT_MODE to change REGs and MEMs in place to affect
all setters and users, but that is undesirable in debug insns which are
intentionally ignored during the analysis and we should keep using correct
modes (TImode) instead of the new one (V1TImode).
Note that MEMs are not shared, so twiddling the mode on any given MEM 
changes one and only one object.


Jeff



Re: [PATCH] xtensa: Fix PR target/78603

2016-11-29 Thread augustine.sterl...@gmail.com
On Tue, Nov 29, 2016 at 2:08 PM, Max Filippov  wrote:
> 2016-11-29  Max Filippov  
> gcc/
> * config/xtensa/xtensa.c (hwloop_optimize): Don't emit zero
> overhead loop start between a call and its CALL_ARG_LOCATION
> note.

Approved. Please apply.


Re: Add a mem_alias_size helper class

2016-11-29 Thread Jeff Law

On 11/15/2016 09:04 AM, Richard Sandiford wrote:

alias.c encodes memory sizes as follows:

size > 0: the exact size is known
size == 0: the size isn't known
size < 0: the exact size of the reference itself is known,
  but the address has been aligned via AND.  In this case
  "-size" includes the size of the reference and the worst-case
  number of bytes traversed by the AND.

This patch wraps this up in a helper class and associated
functions.  The new routines fix what seems to be a hole
in the old logic: if the size of a reference A was unknown,
offset_overlap_p would assume that it could conflict with any
other reference B, even if we could prove that B comes before A.

The fallback CONSTANT_P (x) && CONSTANT_P (y) case looked incorrect.
Either "c" is trustworthy as a distance between the two constants,
in which case the alignment handling should work as well there as
elsewhere, or "c" isn't trustworthy, in which case offset_overlap_p
is unsafe.  I think the latter's true; AFAICT we have no evidence
that "c" really is the distance between the two references, so using
it in the check doesn't make sense.

At this point we've excluded cases for which:

(a) the base addresses are the same
(b) x and y are SYMBOL_REFs, or SYMBOL_REF-based constants
wrapped in a CONST
(c) x and y are both constant integers

No useful cases should be left.  As things stood, we would
assume that:

  (mem:SI (const_int X))

could overlap:

  (mem:SI (symbol_ref Y))

but not:

  (mem:SI (const (plus (symbol_ref Y) (const_int 4

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Thanks,
Richard


[ This patch is part of the SVE series posted here:
  https://gcc.gnu.org/ml/gcc/2016-11/msg00030.html ]

gcc/
2016-11-15  Richard Sandiford  
Alan Hayward  
David Sherwood  

* alias.c (mem_alias_size): New class.
(mem_alias_size::mode): New function.
(mem_alias_size::exact_p): Likewise.
(mem_alias_size::max_size_known_p): Likewise.
(align_to): Likewise.
(alias_may_gt): Likewise.
(addr_side_effect_eval): Change type of size argument to
mem_alias_size.  Use plus_constant.
(offset_overlap_p): Change type of xsize and ysize to
mem_alias_size.  Use alias_may_gt.  Don't assume an overlap
between an access of unknown size and an access that's known
to be earlier than it.
(memrefs_conflict_p): Change type of xsize and ysize to
mem_alias_size.  Remove fallback CONSTANT_P (x) && CONSTANT_P (y)
handling.

OK.  One possible nit below you might want to consider changing.


+/* Represents the size of a memory reference during alias analysis.
+   There are three possibilities:

-/* Set up all info needed to perform alias analysis on memory references.  */
+   (1) the size needs to be treated as completely unknown
+   (2) the size is known exactly and no alignment is applied to the address
+   (3) the size is known exactly but an alignment is applied to the address
+
+   (3) is used for aligned addresses of the form (and X (const_int -N)),
+   which can subtract something in the range [0, N) from the original
+   address X.  We handle this by subtracting N - 1 from X and adding N - 1
+   to the size, so that the range spans all possible bytes.  */
+class mem_alias_size {
+public:
+  /* Return an unknown size (case (1) above).  */
+  static mem_alias_size unknown () { return (HOST_WIDE_INT) 0; }
+
+  /* Return an exact size (case (2) above).  */
+  static mem_alias_size exact (HOST_WIDE_INT size) { return size; }
+
+  /* Return a worst-case size after alignment (case (3) above).
+ SIZE includes the maximum adjustment applied by the alignment.  */
+  static mem_alias_size aligned (HOST_WIDE_INT size) { return -size; }
+
+  /* Return the size of memory reference X.  */
+  static mem_alias_size mem (const_rtx x) { return MEM_SIZE (x); }
+
+  static mem_alias_size mode (machine_mode m);
+
+  /* Return true if the exact size of the memory is known.  */
+  bool exact_p () const { return m_value > 0; }
+  bool exact_p (HOST_WIDE_INT *) const;
+
+  /* Return true if an upper bound on the memory size is known;
+ i.e. not case (1) above.  */
+  bool max_size_known_p () const { return m_value != 0; }
+  bool max_size_known_p (HOST_WIDE_INT *) const;
+
+  /* Return true if the size is subject to alignment.  */
+  bool aligned_p () const { return m_value < 0; }
+
+private:
+  mem_alias_size (HOST_WIDE_INT value) : m_value (value) {}
+
+  HOST_WIDE_INT m_value;
+};
If I were to see a call to the aligned_p method, my first thought is 
testing if an object is properly aligned.  This method actually tells us 
something different -- was the size adjusted to account for alignment 
issues.


In fact, when I was reading the memrefs_conflict_p changes that's the 
mistake I nearly called the code out as wrong.  Then I went back 

[PATCH] xtensa: Fix PR target/78603

2016-11-29 Thread Max Filippov
2016-11-29  Max Filippov  
gcc/
* config/xtensa/xtensa.c (hwloop_optimize): Don't emit zero
overhead loop start between a call and its CALL_ARG_LOCATION
note.
---
 gcc/config/xtensa/xtensa.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/config/xtensa/xtensa.c b/gcc/config/xtensa/xtensa.c
index eb039ba..1236151 100644
--- a/gcc/config/xtensa/xtensa.c
+++ b/gcc/config/xtensa/xtensa.c
@@ -4165,7 +4165,10 @@ hwloop_optimize (hwloop_info loop)
   entry_after = BB_END (entry_bb);
   while (DEBUG_INSN_P (entry_after)
  || (NOTE_P (entry_after)
- && NOTE_KIND (entry_after) != NOTE_INSN_BASIC_BLOCK))
+ && NOTE_KIND (entry_after) != NOTE_INSN_BASIC_BLOCK
+/* Make sure we don't split a call and its corresponding
+   CALL_ARG_LOCATION note.  */
+ && NOTE_KIND (entry_after) != NOTE_INSN_CALL_ARG_LOCATION))
 entry_after = PREV_INSN (entry_after);
 
   emit_insn_after (seq, entry_after);
-- 
2.1.4



Re: [PATCH] correct handling of non-constant width and precision (pr 78521)

2016-11-29 Thread Martin Sebor

On 11/29/2016 01:04 AM, Christophe Lyon wrote:

On 29 November 2016 at 03:59, Martin Sebor  wrote:

On 11/28/2016 06:35 PM, David Edelsohn wrote:


Martin,

I am seeing a number of new failures with the testcases on AIX.

FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c (test for excess errors)

Excess errors:

/nasfarm/edelsohn/src/src/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1485:3:
warning: specified destination size 2147483647 is too large
[-Wformat-length=]


Also, a number of errors like

FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c  target lp64  (test for
warnings, line 256)
PASS: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c  (test for warnings, line
256)



Thanks.   The DejaGnu directives in the tests likely needs adjusting.
Let me look into it tomorrow.

Martin



Probably. I'm seeing errors on arm*:
FAIL:  gcc.dg/tree-ssa/builtin-sprintf-warn-3.c  target lp64  (test
for warnings, line 256)
FAIL:  gcc.dg/tree-ssa/builtin-sprintf-warn-3.c  target lp64  (test
for warnings, line 260)
FAIL:  gcc.dg/tree-ssa/builtin-sprintf-warn-3.c  target lp64  (test
for warnings, line 264)


I committed r242977 to resolve the failures.  My AIX 7.1 ILP32
cross-build and test run is clean but let me know if some persist
elsewhere.

Thanks
Martin


[Patch, Fortran] PR 78573: [7 Regression] [OOP] ICE in resolve_component, at fortran/resolve.c:13405

2016-11-29 Thread Janus Weil
Hi all,

here is a rather straightforward patch for an ice-on-invalid
regression. Regtests cleanly on x86_64-linux-gnu. Ok for trunk?

Cheers,
Janus



2016-11-29  Janus Weil  

PR fortran/78573
* decl.c (build_struct): On error, return directly and do not build
class symbol.

2016-11-29  Janus Weil  

PR fortran/78573
* gfortran.dg/class_61.f90: New test case.
Index: gcc/fortran/decl.c
===
--- gcc/fortran/decl.c  (revision 242960)
+++ gcc/fortran/decl.c  (working copy)
@@ -1850,7 +1850,6 @@ build_struct (const char *name, gfc_charlen *cl, g
 {
   gfc_state_data *s;
   gfc_component *c;
-  bool t = true;
 
   /* F03:C438/C439. If the current symbol is of the same derived type that 
we're
  constructing, it must have the pointer attribute.  */
@@ -1952,7 +1951,7 @@ build_struct (const char *name, gfc_charlen *cl, g
{
  gfc_error ("Pointer array component of structure at %C must have a "
 "deferred shape");
- t = false;
+ return false;
}
 }
   else if (c->attr.allocatable)
@@ -1961,7 +1960,7 @@ build_struct (const char *name, gfc_charlen *cl, g
{
  gfc_error ("Allocatable component of structure at %C must have a "
 "deferred shape");
- t = false;
+ return false;
}
 }
   else
@@ -1970,20 +1969,15 @@ build_struct (const char *name, gfc_charlen *cl, g
{
  gfc_error ("Array component of structure at %C must have an "
 "explicit shape");
- t = false;
+ return false;
}
 }
 
 scalar:
   if (c->ts.type == BT_CLASS)
-{
-  bool t2 = gfc_build_class_symbol (>ts, >attr, >as);
+return gfc_build_class_symbol (>ts, >attr, >as);
 
-  if (t)
-   t = t2;
-}
-
-  return t;
+  return true;
 }
 
 
! { dg-do compile }
!
! PR 78573: [7 Regression] [OOP] ICE in resolve_component, at fortran/resolve.c:13405
!
! Contributed by Gerhard Steinmetz 

program p
  type t1
class(t2), pointer :: q(2)  ! { dg-error "must have a deferred shape" }
  end type
end


Re: [PATCH] libiberty: avoid reading past end of buffer in strndup/xstrndup (PR c/78498)

2016-11-29 Thread Ian Lance Taylor
On Tue, Nov 29, 2016 at 2:08 PM, David Malcolm  wrote:
>
> gcc/ChangeLog:
> PR c/78498
> * selftest.c (selftest::assert_strndup_eq): New function.
> (selftest::test_strndup): New function.
> (selftest::test_libiberty): New function.
> (selftest::selftest_c_tests): Call test_libiberty.
>
> gcc/testsuite/ChangeLog:
> PR c/78498
> * gcc.dg/format/pr78494.c: New test case.
>
> libiberty/ChangeLog:
> PR c/78498
> * strndup.c (strlen): Delete decl.
> (strnlen): Add decl.
> (strndup): Call strnlen rather than strlen.
> * xstrndup.c: Likewise.

The libiberty changes are fine.

Thanks.

Ian


Re: [PATCH], PR 78594, Fix storing QImode/HImode on ISA 3.0/power9

2016-11-29 Thread Segher Boessenkool
On Tue, Nov 29, 2016 at 02:14:17PM -0500, Michael Meissner wrote:
> I was developing the next round of ISA 3.0 code changes to use the vector
> extract byte, half word, and word instructions (VEXTU{B,H,W}{R,L}X) that
> deposit the value into a general purpose register instead of a vector 
> register,
> and I was running the changes through the simulator.  I discovered that my
> previous change to allow QImode/HImode did not work if the value was in a
> traditional Altivec register.
> 
> This fixes the problem that I noticed.  I didn't bother doing the full
> bootstrap and check, since it only affects the power9 target.  Can I check 
> this
> in?

Certainly, please do.  Thanks,


Segher


> 2016-11-29  Michael Meissner  
> 
>   PR target/78594
>   * config/rs6000/rs6000.md (mov_internal, QHI iterator): Add
>   'x' to stxsix print pattern, so that QImode and HImode values
>   residing in traditional altivec registers can be stored
>   correctly.


Re: [PR middle-end/78566] Fix uninit regressions caused by previous -Wmaybe-uninit change

2016-11-29 Thread Christophe Lyon
On 29 November 2016 at 17:49, Christophe Lyon
 wrote:
> On 29 November 2016 at 17:33, Aldy Hernandez  wrote:
>> This fixes the gcc.dg/uninit-pred-6* failures I seem to have caused on some
>> non x86 platforms. Sorry for the delay.
>>
>> The problem is that my fix for PR61409 had the logic backwards.  I was
>> proving that all the uses of a PHI are invalidated by any one undefined PHI
>> path, whereas what we want is to prove that EVERY uninitialized path is
>> invalidated by some facor in the PHI use.
>>
>> The attached patch fixes this without causing any regressions on x86-64
>> Linux.  I also verified that at least on [arm-none-linux-gnueabihf
>> --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16], there are no
>> gcc.dg/*uninit* regressions.
>>
>> There is still one regression at large involving a double free in PR78548
>> which I will look at next/independently.
>>
> Thanks for working on this.
> I've submitted a validation with your patch, I'll let you know if I find any
> regressions.
>
> Christophe
>

The results are OK:
gcc.dg/uninit-pred-6_[abc].c now pass on cortex-a5/cortex-m3.

Thanks

Christophe

>> OK for trunk?
>> Aldy


Re: [PATCH] add common CPP_SPECS for bfin

2016-11-29 Thread Jeff Law

On 11/28/2016 10:51 PM, Waldemar Brodkorb wrote:

Hi,

add common defines _REENTRANT and _POSIX_SOURCE for bfin.
Patch is used in Buildroot for a while to fix issues compiling
some software.
See here, why this should be always enabled:
https://lists.gnu.org/archive/html/autoconf-archive-maintainers/2016-06/msg1.html

2016-11-29  Waldemar Brodkorb 

   gcc/
   * gcc/config/bfin/linux.h: add common CPP_SPEC.

Thanks.  I've adjusted the ChangeLog entry and committed this patch for you.

Jeff



Re: [PATCH, ARM] Further improve stack usage on sha512 (PR 77308)

2016-11-29 Thread Bernd Edlinger
On 11/29/16 16:06, Wilco Dijkstra wrote:
> Bernd Edlinger wrote:
>
> -  "TARGET_32BIT && reload_completed
> +  "TARGET_32BIT && ((!TARGET_NEON && !TARGET_IWMMXT) || reload_completed)
> && ! (TARGET_NEON && IS_VFP_REGNUM (REGNO (operands[0])))"
>
> This is equivalent to "&& (!TARGET_IWMMXT || reload_completed)" since we're
> already excluding NEON.
>

Aehm, no.  This would split the addi_neon insn before it is clear
if the reload pass will assign a VFP register.

With this change the stack usage with -mfpu=neon increases
from 2300 to around 2600 bytes.

> This patch expands ADD and SUB earlier, so shouldn't we do the same obvious
> change for the similar instructions CMP and NEG?
>

Good question.  I think the cmp and neg pattern are more complicated
and do typically have a more complicated data flow than the other
patterns.

I tried to create a test case which expands cmpdi and negdi patterns
as follows:

--- pr77308-1.c 2016-11-25 17:53:20.379141465 +0100
+++ pr77308-2.c 2016-11-29 20:46:51.266948631 +0100
@@ -68,10 +68,10 @@
  #define B(x,j)(((SHA_LONG64)(*(((const unsigned char 
*)())+j)))<<((7-j)*8))
  #define PULL64(x) 
(B(x,0)|B(x,1)|B(x,2)|B(x,3)|B(x,4)|B(x,5)|B(x,6)|B(x,7))
  #define ROTR(x,s)   (((x)>>s) | (x)<<(64-s))
-#define Sigma0(x)   ~(ROTR((x),28) ^ ROTR((x),34) ^ ROTR((x),39))
-#define Sigma1(x)   ~(ROTR((x),14) ^ ROTR((x),18) ^ ROTR((x),41))
-#define sigma0(x)   ~(ROTR((x),1)  ^ ROTR((x),8)  ^ ((x)>>7))
-#define sigma1(x)   ~(ROTR((x),19) ^ ROTR((x),61) ^ ((x)>>6))
+#define Sigma0(x)   (ROTR((x),28) ^ ROTR((x),34) ^ ROTR((x),39) == 
(x) ? -(x) : (x))
+#define Sigma1(x)   (ROTR((x),14) ^ ROTR(-(x),18) ^ ROTR((x),41) < 
(x) ? -(x) : (x))
+#define sigma0(x)   (ROTR((x),1)  ^ ROTR((x),8)  ^ ((x)>>7) <= (x) 
? ~(x) : (x))
+#define sigma1(x)   ((long long)(ROTR((x),19) ^ ROTR((x),61) ^ 
((x)>>6)) < (long long)(x) ? -(x) : (x))
  #define Ch(x,y,z)   (((x) & (y)) ^ ((~(x)) & (z)))
  #define Maj(x,y,z)  (((x) & (y)) ^ ((x) & (z)) ^ ((y) & (z)))


This expands *arm_negdi2, *arm_cmpdi_unsigned, *arm_cmpdi_insn.
The stack usage is around 1900 bytes with previous patch,
and 2300 bytes without.

I tried to split *arm_negdi2 and *arm_cmpdi_unsined early, and it
gives indeed smaller stack sizes in the test case above (~400 bytes).
But when I make *arm_cmpdi_insn split early, it ICEs:

--- arm.md.orig 2016-11-27 09:22:41.794790123 +0100
+++ arm.md  2016-11-29 21:51:51.438163078 +0100
@@ -7432,7 +7432,7 @@
 (clobber (match_scratch:SI 2 "=r"))]
"TARGET_32BIT"
"#"   ; "cmp\\t%Q0, %Q1\;sbcs\\t%2, %R0, %R1"
-  "&& reload_completed"
+  "&& ((!TARGET_NEON && !TARGET_IWMMXT) || reload_completed)"
[(set (reg:CC CC_REGNUM)
  (compare:CC (match_dup 0) (match_dup 1)))
 (parallel [(set (reg:CC CC_REGNUM)

ontop of the latest patch, I got:

gcc -S -Os pr77308-2.c -fdump-rtl-all-verbose
pr77308-2.c: In function 'sha512_block_data_order':
pr77308-2.c:169:1: error: unrecognizable insn:
  }
  ^
(insn 4870 4869 1636 87 (set (scratch:SI)
 (minus:SI (minus:SI (subreg:SI (reg:DI 2261) 4)
 (subreg:SI (reg:DI 473 [ X$14 ]) 4))
 (ltu:SI (reg:CC_C 100 cc)
 (const_int 0 [0] "pr77308-2.c":140 -1
  (nil))
pr77308-2.c:169:1: internal compiler error: in extract_insn, at recog.c:2311
0xaf4cd8 _fatal_insn(char const*, rtx_def const*, char const*, int, char 
const*)
../../gcc-trunk/gcc/rtl-error.c:108
0xaf4d09 _fatal_insn_not_found(rtx_def const*, char const*, int, char 
const*)
../../gcc-trunk/gcc/rtl-error.c:116
0xac74ef extract_insn(rtx_insn*)
../../gcc-trunk/gcc/recog.c:2311
0x122427a decompose_multiword_subregs
../../gcc-trunk/gcc/lower-subreg.c:1467
0x122550d execute
../../gcc-trunk/gcc/lower-subreg.c:1734


So it is certainly possible, but not really simple to improve the
stack size even further.  But I would prefer to do that in a
separate patch.

BTW: there are also negd2_compare, *negdi_extendsidi,
*negdi_zero_extendsidi, *thumb2_negdi2.

I think it would be a precondition to have test cases that exercise
each of these patterns before we try to split these instructions.


Bernd.


[PATCH] libiberty: avoid reading past end of buffer in strndup/xstrndup (PR c/78498)

2016-11-29 Thread David Malcolm
libiberty's implementations of strndup and xstrndup call strlen on
the input string, and hence can read past the end of the input buffer
if it isn't zero-terminated (such as is the case in PR c/78498, where
the input string is from the input.c line cache).

This patch converts them to use strnlen instead (as glibc's
implementation of them does), avoiding reading more than n bytes
from the input buffer.  strnlen is provided by libiberty.

Successfully bootstrapped on x86_64-pc-linux-gnu;
adds 6 PASS results to gcc.sum.

The patch also adds some selftests for this case, which showed
the problem and the fix nicely via "make selftest-valgrind".
Unfortunately I had to put these selftests within the gcc
subdirectory, rather than libiberty, since selftest.h is C++ and
is itself in the gcc subdirectory.  If that's unacceptable, I can
just drop the selftest.c part of the patch (or we somehow support
selftests from within libiberty itself, though I'm not sure how to
do that, if libiberty is meant as a cross-platform compat library,
rather than as a base support layer; the simplest thing to do seemed
to be to put them in the "gcc" subdir).

gcc/ChangeLog:
PR c/78498
* selftest.c (selftest::assert_strndup_eq): New function.
(selftest::test_strndup): New function.
(selftest::test_libiberty): New function.
(selftest::selftest_c_tests): Call test_libiberty.

gcc/testsuite/ChangeLog:
PR c/78498
* gcc.dg/format/pr78494.c: New test case.

libiberty/ChangeLog:
PR c/78498
* strndup.c (strlen): Delete decl.
(strnlen): Add decl.
(strndup): Call strnlen rather than strlen.
* xstrndup.c: Likewise.
---
 gcc/selftest.c| 48 +++
 gcc/testsuite/gcc.dg/format/pr78494.c | 12 +
 libiberty/strndup.c   |  7 ++---
 libiberty/xstrndup.c  |  5 +---
 4 files changed, 63 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/format/pr78494.c

diff --git a/gcc/selftest.c b/gcc/selftest.c
index 2a729be..6df73c2 100644
--- a/gcc/selftest.c
+++ b/gcc/selftest.c
@@ -198,6 +198,53 @@ read_file (const location , const char *path)
   return result;
 }
 
+/* Selftests for libiberty.  */
+
+/* Verify that both strndup and xstrndup generate EXPECTED
+   when called on SRC and N.  */
+
+static void
+assert_strndup_eq (const char *expected, const char *src, size_t n)
+{
+  char *buf = strndup (src, n);
+  if (buf)
+ASSERT_STREQ (expected, buf);
+  free (buf);
+
+  buf = xstrndup (src, n);
+  ASSERT_STREQ (expected, buf);
+  free (buf);
+}
+
+/* Verify that strndup and xstrndup work as expected.  */
+
+static void
+test_strndup ()
+{
+  assert_strndup_eq ("", "test", 0);
+  assert_strndup_eq ("t", "test", 1);
+  assert_strndup_eq ("te", "test", 2);
+  assert_strndup_eq ("tes", "test", 3);
+  assert_strndup_eq ("test", "test", 4);
+  assert_strndup_eq ("test", "test", 5);
+
+  /* Test on an string without zero termination.  */
+  const char src[4] = {'t', 'e', 's', 't'};
+  assert_strndup_eq ("", src, 0);
+  assert_strndup_eq ("t", src, 1);
+  assert_strndup_eq ("te", src, 2);
+  assert_strndup_eq ("tes", src, 3);
+  assert_strndup_eq ("test", src, 4);
+}
+
+/* Run selftests for libiberty.  */
+
+static void
+test_libiberty ()
+{
+  test_strndup ();
+}
+
 /* Selftests for the selftest system itself.  */
 
 /* Sanity-check the ASSERT_ macros with various passing cases.  */
@@ -245,6 +292,7 @@ test_read_file ()
 void
 selftest_c_tests ()
 {
+  test_libiberty ();
   test_assertions ();
   test_named_temp_file ();
   test_read_file ();
diff --git a/gcc/testsuite/gcc.dg/format/pr78494.c 
b/gcc/testsuite/gcc.dg/format/pr78494.c
new file mode 100644
index 000..4b53a68
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/format/pr78494.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wall -Wextra -fdiagnostics-show-caret" } */
+
+void f (void)
+{
+  __builtin_printf ("%i", ""); /* { dg-warning "expects argument of type" } */
+/* { dg-begin-multiline-output "" }
+   __builtin_printf ("%i", "");
+  ~^   ~~
+  %s
+   { dg-end-multiline-output "" } */
+}
diff --git a/libiberty/strndup.c b/libiberty/strndup.c
index 9e9b4e2..4556b96 100644
--- a/libiberty/strndup.c
+++ b/libiberty/strndup.c
@@ -33,7 +33,7 @@ memory was available.  The result is always NUL terminated.
 #include "ansidecl.h"
 #include 
 
-extern size_t  strlen (const char*);
+extern size_t  strnlen (const char *s, size_t maxlen);
 extern PTR malloc (size_t);
 extern PTR memcpy (PTR, const PTR, size_t);
 
@@ -41,10 +41,7 @@ char *
 strndup (const char *s, size_t n)
 {
   char *result;
-  size_t len = strlen (s);
-
-  if (n < len)
-len = n;
+  size_t len = strnlen (s, n);
 
   result = (char *) malloc (len + 1);
   if (!result)
diff --git a/libiberty/xstrndup.c b/libiberty/xstrndup.c
index 0a41f60..c3d2d83 100644
--- a/libiberty/xstrndup.c

Re: [PATCH] Introduce -fdump-ipa-clones dump output

2016-11-29 Thread Jeff Law

On 11/11/2016 07:30 AM, Martin Liška wrote:

Hello.

Motivation for the patch is to dump IPA clones that were created
by all inter-procedural optimizations. Usage of such input is to track
set of functions where a code from another function can eventually occur.
Usage of the dump file can be seen here: [1].

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

[1] https://github.com/marxin/kgraft-analysis-tool


0001-Introduce-fdump-ipa-clones-dump-output.patch


From 700b9833771a5b646d3db44014af81c007dd48f4 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 9 Nov 2016 14:23:30 +0100
Subject: [PATCH] Introduce -fdump-ipa-clones dump output

gcc/ChangeLog:

2016-11-11  Martin Liska  

* cgraph.c (symbol_table::initialize): Initialize
ipa_clones_dump_file.
(cgraph_node::remove): Report to ipa_clones_dump_file.
* cgraph.h: Add new argument (suffix) to cloning methods.
* cgraphclones.c (dump_callgraph_transformation): New function.
(cgraph_node::create_clone): New argument.
(cgraph_node::create_virtual_clone): Likewise.
(cgraph_node::create_version_clone): Likewise.
* dumpfile.c: Add .ipa-clones dump file.
* dumpfile.h (enum tree_dump_index): Add TDI_clones
* ipa-inline-transform.c (clone_inlined_nodes): Report operation
to dump_callgraph_transformation.
---
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index cc730d2..2d59291 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -906,13 +906,14 @@ public:
  If the new node is being inlined into another one, NEW_INLINED_TO should 
be
  the outline function the new one is (even indirectly) inlined to.
  All hooks will see this in node's global.inlined_to, when invoked.
- Can be NULL if the node is not inlined.  */
+ Can be NULL if the node is not inlined.  SUFFIX is string that is appended
+ to the original name.  */
   cgraph_node *create_clone (tree decl, gcov_type count, int freq,
 bool update_original,
 vec redirect_callers,
 bool call_duplication_hook,
 cgraph_node *new_inlined_to,
-bitmap args_to_skip);
+bitmap args_to_skip, const char *sufix = NULL);

s/sufix/suffix/
?

OK with that nit fixed.

Sorry for the delays getting to this.

jeff



Re: Import libcilkrts Build 4467 (PR target/68945)

2016-11-29 Thread Jeff Law

On 11/17/2016 06:06 AM, Rainer Orth wrote:

I happened to notice that my libcilkrts SPARC port has been applied
upstream.  So to reach closure on this issue for the GCC 7 release, I'd
like to import upstream into mainline which seems to be covered by the
free-for-all clause in https://gcc.gnu.org/svnwrite.html#policies, even
though https://gcc.gnu.org/codingconventions.html#upstream lists nothing
specific and we have no listed maintainer.

A few issues are worth mention:

* Upstream still has a typo in the git URL in many files.  I've
  corrected that during the import to avoid a massive diff:

-#  https://bitbucket.org/intelcilkruntime/itnel-cilk-runtime.git are
+#  https://bitbucket.org/intelcilkruntime/intel-cilk-runtime.git are

* libcilkrts.spec.in is missing upstream.  I've no idea if this is
  intentional.

* A few of my changes have been lost and I can't tell if this is by
  accident:

** Lost whitespace:

--- libcilkrts-old/Makefile.am  2016-05-04 16:44:24.0 +
+++ libcilkrts-new/Makefile.am  2016-11-17 11:35:33.782987017 +
@@ -54,7 +54,7 @@ GENERAL_FLAGS = -I$(top_srcdir)/include
 # Enable Intel Cilk Plus extension
 GENERAL_FLAGS += -fcilkplus

-# Always generate unwind tables
+#Always generate unwind tables
 GENERAL_FLAGS += -funwind-tables

 AM_CFLAGS = $(XCFLAGS) $(GENERAL_FLAGS) -std=c99

** Lost alphabetical order of targets:

diff -rup libcilkrts-old/configure.ac libcilkrts-new/configure.ac
--- libcilkrts-old/configure.ac 2016-11-16 18:34:28.0 +
+++ libcilkrts-new/configure.ac 2016-11-17 11:35:33.800015570 +
@@ -143,14 +145,14 @@ esac
 # contains information on what's needed
 case "${target}" in

-  arm-*-*)
-config_dir="arm"
-;;
-
   i?86-*-* | x86_64-*-*)
 config_dir="x86"
 ;;

+  arm-*-*)
+config_dir="arm"
+;;
+
   sparc*-*-*)
 config_dir="sparc"
 ;;
diff -rup libcilkrts-old/configure.tgt libcilkrts-new/configure.tgt
--- libcilkrts-old/configure.tgt2016-11-16 18:34:28.0 +
+++ libcilkrts-new/configure.tgt2016-11-17 11:35:33.807873451 +
@@ -44,10 +44,10 @@

 # Disable Cilk Runtime library for unsupported architectures.
 case "${target}" in
-  arm-*-*)
-;;
   i?86-*-* | x86_64-*-*)
 ;;
+  arm-*-*)
+;;
   sparc*-*-*)
 ;;
   *-*-*)

  I've done nothing about those, just wanted to point them out.

The following patch has passed x86_64-pc-linux-gnu bootstrap without
regressions; i386-pc-solaris2.12 and sparc-sun-solaris2.12 bootstraps
are currently running.

Ok for mainline if they pass?

Yes.  Sorry for not getting back to you sooner.

jeff


Re: [PATCH v2] Fix PR78588 - rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too large for 64-bit type

2016-11-29 Thread Segher Boessenkool
On Tue, Nov 29, 2016 at 05:00:05PM +0100, Markus Trippelsdorf wrote:
> Building gcc with -fsanitize=undefined shows:
>  rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too large for 
> 64-bit type 'long unsigned int'
> 
> This happens because if_then_else_cond() in combine.c calls
> num_sign_bit_copies() in rtlanal.c with mode==BLKmode.
> 
> 5205   bitwidth = GET_MODE_PRECISION (mode);
> 5206   if (bitwidth > HOST_BITS_PER_WIDE_INT)
> 5207 return 1;
> 5208
> 5209   nonzero = nonzero_bits (x, mode);
> 5210   return nonzero & (HOST_WIDE_INT_1U << (bitwidth - 1))
> 5211  ? 1 : bitwidth - floor_log2 (nonzero) - 1;
> 
> This causes (bitwidth - 1) to wrap around.

Could you also add a gcc_assert here?

>   PR rtl-optimization/78588 
>   * combine.c (if_then_else_cond): Also guard against BLKmode.

Approved, please apply.  Thanks,


Segher


Re: [PATCH] Delete GCJ

2016-11-29 Thread Jeff Law

On 11/21/2016 04:23 PM, Matthias Klose wrote:

On 21.11.2016 18:16, Rainer Orth wrote:

Hi Matthias,


ahh, didn't see that :-/ Now fixed, is this clearer now?

The options @option{--with-target-bdw-gc-include} and
@option{--with-target-bdw-gc-lib} must always specified together for

   ^ be


thanks to all sorting out the documentation issues. Now attaching the updated
diff. Ok to commit?

Matthias



2016-11-19  Matthias Klose  

* Makefile.def: Remove reference to boehm-gc target module.
* configure.ac: Include pkg.m4, check for --with-target-bdw-gc
options and for the bdw-gc pkg-config module.
* configure: Regenerate.
* Makefile.in: Regenerate.

gcc/

2016-11-19  Matthias Klose  

* doc/install.texi: Document configure options --enable-objc-gc
and --with-target-bdw-gc.

config/

2016-11-19  Matthias Klose  

* pkg.m4: New file.

libobjc/

2016-11-19  Matthias Klose  

* configure.ac (--enable-objc-gc): Allow to configure with a
system provided boehm-gc.
* configure: Regenerate.
* Makefile.in (OBJC_BOEHM_GC_LIBS): Get value from configure.
* gc.c: Include system bdw-gc headers.
* memory.c: Likewise
* objects.c: Likewise

boehm-gc/

2016-11-19  Matthias Klose  

Remove

OK.

Jeff


Re: [PATCH] combine: Tweak change_zero_ext

2016-11-29 Thread Christophe Lyon
On 29 November 2016 at 20:38, Uros Bizjak  wrote:
>> 2016-11-26  Segher Boessenkool  
>>
>> * combine.c (change_zero_ext): Also handle extends from a subreg
>> to a mode bigger than that of the operand of the subreg.
>
> This patch introduced:
>
> FAIL: gcc.target/i386/pr44578.c (internal compiler error)
>
> on i686 (or x86_64 32bit multi-lib).
>
> ./cc1 -O2 -mtune=athlon64 -m32 -quiet pr44578.c
> pr44578.c: In function ‘test’:
> pr44578.c:18:1: internal compiler error: in gen_rtx_SUBREG, at emit-rtl.c:908
>  }
>  ^
> 0x81493b gen_rtx_SUBREG(machine_mode, rtx_def*, int)
> /home/uros/gcc-svn/trunk/gcc/emit-rtl.c:908
> 0x122609f change_zero_ext
> /home/uros/gcc-svn/trunk/gcc/combine.c:11260
> 0x1226207 recog_for_combine
> /home/uros/gcc-svn/trunk/gcc/combine.c:11346
> 0x1236db3 try_combine
> /home/uros/gcc-svn/trunk/gcc/combine.c:3501
> 0x123a3e0 combine_instructions
> /home/uros/gcc-svn/trunk/gcc/combine.c:1265
> 0x123a3e0 rest_of_handle_combine
> /home/uros/gcc-svn/trunk/gcc/combine.c:14581
> 0x123a3e0 execute
> /home/uros/gcc-svn/trunk/gcc/combine.c:14626
>
> Uros.

Hi,

I'm seeing a similar error on aarch64:
FAIL: gcc.target/aarch64/advsimd-intrinsics/vduph_lane.c   -O1
(internal compiler error)
with the same backtrace.

Christophe


Re: [PATCH 7/9] Add RTL-error-handling to host

2016-11-29 Thread Bernd Schmidt

On 11/29/2016 07:53 PM, David Malcolm wrote:


Would you prefer that I went with approach (B), or is approach (A)
acceptable?


Well, I was hoping there'd be an approach (C) where the read-rtl code 
uses whatever diagnostics framework that is available. Maybe it'll turn 
out that's too hard. Somehow the current patch looked strange to me, but 
if there's no easy alternative maybe we'll have to go with it.



Bernd


Go patch committed: Merge to gccgo branch

2016-11-29 Thread Ian Lance Taylor
I merged GCC trunk revision 242967 to the gccgo branch.

Ian


[PATCH] Fix x86_64 fix_debug_reg_uses (PR rtl-optimization/78575)

2016-11-29 Thread Jakub Jelinek
Hi!

The x86_64 stv pass uses PUT_MODE to change REGs and MEMs in place to affect
all setters and users, but that is undesirable in debug insns which are
intentionally ignored during the analysis and we should keep using correct
modes (TImode) instead of the new one (V1TImode).

The current fix_debug_reg_uses implementation just assumes such a pseudo
can appear only directly in the VAR_LOCATION's second operand, but it can of
course appear anywhere in the expression, the whole expression doesn't have
to be TImode either (e.g. on the testcase it is a QImode comparison of
originally TImode pseudo with CONST_INT, which stv incorrectly changes into
comparison of V1TImode with CONST_INT).

The following patch fixes that and also fixes an issue if the pseudo appears
multiple times in the debug info that the rescan could break traversal of
further uses.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-29  Jakub Jelinek  

PR rtl-optimization/78575
* config/i386/i386.c (timode_scalar_chain::fix_debug_reg_uses): Use
DF infrastructure to wrap all V1TImode reg uses into TImode subreg
if not already wrapped in a subreg.  Make sure df_insn_rescan does not
affect further iterations.

* gcc.dg/pr78575.c: New test.

--- gcc/config/i386/i386.c.jj   2016-11-28 10:59:08.0 +0100
+++ gcc/config/i386/i386.c  2016-11-29 08:31:58.061278522 +0100
@@ -3831,30 +3831,32 @@ timode_scalar_chain::fix_debug_reg_uses
   if (!flag_var_tracking)
 return;
 
-  df_ref ref;
-  for (ref = DF_REG_USE_CHAIN (REGNO (reg));
-   ref;
-   ref = DF_REF_NEXT_REG (ref))
+  df_ref ref, next;
+  for (ref = DF_REG_USE_CHAIN (REGNO (reg)); ref; ref = next)
 {
   rtx_insn *insn = DF_REF_INSN (ref);
+  /* Make sure the next ref is for a different instruction,
+ so that we're not affected by the rescan.  */
+  next = DF_REF_NEXT_REG (ref);
+  while (next && DF_REF_INSN (next) == insn)
+   next = DF_REF_NEXT_REG (next);
+
   if (DEBUG_INSN_P (insn))
{
  /* It may be a debug insn with a TImode variable in
 register.  */
- rtx val = PATTERN (insn);
- if (GET_MODE (val) != TImode)
-   continue;
- gcc_assert (GET_CODE (val) == VAR_LOCATION);
- rtx loc = PAT_VAR_LOCATION_LOC (val);
- /* It may have been converted to TImode already.  */
- if (GET_MODE (loc) == TImode)
-   continue;
- gcc_assert (REG_P (loc)
- && GET_MODE (loc) == V1TImode);
- /* Convert V1TImode register, which has been updated by a SET
-insn before, to SUBREG TImode.  */
- PAT_VAR_LOCATION_LOC (val) = gen_rtx_SUBREG (TImode, loc, 0);
- df_insn_rescan (insn);
+ bool changed = false;
+ for (; ref != next; ref = DF_REF_NEXT_REG (ref))
+   {
+ rtx *loc = DF_REF_LOC (ref);
+ if (REG_P (*loc) && GET_MODE (*loc) == V1TImode)
+   {
+ *loc = gen_rtx_SUBREG (TImode, *loc, 0);
+ changed = true;
+   }
+   }
+ if (changed)
+   df_insn_rescan (insn);
}
 }
 }
--- gcc/testsuite/gcc.dg/pr78575.c.jj   2016-11-29 08:36:25.821932436 +0100
+++ gcc/testsuite/gcc.dg/pr78575.c  2016-11-29 08:35:35.0 +0100
@@ -0,0 +1,16 @@
+/* PR rtl-optimization/78575 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O2 -g -Wno-psabi" } */
+
+typedef unsigned __int128 V __attribute__((vector_size(64)));
+
+V g;
+
+void
+foo (V v)
+{
+  unsigned __int128 x = 1;
+  int c = v[1] <= ~x;
+  v &= v[1];
+  g = v;
+}

Jakub


Re: [v3 PATCH] Implement LWG 2534, Constrain rvalue stream operators.

2016-11-29 Thread Jonathan Wakely

On 27/11/16 20:50 +0200, Ville Voutilainen wrote:

   Implement LWG 2534, Constrain rvalue stream operators.
   * include/std/istream (__is_convertible_to_basic_istream): New.
   (__is_extractable): Likewise.
   (operator>>(basic_istream<_CharT, _Traits>&&, _Tp&&)):
   Turn the stream parameter into a template parameter
   and constrain.
   * include/std/ostream /__is_convertible_to_basic_ostream): New.
   (__is_insertable): Likewise.
   (operator<<(basic_ostream<_CharT, _Traits>&&, const _Tp&)):
   Turn the stream parameter into a template parameter
   and constrain.
   * testsuite/27_io/basic_istream/extractors_other/char/4.cc: New.
   * testsuite/27_io/basic_istream/extractors_other/wchar_t/4.cc:
   Likewise.
   * testsuite/27_io/basic_ostream/inserters_other/char/6.cc: Likewise.
   * testsuite/27_io/basic_ostream/inserters_other/wchar_t/6.cc: Likewise.


OK, thanks.




Re: [PATCH][AArch64] Separate shrink wrapping hooks implementation

2016-11-29 Thread Segher Boessenkool
Hi James, Kyrill,

On Tue, Nov 29, 2016 at 10:57:33AM +, James Greenhalgh wrote:
> > +static sbitmap
> > +aarch64_components_for_bb (basic_block bb)
> > +{
> > +  bitmap in = DF_LIVE_IN (bb);
> > +  bitmap gen = _LIVE_BB_INFO (bb)->gen;
> > +  bitmap kill = _LIVE_BB_INFO (bb)->kill;
> > +
> > +  sbitmap components = sbitmap_alloc (V31_REGNUM + 1);
> > +  bitmap_clear (components);
> > +
> > +  /* GPRs are used in a bb if they are in the IN, GEN, or KILL sets.  */
> > +  for (unsigned regno = R0_REGNUM; regno <= V31_REGNUM; regno++)
> 
> The use of R0_REGNUM and V31_REGNUM scare me a little bit, as we're hardcoding
> where the end of the register file is (does this, for example, fall apart
> with the SVE work that was recently posted). Something like a
> LAST_HARDREG_NUM might work?

Components and registers aren't the same thing (you can have components
for things that aren't just a register save, e.g. the frame setup, stack
alignment, save of some non-GPR via a GPR, PIC register setup, etc.)
The loop here should really only cover the non-volatile registers, and
there should be some translation from register number to component number
(it of course is convenient to have a 1-1 translation for the GPRs and
floating point registers).  For rs6000 many things in the backend already
use non-symbolic numbers for the FPRs and GPRs, so that is easier there.

> > +static void
> > +aarch64_disqualify_components (sbitmap, edge, sbitmap, bool)
> > +{
> > +}
> 
> Is there no default "do nothing" hook for this?

I can make the shrink-wrap code do nothing here if this hook isn't
defined, if you want?


Segher


Re: [libstdc++, testsuite] Add dg-require-thread-fence

2016-11-29 Thread Jonathan Wakely

On 16/11/16 22:18 +0100, Christophe Lyon wrote:

On 15 November 2016 at 12:50, Jonathan Wakely  wrote:

On 14/11/16 14:32 +0100, Christophe Lyon wrote:


On 20 October 2016 at 19:40, Jonathan Wakely  wrote:


On 20/10/16 10:33 -0700, Mike Stump wrote:



On Oct 20, 2016, at 9:34 AM, Jonathan Wakely  wrote:




On 20/10/16 09:26 -0700, Mike Stump wrote:



On Oct 20, 2016, at 5:20 AM, Jonathan Wakely 
wrote:




I am considering leaving this in the ARM backend to force people to
think what they want to do about thread safety with statics and C++
on bare-metal systems.




The quoting makes it look like those are my words, but I was quoting
Ramana from https://gcc.gnu.org/ml/gcc-patches/2015-05/msg02751.html


Not quite in the GNU spirit?  The port people should decide the best
way
to get as much functionality as possible and everything should just
work, no
sharp edges.

Forcing people to think sounds like a sharp edge?




I'm inclined to agree, but we are talking about bare metal systems,




So?  gcc has been doing bare metal systems for more than 2 years now.
It
is pretty good at it.  All my primary targets today are themselves bare
metal systems (I test with newlib).


where there is no one-size-fits-all solution.




Configurations are like ice cream cones.  Everyone gets their flavor no
matter how weird or strange.  Putting nails in a cone because you don't
know
if they like vanilla or chocolate isn't reasonable.  If you want, make
two
flavors, and vend two, if you want to just do one, pick the flavor and
vend
it.  Put an enum #define default_flavor vanilla, and you then have
support
for any flavor you want.  Want to add a configure option for the flavor
select, add it.  You want to make a -mflavor=chocolate option, add it.
gcc
is literally littered with these things.




Like I said, you can either build the library with
-fno-threadsafe-statics or you can provide a definition of the missing
symbol.


I gave this a try (using CXXFLAGS_FOR_TARGET=-fno-threadsafe-statics).
It seems to do the trick indeed: almost all tests now pass, the flag is
added
to testcase compilation.

Among the 6 remaining failures, I noticed these two:
- experimental/type_erased_allocator/2.cc: still complains about the
missing
__sync_synchronize. Does it need dg-require-thread-fence?



Yes, I think that test actually uses atomics directly, so does depend
on the fence.


I've attached the patch to achieve this.
Is it OK?


Yes, OK, thanks.


- abi/header_cxxabi.c complains because the option is not valid for C.
I can see the test is already skipped for other C++-only options: it is OK
if I submit a patch to skip it if -fno-threadsafe-statics is used?



Yes, it makes sense there too.


This one is not as obvious as I hoped. I tried:
-// { dg-skip-if "invalid options for C" { *-*-* } { "-std=c++??"
"-std=gnu++??" } }
+// { dg-skip-if "invalid options for C" { *-*-* } { "-std=c++??"
"-std=gnu++??" "-fno-threadsafe-statics" } }

but it does not work.

I set CXXFLAGS_FOR_TARGET=-fno-threadsafe-statics
before running GCC's configure.

This results in -fno-threadsafe-statics being used when compiling the tests,
but dg-skip-if does not consider it: it would if I passed it via
runtestflags/target-board, but then it would mean passing this flag
to all tests, not only the c++ ones, leading to errors everywhere.

Am I missing something?


I'm not sure how to deal with that.



Re: Pretty printers for versioned namespace

2016-11-29 Thread Jonathan Wakely

On 28/11/16 22:19 +0100, François Dumont wrote:

Hi

   Here is a patch to fix pretty printers when versioned namespace is 
activated.


   You will see that I have hesitated in making the fix independant 
of the version being used. In source files you will find (__7::)? 
patterns while in xmethods.py I chose (__\d+::)? making it ready for 
__8 and forward. Do you want to generalize one option ? If so which 
one ?


I don't really mind, but I note that the point of the path
libstdcxx/v6/printers.py was that we'd have different printers for v7,
v8 etc. ... I think it's simpler to keep everything in one place
though. 

   At the moment version namespace is visible within gdb, it displays 
for instance 'std::__7::string'. I am pretty sure we could hide it, is 
it preferable ? I would need some time to do so as I am neither a 
python nor regex expert.


It's fine to display it.

   I am not fully happy with the replication in printers.py of 
StdRbtreeIteratorPrinter and 
StdExpAnyPrinter(SingleObjContainerPrinter in respectively 
StdVersionedRbtreeIteratorPrinter and 
StdExpVerAnyPrinter(SingleObjContainerPrinter just to adapt 2 lines 
where regex is not an option. We could surely keep only one and pass 
it '' or '__7'. But as I said I am not a python expert so any help 
would be appreciated.


We definitely want to avoid that duplication. For
StdRbtreeIteratorPrinter you can just look at 'typename' and see
whether it starts with "std::__7" or not. If it does, you need to lookup
std::__7::_Rb_tree_node<...>, otherwise you need to lookup
std::_Rb_tree_node<...> instead.

For StdExpAnyPrinter just do two replacements: first replace
std::string with the result of gdb.lookup_type('std::string') and then
replace std::__7::string with the result of looking that up. Are you
sure that's even needed though? Does std::__7::string actually appear
in the manager function's name? I would expect it to appear as
std::__7::basic_string >
which doesn't need to be expanded anyway. So I think you can just
remove your StdExpVerAnyPrinter.



--- a/libstdc++-v3/testsuite/lib/gdb-test.exp
+++ b/libstdc++-v3/testsuite/lib/gdb-test.exp
@@ -74,6 +74,14 @@ proc whatis-test {var result} {
lappend gdb_tests $var $result whatis
}

+# A test of 'whatis'.  This tests a type rather than a variable through a
+# regexp.


Please use "regular expression" here rather than "regexp".


+proc whatis-regexp-test {var result} {
+global gdb_tests
+
+lappend gdb_tests $var $result whatisrexp
+}
+


And something other than "whatisrexp" e.g. "whatis_regexp" would be
OK, but "rexp" is not a conventional abbreviation.



Re: [PATCH] correct handling of non-constant width and precision (pr 78521)

2016-11-29 Thread Martin Sebor

On 11/29/2016 09:56 AM, Martin Sebor wrote:

On 11/28/2016 05:42 PM, Joseph Myers wrote:

On Sun, 27 Nov 2016, Martin Sebor wrote:


Finally, the patch also tightens up the constraint on the upper bound
of bounded functions like snprintf to be INT_MAX.  The functions cannot
produce output in excess of INT_MAX + 1 bytes and some implementations
(e.g., Solaris) fail with EINVAL when the bound is INT_MAX or more.
This is the subject of PR 78520.


Note that failing with large bounds is questionable (there is an apparent
conflict between ISO C, where passing a large bound seems valid, and
POSIX, where large bounds require errors; see
; I'm not sure if any liaison
issue for this ever got passed to WG14).


Thanks!  That's useful background.  Let me check with Nick to see
is he (as the POSIX/WG14 liaison) plans to submit it.  I can also
write it up for the next WG14 meeting if we or the Austin Group
feel like WG14 should clarify or change things.


I've been looking at the original BSD sources where snprintf came
from (AFAICT).  The first implementation I could find is in Net/2
from 1988.  It returns EOF when the size after conversion to int
is less than 1.  The same code is still in 4.4BSD.

Early UNIX implementations also have the limitation that the buffer
size maintained by struct FILE is an int.  Since snprintf on these
early implementations usually uses vfprintf to do the work (with
the count being set to the snprinf bound), it can't store more than
INT_MAX bytes without overflowing the counter.

http://minnie.tuhs.org/cgi-bin/utree.pl?file=Net2/usr/src/lib/libc/stdio/snprintf.c

It looks to me like the POSIX spec is faithful to the historical
implementations and C should consider either tightening up its
constraints or make the behavior implementation-defined to allow
for more modern implementations that don't have this restriction.

Martin


Re: [PATCH] Remove uninitialized reads of is_leaf

2016-11-29 Thread Wilco Dijkstra
Jeff Law wrote:
> On 11/29/2016 11:39 AM, Wilco Dijkstra wrote:
> > I forgot to ask, would it be reasonable to add an assert to check we're not 
> > in
> > a sequence in leaf_function_p? I guess this will trigger on several targets
> > (leaf_function_p is used in several backends) but it's a real bug if
> > crtl->is_leaf is true.
> Can it wait for the next stage1?  I'd hate to start tripping the assert 
> all over the place at this point in the release cycle.

Yes I don't think it is urgent as the incorrect value returned would likely 
make a leaf
function save/restore the return address unnecessarily. It starts to generate 
incorrect
code on ARM if you remove the if (reload_completed) test in 
arm_get_frame_offsets
(which should just be an optimization to avoid recomputing the frame layout 
repeatedly,
not essential for correctness).

Wilco

[PATCH] Fix format_integer (PR tree-optimization/78586)

2016-11-29 Thread Jakub Jelinek
Hi!

As mentioned in the PR, the LSHIFT_EXPR computation of values that
will need longest or shortest string is both incorrect (it shifts
integer_one_node left, so for precisions above precision of integer
it returns 0 (not to mention that it is invalid GENERIC, because the types
of first operand and result have to match)) and unnecessary - every integral
type already has TYPE_MIN_VALUE and TYPE_MAX_VALUE readily available.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Note, in the PR I've raised various further questions, Martin, can you look
at them?

2016-11-29  Jakub Jelinek  

PR tree-optimization/78586
* gimple-ssa-sprintf.c (format_integer): Use TYPE_MAX_VALUE or
TYPE_MIN_VALUE or build_all_ones_cst instead of folding LSHIFT_EXPR.
Don't build_int_cst min/max twice.  Formatting fix.

* gcc.c-torture/execute/pr78586.c: New test.

--- gcc/gimple-ssa-sprintf.c.jj 2016-11-28 23:50:20.0 +0100
+++ gcc/gimple-ssa-sprintf.c2016-11-29 15:54:17.605892667 +0100
@@ -1068,7 +1068,8 @@ format_integer (const conversion_spec 
   tree argmin = NULL_TREE;
   tree argmax = NULL_TREE;
 
-  if (arg && TREE_CODE (arg) == SSA_NAME
+  if (arg
+  && TREE_CODE (arg) == SSA_NAME
   && TREE_CODE (argtype) == INTEGER_TYPE)
 {
   /* Try to determine the range of values of the integer argument
@@ -1090,12 +1091,8 @@ format_integer (const conversion_spec 
 the upper bound for %i but -3 for %u.  */
  if (wi::neg_p (min) && !wi::neg_p (max))
{
- argmin = build_int_cst (argtype, wi::fits_uhwi_p (min)
- ? min.to_uhwi () : min.to_shwi ());
-
- argmax = build_int_cst (argtype, wi::fits_uhwi_p (max)
- ? max.to_uhwi () : max.to_shwi ());
-
+ argmin = res.argmin;
+ argmax = res.argmax;
  int minbytes = format_integer (spec, res.argmin).range.min;
  int maxbytes = format_integer (spec, res.argmax).range.max;
  if (maxbytes < minbytes)
@@ -1154,21 +1151,25 @@ format_integer (const conversion_spec 
   int typeprec = TYPE_PRECISION (dirtype);
   int argprec = TYPE_PRECISION (argtype);
 
-  if (argprec < typeprec || POINTER_TYPE_P (argtype))
+  if (argprec < typeprec)
{
- if (TYPE_UNSIGNED (argtype))
+ if (POINTER_TYPE_P (argtype))
argmax = build_all_ones_cst (argtype);
+ else if (TYPE_UNSIGNED (argtype))
+   argmax = TYPE_MAX_VALUE (argtype);
  else
-   argmax = fold_build2 (LSHIFT_EXPR, argtype, integer_one_node,
- build_int_cst (integer_type_node,
-argprec - 1));
+   argmax = TYPE_MIN_VALUE (argtype);
}
   else
{
- argmax = fold_build2 (LSHIFT_EXPR, dirtype, integer_one_node,
-   build_int_cst (integer_type_node,
-  typeprec - 1));
+ if (POINTER_TYPE_P (dirtype))
+   argmax = build_all_ones_cst (dirtype);
+ else if (TYPE_UNSIGNED (dirtype))
+   argmax = TYPE_MAX_VALUE (dirtype);
+ else
+   argmax = TYPE_MIN_VALUE (dirtype);
}
+
   res.argmin = argmin;
   res.argmax = argmax;
 }
--- gcc/testsuite/gcc.c-torture/execute/pr78586.c.jj2016-11-29 
16:11:35.283742461 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr78586.c   2016-11-29 
16:11:16.0 +0100
@@ -0,0 +1,17 @@
+/* PR tree-optimization/78586 */
+
+void
+foo (unsigned long x)
+{
+  char a[30];
+  unsigned long b = __builtin_sprintf (a, "%lu", x);
+  if (b != 4)
+__builtin_abort ();
+}
+
+int
+main ()
+{
+  foo (1000);
+  return 0;
+}

Jakub


[PATCH] Another debug info stv fix (PR rtl-optimization/78547)

2016-11-29 Thread Jakub Jelinek
Hi!

The following testcase ICEs because DECL_RTL/DECL_INCOMING_RTL are adjusted
by the stv pass through the PUT_MODE modifications, which means that for
var-tracking.c they contain a bogus mode.

Fixed by wrapping those into TImode subreg or adjusting the MEMs to have the
correct mode.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-29  Jakub Jelinek  

PR rtl-optimization/78547
* config/i386/i386.c (convert_scalars_to_vectors): If any
insns have been converted, adjust all parameter's DEC_RTL and
DECL_INCOMING_RTL back from V1TImode to TImode if the parameters have
TImode.

--- gcc/config/i386/i386.c.jj   2016-11-29 08:31:58.0 +0100
+++ gcc/config/i386/i386.c  2016-11-29 12:21:36.867323776 +0100
@@ -4075,6 +4075,39 @@ convert_scalars_to_vector ()
crtl->stack_alignment_needed = 128;
   if (crtl->stack_alignment_estimated < 128)
crtl->stack_alignment_estimated = 128;
+  /* Fix up DECL_RTL/DECL_INCOMING_RTL of arguments.  */
+  if (TARGET_64BIT)
+   for (tree parm = DECL_ARGUMENTS (current_function_decl);
+parm; parm = DECL_CHAIN (parm))
+ {
+   if (TYPE_MODE (TREE_TYPE (parm)) != TImode)
+ continue;
+   if (DECL_RTL_SET_P (parm)
+   && GET_MODE (DECL_RTL (parm)) == V1TImode)
+ {
+   rtx r = DECL_RTL (parm);
+   if (REG_P (r))
+ SET_DECL_RTL (parm, gen_rtx_SUBREG (TImode, r, 0));
+   else
+ {
+   gcc_assert (MEM_P (r));
+   SET_DECL_RTL (parm, adjust_address_nv (r, TImode, 0));
+ }
+ }
+   if (DECL_INCOMING_RTL (parm)
+   && GET_MODE (DECL_INCOMING_RTL (parm)) == V1TImode)
+ {
+   rtx r = DECL_INCOMING_RTL (parm);
+   if (REG_P (r))
+ DECL_INCOMING_RTL (parm) = gen_rtx_SUBREG (TImode, r, 0);
+   else
+ {
+   gcc_assert (MEM_P (r));
+   DECL_INCOMING_RTL (parm)
+ = change_address (r, TImode, NULL_RTX);
+ }
+ }
+ }
 }
 
   return 0;
--- gcc/testsuite/gcc.dg/pr78547.c.jj   2016-11-29 12:26:26.544662630 +0100
+++ gcc/testsuite/gcc.dg/pr78547.c  2016-11-29 12:26:09.0 +0100
@@ -0,0 +1,18 @@
+/* PR rtl-optimization/78547 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-Os -g -freorder-blocks-algorithm=simple -Wno-psabi" } */
+/* { dg-additional-options "-mstringop-strategy=libcall" { target i?86-*-* 
x86_64-*-* } } */
+
+typedef unsigned __int128 u128;
+typedef unsigned __int128 V __attribute__ ((vector_size (64)));
+
+V
+foo (u128 a, u128 b, u128 c, V d)
+{
+  V e = (V) {a};
+  V f = e & 1;
+  e = 0 != e;
+  c = c;
+  f = f << ((V) {c} & 7);
+  return f + e;
+}

Jakub


Re: [PATCH] combine: Tweak change_zero_ext

2016-11-29 Thread Uros Bizjak
> 2016-11-26  Segher Boessenkool  
>
> * combine.c (change_zero_ext): Also handle extends from a subreg
> to a mode bigger than that of the operand of the subreg.

This patch introduced:

FAIL: gcc.target/i386/pr44578.c (internal compiler error)

on i686 (or x86_64 32bit multi-lib).

./cc1 -O2 -mtune=athlon64 -m32 -quiet pr44578.c
pr44578.c: In function ‘test’:
pr44578.c:18:1: internal compiler error: in gen_rtx_SUBREG, at emit-rtl.c:908
 }
 ^
0x81493b gen_rtx_SUBREG(machine_mode, rtx_def*, int)
/home/uros/gcc-svn/trunk/gcc/emit-rtl.c:908
0x122609f change_zero_ext
/home/uros/gcc-svn/trunk/gcc/combine.c:11260
0x1226207 recog_for_combine
/home/uros/gcc-svn/trunk/gcc/combine.c:11346
0x1236db3 try_combine
/home/uros/gcc-svn/trunk/gcc/combine.c:3501
0x123a3e0 combine_instructions
/home/uros/gcc-svn/trunk/gcc/combine.c:1265
0x123a3e0 rest_of_handle_combine
/home/uros/gcc-svn/trunk/gcc/combine.c:14581
0x123a3e0 execute
/home/uros/gcc-svn/trunk/gcc/combine.c:14626

Uros.


[PATCH, i386]: Move mask ops from i386.md to sse.md ...

2016-11-29 Thread Uros Bizjak
... and fix gcc.target/i386/avx512f-kmovw-1.c scan-asm failure.

2016-11-29  Uros Bizjak  

* config/i386/sse.md (UNSPEC_MASKOP): Move from i386.md.
(mshift): Ditto.
(SWI1248_AVX512BWDQ): Ditto.
(SWI1248_AVX512BW): Ditto.
(k): Ditto.
(kandn): Ditto.
(kxnor): Ditto.
(knot): Ditto.
(*k): Ditto.
(kortestzhi, kortestchi): Ditto.
(kunpckhi, kunpcksi, kunpckdi): Ditto.

testsuite/ChangeLog:

2016-11-29  Uros Bizjak  

* gcc.target/i386/avx512f-kmovw-1.c (avx512f_test):
Force value through k register.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 242963)
+++ config/i386/i386.md (working copy)
@@ -186,9 +186,6 @@
   UNSPEC_PDEP
   UNSPEC_PEXT
 
-  ;; For AVX512F support
-  UNSPEC_KMASKOP
-
   UNSPEC_BNDMK
   UNSPEC_BNDMK_ADDR
   UNSPEC_BNDSTX
@@ -921,9 +918,6 @@
 (define_code_attr shift [(ashift "sll") (lshiftrt "shr") (ashiftrt "sar")])
 (define_code_attr vshift [(ashift "sll") (lshiftrt "srl") (ashiftrt "sra")])
 
-;; Mask variant left right mnemonics
-(define_code_attr mshift [(ashift "shiftl") (lshiftrt "shiftr")])
-
 ;; Mapping of rotate operators
 (define_code_iterator any_rotate [rotate rotatert])
 
@@ -966,15 +960,6 @@
 ;; All integer modes.
 (define_mode_iterator SWI1248x [QI HI SI DI])
 
-;; All integer modes with AVX512BW/DQ.
-(define_mode_iterator SWI1248_AVX512BWDQ
-  [(QI "TARGET_AVX512DQ") HI (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW")])
-
-;; All integer modes with AVX512BW, where HImode operation
-;; can be used instead of QImode.
-(define_mode_iterator SWI1248_AVX512BW
-  [QI HI (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW")])
-
 ;; All integer modes without QImode.
 (define_mode_iterator SWI248x [HI SI DI])
 
@@ -2489,11 +2474,6 @@
   ]
   (const_string "SI")))])
 
-(define_expand "kmovw"
-  [(set (match_operand:HI 0 "nonimmediate_operand")
-   (match_operand:HI 1 "nonimmediate_operand"))]
-  "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
-
 (define_insn "*movhi_internal"
   [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r ,r ,m ,k,k ,r,m")
(match_operand:HI 1 "general_operand"  "r ,rn,rm,rn,r,km,k,k"))]
@@ -8061,28 +8041,6 @@
   operands[3] = gen_lowpart (QImode, operands[3]);
 })
 
-(define_insn "k"
-  [(set (match_operand:SWI1248_AVX512BW 0 "register_operand" "=k")
-   (any_logic:SWI1248_AVX512BW
- (match_operand:SWI1248_AVX512BW 1 "register_operand" "k")
- (match_operand:SWI1248_AVX512BW 2 "register_operand" "k")))
-   (unspec [(const_int 0)] UNSPEC_KMASKOP)]
-  "TARGET_AVX512F"
-{
-  if (get_attr_mode (insn) == MODE_HI)
-return "kw\t{%2, %1, %0|%0, %1, %2}";
-  else
-return "k\t{%2, %1, %0|%0, %1, %2}";
-}
-  [(set_attr "type" "msklog")
-   (set_attr "prefix" "vex")
-   (set (attr "mode")
- (cond [(and (match_test "mode == QImode")
-(not (match_test "TARGET_AVX512DQ")))
-  (const_string "HI")
-  ]
-  (const_string "")))])
-
 ;; %%% This used to optimize known byte-wide and operations to memory,
 ;; and sometimes to QImode registers.  If this is considered useful,
 ;; it should be done with splitters.
@@ -8576,29 +8534,6 @@
   operands[2] = gen_lowpart (QImode, operands[2]);
 })
 
-(define_insn "kandn"
-  [(set (match_operand:SWI1248_AVX512BW 0 "register_operand" "=k")
-   (and:SWI1248_AVX512BW
- (not:SWI1248_AVX512BW
-   (match_operand:SWI1248_AVX512BW 1 "register_operand" "k"))
- (match_operand:SWI1248_AVX512BW 2 "register_operand" "k")))
-   (unspec [(const_int 0)] UNSPEC_KMASKOP)]
-  "TARGET_AVX512F"
-{
-  if (get_attr_mode (insn) == MODE_HI)
-return "kandnw\t{%2, %1, %0|%0, %1, %2}";
-  else
-return "kandn\t{%2, %1, %0|%0, %1, %2}";
-}
-  [(set_attr "type" "msklog")
-   (set_attr "prefix" "vex")
-   (set (attr "mode")
- (cond [(and (match_test "mode == QImode")
-(not (match_test "TARGET_AVX512DQ")))
- (const_string "HI")
-  ]
-  (const_string "")))])
-
 (define_insn_and_split "*andndi3_doubleword"
   [(set (match_operand:DI 0 "register_operand" "=r")
(and:DI
@@ -8987,92 +8922,6 @@
(set_attr "type" "alu")
(set_attr "modrm" "1")
(set_attr "mode" "QI")])
-
-(define_insn "kxnor"
-  [(set (match_operand:SWI1248_AVX512BW 0 "register_operand" "=k")
-   (not:SWI1248_AVX512BW
- (xor:SWI1248_AVX512BW
-   (match_operand:SWI1248_AVX512BW 1 "register_operand" "k")
-   (match_operand:SWI1248_AVX512BW 2 "register_operand" "k"
-   (unspec [(const_int 0)] UNSPEC_KMASKOP)]
-  "TARGET_AVX512F"
-{
-  if (get_attr_mode (insn) == MODE_HI)
-return "kxnorw\t{%2, %1, %0|%0, %1, %2}";
-  else
-return "kxnor\t{%2, %1, %0|%0, %1, %2}";
-}
-  [(set_attr "type" "msklog")
-  

[PATCH], PR 78594, Fix storing QImode/HImode on ISA 3.0/power9

2016-11-29 Thread Michael Meissner
I was developing the next round of ISA 3.0 code changes to use the vector
extract byte, half word, and word instructions (VEXTU{B,H,W}{R,L}X) that
deposit the value into a general purpose register instead of a vector register,
and I was running the changes through the simulator.  I discovered that my
previous change to allow QImode/HImode did not work if the value was in a
traditional Altivec register.

This fixes the problem that I noticed.  I didn't bother doing the full
bootstrap and check, since it only affects the power9 target.  Can I check this
in?

2016-11-29  Michael Meissner  

PR target/78594
* config/rs6000/rs6000.md (mov_internal, QHI iterator): Add
'x' to stxsix print pattern, so that QImode and HImode values
residing in traditional altivec registers can be stored
correctly.

Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 242942)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -6863,7 +6863,7 @@ (define_insn "*mov_internal"
lz%U1%X1 %0,%1
lxsizx %x0,%y1
st%U0%X0 %1,%0
-   stxsix %1,%y0
+   stxsix %x1,%y0
li %0,%1
xxlor %x0,%x1,%x1
xxspltib %x0,0

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH 7/9] Add RTL-error-handling to host

2016-11-29 Thread David Malcolm
On Tue, 2016-11-29 at 18:23 +0100, Bernd Schmidt wrote:
> On 11/29/2016 06:20 PM, David Malcolm wrote:
> > 
> > if that distinction makes sense.  Clearly we already have a
> > diagnostics
> > subsystem on the host; what this patch is adding is the separate,
> > rtl-s
> > pecific diagnostic subsystem to cc1 on the host.
> 
> So that still seems odd to me. Why not use the normal diagnostics 
> subsystem, and add whatever you need from it to errors.c for use from
> the generator programs? What exactly makes it "rtl-specific"?

The main issue is that the normal diagnostics subsystem tracks
locations using location_t (aka libcpp's source_location), rather than
read-md.h's struct file_location, so we'd need to start using libcpp
from the generator programs, porting the location tracking to using
libcpp (e.g. creating linemaps for the files).

There would also be various Makefile.in tweaking to build various files
twice; hopefully that wouldn't lead to any unexpected issues.

Quoting from:
  https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00648.html

> There seem to be two ways to do this:
>
>   (A) build the "light" diagnostics system (errors.c) for the host as
> well as build machine, and link it with the RTL reader there, so
there
> are two parallel diagnostics subsystems.
>
>   (B) build the "real" diagnostics system (diagnostics*) for the
> *build* machine as well as the host, and use it from the gen* tools,
> eliminating the "light" system, and porting the gen* tools to use
> libcpp for location tracking.
>
> Approach (A) seems to be simpler, which is what this part of the
patch
> does.
>
> I've experimented with approach (B).  I think it's doable, but it's
> much more invasive (perhaps needing a libdiagnostics.a and a
> build/libdiagnostics.a in gcc/Makefile.in), so I hope this can be
> followup work.
>
> I can split the relevant parts out into a separate patch, but I was
> wondering if either of you had a strong opinion on (A) vs (B) before
I
> do so?

This patch implements approach (A).

Would you prefer that I went with approach (B), or is approach (A)
acceptable?

Thanks
Dave


Re: [PATCH] Remove uninitialized reads of is_leaf

2016-11-29 Thread Jeff Law

On 11/29/2016 11:39 AM, Wilco Dijkstra wrote:

Jeff Law wrote:

On 11/29/2016 04:10 AM, Wilco Dijkstra wrote:

GCC caches the whether a function is a leaf in crtl->is_leaf. Using this
in the backend is best as leaf_function_p may not work correctly (eg. while
emitting prolog or epilog code).


I forgot to ask, would it be reasonable to add an assert to check we're not in
a sequence in leaf_function_p? I guess this will trigger on several targets
(leaf_function_p is used in several backends) but it's a real bug if
crtl->is_leaf is true.
Can it wait for the next stage1?  I'd hate to start tripping the assert 
all over the place at this point in the release cycle.


jeff


Re: [PATCH] Remove uninitialized reads of is_leaf

2016-11-29 Thread Wilco Dijkstra
Jeff Law wrote:
> On 11/29/2016 04:10 AM, Wilco Dijkstra wrote:
> > GCC caches the whether a function is a leaf in crtl->is_leaf. Using this
> > in the backend is best as leaf_function_p may not work correctly (eg. while
> > emitting prolog or epilog code). 

I forgot to ask, would it be reasonable to add an assert to check we're not in
a sequence in leaf_function_p? I guess this will trigger on several targets
(leaf_function_p is used in several backends) but it's a real bug if 
crtl->is_leaf is true.

Wilco


Re: [PING] [PATCH] Fix PR31096

2016-11-29 Thread Jeff Law

On 11/22/2016 10:25 PM, Hurugalawadi, Naveen wrote:

Hi,

Please consider this as a personal reminder to review the patch
at following link and let me know your comments on the same.

https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01049.html
I believe Richi asked for a small change after which you can consider 
the patch approved:


https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02320.html

jeff


Re: [PATCH] Remove uninitialized reads of is_leaf

2016-11-29 Thread Jeff Law

On 11/29/2016 04:10 AM, Wilco Dijkstra wrote:

GCC caches the whether a function is a leaf in crtl->is_leaf. Using this
in the backend is best as leaf_function_p may not work correctly (eg. while
emitting prolog or epilog code).  There are many reads of crtl->is_leaf
before it is initialized.  Many targets do in targetm.frame_pointer_required
(eg. arm, aarch64, i386, mips, sparc), which is called before register
allocation by ira_setup_eliminable_regset and sched_init.

Additionally, SHRINK_WRAPPING_ENABLED calls targetm.have_simple_return,
which evaluates the condition of the simple_return instruction.  On ARM
this results in a call to use_simple_return_p which requires crtl->is_leaf
to be set correctly.

To fix this, initialize crtl->is_leaf in ira_setup_eliminable_regset and
early on in ira.  A bootstrap did not find any uninitialized reads of
crtl->is_leaf on Thumb-2.  A follow-up patch will remove incorrect uses
of leaf_function_p from the ARM backend.

Bootstrap OK (verified all reads of is_leaf in ARM backend are now after
initialization), OK for commit?

ChangeLog:
2016-11-29  Wilco Dijkstra  

* gcc/ira.c (ira_setup_eliminable_regset): Initialize crtl->is_leaf.
(ira): Move initialization of crtl->is_leaf earlier.

OK.
jeff



Re: [Ping][PATCH 0/6][ARM] Implement support for ACLE Coprocessor Intrinsics

2016-11-29 Thread Andre Vieira (lists)
On 29/11/16 10:37, Kyrill Tkachov wrote:
> 
> On 29/11/16 10:35, Andre Vieira (lists) wrote:
>> On 21/11/16 08:42, Christophe Lyon wrote:
>>> Hi,
>>>
>>>
>>> On 17 November 2016 at 11:45, Kyrill Tkachov
>>>  wrote:
 On 17/11/16 10:31, Andre Vieira (lists) wrote:
> Hi Kyrill,
>
> On 17/11/16 10:11, Kyrill Tkachov wrote:
>> Hi Andre,
>>
>> On 09/11/16 10:00, Andre Vieira (lists) wrote:
>>> Tested the series by bootstrapping arm-none-linux-gnuabihf and
>>> found no
>>> regressions, also did a normal build for arm-none-eabi and ran the
>>> acle.exp tests for a Cortex-M3.
>> Can you please also do a full testsuite run on
>> arm-none-linux-gnueabihf.
>> Patches have to be tested by the whole testsuite.
> That's what I have done and meant to say with "Tested the series by
> bootstrapping arm-none-linux-gnuabihf and found no regressions". I
> compared gcc/g++/libstdc++ tests on a bootstrap with and without the
> patches.

 Ah ok, great.

> I'm happy to rerun the tests after a rebase when the patches get
> approved.
>>> FWIW, I ran a validation with the 6 patches applied, and saw no
>>> regression.
>>> Given the large number of new tests, I didn't check the full details.
>>>
>>> If you want to check that each configuration has the PASSes you expect,
>>> you can have a look at:
>>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/242581-acle/report-build-info.html
>>>
>>>
>>> Thanks,
>>>
>>> Christophe
>>>
>>>
 Thanks,
 Kyrill

> Cheers,
> Andre

> 
> Hi Andre,
> 
>> Ping. (For the patch series).
> 
> Have you seen my review at:
> https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01778.html ?
> It might require some minor rework of some parts of the series.
> 
> Thanks,
> Kyrill
> 
> 
Hmm no I had not, must have accidently marked it as read...
Ill go work on the comments. Sorry for the ping.


[PATCH] Fix PR68838

2016-11-29 Thread David Edelsohn
Separate from ulimit, 32 bit AIX processes have the concept of memory
segments.  By default, AIX devotes one 256MB segment to the data
section of an executable.  Some libstdc++ testcases allocate more than
that amount of memory.  Instead of individually fixing tests, this
patch always adds the AIX linker option to allocate two segments for
data, fixing a number of wchar testcases.

PR libstdc++/68838
* testsuite/lib/libstdc++.exp (DEFAULT_CXXFLAGS): Add -Wl,-bmaxdata on AIX.
* testsuite/23_containers/vector/profile/vector.cc: Remove
dg-additional-options.

Index: libstdc++.exp
===
--- libstdc++.exp   (revision 242964)
+++ libstdc++.exp   (working copy)
@@ -136,6 +136,9 @@
if { [string match "powerpc-*-darwin*" $target_triplet] } {
append DEFAULT_CXXFLAGS " -multiply_defined suppress"
}
+   if { [string match "powerpc-ibm-aix*" $target_triplet] }
+   append DEFAULT_CXXFLAGS " -Wl,-bmaxdata:0x2000"
+   }
 }

v3track DEFAULT_CXXFLAGS 2

Index: 23_containers/vector/profile/vector.cc
===
--- 23_containers/vector/profile/vector.cc  (revision 242964)
+++ 23_containers/vector/profile/vector.cc  (working copy)
@@ -2,8 +2,6 @@
 // Advice: set tmp as 1

 // { dg-options "-DITERATIONS=20" { target simulator } }
-// AIX requires higher memory limit
-// { dg-additional-options "-Wl,-bmaxdata:0x2000" { target {
powerpc-ibm-aix* } } }

 #ifndef ITERATIONS
 #define ITERATIONS 2000


Re: Ping: Re: [PATCH 1/2] gcc: Remove unneeded global flag.

2016-11-29 Thread Jeff Law

On 11/29/2016 07:02 AM, Andrew Burgess wrote:

* Jeff Law  [2016-11-28 15:08:46 -0700]:


On 11/24/2016 02:40 PM, Andrew Burgess wrote:

* Christophe Lyon  [2016-11-21 13:47:09 +0100]:


On 20 November 2016 at 18:27, Mike Stump  wrote:

On Nov 19, 2016, at 1:59 PM, Andrew Burgess  wrote:

So, your new test fails on arm* targets:


After a little digging I think the problem might be that
-freorder-blocks-and-partition is not supported on arm.

This should be detected as the new tests include:

   /* { dg-require-effective-target freorder } */

however this test passed on arm as -freorder-blocks-and-partition does
not issue any warning unless -fprofile-use is also passed.

The patch below extends check_effective_target_freorder to check using
-fprofile-use.  With this change in place the tests are skipped on
arm.



All feedback welcome,


Seems reasonable, unless a -freorder-blocks-and-partition/-fprofile-use person 
thinks this is the wrong solution.



Hi,

As promised, I tested this patch: it makes
gcc.dg/tree-prof/section-attr-[123].c
unsupported on arm*, and thus they are not failing anymore :-)

However, it also makes other tests unsupported, while they used to pass:

  gcc.dg/pr33648.c
  gcc.dg/pr46685.c
  gcc.dg/tree-prof/20041218-1.c
  gcc.dg/tree-prof/bb-reorg.c
  gcc.dg/tree-prof/cold_partition_label.c
  gcc.dg/tree-prof/comp-goto-1.c
  gcc.dg/tree-prof/pr34999.c
  gcc.dg/tree-prof/pr45354.c
  gcc.dg/tree-prof/pr50907.c
  gcc.dg/tree-prof/pr52027.c
  gcc.dg/tree-prof/va-arg-pack-1.c

and failures are now unsupported:
  gcc.dg/tree-prof/cold_partition_label.c
  gcc.dg/tree-prof/section-attr-1.c
  gcc.dg/tree-prof/section-attr-2.c
  gcc.dg/tree-prof/section-attr-3.c

So, maybe this patch is too strong?


In all of the cases that used to pass the tests are compile only tests
(except for cold_partition_label, which I discuss below).

On ARM passing -fprofile-use and -freorder-blocks-and-partition
results in a warning, and the -freorder-blocks-and-partition flag is
ignored.  However, disabling -freorder-blocks-and-partition doesn't
stop any of the tests compiling, hence the passes.

All the tests include:

  /* { dg-require-effective-target freorder } */

which I understand to mean, the tests requires the 'freorder' feature
to be supported (which corresponds to -freorder-blocks-and-partition).

For cold_partition_label and my new tests it's seems clear that the
lack of support for -freorder-blocks-and-partition on ARM is the cause
of the test failures.

So, is it reasonable to give up the other tests as "unsupported"?  I'd
be inclined to say yes, but I happy to rework the patch if anyone has
a suggestion for an alternative approach.

It is reasonable.  It's not uncommon to have to drop various tests to
UNSUPPORTED, particularly things which depend on assembler/linker
capabilities, the target runtime system, etc.


OK, I'm going to take that as approval for my patch[1].  I'll wait a
couple of days to give people a chance to correct me, then I'll push
the change.  This should resolve the test regressions I introduced for
ARM.

I'll just go ahead and explicitly ACK this.

Thanks,
jeff



Re: [PATCH] remove %p handling from gimple-ssa-sprintf (pr78512)

2016-11-29 Thread Jeff Law

On 11/28/2016 07:57 PM, Martin Sebor wrote:

PR 78512 - r242674 miscompiles Linux kernel observes that the Linux
kernel fails to boot as a result of enabling the -fprintf-return-value
optimization in GCC.  This is likely because the kernel has its own
sprintf with a large set of extensions to the %p directive that
conflict with the optimization.  Ordinarily, programs that define
their own versions of C library functions that differ from what C
specifies are expected to disable GCC's built-ins (e.g., by
-fno-builtin, or for freestanding environments like the Linux kernel,
by -ffreestanding).  But the Linux kernel doesn't do that and hence
the conflict.

After discussing a few possible options (handling the kernel extensions
in GCC, providing a new GCC option to disable the %p handling, and
disabling both the optimization and the warning for calls involving
the %p directive, the last was viewed as the best alternative).  The
attached patch removes the %p handling from GCC.

And just to give a little more information here.

The fundamental issue is that %p handling is implementation defined and 
the implementations can (of course) change over time.   Handling of %p 
essentially turns into coding GCC to an implementation rather than a 
real specification.


The details of those implementations would have to be baked into GCC 
itself.  That wasn't terrible when we were just trying to support the 
rather limited cases found in glibc, uClibc, aix & solaris.  But when we 
add the linux kernel and its extensions into the mix, it didn't seem 
wise to continue bake that knowledge into GCC.   The need to support 
multiple %p implementations from a single compiler just makes things 
even worse.


After deliberating those issues, Jakub, Martin and myself ultimately 
decided that supporting %p for warnings and optimization was unwise.


It's unfortunate because the kernel makes extensive use of %p.  I guess 
one could create a plug-in to check %p for the kernel if they wanted to 
take advantage of the checking capabilities.


Approved for the trunk.

Thanks,

jeff


Re: Calling 'abort' on bounds violations in libmpx

2016-11-29 Thread Ilya Enkovich
2016-11-29 17:43 GMT+03:00 Alexander Ivchenko :
> Hi,
>
> Attached patch is addressing PR67520. Would that approach work for the
> problem? Should I also change the version of the library?

Hi!

Overall patch is OK. But you need to change version because you
change default behavior. How did you test it? Did you check default
behavior change doesn't affect existing runtime MPX tests? Can we
add new ones?

Thanks,
Ilya

>
> 2016-11-29  Alexander Ivchenko  
>
> * mpxrt/mpxrt-utils.c (set_mpx_rt_stop_handler): New function.
> (print_help): Add help for CHKP_RT_STOP_HANDLER environment
> variable.
> (__mpxrt_init_env_vars): Add initialization of stop_handler.
> (__mpxrt_stop_handler): New function.
> (__mpxrt_stop): Ditto.
> * mpxrt/mpxrt-utils.h (mpx_rt_stop_mode_handler_t): New enum.
>
>
>
> diff --git a/libmpx/mpxrt/mpxrt-utils.c b/libmpx/mpxrt/mpxrt-utils.c
> index 057a355..63ee7c6 100644
> --- a/libmpx/mpxrt/mpxrt-utils.c
> +++ b/libmpx/mpxrt/mpxrt-utils.c
> @@ -60,6 +60,9 @@
>  #define MPX_RT_MODE "CHKP_RT_MODE"
>  #define MPX_RT_MODE_DEFAULT MPX_RT_COUNT
>  #define MPX_RT_MODE_DEFAULT_STR "count"
> +#define MPX_RT_STOP_HANDLER "CHKP_RT_STOP_HANDLER"
> +#define MPX_RT_STOP_HANDLER_DEFAULT MPX_RT_STOP_HANDLER_ABORT
> +#define MPX_RT_STOP_HANDLER_DEFAULT_STR "abort"
>  #define MPX_RT_HELP "CHKP_RT_HELP"
>  #define MPX_RT_ADDPID "CHKP_RT_ADDPID"
>  #define MPX_RT_BNDPRESERVE "CHKP_RT_BNDPRESERVE"
> @@ -84,6 +87,7 @@ typedef struct {
>  static int summary;
>  static int add_pid;
>  static mpx_rt_mode_t mode;
> +static mpx_rt_stop_mode_handler_t stop_handler;
>  static env_var_list_t env_var_list;
>  static verbose_type verbose_val;
>  static FILE *out;
> @@ -226,6 +230,23 @@ set_mpx_rt_mode (const char *env)
>}
>  }
>
> +static mpx_rt_stop_mode_handler_t
> +set_mpx_rt_stop_handler (const char *env)
> +{
> +  if (env == 0)
> +return MPX_RT_STOP_HANDLER_DEFAULT;
> +  else if (strcmp (env, "abort") == 0)
> +return MPX_RT_STOP_HANDLER_ABORT;
> +  else if (strcmp (env, "exit") == 0)
> +return MPX_RT_STOP_HANDLER_EXIT;
> +  {
> +__mpxrt_print (VERB_ERROR, "Illegal value '%s' for %s. Legal values are"
> +   "[abort | exit]\nUsing default value %s\n",
> +   env, MPX_RT_STOP_HANDLER, MPX_RT_STOP_HANDLER_DEFAULT);
> +return MPX_RT_STOP_HANDLER_DEFAULT;
> +  }
> +}
> +
>  static void
>  print_help (void)
>  {
> @@ -244,6 +265,11 @@ print_help (void)
>fprintf (out, "%s \t\t set MPX runtime behavior on #BR exception."
> " [stop | count]\n"
> "\t\t\t [default: %s]\n", MPX_RT_MODE, MPX_RT_MODE_DEFAULT_STR);
> +  fprintf (out, "%s \t set the handler function MPX runtime will call\n"
> +   "\t\t\t on #BR exception when %s is set to \'stop\'."
> +   " [abort | exit]\n"
> +   "\t\t\t [default: %s]\n", MPX_RT_STOP_HANDLER, MPX_RT_MODE,
> +   MPX_RT_STOP_HANDLER_DEFAULT_STR);
>fprintf (out, "%s \t\t generate out,err file for each process.\n"
> "\t\t\t generated file will be MPX_RT_{OUT,ERR}_FILE.pid\n"
> "\t\t\t [default: no]\n", MPX_RT_ADDPID);
> @@ -357,6 +383,10 @@ __mpxrt_init_env_vars (int* bndpreserve)
>env_var_list_add (MPX_RT_MODE, env);
>mode = set_mpx_rt_mode (env);
>
> +  env = secure_getenv (MPX_RT_STOP_HANDLER);
> +  env_var_list_add (MPX_RT_STOP_HANDLER, env);
> +  stop_handler = set_mpx_rt_stop_handler (env);
> +
>env = secure_getenv (MPX_RT_BNDPRESERVE);
>env_var_list_add (MPX_RT_BNDPRESERVE, env);
>validate_bndpreserve (env, bndpreserve);
> @@ -487,6 +517,22 @@ __mpxrt_mode (void)
>return mode;
>  }
>
> +mpx_rt_mode_t
> +__mpxrt_stop_handler (void)
> +{
> +  return stop_handler;
> +}
> +
> +void __attribute__ ((noreturn))
> +__mpxrt_stop (void)
> +{
> +  if (__mpxrt_stop_handler () == MPX_RT_STOP_HANDLER_ABORT)
> +abort ();
> +  else if (__mpxrt_stop_handler () == MPX_RT_STOP_HANDLER_EXIT)
> +exit (255);
> +  __builtin_unreachable ();
> +}
> +
>  void
>  __mpxrt_print_summary (uint64_t num_brs, uint64_t l1_size)
>  {
> diff --git a/libmpx/mpxrt/mpxrt-utils.h b/libmpx/mpxrt/mpxrt-utils.h
> index d62937d..6da12cc 100644
> --- a/libmpx/mpxrt/mpxrt-utils.h
> +++ b/libmpx/mpxrt/mpxrt-utils.h
> @@ -54,6 +54,11 @@ typedef enum {
>MPX_RT_STOP
>  } mpx_rt_mode_t;
>
> +typedef enum {
> +  MPX_RT_STOP_HANDLER_ABORT,
> +  MPX_RT_STOP_HANDLER_EXIT
> +} mpx_rt_stop_mode_handler_t;
> +
>  void __mpxrt_init_env_vars (int* bndpreserve);
>  void __mpxrt_write_uint (verbose_type vt, uint64_t val, unsigned base);
>  void __mpxrt_write (verbose_type vt, const char* str);
> diff --git a/libmpx/mpxrt/mpxrt.c b/libmpx/mpxrt/mpxrt.c
> index b52906b..0bc069c 100644
> --- a/libmpx/mpxrt/mpxrt.c
> +++ b/libmpx/mpxrt/mpxrt.c
> @@ -252,7 +251,7 @@ handler (int sig __attribute__ ((unused)),
>uctxt->uc_mcontext.gregs[REG_IP_IDX] =
>  (greg_t)get_next_inst_ip ((uint8_t *)ip);
>if (__mpxrt_mode () == MPX_RT_STOP)
> -exit (255);
> +__mpxrt_stop ();
>return;
>
>   default:
> 

Re: [PATCH 7/9] Add RTL-error-handling to host

2016-11-29 Thread Bernd Schmidt

On 11/29/2016 06:20 PM, David Malcolm wrote:


if that distinction makes sense.  Clearly we already have a diagnostics
subsystem on the host; what this patch is adding is the separate, rtl-s
pecific diagnostic subsystem to cc1 on the host.


So that still seems odd to me. Why not use the normal diagnostics 
subsystem, and add whatever you need from it to errors.c for use from 
the generator programs? What exactly makes it "rtl-specific"?



Bernd


Re: [PATCH 7/9] Add RTL-error-handling to host

2016-11-29 Thread David Malcolm
On Mon, 2016-11-28 at 14:47 +0100, Bernd Schmidt wrote:
> Been looking at this off and on, and I'm still not sure I entirely
> get 
> it - sorry.
> 
> On 11/11/2016 10:15 PM, David Malcolm wrote:
> > > > Implementing an RTL frontend by using the RTL reader from read
> > > > -rtl.c
> > > > means that we now need a diagnostics subsystem on the *host*
> > > > for
> > > > handling errors in RTL files, rather than just on the build
> > > > machine.
> 
> So, there are two things that bother me about this patch description:
>   - The host already has the full diagnostic subsystem.

Maybe I worded this poorly.

I meant to say:

"we now need
  ((a diagnostics subsystem for handling errors in RTL files)
   on the *host*),
rather than just on the build machine."

rather than:

"we now need a
  ((diagnostics subsystem on the *host*)
   for handling
errors in RTL files),
 rather than just on the build machine."

if that distinction makes sense.  Clearly we already have a diagnostics
subsystem on the host; what this patch is adding is the separate, rtl-s
pecific diagnostic subsystem to cc1 on the host.

> The fact that
> you're commenting out some of the functions in errors.c suggests
> that errors.c is conflicting with the full one.

It doesn't conflict: C++ overloading allows both to co-exist.  However
I wanted to make sure that we don't accidentally use the RTL-specific
error-handling within other parts of the compiler.

>   - We already compile errors.c for both build and host.

Aha, yes, we do, it's linked into gengtype on the host to allow plugins
to support GTY.  The patch adds it to OBJS so that it is available
within cc1.

> Is there a problem with using both the full and the light errors
> system 
> for read-rtl, as available? Mismatches in function signatures or 
> something like this?

As noted above, C++ overloading allows this.

> > -#ifdef HOST_GENERATOR_FILE
> > -#include "config.h"
> > -#define GENERATOR_FILE 1
> > -#else
> > +/* This file is compiled twice: once for the generator programs
> > +   once for the compiler.  */
> > +#ifdef GENERATOR_FILE
> >  #include "bconfig.h"
> > +#else
> > +#include "config.h"
> >  #endif
> >  #include "system.h"
> >  #include "errors.h"
> 
> The Makefile still has a HOST_GENERATOR_FILE definition for errors.c 
> after this.

Will remove.


Re: [RFC] Assert DECL_ABSTRACT_ORIGIN is different from the decl itself

2016-11-29 Thread Jeff Law

On 11/29/2016 03:13 AM, Richard Biener wrote:

On Mon, Nov 28, 2016 at 6:28 PM, Martin Jambor  wrote:

Hi Jeff,

On Mon, Nov 28, 2016 at 08:46:05AM -0700, Jeff Law wrote:

On 11/28/2016 07:27 AM, Martin Jambor wrote:

Hi,

one of a number of symptoms of an otherwise unrelated HSA bug I've
been debugging today is gcc crashing or hanging in the C++ pretty
printer when attempting to emit a warning because dump_decl() ended up
in an infinite recursion calling itself on the DECL_ABSTRACT_ORIGIN of
the decl it was looking at, which was however the same thing.  (It was
set to itself on purpose in set_decl_origin_self as a part of final
pass, the decl was being printed because it was itself an abstract
origin of another one).

If someone ever faces a similar problem, the following (untested)
patch might save them a bit of time.  I have eventually decided not to
make it a checking-only assert because it is on a cold path and
because at release-build optimization levels, the tail-call is
optimized to a jump and thus an infinite loop if the described
situation happens, and I suppose an informative ICE is better tan that
even for users.

What do you think?  Would it be reasonable for trunk even now or
should I queue it for the next stage1?

Thanks,

Martin


gcc/cp/

2016-11-28  Martin Jambor  

* error.c (dump_decl): Add an assert that DECL_ABSTRACT_ORIGIN
is not the decl itself.

Given it's on an error/debug path it ought to be plenty safe for now. What's
more interesting is whether or not DECL_ABSTRACT_ORIGIN can legitimately
point to itself and if so, how is that happening.


Well, I tried to explain it in my original email but I also wanted to
be as brief as possible, so perhaps it is necessary to elaborate a bit:

There is a function set_decl_origin_self() in dwarf2out.c that does
just that, sets DECL_ABSTRACT_ORIGIN to the decl itself, and its
comment makes it clear that is intended (according to git blame, the
whole comment and much of the implementation come from 1992, though ;-)
The function is called from the "final" pass through dwarf2out_decl(),
and gen_decl_die().

So, for one reason or another, this is the intended behavior.
Apparently, after that one is not supposed to be printing the decl
name of such a "finished" a function.  It is too bad however that this
can happen if a "finished" function is itself an abstract origin of a
different one, which is optimized and expanded only afterwards and you
attempt to print its decl name, because it triggers printing the decl
name of the finished function, in turn triggering the infinite
recursion/loop.  I am quite surprised that we have not hit this
earlier (e.g. with warnings in IPA-CP clones) but perhaps there is a
reason.

I will append the patch to some bootstrap and testing run and commit
it afterwards if it passes.


Other users explicitely check for the self-reference when walking origins.
I think that makes it pretty clear that we have to handle 
self-reference.  So it seems that rather than an assert that we should 
just not walk down a self-referencing DECL_ABSTRACT_ORIGIN.


jeff



Re: [PATCH] correct handling of non-constant width and precision (pr 78521)

2016-11-29 Thread Martin Sebor

On 11/28/2016 05:42 PM, Joseph Myers wrote:

On Sun, 27 Nov 2016, Martin Sebor wrote:


Finally, the patch also tightens up the constraint on the upper bound
of bounded functions like snprintf to be INT_MAX.  The functions cannot
produce output in excess of INT_MAX + 1 bytes and some implementations
(e.g., Solaris) fail with EINVAL when the bound is INT_MAX or more.
This is the subject of PR 78520.


Note that failing with large bounds is questionable (there is an apparent
conflict between ISO C, where passing a large bound seems valid, and
POSIX, where large bounds require errors; see
; I'm not sure if any liaison
issue for this ever got passed to WG14).


Thanks!  That's useful background.  Let me check with Nick to see
is he (as the POSIX/WG14 liaison) plans to submit it.  I can also
write it up for the next WG14 meeting if we or the Austin Group
feel like WG14 should clarify or change things.

Martin



Re: [PATCH] Fix PR78306

2016-11-29 Thread Jeff Law

On 11/29/2016 12:47 AM, Richard Biener wrote:

Balaji added this check explicitly. There should be tests in the testsuite
(spawnee_inline, spawner_inline) which exercise that code.


Yes he did, but no, nothing in the testsuite.

I believe the tests are:

c-c++-common/cilk-plus/CK/spawnee_inline.c
c-c++-common/cilk-plus/CK/spawner_inline.c

But as I mentioned, they don't check for proper behaviour




There is _nowhere_ documented _why_ the checks were added.  Why is
inlining a transform that can do anything bad to a function using
cilk_spawn?
I know, it's disappointing.  Even the tests mentioned above don't shed 
any real light on the issue.



Jeff


Re: [PR middle-end/78566] Fix uninit regressions caused by previous -Wmaybe-uninit change

2016-11-29 Thread Christophe Lyon
On 29 November 2016 at 17:33, Aldy Hernandez  wrote:
> This fixes the gcc.dg/uninit-pred-6* failures I seem to have caused on some
> non x86 platforms. Sorry for the delay.
>
> The problem is that my fix for PR61409 had the logic backwards.  I was
> proving that all the uses of a PHI are invalidated by any one undefined PHI
> path, whereas what we want is to prove that EVERY uninitialized path is
> invalidated by some facor in the PHI use.
>
> The attached patch fixes this without causing any regressions on x86-64
> Linux.  I also verified that at least on [arm-none-linux-gnueabihf
> --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16], there are no
> gcc.dg/*uninit* regressions.
>
> There is still one regression at large involving a double free in PR78548
> which I will look at next/independently.
>
Thanks for working on this.
I've submitted a validation with your patch, I'll let you know if I find any
regressions.

Christophe

> OK for trunk?
> Aldy


Re: Ping: Re: [patch, avr] Add flash size to device info and make wrap around default

2016-11-29 Thread Denis Chertykov
2016-11-28 10:17 GMT+03:00 Pitchumani Sivanupandi
:
> On Saturday 26 November 2016 12:11 AM, Denis Chertykov wrote:
>>
>> I'm sorry for delay.
>>
>> I have a problem with the patch:
>> (Stripping trailing CRs from patch; use --binary to disable.)
>> patching file avr-arch.h
>> (Stripping trailing CRs from patch; use --binary to disable.)
>> patching file avr-devices.c
>> (Stripping trailing CRs from patch; use --binary to disable.)
>> patching file avr-mcus.def
>> Hunk #1 FAILED at 62.
>> 1 out of 1 hunk FAILED -- saving rejects to file avr-mcus.def.rej
>> (Stripping trailing CRs from patch; use --binary to disable.)
>> patching file gen-avr-mmcu-specs.c
>> Hunk #1 succeeded at 215 (offset 5 lines).
>> (Stripping trailing CRs from patch; use --binary to disable.)
>> patching file specs.h
>> Hunk #1 succeeded at 58 (offset 1 line).
>> Hunk #2 succeeded at 66 (offset 1 line).
>
>
> There are changes in avr-mcus.def after this patch is submitted.
> Now, I have incorporated the changes and attached the resolved patch.
>
> Regards,
> Pitchumani
>
> gcc/ChangeLog
>
> 2016-11-09  Pitchumani Sivanupandi 
>
> * config/avr/avr-arch.h (avr_mcu_t): Add flash_size member.
> * config/avr/avr-devices.c(avr_mcu_types): Add flash size info.
> * config/avr/avr-mcu.def: Likewise.
> * config/avr/gen-avr-mmcu-specs.c (print_mcu): Remove hard-coded prefix
> check to find wrap-around value, instead use MCU flash size. For 8k
> flash
> devices, update link_pmem_wrap spec string to add --pmem-wrap-around=8k.
> * config/avr/specs.h: Remove link_pmem_wrap from LINK_RELAX_SPEC and
> add to linker specs (LINK_SPEC) directly.

Committed.


[PR middle-end/78566] Fix uninit regressions caused by previous -Wmaybe-uninit change

2016-11-29 Thread Aldy Hernandez
This fixes the gcc.dg/uninit-pred-6* failures I seem to have caused on 
some non x86 platforms. Sorry for the delay.


The problem is that my fix for PR61409 had the logic backwards.  I was 
proving that all the uses of a PHI are invalidated by any one undefined 
PHI path, whereas what we want is to prove that EVERY uninitialized path 
is invalidated by some facor in the PHI use.


The attached patch fixes this without causing any regressions on x86-64 
Linux.  I also verified that at least on [arm-none-linux-gnueabihf
--with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16], there are no 
gcc.dg/*uninit* regressions.


There is still one regression at large involving a double free in 
PR78548 which I will look at next/independently.


OK for trunk?
Aldy
commit 469f4c38a48bc284c268b40f5d5511f015844ea2
Author: Aldy Hernandez 
Date:   Tue Nov 29 05:59:53 2016 -0500

PR middle-end/78566
* tree-ssa-uninit.c (can_one_predicate_be_invalidated_p): Change
argument type to a pred_chain.
(can_chain_union_be_invalidated_p): Use pred_chain instead of a
worklist.
(flatten_out_predicate_chains): Remove.
(uninit_uses_cannot_happen): Rename from
uninit_ops_invalidate_phi_use.
Change logic so that we are checking that the PHI use will
invalidate _ALL_ possibly uninitialized operands.
(is_use_properly_guarded): Rename call to
uninit_ops_invalidate_phi_use into uninit_uses_cannot_happen.

diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c
index 4557403..a648995 100644
--- a/gcc/tree-ssa-uninit.c
+++ b/gcc/tree-ssa-uninit.c
@@ -2155,115 +2155,66 @@ normalize_preds (pred_chain_union preds, gimple 
*use_or_def, bool is_use)
 
 static bool
 can_one_predicate_be_invalidated_p (pred_info predicate,
-   vec worklist)
+   pred_chain use_guard)
 {
-  for (size_t i = 0; i < worklist.length (); ++i)
+  for (size_t i = 0; i < use_guard.length (); ++i)
 {
-  pred_info *p = worklist[i];
-
   /* NOTE: This is a very simple check, and only understands an
 exact opposite.  So, [i == 0] is currently only invalidated
 by [.NOT. i == 0] or [i != 0].  Ideally we should also
 invalidate with say [i > 5] or [i == 8].  There is certainly
 room for improvement here.  */
-  if (pred_neg_p (predicate, *p))
+  if (pred_neg_p (predicate, use_guard[i]))
return true;
 }
   return false;
 }
 
-/* Return TRUE if all USE_PREDS can be invalidated by some predicate
-   in WORKLIST.  */
+/* Return TRUE if all predicates in UNINIT_PRED are invalidated by
+   USE_GUARD being true.  */
 
 static bool
-can_chain_union_be_invalidated_p (pred_chain_union use_preds,
- vec worklist)
+can_chain_union_be_invalidated_p (pred_chain_union uninit_pred,
+ pred_chain use_guard)
 {
-  /* Remember:
-   PRED_CHAIN_UNION = PRED_CHAIN1 || PRED_CHAIN2 || PRED_CHAIN3
-   PRED_CHAIN = PRED_INFO1 && PRED_INFO2 && PRED_INFO3, etc.
-
-   We need to invalidate the entire PRED_CHAIN_UNION, which means,
-   invalidating every PRED_CHAIN in this union.  But to invalidate
-   an individual PRED_CHAIN, all we need to invalidate is _any_ one
-   PRED_INFO, by boolean algebra !PRED_INFO1 || !PRED_INFO2...  */
-  for (size_t i = 0; i < use_preds.length (); ++i)
+  if (uninit_pred.is_empty ())
+return false;
+  for (size_t i = 0; i < uninit_pred.length (); ++i)
 {
-  pred_chain c = use_preds[i];
-  bool entire_pred_chain_invalidated = false;
+  pred_chain c = uninit_pred[i];
   for (size_t j = 0; j < c.length (); ++j)
-   if (can_one_predicate_be_invalidated_p (c[j], worklist))
- {
-   entire_pred_chain_invalidated = true;
-   break;
- }
-  if (!entire_pred_chain_invalidated)
-   return false;
+   if (!can_one_predicate_be_invalidated_p (c[j], use_guard))
+ return false;
 }
   return true;
 }
 
-/* Flatten out all the factors in all the pred_chain_union's in PREDS
-   into a WORKLIST of individual PRED_INFO's.
+/* Return TRUE if none of the uninitialized operands in UNINT_OPNDS
+   can actually happen if we arrived at a use for PHI.
 
-   N is the number of pred_chain_union's in PREDS.
+   PHI_USE_GUARDS are the guard conditions for the use of the PHI.  */
 
-   Since we are interested in the inverse of the PRED_CHAIN's, by
-   boolean algebra, an inverse turns those PRED_CHAINS into unions,
-   which means we can flatten all the factors out for easy access.  */
-
-static void
-flatten_out_predicate_chains (pred_chain_union preds[], size_t n,
- vec *worklist)
+static bool
+uninit_uses_cannot_happen (gphi *phi, unsigned uninit_opnds,
+  pred_chain_union phi_use_guards)
 {
-  for (size_t i = 0; 

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-29 Thread Christophe Lyon
On 18 November 2016 at 16:54, Christophe Lyon
 wrote:
> On 18 November 2016 at 16:46, Yuri Rumyantsev  wrote:
>> It is very strange that this test failed on arm, since it requires
>> target avx2 to check vectorizer dumps:
>>
>> /* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 2 "vect" {
>> target avx2_runtime } } } */
>> /* { dg-final { scan-tree-dump-times "LOOP EPILOGUE VECTORIZED
>> \\(VS=16\\)" 2 "vect" { target avx2_runtime } } } */
>>
>> Could you please clarify what is the reason of the failure?
>
> It's not the scan-dumps that fail, but the execution.
> The test calls abort() for some reason.
>
> It will take me a while to rebuild the test manually in the right
> debug environment to provide you with more traces.
>
>
Sorry for the delay... This problem is not directly related to your patch.

The tests in gcc.dg/vect are compiled with -mfpu=neon
-mfloat-abi=softfp -march=armv7-a
and thus cannot be executed on older versions of the architecture.

This is another instance of what I discussed with Jakub several months ago:
https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00666.html
but the thread died.

Basically, check_vect_support_and_set_flags sets set
dg-do-what-default compile, but
some tests in gcc.dg/vect have dg-do run hardcoded.

Jakub was not happy with my patch that was removing all these dg-do
run directives :-)

Christophe


>
>>
>> Thanks.
>>
>> 2016-11-18 16:20 GMT+03:00 Christophe Lyon :
>>> On 15 November 2016 at 15:41, Yuri Rumyantsev  wrote:
 Hi All,

 Here is patch for non-masked epilogue vectoriziation.

 Bootstrap and regression testing did not show any new failures.

 Is it OK for trunk?

 Thanks.
 Changelog:

 2016-11-15  Yuri Rumyantsev  

 * params.def (PARAM_VECT_EPILOGUES_NOMASK): New.
 * tree-if-conv.c (tree_if_conversion): Make public.
 * * tree-if-conv.h: New file.
 * tree-vect-data-refs.c (vect_analyze_data_ref_dependences) Avoid
 dynamic alias checks for epilogues.
 * tree-vect-loop-manip.c (vect_do_peeling): Return created epilog.
 * tree-vect-loop.c: include tree-if-conv.h.
 (new_loop_vec_info): Add zeroing orig_loop_info field.
 (vect_analyze_loop_2): Don't try to enhance alignment for epilogues.
 (vect_analyze_loop): Add argument ORIG_LOOP_INFO which is not NULL
 if epilogue is vectorized, set up orig_loop_info field of loop_vinfo
 using passed argument.
 (vect_transform_loop): Check if created epilogue should be returned
 for further vectorization with less vf.  If-convert epilogue if
 required. Print vectorization success for epilogue.
 * tree-vectorizer.c (vectorize_loops): Add epilogue vectorization
 if it is required, pass loop_vinfo produced during vectorization of
 loop body to vect_analyze_loop.
 * tree-vectorizer.h (struct _loop_vec_info): Add new field
 orig_loop_info.
 (LOOP_VINFO_ORIG_LOOP_INFO): New.
 (LOOP_VINFO_EPILOGUE_P): New.
 (LOOP_VINFO_ORIG_VECT_FACTOR): New.
 (vect_do_peeling): Change prototype to return epilogue.
 (vect_analyze_loop): Add argument of loop_vec_info type.
 (vect_transform_loop): Return created loop.

 gcc/testsuite/

 * lib/target-supports.exp (check_avx2_hw_available): New.
 (check_effective_target_avx2_runtime): New.
 * gcc.dg/vect/vect-tail-nomask-1.c: New test.

>>>
>>> Hi,
>>>
>>> This new test fails on arm-none-eabi (using default cpu/fpu/mode):
>>>   gcc.dg/vect/vect-tail-nomask-1.c -flto -ffat-lto-objects execution test
>>>   gcc.dg/vect/vect-tail-nomask-1.c execution test
>>>
>>> It does pass on the same target if configured --with-cpu=cortex-a9.
>>>
>>> Christophe
>>>
>>>
>>>

 2016-11-14 20:04 GMT+03:00 Richard Biener :
> On November 14, 2016 4:39:40 PM GMT+01:00, Yuri Rumyantsev 
>  wrote:
>>Richard,
>>
>>I checked one of the tests designed for epilogue vectorization using
>>patches 1 - 3 and found out that build compiler performs vectorization
>>of epilogues with --param vect-epilogues-nomask=1 passed:
>>
>>$ gcc -Ofast -mavx2 t1.c -S --param vect-epilogues-nomask=1 -o
>>t1.new-nomask.s -fdump-tree-vect-details
>>$ grep VECTORIZED -c t1.c.156t.vect
>>4
>> Without param only 2 loops are vectorized.
>>
>>Should I simply add a part of tests related to this feature or I must
>>delete all not necessary changes also?
>
> Please remove all not necessary changes.
>
> Richard.
>
>>Thanks.
>>Yuri.
>>
>>2016-11-14 16:40 GMT+03:00 Richard Biener :
>>> On Mon, 14 Nov 2016, Yuri Rumyantsev wrote:
>>>
 Richard,

 In my previous patch I forgot to remove couple lines related to aux
>>field.
 Here is the correct updated patch.

Re: [PATCH] improve folding of expressions that move a single bit around

2016-11-29 Thread Jeff Law

On 11/29/2016 03:16 AM, Richard Biener wrote:

On Mon, Nov 28, 2016 at 7:41 PM, Jeff Law  wrote:

On 11/28/2016 06:10 AM, Paolo Bonzini wrote:




On 27/11/2016 00:28, Marc Glisse wrote:


On Sat, 26 Nov 2016, Paolo Bonzini wrote:


--- match.pd(revision 242742)
+++ match.pd(working copy)
@@ -2554,6 +2554,19 @@
  (cmp (bit_and@2 @0 integer_pow2p@1) @1)
  (icmp @2 { build_zero_cst (TREE_TYPE (@0)); })))

+/* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2,
+   convert this into a shift of (A & C).  */
+(simplify
+ (cond
+  (ne (bit_and@2 @0 integer_pow2p@1) integer_zerop)
+  integer_pow2p@3 integer_zerop)
+ (with {
+int shift = wi::exact_log2 (@3) - wi::exact_log2 (@1);
+  }
+  (if (shift > 0)
+   (lshift (convert @2) { build_int_cst (integer_type_node, shift); })
+   (convert (rshift @2 { build_int_cst (integer_type_node, -shift);
})



What happens if @1 is the sign bit, in a signed type? Do we get an
arithmetic shift right?



It shouldn't happen because the canonical form of a sign bit test is A <
0 (that's the pattern immediately after).  However I can add an "if" if
preferred, or change the pattern to do the AND after the shift.


But are we absolutely sure it'll be in canonical form every time?


No, of course not (though it would be a bug).  If the pattern generates wrong
code when the non-canonical form is met that would be bad, if it merely
does not optimize (or optimize non-optimally) then that's not too bad.
Agreed.  I managed to convince myself that for a signed type with the 
sign bit on that we'd generate incorrect code.  But that was from a 
quick review of the pattern.


Jeff


Re: [RFA] Handle target with no length attributes sanely in bb-reorder.c

2016-11-29 Thread Jeff Law

On 11/29/2016 03:23 AM, Richard Biener wrote:

On Mon, Nov 28, 2016 at 10:23 PM, Jeff Law  wrote:



I was digging into  issues around the patches for 78120 when I stumbled upon
undesirable bb copying in bb-reorder.c on the m68k.

The core issue is that the m68k does not define a length attribute and
therefore generic code assumes that the length of all insns is 0 bytes.


What other targets behave like this?

ft32, nvptx, mmix, mn10300, m68k, c6x, rl78, vax, ia64, m32c

cris has a hack to define a length, even though no attempt is made to 
make it accurate.  The hack specifically calls out that it's to make 
bb-reorder happy.





That in turn makes bb-reorder think it is infinitely cheap to copy basic
blocks.  In the two codebases I looked at (GCC's runtime libraries and
newlib) this leads to a 10% and 15% undesirable increase in code size.

I've taken a slight variant of this patch and bootstrapped/regression tested
it on x86_64-linux-gnu to verify sanity as well as built the m68k target
libraries noted above.

OK for the trunk?


I wonder if it isn't better to default to a length of 1 instead of zero when
there is no length attribute.  There are more users of the length attribute
in bb-reorder.c (and elsewhere as well I suppose).
I pondered that as well, but felt it was riskier given we've had a 
default length of 0 for ports that don't define lengths since the early 
90s.  It's certainly easy enough to change that default if you'd prefer. 
 I don't have a strong preference either way.


Jeff


Re: [PATCH v2] Fix PR78588 - rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too large for 64-bit type

2016-11-29 Thread Markus Trippelsdorf
Here is v2 of the fix.

Building gcc with -fsanitize=undefined shows:
 rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too large for 
64-bit type 'long unsigned int'

This happens because if_then_else_cond() in combine.c calls
num_sign_bit_copies() in rtlanal.c with mode==BLKmode.

5205   bitwidth = GET_MODE_PRECISION (mode);
5206   if (bitwidth > HOST_BITS_PER_WIDE_INT)
5207 return 1;
5208
5209   nonzero = nonzero_bits (x, mode);
5210   return nonzero & (HOST_WIDE_INT_1U << (bitwidth - 1))
5211  ? 1 : bitwidth - floor_log2 (nonzero) - 1;

This causes (bitwidth - 1) to wrap around.

Fix by also guarding against BLKmode.

Tested on pcc64le.
OK for trunk?

Thanks.

PR rtl-optimization/78588 
* combine.c (if_then_else_cond): Also guard against BLKmode.

diff --git a/gcc/combine.c b/gcc/combine.c
index 22fb7a976538..a32a0ecc72fb 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -9176,7 +9176,7 @@ if_then_else_cond (rtx x, rtx *ptrue, rtx *pfalse)
   /* If X is known to be either 0 or -1, those are the true and
  false values when testing X.  */
   else if (x == constm1_rtx || x == const0_rtx
-  || (mode != VOIDmode
+  || (mode != VOIDmode && mode != BLKmode
   && num_sign_bit_copies (x, mode) == GET_MODE_PRECISION (mode)))
 {
   *ptrue = constm1_rtx, *pfalse = const0_rtx;

--
Markus


Re: [PATCH 4/4] S/390: Disable peeling for alignment.

2016-11-29 Thread Andreas Krebbel
And again with the costs for unaligned loads/stores actually changed:

gcc/ChangeLog:

2016-11-29  Andreas Krebbel  

* gcc/config/s390/s390.c (s390_builtin_vectorization_cost): New
function.
(TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST): Define target
macro.

gcc/testsuite/ChangeLog:

2016-11-29  Andreas Krebbel  

* gcc.target/s390/vector/vec-nopeel-1.c: New test.
---
 gcc/config/s390/s390.c | 37 ++
 .../gcc.target/s390/vector/vec-nopeel-1.c  | 17 ++
 2 files changed, 54 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-nopeel-1.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index dab4f43..767666e 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -3674,6 +3674,40 @@ s390_address_cost (rtx addr, machine_mode mode 
ATTRIBUTE_UNUSED,
   return ad.indx? COSTS_N_INSNS (1) + 1 : COSTS_N_INSNS (1);
 }
 
+/* Implement targetm.vectorize.builtin_vectorization_cost.  */
+static int
+s390_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
+tree vectype,
+int misalign ATTRIBUTE_UNUSED)
+{
+  switch (type_of_cost)
+{
+  case scalar_stmt:
+  case scalar_load:
+  case scalar_store:
+  case vector_stmt:
+  case vector_load:
+  case vector_store:
+  case vec_to_scalar:
+  case scalar_to_vec:
+  case cond_branch_not_taken:
+  case vec_perm:
+  case vec_promote_demote:
+  case unaligned_load:
+  case unaligned_store:
+   return 1;
+
+  case cond_branch_taken:
+   return 3;
+
+  case vec_construct:
+   return TYPE_VECTOR_SUBPARTS (vectype) - 1;
+
+  default:
+   gcc_unreachable ();
+}
+}
+
 /* If OP is a SYMBOL_REF of a thread-local symbol, return its TLS mode,
otherwise return 0.  */
 
@@ -15428,6 +15462,9 @@ s390_excess_precision (enum excess_precision_type type)
 #define TARGET_REGISTER_MOVE_COST s390_register_move_cost
 #undef TARGET_MEMORY_MOVE_COST
 #define TARGET_MEMORY_MOVE_COST s390_memory_move_cost
+#undef TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST
+#define TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST \
+  s390_builtin_vectorization_cost
 
 #undef TARGET_MACHINE_DEPENDENT_REORG
 #define TARGET_MACHINE_DEPENDENT_REORG s390_reorg
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-1.c 
b/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-1.c
new file mode 100644
index 000..581c371
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+/* { dg-require-effective-target vector } */
+
+int
+foo (int * restrict a, int n)
+{
+  int i, result = 0;
+
+  for (i = 0; i < n * 4; i++)
+result += a[i];
+  return result;
+}
+
+/* We do NOT want this loop to get peeled.  Without peeling no scalar
+   memory add should appear.  */
+/* { dg-final { scan-assembler-not "\ta\t" } } */
-- 
2.9.1



[libiberty] demangler formatting

2016-11-29 Thread Nathan Sidwell
In working on pr78252 I noticed a source formatting nit.  Fixed thusly 
and committed.


nathan
--
Nathan Sidwell
2016-11-29  Nathan Sidwell  

	* cp-demangle.c (d_print_comp_inner): Fix parameter indentation.

Index: cp-demangle.c
===
--- cp-demangle.c	(revision 242959)
+++ cp-demangle.c	(working copy)
@@ -4564,7 +4564,7 @@ d_maybe_print_fold_expression (struct d_
 
 static void
 d_print_comp_inner (struct d_print_info *dpi, int options,
-		  const struct demangle_component *dc)
+		const struct demangle_component *dc)
 {
   /* Magic variable to let reference smashing skip over the next modifier
  without needing to modify *dc.  */


Re: [PATCH] Fix PR78588 - rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too large for 64-bit type

2016-11-29 Thread Markus Trippelsdorf
On 2016.11.29 at 16:01 +0100, Markus Trippelsdorf wrote:
> On 2016.11.29 at 15:21 +0100, Markus Trippelsdorf wrote:
> > On 2016.11.29 at 15:14 +0100, Jakub Jelinek wrote:
> > > On Tue, Nov 29, 2016 at 03:08:15PM +0100, Markus Trippelsdorf wrote:
> > > > Building gcc with -fsanitize=undefined shows:
> > > >  rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too
> > > >  large for 64-bit type 'long unsigned int'
> > > >
> > > > 5210   return nonzero & (HOST_WIDE_INT_1U << (bitwidth - 1))
> > > > 5211  ? 1 : bitwidth - floor_log2 (nonzero) - 1;
> > > >
> > > > Here (bitwidth - 1) wraps around because bitwidth is zero and unsigned.
> > >
> > > Which modes have precision of 0?  I'd expect just VOIDmode and BLKmode, 
> > > any
> > > others?  And for those I'd say it is a bug to call num_sign_bit_copies*.
> > 
> > Yes, only VOIDmode and BLKmode:
> > 
> >  233 const unsigned short mode_precision[NUM_MACHINE_MODES] =
> >  234 {
> >  235   0,   /* VOID */
> >  236   0,   /* BLK */
> 
> markus@x4 libsupc++ % cat cp-demangle.i
> d_demangle_callback_mangled() {
>   if (strncmp(d_demangle_callback_mangled, "", 1))
> d_type();
> }
> 
> markus@x4 libsupc++ % UBSAN_OPTIONS=print_stacktrace=1:halt_on_error=1 
> /var/tmp/gcc_build_dir_/./gcc/cc1 -w -fpreprocessed cp-demangle.i -quiet 
> -dumpbase cp-demangle.i -mtune=generic -march=x86-64 -auxbase cp-demangle -O2 
> -version -o /dev/null
> GNU C11 (GCC) version 7.0.0 20161129 (experimental) (x86_64-pc-linux-gnu)
> compiled by GNU C version 7.0.0 20161129 (experimental), GMP version 
> 6.1.1, MPFR version 3.1.5, MPC version 1.0.3, isl version none
> GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
> GNU C11 (GCC) version 7.0.0 20161129 (experimental) (x86_64-pc-linux-gnu)
> compiled by GNU C version 7.0.0 20161129 (experimental), GMP version 
> 6.1.1, MPFR version 3.1.5, MPC version 1.0.3, isl version none
> GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
> Compiler executable checksum: 7cca725773f8a0693a2905f8af7b733c
> ../../gcc/gcc/rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is 
> too large for 64-bit type 'long unsigned int'
> #0 0x1b40fe1 in num_sign_bit_copies1 ../../gcc/gcc/rtlanal.c:5210
> #1 0x35ef5f1 in if_then_else_cond ../../gcc/gcc/combine.c:9180
> #2 0x35ef199 in if_then_else_cond ../../gcc/gcc/combine.c:9034
> #3 0x35ef199 in if_then_else_cond ../../gcc/gcc/combine.c:9034
> #4 0x3625f98 in combine_simplify_rtx ../../gcc/gcc/combine.c:5604
> #5 0x3632525 in subst ../../gcc/gcc/combine.c:5487
> #6 0x36327d6 in subst ../../gcc/gcc/combine.c:5425
> #7 0x3632bd7 in subst ../../gcc/gcc/combine.c:5354
> #8 0x3641a74 in try_combine ../../gcc/gcc/combine.c:3347
> #9 0x365727b in combine_instructions ../../gcc/gcc/combine.c:1421
> #10 0x365727b in rest_of_handle_combine ../../gcc/gcc/combine.c:14581
> #11 0x365727b in execute ../../gcc/gcc/combine.c:14626
> #12 0x195ad18 in execute_one_pass(opt_pass*) ../../gcc/gcc/passes.c:2370
> #13 0x195cbab in execute_pass_list_1 ../../gcc/gcc/passes.c:2459
> #14 0x195cbd4 in execute_pass_list_1 ../../gcc/gcc/passes.c:2460
> #15 0x195cc64 in execute_pass_list(function*, opt_pass*) 
> ../../gcc/gcc/passes.c:2470
> #16 0xc75deb in cgraph_node::expand() ../../gcc/gcc/cgraphunit.c:2001
> #17 0xc7b2fa in expand_all_functions ../../gcc/gcc/cgraphunit.c:2137
> #18 0xc7b2fa in symbol_table::compile() ../../gcc/gcc/cgraphunit.c:2494
> #19 0xc854b7 in symbol_table::compile() ../../gcc/gcc/cgraphunit.c:2587
> #20 0xc854b7 in symbol_table::finalize_compilation_unit() 
> ../../gcc/gcc/cgraphunit.c:2584
> #21 0x1d3ea10 in compile_file ../../gcc/gcc/toplev.c:488
> #22 0x629a14 in do_compile ../../gcc/gcc/toplev.c:1983
> #23 0x629a14 in toplev::main(int, char**) ../../gcc/gcc/toplev.c:2117
> #24 0x62c046 in main ../../gcc/gcc/main.c:39
> #25 0x7f4b6600f310 in __libc_start_main ../csu/libc-start.c:286
> #26 0x62c469 in _start (/var/tmp/gcc_build_dir_/gcc/cc1+0x62c469)

(gdb) p mode
$1 = BLKmode

#6  0x035ef5f2 in if_then_else_cond (x=0x760d3888,
ptrue=ptrue@entry=0x7fffd940, pfalse=pfalse@entry=0x7fffd950) at 
../../gcc/gcc/combine.c:9180
9180   && num_sign_bit_copies (x, mode) == GET_MODE_PRECISION 
(mode)))
(gdb) l
9175
9176  /* If X is known to be either 0 or -1, those are the true and
9177 false values when testing X.  */
9178  else if (x == constm1_rtx || x == const0_rtx
9179   || (mode != VOIDmode
9180   && num_sign_bit_copies (x, mode) == GET_MODE_PRECISION 
(mode)))
9181{
9182  *ptrue = constm1_rtx, *pfalse = const0_rtx;
9183  return x;
9184}


-- 
Markus


Re: [PATCH 4/4] S/390: Disable peeling for alignment.

2016-11-29 Thread Andreas Krebbel
On Tue, Nov 29, 2016 at 11:38:15AM +0100, Richard Biener wrote:
> So - please instead of setting this param provide
> TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST.

Right, that's way better.

gcc/ChangeLog:

2016-11-29  Andreas Krebbel  

* gcc/config/s390/s390.c (s390_builtin_vectorization_cost): New
function.
(TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST): Define target
macro.

gcc/testsuite/ChangeLog:

2016-11-29  Andreas Krebbel  

* gcc.target/s390/vector/vec-nopeel-1.c: New test.
---
 gcc/config/s390/s390.c | 38 ++
 .../gcc.target/s390/vector/vec-nopeel-1.c  | 17 ++
 2 files changed, 55 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-nopeel-1.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index dab4f43..82aca3f 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -3674,6 +3674,41 @@ s390_address_cost (rtx addr, machine_mode mode 
ATTRIBUTE_UNUSED,
   return ad.indx? COSTS_N_INSNS (1) + 1 : COSTS_N_INSNS (1);
 }
 
+/* Implement targetm.vectorize.builtin_vectorization_cost.  */
+static int
+s390_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
+tree vectype,
+int misalign ATTRIBUTE_UNUSED)
+{
+  switch (type_of_cost)
+{
+  case scalar_stmt:
+  case scalar_load:
+  case scalar_store:
+  case vector_stmt:
+  case vector_load:
+  case vector_store:
+  case vec_to_scalar:
+  case scalar_to_vec:
+  case cond_branch_not_taken:
+  case vec_perm:
+  case vec_promote_demote:
+   return 1;
+  case unaligned_load:
+  case unaligned_store:
+   return 2;
+
+  case cond_branch_taken:
+   return 3;
+
+  case vec_construct:
+   return TYPE_VECTOR_SUBPARTS (vectype) - 1;
+
+  default:
+   gcc_unreachable ();
+}
+}
+
 /* If OP is a SYMBOL_REF of a thread-local symbol, return its TLS mode,
otherwise return 0.  */
 
@@ -15428,6 +15463,9 @@ s390_excess_precision (enum excess_precision_type type)
 #define TARGET_REGISTER_MOVE_COST s390_register_move_cost
 #undef TARGET_MEMORY_MOVE_COST
 #define TARGET_MEMORY_MOVE_COST s390_memory_move_cost
+#undef TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST
+#define TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST \
+  s390_builtin_vectorization_cost
 
 #undef TARGET_MACHINE_DEPENDENT_REORG
 #define TARGET_MACHINE_DEPENDENT_REORG s390_reorg
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-1.c 
b/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-1.c
new file mode 100644
index 000..5f370a2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+/* { dg-require-effective-target vector } */
+
+int
+foo (int * restrict a, int n)
+{
+  int i, result = 0;
+
+  for (i = 0; i < n * 4; i++)
+result += a[i];
+  return result;
+}
+
+/* We do NOT want this loop to get peeled to reach better alignment.
+   Without peeling no scalar memory add should appear.  */
+/* { dg-final { scan-assembler-not "\ta\t" } } */
-- 
2.9.1



Re: [PATCH] Fix PR78588 - rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too large for 64-bit type

2016-11-29 Thread Markus Trippelsdorf
On 2016.11.29 at 15:21 +0100, Markus Trippelsdorf wrote:
> On 2016.11.29 at 15:14 +0100, Jakub Jelinek wrote:
> > On Tue, Nov 29, 2016 at 03:08:15PM +0100, Markus Trippelsdorf wrote:
> > > Building gcc with -fsanitize=undefined shows:
> > >  rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too
> > >  large for 64-bit type 'long unsigned int'
> > >
> > > 5210   return nonzero & (HOST_WIDE_INT_1U << (bitwidth - 1))
> > > 5211  ? 1 : bitwidth - floor_log2 (nonzero) - 1;
> > >
> > > Here (bitwidth - 1) wraps around because bitwidth is zero and unsigned.
> >
> > Which modes have precision of 0?  I'd expect just VOIDmode and BLKmode, any
> > others?  And for those I'd say it is a bug to call num_sign_bit_copies*.
> 
> Yes, only VOIDmode and BLKmode:
> 
>  233 const unsigned short mode_precision[NUM_MACHINE_MODES] =
>  234 {
>  235   0,   /* VOID */
>  236   0,   /* BLK */

markus@x4 libsupc++ % cat cp-demangle.i
d_demangle_callback_mangled() {
  if (strncmp(d_demangle_callback_mangled, "", 1))
d_type();
}

markus@x4 libsupc++ % UBSAN_OPTIONS=print_stacktrace=1:halt_on_error=1 
/var/tmp/gcc_build_dir_/./gcc/cc1 -w -fpreprocessed cp-demangle.i -quiet 
-dumpbase cp-demangle.i -mtune=generic -march=x86-64 -auxbase cp-demangle -O2 
-version -o /dev/null
GNU C11 (GCC) version 7.0.0 20161129 (experimental) (x86_64-pc-linux-gnu)
compiled by GNU C version 7.0.0 20161129 (experimental), GMP version 
6.1.1, MPFR version 3.1.5, MPC version 1.0.3, isl version none
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
GNU C11 (GCC) version 7.0.0 20161129 (experimental) (x86_64-pc-linux-gnu)
compiled by GNU C version 7.0.0 20161129 (experimental), GMP version 
6.1.1, MPFR version 3.1.5, MPC version 1.0.3, isl version none
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: 7cca725773f8a0693a2905f8af7b733c
../../gcc/gcc/rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is 
too large for 64-bit type 'long unsigned int'
#0 0x1b40fe1 in num_sign_bit_copies1 ../../gcc/gcc/rtlanal.c:5210
#1 0x35ef5f1 in if_then_else_cond ../../gcc/gcc/combine.c:9180
#2 0x35ef199 in if_then_else_cond ../../gcc/gcc/combine.c:9034
#3 0x35ef199 in if_then_else_cond ../../gcc/gcc/combine.c:9034
#4 0x3625f98 in combine_simplify_rtx ../../gcc/gcc/combine.c:5604
#5 0x3632525 in subst ../../gcc/gcc/combine.c:5487
#6 0x36327d6 in subst ../../gcc/gcc/combine.c:5425
#7 0x3632bd7 in subst ../../gcc/gcc/combine.c:5354
#8 0x3641a74 in try_combine ../../gcc/gcc/combine.c:3347
#9 0x365727b in combine_instructions ../../gcc/gcc/combine.c:1421
#10 0x365727b in rest_of_handle_combine ../../gcc/gcc/combine.c:14581
#11 0x365727b in execute ../../gcc/gcc/combine.c:14626
#12 0x195ad18 in execute_one_pass(opt_pass*) ../../gcc/gcc/passes.c:2370
#13 0x195cbab in execute_pass_list_1 ../../gcc/gcc/passes.c:2459
#14 0x195cbd4 in execute_pass_list_1 ../../gcc/gcc/passes.c:2460
#15 0x195cc64 in execute_pass_list(function*, opt_pass*) 
../../gcc/gcc/passes.c:2470
#16 0xc75deb in cgraph_node::expand() ../../gcc/gcc/cgraphunit.c:2001
#17 0xc7b2fa in expand_all_functions ../../gcc/gcc/cgraphunit.c:2137
#18 0xc7b2fa in symbol_table::compile() ../../gcc/gcc/cgraphunit.c:2494
#19 0xc854b7 in symbol_table::compile() ../../gcc/gcc/cgraphunit.c:2587
#20 0xc854b7 in symbol_table::finalize_compilation_unit() 
../../gcc/gcc/cgraphunit.c:2584
#21 0x1d3ea10 in compile_file ../../gcc/gcc/toplev.c:488
#22 0x629a14 in do_compile ../../gcc/gcc/toplev.c:1983
#23 0x629a14 in toplev::main(int, char**) ../../gcc/gcc/toplev.c:2117
#24 0x62c046 in main ../../gcc/gcc/main.c:39
#25 0x7f4b6600f310 in __libc_start_main ../csu/libc-start.c:286
#26 0x62c469 in _start (/var/tmp/gcc_build_dir_/gcc/cc1+0x62c469)

-- 
Markus


Calling 'abort' on bounds violations in libmpx

2016-11-29 Thread Alexander Ivchenko
Hi,

Attached patch is addressing PR67520. Would that approach work for the
problem? Should I also change the version of the library?

2016-11-29  Alexander Ivchenko  

* mpxrt/mpxrt-utils.c (set_mpx_rt_stop_handler): New function.
(print_help): Add help for CHKP_RT_STOP_HANDLER environment
variable.
(__mpxrt_init_env_vars): Add initialization of stop_handler.
(__mpxrt_stop_handler): New function.
(__mpxrt_stop): Ditto.
* mpxrt/mpxrt-utils.h (mpx_rt_stop_mode_handler_t): New enum.



diff --git a/libmpx/mpxrt/mpxrt-utils.c b/libmpx/mpxrt/mpxrt-utils.c
index 057a355..63ee7c6 100644
--- a/libmpx/mpxrt/mpxrt-utils.c
+++ b/libmpx/mpxrt/mpxrt-utils.c
@@ -60,6 +60,9 @@
 #define MPX_RT_MODE "CHKP_RT_MODE"
 #define MPX_RT_MODE_DEFAULT MPX_RT_COUNT
 #define MPX_RT_MODE_DEFAULT_STR "count"
+#define MPX_RT_STOP_HANDLER "CHKP_RT_STOP_HANDLER"
+#define MPX_RT_STOP_HANDLER_DEFAULT MPX_RT_STOP_HANDLER_ABORT
+#define MPX_RT_STOP_HANDLER_DEFAULT_STR "abort"
 #define MPX_RT_HELP "CHKP_RT_HELP"
 #define MPX_RT_ADDPID "CHKP_RT_ADDPID"
 #define MPX_RT_BNDPRESERVE "CHKP_RT_BNDPRESERVE"
@@ -84,6 +87,7 @@ typedef struct {
 static int summary;
 static int add_pid;
 static mpx_rt_mode_t mode;
+static mpx_rt_stop_mode_handler_t stop_handler;
 static env_var_list_t env_var_list;
 static verbose_type verbose_val;
 static FILE *out;
@@ -226,6 +230,23 @@ set_mpx_rt_mode (const char *env)
   }
 }

+static mpx_rt_stop_mode_handler_t
+set_mpx_rt_stop_handler (const char *env)
+{
+  if (env == 0)
+return MPX_RT_STOP_HANDLER_DEFAULT;
+  else if (strcmp (env, "abort") == 0)
+return MPX_RT_STOP_HANDLER_ABORT;
+  else if (strcmp (env, "exit") == 0)
+return MPX_RT_STOP_HANDLER_EXIT;
+  {
+__mpxrt_print (VERB_ERROR, "Illegal value '%s' for %s. Legal values are"
+   "[abort | exit]\nUsing default value %s\n",
+   env, MPX_RT_STOP_HANDLER, MPX_RT_STOP_HANDLER_DEFAULT);
+return MPX_RT_STOP_HANDLER_DEFAULT;
+  }
+}
+
 static void
 print_help (void)
 {
@@ -244,6 +265,11 @@ print_help (void)
   fprintf (out, "%s \t\t set MPX runtime behavior on #BR exception."
" [stop | count]\n"
"\t\t\t [default: %s]\n", MPX_RT_MODE, MPX_RT_MODE_DEFAULT_STR);
+  fprintf (out, "%s \t set the handler function MPX runtime will call\n"
+   "\t\t\t on #BR exception when %s is set to \'stop\'."
+   " [abort | exit]\n"
+   "\t\t\t [default: %s]\n", MPX_RT_STOP_HANDLER, MPX_RT_MODE,
+   MPX_RT_STOP_HANDLER_DEFAULT_STR);
   fprintf (out, "%s \t\t generate out,err file for each process.\n"
"\t\t\t generated file will be MPX_RT_{OUT,ERR}_FILE.pid\n"
"\t\t\t [default: no]\n", MPX_RT_ADDPID);
@@ -357,6 +383,10 @@ __mpxrt_init_env_vars (int* bndpreserve)
   env_var_list_add (MPX_RT_MODE, env);
   mode = set_mpx_rt_mode (env);

+  env = secure_getenv (MPX_RT_STOP_HANDLER);
+  env_var_list_add (MPX_RT_STOP_HANDLER, env);
+  stop_handler = set_mpx_rt_stop_handler (env);
+
   env = secure_getenv (MPX_RT_BNDPRESERVE);
   env_var_list_add (MPX_RT_BNDPRESERVE, env);
   validate_bndpreserve (env, bndpreserve);
@@ -487,6 +517,22 @@ __mpxrt_mode (void)
   return mode;
 }

+mpx_rt_mode_t
+__mpxrt_stop_handler (void)
+{
+  return stop_handler;
+}
+
+void __attribute__ ((noreturn))
+__mpxrt_stop (void)
+{
+  if (__mpxrt_stop_handler () == MPX_RT_STOP_HANDLER_ABORT)
+abort ();
+  else if (__mpxrt_stop_handler () == MPX_RT_STOP_HANDLER_EXIT)
+exit (255);
+  __builtin_unreachable ();
+}
+
 void
 __mpxrt_print_summary (uint64_t num_brs, uint64_t l1_size)
 {
diff --git a/libmpx/mpxrt/mpxrt-utils.h b/libmpx/mpxrt/mpxrt-utils.h
index d62937d..6da12cc 100644
--- a/libmpx/mpxrt/mpxrt-utils.h
+++ b/libmpx/mpxrt/mpxrt-utils.h
@@ -54,6 +54,11 @@ typedef enum {
   MPX_RT_STOP
 } mpx_rt_mode_t;

+typedef enum {
+  MPX_RT_STOP_HANDLER_ABORT,
+  MPX_RT_STOP_HANDLER_EXIT
+} mpx_rt_stop_mode_handler_t;
+
 void __mpxrt_init_env_vars (int* bndpreserve);
 void __mpxrt_write_uint (verbose_type vt, uint64_t val, unsigned base);
 void __mpxrt_write (verbose_type vt, const char* str);
diff --git a/libmpx/mpxrt/mpxrt.c b/libmpx/mpxrt/mpxrt.c
index b52906b..0bc069c 100644
--- a/libmpx/mpxrt/mpxrt.c
+++ b/libmpx/mpxrt/mpxrt.c
@@ -252,7 +251,7 @@ handler (int sig __attribute__ ((unused)),
   uctxt->uc_mcontext.gregs[REG_IP_IDX] =
 (greg_t)get_next_inst_ip ((uint8_t *)ip);
   if (__mpxrt_mode () == MPX_RT_STOP)
-exit (255);
+__mpxrt_stop ();
   return;

  default:
@@ -269,7 +268,7 @@ handler (int sig __attribute__ ((unused)),
   __mpxrt_write (VERB_ERROR, ", ip = 0x");
   __mpxrt_write_uint (VERB_ERROR, ip, 16);
   __mpxrt_write (VERB_ERROR, "\n");
-  exit (255);
+  __mpxrt_stop ();
 }
   else
 {
@@ -278,7 +277,7 @@ handler (int sig __attribute__ ((unused)),
   __mpxrt_write (VERB_ERROR, "! at 0x");
   __mpxrt_write_uint (VERB_ERROR, ip, 16);
   __mpxrt_write (VERB_ERROR, "\n");
-  exit (255);
+  __mpxrt_stop ();
 }
 }

thanks,
Alexander


RE: [PATCH] [ARC] Fix compact casesi option.

2016-11-29 Thread Claudiu Zissulescu
> Approved.
> 

Committed, thank you for ur review,
Claudiu


Re: [PATCH] Fix PR78588 - rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too large for 64-bit type

2016-11-29 Thread Markus Trippelsdorf
On 2016.11.29 at 15:14 +0100, Jakub Jelinek wrote:
> On Tue, Nov 29, 2016 at 03:08:15PM +0100, Markus Trippelsdorf wrote:
> > Building gcc with -fsanitize=undefined shows:
> >  rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too
> >  large for 64-bit type 'long unsigned int'
> >
> > 5210   return nonzero & (HOST_WIDE_INT_1U << (bitwidth - 1))
> > 5211  ? 1 : bitwidth - floor_log2 (nonzero) - 1;
> >
> > Here (bitwidth - 1) wraps around because bitwidth is zero and unsigned.
>
> Which modes have precision of 0?  I'd expect just VOIDmode and BLKmode, any
> others?  And for those I'd say it is a bug to call num_sign_bit_copies*.

Yes, only VOIDmode and BLKmode:

 233 const unsigned short mode_precision[NUM_MACHINE_MODES] =
 234 {
 235   0,   /* VOID */
 236   0,   /* BLK */


--
Markus


Re: [Patch, Fortran, OOP] PR 58175: Incorrect warning message on scalar finalizer

2016-11-29 Thread Janus Weil
Committed as r242960.



2016-11-28 14:36 GMT+01:00 Janus Weil :
> Hi all,
>
> the attached patch was posted on bugzilla by Tobias three years ago,
> but left unattended since then. It is simple, works well (fixing a
> bogus warning) and regtests cleanly on x86_64-linux-gnu.
>
> If no one objects, I will commit this to trunk by tomorrow.
>
> Cheers,
> Janus
>
>
>
> 2016-11-28  Tobias Burnus  
>
> PR fortran/58175
> * resolve.c (gfc_resolve_finalizers): Properly detect scalar finalizers.
>
> 2016-11-28  Janus Weil  
>
> PR fortran/58175
> * gfortran.dg/finalize_30.f90: New test case.


Re: [PATCH] Fix PR78588 - rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too large for 64-bit type

2016-11-29 Thread Jakub Jelinek
On Tue, Nov 29, 2016 at 03:08:15PM +0100, Markus Trippelsdorf wrote:
> Building gcc with -fsanitize=undefined shows:
>  rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too
>  large for 64-bit type 'long unsigned int'
> 
> 5210   return nonzero & (HOST_WIDE_INT_1U << (bitwidth - 1))
> 5211  ? 1 : bitwidth - floor_log2 (nonzero) - 1;
> 
> Here (bitwidth - 1) wraps around because bitwidth is zero and unsigned. 

Which modes have precision of 0?  I'd expect just VOIDmode and BLKmode, any
others?  And for those I'd say it is a bug to call num_sign_bit_copies*.

> Tested on ppc64le.
> OK for trunk?
> 
> Thanks.
> 
>   * rtlanal.c (num_sign_bit_copies1): Check for zero bitwidth.
> 
> diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
> index 4e4eb2ef3458..918088a0db8e 100644
> --- a/gcc/rtlanal.c
> +++ b/gcc/rtlanal.c
> @@ -5203,7 +5203,7 @@ num_sign_bit_copies1 (const_rtx x, machine_mode mode, 
> const_rtx known_x,
>   safely compute the mask for this mode, always return BITWIDTH.  */
> 
>bitwidth = GET_MODE_PRECISION (mode);
> -  if (bitwidth > HOST_BITS_PER_WIDE_INT)
> +  if (bitwidth == 0 || bitwidth > HOST_BITS_PER_WIDE_INT)
>  return 1;
> 
>nonzero = nonzero_bits (x, mode);
> 
> --
> Markus

Jakub


[PATCH] Fix PR78588 - rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too large for 64-bit type

2016-11-29 Thread Markus Trippelsdorf
Building gcc with -fsanitize=undefined shows:
 rtlanal.c:5210:38: runtime error: shift exponent 4294967295 is too
 large for 64-bit type 'long unsigned int'

5210   return nonzero & (HOST_WIDE_INT_1U << (bitwidth - 1))
5211  ? 1 : bitwidth - floor_log2 (nonzero) - 1;

Here (bitwidth - 1) wraps around because bitwidth is zero and unsigned. 

Fix by returning earlier if bitwidth is zero.

Tested on ppc64le.
OK for trunk?

Thanks.

  * rtlanal.c (num_sign_bit_copies1): Check for zero bitwidth.

diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index 4e4eb2ef3458..918088a0db8e 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -5203,7 +5203,7 @@ num_sign_bit_copies1 (const_rtx x, machine_mode mode, 
const_rtx known_x,
  safely compute the mask for this mode, always return BITWIDTH.  */

   bitwidth = GET_MODE_PRECISION (mode);
-  if (bitwidth > HOST_BITS_PER_WIDE_INT)
+  if (bitwidth == 0 || bitwidth > HOST_BITS_PER_WIDE_INT)
 return 1;

   nonzero = nonzero_bits (x, mode);

--
Markus


Re: Ping: Re: [PATCH 1/2] gcc: Remove unneeded global flag.

2016-11-29 Thread Andrew Burgess
* Jeff Law  [2016-11-28 15:08:46 -0700]:

> On 11/24/2016 02:40 PM, Andrew Burgess wrote:
> > * Christophe Lyon  [2016-11-21 13:47:09 +0100]:
> > 
> > > On 20 November 2016 at 18:27, Mike Stump  wrote:
> > > > On Nov 19, 2016, at 1:59 PM, Andrew Burgess 
> > > >  wrote:
> > > > > > So, your new test fails on arm* targets:
> > > > > 
> > > > > After a little digging I think the problem might be that
> > > > > -freorder-blocks-and-partition is not supported on arm.
> > > > > 
> > > > > This should be detected as the new tests include:
> > > > > 
> > > > >/* { dg-require-effective-target freorder } */
> > > > > 
> > > > > however this test passed on arm as -freorder-blocks-and-partition does
> > > > > not issue any warning unless -fprofile-use is also passed.
> > > > > 
> > > > > The patch below extends check_effective_target_freorder to check using
> > > > > -fprofile-use.  With this change in place the tests are skipped on
> > > > > arm.
> > > > 
> > > > > All feedback welcome,
> > > > 
> > > > Seems reasonable, unless a -freorder-blocks-and-partition/-fprofile-use 
> > > > person thinks this is the wrong solution.
> > > > 
> > > 
> > > Hi,
> > > 
> > > As promised, I tested this patch: it makes
> > > gcc.dg/tree-prof/section-attr-[123].c
> > > unsupported on arm*, and thus they are not failing anymore :-)
> > > 
> > > However, it also makes other tests unsupported, while they used to pass:
> > > 
> > >   gcc.dg/pr33648.c
> > >   gcc.dg/pr46685.c
> > >   gcc.dg/tree-prof/20041218-1.c
> > >   gcc.dg/tree-prof/bb-reorg.c
> > >   gcc.dg/tree-prof/cold_partition_label.c
> > >   gcc.dg/tree-prof/comp-goto-1.c
> > >   gcc.dg/tree-prof/pr34999.c
> > >   gcc.dg/tree-prof/pr45354.c
> > >   gcc.dg/tree-prof/pr50907.c
> > >   gcc.dg/tree-prof/pr52027.c
> > >   gcc.dg/tree-prof/va-arg-pack-1.c
> > > 
> > > and failures are now unsupported:
> > >   gcc.dg/tree-prof/cold_partition_label.c
> > >   gcc.dg/tree-prof/section-attr-1.c
> > >   gcc.dg/tree-prof/section-attr-2.c
> > >   gcc.dg/tree-prof/section-attr-3.c
> > > 
> > > So, maybe this patch is too strong?
> > 
> > In all of the cases that used to pass the tests are compile only tests
> > (except for cold_partition_label, which I discuss below).
> > 
> > On ARM passing -fprofile-use and -freorder-blocks-and-partition
> > results in a warning, and the -freorder-blocks-and-partition flag is
> > ignored.  However, disabling -freorder-blocks-and-partition doesn't
> > stop any of the tests compiling, hence the passes.
> > 
> > All the tests include:
> > 
> >   /* { dg-require-effective-target freorder } */
> > 
> > which I understand to mean, the tests requires the 'freorder' feature
> > to be supported (which corresponds to -freorder-blocks-and-partition).
> > 
> > For cold_partition_label and my new tests it's seems clear that the
> > lack of support for -freorder-blocks-and-partition on ARM is the cause
> > of the test failures.
> > 
> > So, is it reasonable to give up the other tests as "unsupported"?  I'd
> > be inclined to say yes, but I happy to rework the patch if anyone has
> > a suggestion for an alternative approach.
> It is reasonable.  It's not uncommon to have to drop various tests to
> UNSUPPORTED, particularly things which depend on assembler/linker
> capabilities, the target runtime system, etc.

OK, I'm going to take that as approval for my patch[1].  I'll wait a
couple of days to give people a chance to correct me, then I'll push
the change.  This should resolve the test regressions I introduced for
ARM.

Thanks,
Andrew

[1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02050.html


[PATCH] Avoid compile-time overhead of GIMPLE FE in CFG construction

2016-11-29 Thread Richard Biener

$subject

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2016-11-29  Richard Biener  

* tree-cfg.c (lower_phi_internal_fn): Do not look for further
PHIs after a regular stmt.
(stmt_starts_bb_p): PHIs not preceeded by a PHI or a label
start a new BB.

Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c  (revision 242953)
+++ gcc/tree-cfg.c  (working copy)
@@ -361,14 +361,11 @@ lower_phi_internal_fn ()
   /* After edge creation, handle __PHI function from GIMPLE FE.  */
   FOR_EACH_BB_FN (bb, cfun)
 {
-  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
+  for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi);)
{
  stmt = gsi_stmt (gsi);
  if (! gimple_call_internal_p (stmt, IFN_PHI))
-   {
- gsi_next ();
- continue;
-   }
+   break;
 
  lhs = gimple_call_lhs (stmt);
  phi_node = create_phi_node (lhs, bb);
@@ -2604,11 +2601,21 @@ stmt_starts_bb_p (gimple *stmt, gimple *
   else
return true;
 }
-  else if (gimple_code (stmt) == GIMPLE_CALL
-  && gimple_call_flags (stmt) & ECF_RETURNS_TWICE)
-/* setjmp acts similar to a nonlocal GOTO target and thus should
-   start a new block.  */
-return true;
+  else if (gimple_code (stmt) == GIMPLE_CALL)
+{
+  if (gimple_call_flags (stmt) & ECF_RETURNS_TWICE)
+   /* setjmp acts similar to a nonlocal GOTO target and thus should
+  start a new block.  */
+   return true;
+  if (gimple_call_internal_p (stmt, IFN_PHI)
+ && prev_stmt
+ && gimple_code (prev_stmt) != GIMPLE_LABEL
+ && (gimple_code (prev_stmt) != GIMPLE_CALL
+ || ! gimple_call_internal_p (prev_stmt, IFN_PHI)))
+   /* PHI nodes start a new block unless preceeded by a label
+  or another PHI.  */
+   return true;
+}
 
   return false;
 }



  1   2   >