date:20190605

[PATCH] Enable GCC support for AVX512_VP2INTERSECT.

2019-06-05 Thread Hongtao Liu

Hi Uros and all:
  This patch is about to enable support for AVX512_VP2INTERSECT which will
be in Willow Cove. There are two instructions for AVX512_VP2INTERSECT:
VP2INTERSECTD and VP2INTERSECTQ. More details please refer to
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

  Bootstrap is ok, and no regressions for i386/x86-64 testsuite.

Changelog:

gcc/
+2019-06-06  Hongtao Liu  
+ H.J. Lu  
+ Olga Makhotina  
+
+ * common/config/i386/i386-common.c
+ (OPTION_MASK_ISA_AVX512VP2INTERSECT_SET,
+ OPTION_MASK_ISA_AVX512VP2INTERSECT_UNSET): New macros.
+ (OPTION_MASK_ISA2_AVX512F_UNSET): Add
+ OPTION_MASK_ISA_AVX512VP2INTERSECT_UNSET.
+ (ix86_handle_option): Handle -mavx512vp2intersect.
+ * config/i386/avx512vp2intersectintrin.h: New.
+ * config/i386/avx512vp2intersectvlintrin.h: New.
+ * config/i386/cpuid.h (bit_AVX512VP2INTERSECT): New.
+ * config/i386/driver-i386.c (host_detect_local_cpu): Detect
+ AVX512VP2INTERSECT.
+ * config/i386/i386-builtin-types.def: Add new types.
+ * config/i386/i386-builtin.def: Add new builtins.
+ * config/i386/i386-builtins.c: (enum processor_features): Add
+ F_AVX512VP2INTERSECT.
+ (static const _isa_names_table isa_names_table): Ditto.
+ * config/i386/i386-c.c (ix86_target_macros_internal): Define
+ __AVX512VP2INTERSECT__.
+ * config/i386/i386-expand.c (ix86_expand_builtin): Expand
+ IX86_BUILTIN_2INTERSECTD512, IX86_BUILTIN_2INTERSECTQ512,
+ IX86_BUILTIN_2INTERSECTD256, IX86_BUILTIN_2INTERSECTQ256,
+ IX86_BUILTIN_2INTERSECTD128, IX86_BUILTIN_2INTERSECTQ128.
+ * config/i386/i386-modes.def (P2QI, P2HI): New modes.
+ * config/i386/i386-options.c (ix86_target_string): Add
+ -mavx512vp2intersect.
+ (ix86_option_override_internal): Handle AVX512VP2INTERSECT.
+ * config/i386/i386.c (ix86_hard_regno_nregs): Allocate two regs for
+ P2HImode and P2QImode.
+ (ix86_hard_regno_mode_ok): Register pair only starts at even hardreg
+ number for P2QImode and P2HImode.
+ * config/i386/i386.h (TARGET_AVX512VP2INTERSECT,
+ TARGET_AVX512VP2INTERSECT_P): New.
+ (PTA_AVX512VP2INTERSECT): Ditto.
+ * config/i386/i386.opt: Add -mavx512vp2intersect.
+ * config/i386/immintrin.h: Include avx512vp2intersectintrin.h and
+ avx512vp2intersectvlintrin.h.
+ * config/i386/sse.md (define_c_enum "unspec"): Add UNSPEC_VP2INTERSECT.
+ (define_mode_iterator VI48_AVX512VP2VL): New.
+ (avx512vp2intersect_2intersect,
+ avx512vp2intersect_2intersectv16si): New define_insn patterns.
+ (*vec_extractp2hi, *vec_extractp2qi): New define_insn_and_split
+ patterns.
+ * config.gcc: Add avx512vp2intersectvlintrin.h and
+ avx512vp2intersectintrin.h to extra_headers.
+ * doc/invoke.texi: Document -mavx512vp2intersect.
+

gcc/testsuite/
+2019-06-06  Hongtao Liu  
+ Olga Makhotina  
+
+ * gcc.target/i386/avx512-check.h: Handle bit_AVX512VP2INTERSECT.
+ * gcc.target/i386/avx512vp2intersect-2intersect-1a.c: New test.
+ * gcc.target/i386/avx512vp2intersect-2intersect-1b.c: Likewise.
+ * gcc.target/i386/avx512vp2intersect-2intersectvl-1a.c: Likewise.
+ * gcc.target/i386/avx512vp2intersect-2intersectvl-1b.c: Likewise.
+ * gcc.target/i386/sse-12.c: Add -mavx512vp2intersect.
+ * gcc.target/i386/sse-13.c: Likewsie.
+ * gcc.target/i386/sse-14.c: Likewise.
+ * gcc.target/i386/sse-22.c: Likewise.
+ * gcc.target/i386/sse-23.c: Likewise.
+ * g++.dg/other/i386-2.C: Likewise.
+ * g++.dg/other/i386-3.C: Likewise.
+

-- 
BR,
Hongtao
Index: gcc/ChangeLog
===
--- gcc/ChangeLog	(revision 271984)
+++ gcc/ChangeLog	(working copy)
@@ -1,3 +1,53 @@
+2019-06-06  Hongtao Liu  
+	H.J. Lu  
+	Olga Makhotina  
+
+	* common/config/i386/i386-common.c
+	(OPTION_MASK_ISA_AVX512VP2INTERSECT_SET,
+	OPTION_MASK_ISA_AVX512VP2INTERSECT_UNSET): New macros.
+	(OPTION_MASK_ISA2_AVX512F_UNSET): Add
+	OPTION_MASK_ISA_AVX512VP2INTERSECT_UNSET.
+	(ix86_handle_option): Handle -mavx512vp2intersect.
+	* config/i386/avx512vp2intersectintrin.h: New.
+	* config/i386/avx512vp2intersectvlintrin.h: New.
+	* config/i386/cpuid.h (bit_AVX512VP2INTERSECT): New.
+	* config/i386/driver-i386.c (host_detect_local_cpu): Detect
+	AVX512VP2INTERSECT.
+	* config/i386/i386-builtin-types.def: Add new types.
+	* config/i386/i386-builtin.def: Add new builtins.
+	* config/i386/i386-builtins.c: (enum processor_features): Add
+	F_AVX512VP2INTERSECT.
+	(static const _isa_names_table isa_names_table): Ditto.
+	* config/i386/i386-c.c (ix86_target_macros_internal): Define
+	__AVX512VP2INTERSECT__.
+	* config/i386/i386-expand.c (ix86_expand_builtin): Expand
+	IX86_BUILTIN_2INTERSECTD512, IX86_BUILTIN_2INTERSECTQ512,
+	IX86_BUILTIN_2INTERSECTD256, IX86_BUILTIN_2INTERSECTQ256,
+	IX86_BUILTIN_2INTERSECTD128, IX86_BUILTIN_2INTERSECTQ128.
+	* config/i386/i386-modes.def (P2QI, P2HI): New modes.
+	* config/i386/i386-options.c (ix86_target_string): Add
+	-mavx512vp2intersect.
+	(ix86_option_override_internal): Handle AVX512VP2INTERSECT.
+	*

Re: [RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-06-05 Thread Kugan Vivekanandarajah

Hi Richard,

Thanks for the review. Attached is the latest patch.

For testcase like cond_arith_1.c, with the patch, gcc ICE in fwprop. I
am limiting fwprop in cases like this. Is there a better fix for this?
index cf2c9de..2c99285 100644
--- a/gcc/fwprop.c
+++ b/gcc/fwprop.c
@@ -1358,6 +1358,15 @@ forward_propagate_and_simplify (df_ref use,
rtx_insn *def_insn, rtx def_set)
   else
 mode = GET_MODE (*loc);

+  /* TODO. We can't get the mode for
+ (set (reg:VNx16BI 109)
+  (unspec:VNx16BI [
+(reg:SI 131)
+(reg:SI 106)
+   ] UNSPEC_WHILE_LO))
+ Thus, bailout when it is UNSPEC and MODEs are not compatible.  */
+  if (GET_MODE_CLASS (mode) != GET_MODE_CLASS (GET_MODE (reg)))
+return false;
   new_rtx = propagate_rtx (*loc, mode, reg, src,
  optimize_bb_for_speed_p (BLOCK_FOR_INSN (use_insn)));

Thanks,
Kugan

On Mon, 3 Jun 2019 at 19:08, Richard Sandiford
 wrote:
>
> Kugan Vivekanandarajah  writes:
> > diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
> > index b3fae5b..ad838dd 100644
> > --- a/gcc/tree-vect-loop-manip.c
> > +++ b/gcc/tree-vect-loop-manip.c
> > @@ -415,6 +415,7 @@ vect_set_loop_masks_directly (struct loop *loop, 
> > loop_vec_info loop_vinfo,
> > bool might_wrap_p)
> >  {
> >tree compare_type = LOOP_VINFO_MASK_COMPARE_TYPE (loop_vinfo);
> > +  tree iv_type = LOOP_VINFO_MASK_IV_TYPE (loop_vinfo);
> >tree mask_type = rgm->mask_type;
> >unsigned int nscalars_per_iter = rgm->max_nscalars_per_iter;
> >poly_uint64 nscalars_per_mask = TYPE_VECTOR_SUBPARTS (mask_type);
> > @@ -445,11 +446,16 @@ vect_set_loop_masks_directly (struct loop *loop, 
> > loop_vec_info loop_vinfo,
> >tree index_before_incr, index_after_incr;
> >gimple_stmt_iterator incr_gsi;
> >bool insert_after;
> > -  tree zero_index = build_int_cst (compare_type, 0);
> >standard_iv_increment_position (loop, _gsi, _after);
> > -  create_iv (zero_index, nscalars_step, NULL_TREE, loop, _gsi,
> > +
> > +  tree zero_index = build_int_cst (iv_type, 0);
> > +  tree step = build_int_cst (iv_type,
> > +  LOOP_VINFO_VECT_FACTOR (loop_vinfo));
> > +  /* Creating IV of iv_type.  */
>
> s/Creating/Create/
>
> > +  create_iv (zero_index, step, NULL_TREE, loop, _gsi,
> >insert_after, _before_incr, _after_incr);
> >
> > +  zero_index = build_int_cst (compare_type, 0);
> >tree test_index, test_limit, first_limit;
> >gimple_stmt_iterator *test_gsi;
> >if (might_wrap_p)
> > [...]
> > @@ -1066,11 +1077,17 @@ vect_verify_full_masking (loop_vec_info loop_vinfo)
> > if (this_type
> > && can_produce_all_loop_masks_p (loop_vinfo, this_type))
> >   {
> > -   /* Although we could stop as soon as we find a valid mode,
> > -  it's often better to continue until we hit Pmode, since the
> > +   /* See whether zero-based IV would ever generate all-false masks
> > +  before wrapping around.  */
> > +   bool might_wrap_p = (iv_precision > cmp_bits);
> > +   /* Stop as soon as we find a valid mode.  If we decided to use
> > +  cmp_type which is less than Pmode precision, it is often 
> > better
> > +  to use iv_type corresponding to Pmode, since the
> >operands to the WHILE are more likely to be reusable in
> > -  address calculations.  */
> > -   cmp_type = this_type;
> > +  address calculations in this case.  */
>
> We're not stopping as soon as we find a valid mode though.  Any type
> that satisfies the if condition above is valid, but we pick wider
> cmp_types and iv_types for optimisation reasons.  How about:
>
>   /* Although we could stop as soon as we find a valid mode,
>  there are at least two reasons why that's not always the
>  best choice:
>
>  - An IV that's Pmode or wider is more likely to be reusable
>in address calculations than an IV that's narrower than
>Pmode.
>
>  - Doing the comparison in IV_PRECISION or wider allows
>a natural 0-based IV, whereas using a narrower comparison
>type requires mitigations against wrap-around.
>
>  Conversely, if the IV limit is variable, doing the comparison
>  in a wider type than the original type can introduce
>  unnecessary extensions, so picking the widest valid mode
>  is not always a good choice either.
>
>  Here we prefer the first IV type that's Pmode or wider,
>  and the first comparison type that's IV_PRECISION or wider.
>  (The comparison type must be no wider than the IV type,
>  to avoid extensions in the vector loop.)
>
>  ??? We might want to try continuing beyond Pmode

Re: [PATCH] Enable memory operand for vfpclass[p,s][s,d] patterns.

2019-06-05 Thread Hongtao Liu

On Thu, Jun 6, 2019 at 6:18 AM Jeff Law  wrote:
>
> On 6/5/19 1:39 AM, Hongtao Liu wrote:
> > Hi Jeff and Jakub:
> >   When adding new intrinsics(PR target/89803), i found vfpclassp[sd],
> > vfpclasss[sd] patterns didn't support memory operand which is
> > supported in instructions. So this patch is about to enable memory
> > operands for vfpclassp[s,d]/vfpclasss[s,d] patterns.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux (on skylake-avx512),
> > ok for trunk?
> >
> > Changelog
> > gcc/
> > 2019-06-05  Hongtao Liu  
> >
> > * config/i386/sse.md (define_mode_suffix vecmemsuffix): New.
> > (define_insn "avx512dq_fpclass"):
> > Enable memory operand for it.
> > (define_insn "avx512dq_vmfpclass"): Ditto.
> >
> > gcc/testsuite/
> > 2019-06-05  Hongtao Liu  
> >
> > * gcc.target/i386/avx512dq-vfpclasspd-1.c:
> > Adjust scan assember for {x,y,z} suffix.
> > * gcc.target/i386/avx512dq-vfpclassps-1.c: Ditto.
> OK, but make sure you fix the ChangeLog formatting as you commit.
>
> jeff

Thanks.

Committed:
https://gcc.gnu.org/viewcvs/gcc?view=revision=271984.

-- 
BR,
Hongtao

Go patch committed: Use specialized fast map routines

2019-06-05 Thread Ian Lance Taylor

In the Go runtime there are specialized fast map routines for certain
kep types.  This Go frontend patch by Cherry Zhang lets the compiler
make use of thesefunctions, instead of always using the generic ones.

As we now generate multiple versions of map delete calls, to make
things easier we delay the expansion of the built-in delete function
to flatten phase.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 271976)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-2609f9b8420e2341fbbe40d7cf6af42b0fba7293
+bc7374913367fba9b10dc284af87eb539fb6c5b2
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/escape.cc
===
--- gcc/go/gofrontend/escape.cc (revision 271669)
+++ gcc/go/gofrontend/escape.cc (working copy)
@@ -1622,6 +1622,16 @@ Escape_analysis_assign::expression(Expre
 }
 break;
 
+  case Runtime::MAPASSIGN_FAST32PTR:
+  case Runtime::MAPASSIGN_FAST64PTR:
+  case Runtime::MAPASSIGN_FASTSTR:
+{
+  // Map key escapes. The last argument is the key.
+  Node* key_node = Node::make_node(call->args()->back());
+  this->assign(this->context_->sink(), key_node);
+}
+break;
+
   case Runtime::IFACEE2T2:
   case Runtime::IFACEI2T2:
 {
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 271976)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -7843,63 +7843,33 @@ Builtin_call_expression::do_lower(Gogo*,
 
 case BUILTIN_DELETE:
   {
-   // Lower to a runtime function call.
-   const Expression_list* args = this->args();
-   if (args == NULL || args->size() < 2)
- this->report_error(_("not enough arguments"));
-   else if (args->size() > 2)
- this->report_error(_("too many arguments"));
-   else if (args->front()->type()->map_type() == NULL)
- this->report_error(_("argument 1 must be a map"));
-   else
- {
-   // Since this function returns no value it must appear in
-   // a statement by itself, so we don't have to worry about
-   // order of evaluation of values around it.  Evaluate the
-   // map first to get order of evaluation right.
-   Map_type* mt = args->front()->type()->map_type();
-   Temporary_statement* map_temp =
- Statement::make_temporary(mt, args->front(), loc);
-   inserter->insert(map_temp);
-
-   Temporary_statement* key_temp =
- Statement::make_temporary(mt->key_type(), args->back(), loc);
-   inserter->insert(key_temp);
-
-   Expression* e1 = Expression::make_type_descriptor(mt, loc);
-   Expression* e2 = Expression::make_temporary_reference(map_temp,
- loc);
-   Expression* e3 = Expression::make_temporary_reference(key_temp,
- loc);
-
-   // If the call to delete is deferred, and is in a loop,
-   // then the loop will only have a single instance of the
-   // temporary variable.  Passing the address of the
-   // temporary variable here means that the deferred call
-   // will see the last value in the loop, not the current
-   // value.  So for this unusual case copy the value into
-   // the heap.
-   if (!this->is_deferred())
- e3 = Expression::make_unary(OPERATOR_AND, e3, loc);
-   else
- {
-   Expression* a = Expression::make_allocation(mt->key_type(),
-   loc);
-   Temporary_statement* atemp =
- Statement::make_temporary(NULL, a, loc);
-   inserter->insert(atemp);
-
-   a = Expression::make_temporary_reference(atemp, loc);
-   a = Expression::make_dereference(a, NIL_CHECK_NOT_NEEDED, loc);
-   Statement* s = Statement::make_assignment(a, e3, loc);
-   inserter->insert(s);
-
-   e3 = Expression::make_temporary_reference(atemp, loc);
- }
-
-   return Runtime::make_call(Runtime::MAPDELETE, this->location(),
- 3, e1, e2, e3);
- }
+const Expression_list* args = this->args();
+if (args == NULL || args->size() < 2)
+  this->report_error(_("not enough arguments"));
+else if

Re: [PATCH] RX: Add rx-*-linux target

2019-06-05 Thread Jeff Law

On 6/3/19 11:01 AM, Yoshinori Sato wrote:
> On Sun, 02 Jun 2019 22:12:37 +0900,
> Oleg Endo wrote:
>>
>> On Sun, 2019-06-02 at 20:26 +0900, Yoshinori Sato wrote:
>>> On Fri, 31 May 2019 09:16:18 +0900,
>>> Jeff Law wrote:

 On 5/29/19 12:27 PM, Jeff Law wrote:
> On 5/23/19 6:05 AM, Yoshinori Sato wrote:
>> I ported linux kernel to Renesas RX.
>>
>> rx-*-elf target output a binary different from the standard
>> ELF.
>> It has the same format as the Renesas compiler.
>>
>> But the linux kernel requires the standard ELF format.
>> I want to define a rx-*-linux target so that I can generate
>> a standard ELF binary.
>
> Presumably you're resubmitting after your assignment got recorded
> (I
> think I saw that fly by recently).
>
> I'll construct a ChangeLog and install this on the trunk.

 So this is causing libgcc to fail to build for rx-elf.  The problem
 is
 the DF=SF #define.  I think you need so split that out so that it's
 only
 used for rx-linux.

 Jeff
>>>
>>> OK. fix it.
>>> I tried build rx-elf target. it success.
>>>
>>
>> Setting DF=SF is the wrong thing to do IMHO.  RX can do DF just fine in
>> software.  If this is hardcoded like that in the roots of the
>> toolchain, it will make compiling packages that actually need real DF
>> completely impossible, won't it?  We also don't set DI = SI just
>> because the hardware is bad at SI ... 
>>
>> Just my 2 cents.
>>
>> Cheers,
>> Oleg
>>
> 
> OK.
> I was misunderstood.
> I think this is not a problem.
[ ... ]
THanks.  Give we already had your original patch in the source tree, I
extracted just the new bits and committed them.  I'm attaching the
changes for archival purposes.

Jeff

commit aafb499477a6f23d01c4c048c20c8833a801396a
Author: law 
Date:   Wed Jun 5 23:20:27 2019 +

* config.host (rx-*-linux*): Add t-fdpbit to tmake_file
Add appropriate tm_file clause as well.
* config/rx/t-rx (HOST_LIBGCC2_CFLAGS): Remove.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@271978 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog
index 82c124f8ece..93cdcda3284 100644
--- a/libgcc/ChangeLog
+++ b/libgcc/ChangeLog
@@ -1,3 +1,9 @@
+2019-06-05  Yoshinori Sato  
+
+   * config.host (rx-*-linux*): Add t-fdpbit to tmake_file
+   Add appropriate tm_file clause as well.
+   * config/rx/t-rx (HOST_LIBGCC2_CFLAGS): Remove.
+
 2019-06-05  James Clarke  
 
* config/ia64/crtbegin.S (__dso_handle): Put in .sdata/.sbss
diff --git a/libgcc/config.host b/libgcc/config.host
index ff2f0fa1e58..d75e2b5d7aa 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -1212,7 +1212,8 @@ rx-*-elf)
tm_file="$tm_file rx/rx-abi.h rx/rx-lib.h"
;;
 rx-*-linux*)
-   tmake_file="rx/t-rx"
+   tmake_file="rx/t-rx t-fdpbit"
+   tm_file="$tm_file rx/rx-lib.h"
;;
 s390-*-linux*)
tmake_file="${tmake_file} s390/t-crtstuff s390/t-linux 
s390/32/t-floattodi t-stack s390/t-stack-s390"
diff --git a/libgcc/config/rx/t-rx b/libgcc/config/rx/t-rx
index 34cdc97cc03..ace3b16f39e 100644
--- a/libgcc/config/rx/t-rx
+++ b/libgcc/config/rx/t-rx
@@ -23,7 +23,6 @@
 # the normal libgcc sources:
 
 LIB2ADD = $(srcdir)/config/rx/rx-abi-functions.c
-HOST_LIBGCC2_CFLAGS += -DDF=SF
 
 
 # We need special handling of the floating point conversion

Re: [PATCH] RX: Add rx-*-linux target

2019-06-05 Thread Jeff Law

On 6/2/19 5:26 AM, Yoshinori Sato wrote:
> On Fri, 31 May 2019 09:16:18 +0900,
> Jeff Law wrote:
>>
>> On 5/29/19 12:27 PM, Jeff Law wrote:
>>> On 5/23/19 6:05 AM, Yoshinori Sato wrote:
 I ported linux kernel to Renesas RX.

 rx-*-elf target output a binary different from the standard ELF.
 It has the same format as the Renesas compiler.

 But the linux kernel requires the standard ELF format.
 I want to define a rx-*-linux target so that I can generate
 a standard ELF binary.
>>> Presumably you're resubmitting after your assignment got recorded (I
>>> think I saw that fly by recently).
>>>
>>> I'll construct a ChangeLog and install this on the trunk.
>> So this is causing libgcc to fail to build for rx-elf.  The problem is
>> the DF=SF #define.  I think you need so split that out so that it's only
>> used for rx-linux.
>>
>> Jeff
> 
> OK. fix it.
> I tried build rx-elf target. it success.
What parts did you build?  For me it fails building libgcc:


> /home/tmp/rx/./gcc/xgcc -B/home/tmp/rx/./gcc/ -B/tmp/rx/rx-elf/bin/ 
> -B/tmp/rx/rx-elf/lib/ -isystem /tmp/rx/rx-elf/include -isystem 
> /tmp/rx/rx-elf/sys-include-g -O2 -m64bit-doubles -O2  -g -O2 -DIN_GCC  
> -DCROSS_DIRECTORY_STRUCTURE  -W -Wall -Wno-narrowing -Wwrite-strings 
> -Wcast-qual -Wstrict-prototypes -Wold-style-definition  -isystem ./include   
> -DDF=SF -g -DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-protector -Dinhibit_libc 
>  -DDF=SF -I. -I. -I../../.././gcc -I/home/gcc/GIT-3/gcc/libgcc 
> -I/home/gcc/GIT-3/gcc/libgcc/. -I/home/gcc/GIT-3/gcc/libgcc/../gcc 
> -I/home/gcc/GIT-3/gcc/libgcc/../include  -DHAVE_CC_TLS -DUSE_EMUTLS -o 
> _muldc3.o -MT _muldc3.o -MD -MP -MF _muldc3.dep -DL_muldc3 -c 
> /home/gcc/GIT-3/gcc/libgcc/libgcc2.c -fvisibility=hidden -DHIDE_EXPORTS
> In file included from /home/gcc/GIT-3/gcc/libgcc/libgcc2.c:56:
> /home/gcc/GIT-3/gcc/libgcc/libgcc2.h:256:16: warning: conflicting types for 
> built-in function ‘__divdc3’; expected ‘_Complex double(double,  double,  
> double,  double)’ [-Wbuiltin-declaration-mismatch]
>   256 | #define __N(a) __ ## a
>   |^~
> /home/gcc/GIT-3/gcc/libgcc/libgcc2.h:366:19: note: in expansion of macro ‘__N’
>   366 | #define __divdc3  __N(divdc3)
>   |   ^~~
> /home/gcc/GIT-3/gcc/libgcc/libgcc2.h:461:15: note: in expansion of macro 
> ‘__divdc3’
>   461 | extern DCtype __divdc3 (DFtype, DFtype, DFtype, DFtype);
>   |   ^~~~


[ ... and so on ... ]

Jeff

[PATCH] handle vla plus offset in strlen (PR 90662)

2019-06-05 Thread Martin Sebor


One of my new tests for the strlen/sprintf integration tripped
over an incomplete handling of VLAs by the strlen pass.  Where
it can determine the length of a substring at some offset with
other kinds of arrays, the pass fails with VLAs because they
are represented as pointers to arrays.

The attached patch adds the missing handling so that code like
the following can be fully folded even for VLAs.

  int f (int n)
  {
char a[n];
strcpy (a, "12345");
if (strlen ([2]) != 3)
  abort ();
  }

Tested on x86_64-linux.

Martin
PR tree-optimization/90662 - strlen of a string in a vla plus offset not folded

gcc/ChangeLog:

	PR tree-optimization/90662
	* tree-ssa-strlen.c (get_stridx): Handle simple VLAs and pointers
	to arrays.

gcc/testsuite/ChangeLog:

	PR tree-optimization/90662
	* gcc.dg/strlenopt-62.c: New test.
	* gcc.dg/strlenopt-63.c: New test.

diff --git a/gcc/testsuite/gcc.dg/strlenopt-62.c b/gcc/testsuite/gcc.dg/strlenopt-62.c
new file mode 100644
index 000..644bdee2c1b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-62.c
@@ -0,0 +1,89 @@
+/* PR tree-optimization/90662 - strlen of a string in a vla plus offset
+   not folded
+   { dg-do compile }
+   { dg-options "-O2 -Wall -fdump-tree-gimple -fdump-tree-optimized" } */
+
+#include "strlenopt.h"
+
+#define CONCAT(x, y) x ## y
+#define CAT(x, y) CONCAT (x, y)
+#define FAILNAME(name, counter) \
+  CAT (CAT (CAT (call_ ## name ##_on_line_, __LINE__), _), counter)
+
+#define FAIL(name, counter) do {			\
+extern void FAILNAME (name, counter) (void);	\
+FAILNAME (name, counter)();\
+  } while (0)
+
+/* Macro to emit a call to funcation named
+ call_in_true_branch_not_eliminated_on_line_NNN()
+   for each call that's expected to be eliminated.  The dg-final
+   scan-tree-dump-time directive at the bottom of the test verifies
+   that no such call appears in output.  */
+#define ELIM(expr) \
+  if (!(expr)) FAIL (in_true_branch_not_eliminated, __COUNTER__); else (void)0
+
+#define ARGS(...) __VA_ARGS__
+
+void sink (void*, ...);
+
+
+#define T(expect, init, p)\
+  do {			\
+char vla[n];	\
+char *ptr = strcpy (vla, init);			\
+ELIM (expect == strlen (p));			\
+sink (ptr);		\
+  } while (0)
+
+void test_vla_local (int n)
+{
+  T (0, "", ptr);
+  T (0, "\0", ptr);
+  T (1, "1", ptr);
+  T (2, "12", ptr);
+  T (3, "123", ptr);
+
+  T (2, "123", ptr + 1);
+  T (1, "123", [2]);
+  T (0, "123", [1] + 2);
+}
+
+
+#undef T
+#define T(expect, parr, init, p)			\
+  do {			\
+char (*parray)[] = *ppa++;\
+char *ptr = strcpy (parr, init);			\
+(void)		\
+ELIM (expect == strlen (p));			\
+  } while (0)
+
+/* Have the function take a pointer to pointers to arrays so that each
+   test case can use its own pointer to avoid interference between.  */
+
+void test_array_ptr (char (**ppa)[])
+{
+  T (0, *parray, "", *parray);
+  T (0, *parray, "", &(*parray)[0]);
+
+  T (1, *parray, "1", &(*parray)[0]);
+  T (0, *parray, "1", &(*parray)[1]);
+
+  T (2, *parray, "12", &(*parray)[0]);
+  T (1, *parray, "12", &(*parray)[1]);
+  T (0, *parray, "12", &(*parray)[2]);
+
+  T (3, *parray, "123", &(*parray)[0]);
+  T (2, *parray, "123", &(*parray)[1]);
+  T (1, *parray, "123", &(*parray)[2]);
+  T (0, *parray, "123", &(*parray)[3]);
+
+  T (3, *parray, "123", ptr);
+  T (2, *parray, "123", [1]);
+  T (1, *parray, "123", [2]);
+  T (0, *parray, "123", [3]);
+}
+
+/* { dg-final { scan-tree-dump-times "strlen" 0 "optimized" } }
+   { dg-final { scan-tree-dump-times "not_eliminated" 0 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-63.c b/gcc/testsuite/gcc.dg/strlenopt-63.c
new file mode 100644
index 000..ca3e55fd9f8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-63.c
@@ -0,0 +1,158 @@
+/* PR tree-optimization/90662 - strlen of a string in a vla plus offset
+   not folded
+   Verify that strlen of pointers to arrays are computed correctly
+   (whether folded or not).
+   { dg-do run }
+   { dg-options "-O2 -Wall" } */
+
+#include "strlenopt.h"
+
+#define A(expr) \
+  ((expr)   \
+   ? (void)0\
+   : (__builtin_printf ("assertion failed on line %i: %s\n",\
+__LINE__, #expr),   \
+  __builtin_abort ()))
+
+typedef char A5[5];
+
+A5 a5[5];
+A5* p[5] = { [4], [3], [2], [1], [0] };
+
+__attribute__ ((noclone, noinline, noipa))
+void deref_deref (void)
+{
+  strcpy (**p, "12345");
+  A (strlen (**p) == 5);
+}
+
+__attribute__ ((noclone, noinline, noipa))
+void deref_idx_0 (void)
+{
+  strcpy (*p[0], "");
+  A (strlen (*p[0]) == 0);
+}
+
+__attribute__ ((noclone, noinline, noipa))
+void deref_idx_1 (void)
+{
+  strcpy (*p[1], "1");
+  A (strlen (*p[1]) == 1);
+  A (strlen (&(*p[1])[1]) == 0);
+
+  A (strlen (*p[0]) == 0);
+}
+
+__attribute__ ((noclone, noinline, noipa))

Re: [PATCH] [MIPS] Inhibit trailing .insn if pool is not followed by code

2019-06-05 Thread Maciej W. Rozycki

On Wed, 5 Jun 2019, Jeff Law wrote:

> >  Thanks for looking into it.  FWIW I think the `__pend' symbol will best 
> > be still emitted for consistency, however as STT_OBJECT and consequently 
> > with no trailing `.insn'.
> If I understand correctly we'd still want to call
> mips_set_text_contents_type in all the cases we did before, but that
> we'd pass in false for the FUNCTION_P argument?

 Yes, I think it would be the most straightforward implementation.

  Maciej

Re: [PATCH] preserve integer value of local addresses returned from functions (PR 90737)

2019-06-05 Thread Jeff Law

On 6/3/19 3:24 PM, Martin Sebor wrote:
> While testing a different -Wreturn-local-addr bug fix/enhancement
> I noticed that in functions that return integers as opposed to
> pointers such as:
> 
>   intptr_t f (int i) { return (intptr_t) }
> 
> the converted address is folded to zero.  This can be detected
> by strictly conforming programs so it's not really correct.
> Such statements also trigger the warning.
> 
> The attached patch adjusts the C and C++ front-ends to avoid
> the folding.
> 
> The patch also avoids the warning but I'm on the fence about that.
> There is some value in diagnosing it since it could be masking
> a bug.  Would anyone like to argue in favor of keeping it?
> 
> Martin
> 
> gcc-90737.diff
> 
> PR c/90737 - [8/9/10 Regression] inconsistent address of a local converted to 
> intptr_t between callee and caller
> 
> gcc/c/ChangeLog:
> 
>   PR c/90737
>   * c-typeck.c (c_finish_return): Only consider functions returning
>   pointers as candidates for -Wreturn-local-addr.
> 
> gcc/cp/ChangeLog:
> 
>   PR c/90737
>   * typeck.c (maybe_warn_about_returning_address_of_local): Only
>   consider functions returning pointers as candidates for
>   -Wreturn-local-addr.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c/90737
>   * c-c++-common/Wreturn-local-addr.c: New test.
>   * g++.dg/warn/Wreturn-local-addr-6.C: New test.
OK
jeff

Re: [PATCH][MSP430][4/4] Implement 64-bit shifts in assembly code

2019-06-05 Thread Jeff Law

On 6/4/19 7:17 AM, Jozef Lawrynowicz wrote:
> This patch implements 64-bit shifts in assembly code. Previously, generic C
> library code from libgcc would be used to perform the shifts, which was much
> more costly in terms of code size.
> 
> I observed 700 PASS->FAIL regressions from the GCC testsuite alone when these
> 64-bit shifts were implemented incorrectly, hence I've assumed there is
> already adequate test coverage that shifts operate correctly, and I have not
> added new tests to verify their correct execution.
> 
> For the following program, the below code size reduction is observed:
>   long long a;
> 
>   int
>   main (void)
>   {
> a = a >> 4;
> return 0;
>   }
> 
> With shift patch 3:
>textdata bss dec hex filename
> 670  12  26 708 2c4 a.out
> With new patch:
>textdata bss dec hex filename
> 512  12  26 550 226 a.out
> 
> Ok for trunk?
> 
> 
> 0004-MSP430-Implement-64-bit-shifts-in-assembly-code.patch
> 
> From 3b34b3d005ea63b37cf6a277395a048e55d854b2 Mon Sep 17 00:00:00 2001
> From: Jozef Lawrynowicz 
> Date: Mon, 13 May 2019 17:55:27 +0100
> Subject: [PATCH 4/4] MSP430: Implement 64-bit shifts in assembly code
> 
> gcc/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * config/msp430/msp430.c (msp430_expand_helper): Setup arguments which
>   describe how to perform MSPABI compliant 64-bit shift.
>   * config/msp430/msp430.md (ashldi3): New define_expand.
>   (ashrdi3): New define_expand.
>   (lshrdi3): New define_expand.
> 
> libgcc/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * config/msp430/slli.S (__mspabi_s): New library function for
>   performing a logical left shift of a 64-bit value.
>   (__mspabi_srall): New library function for
>   performing a arithmetic right shift of a 64-bit value.
>   (__mspabi_srlll): New library function for
>   performing a logical right shift of a 64-bit value.
> 
> gcc/testsuite/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * gcc.target/msp430/mspabi_s.c: New test.
>   * gcc.target/msp430/mspabi_srall.c: New test.
>   * gcc.target/msp430/mspabi_srlll.c: New test.
Going to assume your assembly routines are correct :-)

OK
jeff

Re: [PATCH] [MIPS] Inhibit trailing .insn if pool is not followed by code

2019-06-05 Thread Jeff Law

On 6/4/19 3:07 PM, Maciej W. Rozycki wrote:
> On Mon, 3 Jun 2019, Faraz Shahbazker wrote:
> 
>> The __pool and __pend symbols are used to mark the beginning and end of
>> inline constant pools in MIPS16 code regions.  However if the pool occurs
>> at the boundary of a code region and is not followed by further code,
>> presence of the __pend symbol can confuse the dissassembler in to treating
>> subsequent non-MIPS16 code block as MIPS16.
> 
>  Thanks for looking into it.  FWIW I think the `__pend' symbol will best 
> be still emitted for consistency, however as STT_OBJECT and consequently 
> with no trailing `.insn'.
If I understand correctly we'd still want to call
mips_set_text_contents_type in all the cases we did before, but that
we'd pass in false for the FUNCTION_P argument?

jeff

Re: [PATCH] Put __dso_handle in .sdata/.sbss on ia64

2019-06-05 Thread Jeff Law

On 6/1/19 6:30 PM, James Clarke wrote:
> The symbol is exposed to C by dso_handle.h, and since it's a single
> 8-byte pointer, it is just within the threshold for being in the small
> data (or bss) section, so code accessing it will use GP-relative
> addressing. Therefore we must put it in .sdata/.sbss in case our other
> data sections grow too big and we overflow the 22-bit relocation.
> 
> libgcc/
>   * config/ia64/crtbegin.S (__dso_handle): Put in .sdata/.sbss
>   rather than .data/.bss so it can be accessed via gp-relative
>   addressing.
THanks.  Ive installed this on the trunk.

jeff

Re: [PATCH][MSP430][3/4] Disable performance optimal library code shifts when optimizing for size

2019-06-05 Thread Jeff Law

On 6/4/19 7:14 AM, Jozef Lawrynowicz wrote:
> This patch reduces code size by disabling the performance optimized,
> "const_variant" of shift library functions when optimization for size is
> enabled.
> 
> For the following program, the below code size reduction is observed:
>   long a;
> 
>   int
>   main (void)
>   {
> a = a >> 4;
> return 0;
>   }
> 
> With shift patch 2:
>textdata bss dec hex filename
> 522  12  22 556 22c a.out
> New patch:
>textdata bss dec hex filename
> 474  12  22 508 1fc a.out
> 
> Ok for trunk?
> 
> 
> 0003-MSP430-Do-not-use-the-performance-optimized-variant-.patch
> 
> From 894b6809822ba3a3a1bab3750abe29e03f2a3ad6 Mon Sep 17 00:00:00 2001
> From: Jozef Lawrynowicz 
> Date: Mon, 13 May 2019 17:52:19 +0100
> Subject: [PATCH 3/4] MSP430: Do not use the performance optimized variant of a
>  shift by constant amount when optimizing for size
> 
> gcc/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * config/msp430/msp430.md (ashlhi3): Use the const_variant of shift
>   library functions only when not optimizing for size.
>   (ashlsi3): Likewise.
>   (ashrhi3): Likewise.
>   (ashrsi3): Likewise.
>   (lshrhi3): Likewise.
>   (lshrsi3): Likewise.
> 
> gcc/testsuite/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * gcc.target/msp430/size-optimized-shifts.c: New test.
OK
jeff

Re: [PATCH][MSP430][2/4] Emulate 16-bit shifts with rotate insn when src operand is originally in memory

2019-06-05 Thread Jeff Law

On 6/4/19 7:11 AM, Jozef Lawrynowicz wrote:
> This patch reduces code size by enabling the emulation of some 16-bit shift
> instructions with the native rotate instructions, when the source operand is 
> in
> memory. This is achieved by forcing the source operand into a register.
> 
> For the following program, the below code size reduction is observed:
>   int a;
> 
>   int
>   main (void)
>   {
> a = a << 4;
> return 0;
>   }
> 
> With shift patch 1:
>textdata bss dec hex filename
> 484  12  20 516 204 a.out
> With new patch:
>textdata bss dec hex filename
> 452  12  20 484 1e4 a.out
> 
> Ok for trunk?
> 
> 
> 0002-MSP430-Force-the-src-operand-of-a-HImode-shift-into-.patch
> 
> From e609f63d49227ce385316896dde6a476f5f27db7 Mon Sep 17 00:00:00 2001
> From: Jozef Lawrynowicz 
> Date: Mon, 13 May 2019 17:48:00 +0100
> Subject: [PATCH 2/4] MSP430: Force the src operand of a HImode shift into a
>  register if it is in memory
> 
> gcc/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * config/msp430/msp430.md (ashlhi3): Force shift src operand into a
>   register if it is in memory, so the shift can be emulated with a rotate
>   instruction.
>   (ashrhi3): Likewise.
>   (lshrhi3): Likewise.
> 
> gcc/testsuite/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * gcc.target/msp430/emulate-slli.c: New test.
>   * gcc.target/msp430/emulate-srai.c: New test.
>   * gcc.target/msp430/emulate-srli.c: New test.
OK
jeff
> ---

Re: [PATCH][MSP430][1/4] Put libgcc shift functions in their own sections

2019-06-05 Thread Jeff Law

On 6/4/19 7:07 AM, Jozef Lawrynowicz wrote:
> This patch reduces code size by putting each of the shift library functions 
> from
> libgcc in their own section. This means that, for example, a 16-bit logical
> left shift does not result in code to perform a 32-bit logical left shift 
> being
> included in the final executable, as the linker can now garbage collect unused
> sections.
> 
> For the following program, the below code size reduction is observed:
>   int a, b;
> 
>   int
>   main (void)
>   {
> a = a << b;
> return 0;
>   }
> 
> Current trunk:
>textdata bss dec hex filename
> 572  12  22 606 25e a.out
> With patch:
>textdata bss dec hex filename
> 466  12  22 500 1f4 a.out
> 
> Ok for trunk?
> 
> 
> 0001-MSP430-Put-the-library-functions-for-bitwise-shifts-.patch
> 
> From 8017a4b453ae1b07bbeb75f7f7613a5bc5605159 Mon Sep 17 00:00:00 2001
> From: Jozef Lawrynowicz 
> Date: Mon, 13 May 2019 17:42:08 +0100
> Subject: [PATCH 1/4] MSP430: Put the library functions for bitwise shifts in
>  their own sections
> 
> libgcc/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * config/msp430/slli.S (__mspabi_slli_n): Put function in its own
>   section.
>   (__mspabi_slli): Likewise.
>   (__mspabi_slll_n): Likewise.
>   (__mspabi_slll): Likewise.
>   * config/msp430/srai.S (__mspabi_srai_n): Likewise.
>   (__mspabi_srai): Likewise.
>   (__mspabi_sral_n): Likewise.
>   (__mspabi_sral): Likewise.
>   * config/msp430/srli.S (__mspabi_srli_n): Likewise.
>   (__mspabi_srli): Likewise.
>   (__mspabi_srll_n): Likewise.
>   (__mspabi_srll): Likewise.
OK.
jeff

Re: [PATCH] Enable memory operand for vfpclass[p,s][s,d] patterns.

2019-06-05 Thread Jeff Law

On 6/5/19 1:39 AM, Hongtao Liu wrote:
> Hi Jeff and Jakub:
>   When adding new intrinsics(PR target/89803), i found vfpclassp[sd],
> vfpclasss[sd] patterns didn't support memory operand which is
> supported in instructions. So this patch is about to enable memory
> operands for vfpclassp[s,d]/vfpclasss[s,d] patterns.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux (on skylake-avx512),
> ok for trunk?
> 
> Changelog
> gcc/
> 2019-06-05  Hongtao Liu  
> 
> * config/i386/sse.md (define_mode_suffix vecmemsuffix): New.
> (define_insn "avx512dq_fpclass"):
> Enable memory operand for it.
> (define_insn "avx512dq_vmfpclass"): Ditto.
> 
> gcc/testsuite/
> 2019-06-05  Hongtao Liu  
> 
> * gcc.target/i386/avx512dq-vfpclasspd-1.c:
> Adjust scan assember for {x,y,z} suffix.
> * gcc.target/i386/avx512dq-vfpclassps-1.c: Ditto.
OK, but make sure you fix the ChangeLog formatting as you commit.

jeff

Re: [PATCH] IPA ICF: enhance dump output

2019-06-05 Thread Jeff Law

On 6/4/19 8:38 AM, Martin Liška wrote:
> Hi.
> 
> The patch is about simplification of dump output. Plus it prints
> also a file in which the dump message was emitted.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2019-06-04  Martin Liska  
> 
>   * ipa-icf-gimple.h (dump_message_1): Remove.
>   (dump_message): Likewise.
>   (return_false_with_message_1): Print also file.
>   (return_false_with_msg): Likewise.
>   (return_with_result): Likewise.
>   (return_with_debug): Likewise.
>   * ipa-icf.c (sem_function::equals_private): Remove call
>   to dump_message.
OK.
jeff

Re: [PATCH] Fix HTML headers and titles for 'Installing GCC' pages (PR web/87933).

2019-06-05 Thread Jeff Law

On 6/5/19 3:09 AM, Martin Liška wrote:
> Hi.
> 
> The patch fixes wrong titles/header in 'Installing GCC' pages.
> 
> Tested with make html.
> Ready for trunk?
> 
> Martin
> 
> gcc/ChangeLog:
> 
> 2019-06-05  Martin Liska  
> 
>   PR web/87933
>   * doc/install.texi: Fix HTML headers and
>   titles for 'Installing GCC' pages.
> ---
>  gcc/doc/install.texi | 1 -
>  1 file changed, 1 deletion(-)
> 
> 
OK
jeff

Re: undefined behavior in value_range::equiv_add()?

2019-06-05 Thread Jeff Law

On 6/4/19 9:04 AM, Richard Biener wrote:
> On Tue, Jun 4, 2019 at 3:40 PM Jeff Law  wrote:
>> 
>> On 6/4/19 5:23 AM, Richard Biener wrote:
>>> On Tue, Jun 4, 2019 at 12:30 AM Jeff Law  wrote:
 
 On 6/3/19 7:13 AM, Aldy Hernandez wrote:
> On 5/31/19 5:00 AM, Richard Biener wrote:
>> On Fri, May 31, 2019 at 2:27 AM Jeff Law 
>> wrote:
>>> 
>>> On 5/29/19 10:20 AM, Aldy Hernandez wrote:
 On 5/29/19 12:12 PM, Jeff Law wrote:
> On 5/29/19 9:58 AM, Aldy Hernandez wrote:
>> On 5/29/19 9:24 AM, Richard Biener wrote:
>>> On Wed, May 29, 2019 at 2:18 PM Aldy Hernandez
>>>  wrote:
 
 As per the API, and the original documentation
 to value_range, VR_UNDEFINED and VR_VARYING
 should never have equivalences. However, 
 equiv_add is tacking on equivalences blindly,
 and there are various regressions that happen
 if I fix this oversight.
 
 void value_range::equiv_add (const_tree var, 
 const value_range *var_vr, bitmap_obstack
 *obstack) { if (!m_equiv) m_equiv =
 BITMAP_ALLOC (obstack); unsigned ver =
 SSA_NAME_VERSION (var); bitmap_set_bit
 (m_equiv, ver); if (var_vr && var_vr->m_equiv) 
 bitmap_ior_into (m_equiv, var_vr->m_equiv); }
 
 Is this a bug in the documentation / API, or is
 equiv_add incorrect and we should fix the
 fall-out elsewhere?
>>> 
>>> I think this must have been crept in during the
>>> classification. If you go back to say GCC 7 you
>>> shouldn't see value-ranges with UNDEFINED/VARYING
>>> state in the lattice that have equivalences.
>>> 
>>> It may not be easy to avoid with the new classy
>>> interface but we're certainly not tacking on them
>>> "blindly".  At least we're not supposed to.  As
>>> usual the intermediate state might be "broken"
>>> but intermediateness is not sth the new class
>>> "likes".
>> 
>> It looks like extract_range_from_stmt (by virtue
>> of vrp_visit_assignment_or_call and then
>> extract_range_from_ssa_name) returns one of these
>> intermediate ranges.  It would seem to me that an 
>> outward looking API method like
>> vr_values::extract_range_from_stmt shouldn't be
>> returning inconsistent ranges.  Or are there no 
>> guarantees for value_ranges from within all of
>> vr_values?
> ISTM that if we have an implementation constraint
> that says a VR_VARYING or VR_UNDEFINED range can't
> have equivalences, then we need to honor that at the
> minimum for anything returned by an external API. 
> Returning an inconsistent state is bad.  I'd even
> state that we should try damn hard to avoid it in
> internal APIs as well.
 
 Agreed * 2.
 
> 
>> 
>> Perhaps I should give a little background.  As part
>> of your value_range_base re-factoring last year,
>> you mentioned that you didn't split out intersect
>> like you did union because of time or oversight.
>> I have code to implement intersect (attached), for
>> which I've noticed that I must leave equivalences
>> intact, even when transitioning to VR_UNDEFINED:
>> 
>> [from the attached patch] +  /* If THIS is varying
>> we want to pick up equivalences from OTHER. +
>> Just special-case this here rather than trying to
>> fixup after the + fact.  */ +  if
>> (this->varying_p ()) +this->deep_copy (other); 
>> +  else if (this->undefined_p ()) +/* ?? Leave
>> any equivalences already present in an undefined. +
>> This is technically not allowed, but we may get an
>> in-flight +   value_range in an intermediate
>> state.  */
> Where/when does this happen?
 
 The above snippet is not currently in mainline.  It's
 in the patch I'm proposing to clean up intersect.  It's
 just that while cleaning up intersect I noticed that if
 we keep to the value_range API, we end up clobbering an
 equivalence to a VR_UNDEFINED that we depend up further
 up the call chain.
 
 The reason it doesn't happen in mainline is because
 intersect_helper bails early on an undefined, thus
 leaving the problematic equivalence intact.
 
 You can see it in mainline though, with the following
 testcase:
 
 int f(int x) { if (x != 0 && x != 1) return -2;
 
 return !x; }
 
 Break in evrp_range_analyzer::record_ranges_from_stmt()
 and see that the call to extract_range_from_stmt()

Go patch committed: Make calls and functions inlinable

2019-06-05 Thread Ian Lance Taylor

This patch to the Go frontend makes call expressions and function
reference expressions inlinable.

We now scan inlinable methods for references to global variables and
functions (I forgot to do that earlier).

We now track all packages mentioned by exports (that should have been
done earlier too).

We record assembler names in the export data, so that we can inline
calls to non-Go functions.  We modify gccgoimporter code to skip
assembler name.

This increases the number of inlinable functions in the standard
library from 215 to 439.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 271945)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-949c3b7aa603bc09e650d62e82c600b3463802f0
+2609f9b8420e2341fbbe40d7cf6af42b0fba7293
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/export.cc
===
--- gcc/go/gofrontend/export.cc (revision 271891)
+++ gcc/go/gofrontend/export.cc (working copy)
@@ -133,6 +133,11 @@ Collect_references_from_inline::expressi
   if (fe != NULL)
 {
   Named_object* no = fe->named_object();
+
+  if (no->is_function_declaration()
+ && no->func_declaration_value()->type()->is_builtin())
+   return TRAVERSE_CONTINUE;
+
   std::pair ins =
this->exports_->insert(no);
 
@@ -247,6 +252,22 @@ Export::export_globals(const std::string
  if ((*p)->is_function()
  && (*p)->func_value()->export_for_inlining())
check_inline_refs.push_back(*p);
+ else if ((*p)->is_type())
+   {
+ const Bindings* methods = (*p)->type_value()->local_methods();
+ if (methods != NULL)
+   {
+ for (Bindings::const_definitions_iterator pm =
+methods->begin_definitions();
+  pm != methods->end_definitions();
+  ++pm)
+   {
+ Function* fn = (*pm)->func_value();
+ if (fn->export_for_inlining())
+   check_inline_refs.push_back(*pm);
+   }
+   }
+   }
}
 }
 
@@ -282,6 +303,9 @@ Export::export_globals(const std::string
}
 }
 
+  // Track all imported packages mentioned in export data.
+  Unordered_set(const Package*) all_imports;
+
   // Export the symbols in sorted order.  That will reduce cases where
   // irrelevant changes to the source code affect the exported
   // interface.
@@ -291,15 +315,20 @@ Export::export_globals(const std::string
   for (Unordered_set(Named_object*)::const_iterator p = exports.begin();
p != exports.end();
++p)
-sorted_exports.push_back(*p);
+{
+  sorted_exports.push_back(*p);
+
+  const Package* pkg = (*p)->package();
+  if (pkg != NULL)
+   all_imports.insert(pkg);
+}
 
   std::sort(sorted_exports.begin(), sorted_exports.end(), Sort_bindings());
 
   // Assign indexes to all exported types and types referenced by
   // exported types, and collect all packages mentioned.
-  Unordered_set(const Package*) type_imports;
   int unexported_type_index = this->prepare_types(_exports,
- _imports);
+ _imports);
 
   // Although the export data is readable, at least this version is,
   // it is conceptually a binary format.  Start with a four byte
@@ -327,7 +356,7 @@ Export::export_globals(const std::string
 
   this->write_packages(packages);
 
-  this->write_imports(imports, type_imports);
+  this->write_imports(imports, all_imports);
 
   this->write_imported_init_fns(package_name, import_init_fn,
imported_init_fns);
@@ -693,7 +722,7 @@ import_compare(const std::pair& imports,
- const Unordered_set(const Package*)& type_imports)
+ const Unordered_set(const Package*)& all_imports)
 {
   // Sort the imports for more consistent output.
   Unordered_set(const Package*) seen;
@@ -729,8 +758,8 @@ Export::write_imports(const std::map indirect_imports;
   for (Unordered_set(const Package*)::const_iterator p =
-type_imports.begin();
-   p != type_imports.end();
+all_imports.begin();
+   p != all_imports.end();
++p)
 {
   if (seen.find(*p) == seen.end())
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 271945)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -1356,6 +1356,29 @@ Func_expression::do_get_backend(Translat
   return gogo->backend()->convert_expression(btype, bexpr,

Re: [PATCH] Improve PTA flow-sensitivity (for the return stmt)

2019-06-05 Thread Jeff Law

On 6/5/19 6:51 AM, Richard Biener wrote:
> 
> The following was inspired by Marins work on escapes of locals
> and the discussion there.  It teaches points-to analysis that
> the point of function return is special and thus escapes through
> that a) do not influence other points-to solutions, b) can be
> pruned of all locals.
> 
> This is one example of reasonably simple "post-processing".
> 
> The effects are small, I've done statistics, counting the number
> of variables we do not mark escaped only after this patch.  This
> number is usually zero, sometimes one and a few cases more
> (but never more than 11) during bootstrap:
> 
> 0 95830
> 1 19268
> 2 19
> 3 2
> 5 2
> 6 1
> 8 1
> 11 1
> 
> so not sure if it is worth all the effort.  It does allow us
> to do more DSE but that requires the accesses to be indirect
> which is not often true for locals.
> 
> Bootstrapped / tested on x86_64-unknown-linux-gnu.
> 
> Martin, does this help you at all?  Anybody thinks this is
> worth the trouble?
> 
> Thanks,
> Richard.
> 
> 2019-06-05  Richard Biener  
> 
>   * tree-ssa-structalias.c: Include tree-cfg.h.
>   (make_heapvar): Do not make heap vars artificial.
>   (find_func_aliases_for_builtin_call): Handle stack allocation
>   functions.
>   (find_func_aliases): Delay processing of simple enough returns
>   in non-IPA mode.
>   (set_uids_in_ptset): Adjust.
>   (find_what_var_points_to): Likewise.
>   (solve_constraints): Do not dump points-to sets here.
>   (compute_points_to_sets): Post-process return statements,
>   amending the escaped solution.  Dump points-to sets afterwards.
>   (ipa_pta_execute): Dump points-to sets.
> 
>   * gcc.dg/tree-ssa/alias-37.c: New testcase.
>   * gcc.dg/torture/20190604-1.c: Likewise.
>   * gcc.dg/tree-ssa/pta-callused.c: Adjust.
It's a fair amount of code for a corner case, _but_ I think it's a step
in the right direction.  I suspect we ought to be using the alias oracle
solutions for more than we are today and improved precision certainly
doesn't hurt.

Jeff

Re: Patch: don't cap TYPE_PRECISION of bitsizetype at MAX_FIXED_MODE_SIZE

2019-06-05 Thread Eric Botcazou

> This issue exists, not just for targets that can have their
> MAX_FIXED_MODE_SIZE more-or-less easily tweaked higher, but also
> for the 'bit-container' targets where it *can't* be set higher.
> 
> Let's please DTRT and correct the code here in the middle-end,
> so we don't ICE for those targets and this line (gcc.dg/pr69973.c):
>  typedef int v4si __attribute__ ((vector_size (1 << 29)));
> (all listed targets happen to have Pmode == SImode)
> 
> So, considering that: ok to commit?

You'd need to audit the effects on other targets though.  Are we sure that we 
want to do bitsizetype calculations in a larger mode on very embedded targets?

-- 
Eric Botcazou

Re: Patch: don't cap TYPE_PRECISION of bitsizetype at MAX_FIXED_MODE_SIZE

2019-06-05 Thread Hans-Peter Nilsson

> From: Richard Biener 
> Date: Wed, 29 May 2019 15:04:42 +0200

> On Tue, May 28, 2019 at 5:43 PM Hans-Peter Nilsson
>  wrote:
> >
> > TL;DR: instead of capping TYPE_PRECISION of bitsizetype at
> > MAX_FIXED_MODE_SIZE, search for the largest fitting size from
> > scalar_int_mode modes supported by the target using
> > targetm.scalar_mode_supported_p.
> >
> > -
> > In initialize_sizetypes, MAX_FIXED_MODE_SIZE is used as an upper
> > limit to the *precision* of the bit size of the size-type
> > (typically address length) of the target, which is wrong.
> >
> > The effect is that if a 32-bit target says "please don't cook up
> > pieces larger than a register size", then we don't get more
> > precision in address-related calculations than that, while the
> > bit-precision needs to be at least (precision +
> > LOG2_BITS_PER_UNIT + 1) with precision being the size of the
> > address, to diagnose overflows.  There are gcc_asserts that
> > guard this, causing ICE when broken.
> >
> > This MAX_FIXED_MODE_SIZE usage comes from r118977 (referencing
> > PR27885 and PR28176) and was introduced as if
> > MAX_FIXED_MODE_SIZE is the size of the largest supported type
> > for the target (where "supported" is in the most trivial sense
> > as in can move and add).  But it's not.
> >
> > MAX_FIXED_MODE_SIZE is arguably a bit vague, but documented as
> > "the size in bits of the largest integer machine mode that
> > should actually be used.  All integer machine modes of this size
> > or smaller can be used for structures and unions with the
> > appropriate sizes."
> 
> I read it as the machine may not have ways to do basic
> things like add two numbers in modes bigger than this
> but you can use larger modes as simple bit "containers".

Quick dismissal of the "should actually" in the documentation and
of my code-digging findings noted.

Either way, for expediency, it sounds like you accept that
MAX_FIXED_MODE_SIZE can validly be set to just the bitsize of
Pmode for a target (for example, targets where Pmode=SImode and
DImode is just a 'bit-container'), so let's skip forward to...

> >  While in general the documentation
> > sometimes differs from reality, that's mostly right, with
> > "should actually be" meaning "is preferably": it's the largest
> > size that the target indicates as beneficial of use besides that
> > directly mapped from types used in the source code; sort-of a
> > performance knob.  (I did a static reality code check looking
> > for direct and indirect uses before imposing this my own
> > interpretation and recollection.)  Typical use is when gcc finds
> > that some operations can be combined and synthesized to
> > optionally use a wider mode than seen in the source (but mostly
> > copying).  Then this macro sets an upper limit to the those
> > operations, whether to be done at all or the chunk-size.
> > Unfortunately some of the effects are unintuitive and I wouldn't
> > be surprised if this de-facto affects ABI corners.  It's not
> > something you tweak more than once for a target.
> >
> > Two tests pass with this fixed for cris-elf (MAX_FIXED_MODE_SIZE
> > 32): gcc.dg/attr-vector_size.c and gcc.dg/pr69973.c, where the
> > lack of precision (32 bits instead of 64 bits for bitsizetype)
> > caused an consistency check to ICE, from where I tracked this.
> 
> So why does cris-elf have 32 as MAX_FIXED_MODE_SIZE when it
> can appearantly do DImode arithmetic just fine?

(For performance reasons, by choice rather than necessity.  Long
time ago; this was in the initial commit.  That's all incidental.)

>  On x86_64
> we end up with TImode which is MAX_FIXED_MODE_SIZE, on
> 32bit x86 it is DImode.
> 
> So - fix cris instead?

...here:

Umm no, this is not a CRIS-specific issue, that's just one
target where the issue was spotted.  See last for more targets.
Please don't suggest sweeping this bug under the carpet, for the
CRIS port or other targets, by tweaking their
MAX_FIXED_MODE_SIZE.  Also, as I mentioned, this can have other
unwanted effects; the macro is used elsewhere.  IMHO it's usage
should be replaced by more specific target settings.

This issue exists, not just for targets that can have their
MAX_FIXED_MODE_SIZE more-or-less easily tweaked higher, but also
for the 'bit-container' targets where it *can't* be set higher.

Let's please DTRT and correct the code here in the middle-end,
so we don't ICE for those targets and this line (gcc.dg/pr69973.c):
 typedef int v4si __attribute__ ((vector_size (1 << 29)));
(all listed targets happen to have Pmode == SImode)

So, considering that: ok to commit?

> > Regarding the change, MAX_FIXED_MODE_SIZE is still mentioned but
> > just to initialize the fall-back largest-supported precision.
> > Sometimes the target supports no larger mode than that of the
> > address, like for a 64-bit target lacking support for larger
> > sizes (e.g. TImode), as in the motivating PR27885.  I guess they
> > can still get ICE for overflowing address-calculation checks,
> > but

Re: Review Hashtable extract node API

2019-06-05 Thread Jonathan Wakely


On 05/06/19 17:43 +0100, Jonathan Wakely wrote:

On 05/06/19 17:22 +0100, Jonathan Wakely wrote:

On 04/06/19 19:19 +0200, François Dumont wrote:

@@ -669,18 +670,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__node_base*
_M_get_previous_node(size_type __bkt, __node_base* __n);

-  // Insert node with hash code __code, in bucket bkt if no rehash (assumes
-  // no element with its key already present). Take ownership of the node,
-  // deallocate it on exception.
+  // Insert node with key __k and hash code __code, in bucket __bkt if no
+  // rehash (assumes no element with its key already present).
+  template
iterator
-  _M_insert_unique_node(size_type __bkt, __hash_code __code,
-   __node_type* __n, size_type __n_elt = 1);
+   _M_insert_unique_node(const key_type& __k, size_type __bkt,
+ __hash_code __code, const _NodeAccessor&,
+ size_type __n_elt = 1);

-  // Insert node with hash code __code. Take ownership of the node,
-  // deallocate it on exception.
+  // Insert node with hash code __code.
+  template
iterator
-  _M_insert_multi_node(__node_type* __hint,
-  __hash_code __code, __node_type* __n);
+   _M_insert_multi_node(__node_type* __hint, __hash_code __code,
+const _NodeAccessor& __node_accessor);


It looks like most times you call these functions you pass an
identical lambda expression, but each of those lambda expressions will
create a unique type. That means you create different instantiations
of the function templates even though they do exactly the same thing.

That's just generating multiple copies of identical code. Passing in a
function object to provide the node pointer doesn't really seem
necessary anyway, so if it results in larger executables it's really
not desirable.


Also I didn't really like the name NodeAccessor. It's not an accessor,
because it performs ownership transfer. Invoking __node_accessor()
returns a __node_type* by releasing it from the previous owner (by
setting the owner's pointer member to null).

Passing a const reference to something called NodeAccessor does not
make it clear that it performs a mutating operation like that! If the
_M_insert_unique_node and _M_insert_multi_node functions did the
__node_accessor() call *before* rehashing, and rehashing threw an
exception, then they would leak. So it's important that the
__node_acessor() call happens at the right time, and so it's important
to name it well.

In my suggested patch the naming isn't misleading, because we just
pass a raw __node_type* and have a new comment saying:

// Takes ownership of __n if insertion succeeds, throws otherwise.

The function doesn't have a callable with non-local effects that
modifies an object outside the function. Because the caller sets the
previous owner's pointer to null there's no danger of it happening at
the wrong time; it can only happen after the function has returned and
ownership transfer has completed.


As a further evolution that simplifies some uses of _Scoped_node we
could give it a constructor that allocates a node and constructs an
element, as in the attached patch.


commit cba81aa3c4b41bef140de30e9651c5134ee2f5ef
Author: Jonathan Wakely 
Date:   Wed Jun 5 20:12:01 2019 +0100

with _Scoped_node constructor that creates the node

diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/include/bits/hashtable.h
index 87a15c8d037..ab579a7059e 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -259,9 +259,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Simple RAII type for managing a node containing an element
   struct _Scoped_node
   {
-	_Scoped_node(__hashtable_alloc* __h, __node_type* __n)
+	// Take ownership of a node with a constructed element.
+	_Scoped_node(__node_type* __n, __hashtable_alloc* __h)
 	: _M_h(__h), _M_node(__n) { }
 
+	// Allocate a node and construct an element within it.
+	template
+	  _Scoped_node(__hashtable_alloc* __h, _Args&&... __args)
+	  : _M_h(__h),
+	_M_node(__h->_M_allocate_node(std::forward<_Args>(__args)...))
+	  { }
+
+	// Destroy element and deallocate node.
 	~_Scoped_node() { if (_M_node) _M_h->_M_deallocate_node(_M_node); };
 
 	_Scoped_node(const _Scoped_node&) = delete;
@@ -1653,9 +1662,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   -> pair
   {
 	// First build the node to get access to the hash code
-	_Scoped_node __node {
-	this, this->_M_allocate_node(std::forward<_Args>(__args)...)
-	};
+	_Scoped_node __node { this, std::forward<_Args>(__args)...  };
 	const key_type& __k = this->_M_extract()(__node._M_node->_M_v());
 	__hash_code __code = this->_M_hash_code(__k);
 	size_type __bkt = _M_bucket_index(__k, __code);
@@ -1681,9 +1688,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   -> iterator
   {
 	// First build the node to get its

Re: PR C++/63149

2019-06-05 Thread Nina Dinka Ranns

On Wed, 5 Jun 2019 at 19:19, Jason Merrill  wrote:

> On 6/5/19 1:29 PM, Nina Dinka Ranns wrote:
> > Ack. Amended change log is below. Changes are :
> > * changed C++ -> c++
> > * fixed the name of added test
> >
> > There are no changes in the diff, but I attached it to this e-mail for
> > reference.
>
> Applied, thanks!
>
> For future reference it's also customary to add a bit of discussion of
> the rationale for the patch.  Also, please include the word "PATCH" on
> the subject line.


Noted.
Thank you,
Nina


>
> Jason
>

Re: [wwwdocs] Document existence of openacc-gcc-9-branch

2019-06-05 Thread Julian Brown

On Wed, 5 Jun 2019 10:30:41 +0200
Thomas Schwinge  wrote:

> Hi Julian!
> 
> On Tue, 4 Jun 2019 23:05:53 +0100, Julian Brown
>  wrote:
> > I've pushed a new branch "openacc-gcc-9-branch" to the Git
> > mirror (i.e. as a Git-only branch), for development of OpenACC and
> > related functionality on top of the GCC 9 branch. It's currently
> > based off the gcc-9_1_0-release tag, and contains a number of
> > patches mainly merged from either the openacc-gcc-8-branch, or from
> > further-developed versions of those patches that have been
> > submitted for upstream review.
> > 
> > This patch updates the svn.html page to point to the new branch
> > rather than the old openacc-gcc-8-branch, which is retired now.
> > 
> > OK to commit?  
> 
> As obvious, but please also add an "openacc-gcc-8-branch" stanza next
> to "openacc-gcc-7-branch" in the "Merged Development Branches"
> section, and update the "gomp-4_0-branch" and "openacc-gcc-7-branch"
> stanzas accordingly.
> 
> Well, actually please move "gomp-4_0-branch", "openacc-gcc-7-branch",
> and "openacc-gcc-8-branch" into the "Inactive Development Branches"
> section, for all "These branches are inactive and contain work that
> might not been merged": they all contain some changes that have not
> been forward-ported to their later instances.

I've committed this version.

Thanks,

Julian
Index: htdocs/svn.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/svn.html,v
retrieving revision 1.225
diff -u -p -r1.225 svn.html
--- htdocs/svn.html	30 Sep 2018 14:38:47 -	1.225
+++ htdocs/svn.html	5 Jun 2019 18:39:19 -
@@ -291,19 +291,19 @@ the command svn log --stop-on-copy
   Patches should be marked with the tag [no-undefined-overflow]
   in the subject line.  The branch is maintained by Richard Biener.
 
-  https://gcc.gnu.org/wiki/OpenACC;>openacc-gcc-8-branch
+  https://gcc.gnu.org/wiki/OpenACC;>openacc-gcc-9-branch
   This https://gcc.gnu.org/wiki/GitMirror;>Git-only branch is
   used for collaborative development
   of https://gcc.gnu.org/wiki/OpenACC;>OpenACC support and related
   functionality, such
   as https://gcc.gnu.org/wiki/Offloading;>offloading support.  The
-  branch is based on gcc-8-branch.  Find it
+  branch is based on gcc-9-branch.  Find it
   at git://gcc.gnu.org/git/gcc.git,
-  https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/openacc-gcc-8-branch;>https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/openacc-gcc-8-branch,
+  https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/openacc-gcc-9-branch;>https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/openacc-gcc-9-branch,
   or
-  https://github.com/gcc-mirror/gcc/tree/openacc-gcc-8-branch;>https://github.com/gcc-mirror/gcc/tree/openacc-gcc-8-branch.
-  Please send email with a short-hand [og8] tag in the subject
-  line, and use ChangeLog.openacc files.
+  https://github.com/gcc-mirror/gcc/tree/openacc-gcc-9-branch;>https://github.com/gcc-mirror/gcc/tree/openacc-gcc-9-branch.
+  Please send patch emails with a short-hand [og9] tag in the
+  subject line, and use ChangeLog.openacc files.
 
   https://gcc.gnu.org/wiki/plugins;>plugins
   This branch adds plugin functionality to GCC.  See the 
   https://gcc.gnu.org/wiki/tuples/;>gimple-tuples-branch
   gomp-20050608-branch
   gomp-3_0-branch
-  gomp-4_0-branch
-  This branch was used to update
-  the https://gcc.gnu.org/wiki/openmp;>OpenMP support to version
-  4.0, including development
-  of https://gcc.gnu.org/wiki/Offloading;>offloading support in
-  GCC as well as support
-  for https://gcc.gnu.org/wiki/OpenACC;>OpenACC.  These features
-  got merged into trunk.  Based on gcc-6-branch then, this branch was used for
-  on-going development of OpenACC support and related functionality, which then
-  moved to openacc-gcc-7-branch, and now openacc-gcc-8-branch.
 
   java-gui-20050128-branch
   This was a temporary branch for development of java GUI libraries
@@ -820,11 +810,6 @@ inactive.
   ea...@eagercon.com.
   All changes have been merged into mainline.
 
-  openacc-gcc-7-branch
-  Based on gcc-7-branch, this branch was used for on-going development
-  of https://gcc.gnu.org/wiki/OpenACC;>OpenACC support and related
-  functionality, which now moved openacc-gcc-8-branch.
-
   pch-branch
   tree-ssa-20020619-branch
   https://gcc.gnu.org/wiki/Var_Tracking_Assignments;>var-tracking-assignments*-branch
@@ -978,6 +963,25 @@ merged.
   OpenMP support in GCC.  They were never properly maintained and
   have now been superseded by gomp-20050608-branch.
 
+  gomp-4_0-branch
+  This branch was based on gcc-6-branch, and was used to update
+  the https://gcc.gnu.org/wiki/openmp;>OpenMP support to version
+  4.0, including development
+  of https://gcc.gnu.org/wiki/Offloading;>offloading support in
+  GCC as well as support
+  for https://gcc.gnu.org/wiki/OpenACC;>OpenACC.  These features
+  got merged into trunk.  The branch was then used for on-going

Re: [C++ Patch] Use declarator->id_loc in three additional places

2019-06-05 Thread Jason Merrill


On 6/5/19 2:09 PM, Paolo Carlini wrote:

Hi,

On 05/06/19 19:45, Jason Merrill wrote:

On 6/4/19 11:57 AM, Paolo Carlini wrote:

Hi,

On 04/06/19 16:50, Jason Merrill wrote:

On 6/4/19 10:31 AM, Paolo Carlini wrote:

+  permerror (loc, "member functions are implicitly "
+ "friends of their class");


Wouldn't it be better to use the location of "friend" in this 
diagnostic?


Yes, however doing that fully correctly seems a bit tricky


Why tricky?  Doesn't declspecs->locations[ds_friend] work?


To be honest, here I wasn't considering ds_friend at all. Indeed it 
gives us the location of 'friend', but, say, for a testcase like 
parse/friend4.C:


struct A
{
   friend void A::foo();  // { dg-error "implicitly friends" }
   friend A::~A();    // { dg-error "implicitly friends" }
};

I thought we wanted a precise caret under the 'f' of 'foo' and the '~' 
of the destructor - both clang and icc do that - I wasn't even 
considering pointing at 'friend'. If you think that would be an 
improvement wrt the current closed parenthesis - I agreee it would! - I 
can do that!


Please do.

Jason

Re: [PATCH] fix more -Wformat-diag issues

2019-06-05 Thread Martin Sebor


On 5/31/19 12:20 PM, Jeff Law wrote:

On 5/31/19 9:56 AM, Martin Sebor wrote:

On 5/30/19 5:49 PM, Jeff Law wrote:

So in several places there's a comment which indicates that debugging
dumps and the like do not follow conventions.  Presumably you've tried
to keep a narrow scope on the diagnostic push/pops.  I'm also concerned
that the comments you mention that we trigger an ICE.

So while I'll ack this patch, I would like to know more about the ICE
that's triggered in the checker and what the plans are for fixing it.


Sorry, I didn't word the comment (copied below) very clearly.
What I meant to say is that the calls to error() in these files
that don't follow the convention are ultimately followed by
an ICE triggered either by an assert (as in cfgloop.c) or a call
to internal_error (cgraph.h).  The diagnostics themselves don't
cause an ICE.

OK.  Thanks for the clarification.



In a comment on one of the i18n bugs raised for these strings
Richard suggests these error calls should probably replaced by
direct calls to the pretty printer.  That would let us avoid
suppressing the warnings and also presumably make it clear to
translators the format strings aren't meant to be translated.
It seemed like too big of a change for this patch so I simply
suppressed the warnings but I agree it's worth considering at
some point.

Agreed.



I'll adjust the comment before I check in the patch (I'm hoping
to commit it at the same time as the checker itself once it's
approved).

Your call on when to commit :-)


I just committed it in r271971 with a few minor tweaks.  As before
I expect some minor fallout in the test suite, and more fixes to
follow once the checker itself is approved and committed.

Martin

Re: PR C++/63149

2019-06-05 Thread Jason Merrill


On 6/5/19 1:29 PM, Nina Dinka Ranns wrote:

Ack. Amended change log is below. Changes are :
* changed C++ -> c++
* fixed the name of added test

There are no changes in the diff, but I attached it to this e-mail for
reference.


Applied, thanks!

For future reference it's also customary to add a bit of discussion of 
the rationale for the patch.  Also, please include the word "PATCH" on 
the subject line.


Jason

Re: [PATCH] alpha: Use TARGET_COMPUTE_FRAME_LAYOUT

2019-06-05 Thread Uros Bizjak

> At the same time, merge several related frame computing functions.
> Recall that HWI is now always 64-bit, so merge IMASK and FMASK,
> which allows merging of several loops within prologue and epilogue.
>
> Full regression testing will take some time, but a quick browse
> suggests no change in generated code.
>
>
> r~
>
>
> * config/alpha/alpha.c (direct_return): Move down after
> struct machine_function definition; use saved frame_size;
> return bool.
> (struct machine_function): Add sa_mask, sa_size, frame_size.
> (alpha_sa_mask, alpha_sa_size, compute_frame_size): Merge into ...
> (alpha_compute_frame_layout): ... new function.
> (TARGET_COMPUTE_FRAME_LAYOUT): New.
> (alpha_initial_elimination_offset): Use saved sa_size.
> (alpha_vms_initial_elimination_offset): Likewise.
> (alpha_vms_can_eliminate): Remove alpha_sa_size call.
> (alpha_expand_prologue): Use saved frame data.  Merge integer
> and fp register save loops.
> (alpha_expand_epilogue): Likewise.
> (alpha_start_function): Use saved frame data.
> * config/alpha/alpha-protos.h (direct_return): Update.
> (alpha_sa_size): Remove.

I have updated, bootstrapped and regression tested this patch on
native alphaev68-linux-gnu and committed it to mainline SVN.

Uros.
diff --git a/gcc/config/alpha/alpha-protos.h b/gcc/config/alpha/alpha-protos.h
index 07c970760eb5..7d340927c58d 100644
--- a/gcc/config/alpha/alpha-protos.h
+++ b/gcc/config/alpha/alpha-protos.h
@@ -21,9 +21,8 @@ extern int alpha_next_sequence_number;
 
 extern void literal_section (void);
 extern int zap_mask (HOST_WIDE_INT);
-extern int direct_return (void);
+extern bool direct_return (void);
 
-extern int alpha_sa_size (void);
 extern HOST_WIDE_INT alpha_initial_elimination_offset (unsigned int,
   unsigned int);
 extern void alpha_expand_prologue (void);
diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c
index 2e1de397e937..db17f7c06e25 100644
--- a/gcc/config/alpha/alpha.c
+++ b/gcc/config/alpha/alpha.c
@@ -731,19 +731,6 @@ alpha_vector_mode_supported_p (machine_mode mode)
   return mode == V8QImode || mode == V4HImode || mode == V2SImode;
 }
 
-/* Return 1 if this function can directly return via $26.  */
-
-int
-direct_return (void)
-{
-  return (TARGET_ABI_OSF
- && reload_completed
- && alpha_sa_size () == 0
- && get_frame_size () == 0
- && crtl->outgoing_args_size == 0
- && crtl->args.pretend_args_size == 0);
-}
-
 /* Return the TLS model to use for SYMBOL.  */
 
 static enum tls_model
@@ -4840,6 +4827,10 @@ struct GTY(()) alpha_links;
 
 struct GTY(()) machine_function
 {
+  unsigned HOST_WIDE_INT sa_mask;
+  HOST_WIDE_INT sa_size;
+  HOST_WIDE_INT frame_size;
+
   /* For flag_reorder_blocks_and_partition.  */
   rtx gp_save_rtx;
 
@@ -7271,83 +7262,59 @@ static int vms_save_fp_regno;
 /* Register number used to reference objects off our PV.  */
 static int vms_base_regno;
 
-/* Compute register masks for saved registers.  */
-
+/* Compute register masks for saved registers, register save area size,
+   and total frame size.  */
 static void
-alpha_sa_mask (unsigned long *imaskP, unsigned long *fmaskP)
+alpha_compute_frame_layout (void)
 {
-  unsigned long imask = 0;
-  unsigned long fmask = 0;
-  unsigned int i;
+  unsigned HOST_WIDE_INT sa_mask = 0;
+  HOST_WIDE_INT frame_size;
+  int sa_size;
 
   /* When outputting a thunk, we don't have valid register life info,
  but assemble_start_function wants to output .frame and .mask
  directives.  */
-  if (cfun->is_thunk)
+  if (!cfun->is_thunk)
 {
-  *imaskP = 0;
-  *fmaskP = 0;
-  return;
-}
-
-  if (TARGET_ABI_OPEN_VMS && alpha_procedure_type == PT_STACK)
-imask |= (1UL << HARD_FRAME_POINTER_REGNUM);
+  if (TARGET_ABI_OPEN_VMS && alpha_procedure_type == PT_STACK)
+   sa_mask |= HOST_WIDE_INT_1U << HARD_FRAME_POINTER_REGNUM;
 
-  /* One for every register we have to save.  */
-  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-if (! fixed_regs[i] && ! call_used_regs[i]
-   && df_regs_ever_live_p (i) && i != REG_RA)
-  {
-   if (i < 32)
- imask |= (1UL << i);
-   else
- fmask |= (1UL << (i - 32));
-  }
+  /* One for every register we have to save.  */
+  for (unsigned i = 0; i < FIRST_PSEUDO_REGISTER; i++)
+   if (! fixed_regs[i] && ! call_used_regs[i]
+   && df_regs_ever_live_p (i) && i != REG_RA)
+ sa_mask |= HOST_WIDE_INT_1U << i;
 
-  /* We need to restore these for the handler.  */
-  if (crtl->calls_eh_return)
-{
-  for (i = 0; ; ++i)
+  /* We need to restore these for the handler.  */
+  if (crtl->calls_eh_return)
{
- unsigned regno = EH_RETURN_DATA_REGNO (i);
- if (regno == INVALID_REGNUM)
-   break;
- imask |= 1UL << regno;
+ for (unsigned i = 0; ; ++i)
+   {
+ unsigned regno = EH_RETURN_DATA_REGNO (i);
+ if (regno ==

Re: [C++ Patch] Use declarator->id_loc in three additional places

2019-06-05 Thread Paolo Carlini


Hi,

On 05/06/19 19:45, Jason Merrill wrote:

On 6/4/19 11:57 AM, Paolo Carlini wrote:

Hi,

On 04/06/19 16:50, Jason Merrill wrote:

On 6/4/19 10:31 AM, Paolo Carlini wrote:

+  permerror (loc, "member functions are implicitly "
+ "friends of their class");


Wouldn't it be better to use the location of "friend" in this 
diagnostic?


Yes, however doing that fully correctly seems a bit tricky


Why tricky?  Doesn't declspecs->locations[ds_friend] work?


To be honest, here I wasn't considering ds_friend at all. Indeed it 
gives us the location of 'friend', but, say, for a testcase like 
parse/friend4.C:


struct A
{
  friend void A::foo();  // { dg-error "implicitly friends" }
  friend A::~A();    // { dg-error "implicitly friends" }
};

I thought we wanted a precise caret under the 'f' of 'foo' and the '~' 
of the destructor - both clang and icc do that - I wasn't even 
considering pointing at 'friend'. If you think that would be an 
improvement wrt the current closed parenthesis - I agreee it would! - I 
can do that!


Paolo.

Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git

2019-06-05 Thread Jason Merrill


On 6/3/19 6:33 PM, Joseph Myers wrote:

On Sun, 2 Jun 2019, Segher Boessenkool wrote:


Git has an identity (well, two) _per commit_, and there is no way you can
reconstruct people's prefered name and email address (at any point in time,
for every commit separately) correctly.  IMO it is much better to not even
try.  We already *have* enough info for anyone to trivially look up who wrote
what, and what might be that person's email address at the time.  But
pretending that is more than a guess is just wrong.


I think not doing a best-effort identification (name+email) is just as


And I think guessing is not a "best effort", but just wrong.


It's 100% accurate about the identity of the person who was the committer
(modulo the one username from the gcc2 period where it was clear who the
author of the commits by that username was, and so that went in the author
map, but not clear that was the same as the committer, who did not commit
patches for any other author).  So it's as accurate as any case where
someone committing natively in git for someone else failed to use --author
(and if the CVS/SVN commit included a ChangeLog entry, we have credit
given from there via the "changelogs" feature).

I think failing to credit (by name and email address) the person implied
by the commit metadata, in the absence of positive evidence (such as a
ChangeLog entry) for the change being authored by someone else, is just
wrong, in the same way it's wrong not to use --author when committing for
someone else in git.


It's wrong, but it's not importantly wrong.  If we're doing a 
reposurgeon conversion, this adjustment makes sense.  If we're starting 
from the git-svn mirror, it doesn't justify breaking everyone's copies 
by rewriting branches.  And the bird in the hand looks more and more 
appealing as time goes by.



Where a person used different names over time, there's no generally
applicable rule for whether they'd prefer the latest version or the
version used at the time to be used in reference to past commits, and I
think using the most current version known is most appropriate, in the
absence of a ChangeLog entry added in the commit, unless they've specified
a preference for some other rule for which commits get what name.
Likewise for email addresses.


For email addresses, I think that using @gcc.gnu.org would be the best 
approach for people that have such accounts, rather than an employer 
address from an arbitrary point in time.


Jason

Re: [C++ Patch] Improve check_special_function_return_type locations

2019-06-05 Thread Jason Merrill


On 6/5/19 11:06 AM, Paolo Carlini wrote:

Hi,

here certainly we can do better than using input_location. In principle 
we could also pass the location of the entity (constructor, destructor, 
etc) itself or something else but I think it makes a lot of sense to 
simply include locations[ds_type_spec] in the computation, seems 
consistent with the existing case of spurious qualifiers (ICC does 
something similar AFAICS). Tested x86_64-linux.


OK.

Jason

Re: [C++ Patch] Use declarator->id_loc in three additional places

2019-06-05 Thread Jason Merrill


On 6/4/19 11:57 AM, Paolo Carlini wrote:

Hi,

On 04/06/19 16:50, Jason Merrill wrote:

On 6/4/19 10:31 AM, Paolo Carlini wrote:

+  permerror (loc, "member functions are implicitly "
+ "friends of their class");


Wouldn't it be better to use the location of "friend" in this diagnostic?


Yes, however doing that fully correctly seems a bit tricky


Why tricky?  Doesn't declspecs->locations[ds_friend] work?

Jason

Re: PR C++/63149

2019-06-05 Thread Nina Dinka Ranns

Ack. Amended change log is below. Changes are :
* changed C++ -> c++
* fixed the name of added test

There are no changes in the diff, but I attached it to this e-mail for
reference.

Thanks,
Nina

2019-06-04  Nina Dinka Ranns  
gcc/cp

PR c++/63149
* pt.c (listify_autos): Use non cv qualified auto_node in
std::initializer_list.

testsuite/

PR c++/63149
* g++.dg/cpp0x/initlist-deduce2.C: New test.





On Wed, 5 Jun 2019 at 13:59, Marek Polacek  wrote:
>
> On Wed, Jun 05, 2019 at 02:50:54PM +0200, Jakub Jelinek wrote:
> > On Wed, Jun 05, 2019 at 08:39:56AM -0400, Marek Polacek wrote:
> > > On Wed, Jun 05, 2019 at 10:34:05AM +0100, Nina Dinka Ranns wrote:
> > > > > PR C++/63149
> > > > > * pt.c (listify_autos): Use non cv qualified auto_node in
> > > > > std::initializer_list.
> > > > >
> > > > > testsuite/
> > > > >
> > > > > PR C++/63149
> > >
> > > "c++" instead of "C++", thought I don't think anyone would mind.
> >
> > I would, I have scripts that grab the PR strings from ChangeLog entries
> > and need to fix stuff by hand if it is incorrect like this (or if people
> > forget to use the component/ part altogether).
>
> Fair enough.  Nina, please adjust that too, then.
>
> Marek
Index: gcc/cp/pt.c
===
--- gcc/cp/pt.c	(revision 271709)
+++ gcc/cp/pt.c	(working copy)
@@ -26836,7 +26836,7 @@
 static tree
 listify_autos (tree type, tree auto_node)
 {
-  tree init_auto = listify (auto_node);
+  tree init_auto = listify (strip_top_quals (auto_node));
   tree argvec = make_tree_vec (1);
   TREE_VEC_ELT (argvec, 0) = init_auto;
   if (processing_template_decl)
Index: gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C
===
--- gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C	(nonexistent)
+++ gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C	(working copy)
@@ -0,0 +1,8 @@
+// Test for PR63149
+// { dg-do compile { target c++11 } }
+
+#include 
+
+const auto r = { 1, 2, 3 };
+using X = decltype(r);
+using X = const std::initializer_list;

Re: Review Hashtable extract node API

2019-06-05 Thread Jonathan Wakely


On 05/06/19 17:22 +0100, Jonathan Wakely wrote:

On 04/06/19 19:19 +0200, François Dumont wrote:

@@ -669,18 +670,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __node_base*
 _M_get_previous_node(size_type __bkt, __node_base* __n);

-  // Insert node with hash code __code, in bucket bkt if no rehash (assumes
-  // no element with its key already present). Take ownership of the node,
-  // deallocate it on exception.
+  // Insert node with key __k and hash code __code, in bucket __bkt if no
+  // rehash (assumes no element with its key already present).
+  template
iterator
-  _M_insert_unique_node(size_type __bkt, __hash_code __code,
-   __node_type* __n, size_type __n_elt = 1);
+   _M_insert_unique_node(const key_type& __k, size_type __bkt,
+ __hash_code __code, const _NodeAccessor&,
+ size_type __n_elt = 1);

-  // Insert node with hash code __code. Take ownership of the node,
-  // deallocate it on exception.
+  // Insert node with hash code __code.
+  template
iterator
-  _M_insert_multi_node(__node_type* __hint,
-  __hash_code __code, __node_type* __n);
+   _M_insert_multi_node(__node_type* __hint, __hash_code __code,
+const _NodeAccessor& __node_accessor);


It looks like most times you call these functions you pass an
identical lambda expression, but each of those lambda expressions will
create a unique type. That means you create different instantiations
of the function templates even though they do exactly the same thing.

That's just generating multiple copies of identical code. Passing in a
function object to provide the node pointer doesn't really seem
necessary anyway, so if it results in larger executables it's really
not desirable.


Also I didn't really like the name NodeAccessor. It's not an accessor,
because it performs ownership transfer. Invoking __node_accessor()
returns a __node_type* by releasing it from the previous owner (by
setting the owner's pointer member to null).

Passing a const reference to something called NodeAccessor does not
make it clear that it performs a mutating operation like that! If the
_M_insert_unique_node and _M_insert_multi_node functions did the
__node_accessor() call *before* rehashing, and rehashing threw an
exception, then they would leak. So it's important that the
__node_acessor() call happens at the right time, and so it's important
to name it well.

In my suggested patch the naming isn't misleading, because we just
pass a raw __node_type* and have a new comment saying:

 // Takes ownership of __n if insertion succeeds, throws otherwise.

The function doesn't have a callable with non-local effects that
modifies an object outside the function. Because the caller sets the
previous owner's pointer to null there's no danger of it happening at
the wrong time; it can only happen after the function has returned and
ownership transfer has completed.

[PATCH] rs6000: Fix new testcase

2019-06-05 Thread Segher Boessenkool

At least with -m32 you need -maltivec if you #include .
Tested on powerpc64-linux {-m32,-m64); committing to trunk.


Segher


2019-06-05  Segher Boessenkool  

gcc/testsuite/
* g++.target/powerpc/undef-bool-3.C: Add -maltivec to dg-options.

---
 gcc/testsuite/g++.target/powerpc/undef-bool-3.C | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/g++.target/powerpc/undef-bool-3.C 
b/gcc/testsuite/g++.target/powerpc/undef-bool-3.C
index 27f3da5..b4e18bd 100644
--- a/gcc/testsuite/g++.target/powerpc/undef-bool-3.C
+++ b/gcc/testsuite/g++.target/powerpc/undef-bool-3.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -std=c++11" } */
+/* { dg-options "-O2 -std=c++11 -maltivec" } */
 
 /* Test to ensure that "bool" is not #define'd in altivec.h for C++ when
we require strict ANSI.  We should compile without errors.  */
-- 
1.8.3.1

Re: Review Hashtable extract node API

2019-06-05 Thread Jonathan Wakely


On 04/06/19 19:19 +0200, François Dumont wrote:

Hi

    Here is a patch to enhance the _Hashtable extract node API and fix 
a FIXME request.


    The enhancement to the extract node Api is that extract(const 
key_type&) do not call extract(const_iterator) anymore. Doing so we 
had to loop again through bucket nodes to find the previous node to 
the one to extract. Even if a bucket shall not contain many nodes (in 
unique key mode) it's easy to avoid it.


Nice.

    To fix the FIXME I introduced a node smart pointer type managing 
the node lifetime. The node is extracted from this smart pointer only 
when there can't be any exception raised. In the context of the node 
extract api the node handle is considered as a smart pointer. So the 
node handle will remain owner of the node in case of exception when 
reinserting it, I hope it is the expected behavior.


Yes, that's right.

I was going to suggest just using the node handle type instead of
inventing a new smart pointer, but the handle type uses std::optional
so isn't available for C++11/14.



    * include/bits/hashtable_policy.h
    (struct _NodeSmartPointer<_NodeAlloc>): New.
    (_Map_base<>::operator[](const key_type&)): Use latter, adapt.
    (_Map_base<>::operator[](key_type&&)): Likewise.
    * include/bits/hashtable.h
    (_Hashtable<>::__node_sp_t): New.
    (_Hashtable<>::_M_insert_unique_node(size_type, __hash_code,
    __node_type*, size_type)): Replace by...
(_Hashtable<>::_M_insert_unique_node<_NodeAccessor>(const key_type&,
    size_type, __hash_code, const _NodeAccessor&, size_type)): ...that.
    (_Hashtable<>::_M_insert_multi_node(__node_type*, __hash_code,
    __node_type*)): Replace by...
(_Hashtable<>::_M_insert_multi_node<_NodeAccessor>(__node_type*,
    __hash_code, const _NodeAccessor&)): ...that.
    (_Hashtable<>::_M_reinsert_node): Adapt.
    (_Hashtable<>::_M_reinsert_node_multi): Adapt.
    (_Hashtable<>::_M_extract_node(size_t, __node_base*)): New.
    (_Hashtable<>::extract(const_iterator)): Use latter.
    (_Hashtable<>::extract(const _Key&)): Likewise.
    (_Hashtable<>::_M_merge_unique): Adapt.
    (_Hashtable<>::_M_emplace<_Args>(true_type, _Args&&...)): Adapt.
    (_Hashtable<>::_M_emplace<_Args>(const_iterator, false_type,
    _Args&&...)): Adapt.

Tested under Linux x86_64.

Ok to commit ?

François




diff --git a/libstdc++-v3/include/bits/hashtable.h 
b/libstdc++-v3/include/bits/hashtable.h
index e2e3f016a35..307865b96bf 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -197,6 +197,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  using __hash_cached = typename __traits_type::__hash_cached;
  using __node_type = __detail::_Hash_node<_Value, __hash_cached::value>;
  using __node_alloc_type = __alloc_rebind<_Alloc, __node_type>;
+  using __node_sp_t = __detail::_NodeSmartPointer<__node_alloc_type>;

  using __hashtable_alloc = __detail::_Hashtable_alloc<__node_alloc_type>;

@@ -669,18 +670,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __node_base*
  _M_get_previous_node(size_type __bkt, __node_base* __n);

-  // Insert node with hash code __code, in bucket bkt if no rehash (assumes
-  // no element with its key already present). Take ownership of the node,
-  // deallocate it on exception.
+  // Insert node with key __k and hash code __code, in bucket __bkt if no
+  // rehash (assumes no element with its key already present).
+  template
iterator
-  _M_insert_unique_node(size_type __bkt, __hash_code __code,
-   __node_type* __n, size_type __n_elt = 1);
+   _M_insert_unique_node(const key_type& __k, size_type __bkt,
+ __hash_code __code, const _NodeAccessor&,
+ size_type __n_elt = 1);

-  // Insert node with hash code __code. Take ownership of the node,
-  // deallocate it on exception.
+  // Insert node with hash code __code.
+  template
iterator
-  _M_insert_multi_node(__node_type* __hint,
-  __hash_code __code, __node_type* __n);
+   _M_insert_multi_node(__node_type* __hint, __hash_code __code,
+const _NodeAccessor& __node_accessor);


It looks like most times you call these functions you pass an
identical lambda expression, but each of those lambda expressions will
create a unique type. That means you create different instantiations
of the function templates even though they do exactly the same thing.

That's just generating multiple copies of identical code. Passing in a
function object to provide the node pointer doesn't really seem
necessary anyway, so if it results in larger executables it's really
not desirable.

The attached patch still just passes in a node pointer (which the
function takes ownership of, unless it throws). Because the semantics
of _M_insert_multi_node change, we need the symbol name to change as
well (so old code linking to

[C++ Patch] Improve check_special_function_return_type locations

2019-06-05 Thread Paolo Carlini


Hi,

here certainly we can do better than using input_location. In principle 
we could also pass the location of the entity (constructor, destructor, 
etc) itself or something else but I think it makes a lot of sense to 
simply include locations[ds_type_spec] in the computation, seems 
consistent with the existing case of spurious qualifiers (ICC does 
something similar AFAICS). Tested x86_64-linux.


Thanks, Paolo.

//

/cp
2019-06-05  Paolo Carlini  

* decl.c (smallest_type_location): Add.
(check_special_function_return_type): Use it.
(grokdeclarator): Lkewise.

/testsuite
2019-06-05  Paolo Carlini  

* g++.dg/diagnostic/return-type-invalid-1.C: New.
* g++.old-deja/g++.brendan/crash16.C: Adjust.
* g++.old-deja/g++.law/ctors5.C: Likewise.
Index: cp/decl.c
===
--- cp/decl.c   (revision 271949)
+++ cp/decl.c   (working copy)
@@ -10111,6 +10111,15 @@ smallest_type_quals_location (int type_quals, cons
   return loc;
 }
 
+/* Returns the smallest among the latter and locations[ds_type_spec].  */
+
+static location_t
+smallest_type_location (int type_quals, const location_t* locations)
+{
+  location_t loc = smallest_type_quals_location (type_quals, locations);
+  return min_location (loc, locations[ds_type_spec]);
+}
+
 /* Check that it's OK to declare a function with the indicated TYPE
and TYPE_QUALS.  SFK indicates the kind of special function (if any)
that this function is.  OPTYPE is the type given in a conversion
@@ -10129,7 +10138,8 @@ check_special_function_return_type (special_functi
 {
 case sfk_constructor:
   if (type)
-   error ("return type specification for constructor invalid");
+   error_at (smallest_type_location (type_quals, locations),
+ "return type specification for constructor invalid");
   else if (type_quals != TYPE_UNQUALIFIED)
error_at (smallest_type_quals_location (type_quals, locations),
  "qualifiers are not allowed on constructor declaration");
@@ -10142,7 +10152,8 @@ check_special_function_return_type (special_functi
 
 case sfk_destructor:
   if (type)
-   error ("return type specification for destructor invalid");
+   error_at (smallest_type_location (type_quals, locations),
+ "return type specification for destructor invalid");
   else if (type_quals != TYPE_UNQUALIFIED)
error_at (smallest_type_quals_location (type_quals, locations),
  "qualifiers are not allowed on destructor declaration");
@@ -10157,7 +10168,8 @@ check_special_function_return_type (special_functi
 
 case sfk_conversion:
   if (type)
-   error ("return type specified for %", optype);
+   error_at (smallest_type_location (type_quals, locations),
+ "return type specified for %", optype);
   else if (type_quals != TYPE_UNQUALIFIED)
error_at (smallest_type_quals_location (type_quals, locations),
  "qualifiers are not allowed on declaration of "
@@ -10168,7 +10180,8 @@ check_special_function_return_type (special_functi
 
 case sfk_deduction_guide:
   if (type)
-   error ("return type specified for deduction guide");
+   error_at (smallest_type_location (type_quals, locations),
+ "return type specified for deduction guide");
   else if (type_quals != TYPE_UNQUALIFIED)
error_at (smallest_type_quals_location (type_quals, locations),
  "qualifiers are not allowed on declaration of "
@@ -10438,10 +10451,8 @@ grokdeclarator (const cp_declarator *declarator,
   if (initialized > 1)
 funcdef_flag = true;
 
-  location_t typespec_loc = smallest_type_quals_location (type_quals,
- declspecs->locations);
-  typespec_loc = min_location (typespec_loc,
-  declspecs->locations[ds_type_spec]);
+  location_t typespec_loc = smallest_type_location (type_quals,
+   declspecs->locations);
   if (typespec_loc == UNKNOWN_LOCATION)
 typespec_loc = input_location;
 
Index: testsuite/g++.dg/diagnostic/return-type-invalid-1.C
===
--- testsuite/g++.dg/diagnostic/return-type-invalid-1.C (nonexistent)
+++ testsuite/g++.dg/diagnostic/return-type-invalid-1.C (working copy)
@@ -0,0 +1,27 @@
+struct S1
+{
+  int S1();  // { dg-error "3:return type" }
+  int ~S1();  // { dg-error "3:return type" }
+  int operator int();  // { dg-error "3:return type" }
+};
+
+struct S2
+{
+  const int S2();  // { dg-error "3:return type" }
+  const int ~S2();  // { dg-error "3:return type" }
+  const int operator int();  // { dg-error "3:return type" }
+};
+
+struct S3
+{
+  volatile int S3();  // { dg-error "3:return type" }
+  volatile int ~S3();  // { dg-error "3:return type" }
+

[PATCH] AIX unwind register number fixes

2019-06-05 Thread David Edelsohn

The recent changes to rs6000 "debug" register numbers created a
conflict in libgcc aix unwind support.  aix-unwind.h provided
definitions of the DWARF register numbers for its own reference.
After the internal register number changes and the definition of those
register names in rs6000.md, the register names appear in
insn-constants.h, which is included by libgcc.  The macro definitions
in aix-unwind.h now conflict with the versions derived from rs6000.md.

This patch changes the aix-unwind.h macro names from XX_REGNO to R_XX,
following the convention of the other unwind files in libgcc to avoid
conflict.

Bootstrapped on powerpc-ibm-aix7.2.0.0.

Thanks, David

* config/rs6000/aix-unwind.h (LR_REGNO): Rename to R_LR.
(CR2_REGNO): Rename to R_CR2.
(XER_REGNO): Rename to R_XER.
(FIRST_ALTIVEC_REGNO): Rename to R_FIRST_ALTIVEC.
(VRSAVE_REGNO): Rename to R_VRSAVE.
(VSCR_REGNO): R_VSCR.

Index: aix-unwind.h
===
--- aix-unwind.h(revision 271883)
+++ aix-unwind.h(working copy)
@@ -24,12 +24,12 @@

 /* Useful register numbers.  */

-#define LR_REGNO 65
-#define CR2_REGNO70
-#define XER_REGNO76
-#define FIRST_ALTIVEC_REGNO  77
-#define VRSAVE_REGNO109
-#define VSCR_REGNO  110
+#define R_LR 65
+#define R_CR270
+#define R_XER76
+#define R_FIRST_ALTIVEC  77
+#define R_VRSAVE109
+#define R_VSCR  110

 /* If the current unwind info (FS) does not contain explicit info
saving R2, then we have to do a minor amount of code reading to
@@ -44,7 +44,7 @@
   {
\
unsigned int *insn  \
  = (unsigned int *)\
-   _Unwind_GetGR ((CTX), LR_REGNO);\
+   _Unwind_GetGR ((CTX), R_LR);\
if (*insn == 0xE8410028)\
  _Unwind_SetGRPtr ((CTX), 2, (CTX)->cfa + 40); \
   }
\
@@ -56,7 +56,7 @@
   {
\
unsigned int *insn  \
  = (unsigned int *)\
-   _Unwind_GetGR ((CTX), LR_REGNO);\
+   _Unwind_GetGR ((CTX), R_LR);\
if (*insn == 0x80410014)\
  _Unwind_SetGRPtr ((CTX), 2, (CTX)->cfa + 20); \
   }\
@@ -241,9 +241,9 @@ ppc_aix_fallback_frame_state (struct _Unwind_Conte
 if (i != __LIBGCC_STACK_POINTER_REGNUM__)
   REGISTER_CFA_OFFSET_FOR (fs, i, >gpr[i], new_cfa);

-  REGISTER_CFA_OFFSET_FOR (fs, CR2_REGNO, >cr, new_cfa);
-  REGISTER_CFA_OFFSET_FOR (fs, XER_REGNO, >xer, new_cfa);
-  REGISTER_CFA_OFFSET_FOR (fs, LR_REGNO, >lr, new_cfa);
+  REGISTER_CFA_OFFSET_FOR (fs, R_CR2, >cr, new_cfa);
+  REGISTER_CFA_OFFSET_FOR (fs, R_XER, >xer, new_cfa);
+  REGISTER_CFA_OFFSET_FOR (fs, R_LR, >lr, new_cfa);

   fs->retaddr_column = RETURN_COLUMN;
   REGISTER_CFA_OFFSET_FOR (fs, RETURN_COLUMN, >iar, new_cfa);
@@ -268,10 +268,10 @@ ppc_aix_fallback_frame_state (struct _Unwind_Conte

  for (i = 0; i < 32; i++)
REGISTER_CFA_OFFSET_FOR
-   (fs, i+FIRST_ALTIVEC_REGNO, >regs[i], new_cfa);
+   (fs, i+R_FIRST_ALTIVEC, >regs[i], new_cfa);

- REGISTER_CFA_OFFSET_FOR (fs, VSCR_REGNO, >vscr, new_cfa);
- REGISTER_CFA_OFFSET_FOR (fs, VRSAVE_REGNO, >vrsave, new_cfa);
+ REGISTER_CFA_OFFSET_FOR (fs, R_VSCR, >vscr, new_cfa);
+ REGISTER_CFA_OFFSET_FOR (fs, R_VRSAVE, >vrsave, new_cfa);
}
 }

Re: [0/3] Improve debug info for addressable vars

2019-06-05 Thread Richard Sandiford

Richard Biener  writes:
> On Sat, Jun 1, 2019 at 5:49 PM Richard Sandiford
>  wrote:
>>
>> Taking the address of a variable stops us doing var-tracking on it,
>> so that we just use the DECL_RTL instead.  This can easily cause wrong
>> debug info for regions of code that would have had correct debug info
>> if the variable weren't addressable.  E.g.:
>>
>> {
>>   int base;
>>   get_start ();
>>   x[i1] = base;
>>   base += 1; // No need to store this
>>   x[i2] = base; // ...so the debug info for "base" is wrong here
>> }
>>
>> or (the motivating example):
>>
>> {
>>   int base;
>>   get_start ();
>>   for (int i = 0; i < n; ++i)
>> {
>>   x[i] = base;
>>   base += y[i]; // Can apply LSM here, so the debug info for "base"
>> // in the loop is wrong
>> }
>>   consume ();
>> }
>>
>> This patch series lets us use the DECL_RTL location for some parts of a
>> variable's lifetime and debug-bind locations for other parts:
>>
>> 1) Gimple uses "VAR s=> VAR" to bind VAR to its DECL_RTL.  The binding
>>holds until overridden.
>>
>> 2) RTL does the same thing using:
>>
>>  (var_location VAR (decl_rtl_ref VAR))
>>
>>where DECL_RTL_REF is a new rtx code that captures the DECL_RTL
>>by reference rather than by value.
>>
>>We can't just use "(var_location VAR (mem X))" for this, because
>>that would bind VAR to the value that (mem X) has at that exact point.
>>VAR would therefore get reset by any possible change to (mem X),
>>whereas here we want it to track (possibly indirect) updates instead.
>>
>> 3) The gimplifier decides which variables should get the new treatment
>>and emits "VAR s=> VAR" to mark the start of VAR's lifetime.
>>Clobbers continue to mark the end of VAR's lifetime.
>>
>> 4) Stores to VAR implicitly reestablish the link between VAR and its
>>DECL_RTL.  This is simpler (and IMO more robust) than inserting an
>>explicit "VAR s=> VAR" at every write.
>>
>> 5) gsi_remove tries to insert "VAR => X" in place of a deleted "VAR = X",
>>falling back to a "VAR => NULL" reset if that fails.
>>
>> Patch 1 handles the new rtl code, patch 2 adds the gimple framework,
>> and patch 3 uses it for LSM.
>
> So I wonder how it handles
>
> void __attribute__((noinline)) foo(int *p) { *p = 42; }
> int x;
> int main()
> {
>   int base = 1;
>   foo ();
>   base = 2;
>   *(x ?  : ) = 1; // (*)
>   return 0;
> }
>
> here we DSE the base = 2 store leaving a
>
> # DEBUG base = 2
>
> stmt?  But there's an indirect store that also stores
> to base - what will the debug info say at/after (*)?  Will
> it claim that base is 2?  At least I do not see that
> the connection with bases DECL_RTL is re-established?

Yeah, true.

> There's a clobber of base before return 0 so you eventually
> have to add some dummy stmt you can print base after
> the indirect store.
>
> That said, doesn't "aliasing" create another source of wrong-debug
> with your approach that might be even worse?

Not sure about even worse, but maybe different.  In the example above
the patches fix the debug info after "base = 2" but break it after the
following statement.

But there's no real need for the compiler to store to base in (*) either.
We could end up with "if (...) x = 1;" instead.  So AFAICT there's no
guarantee that we'll get correct debug info at the return statement even
as things stand.

For memory variables, I think we're always at the mercy of dead stores
being optimised away, and the patch isn't trying to fix that.  Since
both writes to base are dead in the above, I don't think we can guarantee
correct debug info without compromising optimisation for the sake of
debuggability.  (FWIW, I have a WIP patch to add an option for that,
hope to post an RFC soon.)

I can't think of a case in which the patches introduce wrong debug
info for code that isn't dead.

Thanks,
Richard

>
> Otherwise the patches look reasonable.
>
> Richard.
>
>> Bootstrapped & regtested on aarch64-linux-gnu and x86_64-linux-gnu.
>> OK to install?
>>
>> Richard

Re: Ping: [PATCH] PR88395 Fix Nullptr when compiling with -fconcepts

2019-06-05 Thread Richard Sandiford

Thanks for the patch and sorry that there was no response.
I've added the C++ maintainers to cc:

nick  writes:
> I'm pinging this patch as it's old now and should be applied to fix the bug.
>
> Nick
>
> On 2019-04-08 7:20 p.m., Nicholas Krause wrote:
>> This fixes the caller in tsubst_requires_expr to
>> tsubst_constraint_variables to wrap their respective
>> trees in PARM_CONSTR_PARMS. This is to get the correct
>> parmeter constraints from the tree before calling
>> tsubst_constraint_variables like other callers
>> in constraint.cc and to fix the bug id, 88395 on
>> the gcc bugzilla. OK for merge?
>> 
>> Signed-off-by: Nicholas Krause 
>> ---
>>  gcc/cp/constraint.cc | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
>> index 9884eb0db50..a78d0a9a49b 100644
>> --- a/gcc/cp/constraint.cc
>> +++ b/gcc/cp/constraint.cc
>> @@ -1882,7 +1882,7 @@ tsubst_requires_expr (tree t, tree args,
>>tree parms = TREE_OPERAND (t, 0);
>>if (parms)
>>  {
>> -  parms = tsubst_constraint_variables (parms, args, complain, in_decl);
>> +  parms = tsubst_constraint_variables (PARM_CONSTR_PARMS (parms), args, 
>> complain, in_decl);
>>if (parms == error_mark_node)
>>  return error_mark_node;
>>  }
>>

Re: [PATCH] Improve PTA flow-sensitivity (for the return stmt)

2019-06-05 Thread Richard Biener

On Wed, 5 Jun 2019, Martin Sebor wrote:

> On 6/5/19 6:51 AM, Richard Biener wrote:
> > 
> > The following was inspired by Marins work on escapes of locals
> > and the discussion there.  It teaches points-to analysis that
> > the point of function return is special and thus escapes through
> > that a) do not influence other points-to solutions, b) can be
> > pruned of all locals.
> > 
> > This is one example of reasonably simple "post-processing".
> > 
> > The effects are small, I've done statistics, counting the number
> > of variables we do not mark escaped only after this patch.  This
> > number is usually zero, sometimes one and a few cases more
> > (but never more than 11) during bootstrap:
> > 
> > 0 95830
> > 1 19268
> > 2 19
> > 3 2
> > 5 2
> > 6 1
> > 8 1
> > 11 1
> > 
> > so not sure if it is worth all the effort.  It does allow us
> > to do more DSE but that requires the accesses to be indirect
> > which is not often true for locals.
> > 
> > Bootstrapped / tested on x86_64-unknown-linux-gnu.
> > 
> > Martin, does this help you at all?  Anybody thinks this is
> > worth the trouble?
> 
> IIUC, it would help with only one aspect of what I'm doing:
> distinguish alloca/VLAs from heap memory.  I don't think it
> would make it any easier to track down the actual allocations.
> In my prototype (using the oracle) I would find the variables
> whose address is being returned by iterating over local DECLs
> and matching those against the vars bitmap.  But unless there's
> a way to include the alloca statements in that traversal I don't
> see how to find those.

Btw, you could gather alloca "bits" up-front in a similar
way fold_builtin_alloca_with_align gets at them.  The
singleton ID can then be used to check points-to solutions.

But of course points-to solutions are conservative, so
a return p; might just point to all locals (in fact after
this patch it won't point to _any_ locals anymore which
probably defeats using the oracle for the purpose of finding
escaped locals).

> That said, I'd say the improved accuracy the patch gives us
> certainly makes it worth keeping.  There might be other
> solutions besides what I'm doing where distinguishing alloca
> from malloc will be important and where tracking down
> the allocations won't be an issue.

OK.

Thanks,
Richard.

> Thanks for working on this!
> 
> Martin
> 
> > 
> > Thanks,
> > Richard.
> > 
> > 2019-06-05  Richard Biener  
> > 
> > * tree-ssa-structalias.c: Include tree-cfg.h.
> > (make_heapvar): Do not make heap vars artificial.
> > (find_func_aliases_for_builtin_call): Handle stack allocation
> > functions.
> > (find_func_aliases): Delay processing of simple enough returns
> > in non-IPA mode.
> > (set_uids_in_ptset): Adjust.
> > (find_what_var_points_to): Likewise.
> > (solve_constraints): Do not dump points-to sets here.
> > (compute_points_to_sets): Post-process return statements,
> > amending the escaped solution.  Dump points-to sets afterwards.
> > (ipa_pta_execute): Dump points-to sets.
> > 
> > * gcc.dg/tree-ssa/alias-37.c: New testcase.
> > * gcc.dg/torture/20190604-1.c: Likewise.
> > * gcc.dg/tree-ssa/pta-callused.c: Adjust.
> > 
> > Index: gcc/tree-ssa-structalias.c
> > ===
> > --- gcc/tree-ssa-structalias.c  (revision 271951)
> > +++ gcc/tree-ssa-structalias.c  (working copy)
> > @@ -43,6 +43,7 @@
> >   #include "stringpool.h"
> >   #include "attribs.h"
> >   #include "tree-ssa.h"
> > +#include "tree-cfg.h"
> > /* The idea behind this analyzer is to generate set constraints from the
> >  program, then solve the resulting constraints in order to generate the
> > @@ -3854,7 +3855,6 @@ make_heapvar (const char *name, bool add
> > DECL_EXTERNAL (heapvar) = 1;
> >   vi = new_var_info (heapvar, name, add_id);
> > -  vi->is_artificial_var = true;
> > vi->is_heap_var = true;
> > vi->is_unknown_size_var = true;
> > vi->offset = 0;
> > @@ -4409,6 +4409,32 @@ find_func_aliases_for_builtin_call (stru
> >   process_constraint (new_constraint (*lhsp, ac));
> >   return true;
> > }
> > +  case BUILT_IN_STACK_SAVE:
> > +  case BUILT_IN_STACK_RESTORE:
> > +/* Nothing interesting happens.  */
> > +return true;
> > +  case BUILT_IN_ALLOCA:
> > +  case BUILT_IN_ALLOCA_WITH_ALIGN:
> > +  case BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX:
> > +   {
> > + tree ptr = gimple_call_lhs (t);
> > + if (ptr == NULL_TREE)
> > +   return true;
> > + get_constraint_for (ptr, );
> > + varinfo_t vi = make_heapvar ("HEAP", true);
> > + /* Alloca storage is never global.  To exempt it from escaped
> > +handling make it a non-heap var.  */
> > + DECL_EXTERNAL (vi->decl) = 0;
> > + vi->is_global_var = 0;
> > + vi->is_heap_var = 0;
> > + struct constraint_expr tmpc;
> > + tmpc.var = vi->id;
> > + tmpc.offset = 0;
> > +

Re: [PATCH 0/4] Store multiple values for single value profilers

2019-06-05 Thread Richard Biener

On Tue, Jun 4, 2019 at 10:44 AM Martin Liska  wrote:
>
> Hi.
>
> It's becoming more common that a training run happens in parallel environment.
> That can lead to a not reproducible builds caused by different order of 
> merging
> of .gcda files. So that I'm suggesting to store up to 4 values for 
> HIST_TYPE_SINGLE_VALUE
> and HIST_TYPE_INDIR_CALL on disk. If the capacity is exceeded the whole 
> counter is
> marked as unstable (not reproducible).
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

Thanks for working on this, I hope Honza can review and approve it.  Does this
solve the issue of profiledbootstrap results being not reproducible?  (if you
fix genchecksum to not generate different checksums)

I suppose this would also apply to a GCC 9 tree?

Thanks,
Richard.

> Thanks,
> Martin
>
> marxin (4):
>   Remove indirect call top N counter type.
>   Implement N disk counters for single value and indirect call counters.
>   Dump histograms only if present.
>   Update a bit dump format.
>
>  gcc/doc/invoke.texi   |   3 -
>  gcc/gcov-counter.def  |   3 -
>  gcc/gcov-io.h |   9 +-
>  gcc/ipa-profile.c |  13 ++-
>  gcc/params.def|   8 --
>  gcc/profile.c |   1 -
>  gcc/tree-profile.c|  23 +---
>  gcc/value-prof.c  | 224 --
>  gcc/value-prof.h  |   4 +-
>  libgcc/Makefile.in|  10 +-
>  libgcc/libgcov-driver.c   |  80 --
>  libgcc/libgcov-merge.c| 139 +--
>  libgcc/libgcov-profiler.c | 176 ++
>  libgcc/libgcov-util.c |  19 
>  libgcc/libgcov.h  |  12 +-
>  15 files changed, 179 insertions(+), 545 deletions(-)
>
> --
> 2.21.0
>

Re: [PATCH] Improve PTA flow-sensitivity (for the return stmt)

2019-06-05 Thread Martin Sebor


On 6/5/19 6:51 AM, Richard Biener wrote:


The following was inspired by Marins work on escapes of locals
and the discussion there.  It teaches points-to analysis that
the point of function return is special and thus escapes through
that a) do not influence other points-to solutions, b) can be
pruned of all locals.

This is one example of reasonably simple "post-processing".

The effects are small, I've done statistics, counting the number
of variables we do not mark escaped only after this patch.  This
number is usually zero, sometimes one and a few cases more
(but never more than 11) during bootstrap:

0 95830
1 19268
2 19
3 2
5 2
6 1
8 1
11 1

so not sure if it is worth all the effort.  It does allow us
to do more DSE but that requires the accesses to be indirect
which is not often true for locals.

Bootstrapped / tested on x86_64-unknown-linux-gnu.

Martin, does this help you at all?  Anybody thinks this is
worth the trouble?


IIUC, it would help with only one aspect of what I'm doing:
distinguish alloca/VLAs from heap memory.  I don't think it
would make it any easier to track down the actual allocations.
In my prototype (using the oracle) I would find the variables
whose address is being returned by iterating over local DECLs
and matching those against the vars bitmap.  But unless there's
a way to include the alloca statements in that traversal I don't
see how to find those.

That said, I'd say the improved accuracy the patch gives us
certainly makes it worth keeping.  There might be other
solutions besides what I'm doing where distinguishing alloca
from malloc will be important and where tracking down
the allocations won't be an issue.

Thanks for working on this!

Martin



Thanks,
Richard.

2019-06-05  Richard Biener  

* tree-ssa-structalias.c: Include tree-cfg.h.
(make_heapvar): Do not make heap vars artificial.
(find_func_aliases_for_builtin_call): Handle stack allocation
functions.
(find_func_aliases): Delay processing of simple enough returns
in non-IPA mode.
(set_uids_in_ptset): Adjust.
(find_what_var_points_to): Likewise.
(solve_constraints): Do not dump points-to sets here.
(compute_points_to_sets): Post-process return statements,
amending the escaped solution.  Dump points-to sets afterwards.
(ipa_pta_execute): Dump points-to sets.

* gcc.dg/tree-ssa/alias-37.c: New testcase.
* gcc.dg/torture/20190604-1.c: Likewise.
* gcc.dg/tree-ssa/pta-callused.c: Adjust.

Index: gcc/tree-ssa-structalias.c
===
--- gcc/tree-ssa-structalias.c  (revision 271951)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -43,6 +43,7 @@
  #include "stringpool.h"
  #include "attribs.h"
  #include "tree-ssa.h"
+#include "tree-cfg.h"
  
  /* The idea behind this analyzer is to generate set constraints from the

 program, then solve the resulting constraints in order to generate the
@@ -3854,7 +3855,6 @@ make_heapvar (const char *name, bool add
DECL_EXTERNAL (heapvar) = 1;
  
vi = new_var_info (heapvar, name, add_id);

-  vi->is_artificial_var = true;
vi->is_heap_var = true;
vi->is_unknown_size_var = true;
vi->offset = 0;
@@ -4409,6 +4409,32 @@ find_func_aliases_for_builtin_call (stru
  process_constraint (new_constraint (*lhsp, ac));
  return true;
}
+  case BUILT_IN_STACK_SAVE:
+  case BUILT_IN_STACK_RESTORE:
+/* Nothing interesting happens.  */
+return true;
+  case BUILT_IN_ALLOCA:
+  case BUILT_IN_ALLOCA_WITH_ALIGN:
+  case BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX:
+   {
+ tree ptr = gimple_call_lhs (t);
+ if (ptr == NULL_TREE)
+   return true;
+ get_constraint_for (ptr, );
+ varinfo_t vi = make_heapvar ("HEAP", true);
+ /* Alloca storage is never global.  To exempt it from escaped
+handling make it a non-heap var.  */
+ DECL_EXTERNAL (vi->decl) = 0;
+ vi->is_global_var = 0;
+ vi->is_heap_var = 0;
+ struct constraint_expr tmpc;
+ tmpc.var = vi->id;
+ tmpc.offset = 0;
+ tmpc.type = ADDRESSOF;
+ rhsc.safe_push (tmpc);
+ process_all_all_constraints (lhsc, rhsc);
+ return true;
+   }
case BUILT_IN_POSIX_MEMALIGN:
  {
  tree ptrptr = gimple_call_arg (t, 0);
@@ -4976,7 +5002,12 @@ find_func_aliases (struct function *fn,
greturn *return_stmt = as_a  (t);
fi = NULL;
if (!in_ipa_mode
- || !(fi = get_vi_for_tree (fn->decl)))
+ && SSA_VAR_P (gimple_return_retval (return_stmt)))
+   {
+ /* We handle simple returns by post-processing the solutions.  */
+ ;
+   }
+  if (!(fi = get_vi_for_tree (fn->decl)))
make_escape_constraint (gimple_return_retval (return_stmt));
else if (in_ipa_mode)
{
@@ -6422,9 +6453,7

Re: [0/3] Improve debug info for addressable vars

2019-06-05 Thread Richard Biener

On Sat, Jun 1, 2019 at 5:49 PM Richard Sandiford
 wrote:
>
> Taking the address of a variable stops us doing var-tracking on it,
> so that we just use the DECL_RTL instead.  This can easily cause wrong
> debug info for regions of code that would have had correct debug info
> if the variable weren't addressable.  E.g.:
>
> {
>   int base;
>   get_start ();
>   x[i1] = base;
>   base += 1; // No need to store this
>   x[i2] = base; // ...so the debug info for "base" is wrong here
> }
>
> or (the motivating example):
>
> {
>   int base;
>   get_start ();
>   for (int i = 0; i < n; ++i)
> {
>   x[i] = base;
>   base += y[i]; // Can apply LSM here, so the debug info for "base"
> // in the loop is wrong
> }
>   consume ();
> }
>
> This patch series lets us use the DECL_RTL location for some parts of a
> variable's lifetime and debug-bind locations for other parts:
>
> 1) Gimple uses "VAR s=> VAR" to bind VAR to its DECL_RTL.  The binding
>holds until overridden.
>
> 2) RTL does the same thing using:
>
>  (var_location VAR (decl_rtl_ref VAR))
>
>where DECL_RTL_REF is a new rtx code that captures the DECL_RTL
>by reference rather than by value.
>
>We can't just use "(var_location VAR (mem X))" for this, because
>that would bind VAR to the value that (mem X) has at that exact point.
>VAR would therefore get reset by any possible change to (mem X),
>whereas here we want it to track (possibly indirect) updates instead.
>
> 3) The gimplifier decides which variables should get the new treatment
>and emits "VAR s=> VAR" to mark the start of VAR's lifetime.
>Clobbers continue to mark the end of VAR's lifetime.
>
> 4) Stores to VAR implicitly reestablish the link between VAR and its
>DECL_RTL.  This is simpler (and IMO more robust) than inserting an
>explicit "VAR s=> VAR" at every write.
>
> 5) gsi_remove tries to insert "VAR => X" in place of a deleted "VAR = X",
>falling back to a "VAR => NULL" reset if that fails.
>
> Patch 1 handles the new rtl code, patch 2 adds the gimple framework,
> and patch 3 uses it for LSM.

So I wonder how it handles

void __attribute__((noinline)) foo(int *p) { *p = 42; }
int x;
int main()
{
  int base = 1;
  foo ();
  base = 2;
  *(x ?  : ) = 1; // (*)
  return 0;
}

here we DSE the base = 2 store leaving a

# DEBUG base = 2

stmt?  But there's an indirect store that also stores
to base - what will the debug info say at/after (*)?  Will
it claim that base is 2?  At least I do not see that
the connection with bases DECL_RTL is re-established?
There's a clobber of base before return 0 so you eventually
have to add some dummy stmt you can print base after
the indirect store.

That said, doesn't "aliasing" create another source of wrong-debug
with your approach that might be even worse?

Otherwise the patches look reasonable.

Richard.

> Bootstrapped & regtested on aarch64-linux-gnu and x86_64-linux-gnu.
> OK to install?
>
> Richard

Re: [PATCH] Simplify mult where both arguments are 0 or 1 (PR tree-optimization/87954).

2019-06-05 Thread Martin Liška

On 6/5/19 3:04 PM, Richard Biener wrote:
> On Wed, Jun 5, 2019 at 2:09 PM Martin Liška  wrote:
>>
>> On 6/5/19 1:13 PM, Richard Biener wrote:
>>> On Wed, Jun 5, 2019 at 12:56 PM Martin Liška  wrote:

 Hi.

 I'm suggesting one multiplication simplification pattern.

 Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

 Ready to be installed?
>>>
>>> +  (if (INTEGRAL_TYPE_P (type)
>>> +   && wi::eq_p (get_nonzero_bits (@1), wi::one (TYPE_PRECISION (type)))
>>> +   && wi::eq_p (get_nonzero_bits (@2), wi::one (TYPE_PRECISION 
>>> (type
>>>
>>>   && wi::eq_p (wi::bit_or (get_nonzero_bits (@1), get_nonzero_bits  (@2)), 
>>> 1))
>>>
>>> (I think literal 1 still works)?
>>
>> Yep, I can confirm that.
>>
>>> How does it behave for  singed/unsigned 1-bit
>>> bitfields?  A gimple testcase maybe necessary to see.
>>
>> Can we really have a mult that will have a bitfield type?
> 
> As said you probably need a GIMPLE testcase to avoid
> promoting to int.  Oh, and that doesn't work yet because
> we cannot "parse" bit-precision types for temporaries.
> 
> struct X { int a : 1; int b : 1; };
> 
> int foo (struct X *p)
> {
>   return p->a;
> }
> 
> produces
> 
> int __GIMPLE (ssa)
> foo (struct X * p)
> {
>   int D_1913;
>_1;
> ...
> 
> we have similar issues with dumping of vector types but
> there at least one can use a typedef and manual editing.
> For bit-precision types we need to invent a "C" extension
> (thus also for vectors).
> 
> Anyway...

I see, I'm sending updated version of the patch I've been just testing.
It's addressing Richard Sandifords's note.

May I install it after testing?

> 
>> $ cat gcc/testsuite/gcc.dg/pr87954-2.c
>> #define __GFP_DMA 1u
>> #define __GFP_RECLAIM 0x10u
>>
>> struct bt
>> {
>>   unsigned int v:1;
>> };
>>
>> unsigned int
>> imul(unsigned int flags)
>> {
>>   struct bt is_dma, is_rec;
>>
>>   is_dma.v = !!(flags & __GFP_DMA);
>>   is_rec.v = !!(flags & __GFP_RECLAIM);
>>
>>   return is_rec.v * !is_dma.v;
>> }
>>
>> $ ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr87954-2.c 
>> -fdump-tree-optimized=/dev/stdout -O2
>>
>> ;; Function imul (imul, funcdef_no=0, decl_uid=1909, cgraph_uid=1, 
>> symbol_order=0)
>>
>> imul (unsigned int flags)
>> {
>>   struct bt is_dma;
>>   _Bool _1;
>>   unsigned int _2;
>>   _Bool _3;
>>   unsigned char _4;
>>   _Bool _6;
>>   unsigned int _9;
>>_11;
>>   unsigned char _14;
>>
>>[local count: 1073741824]:
>>   _1 = (_Bool) flags_7(D);
>>   _2 = flags_7(D) & 16;
>>   _3 = _2 != 0;
>>   is_dma.v = _1;
>>   _4 = BIT_FIELD_REF ;
>>   _14 = ~_4;
>>   _6 = (_Bool) _14;
>>   _11 = _3 & _6;
>>   _9 = (unsigned int) _11;
>>   is_dma ={v} {CLOBBER};
>>   return _9;
>> }
>>
>>>
>>> Does this mean we want to turn plus into bit_ior when
>>> get_nonzero_bits() & get_nonzero_bits() is zero?
>>
>> That's quite interesting transformation, I'll add it as a follow up patch.
> 
> I was just curious - maybe we should do the reverse instead?
> For mult vs. bit-and I think the latter will be "faster" (well, probably not
> even that...).  But for plus vs or?

Hmm, expected speed up will be probably very small.

Martin

> 
> 
>>>
>>> X * [0, 1] -> X & sign-extend-from-bit-1 also works I guess, but 
>>> multiplication
>>> looks more canonical.
>> Ok here.
>>
>> Martin
>>
>>>
>>> Thanks,
>>> Richard.
>>>
 Thanks,
 Martin

 gcc/ChangeLog:

 2019-06-05  Martin Liska  

 PR tree-optimization/87954
 * match.pd: Simplify mult where both arguments are 0 or 1.

 gcc/testsuite/ChangeLog:

 2019-06-05  Martin Liska  

 PR tree-optimization/87954
 * gcc.dg/pr87954.c: New test.
 ---
  gcc/match.pd   |  8 
  gcc/testsuite/gcc.dg/pr87954.c | 21 +
  2 files changed, 29 insertions(+)
  create mode 100644 gcc/testsuite/gcc.dg/pr87954.c


>>

>From ef2699c2e41df4bb7667ce44248795dd511e0f7f Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 5 Jun 2019 11:58:57 +0200
Subject: [PATCH] Simplify mult where both arguments are 0 or 1 (PR
 tree-optimization/87954).

gcc/ChangeLog:

2019-06-05  Martin Liska  

	PR tree-optimization/87954
	* match.pd: Simplify mult where both arguments are 0 or 1.

gcc/testsuite/ChangeLog:

2019-06-05  Martin Liska  

	PR tree-optimization/87954
	* gcc.dg/pr87954.c: New test.
---
 gcc/match.pd   |  8 
 gcc/testsuite/gcc.dg/pr87954.c | 21 +
 2 files changed, 29 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr87954.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 02e0471dd4e..88dae4231d8 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -217,6 +217,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
|| !COMPLEX_FLOAT_TYPE_P (type)))
(negate @0)))
 
+/* Transform { 0 or 1 } * { 0 or 1 } into { 0 or 1 } & { 0 or 1 } */
+(simplify
+ (mult SSA_NAME@1

H!..

2019-06-05 Thread Joy

Hello Dear,is Joy from Belgium.

Please can we have a discussion that will bring fortune to us? Kindly reply me. 
   

Thanks.

Re: [PATCH] Simplify mult where both arguments are 0 or 1 (PR tree-optimization/87954).

2019-06-05 Thread Richard Biener

On Wed, Jun 5, 2019 at 2:09 PM Martin Liška  wrote:
>
> On 6/5/19 1:13 PM, Richard Biener wrote:
> > On Wed, Jun 5, 2019 at 12:56 PM Martin Liška  wrote:
> >>
> >> Hi.
> >>
> >> I'm suggesting one multiplication simplification pattern.
> >>
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> Ready to be installed?
> >
> > +  (if (INTEGRAL_TYPE_P (type)
> > +   && wi::eq_p (get_nonzero_bits (@1), wi::one (TYPE_PRECISION (type)))
> > +   && wi::eq_p (get_nonzero_bits (@2), wi::one (TYPE_PRECISION 
> > (type
> >
> >   && wi::eq_p (wi::bit_or (get_nonzero_bits (@1), get_nonzero_bits  (@2)), 
> > 1))
> >
> > (I think literal 1 still works)?
>
> Yep, I can confirm that.
>
> > How does it behave for  singed/unsigned 1-bit
> > bitfields?  A gimple testcase maybe necessary to see.
>
> Can we really have a mult that will have a bitfield type?

As said you probably need a GIMPLE testcase to avoid
promoting to int.  Oh, and that doesn't work yet because
we cannot "parse" bit-precision types for temporaries.

struct X { int a : 1; int b : 1; };

int foo (struct X *p)
{
  return p->a;
}

produces

int __GIMPLE (ssa)
foo (struct X * p)
{
  int D_1913;
   _1;
...

we have similar issues with dumping of vector types but
there at least one can use a typedef and manual editing.
For bit-precision types we need to invent a "C" extension
(thus also for vectors).

Anyway...

> $ cat gcc/testsuite/gcc.dg/pr87954-2.c
> #define __GFP_DMA 1u
> #define __GFP_RECLAIM 0x10u
>
> struct bt
> {
>   unsigned int v:1;
> };
>
> unsigned int
> imul(unsigned int flags)
> {
>   struct bt is_dma, is_rec;
>
>   is_dma.v = !!(flags & __GFP_DMA);
>   is_rec.v = !!(flags & __GFP_RECLAIM);
>
>   return is_rec.v * !is_dma.v;
> }
>
> $ ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr87954-2.c 
> -fdump-tree-optimized=/dev/stdout -O2
>
> ;; Function imul (imul, funcdef_no=0, decl_uid=1909, cgraph_uid=1, 
> symbol_order=0)
>
> imul (unsigned int flags)
> {
>   struct bt is_dma;
>   _Bool _1;
>   unsigned int _2;
>   _Bool _3;
>   unsigned char _4;
>   _Bool _6;
>   unsigned int _9;
>_11;
>   unsigned char _14;
>
>[local count: 1073741824]:
>   _1 = (_Bool) flags_7(D);
>   _2 = flags_7(D) & 16;
>   _3 = _2 != 0;
>   is_dma.v = _1;
>   _4 = BIT_FIELD_REF ;
>   _14 = ~_4;
>   _6 = (_Bool) _14;
>   _11 = _3 & _6;
>   _9 = (unsigned int) _11;
>   is_dma ={v} {CLOBBER};
>   return _9;
> }
>
> >
> > Does this mean we want to turn plus into bit_ior when
> > get_nonzero_bits() & get_nonzero_bits() is zero?
>
> That's quite interesting transformation, I'll add it as a follow up patch.

I was just curious - maybe we should do the reverse instead?
For mult vs. bit-and I think the latter will be "faster" (well, probably not
even that...).  But for plus vs or?


> >
> > X * [0, 1] -> X & sign-extend-from-bit-1 also works I guess, but 
> > multiplication
> > looks more canonical.
> Ok here.
>
> Martin
>
> >
> > Thanks,
> > Richard.
> >
> >> Thanks,
> >> Martin
> >>
> >> gcc/ChangeLog:
> >>
> >> 2019-06-05  Martin Liska  
> >>
> >> PR tree-optimization/87954
> >> * match.pd: Simplify mult where both arguments are 0 or 1.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >> 2019-06-05  Martin Liska  
> >>
> >> PR tree-optimization/87954
> >> * gcc.dg/pr87954.c: New test.
> >> ---
> >>  gcc/match.pd   |  8 
> >>  gcc/testsuite/gcc.dg/pr87954.c | 21 +
> >>  2 files changed, 29 insertions(+)
> >>  create mode 100644 gcc/testsuite/gcc.dg/pr87954.c
> >>
> >>
>

Re: PR C++/63149

2019-06-05 Thread Marek Polacek

On Wed, Jun 05, 2019 at 02:50:54PM +0200, Jakub Jelinek wrote:
> On Wed, Jun 05, 2019 at 08:39:56AM -0400, Marek Polacek wrote:
> > On Wed, Jun 05, 2019 at 10:34:05AM +0100, Nina Dinka Ranns wrote:
> > > > PR C++/63149
> > > > * pt.c (listify_autos): Use non cv qualified auto_node in
> > > > std::initializer_list.
> > > >
> > > > testsuite/
> > > >
> > > > PR C++/63149
> > 
> > "c++" instead of "C++", thought I don't think anyone would mind.
> 
> I would, I have scripts that grab the PR strings from ChangeLog entries
> and need to fix stuff by hand if it is incorrect like this (or if people
> forget to use the component/ part altogether).

Fair enough.  Nina, please adjust that too, then.

Marek

Re: PR C++/63149

2019-06-05 Thread Jakub Jelinek

On Wed, Jun 05, 2019 at 08:39:56AM -0400, Marek Polacek wrote:
> On Wed, Jun 05, 2019 at 10:34:05AM +0100, Nina Dinka Ranns wrote:
> > > PR C++/63149
> > > * pt.c (listify_autos): Use non cv qualified auto_node in
> > > std::initializer_list.
> > >
> > > testsuite/
> > >
> > > PR C++/63149
> 
> "c++" instead of "C++", thought I don't think anyone would mind.

I would, I have scripts that grab the PR strings from ChangeLog entries
and need to fix stuff by hand if it is incorrect like this (or if people
forget to use the component/ part altogether).

Jakub

[PATCH] Improve PTA flow-sensitivity (for the return stmt)

2019-06-05 Thread Richard Biener



The following was inspired by Marins work on escapes of locals
and the discussion there.  It teaches points-to analysis that
the point of function return is special and thus escapes through
that a) do not influence other points-to solutions, b) can be
pruned of all locals.

This is one example of reasonably simple "post-processing".

The effects are small, I've done statistics, counting the number
of variables we do not mark escaped only after this patch.  This
number is usually zero, sometimes one and a few cases more
(but never more than 11) during bootstrap:

0 95830
1 19268
2 19
3 2
5 2
6 1
8 1
11 1

so not sure if it is worth all the effort.  It does allow us
to do more DSE but that requires the accesses to be indirect
which is not often true for locals.

Bootstrapped / tested on x86_64-unknown-linux-gnu.

Martin, does this help you at all?  Anybody thinks this is
worth the trouble?

Thanks,
Richard.

2019-06-05  Richard Biener  

* tree-ssa-structalias.c: Include tree-cfg.h.
(make_heapvar): Do not make heap vars artificial.
(find_func_aliases_for_builtin_call): Handle stack allocation
functions.
(find_func_aliases): Delay processing of simple enough returns
in non-IPA mode.
(set_uids_in_ptset): Adjust.
(find_what_var_points_to): Likewise.
(solve_constraints): Do not dump points-to sets here.
(compute_points_to_sets): Post-process return statements,
amending the escaped solution.  Dump points-to sets afterwards.
(ipa_pta_execute): Dump points-to sets.

* gcc.dg/tree-ssa/alias-37.c: New testcase.
* gcc.dg/torture/20190604-1.c: Likewise.
* gcc.dg/tree-ssa/pta-callused.c: Adjust.

Index: gcc/tree-ssa-structalias.c
===
--- gcc/tree-ssa-structalias.c  (revision 271951)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -43,6 +43,7 @@
 #include "stringpool.h"
 #include "attribs.h"
 #include "tree-ssa.h"
+#include "tree-cfg.h"
 
 /* The idea behind this analyzer is to generate set constraints from the
program, then solve the resulting constraints in order to generate the
@@ -3854,7 +3855,6 @@ make_heapvar (const char *name, bool add
   DECL_EXTERNAL (heapvar) = 1;
 
   vi = new_var_info (heapvar, name, add_id);
-  vi->is_artificial_var = true;
   vi->is_heap_var = true;
   vi->is_unknown_size_var = true;
   vi->offset = 0;
@@ -4409,6 +4409,32 @@ find_func_aliases_for_builtin_call (stru
  process_constraint (new_constraint (*lhsp, ac));
  return true;
}
+  case BUILT_IN_STACK_SAVE:
+  case BUILT_IN_STACK_RESTORE:
+/* Nothing interesting happens.  */
+return true;
+  case BUILT_IN_ALLOCA:
+  case BUILT_IN_ALLOCA_WITH_ALIGN:
+  case BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX:
+   {
+ tree ptr = gimple_call_lhs (t);
+ if (ptr == NULL_TREE)
+   return true;
+ get_constraint_for (ptr, );
+ varinfo_t vi = make_heapvar ("HEAP", true);
+ /* Alloca storage is never global.  To exempt it from escaped
+handling make it a non-heap var.  */
+ DECL_EXTERNAL (vi->decl) = 0;
+ vi->is_global_var = 0;
+ vi->is_heap_var = 0;
+ struct constraint_expr tmpc;
+ tmpc.var = vi->id;
+ tmpc.offset = 0;
+ tmpc.type = ADDRESSOF;
+ rhsc.safe_push (tmpc);
+ process_all_all_constraints (lhsc, rhsc);
+ return true;
+   }
   case BUILT_IN_POSIX_MEMALIGN:
 {
  tree ptrptr = gimple_call_arg (t, 0);
@@ -4976,7 +5002,12 @@ find_func_aliases (struct function *fn,
   greturn *return_stmt = as_a  (t);
   fi = NULL;
   if (!in_ipa_mode
- || !(fi = get_vi_for_tree (fn->decl)))
+ && SSA_VAR_P (gimple_return_retval (return_stmt)))
+   {
+ /* We handle simple returns by post-processing the solutions.  */
+ ;
+   }
+  if (!(fi = get_vi_for_tree (fn->decl)))
make_escape_constraint (gimple_return_retval (return_stmt));
   else if (in_ipa_mode)
{
@@ -6422,9 +6453,7 @@ set_uids_in_ptset (bitmap into, bitmap f
 {
   varinfo_t vi = get_varinfo (i);
 
-  /* The only artificial variables that are allowed in a may-alias
-set are heap variables.  */
-  if (vi->is_artificial_var && !vi->is_heap_var)
+  if (vi->is_artificial_var)
continue;
 
   if (everything_escaped
@@ -6544,9 +6573,6 @@ find_what_var_points_to (tree fndecl, va
}
  else if (vi->id == nonlocal_id)
pt->nonlocal = 1;
- else if (vi->is_heap_var)
-   /* We represent heapvars in the points-to set properly.  */
-   ;
  else if (vi->id == string_id)
/* Nobody cares - STRING_CSTs are read-only entities.  */
;
@@ -7254,9 +7280,6 @@ solve_constraints (void)
   dump_constraint_graph (dump_file);

Re: PR C++/63149

2019-06-05 Thread Marek Polacek

On Wed, Jun 05, 2019 at 10:34:05AM +0100, Nina Dinka Ranns wrote:
> > PR C++/63149
> > * pt.c (listify_autos): Use non cv qualified auto_node in
> > std::initializer_list.
> >
> > testsuite/
> >
> > PR C++/63149

"c++" instead of "C++", thought I don't think anyone would mind.

> > * g++.dg/cpp0x/initlist-deduce.C: New test.

You're actually adding initlist-deduce2.C.

Marek

Backports to gcc-9-branch

2019-06-05 Thread Jakub Jelinek

Hi!

I've backported following 4 patches from trunk to 9 branch,
bootstrapped/regtested on x86_64-linux and i686-linux, committed.

Jakub
2019-06-05  Jakub Jelinek  

Backported from mainline
2019-05-24  Jakub Jelinek  

PR libgomp/90585
* plugin/plugin-hsa.c (print_kernel_dispatch, run_kernel): Use PRIu64
macro instead of "lu".
(release_kernel_dispatch): Likewise.  Cast shadow->debug to uintptr_t
before casting to void *.

--- libgomp/plugin/plugin-hsa.c (revision 271596)
+++ libgomp/plugin/plugin-hsa.c (revision 271597)
@@ -1154,8 +1164,9 @@ create_single_kernel_dispatch (struct ke
 static void
 release_kernel_dispatch (struct GOMP_hsa_kernel_dispatch *shadow)
 {
-  HSA_DEBUG ("Released kernel dispatch: %p has value: %lu (%p)\n", shadow,
-shadow->debug, (void *) shadow->debug);
+  HSA_DEBUG ("Released kernel dispatch: %p has value: %" PRIu64 " (%p)\n",
+shadow, shadow->debug,
+(void *) (uintptr_t) shadow->debug);
 
   hsa_fns.hsa_memory_free_fn (shadow->kernarg_address);
 
@@ -1276,9 +1287,9 @@ print_kernel_dispatch (struct GOMP_hsa_k
   indent_stream (stderr, indent);
   fprintf (stderr, "kernarg_address: %p\n", dispatch->kernarg_address);
   indent_stream (stderr, indent);
-  fprintf (stderr, "object: %lu\n", dispatch->object);
+  fprintf (stderr, "object: %" PRIu64 "\n", dispatch->object);
   indent_stream (stderr, indent);
-  fprintf (stderr, "signal: %lu\n", dispatch->signal);
+  fprintf (stderr, "signal: %" PRIu64 "\n", dispatch->signal);
   indent_stream (stderr, indent);
   fprintf (stderr, "private_segment_size: %u\n",
   dispatch->private_segment_size);
@@ -1286,8 +1297,8 @@ print_kernel_dispatch (struct GOMP_hsa_k
   fprintf (stderr, "group_segment_size: %u\n",
   dispatch->group_segment_size);
   indent_stream (stderr, indent);
-  fprintf (stderr, "children dispatches: %lu\n",
-  dispatch->kernel_dispatch_count);
+  fprintf (stderr, "children dispatches: %" PRIu64 "\n",
+  dispatch->kernel_dispatch_count);
   indent_stream (stderr, indent);
   fprintf (stderr, "omp_num_threads: %u\n",
   dispatch->omp_num_threads);
@@ -1594,8 +1605,8 @@ run_kernel (struct kernel_info *kernel,
hsa_signal_t child_s;
child_s.handle = shadow->children_dispatches[i]->signal;
 
-   HSA_DEBUG ("Waiting for children completion signal: %lu\n",
-  shadow->children_dispatches[i]->signal);
+   HSA_DEBUG ("Waiting for children completion signal: %" PRIu64 "\n",
+  shadow->children_dispatches[i]->signal);
hsa_fns.hsa_signal_load_acquire_fn (child_s);
   }
 
2019-06-05  Jakub Jelinek  

Backported from mainline
2019-05-27  Jakub Jelinek  

PR libgomp/90641
* work.c (gomp_init_work_share): Instead of aligning final ordered
value to multiples of long long alignment, align to that the
first part (ordered team ids) and if inline_ordered_team_ids
is not on a long long alignment boundary within the structure,
use __alignof__ (long long) - 1 pad size always.
* loop.c (GOMP_loop_start): Fix *mem computation if
inline_ordered_team_ids is not aligned on long long alignment boundary
within the structure.
* loop-ull.c (GOMP_loop_ull_start): Likewise.
* sections.c (GOMP_sections2_start): Likewise.

--- libgomp/loop.c  (revision 271670)
+++ libgomp/loop.c  (revision 271671)
@@ -267,14 +267,17 @@ GOMP_loop_start (long start, long end, l
   if (mem)
{
  uintptr_t size = (uintptr_t) *mem;
+#define INLINE_ORDERED_TEAM_IDS_OFF \
+  ((offsetof (struct gomp_work_share, inline_ordered_team_ids) \
++ __alignof__ (long long) - 1) & ~(__alignof__ (long long) - 1))
  if (size > (sizeof (struct gomp_work_share)
- - offsetof (struct gomp_work_share,
- inline_ordered_team_ids)))
-   thr->ts.work_share->ordered_team_ids
- = gomp_malloc_cleared (size);
+ - INLINE_ORDERED_TEAM_IDS_OFF))
+   *mem
+ = (void *) (thr->ts.work_share->ordered_team_ids
+ = gomp_malloc_cleared (size));
  else
-   memset (thr->ts.work_share->ordered_team_ids, '\0', size);
- *mem = (void *) thr->ts.work_share->ordered_team_ids;
+   *mem = memset (((char *) thr->ts.work_share)
+  + INLINE_ORDERED_TEAM_IDS_OFF, '\0', size);
}
   gomp_work_share_init_done ();
 }
@@ -287,7 +290,18 @@ GOMP_loop_start (long start, long end, l
  first_reductions);
}
   if (mem)
-   *mem = (void *) thr->ts.work_share->ordered_team_ids;
+   {
+ if ((offsetof (struct gomp_work_share, inline_ordered_team_ids)
+  & (__alignof__ (long long) - 1)) == 0)
+

Re: [PATCH] Simplify mult where both arguments are 0 or 1 (PR tree-optimization/87954).

2019-06-05 Thread Richard Sandiford

Martin Liška  writes:
> On 6/5/19 1:13 PM, Richard Biener wrote:
>> On Wed, Jun 5, 2019 at 12:56 PM Martin Liška  wrote:
>>>
>>> Hi.
>>>
>>> I'm suggesting one multiplication simplification pattern.
>>>
>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>>
>>> Ready to be installed?
>> 
>> +  (if (INTEGRAL_TYPE_P (type)
>> +   && wi::eq_p (get_nonzero_bits (@1), wi::one (TYPE_PRECISION (type)))
>> +   && wi::eq_p (get_nonzero_bits (@2), wi::one (TYPE_PRECISION (type
>> 
>>   && wi::eq_p (wi::bit_or (get_nonzero_bits (@1), get_nonzero_bits  (@2)), 
>> 1))
>> 
>> (I think literal 1 still works)? 
>
> Yep, I can confirm that.

get_nonzero_bits (@1) == 1 should work too (without the wi::eq).

>
>> How does it behave for  singed/unsigned 1-bit
>> bitfields?  A gimple testcase maybe necessary to see.
>
> Can we really have a mult that will have a bitfield type?

FWIW, even if we can't, get_nonzero_bits returns 1-bit integers for
1-bit types (of either signedness), so the code should DTRT.

Thanks,
Richard

Re: [PATCH] Simplify mult where both arguments are 0 or 1 (PR tree-optimization/87954).

2019-06-05 Thread Martin Liška

On 6/5/19 1:13 PM, Richard Biener wrote:
> On Wed, Jun 5, 2019 at 12:56 PM Martin Liška  wrote:
>>
>> Hi.
>>
>> I'm suggesting one multiplication simplification pattern.
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
> 
> +  (if (INTEGRAL_TYPE_P (type)
> +   && wi::eq_p (get_nonzero_bits (@1), wi::one (TYPE_PRECISION (type)))
> +   && wi::eq_p (get_nonzero_bits (@2), wi::one (TYPE_PRECISION (type
> 
>   && wi::eq_p (wi::bit_or (get_nonzero_bits (@1), get_nonzero_bits  (@2)), 1))
> 
> (I think literal 1 still works)? 

Yep, I can confirm that.

> How does it behave for  singed/unsigned 1-bit
> bitfields?  A gimple testcase maybe necessary to see.

Can we really have a mult that will have a bitfield type?

$ cat gcc/testsuite/gcc.dg/pr87954-2.c
#define __GFP_DMA 1u
#define __GFP_RECLAIM 0x10u

struct bt
{
  unsigned int v:1;
};

unsigned int
imul(unsigned int flags)
{
  struct bt is_dma, is_rec;

  is_dma.v = !!(flags & __GFP_DMA);
  is_rec.v = !!(flags & __GFP_RECLAIM);

  return is_rec.v * !is_dma.v;
}

$ ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr87954-2.c 
-fdump-tree-optimized=/dev/stdout -O2

;; Function imul (imul, funcdef_no=0, decl_uid=1909, cgraph_uid=1, 
symbol_order=0)

imul (unsigned int flags)
{
  struct bt is_dma;
  _Bool _1;
  unsigned int _2;
  _Bool _3;
  unsigned char _4;
  _Bool _6;
  unsigned int _9;
   _11;
  unsigned char _14;

   [local count: 1073741824]:
  _1 = (_Bool) flags_7(D);
  _2 = flags_7(D) & 16;
  _3 = _2 != 0;
  is_dma.v = _1;
  _4 = BIT_FIELD_REF ;
  _14 = ~_4;
  _6 = (_Bool) _14;
  _11 = _3 & _6;
  _9 = (unsigned int) _11;
  is_dma ={v} {CLOBBER};
  return _9;
}

> 
> Does this mean we want to turn plus into bit_ior when
> get_nonzero_bits() & get_nonzero_bits() is zero?

That's quite interesting transformation, I'll add it as a follow up patch.

> 
> X * [0, 1] -> X & sign-extend-from-bit-1 also works I guess, but 
> multiplication
> looks more canonical.
Ok here.

Martin

> 
> Thanks,
> Richard.
> 
>> Thanks,
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2019-06-05  Martin Liska  
>>
>> PR tree-optimization/87954
>> * match.pd: Simplify mult where both arguments are 0 or 1.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2019-06-05  Martin Liska  
>>
>> PR tree-optimization/87954
>> * gcc.dg/pr87954.c: New test.
>> ---
>>  gcc/match.pd   |  8 
>>  gcc/testsuite/gcc.dg/pr87954.c | 21 +
>>  2 files changed, 29 insertions(+)
>>  create mode 100644 gcc/testsuite/gcc.dg/pr87954.c
>>
>>

Re: [PATCH] Testsuite: Add s390 exceptions for gen-vect

2019-06-05 Thread Robin Dapp

Ping.

> gcc/testsuite/ChangeLog:
> 
> 2019-05-15  Robin Dapp  
> 
>   * gcc.dg/tree-ssa/gen-vect-26.c: Do not expect unaligned access
>   vectorization on s390.
>   * gcc.dg/tree-ssa/gen-vect-28.c: Likewise.
>   * gcc.dg/tree-ssa/gen-vect-32.c: Likewise.
>

Re: [PATCH] Integrate non-TBB serial backend support

2019-06-05 Thread Jonathan Wakely


On 04/06/19 15:47 -0700, Thomas Rodgers wrote:

* include/bits/c++config: Adjust TBB detection logic to select serial
PSTL backend if no TBB present.
* testsuite/utils/pstl/test_utils.h: Remove check for
__PSTL_USE_PAR_POLICIES


The changelog says __PSTL_USE_PAR_POLICIES with two leading
underscores, but the macro is _PSTL_USE_PAR_POLICIES with one.

OK for trunk with the changelog corrected, thanks.

Re: [PATCH] Simplify mult where both arguments are 0 or 1 (PR tree-optimization/87954).

2019-06-05 Thread Richard Biener

On Wed, Jun 5, 2019 at 12:56 PM Martin Liška  wrote:
>
> Hi.
>
> I'm suggesting one multiplication simplification pattern.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

+  (if (INTEGRAL_TYPE_P (type)
+   && wi::eq_p (get_nonzero_bits (@1), wi::one (TYPE_PRECISION (type)))
+   && wi::eq_p (get_nonzero_bits (@2), wi::one (TYPE_PRECISION (type

  && wi::eq_p (wi::bit_or (get_nonzero_bits (@1), get_nonzero_bits  (@2)), 1))

(I think literal 1 still works)?  How does it behave for  singed/unsigned 1-bit
bitfields?  A gimple testcase maybe necessary to see.

Does this mean we want to turn plus into bit_ior when
get_nonzero_bits() & get_nonzero_bits() is zero?

X * [0, 1] -> X & sign-extend-from-bit-1 also works I guess, but multiplication
looks more canonical.

Thanks,
Richard.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> 2019-06-05  Martin Liska  
>
> PR tree-optimization/87954
> * match.pd: Simplify mult where both arguments are 0 or 1.
>
> gcc/testsuite/ChangeLog:
>
> 2019-06-05  Martin Liska  
>
> PR tree-optimization/87954
> * gcc.dg/pr87954.c: New test.
> ---
>  gcc/match.pd   |  8 
>  gcc/testsuite/gcc.dg/pr87954.c | 21 +
>  2 files changed, 29 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/pr87954.c
>
>

Re: [PATCH][GCC][AARCH64] Add tests for pointer authentication B-key

2019-06-05 Thread Sam Tebbs

Committed as obvious as r271954.

On 05/06/2019 11:20, Sam Tebbs wrote:
> Hi all,
>
> When committing my b-key patch (r271735) I didn't svn add the new test
> files, this patch adds them and moves the exception tests to
> g++.target/aarch64.
>
> Tested on aarch64-none-linux-gnu and aarch64-none-elf.
>
> OK for trunk?
>
> gcc/testsuite
> 2019-06-05  Sam Tebbs
>
>   * gcc.target/aarch64/return_address_sign_b_1.c: New file.
>   * gcc.target/aarch64/return_address_sign_b_2.c: New file.
>   * gcc.target/aarch64/return_address_sign_b_3.c: New file.
>   * gcc.target/aarch64/return_address_sign_builtin.c: New file.
>   * g++.target/aarch64/return_address_sign_ab_exception.C: New file.
>   * g++.target/aarch64/return_address_sign_b_exception.C: New file.
>

Re: [PATCH V6] Remove empty loop with assumed finiteness (PR tree-optimization/89713)

2019-06-05 Thread Richard Biener

On Tue, Jun 4, 2019 at 5:24 PM Feng Xue OS  wrote:
>
> Some changes on documentation.

Please name the option -ffinite-loops (plural), the patch is OK with this
change if you also adjust

> +  /* If the loop has any non-EH exit, we can assume it will terminate. */
> +  FOR_EACH_VEC_ELT (exits, i, ex)
> +   if (!(ex->flags & EDGE_EH))
> + {

to look for "normal" edges only - in addition to EDGE_EH you
want to ignore EDGE_FAKE and EDGE_ABNORMAL.  EDGE_FAKE
are inserted by connect_infiinite_loops_to_exit and EDGE_ABNORMAL
can appear with setjmp/longjmp and friends.

I think this change also warrants mentioning in gcc-10/changes.html.

Thanks,
Richard.

> Feng
>
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 37aab79..4fdc5c8 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,16 @@
> +2019-06-04  Feng Xue  
> +
> +   PR tree-optimization/89713
> +   * doc/invoke.texi (-ffinite-loop): Document new option.
> +   * common.opt (-ffinite-loop): New option.
> +   * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Mark
> +   IFN_GOACC_LOOP calls as necessary.
> +   * tree-ssa-loop-niter.c (finite_loop): Assume loop with an exit is
> +   finite.
> +   * omp-offload.c (oacc_xform_loop): Skip lowering if return value of
> +   IFN_GOACC_LOOP call is not used.
> +   * opts.c (default_options_table): Enable -ffinite-loop at -O2+.
> +
>  2019-06-04  Alan Modra  
>
> PR target/90689
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 0e72fd0..f570815 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1437,6 +1437,10 @@ ffinite-math-only
>  Common Report Var(flag_finite_math_only) Optimization SetByCombined
>  Assume no NaNs or infinities are generated.
>
> +ffinite-loop
> +Common Report Var(flag_finite_loop) Optimization
> +Assume that loops with an exit will terminate and not loop indefinitely.
> +
>  ffixed-
>  Common Joined RejectNegative Var(common_deferred_options) Defer
>  -ffixed- Mark  as being unavailable to the compiler.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 91c9bb8..2cb0b9a 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fdevirtualize-at-ltrans  -fdse @gol
>  -fearly-inlining  -fipa-sra  -fexpensive-optimizations  -ffat-lto-objects 
> @gol
>  -ffast-math  -ffinite-math-only  -ffloat-store  
> -fexcess-precision=@var{style} @gol
> +-ffinite-loop @gol
>  -fforward-propagate  -ffp-contract=@var{style}  -ffunction-sections @gol
>  -fgcse  -fgcse-after-reload  -fgcse-las  -fgcse-lm  -fgraphite-identity @gol
>  -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
> @@ -8282,6 +8283,7 @@ also turns on the following optimization flags:
>  -fdelete-null-pointer-checks @gol
>  -fdevirtualize  -fdevirtualize-speculatively @gol
>  -fexpensive-optimizations @gol
> +-ffinite-loop @gol
>  -fgcse  -fgcse-lm  @gol
>  -fhoist-adjacent-loads @gol
>  -finline-small-functions @gol
> @@ -9503,6 +9505,15 @@ that may set @code{errno} but are otherwise free of 
> side effects.  This flag is
>  enabled by default at @option{-O2} and higher if @option{-Os} is not also
>  specified.
>
> +@item -ffinite-loop
> +@opindex ffinite-loop
> +@opindex fno-finite-loop
> +Assume that a loop with an exit will eventually take the exit and not loop
> +indefinitely.  This allows the compiler to remove loops that otherwise have
> +no side-effects, not considering eventual endless looping as such.
> +
> +This option is enabled by default at @option{-O2}.
> +
>  @item -ftree-dominator-opts
>  @opindex ftree-dominator-opts
>  Perform a variety of simple scalar cleanups (constant/copy
> diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
> index 97ae47b..369122f 100644
> --- a/gcc/omp-offload.c
> +++ b/gcc/omp-offload.c
> @@ -300,7 +300,7 @@ oacc_xform_loop (gcall *call)
>tree chunk_size = NULL_TREE;
>unsigned mask = (unsigned) TREE_INT_CST_LOW (gimple_call_arg (call, 5));
>tree lhs = gimple_call_lhs (call);
> -  tree type = TREE_TYPE (lhs);
> +  tree type = NULL_TREE;
>tree diff_type = TREE_TYPE (range);
>tree r = NULL_TREE;
>gimple_seq seq = NULL;
> @@ -308,6 +308,15 @@ oacc_xform_loop (gcall *call)
>unsigned outer_mask = mask & (~mask + 1); // Outermost partitioning
>unsigned inner_mask = mask & ~outer_mask; // Inner partitioning (if any)
>
> +  /* Skip lowering if return value of IFN_GOACC_LOOP call is not used. */
> +  if (!lhs)
> +{
> +  gsi_replace_with_seq (, seq, true);
> +  return;
> +}
> +
> +  type = TREE_TYPE (lhs);
> +
>  #ifdef ACCEL_COMPILER
>chunk_size = gimple_call_arg (call, 4);
>if (integer_minus_onep (chunk_size)  /* Force static allocation.  */
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 64f94ac..0db9dda 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -494,6 +494,7 @@ static const struct default_options 
> default_options_table[] =
>  { OPT_LEVELS_2_PLUS,

[PATCH] Simplify mult where both arguments are 0 or 1 (PR tree-optimization/87954).

2019-06-05 Thread Martin Liška

Hi.

I'm suggesting one multiplication simplification pattern.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-06-05  Martin Liska  

PR tree-optimization/87954
* match.pd: Simplify mult where both arguments are 0 or 1.

gcc/testsuite/ChangeLog:

2019-06-05  Martin Liska  

PR tree-optimization/87954
* gcc.dg/pr87954.c: New test.
---
 gcc/match.pd   |  8 
 gcc/testsuite/gcc.dg/pr87954.c | 21 +
 2 files changed, 29 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr87954.c


diff --git a/gcc/match.pd b/gcc/match.pd
index 02e0471dd4e..7c6bf621917 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -217,6 +217,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
|| !COMPLEX_FLOAT_TYPE_P (type)))
(negate @0)))
 
+/* Transform { 0 or 1 } * { 0 or 1 } into { 0 or 1 } & { 0 or 1 } */
+(simplify
+ (mult SSA_NAME@1 SSA_NAME@2)
+  (if (INTEGRAL_TYPE_P (type)
+   && wi::eq_p (get_nonzero_bits (@1), wi::one (TYPE_PRECISION (type)))
+   && wi::eq_p (get_nonzero_bits (@2), wi::one (TYPE_PRECISION (type
+   (bit_and @1 @2)))
+
 /* Transform x * { 0 or 1, 0 or 1, ... } into x & { 0 or -1, 0 or -1, ...},
unless the target has native support for the former but not the latter.  */
 (simplify
diff --git a/gcc/testsuite/gcc.dg/pr87954.c b/gcc/testsuite/gcc.dg/pr87954.c
new file mode 100644
index 000..620657cb1f5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr87954.c
@@ -0,0 +1,21 @@
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+#define __GFP_DMA 1u
+#define __GFP_RECLAIM 0x10u
+
+#define KMALLOC_DMA 2
+#define KMALLOC_RECLAIM 1
+
+unsigned int
+imul(unsigned int flags)
+{
+  int is_dma, type_dma, is_rec;
+
+  is_dma = !!(flags & __GFP_DMA);
+  type_dma = is_dma * KMALLOC_DMA;
+  is_rec = !!(flags & __GFP_RECLAIM);
+
+  return type_dma + (is_rec * !is_dma) * KMALLOC_RECLAIM;
+}
+
+/* { dg-final { scan-tree-dump-times { \* } 1 "optimized" } } */

Re: [C++ PATCH] decls with error_node type

2019-06-05 Thread Nathan Sidwell


On 6/3/19 1:30 PM, Jason Merrill wrote:

On 5/31/19 1:39 PM, Nathan Sidwell wrote:


The fallout is some error cascade on symbol lookups that now fail. 


Right, that's why we had the previous behavior, to avoid this extra 
noise.  And then various places check error_operand_p to avoid trying to 
do anything with such a declaration.


Perhaps lookup should return error_mark_node rather than a _DECL with 
error_mark_node type?


Good idea, but still fails in simlar ways.   I think that's because with 
the curren tbehaviour, it can parse suchan erronous typedef name as a 
declspec seq -- and a var-decl as a var (I now see we set their type to 
error_mark_node too).This is looking like a rat hole :(


nathan

--
Nathan Sidwell

Re: [PR90743] Fortran 'allocatable' with OpenACC data/OpenMP 'target' 'map' clauses

2019-06-05 Thread Jakub Jelinek

On Wed, Jun 05, 2019 at 12:26:32PM +0200, Thomas Schwinge wrote:
> Hi Jakub!
> 
> On Wed, 5 Jun 2019 12:00:25 +0200, Jakub Jelinek  wrote:
> > On Wed, Jun 05, 2019 at 11:25:07AM +0200, Thomas Schwinge wrote:
> > > +  !$omp target map(to: a) map(tofrom: b, c, d) map(from: e)
> > > +  !$acc parallel copyin(a) copy(b, c, d) copyout(e)
> > 
> > Is mixing OpenMP and OpenACC construct this way defined at all?
> 
> It's not.  I'm using this just to avoid duplicating the test case file,
> that is, '-fopenacc' and '-fopenmp' aren't enabled at the same time.

I think it is better to duplicate the test, it avoids confusion.

Jakub

Re: [C++ PATCH] structure tag lookup.

2019-06-05 Thread Nathan Sidwell


On 6/4/19 11:20 AM, Marek Polacek wrote:

On Tue, Jun 04, 2019 at 11:13:14AM -0400, Nathan Sidwell wrote:



[...]


+  /* Now check if we can look in namespace scope.  */
+  for (; b->kind != sk_namespace; b = b->level_chain)
+if (!(b->kind == sk_cleanup
+ || b->kind == sk_template_parms
+ || b->kind == sk_function_parms
+ || (b->kind == sk_class
+ && scope == ts_within_enclosing_non_class)))
+  return NULL_TREE;


Looks like we could break that out into a new predicate function?
Something like allowed_scope_p?


Yeah, I thought about that, but didn't think it worth the effort.

nathan

--
Nathan Sidwell

Re: [PR90743] Fortran 'allocatable' with OpenACC data/OpenMP 'target' 'map' clauses

2019-06-05 Thread Thomas Schwinge

Hi Jakub!

On Wed, 5 Jun 2019 12:00:25 +0200, Jakub Jelinek  wrote:
> On Wed, Jun 05, 2019 at 11:25:07AM +0200, Thomas Schwinge wrote:
> > +  !$omp target map(to: a) map(tofrom: b, c, d) map(from: e)
> > +  !$acc parallel copyin(a) copy(b, c, d) copyout(e)
> 
> Is mixing OpenMP and OpenACC construct this way defined at all?

It's not.  I'm using this just to avoid duplicating the test case file,
that is, '-fopenacc' and '-fopenmp' aren't enabled at the same time.

> I see we reject OpenMP constructs inside of OpenACC contexts

ACK.

> and
> using OpenACC constructs inside host OpenMP constructs should be generally
> fine too

ACK.  (Should, but currently isn't.)

> but mixing OpenMP offloading constructs with OpenACC constructs
> sounds wrong.

ACK.  (Unless that gets defined by the two standards.)


Grüße
 Thomas


signature.asc
Description: PGP signature

[PATCH][GCC][AARCH64] Add tests for pointer authentication B-key

2019-06-05 Thread Sam Tebbs

Hi all,

When committing my b-key patch (r271735) I didn't svn add the new test
files, this patch adds them and moves the exception tests to
g++.target/aarch64.

Tested on aarch64-none-linux-gnu and aarch64-none-elf.

OK for trunk?

gcc/testsuite
2019-06-05  Sam Tebbs

* gcc.target/aarch64/return_address_sign_b_1.c: New file.
* gcc.target/aarch64/return_address_sign_b_2.c: New file.
* gcc.target/aarch64/return_address_sign_b_3.c: New file.
* gcc.target/aarch64/return_address_sign_builtin.c: New file.
* g++.target/aarch64/return_address_sign_ab_exception.C: New file.
* g++.target/aarch64/return_address_sign_b_exception.C: New file.

diff --git a/gcc/testsuite/g++.target/aarch64/return_address_sign_ab_exception.C b/gcc/testsuite/g++.target/aarch64/return_address_sign_ab_exception.C
new file mode 100644
index ..520cd180beeba317a12d2404ce6e252eaeb6aa90
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/return_address_sign_ab_exception.C
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+/* { dg-options "--save-temps" } */
+
+__attribute__((target("branch-protection=pac-ret+leaf")))
+int foo_a () {
+  throw 22;
+}
+
+__attribute__((target("branch-protection=pac-ret+leaf+b-key")))
+int foo_b () {
+  throw 22;
+}
+
+int main (int argc, char** argv) {
+  try {
+foo_a ();
+  } catch (...) {
+try {
+  foo_b ();
+} catch (...) {
+  return 0;
+}
+  }
+  return 1;
+}
+
+/* { dg-final { scan-assembler-times "paciasp" 1 } } */
+/* { dg-final { scan-assembler-times "pacibsp" 1 } } */
+/* { dg-final { scan-assembler-times ".cfi_b_key_frame" 1 } } */
+
diff --git a/gcc/testsuite/g++.target/aarch64/return_address_sign_b_exception.C b/gcc/testsuite/g++.target/aarch64/return_address_sign_b_exception.C
new file mode 100644
index ..eab2869e004b245944db696e34dc4db90e6e
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/return_address_sign_b_exception.C
@@ -0,0 +1,17 @@
+/* { dg-do run } */
+/* { dg-options "-mbranch-protection=pac-ret+leaf+b-key --save-temps" } */
+
+int foo () {
+  throw 22;
+}
+
+int main (int argc, char** argv) {
+  try {
+foo();
+  } catch (...) {
+return 0;
+  }
+  return 1;
+}
+
+/* { dg-final { scan-assembler-times ".cfi_b_key_frame" 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/return_address_sign_b_1.c b/gcc/testsuite/gcc.target/aarch64/return_address_sign_b_1.c
new file mode 100644
index ..32d788ddf3fb72545d7c4b9869d8e445bdaaab37
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/return_address_sign_b_1.c
@@ -0,0 +1,52 @@
+/* Testing return address signing where no combined instructions used.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mbranch-protection=pac-ret+leaf+b-key" } */
+/* { dg-require-effective-target lp64 } */
+
+int foo (int);
+
+/* sibcall only.  */
+int __attribute__ ((target ("arch=armv8.3-a")))
+func1 (int a, int b)
+{
+  /* pacibsp */
+  return foo (a + b);
+  /* autibsp */
+}
+
+/* non-leaf function with sibcall.  */
+int __attribute__ ((target ("arch=armv8.3-a")))
+func2 (int a, int b)
+{
+  /* pacibsp */
+  if (a < b)
+return b;
+
+  a = foo (b);
+
+  return foo (a);
+  /* autibsp */
+}
+
+/* non-leaf function, legacy arch.  */
+int __attribute__ ((target ("arch=armv8.2-a")))
+func3 (int a, int b, int c)
+{
+  /* pacibsp */
+  return a + foo (b) + c;
+  /* autibsp */
+}
+
+/* eh_return.  */
+void __attribute__ ((target ("arch=armv8.3-a")))
+func4 (long offset, void *handler, int *ptr, int imm1, int imm2)
+{
+  /* pacibsp */
+  *ptr = imm1 + foo (imm1) + imm2;
+  __builtin_eh_return (offset, handler);
+  /* autibsp */
+  return;
+}
+
+/* { dg-final { scan-assembler-times "pacibsp" 4 } } */
+/* { dg-final { scan-assembler-times "autibsp" 4 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/return_address_sign_b_2.c b/gcc/testsuite/gcc.target/aarch64/return_address_sign_b_2.c
new file mode 100644
index ..9ed64ce05911475268e9ef95e88e873e21611085
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/return_address_sign_b_2.c
@@ -0,0 +1,18 @@
+/* Testing return address signing where combined instructions used.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mbranch-protection=pac-ret+leaf+b-key" } */
+/* { dg-require-effective-target lp64 } */
+
+int foo (int);
+int bar (int, int);
+
+int __attribute__ ((target ("arch=armv8.3-a")))
+func1 (int a, int b, int c)
+{
+  /* pacibsp */
+  return a + foo (b) + c;
+  /* retab */
+}
+
+/* { dg-final { scan-assembler-times "pacibsp" 1 } } */
+/* { dg-final { scan-assembler-times "retab" 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/return_address_sign_b_3.c b/gcc/testsuite/gcc.target/aarch64/return_address_sign_b_3.c
new file mode 100644
index ..7183cc2f2b1c1692d5eda9c96c237d2e308060fb
--- /dev/null
+++

Re: [PATCH] Find constant definition for by-ref argument using dominance information (PR ipa/90401)

2019-06-05 Thread Richard Biener

On Wed, Jun 5, 2019 at 10:59 AM Feng Xue OS  wrote:
>
> IPA-CP can not identify a constant by-ref argument to a function, if 
> definition
> of the argument is not in same basic block where the callsite lies in. This is
> because IPA-CP only does local search in the callsite basic block.So this 
> patch
> implemented an enhanced algorithm, which uses dominance information to guide
> traverse in virtual SSA web, to find out constant definitions in dominating 
> basic
> block.
>
> Feng
>
> ---
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 526ed45be89..cc076c337af 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,14 @@
> +2019-06-04  Feng Xue  
> +
> +   PR ipa/90401
> +   * ipa-prop.c (add_to_agg_contents_list): New function.
> +   (clobber_by_agg_contents_list_p): Likewise.
> +   (extract_mem_content): Likewise.
> +   (strictly_dominated_by_ssa_p): Likewise.
> +   (get_place_in_agg_contents_list): Delete.
> +   (determine_known_aggregate_parts): Renamed from
> +   determine_locally_known_aggregate_parts.
> +
>  2019-06-04  Segher Boessenkool  
>
> * config/rs6000/constraints.md (define_register_constraint "wp"):
> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> index d86c2f3db55..405cdf7730b 100644
> --- a/gcc/ipa-prop.c
> +++ b/gcc/ipa-prop.c
> @@ -1458,7 +1458,7 @@ get_ssa_def_if_simple_copy (tree rhs)
>return rhs;
>  }
>
> -/* Simple linked list, describing known contents of an aggregate beforere
> +/* Simple linked list, describing known contents of an aggregate before
> call.  */
>
>  struct ipa_known_agg_contents_list
> @@ -1471,41 +1471,64 @@ struct ipa_known_agg_contents_list
>struct ipa_known_agg_contents_list *next;
>  };
>
> -/* Find the proper place in linked list of ipa_known_agg_contents_list
> -   structures where to put a new one with the given LHS_OFFSET and LHS_SIZE,
> -   unless there is a partial overlap, in which case return NULL, or such
> -   element is already there, in which case set *ALREADY_THERE to true.  */
> +/* Add a known content item into a linked list of 
> ipa_known_agg_contents_list,
> +   in which all elements are sorted ascendingly by offset. When ALLOW_DUP is
> +   false, insert the item only if there is no duplicate one (with same offset
> +   and size) in the list. And if the item is added, return true. */
>
> -static struct ipa_known_agg_contents_list **
> -get_place_in_agg_contents_list (struct ipa_known_agg_contents_list **list,
> -   HOST_WIDE_INT lhs_offset,
> -   HOST_WIDE_INT lhs_size,
> -   bool *already_there)
> +static inline bool
> +add_to_agg_contents_list (struct ipa_known_agg_contents_list **plist,
> +  struct ipa_known_agg_contents_list *item,
> +  bool allow_dup = true)
>  {
> -  struct ipa_known_agg_contents_list **p = list;
> -  while (*p && (*p)->offset < lhs_offset)
> +  struct ipa_known_agg_contents_list *list = *plist;
> +
> +  for (; list; list = list->next)
>  {
> -  if ((*p)->offset + (*p)->size > lhs_offset)
> -   return NULL;
> -  p = &(*p)->next;
> +  if (list->offset > item->offset)
> +break;
> +
> +  if (list->offset == item->offset && list->size == item->size
> +  && !allow_dup)
> +return false;
> +
> +  plist = >next;
>  }
>
> -  if (*p && (*p)->offset < lhs_offset + lhs_size)
> +  item->next = list;
> +  *plist = item;
> +  return true;
> +}
> +
> +/* Check whether a given known content is clobbered by certain element in
> +   a linked list of ipa_known_agg_contents_list. A special case is that
> +   we can ignore those constant items completely same as the given item,
> +   that is they have same offset/size/value. */
> +
> +static inline bool
> +clobber_by_agg_contents_list_p (struct ipa_known_agg_contents_list *list,
> +struct ipa_known_agg_contents_list *item)
> +{
> +  for (; list; list = list->next)
>  {
> -  if ((*p)->offset == lhs_offset && (*p)->size == lhs_size)
> -   /* We already know this value is subsequently overwritten with
> -  something else.  */
> -   *already_there = true;
> -  else
> -   /* Otherwise this is a partial overlap which we cannot
> -  represent.  */
> -   return NULL;
> +  if (list->offset > item->offset)
> +return list->offset < item->offset + item->size;
> +
> +  /* For the constant item, we can skip comparison with identical items 
> in
> + the list, because its content remains unchanged after clobbering. */
> +  if (list->offset == item->offset && list->size == item->size
> +  && list->constant && item->constant
> +  && operand_equal_p (list->constant, item->constant, 0))
> +continue;
> +
> +  if (list->offset + list->size > item->offset)
> +return true;
>  }
> -  return p;
> +  return false;
>  }
>
>

Re: [PR90743] Fortran 'allocatable' with OpenACC data/OpenMP 'target' 'map' clauses

2019-06-05 Thread Jakub Jelinek

On Wed, Jun 05, 2019 at 11:25:07AM +0200, Thomas Schwinge wrote:
>   libgomp/
>   PR fortran/90743
>   * oacc-parallel.c (GOACC_parallel_keyed): Handle NULL case.
>   * testsuite/libgomp.fortran/target-allocatable-1.f90: New file.
>   * testsuite/libgomp.oacc-fortran/allocatable-1.f90: New file.
> ---
>  libgomp/oacc-parallel.c   |  9 ++-
>  .../libgomp.fortran/target-allocatable-1.f90  |  8 +++
>  .../libgomp.oacc-fortran/allocatable-1.f90| 70 +++
>  3 files changed, 84 insertions(+), 3 deletions(-)
>  create mode 100644 libgomp/testsuite/libgomp.fortran/target-allocatable-1.f90
>  create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/allocatable-1.f90
> 
> diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c
> index e56330f6226b..0c2cfa05a438 100644
> --- a/libgomp/oacc-parallel.c
> +++ b/libgomp/oacc-parallel.c
> @@ -325,9 +325,12 @@ GOACC_parallel_keyed (int flags_m, void (*fn) (void *),
>
>devaddrs = gomp_alloca (sizeof (void *) * mapnum);
>for (i = 0; i < mapnum; i++)
> -devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start
> - + tgt->list[i].key->tgt_offset
> - + tgt->list[i].offset);
> +if (tgt->list[i].key != NULL)
> +  devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start
> +   + tgt->list[i].key->tgt_offset
> +   + tgt->list[i].offset);
> +else
> +  devaddrs[i] = NULL;

I don't know what does OpenACC require for allocatables, so can't comment on
this (and it falls under OpenACC maintainance).

> +
> +  !$omp target map(to: a) map(tofrom: b, c, d) map(from: e)
> +  !$acc parallel copyin(a) copy(b, c, d) copyout(e)

Is mixing OpenMP and OpenACC construct this way defined at all?
I see we reject OpenMP constructs inside of OpenACC contexts, and
using OpenACC constructs inside host OpenMP constructs should be generally
fine too, but mixing OpenMP offloading constructs with OpenACC constructs
sounds wrong.

Jakub

Re: PR C++/63149

2019-06-05 Thread Nina Dinka Ranns

yes, I forgot to attach the latest patch. :)

On Wed, 5 Jun 2019 at 10:24, Nina Dinka Ranns  wrote:
>
> Hi both,
> Addressing all comments in this e-mail, as some are duplicate.
>
> On Tue, 4 Jun 2019 at 20:45, Paolo Carlini  wrote:
> >
> > Hi,
> >
> > On 04/06/19 21:26, Nina Dinka Ranns wrote:
> >
> > Good point, dg-do compile is sufficient to demonstrate the issue.
> >
> > I agree.
> >
> > A couple of additional nits, sorry for mentioning only now.
> >
> >
> > C++63149_2.diff
> >
> > Index: gcc/cp/pt.c
> > ===
> > --- gcc/cp/pt.c (revision 271709)
> > +++ gcc/cp/pt.c (working copy)
> > @@ -26836,7 +26836,7 @@
> >  static tree
> >  listify_autos (tree type, tree auto_node)
> >  {
> > -  tree init_auto = listify (auto_node);
> > +  tree init_auto = listify (strip_top_quals(auto_node));
> >
> > You want a space after strip_top_quals.
>
> fixed.
>
> >
> >tree argvec = make_tree_vec (1);
> >TREE_VEC_ELT (argvec, 0) = init_auto;
> >if (processing_template_decl)
> > Index: gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C
> > ===
> > --- gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C (nonexistent)
> > +++ gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C (working copy)
> > @@ -0,0 +1,12 @@
> > +// Test for PR63149
> > +// { dg-do compile { target c++11 } }
> > +
> > +#include 
> > +
> > +const auto r = { 1, 2, 3 };
> > +using X = decltype(r);
> > +using X = const std::initializer_list;
> > +
> > +int main()
> > +{
> > +}
> >
> > With dg-do compile you don't need a main anymore.
> >
> fixed
>
> > I seem to remember also a couple of minor formatting issues in the 
> > ChangeLog entry: just harmonize the format with everything else you find in 
> > the ChangeLog, in terms of the usual trivial details: upper cases, line 
> > lenghts and line wraps, etc.
> >
>
> Below is the amended change log. If there is anything else off, I
> would need specifics as I've made all the changes I could spot myself.
> :)
>
> Thanks,
> Nina
>
> 2019-06-04  Nina Dinka Ranns  
> gcc/cp
>
> PR C++/63149
> * pt.c (listify_autos): Use non cv qualified auto_node in
> std::initializer_list.
>
> testsuite/
>
> PR C++/63149
> * g++.dg/cpp0x/initlist-deduce.C: New test.
Index: gcc/cp/pt.c
===
--- gcc/cp/pt.c	(revision 271709)
+++ gcc/cp/pt.c	(working copy)
@@ -26836,7 +26836,7 @@
 static tree
 listify_autos (tree type, tree auto_node)
 {
-  tree init_auto = listify (auto_node);
+  tree init_auto = listify (strip_top_quals (auto_node));
   tree argvec = make_tree_vec (1);
   TREE_VEC_ELT (argvec, 0) = init_auto;
   if (processing_template_decl)
Index: gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C
===
--- gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C	(nonexistent)
+++ gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C	(working copy)
@@ -0,0 +1,8 @@
+// Test for PR63149
+// { dg-do compile { target c++11 } }
+
+#include 
+
+const auto r = { 1, 2, 3 };
+using X = decltype(r);
+using X = const std::initializer_list;

[PR90743] Fortran 'allocatable' with OpenACC data/OpenMP 'target' 'map' clauses

2019-06-05 Thread Thomas Schwinge

Hi!

After having learned from PR90741 "Unreachable second '__builtin_malloc'
for scalar 'allocatable'", I then in context of PR90743 "Device-side
'malloc' for Fortran 'allocatable' scalar" had a look at what OpenMP 5.0
is saying about Fortran 'allocatable' in 'map' clauses, and suggest to
document that with the following test case.

I'm also making OpenACC do the same by fixing one thing.  (I'm aware we
should really be using 'gomp_map_val', but that's a different change;
likely in context of PR90596 "'GOACC_parallel_keyed' should use
'GOMP_MAP_VARS_TARGET'".)

Is the attached patch OK?  If approving this patch, please respond with
"Reviewed-by: NAME " so that your effort will be recorded in the
commit log, see .


Grüße
 Thomas


From a55b64d3dbc7e6c9c649f7f3d23e72a0a8712ee0 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 4 Jun 2019 20:25:41 +0200
Subject: [PATCH] [PR90743] Fortran 'allocatable' with OpenACC data/OpenMP
 'target' 'map' clauses

Test what OpenMP 5.0 has to say on this topic.  And make OpenACC do the same.

	libgomp/
	PR fortran/90743
	* oacc-parallel.c (GOACC_parallel_keyed): Handle NULL case.
	* testsuite/libgomp.fortran/target-allocatable-1.f90: New file.
	* testsuite/libgomp.oacc-fortran/allocatable-1.f90: New file.
---
 libgomp/oacc-parallel.c   |  9 ++-
 .../libgomp.fortran/target-allocatable-1.f90  |  8 +++
 .../libgomp.oacc-fortran/allocatable-1.f90| 70 +++
 3 files changed, 84 insertions(+), 3 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.fortran/target-allocatable-1.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/allocatable-1.f90

diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c
index e56330f6226b..0c2cfa05a438 100644
--- a/libgomp/oacc-parallel.c
+++ b/libgomp/oacc-parallel.c
@@ -325,9 +325,12 @@ GOACC_parallel_keyed (int flags_m, void (*fn) (void *),
   
   devaddrs = gomp_alloca (sizeof (void *) * mapnum);
   for (i = 0; i < mapnum; i++)
-devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start
-			+ tgt->list[i].key->tgt_offset
-			+ tgt->list[i].offset);
+if (tgt->list[i].key != NULL)
+  devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start
+			  + tgt->list[i].key->tgt_offset
+			  + tgt->list[i].offset);
+else
+  devaddrs[i] = NULL;
   if (aq == NULL)
 acc_dev->openacc.exec_func (tgt_fn, mapnum, hostaddrs, devaddrs, dims,
 tgt);
diff --git a/libgomp/testsuite/libgomp.fortran/target-allocatable-1.f90 b/libgomp/testsuite/libgomp.fortran/target-allocatable-1.f90
new file mode 100644
index ..9f0b7f0c3b53
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/target-allocatable-1.f90
@@ -0,0 +1,8 @@
+! Test 'allocatable' with OpenMP 'target' 'map' clauses.
+
+! { dg-do run }
+! { dg-additional-options "-cpp" }
+! { dg-additional-options "-DACC_MEM_SHARED=0" { target offload_device_nonshared_as } }
+! { dg-additional-options "-DACC_MEM_SHARED=1" { target offload_device_shared_as } }
+
+#include "../libgomp.oacc-fortran/allocatable-1.f90"
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/allocatable-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/allocatable-1.f90
new file mode 100644
index ..0941f71f8c30
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/allocatable-1.f90
@@ -0,0 +1,70 @@
+! Test 'allocatable' with OpenACC data clauses.
+
+! This is also 'include'd from '../libgomp.fortran/target-allocatable-1.f90'.
+
+! { dg-do run }
+! { dg-additional-options "-cpp" }
+
+program main
+  implicit none
+  integer, allocatable :: a, b, c, d, e
+
+  allocate (a)
+  a = 11
+
+  b = 25 ! Implicit allocation.
+
+  c = 52 ! Implicit allocation.
+
+  !No 'allocate (d)' here.
+
+  !No 'allocate (e)' here.
+
+  !$omp target map(to: a) map(tofrom: b, c, d) map(from: e)
+  !$acc parallel copyin(a) copy(b, c, d) copyout(e)
+
+  if (.not. allocated (a)) stop 1
+  if (a .ne. 11) stop 2
+  a = 33
+ 
+  if (.not. allocated (b)) stop 3
+  if (b .ne. 25) stop 4
+ 
+  if (.not. allocated (c)) stop 5
+  if (c .ne. 52) stop 6
+  c = 10
+ 
+  if (allocated (d)) stop 7
+  d = 42 ! Implicit allocation, but on device only.
+  if (.not. allocated (d)) stop 8
+  deallocate (d) ! OpenMP requires must be "unallocated upon exit from the region".
+
+  if (allocated (e)) stop 9
+  e = 24 ! Implicit allocation, but on device only.
+  if (.not. allocated (e)) stop 10
+  deallocate (e) ! OpenMP requires must be "unallocated upon exit from the region".
+
+  !$acc end parallel
+  !$omp end target
+
+  if (.not. allocated (a)) stop 20
+#if ACC_MEM_SHARED
+  if (a .ne. 33) stop 21
+#else
+  if (a .ne. 11) stop 22
+#endif
+  deallocate (a)
+
+  if (.not. allocated (b)) stop 23
+  if (b .ne. 25) stop 24
+  deallocate (b)
+
+  if (.not. allocated (c)) stop 25
+  if (c .ne. 10) stop 26
+  deallocate (c)
+
+  if (allocated (d)) stop 27
+
+  if (allocated (e)) stop 28
+
+end program main
-- 
2.17.1



signature.asc

Re: PR C++/63149

2019-06-05 Thread Nina Dinka Ranns

Hi both,
Addressing all comments in this e-mail, as some are duplicate.

On Tue, 4 Jun 2019 at 20:45, Paolo Carlini  wrote:
>
> Hi,
>
> On 04/06/19 21:26, Nina Dinka Ranns wrote:
>
> Good point, dg-do compile is sufficient to demonstrate the issue.
>
> I agree.
>
> A couple of additional nits, sorry for mentioning only now.
>
>
> C++63149_2.diff
>
> Index: gcc/cp/pt.c
> ===
> --- gcc/cp/pt.c (revision 271709)
> +++ gcc/cp/pt.c (working copy)
> @@ -26836,7 +26836,7 @@
>  static tree
>  listify_autos (tree type, tree auto_node)
>  {
> -  tree init_auto = listify (auto_node);
> +  tree init_auto = listify (strip_top_quals(auto_node));
>
> You want a space after strip_top_quals.

fixed.

>
>tree argvec = make_tree_vec (1);
>TREE_VEC_ELT (argvec, 0) = init_auto;
>if (processing_template_decl)
> Index: gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C
> ===
> --- gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C (nonexistent)
> +++ gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C (working copy)
> @@ -0,0 +1,12 @@
> +// Test for PR63149
> +// { dg-do compile { target c++11 } }
> +
> +#include 
> +
> +const auto r = { 1, 2, 3 };
> +using X = decltype(r);
> +using X = const std::initializer_list;
> +
> +int main()
> +{
> +}
>
> With dg-do compile you don't need a main anymore.
>
fixed

> I seem to remember also a couple of minor formatting issues in the ChangeLog 
> entry: just harmonize the format with everything else you find in the 
> ChangeLog, in terms of the usual trivial details: upper cases, line lenghts 
> and line wraps, etc.
>

Below is the amended change log. If there is anything else off, I
would need specifics as I've made all the changes I could spot myself.
:)

Thanks,
Nina

2019-06-04  Nina Dinka Ranns  
gcc/cp

PR C++/63149
* pt.c (listify_autos): Use non cv qualified auto_node in
std::initializer_list.

testsuite/

PR C++/63149
* g++.dg/cpp0x/initlist-deduce.C: New test.

Re: [PATCH] Fix var-tracking ICE with complex subregs (PR debug/90733)

2019-06-05 Thread Richard Biener

On Wed, 5 Jun 2019, Jakub Jelinek wrote:

> Hi!
> 
> On the following testcase, we have:
> (debug_insn 45 20 22 2 (var_location:DI D#4 (const_int 0 [0])) -1 (nil))
> (debug_insn 22 45 26 2 (debug_marker) "pr90733.c":16:3 -1 (nil))
> (debug_insn 26 22 27 2 (var_location:CSI D#3 (subreg:CSI (debug_expr:DI D#4) 
> 0)) "pr90733.c":16:10 -1 (nil))
> (debug_insn 27 26 28 2 (var_location:SI D#2 (subreg:SI (debug_expr:CSI D#3) 
> 0)) -1 (nil))
> (debug_insn 28 27 29 2 (var_location:SI D#1 (clobber (const_int 0 [0]))) -1 
> (nil))
> (debug_insn 29 28 30 2 (var_location:CSI y$c (concat:CSI (debug_expr:SI D#2) 
> (debug_expr:SI D#1))) -1 (nil))
> During var-tracking, we first propagate D#4 into the D#3 definition and want
> to simplify_subreg (CSImode, const0_rtx, DImode, 0), unfortunately that
> fails due to simplify_immed_subreg having:
>   /* We have no way to represent a complex constant at the rtl level.  */
>   if (COMPLEX_MODE_P (outermode))
> return NULL_RTX;
> and later on the following vt_expand_loc_callback hunk forces the creation
> of SUBREG, as in debug insns/notes we generally can handle even invalid
> SUBREGs.  We can't handle SUBREGs where the inner mode is VOIDmode though,
> we don't really know what to do that.  On this testcase, after creating
> such a subreg we simplify another subreg, the SImode lowpart subreg of that
> and that is where we ICE, because subreg of a subreg is invalid, we try
> harder to simplify it and don't know what to do with the VOIDmode in there.
> 
> The following patch fixes the ICE by refusing to create raw SUBREGs with
> VOIDmode inner mode.  Bootstrapped/regtested on x86_64-linux and i686-linux,
> ok for trunk?

OK.

Richard.

> Incrementally, we could into vt_expand_loc_callback add some special case
> for complex modes, if simplify_subreg fails, for complex mode try to
> simplify a lowpart and highpart subregs for the scalar halves and if both
> succeed, create a CONCAT of those, which is what we use in debug insns (as
> debug_insn 29 above shows).  Another option might be to use CONST_VECTORs
> even with complex modes, but I guess that would be a far bigger change.
> 
> 2019-06-05  Jakub Jelinek  
> 
>   PR debug/90733
>   * var-tracking.c (vt_expand_loc_callback): Don't create raw subregs
>   with VOIDmode inner operands.
> 
>   * gcc.dg/pr90733.c: New test.
> 
> --- gcc/var-tracking.c.jj 2019-05-03 15:22:07.0 +0200
> +++ gcc/var-tracking.c2019-06-04 16:32:35.014561614 +0200
> @@ -8491,7 +8491,7 @@ vt_expand_loc_callback (rtx x, bitmap re
>  
>/* Invalid SUBREGs are ok in debug info.  ??? We could try
>alternate expansions for the VALUE as well.  */
> -  if (!result)
> +  if (!result && GET_MODE (subreg) != VOIDmode)
>   result = gen_rtx_raw_SUBREG (GET_MODE (x), subreg, SUBREG_BYTE (x));
>  
>return result;
> --- gcc/testsuite/gcc.dg/pr90733.c.jj 2019-06-04 16:43:21.749638839 +0200
> +++ gcc/testsuite/gcc.dg/pr90733.c2019-06-04 16:42:53.181083748 +0200
> @@ -0,0 +1,22 @@
> +/* PR debug/90733 */
> +/* { dg-do compile } */
> +/* { dg-options "-g -O2 -w" } */
> +
> +struct S { unsigned a : 1; };
> +union U { struct S b; _Complex unsigned c; };
> +
> +union U
> +foo (union U d)
> +{
> +  union U e = d;
> +  return e;
> +}
> +
> +int
> +bar (void)
> +{
> +  union U x, y;
> +  x.c = x.b.a;
> +  y = foo (x);
> +  return x.c != y.c;
> +}
> 
>   Jakub
> 

-- 
Richard Biener 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

[PATCH] Fix HTML headers and titles for 'Installing GCC' pages (PR web/87933).

2019-06-05 Thread Martin Liška

Hi.

The patch fixes wrong titles/header in 'Installing GCC' pages.

Tested with make html.
Ready for trunk?

Martin

gcc/ChangeLog:

2019-06-05  Martin Liska  

PR web/87933
* doc/install.texi: Fix HTML headers and
titles for 'Installing GCC' pages.
---
 gcc/doc/install.texi | 1 -
 1 file changed, 1 deletion(-)


diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index e5a7d83bfd2..29d047004fd 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2,7 +2,6 @@
 @c @ifnothtml
 @c %**start of header
 @setfilename gccinstall.info
-@settitle Installing GCC
 @setchapternewpage odd
 @c %**end of header
 @c @end ifnothtml

[PATCH] Find constant definition for by-ref argument using dominance information (PR ipa/90401)

2019-06-05 Thread Feng Xue OS

IPA-CP can not identify a constant by-ref argument to a function, if definition
of the argument is not in same basic block where the callsite lies in. This is
because IPA-CP only does local search in the callsite basic block.So this patch
implemented an enhanced algorithm, which uses dominance information to guide
traverse in virtual SSA web, to find out constant definitions in dominating 
basic
block.

Feng

---
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 526ed45be89..cc076c337af 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,14 @@
+2019-06-04  Feng Xue  
+
+   PR ipa/90401
+   * ipa-prop.c (add_to_agg_contents_list): New function.
+   (clobber_by_agg_contents_list_p): Likewise.
+   (extract_mem_content): Likewise.
+   (strictly_dominated_by_ssa_p): Likewise.
+   (get_place_in_agg_contents_list): Delete.
+   (determine_known_aggregate_parts): Renamed from
+   determine_locally_known_aggregate_parts.
+
 2019-06-04  Segher Boessenkool  
 
* config/rs6000/constraints.md (define_register_constraint "wp"):
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index d86c2f3db55..405cdf7730b 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -1458,7 +1458,7 @@ get_ssa_def_if_simple_copy (tree rhs)
   return rhs;
 }
 
-/* Simple linked list, describing known contents of an aggregate beforere
+/* Simple linked list, describing known contents of an aggregate before
call.  */
 
 struct ipa_known_agg_contents_list
@@ -1471,41 +1471,64 @@ struct ipa_known_agg_contents_list
   struct ipa_known_agg_contents_list *next;
 };
 
-/* Find the proper place in linked list of ipa_known_agg_contents_list
-   structures where to put a new one with the given LHS_OFFSET and LHS_SIZE,
-   unless there is a partial overlap, in which case return NULL, or such
-   element is already there, in which case set *ALREADY_THERE to true.  */
+/* Add a known content item into a linked list of ipa_known_agg_contents_list,
+   in which all elements are sorted ascendingly by offset. When ALLOW_DUP is
+   false, insert the item only if there is no duplicate one (with same offset
+   and size) in the list. And if the item is added, return true. */
 
-static struct ipa_known_agg_contents_list **
-get_place_in_agg_contents_list (struct ipa_known_agg_contents_list **list,
-   HOST_WIDE_INT lhs_offset,
-   HOST_WIDE_INT lhs_size,
-   bool *already_there)
+static inline bool
+add_to_agg_contents_list (struct ipa_known_agg_contents_list **plist,
+  struct ipa_known_agg_contents_list *item,
+  bool allow_dup = true)
 {
-  struct ipa_known_agg_contents_list **p = list;
-  while (*p && (*p)->offset < lhs_offset)
+  struct ipa_known_agg_contents_list *list = *plist;
+
+  for (; list; list = list->next)
 {
-  if ((*p)->offset + (*p)->size > lhs_offset)
-   return NULL;
-  p = &(*p)->next;
+  if (list->offset > item->offset)
+break;
+
+  if (list->offset == item->offset && list->size == item->size
+  && !allow_dup)
+return false;
+
+  plist = >next;
 }
 
-  if (*p && (*p)->offset < lhs_offset + lhs_size)
+  item->next = list;
+  *plist = item;
+  return true;
+}
+
+/* Check whether a given known content is clobbered by certain element in
+   a linked list of ipa_known_agg_contents_list. A special case is that
+   we can ignore those constant items completely same as the given item,
+   that is they have same offset/size/value. */
+
+static inline bool
+clobber_by_agg_contents_list_p (struct ipa_known_agg_contents_list *list,
+struct ipa_known_agg_contents_list *item)
+{
+  for (; list; list = list->next)
 {
-  if ((*p)->offset == lhs_offset && (*p)->size == lhs_size)
-   /* We already know this value is subsequently overwritten with
-  something else.  */
-   *already_there = true;
-  else
-   /* Otherwise this is a partial overlap which we cannot
-  represent.  */
-   return NULL;
+  if (list->offset > item->offset)
+return list->offset < item->offset + item->size;
+
+  /* For the constant item, we can skip comparison with identical items in
+ the list, because its content remains unchanged after clobbering. */
+  if (list->offset == item->offset && list->size == item->size
+  && list->constant && item->constant
+  && operand_equal_p (list->constant, item->constant, 0))
+continue;
+
+  if (list->offset + list->size > item->offset)
+return true;
 }
-  return p;
+  return false;
 }
 
 /* Build aggregate jump function from LIST, assuming there are exactly
-   CONST_COUNT constant entries there and that th offset of the passed argument
+   CONST_COUNT constant entries there and that offset of the passed argument
is ARG_OFFSET and store it into JFUNC.  */
 
 static void
@@ -1528,6

[PATCH] Fix var-tracking ICE with complex subregs (PR debug/90733)

2019-06-05 Thread Jakub Jelinek

Hi!

On the following testcase, we have:
(debug_insn 45 20 22 2 (var_location:DI D#4 (const_int 0 [0])) -1 (nil))
(debug_insn 22 45 26 2 (debug_marker) "pr90733.c":16:3 -1 (nil))
(debug_insn 26 22 27 2 (var_location:CSI D#3 (subreg:CSI (debug_expr:DI D#4) 
0)) "pr90733.c":16:10 -1 (nil))
(debug_insn 27 26 28 2 (var_location:SI D#2 (subreg:SI (debug_expr:CSI D#3) 0)) 
-1 (nil))
(debug_insn 28 27 29 2 (var_location:SI D#1 (clobber (const_int 0 [0]))) -1 
(nil))
(debug_insn 29 28 30 2 (var_location:CSI y$c (concat:CSI (debug_expr:SI D#2) 
(debug_expr:SI D#1))) -1 (nil))
During var-tracking, we first propagate D#4 into the D#3 definition and want
to simplify_subreg (CSImode, const0_rtx, DImode, 0), unfortunately that
fails due to simplify_immed_subreg having:
  /* We have no way to represent a complex constant at the rtl level.  */
  if (COMPLEX_MODE_P (outermode))
return NULL_RTX;
and later on the following vt_expand_loc_callback hunk forces the creation
of SUBREG, as in debug insns/notes we generally can handle even invalid
SUBREGs.  We can't handle SUBREGs where the inner mode is VOIDmode though,
we don't really know what to do that.  On this testcase, after creating
such a subreg we simplify another subreg, the SImode lowpart subreg of that
and that is where we ICE, because subreg of a subreg is invalid, we try
harder to simplify it and don't know what to do with the VOIDmode in there.

The following patch fixes the ICE by refusing to create raw SUBREGs with
VOIDmode inner mode.  Bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

Incrementally, we could into vt_expand_loc_callback add some special case
for complex modes, if simplify_subreg fails, for complex mode try to
simplify a lowpart and highpart subregs for the scalar halves and if both
succeed, create a CONCAT of those, which is what we use in debug insns (as
debug_insn 29 above shows).  Another option might be to use CONST_VECTORs
even with complex modes, but I guess that would be a far bigger change.

2019-06-05  Jakub Jelinek  

PR debug/90733
* var-tracking.c (vt_expand_loc_callback): Don't create raw subregs
with VOIDmode inner operands.

* gcc.dg/pr90733.c: New test.

--- gcc/var-tracking.c.jj   2019-05-03 15:22:07.0 +0200
+++ gcc/var-tracking.c  2019-06-04 16:32:35.014561614 +0200
@@ -8491,7 +8491,7 @@ vt_expand_loc_callback (rtx x, bitmap re
 
   /* Invalid SUBREGs are ok in debug info.  ??? We could try
 alternate expansions for the VALUE as well.  */
-  if (!result)
+  if (!result && GET_MODE (subreg) != VOIDmode)
result = gen_rtx_raw_SUBREG (GET_MODE (x), subreg, SUBREG_BYTE (x));
 
   return result;
--- gcc/testsuite/gcc.dg/pr90733.c.jj   2019-06-04 16:43:21.749638839 +0200
+++ gcc/testsuite/gcc.dg/pr90733.c  2019-06-04 16:42:53.181083748 +0200
@@ -0,0 +1,22 @@
+/* PR debug/90733 */
+/* { dg-do compile } */
+/* { dg-options "-g -O2 -w" } */
+
+struct S { unsigned a : 1; };
+union U { struct S b; _Complex unsigned c; };
+
+union U
+foo (union U d)
+{
+  union U e = d;
+  return e;
+}
+
+int
+bar (void)
+{
+  union U x, y;
+  x.c = x.b.a;
+  y = foo (x);
+  return x.c != y.c;
+}

Jakub

Re: [wwwdocs] Document existence of openacc-gcc-9-branch

2019-06-05 Thread Thomas Schwinge

Hi Julian!

On Tue, 4 Jun 2019 23:05:53 +0100, Julian Brown  wrote:
> I've pushed a new branch "openacc-gcc-9-branch" to the Git
> mirror (i.e. as a Git-only branch), for development of OpenACC and
> related functionality on top of the GCC 9 branch. It's currently based
> off the gcc-9_1_0-release tag, and contains a number of patches mainly
> merged from either the openacc-gcc-8-branch, or from further-developed
> versions of those patches that have been submitted for upstream review.
> 
> This patch updates the svn.html page to point to the new branch rather
> than the old openacc-gcc-8-branch, which is retired now.
> 
> OK to commit?

As obvious, but please also add an "openacc-gcc-8-branch" stanza next to
"openacc-gcc-7-branch" in the "Merged Development Branches" section, and
update the "gomp-4_0-branch" and "openacc-gcc-7-branch" stanzas
accordingly.

Well, actually please move "gomp-4_0-branch", "openacc-gcc-7-branch", and
"openacc-gcc-8-branch" into the "Inactive Development Branches" section,
for all "These branches are inactive and contain work that might not been
merged": they all contain some changes that have not been forward-ported
to their later instances.


Grüße
 Thomas


> --- htdocs/svn.html   30 Sep 2018 14:38:47 -  1.225
> +++ htdocs/svn.html   4 Jun 2019 17:10:37 -
> @@ -291,18 +291,18 @@ the command svn log --stop-on-copy
>Patches should be marked with the tag [no-undefined-overflow]
>in the subject line.  The branch is maintained by Richard Biener.
>  
> -   href="https://gcc.gnu.org/wiki/OpenACC;>openacc-gcc-8-branch
> +   href="https://gcc.gnu.org/wiki/OpenACC;>openacc-gcc-9-branch
>This https://gcc.gnu.org/wiki/GitMirror;>Git-only branch 
> is
>used for collaborative development
>of https://gcc.gnu.org/wiki/OpenACC;>OpenACC support and 
> related
>functionality, such
>as https://gcc.gnu.org/wiki/Offloading;>offloading support.  
> The
> -  branch is based on gcc-8-branch.  Find it
> +  branch is based on gcc-9-branch.  Find it
>at git://gcc.gnu.org/git/gcc.git,
> -   href="https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/openacc-gcc-8-branch;>https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/openacc-gcc-8-branch,
> +   href="https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/openacc-gcc-9-branch;>https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/openacc-gcc-9-branch,
>or
> -   href="https://github.com/gcc-mirror/gcc/tree/openacc-gcc-8-branch;>https://github.com/gcc-mirror/gcc/tree/openacc-gcc-8-branch.
> -  Please send email with a short-hand [og8] tag in the subject
> +   href="https://github.com/gcc-mirror/gcc/tree/openacc-gcc-9-branch;>https://github.com/gcc-mirror/gcc/tree/openacc-gcc-9-branch.
> +  Please send email with a short-hand [og9] tag in the subject
>line, and use ChangeLog.openacc files.
>  
>https://gcc.gnu.org/wiki/plugins;>plugins


signature.asc
Description: PGP signature

[committed] Force monotonic loops if lastprivate(conditional:) is present

2019-06-05 Thread Jakub Jelinek

Hi!

The way we implement lastprivate(conditional:), where we have 2 per-thread
variables, the privatized value of the variable and another private variable
with the highest iteration count that stored to the privatized variable and
only merge at the end of the construct doesn't really work for non-monotonic
loops (admittedly, at least right now all loops are monotonic), because
while the counter private variable could hold the maximum iteration count
properly, the other private variable will hold the value from whatever
iteration has been handled last by the thread and stored to the variable.
We'd need to do the merging and clearing also at the point where the
iteration goes nonmonotonically down, which would be slow.

So, the following patch just forces the use of monotonic APIs when we have
lastprivate(conditional:).

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2019-06-05  Jakub Jelinek  

* omp-expand.c (struct omp_region): Add has_lastprivate_conditional
member.
(expand_parallel_call): If region->inner->has_lastprivate_conditional,
treat it like explicit monotonic schedule modifier.
(expand_omp_for): Initialize has_lastprivate_conditional.
If fd.lastprivate_conditional != 0, treat it like explicit monotonic
schedule modifier.

--- gcc/omp-expand.c.jj 2019-06-03 12:04:20.342277395 +0200
+++ gcc/omp-expand.c2019-06-04 15:45:54.450078636 +0200
@@ -100,6 +100,9 @@ struct omp_region
   /* True if this is a combined parallel+workshare region.  */
   bool is_combined_parallel;
 
+  /* Copy of fd.lastprivate_conditional != 0.  */
+  bool has_lastprivate_conditional;
+
   /* The ordered stmt if type is GIMPLE_OMP_ORDERED and it has
  a depend clause.  */
   gomp_ordered *ord_stmt;
@@ -601,8 +604,12 @@ expand_parallel_call (struct omp_region
  switch (region->inner->sched_kind)
{
case OMP_CLAUSE_SCHEDULE_RUNTIME:
- if ((region->inner->sched_modifiers
-  & OMP_CLAUSE_SCHEDULE_NONMONOTONIC) != 0)
+ /* For lastprivate(conditional:), our implementation
+requires monotonic behavior.  */
+ if (region->inner->has_lastprivate_conditional != 0)
+   start_ix2 = 3;
+ else if ((region->inner->sched_modifiers
+  & OMP_CLAUSE_SCHEDULE_NONMONOTONIC) != 0)
start_ix2 = 6;
  else if ((region->inner->sched_modifiers
& OMP_CLAUSE_SCHEDULE_MONOTONIC) == 0)
@@ -613,7 +620,8 @@ expand_parallel_call (struct omp_region
case OMP_CLAUSE_SCHEDULE_DYNAMIC:
case OMP_CLAUSE_SCHEDULE_GUIDED:
  if ((region->inner->sched_modifiers
-  & OMP_CLAUSE_SCHEDULE_MONOTONIC) == 0)
+  & OMP_CLAUSE_SCHEDULE_MONOTONIC) == 0
+ && !region->inner->has_lastprivate_conditional)
{
  start_ix2 = 3 + region->inner->sched_kind;
  break;
@@ -6228,6 +6236,7 @@ expand_omp_for (struct omp_region *regio
, loops);
   region->sched_kind = fd.sched_kind;
   region->sched_modifiers = fd.sched_modifiers;
+  region->has_lastprivate_conditional = fd.lastprivate_conditional != 0;
 
   gcc_assert (EDGE_COUNT (region->entry->succs) == 2);
   BRANCH_EDGE (region->entry)->flags &= ~EDGE_ABNORMAL;
@@ -6280,14 +6289,16 @@ expand_omp_for (struct omp_region *regio
   switch (fd.sched_kind)
{
case OMP_CLAUSE_SCHEDULE_RUNTIME:
- if ((fd.sched_modifiers & OMP_CLAUSE_SCHEDULE_NONMONOTONIC) != 0)
+ if ((fd.sched_modifiers & OMP_CLAUSE_SCHEDULE_NONMONOTONIC) != 0
+ && fd.lastprivate_conditional == 0)
{
  gcc_assert (!fd.have_ordered);
  fn_index = 6;
  sched = 4;
}
  else if ((fd.sched_modifiers & OMP_CLAUSE_SCHEDULE_MONOTONIC) == 0
-  && !fd.have_ordered)
+  && !fd.have_ordered
+  && fd.lastprivate_conditional == 0)
fn_index = 7;
  else
{
@@ -6298,7 +6309,8 @@ expand_omp_for (struct omp_region *regio
case OMP_CLAUSE_SCHEDULE_DYNAMIC:
case OMP_CLAUSE_SCHEDULE_GUIDED:
  if ((fd.sched_modifiers & OMP_CLAUSE_SCHEDULE_MONOTONIC) == 0
- && !fd.have_ordered)
+ && !fd.have_ordered
+ && fd.lastprivate_conditional == 0)
{
  fn_index = 3 + fd.sched_kind;
  sched = (fd.sched_kind == OMP_CLAUSE_SCHEDULE_GUIDED) + 2;

Jakub

[committed] Add support for lastprivate (conditional: ) with reference arguments

2019-06-05 Thread Jakub Jelinek

Hi!

The following patch fixes handling of references in lastprivate
(conditional:).

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2019-06-05  Jakub Jelinek  

* omp-low.c (lower_rec_input_clauses): For lastprivate conditional
references, lookup in in hash map MEM_REF operand instead of the
MEM_REF itself.
(lower_omp_1): When looking for lastprivate conditional assignments,
handle MEM_REFs with REFERENCE_TYPE operands.

* testsuite/libgomp.c++/lastprivate-conditional-1.C: New test.
* testsuite/libgomp.c++/lastprivate-conditional-2.C: New test.

--- gcc/omp-low.c.jj2019-06-03 15:12:45.535613313 +0200
+++ gcc/omp-low.c   2019-06-03 17:57:42.060631242 +0200
@@ -4795,8 +4795,14 @@ lower_rec_input_clauses (tree clauses, g
  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LASTPRIVATE
  && OMP_CLAUSE_LASTPRIVATE_CONDITIONAL (c))
{
- tree v
-   = *ctx->lastprivate_conditional_map->get (new_var);
+ tree v = new_var;
+ if (!DECL_P (v))
+   {
+ gcc_assert (TREE_CODE (v) == MEM_REF);
+ v = TREE_OPERAND (v, 0);
+ gcc_assert (DECL_P (v));
+   }
+ v = *ctx->lastprivate_conditional_map->get (v);
  tree t = create_tmp_var (TREE_TYPE (v));
  tree z = build_zero_cst (TREE_TYPE (v));
  tree orig_v
@@ -10835,6 +10841,11 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p
  else if (!up->lastprivate_conditional_map)
break;
  tree lhs = get_base_address (gimple_assign_lhs (stmt));
+ if (TREE_CODE (lhs) == MEM_REF
+ && DECL_P (TREE_OPERAND (lhs, 0))
+ && TREE_CODE (TREE_TYPE (TREE_OPERAND (lhs,
+0))) == REFERENCE_TYPE)
+   lhs = TREE_OPERAND (lhs, 0);
  if (DECL_P (lhs))
if (tree *v = up->lastprivate_conditional_map->get (lhs))
  {
--- libgomp/testsuite/libgomp.c++/lastprivate-conditional-1.C.jj
2019-06-03 17:09:39.808619044 +0200
+++ libgomp/testsuite/libgomp.c++/lastprivate-conditional-1.C   2019-06-03 
17:28:00.612116989 +0200
@@ -0,0 +1,62 @@
+extern "C" void abort ();
+int w;
+struct S { int s,  S () : s (0), t (w) {}; void foo (short &); bool bar 
(int, int); void baz (short &); };
+
+bool
+S::bar (int i, int q)
+{
+  switch (q)
+{
+case 0: return (i % 17) == 7;
+case 1: return (i % 19) == 2;
+case 2: return (i % 23) == 5;
+default: abort ();
+}
+}
+
+void
+S::foo (short )
+{
+  #pragma omp for lastprivate (conditional: x, s, t)
+  for (int i = 0; i < 1025; ++i)
+{
+  if (bar (i, 0))
+   x = i;
+  if (bar (i, 1))
+   s = i + 3;
+  if (bar (i, 2))
+   t = i + 6;
+}
+}
+
+void
+S::baz (short )
+{
+  #pragma omp parallel for lastprivate (conditional: x, s, t) collapse (3)
+  for (int i = 0; i < 15; ++i)
+for (int j = -4; j < 9; j++)
+  for (int k = 12; k > 7; --k)
+   {
+ int l = (k - 8) + (j + 4) * 5 + i * 13 * 5;
+ if (bar (l, 0))
+   x = l;
+ if (bar (l, 1))
+   s = l + 3;
+ if (bar (l, 2))
+   t = l + 6;
+   }
+}
+
+int
+main ()
+{
+  short x;
+  S s;
+  #pragma omp parallel
+  s.foo (x);
+  if (x != 1010 || s.s != 1012 || s.t != 1023)
+abort ();
+  s.baz (x);
+  if (x != 959 || s.s != 974 || s.t != 977)
+abort ();
+}
--- libgomp/testsuite/libgomp.c++/lastprivate-conditional-2.C.jj
2019-06-03 17:46:38.207960734 +0200
+++ libgomp/testsuite/libgomp.c++/lastprivate-conditional-2.C   2019-06-03 
17:46:22.542204491 +0200
@@ -0,0 +1,104 @@
+extern "C" void abort ();
+int w;
+struct S {
+  int s, 
+  int *p;
+  S (int *x) : s (0), t (w), p(x) {};
+  void foo (short &);
+  void bar (short &);
+  void baz (short &);
+  void qux (short &);
+};
+
+void
+S::foo (short )
+{
+  #pragma omp simd lastprivate (conditional: x, s, t)
+  for (int i = 0; i < 1025; ++i)
+{
+  if (p[i])
+   x = i;
+  if (p[i + 1025])
+   s = i + 3;
+  if (p[i + 2 * 1025])
+   t = i + 6;
+}
+}
+
+void
+S::bar (short )
+{
+  #pragma omp simd lastprivate (conditional: x, s, t) collapse (3) if (0)
+  for (int i = 0; i < 15; ++i)
+for (int j = -4; j < 9; j++)
+  for (int k = 12; k > 7; --k)
+   {
+ int l = (k - 8) + (j + 4) * 5 + i * 13 * 5;
+ if (p[l])
+   x = l;
+ if (p[l + 1025])
+   s = l + 3;
+ if (p[l + 1025 * 2])
+   t = l + 6;
+   }
+}
+
+void
+S::baz (short )
+{
+  #pragma omp parallel for simd lastprivate (conditional: x, s, t) if (simd: 0)
+  for (int i = 0; i < 1025; ++i)
+{
+  if

[committed] Fix up reference handling on simd constructs

2019-06-05 Thread Jakub Jelinek

Hi!

To be able to properly vectorize code that uses private, lastprivate or linear
clauses with reference arguments, we should be using "omp simd array" types
holding what those references bind to.  The following patch in addition
disables "omp simd array" vectorization in loops where those reference
arguments refer to variable length types.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2019-06-05  Jakub Jelinek  

* omp-low.c (lower_rec_input_clauses): Force max_vf if is_simd and
on privatization clauses OMP_CLAUSE_DECL is privatized by reference
and references a VLA.  Handle references to non-VLAs if is_simd
all privatization clauses like reductions.
(lower_rec_input_clauses) :
If omp_is_reference, use always omp simd arrays and set
DECL_VALUE_EXPR in that case, if lower_rec_simd_input_clauses
fails, emit reference initialization.

* g++.dg/vect/simd-1.cc: New test.

--- gcc/omp-low.c.jj2019-06-03 17:57:42.060631242 +0200
+++ gcc/omp-low.c   2019-06-04 14:23:02.860608537 +0200
@@ -3831,12 +3831,24 @@ lower_rec_input_clauses (tree clauses, g
case OMP_CLAUSE_LASTPRIVATE:
  if (is_variable_sized (OMP_CLAUSE_DECL (c)))
sctx.max_vf = 1;
+ else if (omp_is_reference (OMP_CLAUSE_DECL (c)))
+   {
+ tree rtype = TREE_TYPE (TREE_TYPE (OMP_CLAUSE_DECL (c)));
+ if (!TREE_CONSTANT (TYPE_SIZE_UNIT (rtype)))
+   sctx.max_vf = 1;
+   }
  break;
case OMP_CLAUSE_REDUCTION:
case OMP_CLAUSE_IN_REDUCTION:
  if (TREE_CODE (OMP_CLAUSE_DECL (c)) == MEM_REF
  || is_variable_sized (OMP_CLAUSE_DECL (c)))
sctx.max_vf = 1;
+ else if (omp_is_reference (OMP_CLAUSE_DECL (c)))
+   {
+ tree rtype = TREE_TYPE (TREE_TYPE (OMP_CLAUSE_DECL (c)));
+ if (!TREE_CONSTANT (TYPE_SIZE_UNIT (rtype)))
+   sctx.max_vf = 1;
+   }
  break;
case OMP_CLAUSE_IF:
  if (integer_zerop (OMP_CLAUSE_IF_EXPR (c)))
@@ -4665,8 +4677,8 @@ lower_rec_input_clauses (tree clauses, g
  /* For reduction in SIMD loop, defer adding the
 initialization of the reference, because if we decide
 to use SIMD array for it, the initilization could cause
-expansion ICE.  */
- if (c_kind == OMP_CLAUSE_REDUCTION && is_simd)
+expansion ICE.  Ditto for other privatization clauses.  */
+ if (is_simd)
x = NULL_TREE;
  else
{
@@ -4777,10 +4789,21 @@ lower_rec_input_clauses (tree clauses, g
  tree y = lang_hooks.decls.omp_clause_dtor (c, new_var);
  if ((TREE_ADDRESSABLE (new_var) || nx || y
   || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LASTPRIVATE
-  || OMP_CLAUSE_CODE (c) == OMP_CLAUSE__CONDTEMP_)
+  || OMP_CLAUSE_CODE (c) == OMP_CLAUSE__CONDTEMP_
+  || omp_is_reference (var))
  && lower_rec_simd_input_clauses (new_var, ctx, ,
   ivar, lvar))
{
+ if (omp_is_reference (var))
+   {
+ gcc_assert (TREE_CODE (new_var) == MEM_REF);
+ tree new_vard = TREE_OPERAND (new_var, 0);
+ gcc_assert (DECL_P (new_vard));
+ SET_DECL_VALUE_EXPR (new_vard,
+  build_fold_addr_expr (lvar));
+ DECL_HAS_VALUE_EXPR_P (new_vard) = 1;
+   }
+
  if (nx)
x = lang_hooks.decls.omp_clause_default_ctor
(c, unshare_expr (ivar), x);
@@ -4844,6 +4867,24 @@ lower_rec_input_clauses (tree clauses, g
}
  break;
}
+ if (omp_is_reference (var))
+   {
+ gcc_assert (TREE_CODE (new_var) == MEM_REF);
+ tree new_vard = TREE_OPERAND (new_var, 0);
+ gcc_assert (DECL_P (new_vard));
+ tree type = TREE_TYPE (TREE_TYPE (new_vard));
+ x = TYPE_SIZE_UNIT (type);
+ if (TREE_CONSTANT (x))
+   {
+ x = create_tmp_var_raw (type, get_name (var));
+ gimple_add_tmp_var (x);
+ TREE_ADDRESSABLE (x) = 1;
+ x = build_fold_addr_expr_loc (clause_loc, x);
+ x = fold_convert_loc (clause_loc,
+   TREE_TYPE (new_vard), x);
+

[PATCH] Enable memory operand for vfpclass[p,s][s,d] patterns.

2019-06-05 Thread Hongtao Liu

Hi Jeff and Jakub:
  When adding new intrinsics(PR target/89803), i found vfpclassp[sd],
vfpclasss[sd] patterns didn't support memory operand which is
supported in instructions. So this patch is about to enable memory
operands for vfpclassp[s,d]/vfpclasss[s,d] patterns.

Bootstrapped/regtested on x86_64-linux and i686-linux (on skylake-avx512),
ok for trunk?

Changelog
gcc/
2019-06-05  Hongtao Liu  

* config/i386/sse.md (define_mode_suffix vecmemsuffix): New.
(define_insn "avx512dq_fpclass"):
Enable memory operand for it.
(define_insn "avx512dq_vmfpclass"): Ditto.

gcc/testsuite/
2019-06-05  Hongtao Liu  

* gcc.target/i386/avx512dq-vfpclasspd-1.c:
Adjust scan assember for {x,y,z} suffix.
* gcc.target/i386/avx512dq-vfpclassps-1.c: Ditto.

--
BR,
Hongtao
Index: gcc/config/i386/sse.md
===
--- gcc/config/i386/sse.md	(revision 271946)
+++ gcc/config/i386/sse.md	(working copy)
@@ -595,6 +595,10 @@
 (define_mode_attr ssequarterinsnmode
   [(V16SF "V4SF") (V8DF "V2DF") (V16SI "TI") (V8DI "TI")])
 
+(define_mode_attr vecmemsuffix
+  [(V16SF "{z}") (V8SF "{y}") (V4SF "{x}")
+   (V8DF "{z}") (V4DF "{y}") (V2DF "{x}")])
+
 (define_mode_attr ssedoublemodelower
   [(V16QI "v16hi") (V32QI "v32hi") (V64QI "v64hi")
(V8HI "v8si")   (V16HI "v16si") (V32HI "v32si")
@@ -21317,11 +21321,11 @@
 (define_insn "avx512dq_fpclass"
   [(set (match_operand: 0 "register_operand" "=k")
   (unspec:
-[(match_operand:VF_AVX512VL 1 "register_operand" "v")
+[(match_operand:VF_AVX512VL 1 "vector_operand" "vm")
  (match_operand:QI 2 "const_0_to_255_operand" "n")]
  UNSPEC_FPCLASS))]
"TARGET_AVX512DQ"
-   "vfpclass\t{%2, %1, %0|%0, %1, %2}";
+   "vfpclass\t{%2, %1, %0|%0, %1, %2}";
   [(set_attr "type" "sse")
(set_attr "length_immediate" "1")
(set_attr "prefix" "evex")
@@ -21331,7 +21335,7 @@
   [(set (match_operand: 0 "register_operand" "=k")
 	(and:
 	  (unspec:
-	[(match_operand:VF_128 1 "register_operand" "v")
+	[(match_operand:VF_128 1 "vector_operand" "vm")
  (match_operand:QI 2 "const_0_to_255_operand" "n")]
 	UNSPEC_FPCLASS)
 	  (const_int 1)))]
Index: gcc/testsuite/gcc.target/i386/avx512dq-vfpclasspd-1.c
===
--- gcc/testsuite/gcc.target/i386/avx512dq-vfpclasspd-1.c	(revision 271946)
+++ gcc/testsuite/gcc.target/i386/avx512dq-vfpclasspd-1.c	(working copy)
@@ -1,11 +1,11 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512dq -mavx512vl -O2" } */
-/* { dg-final { scan-assembler-times "vfpclasspd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vfpclasspd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vfpclasspd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vfpclasspd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\[^\n^k\]*%k\[0-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vfpclasspd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n^k\]*%k\[0-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vfpclasspd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vfpclasspdz\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vfpclasspdy\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vfpclasspdx\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vfpclasspdz\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\[^\n^k\]*%k\[0-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vfpclasspdy\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n^k\]*%k\[0-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vfpclasspdx\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
 
 #include 
 
Index: gcc/testsuite/gcc.target/i386/avx512dq-vfpclassps-1.c
===
--- gcc/testsuite/gcc.target/i386/avx512dq-vfpclassps-1.c	(revision 271946)
+++ gcc/testsuite/gcc.target/i386/avx512dq-vfpclassps-1.c	(working copy)
@@ -1,11 +1,11 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512dq -mavx512vl -O2" } */
-/* { dg-final { scan-assembler-times "vfpclassps\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vfpclassps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vfpclassps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times

[PATCH] aarch64' Android Support

2019-06-05 Thread 林作健

The following patch aims to add support to android for aarch64 architecture.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 09fb9ecd2cd..a8b32e34b47 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1022,8 +1022,9 @@ aarch64*-*-freebsd*)
  tm_defines="${tm_defines}  TARGET_DEFAULT_ASYNC_UNWIND_TABLES=1"
  ;;
 aarch64*-*-linux*)
- tm_file="${tm_file} dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h"
+ tm_file="${tm_file} dbxelf.h elfos.h gnu-user.h linux.h
linux-android.h glibc-stdint.h"
  tm_file="${tm_file} aarch64/aarch64-elf.h aarch64/aarch64-linux.h"
+ extra_options="${extra_options} linux-android.opt"
  tmake_file="${tmake_file} aarch64/t-aarch64 aarch64/t-aarch64-linux"
  tm_defines="${tm_defines}  TARGET_DEFAULT_ASYNC_UNWIND_TABLES=1"
  case $target in
diff --git a/gcc/config/aarch64/aarch64-linux.h
b/gcc/config/aarch64/aarch64-linux.h
index 5e8b34ded03..e1700bf7db6 100644
--- a/gcc/config/aarch64/aarch64-linux.h
+++ b/gcc/config/aarch64/aarch64-linux.h
@@ -62,22 +62,33 @@
   " %{mfix-cortex-a53-843419:--fix-cortex-a53-843419}"
 #endif

-#define LINK_SPEC LINUX_TARGET_LINK_SPEC \
+#define LINK_SPEC LINUX_OR_ANDROID_LD (LINUX_TARGET_LINK_SPEC, \
+LINUX_TARGET_LINK_SPEC " " ANDROID_LINK_SPEC) \
   CA53_ERR_835769_SPEC \
   CA53_ERR_843419_SPEC

+#undef STARTFILE_SPEC
+#define STARTFILE_SPEC \
+  LINUX_OR_ANDROID_LD (GNU_USER_TARGET_STARTFILE_SPEC, ANDROID_STARTFILE_SPEC)
+
 #define GNU_USER_TARGET_MATHFILE_SPEC \
   "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s}"

 #undef ENDFILE_SPEC
-#define ENDFILE_SPEC   \
-  GNU_USER_TARGET_MATHFILE_SPEC " " \
-  GNU_USER_TARGET_ENDFILE_SPEC
+#define ENDFILE_SPEC \
+  LINUX_OR_ANDROID_LD (GNU_USER_TARGET_MATHFILE_SPEC " " \
+  GNU_USER_TARGET_ENDFILE_SPEC, ANDROID_ENDFILE_SPEC)
+
+#undef  LIB_SPEC
+#define LIB_SPEC \
+  LINUX_OR_ANDROID_LD (GNU_USER_TARGET_LIB_SPEC, \
+ GNU_USER_TARGET_NO_PTHREADS_LIB_SPEC " " ANDROID_LIB_SPEC)

 #define TARGET_OS_CPP_BUILTINS() \
   do \
 { \
  GNU_USER_TARGET_OS_CPP_BUILTINS(); \
+ ANDROID_TARGET_OS_CPP_BUILTINS(); \
 } \
   while (0)

Re: [PATCH] Add missing avx512dqintrin.h _mm_mask_fpclass_s[sd]_mask (PR target/89803)

2019-06-05 Thread Hongtao Liu

On Tue, Jun 4, 2019 at 5:56 PM Hongtao Liu  wrote:
>
> On Tue, Jun 4, 2019 at 5:21 PM Jakub Jelinek  wrote:
> >
> > On Tue, Jun 04, 2019 at 05:00:05PM +0800, Hongtao Liu wrote:
> > > Thanks for reminding, Here is updated:
> >
> > You've missed some notes.  Ok for trunk with:
> > 1) the following patch applied on top of your patch
> > 2) the ChangeLog entries moved to the start of the ChangeLog (normally,
> >ChangeLog entries are not added as part of the patch, but before the
> >patch in text form, because the ChangeLog files are updated many times
> >a day
> >
>
> Ok, thanks.
>
> > --- mask_fpclasss[sd]_v3.diff   2019-06-04 11:11:31.007712339 +0200
> > +++ mask_fpclasss[sd]_v3.diff   2019-06-04 11:14:19.581047040 +0200
> > @@ -2,7 +2,7 @@ Index: gcc/ChangeLog
> >  ===
> >  --- gcc/ChangeLog  (revision 271853)
> >  +++ gcc/ChangeLog  (working copy)
> > -@@ -4706,6 +4706,24 @@
> > +@@ -4706,6 +4706,23 @@
> > reprocessing.  Always call df_analyze before fixing up debug bind
> > insns.
> >
> > @@ -12,17 +12,16 @@ Index: gcc/ChangeLog
> >  +  * config/i386/avx512dqintrin.h (_mm_mask_fpclass_ss_mask,
> >  +  _mm_mask_fpclass_sd_mask): New intrinsics.
> >  +  (_mm_fpclass_ss_mask, _mm_fpclass_sd_mask): Modified, use new 
> > builtins.
> > -+  * config/i386/i386-builtin.def
> > -+  (__builtin_ia32_fpcla_mask, __builtin_ia32_fpclasssd_mask):
> > -+  New builtins.
> > ++  * config/i386/i386-builtin.def (__builtin_ia32_fpcla_mask,
> > ++  __builtin_ia32_fpclasssd_mask): New builtins.
> >  +  (__builtin_ia32_fpcla, __builtin_ia32_fpclasssd): Deleted.
> >  +  * config/i386/i386-builtin-types.def (DEF_FUNCTION_TYPE (QI, V2DF, 
> > INT),
> >  +  DEF_FUNCTION_TYPE (QI, V4SF, INT)): Deleted.
> >  +  * config/i386/i386-expand.c (case QI_FTYPE_V4SF_INT,
> >  +  case QI_FTYPE_V2SF_INT): Ditto.
> > -+  * config/i386/sse.md
> > -+  (define_insn "avx512dq_vmfpclass):
> > -+  Extended to insnstructions with mask operands.
> > ++  * config/i386/sse.md (avx512dq_vmfpclass): Rename to ...
> > ++  (avx512dq_vmfpclass): ... this.  Add
> > ++   to insn template.
> >  +
> >   2019-03-23  Segher Boessenkool  
> >
> > @@ -184,10 +183,10 @@ Index: gcc/testsuite/ChangeLog
> >  +  (__builtin_ia32_fpcla_mask, __builtin_ia32_fpclasssd_mask): 
> > Define.
> >  +  * gcc.target/i386/avx512dq-vfpcla-2.c: New.
> >  +  * gcc.target/i386/avx512dq-vfpclasssd-2.c: New.
> > -+  * gcc.target/i386/avx512dq-vfpcla-1.c (avx512f_test):
> > -+  Add test for _mm_mask_fpclass_ss_mask.
> > -+  * gcc.target/i386/avx512dq-vfpclasssd-1.c (avx512f_test):
> > -+  Add test for _mm_mask_fpclass_sd_mask.
> > ++  * gcc.target/i386/avx512dq-vfpcla-1.c (avx512f_test): Add test 
> > for
> > ++  _mm_mask_fpclass_ss_mask.
> > ++  * gcc.target/i386/avx512dq-vfpclasssd-1.c (avx512f_test): Add test 
> > for
> > ++  _mm_mask_fpclass_sd_mask.
> >  +
> >   2019-03-22  Vladimir Makarov  
> >
> >
> >
> > Jakub
>
>
>
> --
> BR,
> Hongtao

Author: liuhongt
Date: Wed Jun  5 06:04:22 2019
New Revision: 271946

URL: https://gcc.gnu.org/viewcvs?rev=271946=gcc=rev
Log:
gcc/
2019-06-05  Hongtao Liu  

PR target/89803
* config/i386/avx512dqintrin.h (_mm_mask_fpclass_ss_mask,
_mm_mask_fpclass_sd_mask): New intrinsics.
(_mm_fpclass_ss_mask, _mm_fpclass_sd_mask): Modified, use new builtins.
* config/i386/i386-builtin.def
(__builtin_ia32_fpcla_mask, __builtin_ia32_fpclasssd_mask):
New builtins.
(__builtin_ia32_fpcla, __builtin_ia32_fpclasssd): Deleted.
* config/i386/i386-builtin-types.def (DEF_FUNCTION_TYPE (QI, V2DF,
INT),
DEF_FUNCTION_TYPE (QI, V4SF, INT)): Deleted.
* config/i386/i386-expand.c (case QI_FTYPE_V4SF_INT,
case QI_FTYPE_V2SF_INT): Ditto.
* config/i386/sse.md
(define_insn "avx512dq_vmfpclass):
Extended to insnstructions with mask operands.

gcc/testsuite
2019-06-05  Hongtao Liu  

PR target/89803
* gcc.target/i386/avx-1.c (__builtin_ia32_fpclas,
__builtin_ia32_fpclasssd): Removed.
(__builtin_ia32_fpcla_mask, __builtin_ia32_fpclasssd_mask): Define.
* gcc.target/i386/sse-13.c (__builtin_ia32_fpclas,
__builtin_ia32_fpclasssd): Removed.
(__builtin_ia32_fpcla_mask, __builtin_ia32_fpclasssd_mask): Define.
* gcc.target/i386/sse-23.c (__builtin_ia32_fpclas,
__builtin_ia32_fpclasssd): Removed.
(__builtin_ia32_fpcla_mask, __builtin_ia32_fpclasssd_mask): Define.
* gcc.target/i386/avx512dq-vfpcla-2.c: New.
* gcc.target/i386/avx512dq-vfpclasssd-2.c: New.
* gcc.target/i386/avx512dq-vfpcla-1.c (avx512f_test):
Add test for _mm_mask_fpclass_ss_mask.
*

83 matches

Mail list logo