[PATCH] Enable GCC support for AVX512_VP2INTERSECT.

2019-06-05 Thread Hongtao Liu
Hi Uros and all:
  This patch is about to enable support for AVX512_VP2INTERSECT which will
be in Willow Cove. There are two instructions for AVX512_VP2INTERSECT:
VP2INTERSECTD and VP2INTERSECTQ. More details please refer to
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

  Bootstrap is ok, and no regressions for i386/x86-64 testsuite.

Changelog:

gcc/
+2019-06-06  Hongtao Liu  
+ H.J. Lu  
+ Olga Makhotina  
+
+ * common/config/i386/i386-common.c
+ (OPTION_MASK_ISA_AVX512VP2INTERSECT_SET,
+ OPTION_MASK_ISA_AVX512VP2INTERSECT_UNSET): New macros.
+ (OPTION_MASK_ISA2_AVX512F_UNSET): Add
+ OPTION_MASK_ISA_AVX512VP2INTERSECT_UNSET.
+ (ix86_handle_option): Handle -mavx512vp2intersect.
+ * config/i386/avx512vp2intersectintrin.h: New.
+ * config/i386/avx512vp2intersectvlintrin.h: New.
+ * config/i386/cpuid.h (bit_AVX512VP2INTERSECT): New.
+ * config/i386/driver-i386.c (host_detect_local_cpu): Detect
+ AVX512VP2INTERSECT.
+ * config/i386/i386-builtin-types.def: Add new types.
+ * config/i386/i386-builtin.def: Add new builtins.
+ * config/i386/i386-builtins.c: (enum processor_features): Add
+ F_AVX512VP2INTERSECT.
+ (static const _isa_names_table isa_names_table): Ditto.
+ * config/i386/i386-c.c (ix86_target_macros_internal): Define
+ __AVX512VP2INTERSECT__.
+ * config/i386/i386-expand.c (ix86_expand_builtin): Expand
+ IX86_BUILTIN_2INTERSECTD512, IX86_BUILTIN_2INTERSECTQ512,
+ IX86_BUILTIN_2INTERSECTD256, IX86_BUILTIN_2INTERSECTQ256,
+ IX86_BUILTIN_2INTERSECTD128, IX86_BUILTIN_2INTERSECTQ128.
+ * config/i386/i386-modes.def (P2QI, P2HI): New modes.
+ * config/i386/i386-options.c (ix86_target_string): Add
+ -mavx512vp2intersect.
+ (ix86_option_override_internal): Handle AVX512VP2INTERSECT.
+ * config/i386/i386.c (ix86_hard_regno_nregs): Allocate two regs for
+ P2HImode and P2QImode.
+ (ix86_hard_regno_mode_ok): Register pair only starts at even hardreg
+ number for P2QImode and P2HImode.
+ * config/i386/i386.h (TARGET_AVX512VP2INTERSECT,
+ TARGET_AVX512VP2INTERSECT_P): New.
+ (PTA_AVX512VP2INTERSECT): Ditto.
+ * config/i386/i386.opt: Add -mavx512vp2intersect.
+ * config/i386/immintrin.h: Include avx512vp2intersectintrin.h and
+ avx512vp2intersectvlintrin.h.
+ * config/i386/sse.md (define_c_enum "unspec"): Add UNSPEC_VP2INTERSECT.
+ (define_mode_iterator VI48_AVX512VP2VL): New.
+ (avx512vp2intersect_2intersect,
+ avx512vp2intersect_2intersectv16si): New define_insn patterns.
+ (*vec_extractp2hi, *vec_extractp2qi): New define_insn_and_split
+ patterns.
+ * config.gcc: Add avx512vp2intersectvlintrin.h and
+ avx512vp2intersectintrin.h to extra_headers.
+ * doc/invoke.texi: Document -mavx512vp2intersect.
+

gcc/testsuite/
+2019-06-06  Hongtao Liu  
+ Olga Makhotina  
+
+ * gcc.target/i386/avx512-check.h: Handle bit_AVX512VP2INTERSECT.
+ * gcc.target/i386/avx512vp2intersect-2intersect-1a.c: New test.
+ * gcc.target/i386/avx512vp2intersect-2intersect-1b.c: Likewise.
+ * gcc.target/i386/avx512vp2intersect-2intersectvl-1a.c: Likewise.
+ * gcc.target/i386/avx512vp2intersect-2intersectvl-1b.c: Likewise.
+ * gcc.target/i386/sse-12.c: Add -mavx512vp2intersect.
+ * gcc.target/i386/sse-13.c: Likewsie.
+ * gcc.target/i386/sse-14.c: Likewise.
+ * gcc.target/i386/sse-22.c: Likewise.
+ * gcc.target/i386/sse-23.c: Likewise.
+ * g++.dg/other/i386-2.C: Likewise.
+ * g++.dg/other/i386-3.C: Likewise.
+

-- 
BR,
Hongtao
Index: gcc/ChangeLog
===
--- gcc/ChangeLog	(revision 271984)
+++ gcc/ChangeLog	(working copy)
@@ -1,3 +1,53 @@
+2019-06-06  Hongtao Liu  
+	H.J. Lu  
+	Olga Makhotina  
+
+	* common/config/i386/i386-common.c
+	(OPTION_MASK_ISA_AVX512VP2INTERSECT_SET,
+	OPTION_MASK_ISA_AVX512VP2INTERSECT_UNSET): New macros.
+	(OPTION_MASK_ISA2_AVX512F_UNSET): Add
+	OPTION_MASK_ISA_AVX512VP2INTERSECT_UNSET.
+	(ix86_handle_option): Handle -mavx512vp2intersect.
+	* config/i386/avx512vp2intersectintrin.h: New.
+	* config/i386/avx512vp2intersectvlintrin.h: New.
+	* config/i386/cpuid.h (bit_AVX512VP2INTERSECT): New.
+	* config/i386/driver-i386.c (host_detect_local_cpu): Detect
+	AVX512VP2INTERSECT.
+	* config/i386/i386-builtin-types.def: Add new types.
+	* config/i386/i386-builtin.def: Add new builtins.
+	* config/i386/i386-builtins.c: (enum processor_features): Add
+	F_AVX512VP2INTERSECT.
+	(static const _isa_names_table isa_names_table): Ditto.
+	* config/i386/i386-c.c (ix86_target_macros_internal): Define
+	__AVX512VP2INTERSECT__.
+	* config/i386/i386-expand.c (ix86_expand_builtin): Expand
+	IX86_BUILTIN_2INTERSECTD512, IX86_BUILTIN_2INTERSECTQ512,
+	IX86_BUILTIN_2INTERSECTD256, IX86_BUILTIN_2INTERSECTQ256,
+	IX86_BUILTIN_2INTERSECTD128, IX86_BUILTIN_2INTERSECTQ128.
+	* config/i386/i386-modes.def (P2QI, P2HI): New modes.
+	* config/i386/i386-options.c (ix86_target_string): Add
+	-mavx512vp2intersect.
+	(ix86_option_override_internal): Handle AVX512VP2INTERSECT.
+	* 

[Bug c/83584] "ISO C forbids conversion of object pointer to function pointer type" -- no, not really

2019-06-05 Thread Keith.S.Thompson at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83584

--- Comment #18 from Keith Thompson  ---
Something I probably should have noticed earlier:

Why was this bug closed as a duplicate of bug 11234?

Bug 11234 complains that conversions between function pointer and void*
are accepted. This bug is exactly the opposite, complaining that they're
rejected.

If the resolution of #11234 were valid, then this bug would be invalid,
not a duplicate. (I've already stated my opinion.)

[Bug c++/90769] Template instantiation recursion when trying to do a conversion template

2019-06-05 Thread ensadc at mailnesia dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90769

ensadc at mailnesia dot com changed:

   What|Removed |Added

 CC||ensadc at mailnesia dot com

--- Comment #3 from ensadc at mailnesia dot com ---
It seems that `f == B` triggers overload resolution which treats `operator==`
declared latter as a candidate, which then causes the instantiation of the
constructor template, which requires overload resolution for `f == B`

Reduced:

enum E {B};

struct X
{
  template 
  constexpr X(int v);
};

bool operator==(X const& lhs, int) { 
return X(1);
}

[Bug c/90737] [8/9 Regression] inconsistent address of a local converted to intptr_t between callee and caller

2019-06-05 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90737

Martin Sebor  changed:

   What|Removed |Added

Summary|[8/9/10 Regression] |[8/9 Regression]
   |inconsistent address of a   |inconsistent address of a
   |local converted to intptr_t |local converted to intptr_t
   |between callee and caller   |between callee and caller

--- Comment #4 from Martin Sebor  ---
Fixed for GCC 10 via r271985.

[Bug c/90737] [8/9/10 Regression] inconsistent address of a local converted to intptr_t between callee and caller

2019-06-05 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90737

--- Comment #3 from Martin Sebor  ---
Author: msebor
Date: Thu Jun  6 02:53:01 2019
New Revision: 271985

URL: https://gcc.gnu.org/viewcvs?rev=271985=gcc=rev
Log:
PR c/90737 - [8/9/10 Regression] inconsistent address of a local converted to
intptr_t between callee and caller

gcc/c/ChangeLog:

PR c/90737
* c-typeck.c (c_finish_return): Only consider functions returning
pointers as candidates for -Wreturn-local-addr.

gcc/cp/ChangeLog:

PR c/90737
* typeck.c (maybe_warn_about_returning_address_of_local): Only
consider functions returning pointers as candidates for
-Wreturn-local-addr.

gcc/testsuite/ChangeLog:

PR c/90737
* c-c++-common/Wreturn-local-addr.c: New test.
* g++.dg/warn/Wreturn-local-addr-6.C: New test.


Added:
trunk/gcc/testsuite/c-c++-common/Wreturn-local-addr.c
trunk/gcc/testsuite/g++.dg/warn/Wreturn-local-addr-6.C
Modified:
trunk/gcc/c/ChangeLog
trunk/gcc/c/c-typeck.c
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/typeck.c
trunk/gcc/testsuite/ChangeLog

[Bug c++/90769] Template instantiation recursion when trying to do a conversion template

2019-06-05 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90769

Marek Polacek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-06-06
 CC||mpolacek at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Marek Polacek  ---
Not a regression, seems it's never worked.

[Bug c++/90769] Template instantiation recursion when trying to do a conversion template

2019-06-05 Thread barry.revzin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90769

--- Comment #1 from Barry Revzin  ---
Sorry, more reduced:

#include 

enum E {A, B};

struct X
{
  template  = 0>
  constexpr X(int v);

  template  = 0>
  operator OUT() const;
};

#ifdef WORKS
bool operator!=(X const& lhs, int) { 
return static_cast(lhs) == 0;
}
#else
bool operator==(X const& lhs, int) { 
return static_cast(lhs) == 0;
}
#endif

[Bug c++/90769] New: Template instantiation recursion when trying to do a conversion template

2019-06-05 Thread barry.revzin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90769

Bug ID: 90769
   Summary: Template instantiation recursion when trying to do a
conversion template
   Product: gcc
   Version: 9.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: barry.revzin at gmail dot com
  Target Milestone: ---

Reduced slightly from https://stackoverflow.com/q/56470126/2069064:

#include 

enum E {A, B};

template
struct X
{
  template = 0>
  constexpr X(float v);

  template  = 0>
  operator OUT() const;
};

#ifdef WORKS
bool operator!=(X const& lhs, int) { 
return static_cast(lhs) == 0;
}
#else
bool operator==(X const& lhs, int) { 
return static_cast(lhs) == 0;
}
#endif

Compiling with g++ 9.1 -std=c++17, with -DWORKS this compiles fine. Without it,
it fails due to reaching the max template instantiation limit. The only
difference between the two cases is that one is declaring an operator== and the
other is declaring an operator!=.

Re: [RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-06-05 Thread Kugan Vivekanandarajah
Hi Richard,

Thanks for the review. Attached is the latest patch.

For testcase like cond_arith_1.c, with the patch, gcc ICE in fwprop. I
am limiting fwprop in cases like this. Is there a better fix for this?
index cf2c9de..2c99285 100644
--- a/gcc/fwprop.c
+++ b/gcc/fwprop.c
@@ -1358,6 +1358,15 @@ forward_propagate_and_simplify (df_ref use,
rtx_insn *def_insn, rtx def_set)
   else
 mode = GET_MODE (*loc);

+  /* TODO. We can't get the mode for
+ (set (reg:VNx16BI 109)
+  (unspec:VNx16BI [
+(reg:SI 131)
+(reg:SI 106)
+   ] UNSPEC_WHILE_LO))
+ Thus, bailout when it is UNSPEC and MODEs are not compatible.  */
+  if (GET_MODE_CLASS (mode) != GET_MODE_CLASS (GET_MODE (reg)))
+return false;
   new_rtx = propagate_rtx (*loc, mode, reg, src,
  optimize_bb_for_speed_p (BLOCK_FOR_INSN (use_insn)));

Thanks,
Kugan

On Mon, 3 Jun 2019 at 19:08, Richard Sandiford
 wrote:
>
> Kugan Vivekanandarajah  writes:
> > diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
> > index b3fae5b..ad838dd 100644
> > --- a/gcc/tree-vect-loop-manip.c
> > +++ b/gcc/tree-vect-loop-manip.c
> > @@ -415,6 +415,7 @@ vect_set_loop_masks_directly (struct loop *loop, 
> > loop_vec_info loop_vinfo,
> > bool might_wrap_p)
> >  {
> >tree compare_type = LOOP_VINFO_MASK_COMPARE_TYPE (loop_vinfo);
> > +  tree iv_type = LOOP_VINFO_MASK_IV_TYPE (loop_vinfo);
> >tree mask_type = rgm->mask_type;
> >unsigned int nscalars_per_iter = rgm->max_nscalars_per_iter;
> >poly_uint64 nscalars_per_mask = TYPE_VECTOR_SUBPARTS (mask_type);
> > @@ -445,11 +446,16 @@ vect_set_loop_masks_directly (struct loop *loop, 
> > loop_vec_info loop_vinfo,
> >tree index_before_incr, index_after_incr;
> >gimple_stmt_iterator incr_gsi;
> >bool insert_after;
> > -  tree zero_index = build_int_cst (compare_type, 0);
> >standard_iv_increment_position (loop, _gsi, _after);
> > -  create_iv (zero_index, nscalars_step, NULL_TREE, loop, _gsi,
> > +
> > +  tree zero_index = build_int_cst (iv_type, 0);
> > +  tree step = build_int_cst (iv_type,
> > +  LOOP_VINFO_VECT_FACTOR (loop_vinfo));
> > +  /* Creating IV of iv_type.  */
>
> s/Creating/Create/
>
> > +  create_iv (zero_index, step, NULL_TREE, loop, _gsi,
> >insert_after, _before_incr, _after_incr);
> >
> > +  zero_index = build_int_cst (compare_type, 0);
> >tree test_index, test_limit, first_limit;
> >gimple_stmt_iterator *test_gsi;
> >if (might_wrap_p)
> > [...]
> > @@ -1066,11 +1077,17 @@ vect_verify_full_masking (loop_vec_info loop_vinfo)
> > if (this_type
> > && can_produce_all_loop_masks_p (loop_vinfo, this_type))
> >   {
> > -   /* Although we could stop as soon as we find a valid mode,
> > -  it's often better to continue until we hit Pmode, since the
> > +   /* See whether zero-based IV would ever generate all-false masks
> > +  before wrapping around.  */
> > +   bool might_wrap_p = (iv_precision > cmp_bits);
> > +   /* Stop as soon as we find a valid mode.  If we decided to use
> > +  cmp_type which is less than Pmode precision, it is often 
> > better
> > +  to use iv_type corresponding to Pmode, since the
> >operands to the WHILE are more likely to be reusable in
> > -  address calculations.  */
> > -   cmp_type = this_type;
> > +  address calculations in this case.  */
>
> We're not stopping as soon as we find a valid mode though.  Any type
> that satisfies the if condition above is valid, but we pick wider
> cmp_types and iv_types for optimisation reasons.  How about:
>
>   /* Although we could stop as soon as we find a valid mode,
>  there are at least two reasons why that's not always the
>  best choice:
>
>  - An IV that's Pmode or wider is more likely to be reusable
>in address calculations than an IV that's narrower than
>Pmode.
>
>  - Doing the comparison in IV_PRECISION or wider allows
>a natural 0-based IV, whereas using a narrower comparison
>type requires mitigations against wrap-around.
>
>  Conversely, if the IV limit is variable, doing the comparison
>  in a wider type than the original type can introduce
>  unnecessary extensions, so picking the widest valid mode
>  is not always a good choice either.
>
>  Here we prefer the first IV type that's Pmode or wider,
>  and the first comparison type that's IV_PRECISION or wider.
>  (The comparison type must be no wider than the IV type,
>  to avoid extensions in the vector loop.)
>
>  ??? We might want to try continuing beyond Pmode 

Re: [PATCH] Enable memory operand for vfpclass[p,s][s,d] patterns.

2019-06-05 Thread Hongtao Liu
On Thu, Jun 6, 2019 at 6:18 AM Jeff Law  wrote:
>
> On 6/5/19 1:39 AM, Hongtao Liu wrote:
> > Hi Jeff and Jakub:
> >   When adding new intrinsics(PR target/89803), i found vfpclassp[sd],
> > vfpclasss[sd] patterns didn't support memory operand which is
> > supported in instructions. So this patch is about to enable memory
> > operands for vfpclassp[s,d]/vfpclasss[s,d] patterns.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux (on skylake-avx512),
> > ok for trunk?
> >
> > Changelog
> > gcc/
> > 2019-06-05  Hongtao Liu  
> >
> > * config/i386/sse.md (define_mode_suffix vecmemsuffix): New.
> > (define_insn "avx512dq_fpclass"):
> > Enable memory operand for it.
> > (define_insn "avx512dq_vmfpclass"): Ditto.
> >
> > gcc/testsuite/
> > 2019-06-05  Hongtao Liu  
> >
> > * gcc.target/i386/avx512dq-vfpclasspd-1.c:
> > Adjust scan assember for {x,y,z} suffix.
> > * gcc.target/i386/avx512dq-vfpclassps-1.c: Ditto.
> OK, but make sure you fix the ChangeLog formatting as you commit.
>
> jeff

Thanks.

Committed:
https://gcc.gnu.org/viewcvs/gcc?view=revision=271984.

-- 
BR,
Hongtao


[Bug target/89803] Missing AVX512 intrinsics

2019-06-05 Thread crazylht at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89803

Hongtao.liu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Hongtao.liu  ---
Fixed for gcc10

[Bug target/88918] [meta-bug] x86 intrinsic issues

2019-06-05 Thread crazylht at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88918
Bug 88918 depends on bug 89803, which changed state.

Bug 89803 Summary: Missing AVX512 intrinsics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89803

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

Go patch committed: Use specialized fast map routines

2019-06-05 Thread Ian Lance Taylor
In the Go runtime there are specialized fast map routines for certain
kep types.  This Go frontend patch by Cherry Zhang lets the compiler
make use of thesefunctions, instead of always using the generic ones.

As we now generate multiple versions of map delete calls, to make
things easier we delay the expansion of the built-in delete function
to flatten phase.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 271976)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-2609f9b8420e2341fbbe40d7cf6af42b0fba7293
+bc7374913367fba9b10dc284af87eb539fb6c5b2
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/escape.cc
===
--- gcc/go/gofrontend/escape.cc (revision 271669)
+++ gcc/go/gofrontend/escape.cc (working copy)
@@ -1622,6 +1622,16 @@ Escape_analysis_assign::expression(Expre
 }
 break;
 
+  case Runtime::MAPASSIGN_FAST32PTR:
+  case Runtime::MAPASSIGN_FAST64PTR:
+  case Runtime::MAPASSIGN_FASTSTR:
+{
+  // Map key escapes. The last argument is the key.
+  Node* key_node = Node::make_node(call->args()->back());
+  this->assign(this->context_->sink(), key_node);
+}
+break;
+
   case Runtime::IFACEE2T2:
   case Runtime::IFACEI2T2:
 {
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 271976)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -7843,63 +7843,33 @@ Builtin_call_expression::do_lower(Gogo*,
 
 case BUILTIN_DELETE:
   {
-   // Lower to a runtime function call.
-   const Expression_list* args = this->args();
-   if (args == NULL || args->size() < 2)
- this->report_error(_("not enough arguments"));
-   else if (args->size() > 2)
- this->report_error(_("too many arguments"));
-   else if (args->front()->type()->map_type() == NULL)
- this->report_error(_("argument 1 must be a map"));
-   else
- {
-   // Since this function returns no value it must appear in
-   // a statement by itself, so we don't have to worry about
-   // order of evaluation of values around it.  Evaluate the
-   // map first to get order of evaluation right.
-   Map_type* mt = args->front()->type()->map_type();
-   Temporary_statement* map_temp =
- Statement::make_temporary(mt, args->front(), loc);
-   inserter->insert(map_temp);
-
-   Temporary_statement* key_temp =
- Statement::make_temporary(mt->key_type(), args->back(), loc);
-   inserter->insert(key_temp);
-
-   Expression* e1 = Expression::make_type_descriptor(mt, loc);
-   Expression* e2 = Expression::make_temporary_reference(map_temp,
- loc);
-   Expression* e3 = Expression::make_temporary_reference(key_temp,
- loc);
-
-   // If the call to delete is deferred, and is in a loop,
-   // then the loop will only have a single instance of the
-   // temporary variable.  Passing the address of the
-   // temporary variable here means that the deferred call
-   // will see the last value in the loop, not the current
-   // value.  So for this unusual case copy the value into
-   // the heap.
-   if (!this->is_deferred())
- e3 = Expression::make_unary(OPERATOR_AND, e3, loc);
-   else
- {
-   Expression* a = Expression::make_allocation(mt->key_type(),
-   loc);
-   Temporary_statement* atemp =
- Statement::make_temporary(NULL, a, loc);
-   inserter->insert(atemp);
-
-   a = Expression::make_temporary_reference(atemp, loc);
-   a = Expression::make_dereference(a, NIL_CHECK_NOT_NEEDED, loc);
-   Statement* s = Statement::make_assignment(a, e3, loc);
-   inserter->insert(s);
-
-   e3 = Expression::make_temporary_reference(atemp, loc);
- }
-
-   return Runtime::make_call(Runtime::MAPDELETE, this->location(),
- 3, e1, e2, e3);
- }
+const Expression_list* args = this->args();
+if (args == NULL || args->size() < 2)
+  this->report_error(_("not enough arguments"));
+else if 

Re: ARM peephole2 from 2003 never merged, still valid

2019-06-05 Thread Segher Boessenkool
On Wed, Jun 05, 2019 at 05:02:53PM -0600, Jeff Law wrote:
> On 6/2/19 6:28 AM, Segher Boessenkool wrote:
> > On Sat, Jun 01, 2019 at 11:41:30PM +, Fredrik Hederstierna wrote:
> >> +(define_peephole2
> >> +  [(set (match_operand:SI 0 "arm_general_register_operand" "")
> >> +   (match_operand:SI 1 "arm_general_register_operand" ""))
> >> +   (set (reg:CC CC_REGNUM)
> >> +   (compare:CC (match_dup 0) (const_int 0)))]
> >> +  "TARGET_ARM"
> >> +  [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) 
> >> (const_int 0)))
> >> + (set (match_dup 0) (match_dup 1))])]
> >> +  ""
> >> +)
> > 
> > Do you have a testcase for this?  I wonder if it would be better handled
> > during combine, and what that then tried; or perhaps these opportunities
> > are created later, making a peephole a more attractive solution.
> We have two independent insns with no output/true dependency between
> them.  So there's really not anything for combine to do here.

op0 := op1;
CC := op0 cmp 0;

That's a perfectly fine dependency I think?

> What is more interesting to me is whether or not this could be handled
> by compare-elim and a define_insn rather than a define_peephole2.  THe
> existing pattern and the new one both seem well suited for compare-elim.
> 
> I do think a testcase is warranted here.  Fredrik, if you could reduce
> one from CSiBE that might be appropriate.

Yeah, let's see what is really going on :-)


Segher


Re: [PATCH] RX: Add rx-*-linux target

2019-06-05 Thread Jeff Law
On 6/3/19 11:01 AM, Yoshinori Sato wrote:
> On Sun, 02 Jun 2019 22:12:37 +0900,
> Oleg Endo wrote:
>>
>> On Sun, 2019-06-02 at 20:26 +0900, Yoshinori Sato wrote:
>>> On Fri, 31 May 2019 09:16:18 +0900,
>>> Jeff Law wrote:

 On 5/29/19 12:27 PM, Jeff Law wrote:
> On 5/23/19 6:05 AM, Yoshinori Sato wrote:
>> I ported linux kernel to Renesas RX.
>>
>> rx-*-elf target output a binary different from the standard
>> ELF.
>> It has the same format as the Renesas compiler.
>>
>> But the linux kernel requires the standard ELF format.
>> I want to define a rx-*-linux target so that I can generate
>> a standard ELF binary.
>
> Presumably you're resubmitting after your assignment got recorded
> (I
> think I saw that fly by recently).
>
> I'll construct a ChangeLog and install this on the trunk.

 So this is causing libgcc to fail to build for rx-elf.  The problem
 is
 the DF=SF #define.  I think you need so split that out so that it's
 only
 used for rx-linux.

 Jeff
>>>
>>> OK. fix it.
>>> I tried build rx-elf target. it success.
>>>
>>
>> Setting DF=SF is the wrong thing to do IMHO.  RX can do DF just fine in
>> software.  If this is hardcoded like that in the roots of the
>> toolchain, it will make compiling packages that actually need real DF
>> completely impossible, won't it?  We also don't set DI = SI just
>> because the hardware is bad at SI ... 
>>
>> Just my 2 cents.
>>
>> Cheers,
>> Oleg
>>
> 
> OK.
> I was misunderstood.
> I think this is not a problem.
[ ... ]
THanks.  Give we already had your original patch in the source tree, I
extracted just the new bits and committed them.  I'm attaching the
changes for archival purposes.

Jeff

commit aafb499477a6f23d01c4c048c20c8833a801396a
Author: law 
Date:   Wed Jun 5 23:20:27 2019 +

* config.host (rx-*-linux*): Add t-fdpbit to tmake_file
Add appropriate tm_file clause as well.
* config/rx/t-rx (HOST_LIBGCC2_CFLAGS): Remove.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@271978 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog
index 82c124f8ece..93cdcda3284 100644
--- a/libgcc/ChangeLog
+++ b/libgcc/ChangeLog
@@ -1,3 +1,9 @@
+2019-06-05  Yoshinori Sato  
+
+   * config.host (rx-*-linux*): Add t-fdpbit to tmake_file
+   Add appropriate tm_file clause as well.
+   * config/rx/t-rx (HOST_LIBGCC2_CFLAGS): Remove.
+
 2019-06-05  James Clarke  
 
* config/ia64/crtbegin.S (__dso_handle): Put in .sdata/.sbss
diff --git a/libgcc/config.host b/libgcc/config.host
index ff2f0fa1e58..d75e2b5d7aa 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -1212,7 +1212,8 @@ rx-*-elf)
tm_file="$tm_file rx/rx-abi.h rx/rx-lib.h"
;;
 rx-*-linux*)
-   tmake_file="rx/t-rx"
+   tmake_file="rx/t-rx t-fdpbit"
+   tm_file="$tm_file rx/rx-lib.h"
;;
 s390-*-linux*)
tmake_file="${tmake_file} s390/t-crtstuff s390/t-linux 
s390/32/t-floattodi t-stack s390/t-stack-s390"
diff --git a/libgcc/config/rx/t-rx b/libgcc/config/rx/t-rx
index 34cdc97cc03..ace3b16f39e 100644
--- a/libgcc/config/rx/t-rx
+++ b/libgcc/config/rx/t-rx
@@ -23,7 +23,6 @@
 # the normal libgcc sources:
 
 LIB2ADD = $(srcdir)/config/rx/rx-abi-functions.c
-HOST_LIBGCC2_CFLAGS += -DDF=SF
 
 
 # We need special handling of the floating point conversion


Re: [PATCH] RX: Add rx-*-linux target

2019-06-05 Thread Jeff Law
On 6/2/19 5:26 AM, Yoshinori Sato wrote:
> On Fri, 31 May 2019 09:16:18 +0900,
> Jeff Law wrote:
>>
>> On 5/29/19 12:27 PM, Jeff Law wrote:
>>> On 5/23/19 6:05 AM, Yoshinori Sato wrote:
 I ported linux kernel to Renesas RX.

 rx-*-elf target output a binary different from the standard ELF.
 It has the same format as the Renesas compiler.

 But the linux kernel requires the standard ELF format.
 I want to define a rx-*-linux target so that I can generate
 a standard ELF binary.
>>> Presumably you're resubmitting after your assignment got recorded (I
>>> think I saw that fly by recently).
>>>
>>> I'll construct a ChangeLog and install this on the trunk.
>> So this is causing libgcc to fail to build for rx-elf.  The problem is
>> the DF=SF #define.  I think you need so split that out so that it's only
>> used for rx-linux.
>>
>> Jeff
> 
> OK. fix it.
> I tried build rx-elf target. it success.
What parts did you build?  For me it fails building libgcc:


> /home/tmp/rx/./gcc/xgcc -B/home/tmp/rx/./gcc/ -B/tmp/rx/rx-elf/bin/ 
> -B/tmp/rx/rx-elf/lib/ -isystem /tmp/rx/rx-elf/include -isystem 
> /tmp/rx/rx-elf/sys-include-g -O2 -m64bit-doubles -O2  -g -O2 -DIN_GCC  
> -DCROSS_DIRECTORY_STRUCTURE  -W -Wall -Wno-narrowing -Wwrite-strings 
> -Wcast-qual -Wstrict-prototypes -Wold-style-definition  -isystem ./include   
> -DDF=SF -g -DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-protector -Dinhibit_libc 
>  -DDF=SF -I. -I. -I../../.././gcc -I/home/gcc/GIT-3/gcc/libgcc 
> -I/home/gcc/GIT-3/gcc/libgcc/. -I/home/gcc/GIT-3/gcc/libgcc/../gcc 
> -I/home/gcc/GIT-3/gcc/libgcc/../include  -DHAVE_CC_TLS -DUSE_EMUTLS -o 
> _muldc3.o -MT _muldc3.o -MD -MP -MF _muldc3.dep -DL_muldc3 -c 
> /home/gcc/GIT-3/gcc/libgcc/libgcc2.c -fvisibility=hidden -DHIDE_EXPORTS
> In file included from /home/gcc/GIT-3/gcc/libgcc/libgcc2.c:56:
> /home/gcc/GIT-3/gcc/libgcc/libgcc2.h:256:16: warning: conflicting types for 
> built-in function ‘__divdc3’; expected ‘_Complex double(double,  double,  
> double,  double)’ [-Wbuiltin-declaration-mismatch]
>   256 | #define __N(a) __ ## a
>   |^~
> /home/gcc/GIT-3/gcc/libgcc/libgcc2.h:366:19: note: in expansion of macro ‘__N’
>   366 | #define __divdc3  __N(divdc3)
>   |   ^~~
> /home/gcc/GIT-3/gcc/libgcc/libgcc2.h:461:15: note: in expansion of macro 
> ‘__divdc3’
>   461 | extern DCtype __divdc3 (DFtype, DFtype, DFtype, DFtype);
>   |   ^~~~


[ ... and so on ... ]

Jeff


Re: ARM peephole2 from 2003 never merged, still valid

2019-06-05 Thread Jeff Law
On 6/2/19 6:28 AM, Segher Boessenkool wrote:
> On Sat, Jun 01, 2019 at 11:41:30PM +, Fredrik Hederstierna wrote:
>> +(define_peephole2
>> +  [(set (match_operand:SI 0 "arm_general_register_operand" "")
>> +   (match_operand:SI 1 "arm_general_register_operand" ""))
>> +   (set (reg:CC CC_REGNUM)
>> +   (compare:CC (match_dup 0) (const_int 0)))]
>> +  "TARGET_ARM"
>> +  [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (const_int 
>> 0)))
>> + (set (match_dup 0) (match_dup 1))])]
>> +  ""
>> +)
> 
> Hi Fredrik,
> 
> Do you have a testcase for this?  I wonder if it would be better handled
> during combine, and what that then tried; or perhaps these opportunities
> are created later, making a peephole a more attractive solution.
We have two independent insns with no output/true dependency between
them.  So there's really not anything for combine to do here.

What is more interesting to me is whether or not this could be handled
by compare-elim and a define_insn rather than a define_peephole2.  THe
existing pattern and the new one both seem well suited for compare-elim.

I do think a testcase is warranted here.  Fredrik, if you could reduce
one from CSiBE that might be appropriate.


jeff


[Bug tree-optimization/90662] strlen of a string in a vla plus offset not folded

2019-06-05 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90662

Martin Sebor  changed:

   What|Removed |Added

   Keywords||patch
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-06-05
   Assignee|unassigned at gcc dot gnu.org  |msebor at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Martin Sebor  ---
Patch: https://gcc.gnu.org/ml/gcc-patches/2019-06/msg00297.html

[PATCH] handle vla plus offset in strlen (PR 90662)

2019-06-05 Thread Martin Sebor

One of my new tests for the strlen/sprintf integration tripped
over an incomplete handling of VLAs by the strlen pass.  Where
it can determine the length of a substring at some offset with
other kinds of arrays, the pass fails with VLAs because they
are represented as pointers to arrays.

The attached patch adds the missing handling so that code like
the following can be fully folded even for VLAs.

  int f (int n)
  {
char a[n];
strcpy (a, "12345");
if (strlen ([2]) != 3)
  abort ();
  }

Tested on x86_64-linux.

Martin
PR tree-optimization/90662 - strlen of a string in a vla plus offset not folded

gcc/ChangeLog:

	PR tree-optimization/90662
	* tree-ssa-strlen.c (get_stridx): Handle simple VLAs and pointers
	to arrays.

gcc/testsuite/ChangeLog:

	PR tree-optimization/90662
	* gcc.dg/strlenopt-62.c: New test.
	* gcc.dg/strlenopt-63.c: New test.

diff --git a/gcc/testsuite/gcc.dg/strlenopt-62.c b/gcc/testsuite/gcc.dg/strlenopt-62.c
new file mode 100644
index 000..644bdee2c1b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-62.c
@@ -0,0 +1,89 @@
+/* PR tree-optimization/90662 - strlen of a string in a vla plus offset
+   not folded
+   { dg-do compile }
+   { dg-options "-O2 -Wall -fdump-tree-gimple -fdump-tree-optimized" } */
+
+#include "strlenopt.h"
+
+#define CONCAT(x, y) x ## y
+#define CAT(x, y) CONCAT (x, y)
+#define FAILNAME(name, counter) \
+  CAT (CAT (CAT (call_ ## name ##_on_line_, __LINE__), _), counter)
+
+#define FAIL(name, counter) do {			\
+extern void FAILNAME (name, counter) (void);	\
+FAILNAME (name, counter)();\
+  } while (0)
+
+/* Macro to emit a call to funcation named
+ call_in_true_branch_not_eliminated_on_line_NNN()
+   for each call that's expected to be eliminated.  The dg-final
+   scan-tree-dump-time directive at the bottom of the test verifies
+   that no such call appears in output.  */
+#define ELIM(expr) \
+  if (!(expr)) FAIL (in_true_branch_not_eliminated, __COUNTER__); else (void)0
+
+#define ARGS(...) __VA_ARGS__
+
+void sink (void*, ...);
+
+
+#define T(expect, init, p)\
+  do {			\
+char vla[n];	\
+char *ptr = strcpy (vla, init);			\
+ELIM (expect == strlen (p));			\
+sink (ptr);		\
+  } while (0)
+
+void test_vla_local (int n)
+{
+  T (0, "", ptr);
+  T (0, "\0", ptr);
+  T (1, "1", ptr);
+  T (2, "12", ptr);
+  T (3, "123", ptr);
+
+  T (2, "123", ptr + 1);
+  T (1, "123", [2]);
+  T (0, "123", [1] + 2);
+}
+
+
+#undef T
+#define T(expect, parr, init, p)			\
+  do {			\
+char (*parray)[] = *ppa++;\
+char *ptr = strcpy (parr, init);			\
+(void)		\
+ELIM (expect == strlen (p));			\
+  } while (0)
+
+/* Have the function take a pointer to pointers to arrays so that each
+   test case can use its own pointer to avoid interference between.  */
+
+void test_array_ptr (char (**ppa)[])
+{
+  T (0, *parray, "", *parray);
+  T (0, *parray, "", &(*parray)[0]);
+
+  T (1, *parray, "1", &(*parray)[0]);
+  T (0, *parray, "1", &(*parray)[1]);
+
+  T (2, *parray, "12", &(*parray)[0]);
+  T (1, *parray, "12", &(*parray)[1]);
+  T (0, *parray, "12", &(*parray)[2]);
+
+  T (3, *parray, "123", &(*parray)[0]);
+  T (2, *parray, "123", &(*parray)[1]);
+  T (1, *parray, "123", &(*parray)[2]);
+  T (0, *parray, "123", &(*parray)[3]);
+
+  T (3, *parray, "123", ptr);
+  T (2, *parray, "123", [1]);
+  T (1, *parray, "123", [2]);
+  T (0, *parray, "123", [3]);
+}
+
+/* { dg-final { scan-tree-dump-times "strlen" 0 "optimized" } }
+   { dg-final { scan-tree-dump-times "not_eliminated" 0 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-63.c b/gcc/testsuite/gcc.dg/strlenopt-63.c
new file mode 100644
index 000..ca3e55fd9f8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-63.c
@@ -0,0 +1,158 @@
+/* PR tree-optimization/90662 - strlen of a string in a vla plus offset
+   not folded
+   Verify that strlen of pointers to arrays are computed correctly
+   (whether folded or not).
+   { dg-do run }
+   { dg-options "-O2 -Wall" } */
+
+#include "strlenopt.h"
+
+#define A(expr) \
+  ((expr)   \
+   ? (void)0\
+   : (__builtin_printf ("assertion failed on line %i: %s\n",\
+__LINE__, #expr),   \
+  __builtin_abort ()))
+
+typedef char A5[5];
+
+A5 a5[5];
+A5* p[5] = { [4], [3], [2], [1], [0] };
+
+__attribute__ ((noclone, noinline, noipa))
+void deref_deref (void)
+{
+  strcpy (**p, "12345");
+  A (strlen (**p) == 5);
+}
+
+__attribute__ ((noclone, noinline, noipa))
+void deref_idx_0 (void)
+{
+  strcpy (*p[0], "");
+  A (strlen (*p[0]) == 0);
+}
+
+__attribute__ ((noclone, noinline, noipa))
+void deref_idx_1 (void)
+{
+  strcpy (*p[1], "1");
+  A (strlen (*p[1]) == 1);
+  A (strlen (&(*p[1])[1]) == 0);
+
+  A (strlen (*p[0]) == 0);
+}
+
+__attribute__ ((noclone, noinline, noipa))

Re: [PATCH] [MIPS] Inhibit trailing .insn if pool is not followed by code

2019-06-05 Thread Maciej W. Rozycki
On Wed, 5 Jun 2019, Jeff Law wrote:

> >  Thanks for looking into it.  FWIW I think the `__pend' symbol will best 
> > be still emitted for consistency, however as STT_OBJECT and consequently 
> > with no trailing `.insn'.
> If I understand correctly we'd still want to call
> mips_set_text_contents_type in all the cases we did before, but that
> we'd pass in false for the FUNCTION_P argument?

 Yes, I think it would be the most straightforward implementation.

  Maciej


[Bug tree-optimization/90768] better range analysis for converting bit tests into less-than greater-than

2019-06-05 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90768

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-06-05
  Component|target  |tree-optimization
 Ever confirmed|0   |1

--- Comment #4 from Andrew Pinski  ---
(In reply to Shawn Landden from comment #2)
> Segher: Not sure if this is a PowerPC issue or a middle-end issue.

It is a generic issue (and shows up on all targets).  The middle-end was able
to converted the first case into:
(s - 8) <= 7

Re: [PATCH] preserve integer value of local addresses returned from functions (PR 90737)

2019-06-05 Thread Jeff Law
On 6/3/19 3:24 PM, Martin Sebor wrote:
> While testing a different -Wreturn-local-addr bug fix/enhancement
> I noticed that in functions that return integers as opposed to
> pointers such as:
> 
>   intptr_t f (int i) { return (intptr_t) }
> 
> the converted address is folded to zero.  This can be detected
> by strictly conforming programs so it's not really correct.
> Such statements also trigger the warning.
> 
> The attached patch adjusts the C and C++ front-ends to avoid
> the folding.
> 
> The patch also avoids the warning but I'm on the fence about that.
> There is some value in diagnosing it since it could be masking
> a bug.  Would anyone like to argue in favor of keeping it?
> 
> Martin
> 
> gcc-90737.diff
> 
> PR c/90737 - [8/9/10 Regression] inconsistent address of a local converted to 
> intptr_t between callee and caller
> 
> gcc/c/ChangeLog:
> 
>   PR c/90737
>   * c-typeck.c (c_finish_return): Only consider functions returning
>   pointers as candidates for -Wreturn-local-addr.
> 
> gcc/cp/ChangeLog:
> 
>   PR c/90737
>   * typeck.c (maybe_warn_about_returning_address_of_local): Only
>   consider functions returning pointers as candidates for
>   -Wreturn-local-addr.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c/90737
>   * c-c++-common/Wreturn-local-addr.c: New test.
>   * g++.dg/warn/Wreturn-local-addr-6.C: New test.
OK
jeff


Re: [PATCH][MSP430][4/4] Implement 64-bit shifts in assembly code

2019-06-05 Thread Jeff Law
On 6/4/19 7:17 AM, Jozef Lawrynowicz wrote:
> This patch implements 64-bit shifts in assembly code. Previously, generic C
> library code from libgcc would be used to perform the shifts, which was much
> more costly in terms of code size.
> 
> I observed 700 PASS->FAIL regressions from the GCC testsuite alone when these
> 64-bit shifts were implemented incorrectly, hence I've assumed there is
> already adequate test coverage that shifts operate correctly, and I have not
> added new tests to verify their correct execution.
> 
> For the following program, the below code size reduction is observed:
>   long long a;
> 
>   int
>   main (void)
>   {
> a = a >> 4;
> return 0;
>   }
> 
> With shift patch 3:
>textdata bss dec hex filename
> 670  12  26 708 2c4 a.out
> With new patch:
>textdata bss dec hex filename
> 512  12  26 550 226 a.out
> 
> Ok for trunk?
> 
> 
> 0004-MSP430-Implement-64-bit-shifts-in-assembly-code.patch
> 
> From 3b34b3d005ea63b37cf6a277395a048e55d854b2 Mon Sep 17 00:00:00 2001
> From: Jozef Lawrynowicz 
> Date: Mon, 13 May 2019 17:55:27 +0100
> Subject: [PATCH 4/4] MSP430: Implement 64-bit shifts in assembly code
> 
> gcc/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * config/msp430/msp430.c (msp430_expand_helper): Setup arguments which
>   describe how to perform MSPABI compliant 64-bit shift.
>   * config/msp430/msp430.md (ashldi3): New define_expand.
>   (ashrdi3): New define_expand.
>   (lshrdi3): New define_expand.
> 
> libgcc/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * config/msp430/slli.S (__mspabi_s): New library function for
>   performing a logical left shift of a 64-bit value.
>   (__mspabi_srall): New library function for
>   performing a arithmetic right shift of a 64-bit value.
>   (__mspabi_srlll): New library function for
>   performing a logical right shift of a 64-bit value.
> 
> gcc/testsuite/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * gcc.target/msp430/mspabi_s.c: New test.
>   * gcc.target/msp430/mspabi_srall.c: New test.
>   * gcc.target/msp430/mspabi_srlll.c: New test.
Going to assume your assembly routines are correct :-)

OK
jeff


Re: [PATCH] [MIPS] Inhibit trailing .insn if pool is not followed by code

2019-06-05 Thread Jeff Law
On 6/4/19 3:07 PM, Maciej W. Rozycki wrote:
> On Mon, 3 Jun 2019, Faraz Shahbazker wrote:
> 
>> The __pool and __pend symbols are used to mark the beginning and end of
>> inline constant pools in MIPS16 code regions.  However if the pool occurs
>> at the boundary of a code region and is not followed by further code,
>> presence of the __pend symbol can confuse the dissassembler in to treating
>> subsequent non-MIPS16 code block as MIPS16.
> 
>  Thanks for looking into it.  FWIW I think the `__pend' symbol will best 
> be still emitted for consistency, however as STT_OBJECT and consequently 
> with no trailing `.insn'.
If I understand correctly we'd still want to call
mips_set_text_contents_type in all the cases we did before, but that
we'd pass in false for the FUNCTION_P argument?

jeff


Re: [PATCH] Put __dso_handle in .sdata/.sbss on ia64

2019-06-05 Thread Jeff Law
On 6/1/19 6:30 PM, James Clarke wrote:
> The symbol is exposed to C by dso_handle.h, and since it's a single
> 8-byte pointer, it is just within the threshold for being in the small
> data (or bss) section, so code accessing it will use GP-relative
> addressing. Therefore we must put it in .sdata/.sbss in case our other
> data sections grow too big and we overflow the 22-bit relocation.
> 
> libgcc/
>   * config/ia64/crtbegin.S (__dso_handle): Put in .sdata/.sbss
>   rather than .data/.bss so it can be accessed via gp-relative
>   addressing.
THanks.  Ive installed this on the trunk.

jeff


[Bug target/90768] better range analysis for converting bit tests into less-than greater-than

2019-06-05 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90768

Shawn Landden  changed:

   What|Removed |Added

  Component|middle-end  |target

--- Comment #3 from Shawn Landden  ---
Yeah I think this is a target issue. These two functions should produce
identical code: https://godbolt.org/z/omu09e

[Bug middle-end/90768] better range analysis for converting lt/gt into bit tests

2019-06-05 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90768

Shawn Landden  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #2 from Shawn Landden  ---
Segher: Not sure if this is a PowerPC issue or a middle-end issue.

[Bug middle-end/90768] better range analysis for converting lt/gt into bit tests

2019-06-05 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90768

--- Comment #1 from Shawn Landden  ---
Whoops I got that backwards, converting the bit test to a
greater-than-or-equal-to is better.

Re: [PATCH][MSP430][3/4] Disable performance optimal library code shifts when optimizing for size

2019-06-05 Thread Jeff Law
On 6/4/19 7:14 AM, Jozef Lawrynowicz wrote:
> This patch reduces code size by disabling the performance optimized,
> "const_variant" of shift library functions when optimization for size is
> enabled.
> 
> For the following program, the below code size reduction is observed:
>   long a;
> 
>   int
>   main (void)
>   {
> a = a >> 4;
> return 0;
>   }
> 
> With shift patch 2:
>textdata bss dec hex filename
> 522  12  22 556 22c a.out
> New patch:
>textdata bss dec hex filename
> 474  12  22 508 1fc a.out
> 
> Ok for trunk?
> 
> 
> 0003-MSP430-Do-not-use-the-performance-optimized-variant-.patch
> 
> From 894b6809822ba3a3a1bab3750abe29e03f2a3ad6 Mon Sep 17 00:00:00 2001
> From: Jozef Lawrynowicz 
> Date: Mon, 13 May 2019 17:52:19 +0100
> Subject: [PATCH 3/4] MSP430: Do not use the performance optimized variant of a
>  shift by constant amount when optimizing for size
> 
> gcc/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * config/msp430/msp430.md (ashlhi3): Use the const_variant of shift
>   library functions only when not optimizing for size.
>   (ashlsi3): Likewise.
>   (ashrhi3): Likewise.
>   (ashrsi3): Likewise.
>   (lshrhi3): Likewise.
>   (lshrsi3): Likewise.
> 
> gcc/testsuite/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * gcc.target/msp430/size-optimized-shifts.c: New test.
OK
jeff


[Bug middle-end/90768] New: better range analysis for converting lt/gt into bit tests

2019-06-05 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90768

Bug ID: 90768
   Summary: better range analysis for converting lt/gt into bit
tests
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slandden at gmail dot com
  Target Milestone: ---

Converting the >= 8 to a & 8 results in one less instruction on power 9

https://godbolt.org/z/0QPN3z

#include 
#include 
int bcmp_2(char *a, char *b, size_t s) {
if (s < 16) {
if (s >= 8)
if (*(uint64_t *)a != *(uint64_t *)b)
return 1;

}
}

Re: [PATCH][MSP430][2/4] Emulate 16-bit shifts with rotate insn when src operand is originally in memory

2019-06-05 Thread Jeff Law
On 6/4/19 7:11 AM, Jozef Lawrynowicz wrote:
> This patch reduces code size by enabling the emulation of some 16-bit shift
> instructions with the native rotate instructions, when the source operand is 
> in
> memory. This is achieved by forcing the source operand into a register.
> 
> For the following program, the below code size reduction is observed:
>   int a;
> 
>   int
>   main (void)
>   {
> a = a << 4;
> return 0;
>   }
> 
> With shift patch 1:
>textdata bss dec hex filename
> 484  12  20 516 204 a.out
> With new patch:
>textdata bss dec hex filename
> 452  12  20 484 1e4 a.out
> 
> Ok for trunk?
> 
> 
> 0002-MSP430-Force-the-src-operand-of-a-HImode-shift-into-.patch
> 
> From e609f63d49227ce385316896dde6a476f5f27db7 Mon Sep 17 00:00:00 2001
> From: Jozef Lawrynowicz 
> Date: Mon, 13 May 2019 17:48:00 +0100
> Subject: [PATCH 2/4] MSP430: Force the src operand of a HImode shift into a
>  register if it is in memory
> 
> gcc/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * config/msp430/msp430.md (ashlhi3): Force shift src operand into a
>   register if it is in memory, so the shift can be emulated with a rotate
>   instruction.
>   (ashrhi3): Likewise.
>   (lshrhi3): Likewise.
> 
> gcc/testsuite/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * gcc.target/msp430/emulate-slli.c: New test.
>   * gcc.target/msp430/emulate-srai.c: New test.
>   * gcc.target/msp430/emulate-srli.c: New test.
OK
jeff
> ---


Re: [PATCH][MSP430][1/4] Put libgcc shift functions in their own sections

2019-06-05 Thread Jeff Law
On 6/4/19 7:07 AM, Jozef Lawrynowicz wrote:
> This patch reduces code size by putting each of the shift library functions 
> from
> libgcc in their own section. This means that, for example, a 16-bit logical
> left shift does not result in code to perform a 32-bit logical left shift 
> being
> included in the final executable, as the linker can now garbage collect unused
> sections.
> 
> For the following program, the below code size reduction is observed:
>   int a, b;
> 
>   int
>   main (void)
>   {
> a = a << b;
> return 0;
>   }
> 
> Current trunk:
>textdata bss dec hex filename
> 572  12  22 606 25e a.out
> With patch:
>textdata bss dec hex filename
> 466  12  22 500 1f4 a.out
> 
> Ok for trunk?
> 
> 
> 0001-MSP430-Put-the-library-functions-for-bitwise-shifts-.patch
> 
> From 8017a4b453ae1b07bbeb75f7f7613a5bc5605159 Mon Sep 17 00:00:00 2001
> From: Jozef Lawrynowicz 
> Date: Mon, 13 May 2019 17:42:08 +0100
> Subject: [PATCH 1/4] MSP430: Put the library functions for bitwise shifts in
>  their own sections
> 
> libgcc/ChangeLog
> 
> 2019-06-04  Jozef Lawrynowicz  
> 
>   * config/msp430/slli.S (__mspabi_slli_n): Put function in its own
>   section.
>   (__mspabi_slli): Likewise.
>   (__mspabi_slll_n): Likewise.
>   (__mspabi_slll): Likewise.
>   * config/msp430/srai.S (__mspabi_srai_n): Likewise.
>   (__mspabi_srai): Likewise.
>   (__mspabi_sral_n): Likewise.
>   (__mspabi_sral): Likewise.
>   * config/msp430/srli.S (__mspabi_srli_n): Likewise.
>   (__mspabi_srli): Likewise.
>   (__mspabi_srll_n): Likewise.
>   (__mspabi_srll): Likewise.
OK.
jeff


Re: [PATCH] Enable memory operand for vfpclass[p,s][s,d] patterns.

2019-06-05 Thread Jeff Law
On 6/5/19 1:39 AM, Hongtao Liu wrote:
> Hi Jeff and Jakub:
>   When adding new intrinsics(PR target/89803), i found vfpclassp[sd],
> vfpclasss[sd] patterns didn't support memory operand which is
> supported in instructions. So this patch is about to enable memory
> operands for vfpclassp[s,d]/vfpclasss[s,d] patterns.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux (on skylake-avx512),
> ok for trunk?
> 
> Changelog
> gcc/
> 2019-06-05  Hongtao Liu  
> 
> * config/i386/sse.md (define_mode_suffix vecmemsuffix): New.
> (define_insn "avx512dq_fpclass"):
> Enable memory operand for it.
> (define_insn "avx512dq_vmfpclass"): Ditto.
> 
> gcc/testsuite/
> 2019-06-05  Hongtao Liu  
> 
> * gcc.target/i386/avx512dq-vfpclasspd-1.c:
> Adjust scan assember for {x,y,z} suffix.
> * gcc.target/i386/avx512dq-vfpclassps-1.c: Ditto.
OK, but make sure you fix the ChangeLog formatting as you commit.

jeff


Re: [PATCH] IPA ICF: enhance dump output

2019-06-05 Thread Jeff Law
On 6/4/19 8:38 AM, Martin Liška wrote:
> Hi.
> 
> The patch is about simplification of dump output. Plus it prints
> also a file in which the dump message was emitted.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2019-06-04  Martin Liska  
> 
>   * ipa-icf-gimple.h (dump_message_1): Remove.
>   (dump_message): Likewise.
>   (return_false_with_message_1): Print also file.
>   (return_false_with_msg): Likewise.
>   (return_with_result): Likewise.
>   (return_with_debug): Likewise.
>   * ipa-icf.c (sem_function::equals_private): Remove call
>   to dump_message.
OK.
jeff


Re: Preventing ISO C errors when using macros for builtin types

2019-06-05 Thread Segher Boessenkool
On Wed, Jun 05, 2019 at 08:49:39PM +0100, Jozef Lawrynowicz wrote:
> On Wed, 5 Jun 2019 11:49:21 -0500
> Segher Boessenkool  wrote:
> > The documentation says
> > 
> >   '-pedantic' and other options cause warnings for many GNU C extensions.
> >   You can prevent such warnings within one expression by writing
> >   '__extension__' before the expression.  '__extension__' has no effect
> >   aside from this.
> > 
> > It's not clear to me why you cannot simply put __extension__ earlier in
> > your case?
> 
> If I am modifying tests on an individual basis, then indeed I can put
> __extension__ earlier in the declaration and it fixes the issue.

Or you could fix it like this in your header file?

> But for a fix within the compiler to prevent having to modify individual 
> tests,
> I could add __extension__ to the beginning of the macro - but the macro may
> not end up at the beginning of a declaration in the source code.
> 
> For example, say __SIZE_TYPE__ now expands to "__extension__ __int20 
> unsigned",
> then the following usage of __SIZE_TYPE__ would be ok, as __extension__ is at
> the beginning of the declaration:
> 
> > __SIZE_TYPE__ size;

Don't use macros for types?  You could use something like

  __extension__ typedef unsigned __int20 __SIZE_TYPE__;

(stddef.h already uses typedefs like that, btw, for __SIZE_TYPE__ in fact).

> I'm mainly just wondering if a change to the compiler to allow the usage of
> __extension__ within a declaration would be allowed.

Well, how would that work?  We'd need to modify the grammar to allow
__extension__ pretty much anywhere, and then change all compiler code
to not pedwarn until it has parsed a full expression (or statement, or
file, or however this will work).  Or make it not warn for things after
the __extension__ only, or make it only *guaranteed* not to warn for
things *after* the __extension__.

None of those options are very appetising, IMO.


Segher


[Bug c++/90767] [9/10 Regression] jumbled error message with this and const

2019-06-05 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90767

--- Comment #2 from Marek Polacek  ---
Regressed with r264250.

Re: [PATCH] Fix HTML headers and titles for 'Installing GCC' pages (PR web/87933).

2019-06-05 Thread Jeff Law
On 6/5/19 3:09 AM, Martin Liška wrote:
> Hi.
> 
> The patch fixes wrong titles/header in 'Installing GCC' pages.
> 
> Tested with make html.
> Ready for trunk?
> 
> Martin
> 
> gcc/ChangeLog:
> 
> 2019-06-05  Martin Liska  
> 
>   PR web/87933
>   * doc/install.texi: Fix HTML headers and
>   titles for 'Installing GCC' pages.
> ---
>  gcc/doc/install.texi | 1 -
>  1 file changed, 1 deletion(-)
> 
> 
OK
jeff


[Bug middle-end/90673] A problem with 'copy destination size is too small' error in copy_from_user

2019-06-05 Thread yaro330 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90673

--- Comment #6 from Yaro Slav  ---
(In reply to Kees Cook from comment #5)
> From the linked code:
> 
>   missing = copy_from_user(dbg_buff, buf, sizeof(buf));
> 
> dbg_buff is a global variable -- is writing to it thread safe?
> 
> sizeof(buf) is 8. (it's a pointer not an array), so that seems the wrong
> size?
> 
> I bet the error message for __bad_copy_to is busted and it really means
> __bad_copy_from.

The code that you used as an example is "fixed", gcc doesn't warn about it, it
warns when instead of 'sizeof(buf)' we pass 'count'.

[Bug c++/90767] [9/10 Regression] jumbled error message with this and const

2019-06-05 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90767

Marek Polacek  changed:

   What|Removed |Added

   Keywords||diagnostic
   Target Milestone|--- |9.2
Summary|jumbled error message with  |[9/10 Regression] jumbled
   |this and const  |error message with this and
   ||const

--- Comment #1 from Marek Polacek  ---
Since g++ 8's output was much better, I'm marking this as a regression.

[Bug c++/90767] New: jumbled error message with this and const

2019-06-05 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90767

Bug ID: 90767
   Summary: jumbled error message with this and const
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mpolacek at gcc dot gnu.org
  Target Milestone: ---

struct x {
  int n;
  void bye();

  template
  operator T() const
{
  if (n == 0)
bye();
  return n;
}
};

$ ./cc1plus -quiet f.cc
f.cc: In member function ‘x::operator T() const’:
f.cc:9:6: error: cannot convert ‘#‘addr_expr’ not supported by dump_type#’ to ‘x*’
9 |  bye();
  |  ^
f.cc:3:8: note:   initializing argument 'this' of ‘void x::bye()’
3 |   void bye();
  |^~~

whereas g++ 8 gave:
f.cc: In member function ‘x::operator T() const’:
f.cc:9:6: error: no matching function for call to ‘x::bye() const’
  bye();
  ^
f.cc:3:8: note: candidate: ‘void x::bye()’ 
   void bye();
^~~
f.cc:3:8: note:   passing ‘const x*’ as ‘this’ argument discards qualifiers

[Bug middle-end/90673] A problem with 'copy destination size is too small' error in copy_from_user

2019-06-05 Thread kees at outflux dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90673

Kees Cook  changed:

   What|Removed |Added

 CC||kees at outflux dot net

--- Comment #5 from Kees Cook  ---
From the linked code:

missing = copy_from_user(dbg_buff, buf, sizeof(buf));

dbg_buff is a global variable -- is writing to it thread safe?

sizeof(buf) is 8. (it's a pointer not an array), so that seems the wrong size?

I bet the error message for __bad_copy_to is busted and it really means
__bad_copy_from.

[Bug tree-optimization/90766] strlen(a + i) missing range for arrays of unknown bound with strings of known length and variable i

2019-06-05 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90766

Martin Sebor  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Blocks||83819

--- Comment #1 from Martin Sebor  ---
For the same reason, although GCC folds the strlen() computation in the first
two functions into the difference N - i (where N is the known length of the
string), it fails to do the same thing in the last instance (ditto when b is a
char*).

$ cat a.c && gcc -O2 -S -Wall -Wextra -Wpedantic
-fdump-tree-optimized=/dev/stdout a.c
typedef __SIZE_TYPE__ size_t;

const char s[] = "123";

size_t f (unsigned i)
{
  return __builtin_strlen ([i]);   // folded to 3 - i
}

extern char a[8];

size_t g (unsigned i)
{
  __builtin_strcpy (a, "123");
  return __builtin_strlen ([i]);   // folded to 3 - i
}

extern char b[];

size_t h (unsigned i)
{
  __builtin_strcpy (b, "123");
  return __builtin_strlen ([i]);   // not folded but could be
}

;; Function f (f, funcdef_no=0, decl_uid=1908, cgraph_uid=1, symbol_order=1)

Removing basic block 5
f (unsigned int i)
{
  sizetype _1;
  size_t iftmp.0_2;
  size_t iftmp.0_4;

   [local count: 1073741824]:
  if (i_3(D) <= 3)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870913]:
  _1 = (sizetype) i_3(D);
  iftmp.0_4 = 3 - _1;

   [local count: 1073741824]:
  # iftmp.0_2 = PHI 
  return iftmp.0_2;

}



;; Function g (g, funcdef_no=1, decl_uid=1912, cgraph_uid=2, symbol_order=2)

Removing basic block 5
g (unsigned int i)
{
  sizetype _1;
  size_t iftmp.1_2;
  size_t iftmp.1_6;

   [local count: 1073741824]:
  __builtin_memcpy (, "123", 4);
  if (i_5(D) <= 3)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870913]:
  _1 = (sizetype) i_5(D);
  iftmp.1_6 = 3 - _1;

   [local count: 1073741824]:
  # iftmp.1_2 = PHI 
  return iftmp.1_2;

}



;; Function h (h, funcdef_no=2, decl_uid=1916, cgraph_uid=3, symbol_order=3)

h (unsigned int i)
{
  char * _1;
  size_t _5;
  sizetype _6;

   [local count: 1073741824]:
  __builtin_memcpy (, "123", 4);
  _6 = (sizetype) i_4(D);
  _1 =  + _6;
  _5 = __builtin_strlen (_1); [tail call]
  return _5;

}


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83819
[Bug 83819] [meta-bug] missing strlen optimizations

[Bug tree-optimization/90766] New: strlen(a + i) missing range for arrays of unknown bound with strings of known length and variable i

2019-06-05 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90766

Bug ID: 90766
   Summary: strlen(a + i) missing range for arrays of unknown
bound with strings of known length and variable i
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: msebor at gcc dot gnu.org
  Target Milestone: ---

GCC manages to fold the first two conditionals involving strlen but not the
third, even though it has all the information to do that.  The
maybe_set_strlen_range() function in tree-ssa-strlen.c doesn't take the known
length of the string stored in b into consideration.

$ cat a.c && gcc -O2 -S -Wall -Wextra -Wpedantic
-fdump-tree-optimized=/dev/stdout a.c
const char s[] = "123";

void f (int i)
{
  if (__builtin_strlen ([i]) > 3)   // folded to false, good
__builtin_abort ();
}

extern char a[8];

void g (int i)
{
  __builtin_strcpy (a, "123");
  if (__builtin_strlen ([i]) > 3)   // folded to false, good
__builtin_abort ();
}

extern char b[];

void h (int i)
{
  __builtin_strcpy (b, "123");
  if (__builtin_strlen ([i]) > 3)   // not folded but could be
__builtin_abort ();
}

;; Function f (f, funcdef_no=0, decl_uid=1907, cgraph_uid=1, symbol_order=1)

f (int i)
{
   [local count: 1073741824]:
  return;

}



;; Function g (g, funcdef_no=1, decl_uid=1911, cgraph_uid=2, symbol_order=2)

g (int i)
{
   [local count: 1073741824]:
  __builtin_memcpy (, "123", 4); [tail call]
  return;

}



;; Function h (h, funcdef_no=2, decl_uid=1915, cgraph_uid=3, symbol_order=3)

h (int i)
{
  char * _1;
  long unsigned int _2;
  sizetype _7;

   [local count: 1073741824]:
  __builtin_memcpy (, "123", 4);
  _7 = (sizetype) i_5(D);
  _1 =  + _7;
  _2 = __builtin_strlen (_1);
  if (_2 > 3)
goto ; [0.00%]
  else
goto ; [100.00%]

   [count: 0]:
  __builtin_abort ();

   [local count: 1073741824]:
  return;

}

Re: undefined behavior in value_range::equiv_add()?

2019-06-05 Thread Jeff Law
On 6/4/19 9:04 AM, Richard Biener wrote:
> On Tue, Jun 4, 2019 at 3:40 PM Jeff Law  wrote:
>> 
>> On 6/4/19 5:23 AM, Richard Biener wrote:
>>> On Tue, Jun 4, 2019 at 12:30 AM Jeff Law  wrote:
 
 On 6/3/19 7:13 AM, Aldy Hernandez wrote:
> On 5/31/19 5:00 AM, Richard Biener wrote:
>> On Fri, May 31, 2019 at 2:27 AM Jeff Law 
>> wrote:
>>> 
>>> On 5/29/19 10:20 AM, Aldy Hernandez wrote:
 On 5/29/19 12:12 PM, Jeff Law wrote:
> On 5/29/19 9:58 AM, Aldy Hernandez wrote:
>> On 5/29/19 9:24 AM, Richard Biener wrote:
>>> On Wed, May 29, 2019 at 2:18 PM Aldy Hernandez
>>>  wrote:
 
 As per the API, and the original documentation
 to value_range, VR_UNDEFINED and VR_VARYING
 should never have equivalences. However, 
 equiv_add is tacking on equivalences blindly,
 and there are various regressions that happen
 if I fix this oversight.
 
 void value_range::equiv_add (const_tree var, 
 const value_range *var_vr, bitmap_obstack
 *obstack) { if (!m_equiv) m_equiv =
 BITMAP_ALLOC (obstack); unsigned ver =
 SSA_NAME_VERSION (var); bitmap_set_bit
 (m_equiv, ver); if (var_vr && var_vr->m_equiv) 
 bitmap_ior_into (m_equiv, var_vr->m_equiv); }
 
 Is this a bug in the documentation / API, or is
 equiv_add incorrect and we should fix the
 fall-out elsewhere?
>>> 
>>> I think this must have been crept in during the
>>> classification. If you go back to say GCC 7 you
>>> shouldn't see value-ranges with UNDEFINED/VARYING
>>> state in the lattice that have equivalences.
>>> 
>>> It may not be easy to avoid with the new classy
>>> interface but we're certainly not tacking on them
>>> "blindly".  At least we're not supposed to.  As
>>> usual the intermediate state might be "broken"
>>> but intermediateness is not sth the new class
>>> "likes".
>> 
>> It looks like extract_range_from_stmt (by virtue
>> of vrp_visit_assignment_or_call and then
>> extract_range_from_ssa_name) returns one of these
>> intermediate ranges.  It would seem to me that an 
>> outward looking API method like
>> vr_values::extract_range_from_stmt shouldn't be
>> returning inconsistent ranges.  Or are there no 
>> guarantees for value_ranges from within all of
>> vr_values?
> ISTM that if we have an implementation constraint
> that says a VR_VARYING or VR_UNDEFINED range can't
> have equivalences, then we need to honor that at the
> minimum for anything returned by an external API. 
> Returning an inconsistent state is bad.  I'd even
> state that we should try damn hard to avoid it in
> internal APIs as well.
 
 Agreed * 2.
 
> 
>> 
>> Perhaps I should give a little background.  As part
>> of your value_range_base re-factoring last year,
>> you mentioned that you didn't split out intersect
>> like you did union because of time or oversight.
>> I have code to implement intersect (attached), for
>> which I've noticed that I must leave equivalences
>> intact, even when transitioning to VR_UNDEFINED:
>> 
>> [from the attached patch] +  /* If THIS is varying
>> we want to pick up equivalences from OTHER. +
>> Just special-case this here rather than trying to
>> fixup after the + fact.  */ +  if
>> (this->varying_p ()) +this->deep_copy (other); 
>> +  else if (this->undefined_p ()) +/* ?? Leave
>> any equivalences already present in an undefined. +
>> This is technically not allowed, but we may get an
>> in-flight +   value_range in an intermediate
>> state.  */
> Where/when does this happen?
 
 The above snippet is not currently in mainline.  It's
 in the patch I'm proposing to clean up intersect.  It's
 just that while cleaning up intersect I noticed that if
 we keep to the value_range API, we end up clobbering an
 equivalence to a VR_UNDEFINED that we depend up further
 up the call chain.
 
 The reason it doesn't happen in mainline is because
 intersect_helper bails early on an undefined, thus
 leaving the problematic equivalence intact.
 
 You can see it in mainline though, with the following
 testcase:
 
 int f(int x) { if (x != 0 && x != 1) return -2;
 
 return !x; }
 
 Break in evrp_range_analyzer::record_ranges_from_stmt()
 and see that the call to extract_range_from_stmt()

Go patch committed: Make calls and functions inlinable

2019-06-05 Thread Ian Lance Taylor
This patch to the Go frontend makes call expressions and function
reference expressions inlinable.

We now scan inlinable methods for references to global variables and
functions (I forgot to do that earlier).

We now track all packages mentioned by exports (that should have been
done earlier too).

We record assembler names in the export data, so that we can inline
calls to non-Go functions.  We modify gccgoimporter code to skip
assembler name.

This increases the number of inlinable functions in the standard
library from 215 to 439.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 271945)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-949c3b7aa603bc09e650d62e82c600b3463802f0
+2609f9b8420e2341fbbe40d7cf6af42b0fba7293
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/export.cc
===
--- gcc/go/gofrontend/export.cc (revision 271891)
+++ gcc/go/gofrontend/export.cc (working copy)
@@ -133,6 +133,11 @@ Collect_references_from_inline::expressi
   if (fe != NULL)
 {
   Named_object* no = fe->named_object();
+
+  if (no->is_function_declaration()
+ && no->func_declaration_value()->type()->is_builtin())
+   return TRAVERSE_CONTINUE;
+
   std::pair ins =
this->exports_->insert(no);
 
@@ -247,6 +252,22 @@ Export::export_globals(const std::string
  if ((*p)->is_function()
  && (*p)->func_value()->export_for_inlining())
check_inline_refs.push_back(*p);
+ else if ((*p)->is_type())
+   {
+ const Bindings* methods = (*p)->type_value()->local_methods();
+ if (methods != NULL)
+   {
+ for (Bindings::const_definitions_iterator pm =
+methods->begin_definitions();
+  pm != methods->end_definitions();
+  ++pm)
+   {
+ Function* fn = (*pm)->func_value();
+ if (fn->export_for_inlining())
+   check_inline_refs.push_back(*pm);
+   }
+   }
+   }
}
 }
 
@@ -282,6 +303,9 @@ Export::export_globals(const std::string
}
 }
 
+  // Track all imported packages mentioned in export data.
+  Unordered_set(const Package*) all_imports;
+
   // Export the symbols in sorted order.  That will reduce cases where
   // irrelevant changes to the source code affect the exported
   // interface.
@@ -291,15 +315,20 @@ Export::export_globals(const std::string
   for (Unordered_set(Named_object*)::const_iterator p = exports.begin();
p != exports.end();
++p)
-sorted_exports.push_back(*p);
+{
+  sorted_exports.push_back(*p);
+
+  const Package* pkg = (*p)->package();
+  if (pkg != NULL)
+   all_imports.insert(pkg);
+}
 
   std::sort(sorted_exports.begin(), sorted_exports.end(), Sort_bindings());
 
   // Assign indexes to all exported types and types referenced by
   // exported types, and collect all packages mentioned.
-  Unordered_set(const Package*) type_imports;
   int unexported_type_index = this->prepare_types(_exports,
- _imports);
+ _imports);
 
   // Although the export data is readable, at least this version is,
   // it is conceptually a binary format.  Start with a four byte
@@ -327,7 +356,7 @@ Export::export_globals(const std::string
 
   this->write_packages(packages);
 
-  this->write_imports(imports, type_imports);
+  this->write_imports(imports, all_imports);
 
   this->write_imported_init_fns(package_name, import_init_fn,
imported_init_fns);
@@ -693,7 +722,7 @@ import_compare(const std::pair& imports,
- const Unordered_set(const Package*)& type_imports)
+ const Unordered_set(const Package*)& all_imports)
 {
   // Sort the imports for more consistent output.
   Unordered_set(const Package*) seen;
@@ -729,8 +758,8 @@ Export::write_imports(const std::map indirect_imports;
   for (Unordered_set(const Package*)::const_iterator p =
-type_imports.begin();
-   p != type_imports.end();
+all_imports.begin();
+   p != all_imports.end();
++p)
 {
   if (seen.find(*p) == seen.end())
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 271945)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -1356,6 +1356,29 @@ Func_expression::do_get_backend(Translat
   return gogo->backend()->convert_expression(btype, bexpr, 

On-Demand range technology [6/5] - Integration

2019-06-05 Thread Andrew MacLeod
After the various discussions, I've evaluated how I think everything can 
fit together, so this is my proposal for integration with trunk.



The complete Ranger prototype consists of 5 major  components, one of 
which is missing/un-implemented as yet :-)


1 - irange -  This is the non-symbolic range implementation we've 
been using which represents ranges as groups of ordered sub-ranges.
2 - range-ops -  This is the component which extracts ranges from 
statements, and so performs the functionality of extract_range_from_*,  
except it operates on the irange API and also allows for solving of 
operands other than just the LHS of the expression.
3 - GORI-computes -  This is the component which utilizes range-ops to 
compute a range on an outgoing edge for any ssa-name in the definition 
chain of the branch

  a_3 = b_6 * 2
  d_8 = a_3 - 20
 if (d_8 < 30)
   the GORI-compute component can generate ranges for d_8, a_3 and b_6.
4 - GORI-Cache and the Ranger.  Working together, this provides the 
on-demand range functionality to resolve ranges
5 - relational/equivalency tracker - This is the sketched out but 
unimplemented bit which tracks the symbolic relationships, and remove 
the need for ranges to support symbolics. ( <,<=, >, >=, ==, != and none).


The consensus appears to be that range-ops and gori-computes are good 
candidates to replace aspects of vr-values and assert generation.


A)
Until I get to (5) (relational tracker), using (1) (irange) is a 
non-starter since it doesn't handle symbolics.


To eliminate the range issue from the equation,  Aldy is currently 
working on unifying the irange and value_range APIs.   This will allow 
the rest of the ranger code base to use the value_range implementation 
transparently.   We can talk about irange or some alternate 
implementation of ranges at some later point, but there'll be an API 
that works for all clients.


The existing value_range API gets a few tweaks/cleanups, but mostly 
there is an additional set of calls to query sub-ranges which the ranger 
and range-ops require. These routines basically translate the various 
value ranges formats into discrete sub-ranges.  Thru these rotuines, 
ANTI_RANGE will appear as 2 sub-ranges, VARYING as a [MIN, MAX] range, 
and UNDEFINED as an empty range [].  These additions should allow 
value_range to function as the range implementation for both the ranger 
and VRP.


I suspect he will have patches coming shortly that will help to unify 
the 2 range implementations, we can discuss details over those patches..


B)
A Unified range API then allows us to work on integrating the range-ops 
and GORI-computes component into the code base.   Range ops would 
replace the various extract_range_from_*_ routines in vr_values for 
statement level ranges.  GORI-computes would then replace the assert 
building code for calculating outgoing ranges on edges.  In theory EVRP 
then simply calls range_on_edge() from gori_compute instead of 
register_edge_assert() .


The range ops code is designed to perform all evaluations assuming an 
arbitrary number of sub-ranges.  Aldy spent a lot of time last year 
unifying the VRP code and the range-ops code to get the identical 
results, and they frequently share a common base. He  has gone thru 
excruciating care to ensure the calculations are identical and verifies 
it by calculating everything using both code bases, comparing them, and 
aborting if the results ever get diverge.


We will need to adjust the range-ops code to work with symbolics in 
certain place. This means PLUS, MINUS, all the relations (<,>, etc), and 
copy. Probably something else as it is encountered. This is un-sized as 
yet, but I'm hoping won't be too bad assuming we can utilize some of the 
existing code for those bits..  More details when we actually start 
doing this and find the lurking dragons.


we'll worry about bitmasks and equivalencies when we get closer to 
functioning, but I don't foresee too much problem since value_range_base 
is still being used.



C)  That will keep us busy for a while, and should result in the core 
integration.   Meanwhile, we'll try to figure out the relational code 
design. I'll go back to my original design, adjust that, then we can 
figure out how best to proceed to address the various needs.


D) Finally, the GORI-cache and on-demand ranger are blocked until the 
above work is finished.



One additional thing I would like to do eventually is tweak EVRP 
slightly to align with the ranger model.


The ranger API is basically just 5 entry points which the ranger uses to 
determine ranges.

    range_of_expr  - range of a use on a statement
    range_of_stmt  - range of the result of the statement, (calculated 
by range-ops).

    range_on_edge - range on an edge - (provided by gori_computes)
    range_on_entry - range on entry to a block (provided by gori-cache)
    range_on_exit - range after the last statement in a block

Abstracted and simplified, I 

Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime

2019-06-05 Thread Janne Blomqvist
On Wed, Jun 5, 2019 at 10:42 PM 김규래  wrote:

> > On Wed, Jun 5, 2019 at 9:25 PM 김규래  wrote:
>
> >
> > > Hi, thanks for the detailed explanation.
> > > I think I now get the picture.
> > > Judging from my current understanding, the task-parallelism currently
> works as follows:
> > > 1. Tasks are placed in a global shared queue.
> > > 2. Workers consume the tasks by bashing the queue in a while loop,
> just as self-scheduling (dynamic scheduling)/
> > >
> > > Then the improvements including work-stealing must be done by:
> > > 1. Each worker holds a dedicated task queue reducing the resource
> contention.
> > > 2. The tasks are distributed in a round-robin fashion
>
> > For nested task submission (does OpenMP support that?) you probably
> > want to submit to the local queue rather than round-robin, no?
>
>
>
> Hi Janne,
>
> I believe you're talking about spawning child tasks within an already
> running task, which I believe is supported by OpenMP.
>
> In this case, pushing to the local queue could introduce a deadlock if the
> master thread waits on the spawned tasks.
>
> A short one though since work-stealing can resolve the deadlock.
>
> A better way to handle this is to round-robin the child tasks to the
> queues excluding the queue of the thread consuming the current task.
>
> Then waiting on the tasks should never cause a deadlock.
>
> Or did I misunderstand your question?
>
> Maybe there is a specific reason for avoiding avoid round-robin that I
> missed?
>
Better cache locality.

Another option, which I guess starts to go out of scope of your gsoc, is
parallel depth first (PDF) search (Blelloch 1999) as an alternative to work
stealing. Here's a presentation about some recent work in this area,
although for Julia and not OpenMP (no idea if PDF would fit with OpenMP at
all): https://www.youtube.com/watch?v=YdiZa0Y3F3c

-- 
Janne Blomqvist


Re: [PATCH] Improve PTA flow-sensitivity (for the return stmt)

2019-06-05 Thread Jeff Law
On 6/5/19 6:51 AM, Richard Biener wrote:
> 
> The following was inspired by Marins work on escapes of locals
> and the discussion there.  It teaches points-to analysis that
> the point of function return is special and thus escapes through
> that a) do not influence other points-to solutions, b) can be
> pruned of all locals.
> 
> This is one example of reasonably simple "post-processing".
> 
> The effects are small, I've done statistics, counting the number
> of variables we do not mark escaped only after this patch.  This
> number is usually zero, sometimes one and a few cases more
> (but never more than 11) during bootstrap:
> 
> 0 95830
> 1 19268
> 2 19
> 3 2
> 5 2
> 6 1
> 8 1
> 11 1
> 
> so not sure if it is worth all the effort.  It does allow us
> to do more DSE but that requires the accesses to be indirect
> which is not often true for locals.
> 
> Bootstrapped / tested on x86_64-unknown-linux-gnu.
> 
> Martin, does this help you at all?  Anybody thinks this is
> worth the trouble?
> 
> Thanks,
> Richard.
> 
> 2019-06-05  Richard Biener  
> 
>   * tree-ssa-structalias.c: Include tree-cfg.h.
>   (make_heapvar): Do not make heap vars artificial.
>   (find_func_aliases_for_builtin_call): Handle stack allocation
>   functions.
>   (find_func_aliases): Delay processing of simple enough returns
>   in non-IPA mode.
>   (set_uids_in_ptset): Adjust.
>   (find_what_var_points_to): Likewise.
>   (solve_constraints): Do not dump points-to sets here.
>   (compute_points_to_sets): Post-process return statements,
>   amending the escaped solution.  Dump points-to sets afterwards.
>   (ipa_pta_execute): Dump points-to sets.
> 
>   * gcc.dg/tree-ssa/alias-37.c: New testcase.
>   * gcc.dg/torture/20190604-1.c: Likewise.
>   * gcc.dg/tree-ssa/pta-callused.c: Adjust.
It's a fair amount of code for a corner case, _but_ I think it's a step
in the right direction.  I suspect we ought to be using the alias oracle
solutions for more than we are today and improved precision certainly
doesn't hurt.

Jeff


[Bug target/90751] -fpatchtable-function-entry broken on hppa-linux-gnu-gcc/hppa64-linux-gnu-gcc

2019-06-05 Thread svens at stackframe dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90751

--- Comment #5 from Sven Schnelle  ---
I can confirm that the patch from Dave fixes the issue for me.

Re: Patch: don't cap TYPE_PRECISION of bitsizetype at MAX_FIXED_MODE_SIZE

2019-06-05 Thread Eric Botcazou
> This issue exists, not just for targets that can have their
> MAX_FIXED_MODE_SIZE more-or-less easily tweaked higher, but also
> for the 'bit-container' targets where it *can't* be set higher.
> 
> Let's please DTRT and correct the code here in the middle-end,
> so we don't ICE for those targets and this line (gcc.dg/pr69973.c):
>  typedef int v4si __attribute__ ((vector_size (1 << 29)));
> (all listed targets happen to have Pmode == SImode)
> 
> So, considering that: ok to commit?

You'd need to audit the effects on other targets though.  Are we sure that we 
want to do bitsizetype calculations in a larger mode on very embedded targets?

-- 
Eric Botcazou


Re: Preventing ISO C errors when using macros for builtin types

2019-06-05 Thread Jozef Lawrynowicz
On Wed, 5 Jun 2019 11:49:21 -0500
Segher Boessenkool  wrote:

> On Wed, Jun 05, 2019 at 02:25:59PM +0100, Jozef Lawrynowicz wrote:
> > I'm assuming it would not be valid to modify the behaviour of __extension__
> > so it can be placed within a declaration, and not just at the
> > beginning. However, there is minimal documentation on this keyword (it does 
> > not
> > state that it can be used in declarations, even though it can), so I wonder
> > what the "rules" are.  
> 
> The documentation says
> 
>   '-pedantic' and other options cause warnings for many GNU C extensions.
>   You can prevent such warnings within one expression by writing
>   '__extension__' before the expression.  '__extension__' has no effect
>   aside from this.
> 
> It's not clear to me why you cannot simply put __extension__ earlier in
> your case?

If I am modifying tests on an individual basis, then indeed I can put
__extension__ earlier in the declaration and it fixes the issue.

But for a fix within the compiler to prevent having to modify individual tests,
I could add __extension__ to the beginning of the macro - but the macro may
not end up at the beginning of a declaration in the source code.

For example, say __SIZE_TYPE__ now expands to "__extension__ __int20 unsigned",
then the following usage of __SIZE_TYPE__ would be ok, as __extension__ is at
the beginning of the declaration:

> __SIZE_TYPE__ size;

But the following would emit an error, because __extension__ is not at the
beginning of the declaration:

> typedef __SIZE_TYPE__ size_t;

I'm mainly just wondering if a change to the compiler to allow the usage of
__extension__ within a declaration would be allowed.
However since there are many contexts where usage of a type such as
__SIZE_TYPE__ is valid, but where __extension__ preceding the type wouldn't be
valid, this is probably not worth exploring..

I'm aware of some different ways that the tests can be fixed up to prevent
the ISO C errors caused by -pedantic-errors.
I'm currently thinking the best way to fix these issues in the testsuite might
be to implement a directive such as "dg-prune-options". This directive can then
prune "-pedantic-errors" from the options to be passed to GCC when some
specified condition (e.g. -mlarge is in the opt list) is matched.
This would at least avoid having to re-jig tests to get the __extension__
keyword into the right places, as the dg-prune-options can just be added to the
problematic tests.

Thanks,
Jozef


Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime

2019-06-05 Thread 김규래
> On Wed, Jun 5, 2019 at 9:25 PM 김규래  wrote: 
>
> > Hi, thanks for the detailed explanation.
> > I think I now get the picture.
> > Judging from my current understanding, the task-parallelism currently works 
> > as follows:
> > 1. Tasks are placed in a global shared queue.
> > 2. Workers consume the tasks by bashing the queue in a while loop, just as 
> > self-scheduling (dynamic scheduling)/
> >
> > Then the improvements including work-stealing must be done by:
> > 1. Each worker holds a dedicated task queue reducing the resource 
> > contention.
> > 2. The tasks are distributed in a round-robin fashion

> For nested task submission (does OpenMP support that?) you probably
> want to submit to the local queue rather than round-robin, no? 
 
Hi Janne,
I believe you're talking about spawning child tasks within an already running 
task, which I believe is supported by OpenMP.
In this case, pushing to the local queue could introduce a deadlock if the 
master thread waits on the spawned tasks. 
A short one though since work-stealing can resolve the deadlock.
A better way to handle this is to round-robin the child tasks to the queues 
excluding the queue of the thread consuming the current task.
Then waiting on the tasks should never cause a deadlock.
Or did I misunderstand your question?
Maybe there is a specific reason for avoiding avoid round-robin that I missed?
 
Ray Kim


Re: Patch: don't cap TYPE_PRECISION of bitsizetype at MAX_FIXED_MODE_SIZE

2019-06-05 Thread Hans-Peter Nilsson
> From: Richard Biener 
> Date: Wed, 29 May 2019 15:04:42 +0200

> On Tue, May 28, 2019 at 5:43 PM Hans-Peter Nilsson
>  wrote:
> >
> > TL;DR: instead of capping TYPE_PRECISION of bitsizetype at
> > MAX_FIXED_MODE_SIZE, search for the largest fitting size from
> > scalar_int_mode modes supported by the target using
> > targetm.scalar_mode_supported_p.
> >
> > -
> > In initialize_sizetypes, MAX_FIXED_MODE_SIZE is used as an upper
> > limit to the *precision* of the bit size of the size-type
> > (typically address length) of the target, which is wrong.
> >
> > The effect is that if a 32-bit target says "please don't cook up
> > pieces larger than a register size", then we don't get more
> > precision in address-related calculations than that, while the
> > bit-precision needs to be at least (precision +
> > LOG2_BITS_PER_UNIT + 1) with precision being the size of the
> > address, to diagnose overflows.  There are gcc_asserts that
> > guard this, causing ICE when broken.
> >
> > This MAX_FIXED_MODE_SIZE usage comes from r118977 (referencing
> > PR27885 and PR28176) and was introduced as if
> > MAX_FIXED_MODE_SIZE is the size of the largest supported type
> > for the target (where "supported" is in the most trivial sense
> > as in can move and add).  But it's not.
> >
> > MAX_FIXED_MODE_SIZE is arguably a bit vague, but documented as
> > "the size in bits of the largest integer machine mode that
> > should actually be used.  All integer machine modes of this size
> > or smaller can be used for structures and unions with the
> > appropriate sizes."
> 
> I read it as the machine may not have ways to do basic
> things like add two numbers in modes bigger than this
> but you can use larger modes as simple bit "containers".

Quick dismissal of the "should actually" in the documentation and
of my code-digging findings noted.

Either way, for expediency, it sounds like you accept that
MAX_FIXED_MODE_SIZE can validly be set to just the bitsize of
Pmode for a target (for example, targets where Pmode=SImode and
DImode is just a 'bit-container'), so let's skip forward to...

> >  While in general the documentation
> > sometimes differs from reality, that's mostly right, with
> > "should actually be" meaning "is preferably": it's the largest
> > size that the target indicates as beneficial of use besides that
> > directly mapped from types used in the source code; sort-of a
> > performance knob.  (I did a static reality code check looking
> > for direct and indirect uses before imposing this my own
> > interpretation and recollection.)  Typical use is when gcc finds
> > that some operations can be combined and synthesized to
> > optionally use a wider mode than seen in the source (but mostly
> > copying).  Then this macro sets an upper limit to the those
> > operations, whether to be done at all or the chunk-size.
> > Unfortunately some of the effects are unintuitive and I wouldn't
> > be surprised if this de-facto affects ABI corners.  It's not
> > something you tweak more than once for a target.
> >
> > Two tests pass with this fixed for cris-elf (MAX_FIXED_MODE_SIZE
> > 32): gcc.dg/attr-vector_size.c and gcc.dg/pr69973.c, where the
> > lack of precision (32 bits instead of 64 bits for bitsizetype)
> > caused an consistency check to ICE, from where I tracked this.
> 
> So why does cris-elf have 32 as MAX_FIXED_MODE_SIZE when it
> can appearantly do DImode arithmetic just fine?

(For performance reasons, by choice rather than necessity.  Long
time ago; this was in the initial commit.  That's all incidental.)

>  On x86_64
> we end up with TImode which is MAX_FIXED_MODE_SIZE, on
> 32bit x86 it is DImode.
> 
> So - fix cris instead?

...here:

Umm no, this is not a CRIS-specific issue, that's just one
target where the issue was spotted.  See last for more targets.
Please don't suggest sweeping this bug under the carpet, for the
CRIS port or other targets, by tweaking their
MAX_FIXED_MODE_SIZE.  Also, as I mentioned, this can have other
unwanted effects; the macro is used elsewhere.  IMHO it's usage
should be replaced by more specific target settings.

This issue exists, not just for targets that can have their
MAX_FIXED_MODE_SIZE more-or-less easily tweaked higher, but also
for the 'bit-container' targets where it *can't* be set higher.

Let's please DTRT and correct the code here in the middle-end,
so we don't ICE for those targets and this line (gcc.dg/pr69973.c):
 typedef int v4si __attribute__ ((vector_size (1 << 29)));
(all listed targets happen to have Pmode == SImode)

So, considering that: ok to commit?

> > Regarding the change, MAX_FIXED_MODE_SIZE is still mentioned but
> > just to initialize the fall-back largest-supported precision.
> > Sometimes the target supports no larger mode than that of the
> > address, like for a 64-bit target lacking support for larger
> > sizes (e.g. TImode), as in the motivating PR27885.  I guess they
> > can still get ICE for overflowing address-calculation checks,
> > but 

[Bug rtl-optimization/90765] New: preferred_stack_boundary is updated for callee

2019-06-05 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90765

Bug ID: 90765
   Summary: preferred_stack_boundary is updated for callee
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: skpgkp1 at gmail dot com
  Target Milestone: ---

locate_and_pad_parm is called when expanding function call from
initialize_argument_information and when generating function body
from assign_parm_find_entry_rtl:

  /* Remember if the outgoing parameter requires extra alignment on the
 calling function side.  */
  if (crtl->stack_alignment_needed < boundary)
crtl->stack_alignment_needed = boundary;
  if (crtl->preferred_stack_boundary < boundary)
crtl->preferred_stack_boundary = boundary;

preferred_stack_boundary should be updated only when expanding function
call, not when generating function body.  In this testcase,
crtl->preferred_stack_boundary is to set 512 when not needed:

[hjl@gnu-cfl-1 lea-3]$ cat 1.i
typedef int __v16si __attribute__ ((__vector_size__ (64)));

void
foo (__v16si x, int i0, int i1, int i2, int i3, int i4, int i5, __v16si *p)
{
  *p = x;
}
[hjl@gnu-cfl-1 lea-3]$ make 1.s
/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/ -mavx512f -O2
-Wa,-mx86-used-note=yes -S 1.i
[hjl@gnu-cfl-1 lea-3]$ cat 1.s
.file   "1.i"
.text
.p2align 4
.globl  foo
.type   foo, @function
foo:
.LFB0:
.cfi_startproc
pushq   %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq%rsp, %rbp
.cfi_def_cfa_register 6
andq$-64, %rsp
movq16(%rbp), %rax
vmovdqa64   %zmm0, (%rax)
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size   foo, .-foo
.ident  "GCC: (GNU) 10.0.0 20190605 (experimental)"
.section.note.GNU-stack,"",@progbits
[hjl@gnu-cfl-1 lea-3]$

Re: Review Hashtable extract node API

2019-06-05 Thread Jonathan Wakely

On 05/06/19 17:43 +0100, Jonathan Wakely wrote:

On 05/06/19 17:22 +0100, Jonathan Wakely wrote:

On 04/06/19 19:19 +0200, François Dumont wrote:

@@ -669,18 +670,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__node_base*
_M_get_previous_node(size_type __bkt, __node_base* __n);

-  // Insert node with hash code __code, in bucket bkt if no rehash (assumes
-  // no element with its key already present). Take ownership of the node,
-  // deallocate it on exception.
+  // Insert node with key __k and hash code __code, in bucket __bkt if no
+  // rehash (assumes no element with its key already present).
+  template
iterator
-  _M_insert_unique_node(size_type __bkt, __hash_code __code,
-   __node_type* __n, size_type __n_elt = 1);
+   _M_insert_unique_node(const key_type& __k, size_type __bkt,
+ __hash_code __code, const _NodeAccessor&,
+ size_type __n_elt = 1);

-  // Insert node with hash code __code. Take ownership of the node,
-  // deallocate it on exception.
+  // Insert node with hash code __code.
+  template
iterator
-  _M_insert_multi_node(__node_type* __hint,
-  __hash_code __code, __node_type* __n);
+   _M_insert_multi_node(__node_type* __hint, __hash_code __code,
+const _NodeAccessor& __node_accessor);


It looks like most times you call these functions you pass an
identical lambda expression, but each of those lambda expressions will
create a unique type. That means you create different instantiations
of the function templates even though they do exactly the same thing.

That's just generating multiple copies of identical code. Passing in a
function object to provide the node pointer doesn't really seem
necessary anyway, so if it results in larger executables it's really
not desirable.


Also I didn't really like the name NodeAccessor. It's not an accessor,
because it performs ownership transfer. Invoking __node_accessor()
returns a __node_type* by releasing it from the previous owner (by
setting the owner's pointer member to null).

Passing a const reference to something called NodeAccessor does not
make it clear that it performs a mutating operation like that! If the
_M_insert_unique_node and _M_insert_multi_node functions did the
__node_accessor() call *before* rehashing, and rehashing threw an
exception, then they would leak. So it's important that the
__node_acessor() call happens at the right time, and so it's important
to name it well.

In my suggested patch the naming isn't misleading, because we just
pass a raw __node_type* and have a new comment saying:

// Takes ownership of __n if insertion succeeds, throws otherwise.

The function doesn't have a callable with non-local effects that
modifies an object outside the function. Because the caller sets the
previous owner's pointer to null there's no danger of it happening at
the wrong time; it can only happen after the function has returned and
ownership transfer has completed.


As a further evolution that simplifies some uses of _Scoped_node we
could give it a constructor that allocates a node and constructs an
element, as in the attached patch.


commit cba81aa3c4b41bef140de30e9651c5134ee2f5ef
Author: Jonathan Wakely 
Date:   Wed Jun 5 20:12:01 2019 +0100

with _Scoped_node constructor that creates the node

diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/include/bits/hashtable.h
index 87a15c8d037..ab579a7059e 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -259,9 +259,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Simple RAII type for managing a node containing an element
   struct _Scoped_node
   {
-	_Scoped_node(__hashtable_alloc* __h, __node_type* __n)
+	// Take ownership of a node with a constructed element.
+	_Scoped_node(__node_type* __n, __hashtable_alloc* __h)
 	: _M_h(__h), _M_node(__n) { }
 
+	// Allocate a node and construct an element within it.
+	template
+	  _Scoped_node(__hashtable_alloc* __h, _Args&&... __args)
+	  : _M_h(__h),
+	_M_node(__h->_M_allocate_node(std::forward<_Args>(__args)...))
+	  { }
+
+	// Destroy element and deallocate node.
 	~_Scoped_node() { if (_M_node) _M_h->_M_deallocate_node(_M_node); };
 
 	_Scoped_node(const _Scoped_node&) = delete;
@@ -1653,9 +1662,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   -> pair
   {
 	// First build the node to get access to the hash code
-	_Scoped_node __node {
-	this, this->_M_allocate_node(std::forward<_Args>(__args)...)
-	};
+	_Scoped_node __node { this, std::forward<_Args>(__args)...  };
 	const key_type& __k = this->_M_extract()(__node._M_node->_M_v());
 	__hash_code __code = this->_M_hash_code(__k);
 	size_type __bkt = _M_bucket_index(__k, __code);
@@ -1681,9 +1688,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   -> iterator
   {
 	// First build the node to get its 

[Bug c++/90449] No way to turn off warning about inaccessible base

2019-06-05 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90449

--- Comment #4 from Marek Polacek  ---
(In reply to Marek Polacek from comment #3)
> The testcase in the godbolt link even ICEs with current trunk.  I'll open a
> separate PR.

PR90764

[Bug c++/90764] [10 Regression] internal compiler error in build_deduction_guide, at cp/pt.c:27162

2019-06-05 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90764

--- Comment #1 from Marek Polacek  ---
Started with r270765.

[Bug c++/90764] [10 Regression] internal compiler error in build_deduction_guide, at cp/pt.c:27162

2019-06-05 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90764

Marek Polacek  changed:

   What|Removed |Added

   Keywords||ice-on-invalid-code
  Known to work||9.0
   Target Milestone|--- |10.0
  Known to fail||10.0

[Bug c++/90764] New: [10 Regression] internal compiler error in build_deduction_guide, at cp/pt.c:27162

2019-06-05 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90764

Bug ID: 90764
   Summary: [10 Regression] internal compiler error in
build_deduction_guide, at cp/pt.c:27162
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mpolacek at gcc dot gnu.org
  Target Milestone: ---

namespace a {
  struct b;
  template  using c = b;
}
template  struct e : a::c { using a::c<>::c; };
template  typename f> void g() { f(); }
void h() { g(); }

$ ./cc1plus -quiet ice.cc -std=c++17
ice.cc: In instantiation of ‘void g() [with f = e]’:
ice.cc:7:17:   required from here
ice.cc:6:54: internal compiler error: tree check: expected var_decl or
field_decl or function_decl or type_decl or template_decl, have using_decl in
build_deduction_guide, at cp/pt.c:27162
6 | template  typename f> void g() { f(); }
  |  ^~~
0x17ef2dc tree_check_failed(tree_node const*, char const*, int, char const*,
...)
/home/mpolacek/src/gcc/gcc/tree.c:9894
0x871bdc tree_check5(tree_node*, char const*, int, char const*, tree_code,
tree_code, tree_code, tree_code, tree_code)
/home/mpolacek/src/gcc/gcc/tree.h:3285
0xaf6037 build_deduction_guide
/home/mpolacek/src/gcc/gcc/cp/pt.c:27162
0xaf822e do_class_deduction
/home/mpolacek/src/gcc/gcc/cp/pt.c:27394
0xaf8a6e do_auto_deduction(tree_node*, tree_node*, tree_node*, int,
auto_deduction_context, tree_node*, int)
/home/mpolacek/src/gcc/gcc/cp/pt.c:27528
0xba614f build_functional_cast(tree_node*, tree_node*, int)
/home/mpolacek/src/gcc/gcc/cp/typeck2.c:2213
0xacfc76 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool,
bool)
/home/mpolacek/src/gcc/gcc/cp/pt.c:18432
0xacd9f3 tsubst_expr(tree_node*, tree_node*, int, tree_node*, bool)
/home/mpolacek/src/gcc/gcc/cp/pt.c:17923
0xac6683 tsubst_expr(tree_node*, tree_node*, int, tree_node*, bool)
/home/mpolacek/src/gcc/gcc/cp/pt.c:17050
0xac8e7f tsubst_expr(tree_node*, tree_node*, int, tree_node*, bool)
/home/mpolacek/src/gcc/gcc/cp/pt.c:17330
0xaec1fd instantiate_decl(tree_node*, bool, bool)
/home/mpolacek/src/gcc/gcc/cp/pt.c:24769
0xaecbd6 instantiate_pending_templates(int)
/home/mpolacek/src/gcc/gcc/cp/pt.c:24885
0x966d02 c_parse_final_cleanups()
/home/mpolacek/src/gcc/gcc/cp/decl2.c:4821
0xc28828 c_common_parse_file()
/home/mpolacek/src/gcc/gcc/c-family/c-opts.c:1178
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

Re: PR C++/63149

2019-06-05 Thread Nina Dinka Ranns
On Wed, 5 Jun 2019 at 19:19, Jason Merrill  wrote:

> On 6/5/19 1:29 PM, Nina Dinka Ranns wrote:
> > Ack. Amended change log is below. Changes are :
> > * changed C++ -> c++
> > * fixed the name of added test
> >
> > There are no changes in the diff, but I attached it to this e-mail for
> > reference.
>
> Applied, thanks!
>
> For future reference it's also customary to add a bit of discussion of
> the rationale for the patch.  Also, please include the word "PATCH" on
> the subject line.


Noted.
Thank you,
Nina


>
> Jason
>


[Bug target/88483] Unnecessary stack alignment

2019-06-05 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88483

H.J. Lu  changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from H.J. Lu  ---
Fixed for GCC 10.

Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime

2019-06-05 Thread Janne Blomqvist
On Wed, Jun 5, 2019 at 9:25 PM 김규래  wrote:
>
> Hi, thanks for the detailed explanation.
> I think I now get the picture.
> Judging from my current understanding, the task-parallelism currently works 
> as follows:
> 1. Tasks are placed in a global shared queue.
> 2. Workers consume the tasks by bashing the queue in a while loop, just as 
> self-scheduling (dynamic scheduling)/
>
> Then the improvements including work-stealing must be done by:
> 1. Each worker holds a dedicated task queue reducing the resource contention.
> 2. The tasks are distributed in a round-robin fashion

For nested task submission (does OpenMP support that?) you probably
want to submit to the local queue rather than round-robin, no?

> 3. work-stealing will resolve the load imbalance.
>
> If the above statements are correct, I guess the task priority should be 
> given some special treatment?
>
> Ray Kim
>
> -Original Message-
> From: "Jakub Jelinek"
> To: "김규래";
> Cc: ;
> Sent: 2019-06-04 (화) 03:21:01 (GMT+09:00)
> Subject: Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime
>
> On Tue, Jun 04, 2019 at 03:01:13AM +0900, 김규래 wrote:
> > Hi,
> > I've been studying the libgomp task parallelism system.
> > I have a few questions.
> > First, Tracing the events shows that only the main thread calls GOMP_task.
>
> No, any thread can call GOMP_task, in particular the thread that encountered
> the #pragma omp task construct.
>
> The GOMP_task function then decides based on the clauses of the construct
> (passed in various ways through the arguments of that function) whether it
> will be included (executed by the encountering thread), or queued for
> later execution.  In the latter case, it will be scheduled during a barrier
> (implicit or explicit), see gomp_barrier_handle_tasks called from the
> bar.[ch] code, or at other spots, e.g. during taskwait construct
> (GOMP_taskwait) or at the end of taskgroup (GOMP_taskgroup_end).
>
> > How do the other worker threads enter the libgomp runtime?
>
> If you never encounter a parallel, teams or target construct, then there is
> just one thread that does everything (well, the library is written such that
> if you by hand pthread_create, each such thread acts as a separate initial
> thread from OpenMP POV).
> Threads are created e.g. during parallel construct (GOMP_parallel), where
> for non-nested parallelism as the standard requires it reuses existing
> threads if possible or spawns new ones, see mainly team.c (gomp_team_start)
> for the function that spawns new threads or awakes the ones waiting for
> work, or gomp_thread_start in the same file for the function actually run by
> the libgomp library created threads.
>
> > I can't find the entry point of the worker threads from the event tracing 
> > and the assembly dump.
> > Second, How is the task priority set?
>
> By the user, through priority clause, passed to GOMP_task and then taken
> into account when handling tasks in the various queues.
>
> Jakub



-- 
Janne Blomqvist


[Bug c++/90449] No way to turn off warning about inaccessible base

2019-06-05 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90449

--- Comment #3 from Marek Polacek  ---
The testcase in the godbolt link even ICEs with current trunk.  I'll open a
separate PR.

Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime

2019-06-05 Thread Jakub Jelinek
On Thu, Jun 06, 2019 at 03:25:24AM +0900, 김규래 wrote:
> Hi, thanks for the detailed explanation.
> I think I now get the picture.
> Judging from my current understanding, the task-parallelism currently works 
> as follows: 
> 1. Tasks are placed in a global shared queue.

It isn't a global shared queue, but a per-team shared queue, in fact 3
different ones, guarded by the same per-team mutex team->task_lock though:
team->task_queueused on barriers
task->children_queueused for #pragma omp taskwait
taskgroup->taskgroup_queue  used at the end of #pragma omp taskgroup

> 2. Workers consume the tasks by bashing the queue in a while loop, just as 
> self-scheduling (dynamic scheduling)/
>  
> Then the improvements including work-stealing must be done by:
> 1. Each worker holds a dedicated task queue reducing the resource contention.
> 2. The tasks are distributed in a round-robin fashion
> 3. work-stealing will resolve the load imbalance.
>  
> If the above statements are correct, I guess the task priority should be 
> given some special treatment?

Yes, one thing to consider is task priority, another thing is what to do
on those #pragma omp taskwait or #pragma omp taskgroup waits where while one
can schedule most of the tasks (the OpenMP specification has restrictions on
what can be scheduled, but the restrictions are mostly relevant to untied
tasks which we don't support anyway (well, ignore untied and schedule them
like tied tasks)), but it might be more beneficial (and what is currently
implemented) to queue only the tasks that will help satisfying what we are
waiting on.  All of priority and only scheduling task or taskgroup children
instead of all tasks are just hints, the implementation may choose other
tasks.
There is another thing though, tasks with dependencies, where
gomp_task_handle_depend which needs to look up addresses in hash tables
for the dependencies also uses the team->task_lock mutex.

Jakub


Re: [wwwdocs] Document existence of openacc-gcc-9-branch

2019-06-05 Thread Julian Brown
On Wed, 5 Jun 2019 10:30:41 +0200
Thomas Schwinge  wrote:

> Hi Julian!
> 
> On Tue, 4 Jun 2019 23:05:53 +0100, Julian Brown
>  wrote:
> > I've pushed a new branch "openacc-gcc-9-branch" to the Git
> > mirror (i.e. as a Git-only branch), for development of OpenACC and
> > related functionality on top of the GCC 9 branch. It's currently
> > based off the gcc-9_1_0-release tag, and contains a number of
> > patches mainly merged from either the openacc-gcc-8-branch, or from
> > further-developed versions of those patches that have been
> > submitted for upstream review.
> > 
> > This patch updates the svn.html page to point to the new branch
> > rather than the old openacc-gcc-8-branch, which is retired now.
> > 
> > OK to commit?  
> 
> As obvious, but please also add an "openacc-gcc-8-branch" stanza next
> to "openacc-gcc-7-branch" in the "Merged Development Branches"
> section, and update the "gomp-4_0-branch" and "openacc-gcc-7-branch"
> stanzas accordingly.
> 
> Well, actually please move "gomp-4_0-branch", "openacc-gcc-7-branch",
> and "openacc-gcc-8-branch" into the "Inactive Development Branches"
> section, for all "These branches are inactive and contain work that
> might not been merged": they all contain some changes that have not
> been forward-ported to their later instances.

I've committed this version.

Thanks,

Julian
Index: htdocs/svn.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/svn.html,v
retrieving revision 1.225
diff -u -p -r1.225 svn.html
--- htdocs/svn.html	30 Sep 2018 14:38:47 -	1.225
+++ htdocs/svn.html	5 Jun 2019 18:39:19 -
@@ -291,19 +291,19 @@ the command svn log --stop-on-copy
   Patches should be marked with the tag [no-undefined-overflow]
   in the subject line.  The branch is maintained by Richard Biener.
 
-  https://gcc.gnu.org/wiki/OpenACC;>openacc-gcc-8-branch
+  https://gcc.gnu.org/wiki/OpenACC;>openacc-gcc-9-branch
   This https://gcc.gnu.org/wiki/GitMirror;>Git-only branch is
   used for collaborative development
   of https://gcc.gnu.org/wiki/OpenACC;>OpenACC support and related
   functionality, such
   as https://gcc.gnu.org/wiki/Offloading;>offloading support.  The
-  branch is based on gcc-8-branch.  Find it
+  branch is based on gcc-9-branch.  Find it
   at git://gcc.gnu.org/git/gcc.git,
-  https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/openacc-gcc-8-branch;>https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/openacc-gcc-8-branch,
+  https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/openacc-gcc-9-branch;>https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/openacc-gcc-9-branch,
   or
-  https://github.com/gcc-mirror/gcc/tree/openacc-gcc-8-branch;>https://github.com/gcc-mirror/gcc/tree/openacc-gcc-8-branch.
-  Please send email with a short-hand [og8] tag in the subject
-  line, and use ChangeLog.openacc files.
+  https://github.com/gcc-mirror/gcc/tree/openacc-gcc-9-branch;>https://github.com/gcc-mirror/gcc/tree/openacc-gcc-9-branch.
+  Please send patch emails with a short-hand [og9] tag in the
+  subject line, and use ChangeLog.openacc files.
 
   https://gcc.gnu.org/wiki/plugins;>plugins
   This branch adds plugin functionality to GCC.  See the 
   https://gcc.gnu.org/wiki/tuples/;>gimple-tuples-branch
   gomp-20050608-branch
   gomp-3_0-branch
-  gomp-4_0-branch
-  This branch was used to update
-  the https://gcc.gnu.org/wiki/openmp;>OpenMP support to version
-  4.0, including development
-  of https://gcc.gnu.org/wiki/Offloading;>offloading support in
-  GCC as well as support
-  for https://gcc.gnu.org/wiki/OpenACC;>OpenACC.  These features
-  got merged into trunk.  Based on gcc-6-branch then, this branch was used for
-  on-going development of OpenACC support and related functionality, which then
-  moved to openacc-gcc-7-branch, and now openacc-gcc-8-branch.
 
   java-gui-20050128-branch
   This was a temporary branch for development of java GUI libraries
@@ -820,11 +810,6 @@ inactive.
   ea...@eagercon.com.
   All changes have been merged into mainline.
 
-  openacc-gcc-7-branch
-  Based on gcc-7-branch, this branch was used for on-going development
-  of https://gcc.gnu.org/wiki/OpenACC;>OpenACC support and related
-  functionality, which now moved openacc-gcc-8-branch.
-
   pch-branch
   tree-ssa-20020619-branch
   https://gcc.gnu.org/wiki/Var_Tracking_Assignments;>var-tracking-assignments*-branch
@@ -978,6 +963,25 @@ merged.
   OpenMP support in GCC.  They were never properly maintained and
   have now been superseded by gomp-20050608-branch.
 
+  gomp-4_0-branch
+  This branch was based on gcc-6-branch, and was used to update
+  the https://gcc.gnu.org/wiki/openmp;>OpenMP support to version
+  4.0, including development
+  of https://gcc.gnu.org/wiki/Offloading;>offloading support in
+  GCC as well as support
+  for https://gcc.gnu.org/wiki/OpenACC;>OpenACC.  These features
+  got merged into trunk.  The branch was then used for on-going 

Re: [C++ Patch] Use declarator->id_loc in three additional places

2019-06-05 Thread Jason Merrill

On 6/5/19 2:09 PM, Paolo Carlini wrote:

Hi,

On 05/06/19 19:45, Jason Merrill wrote:

On 6/4/19 11:57 AM, Paolo Carlini wrote:

Hi,

On 04/06/19 16:50, Jason Merrill wrote:

On 6/4/19 10:31 AM, Paolo Carlini wrote:

+  permerror (loc, "member functions are implicitly "
+ "friends of their class");


Wouldn't it be better to use the location of "friend" in this 
diagnostic?


Yes, however doing that fully correctly seems a bit tricky


Why tricky?  Doesn't declspecs->locations[ds_friend] work?


To be honest, here I wasn't considering ds_friend at all. Indeed it 
gives us the location of 'friend', but, say, for a testcase like 
parse/friend4.C:


struct A
{
   friend void A::foo();  // { dg-error "implicitly friends" }
   friend A::~A();    // { dg-error "implicitly friends" }
};

I thought we wanted a precise caret under the 'f' of 'foo' and the '~' 
of the destructor - both clang and icc do that - I wasn't even 
considering pointing at 'friend'. If you think that would be an 
improvement wrt the current closed parenthesis - I agreee it would! - I 
can do that!


Please do.

Jason



Re: [PATCH] fix more -Wformat-diag issues

2019-06-05 Thread Martin Sebor

On 5/31/19 12:20 PM, Jeff Law wrote:

On 5/31/19 9:56 AM, Martin Sebor wrote:

On 5/30/19 5:49 PM, Jeff Law wrote:

So in several places there's a comment which indicates that debugging
dumps and the like do not follow conventions.  Presumably you've tried
to keep a narrow scope on the diagnostic push/pops.  I'm also concerned
that the comments you mention that we trigger an ICE.

So while I'll ack this patch, I would like to know more about the ICE
that's triggered in the checker and what the plans are for fixing it.


Sorry, I didn't word the comment (copied below) very clearly.
What I meant to say is that the calls to error() in these files
that don't follow the convention are ultimately followed by
an ICE triggered either by an assert (as in cfgloop.c) or a call
to internal_error (cgraph.h).  The diagnostics themselves don't
cause an ICE.

OK.  Thanks for the clarification.



In a comment on one of the i18n bugs raised for these strings
Richard suggests these error calls should probably replaced by
direct calls to the pretty printer.  That would let us avoid
suppressing the warnings and also presumably make it clear to
translators the format strings aren't meant to be translated.
It seemed like too big of a change for this patch so I simply
suppressed the warnings but I agree it's worth considering at
some point.

Agreed.



I'll adjust the comment before I check in the patch (I'm hoping
to commit it at the same time as the checker itself once it's
approved).

Your call on when to commit :-)


I just committed it in r271971 with a few minor tweaks.  As before
I expect some minor fallout in the test suite, and more fixes to
follow once the checker itself is approved and committed.

Martin


Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime

2019-06-05 Thread 김규래
Hi, thanks for the detailed explanation.
I think I now get the picture.
Judging from my current understanding, the task-parallelism currently works as 
follows: 
1. Tasks are placed in a global shared queue.
2. Workers consume the tasks by bashing the queue in a while loop, just as 
self-scheduling (dynamic scheduling)/
 
Then the improvements including work-stealing must be done by:
1. Each worker holds a dedicated task queue reducing the resource contention.
2. The tasks are distributed in a round-robin fashion
3. work-stealing will resolve the load imbalance.
 
If the above statements are correct, I guess the task priority should be given 
some special treatment?
 
Ray Kim
 
-Original Message-
From: "Jakub Jelinek"
To: "김규래";
Cc: ;
Sent: 2019-06-04 (화) 03:21:01 (GMT+09:00)
Subject: Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime
 
On Tue, Jun 04, 2019 at 03:01:13AM +0900, 김규래 wrote:
> Hi,
> I've been studying the libgomp task parallelism system.
> I have a few questions.
> First, Tracing the events shows that only the main thread calls GOMP_task.

No, any thread can call GOMP_task, in particular the thread that encountered
the #pragma omp task construct.

The GOMP_task function then decides based on the clauses of the construct
(passed in various ways through the arguments of that function) whether it
will be included (executed by the encountering thread), or queued for
later execution.  In the latter case, it will be scheduled during a barrier
(implicit or explicit), see gomp_barrier_handle_tasks called from the
bar.[ch] code, or at other spots, e.g. during taskwait construct
(GOMP_taskwait) or at the end of taskgroup (GOMP_taskgroup_end).

> How do the other worker threads enter the libgomp runtime?

If you never encounter a parallel, teams or target construct, then there is
just one thread that does everything (well, the library is written such that
if you by hand pthread_create, each such thread acts as a separate initial
thread from OpenMP POV).
Threads are created e.g. during parallel construct (GOMP_parallel), where
for non-nested parallelism as the standard requires it reuses existing
threads if possible or spawns new ones, see mainly team.c (gomp_team_start)
for the function that spawns new threads or awakes the ones waiting for
work, or gomp_thread_start in the same file for the function actually run by
the libgomp library created threads.

> I can't find the entry point of the worker threads from the event tracing and 
> the assembly dump.
> Second, How is the task priority set?

By the user, through priority clause, passed to GOMP_task and then taken
into account when handling tasks in the various queues.

Jakub


[Bug c/90760] [8/9/10 Regression] ICE on attributes section and alias in set_section, at symtab.c:1573

2019-06-05 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90760

Martin Sebor  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-06-05
 CC||msebor at gcc dot gnu.org
  Known to work||4.9.4
Summary|ICE in set_section, at  |[8/9/10 Regression] ICE on
   |symtab.c:1573   |attributes section and
   ||alias in set_section, at
   ||symtab.c:1573
 Ever confirmed|0   |1
  Known to fail||10.0, 5.1.0, 6.4.0, 7.3.0,
   ||8.2.0, 9.1.0

--- Comment #1 from Martin Sebor  ---
Confirmed.  Bisection points to r216361 committed into GCC 5.0.0.  GCC 4.9
rejects the code with an error.

r216361 | trippels | 2014-10-17 01:10:07 -0400 (Fri, 17 Oct 2014) | 34 lines

Fix pr61848, linux kernel miscompile

This patch cures the linux kernel boot failure when compiled using
trunk gcc.

At its heart, the problem is caused by merge_decls merging from the
old decl to the new decl, then copying back to the old decl and
discarding the new.  When Jan moved some fields to the symtab,
"copying back to the old decl" was lost for those fields.  Really,
it would be best if merge_decls was rewritten to merge everything to
the kept decl, but here I'm just doing that for fields accessed via
decl_with_vis.symtab_node.

Re: PR C++/63149

2019-06-05 Thread Jason Merrill

On 6/5/19 1:29 PM, Nina Dinka Ranns wrote:

Ack. Amended change log is below. Changes are :
* changed C++ -> c++
* fixed the name of added test

There are no changes in the diff, but I attached it to this e-mail for
reference.


Applied, thanks!

For future reference it's also customary to add a bit of discussion of 
the rationale for the patch.  Also, please include the word "PATCH" on 
the subject line.


Jason


Re: [PATCH] alpha: Use TARGET_COMPUTE_FRAME_LAYOUT

2019-06-05 Thread Uros Bizjak
> At the same time, merge several related frame computing functions.
> Recall that HWI is now always 64-bit, so merge IMASK and FMASK,
> which allows merging of several loops within prologue and epilogue.
>
> Full regression testing will take some time, but a quick browse
> suggests no change in generated code.
>
>
> r~
>
>
> * config/alpha/alpha.c (direct_return): Move down after
> struct machine_function definition; use saved frame_size;
> return bool.
> (struct machine_function): Add sa_mask, sa_size, frame_size.
> (alpha_sa_mask, alpha_sa_size, compute_frame_size): Merge into ...
> (alpha_compute_frame_layout): ... new function.
> (TARGET_COMPUTE_FRAME_LAYOUT): New.
> (alpha_initial_elimination_offset): Use saved sa_size.
> (alpha_vms_initial_elimination_offset): Likewise.
> (alpha_vms_can_eliminate): Remove alpha_sa_size call.
> (alpha_expand_prologue): Use saved frame data.  Merge integer
> and fp register save loops.
> (alpha_expand_epilogue): Likewise.
> (alpha_start_function): Use saved frame data.
> * config/alpha/alpha-protos.h (direct_return): Update.
> (alpha_sa_size): Remove.

I have updated, bootstrapped and regression tested this patch on
native alphaev68-linux-gnu and committed it to mainline SVN.

Uros.
diff --git a/gcc/config/alpha/alpha-protos.h b/gcc/config/alpha/alpha-protos.h
index 07c970760eb5..7d340927c58d 100644
--- a/gcc/config/alpha/alpha-protos.h
+++ b/gcc/config/alpha/alpha-protos.h
@@ -21,9 +21,8 @@ extern int alpha_next_sequence_number;
 
 extern void literal_section (void);
 extern int zap_mask (HOST_WIDE_INT);
-extern int direct_return (void);
+extern bool direct_return (void);
 
-extern int alpha_sa_size (void);
 extern HOST_WIDE_INT alpha_initial_elimination_offset (unsigned int,
   unsigned int);
 extern void alpha_expand_prologue (void);
diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c
index 2e1de397e937..db17f7c06e25 100644
--- a/gcc/config/alpha/alpha.c
+++ b/gcc/config/alpha/alpha.c
@@ -731,19 +731,6 @@ alpha_vector_mode_supported_p (machine_mode mode)
   return mode == V8QImode || mode == V4HImode || mode == V2SImode;
 }
 
-/* Return 1 if this function can directly return via $26.  */
-
-int
-direct_return (void)
-{
-  return (TARGET_ABI_OSF
- && reload_completed
- && alpha_sa_size () == 0
- && get_frame_size () == 0
- && crtl->outgoing_args_size == 0
- && crtl->args.pretend_args_size == 0);
-}
-
 /* Return the TLS model to use for SYMBOL.  */
 
 static enum tls_model
@@ -4840,6 +4827,10 @@ struct GTY(()) alpha_links;
 
 struct GTY(()) machine_function
 {
+  unsigned HOST_WIDE_INT sa_mask;
+  HOST_WIDE_INT sa_size;
+  HOST_WIDE_INT frame_size;
+
   /* For flag_reorder_blocks_and_partition.  */
   rtx gp_save_rtx;
 
@@ -7271,83 +7262,59 @@ static int vms_save_fp_regno;
 /* Register number used to reference objects off our PV.  */
 static int vms_base_regno;
 
-/* Compute register masks for saved registers.  */
-
+/* Compute register masks for saved registers, register save area size,
+   and total frame size.  */
 static void
-alpha_sa_mask (unsigned long *imaskP, unsigned long *fmaskP)
+alpha_compute_frame_layout (void)
 {
-  unsigned long imask = 0;
-  unsigned long fmask = 0;
-  unsigned int i;
+  unsigned HOST_WIDE_INT sa_mask = 0;
+  HOST_WIDE_INT frame_size;
+  int sa_size;
 
   /* When outputting a thunk, we don't have valid register life info,
  but assemble_start_function wants to output .frame and .mask
  directives.  */
-  if (cfun->is_thunk)
+  if (!cfun->is_thunk)
 {
-  *imaskP = 0;
-  *fmaskP = 0;
-  return;
-}
-
-  if (TARGET_ABI_OPEN_VMS && alpha_procedure_type == PT_STACK)
-imask |= (1UL << HARD_FRAME_POINTER_REGNUM);
+  if (TARGET_ABI_OPEN_VMS && alpha_procedure_type == PT_STACK)
+   sa_mask |= HOST_WIDE_INT_1U << HARD_FRAME_POINTER_REGNUM;
 
-  /* One for every register we have to save.  */
-  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-if (! fixed_regs[i] && ! call_used_regs[i]
-   && df_regs_ever_live_p (i) && i != REG_RA)
-  {
-   if (i < 32)
- imask |= (1UL << i);
-   else
- fmask |= (1UL << (i - 32));
-  }
+  /* One for every register we have to save.  */
+  for (unsigned i = 0; i < FIRST_PSEUDO_REGISTER; i++)
+   if (! fixed_regs[i] && ! call_used_regs[i]
+   && df_regs_ever_live_p (i) && i != REG_RA)
+ sa_mask |= HOST_WIDE_INT_1U << i;
 
-  /* We need to restore these for the handler.  */
-  if (crtl->calls_eh_return)
-{
-  for (i = 0; ; ++i)
+  /* We need to restore these for the handler.  */
+  if (crtl->calls_eh_return)
{
- unsigned regno = EH_RETURN_DATA_REGNO (i);
- if (regno == INVALID_REGNUM)
-   break;
- imask |= 1UL << regno;
+ for (unsigned i = 0; ; ++i)
+   {
+ unsigned regno = EH_RETURN_DATA_REGNO (i);
+ if (regno == 

[Bug c++/63149] wrong auto deduction from braced-init-list

2019-06-05 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63149

--- Comment #7 from Jason Merrill  ---
Author: jason
Date: Wed Jun  5 18:12:53 2019
New Revision: 271969

URL: https://gcc.gnu.org/viewcvs?rev=271969=gcc=rev
Log:
ChangeLog for PR c++/63149

Modified:
trunk/gcc/cp/ChangeLog

[Bug c++/63149] wrong auto deduction from braced-init-list

2019-06-05 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63149

--- Comment #6 from Jason Merrill  ---
Author: jason
Date: Wed Jun  5 18:11:20 2019
New Revision: 271968

URL: https://gcc.gnu.org/viewcvs?rev=271968=gcc=rev
Log:
PR c++/63149 - wrong auto deduction from braced-init-list

2019-06-04  Nina Dinka Ranns  

gcc/cp/
* pt.c (listify_autos): Use non cv qualified auto_node in
std::initializer_list.

testsuite/
* g++.dg/cpp0x/initlist-deduce2.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C
Modified:
trunk/gcc/cp/pt.c

Re: [C++ Patch] Use declarator->id_loc in three additional places

2019-06-05 Thread Paolo Carlini

Hi,

On 05/06/19 19:45, Jason Merrill wrote:

On 6/4/19 11:57 AM, Paolo Carlini wrote:

Hi,

On 04/06/19 16:50, Jason Merrill wrote:

On 6/4/19 10:31 AM, Paolo Carlini wrote:

+  permerror (loc, "member functions are implicitly "
+ "friends of their class");


Wouldn't it be better to use the location of "friend" in this 
diagnostic?


Yes, however doing that fully correctly seems a bit tricky


Why tricky?  Doesn't declspecs->locations[ds_friend] work?


To be honest, here I wasn't considering ds_friend at all. Indeed it 
gives us the location of 'friend', but, say, for a testcase like 
parse/friend4.C:


struct A
{
  friend void A::foo();  // { dg-error "implicitly friends" }
  friend A::~A();    // { dg-error "implicitly friends" }
};

I thought we wanted a precise caret under the 'f' of 'foo' and the '~' 
of the destructor - both clang and icc do that - I wasn't even 
considering pointing at 'friend'. If you think that would be an 
improvement wrt the current closed parenthesis - I agreee it would! - I 
can do that!


Paolo.




[Bug rtl-optimization/22568] Should use cmov in some stituations

2019-06-05 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=22568

--- Comment #15 from Andrew Pinski  ---
(In reply to Eric Gallager from comment #14)
> (In reply to Andrew Pinski from comment #12)
> > I have a patch to tree-ssa-phiopt.c to fix comment #1 though it needs
> > another patch to expr.c to produce the cmov directly from COND_EXPR.  I hope
> > to post both patches for 4.8.0.
> 
> Did you ever post them?

The expr.c patch yes.  The tree-ssa-phiopt.c patch, no.  Tree-ssa-phiopt.c
needs more code rework because of the new infrastructures so I have not gotten
around to improving the patch yet; been busy with other stuff.

Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git

2019-06-05 Thread Jason Merrill

On 6/3/19 6:33 PM, Joseph Myers wrote:

On Sun, 2 Jun 2019, Segher Boessenkool wrote:


Git has an identity (well, two) _per commit_, and there is no way you can
reconstruct people's prefered name and email address (at any point in time,
for every commit separately) correctly.  IMO it is much better to not even
try.  We already *have* enough info for anyone to trivially look up who wrote
what, and what might be that person's email address at the time.  But
pretending that is more than a guess is just wrong.


I think not doing a best-effort identification (name+email) is just as


And I think guessing is not a "best effort", but just wrong.


It's 100% accurate about the identity of the person who was the committer
(modulo the one username from the gcc2 period where it was clear who the
author of the commits by that username was, and so that went in the author
map, but not clear that was the same as the committer, who did not commit
patches for any other author).  So it's as accurate as any case where
someone committing natively in git for someone else failed to use --author
(and if the CVS/SVN commit included a ChangeLog entry, we have credit
given from there via the "changelogs" feature).

I think failing to credit (by name and email address) the person implied
by the commit metadata, in the absence of positive evidence (such as a
ChangeLog entry) for the change being authored by someone else, is just
wrong, in the same way it's wrong not to use --author when committing for
someone else in git.


It's wrong, but it's not importantly wrong.  If we're doing a 
reposurgeon conversion, this adjustment makes sense.  If we're starting 
from the git-svn mirror, it doesn't justify breaking everyone's copies 
by rewriting branches.  And the bird in the hand looks more and more 
appealing as time goes by.



Where a person used different names over time, there's no generally
applicable rule for whether they'd prefer the latest version or the
version used at the time to be used in reference to past commits, and I
think using the most current version known is most appropriate, in the
absence of a ChangeLog entry added in the commit, unless they've specified
a preference for some other rule for which commits get what name.
Likewise for email addresses.


For email addresses, I think that using @gcc.gnu.org would be the best 
approach for people that have such accounts, rather than an employer 
address from an arbitrary point in time.


Jason


Re: [C++ Patch] Improve check_special_function_return_type locations

2019-06-05 Thread Jason Merrill

On 6/5/19 11:06 AM, Paolo Carlini wrote:

Hi,

here certainly we can do better than using input_location. In principle 
we could also pass the location of the entity (constructor, destructor, 
etc) itself or something else but I think it makes a lot of sense to 
simply include locations[ds_type_spec] in the computation, seems 
consistent with the existing case of spurious qualifiers (ICC does 
something similar AFAICS). Tested x86_64-linux.


OK.

Jason



Re: [C++ Patch] Use declarator->id_loc in three additional places

2019-06-05 Thread Jason Merrill

On 6/4/19 11:57 AM, Paolo Carlini wrote:

Hi,

On 04/06/19 16:50, Jason Merrill wrote:

On 6/4/19 10:31 AM, Paolo Carlini wrote:

+  permerror (loc, "member functions are implicitly "
+ "friends of their class");


Wouldn't it be better to use the location of "friend" in this diagnostic?


Yes, however doing that fully correctly seems a bit tricky


Why tricky?  Doesn't declspecs->locations[ds_friend] work?

Jason



[Bug tree-optimization/90758] [7 Regression] spurious -Warray-bounds with -O3

2019-06-05 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90758

Martin Sebor  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
  Known to work||10.0, 8.2.0, 9.1.0
   Keywords||diagnostic
   Last reconfirmed||2019-06-05
  Component|c   |tree-optimization
 CC||msebor at gcc dot gnu.org
 Ever confirmed|0   |1
Summary|out of bounds warning with  |[7 Regression] spurious
   |-O3 |-Warray-bounds with -O3
  Known to fail||4.8.1, 4.9.4, 5.5.0, 6.4.0,
   ||7.3.0

--- Comment #1 from Martin Sebor  ---
I can confirm the warning.  I didn't spend enough time on it to say
definitively that it's a false positive so I'm going to take your word for it.

The warning was introduced in 4.8 via r192538 and fixed in 8.1 via r257184. 
The second commit is for PR 81360 which was an ICE, so that it suppressed the
warning could be entirely incidental.

Re: PR C++/63149

2019-06-05 Thread Nina Dinka Ranns
Ack. Amended change log is below. Changes are :
* changed C++ -> c++
* fixed the name of added test

There are no changes in the diff, but I attached it to this e-mail for
reference.

Thanks,
Nina

2019-06-04  Nina Dinka Ranns  
gcc/cp

PR c++/63149
* pt.c (listify_autos): Use non cv qualified auto_node in
std::initializer_list.

testsuite/

PR c++/63149
* g++.dg/cpp0x/initlist-deduce2.C: New test.





On Wed, 5 Jun 2019 at 13:59, Marek Polacek  wrote:
>
> On Wed, Jun 05, 2019 at 02:50:54PM +0200, Jakub Jelinek wrote:
> > On Wed, Jun 05, 2019 at 08:39:56AM -0400, Marek Polacek wrote:
> > > On Wed, Jun 05, 2019 at 10:34:05AM +0100, Nina Dinka Ranns wrote:
> > > > > PR C++/63149
> > > > > * pt.c (listify_autos): Use non cv qualified auto_node in
> > > > > std::initializer_list.
> > > > >
> > > > > testsuite/
> > > > >
> > > > > PR C++/63149
> > >
> > > "c++" instead of "C++", thought I don't think anyone would mind.
> >
> > I would, I have scripts that grab the PR strings from ChangeLog entries
> > and need to fix stuff by hand if it is incorrect like this (or if people
> > forget to use the component/ part altogether).
>
> Fair enough.  Nina, please adjust that too, then.
>
> Marek
Index: gcc/cp/pt.c
===
--- gcc/cp/pt.c	(revision 271709)
+++ gcc/cp/pt.c	(working copy)
@@ -26836,7 +26836,7 @@
 static tree
 listify_autos (tree type, tree auto_node)
 {
-  tree init_auto = listify (auto_node);
+  tree init_auto = listify (strip_top_quals (auto_node));
   tree argvec = make_tree_vec (1);
   TREE_VEC_ELT (argvec, 0) = init_auto;
   if (processing_template_decl)
Index: gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C
===
--- gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C	(nonexistent)
+++ gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C	(working copy)
@@ -0,0 +1,8 @@
+// Test for PR63149
+// { dg-do compile { target c++11 } }
+
+#include 
+
+const auto r = { 1, 2, 3 };
+using X = decltype(r);
+using X = const std::initializer_list;


[Bug target/90763] New: vec_xl_len should take constnan

2019-06-05 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90763

Bug ID: 90763
   Summary: vec_xl_len should take constnan
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slandden at gmail dot com
  Target Milestone: ---

This should compile (that const is preventing it):

#include 
vector char vec_load_const(const unsigned char *s, int num) {
return vec_xl_len(s, num);
}

https://ppc.godbolt.org/z/KkV5JN

Re: Preventing ISO C errors when using macros for builtin types

2019-06-05 Thread Segher Boessenkool
On Wed, Jun 05, 2019 at 02:25:59PM +0100, Jozef Lawrynowicz wrote:
> I'm assuming it would not be valid to modify the behaviour of __extension__
> so it can be placed within a declaration, and not just at the
> beginning. However, there is minimal documentation on this keyword (it does 
> not
> state that it can be used in declarations, even though it can), so I wonder
> what the "rules" are.

The documentation says

  '-pedantic' and other options cause warnings for many GNU C extensions.
  You can prevent such warnings within one expression by writing
  '__extension__' before the expression.  '__extension__' has no effect
  aside from this.

It's not clear to me why you cannot simply put __extension__ earlier in
your case?


Segher


Re: Review Hashtable extract node API

2019-06-05 Thread Jonathan Wakely

On 05/06/19 17:22 +0100, Jonathan Wakely wrote:

On 04/06/19 19:19 +0200, François Dumont wrote:

@@ -669,18 +670,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __node_base*
 _M_get_previous_node(size_type __bkt, __node_base* __n);

-  // Insert node with hash code __code, in bucket bkt if no rehash (assumes
-  // no element with its key already present). Take ownership of the node,
-  // deallocate it on exception.
+  // Insert node with key __k and hash code __code, in bucket __bkt if no
+  // rehash (assumes no element with its key already present).
+  template
iterator
-  _M_insert_unique_node(size_type __bkt, __hash_code __code,
-   __node_type* __n, size_type __n_elt = 1);
+   _M_insert_unique_node(const key_type& __k, size_type __bkt,
+ __hash_code __code, const _NodeAccessor&,
+ size_type __n_elt = 1);

-  // Insert node with hash code __code. Take ownership of the node,
-  // deallocate it on exception.
+  // Insert node with hash code __code.
+  template
iterator
-  _M_insert_multi_node(__node_type* __hint,
-  __hash_code __code, __node_type* __n);
+   _M_insert_multi_node(__node_type* __hint, __hash_code __code,
+const _NodeAccessor& __node_accessor);


It looks like most times you call these functions you pass an
identical lambda expression, but each of those lambda expressions will
create a unique type. That means you create different instantiations
of the function templates even though they do exactly the same thing.

That's just generating multiple copies of identical code. Passing in a
function object to provide the node pointer doesn't really seem
necessary anyway, so if it results in larger executables it's really
not desirable.


Also I didn't really like the name NodeAccessor. It's not an accessor,
because it performs ownership transfer. Invoking __node_accessor()
returns a __node_type* by releasing it from the previous owner (by
setting the owner's pointer member to null).

Passing a const reference to something called NodeAccessor does not
make it clear that it performs a mutating operation like that! If the
_M_insert_unique_node and _M_insert_multi_node functions did the
__node_accessor() call *before* rehashing, and rehashing threw an
exception, then they would leak. So it's important that the
__node_acessor() call happens at the right time, and so it's important
to name it well.

In my suggested patch the naming isn't misleading, because we just
pass a raw __node_type* and have a new comment saying:

 // Takes ownership of __n if insertion succeeds, throws otherwise.

The function doesn't have a callable with non-local effects that
modifies an object outside the function. Because the caller sets the
previous owner's pointer to null there's no danger of it happening at
the wrong time; it can only happen after the function has returned and
ownership transfer has completed.



[Bug d/90762] New: ICE in resolvePropertiesX, at d/dmd/expression.c:251

2019-06-05 Thread gs...@t-online.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90762

Bug ID: 90762
   Summary: ICE in resolvePropertiesX, at d/dmd/expression.c:251
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: d
  Assignee: ibuclaw at gdcproject dot org
  Reporter: gs...@t-online.de
  Target Milestone: ---

$ cat z1.d
void f ()
{ __traits(compiles, { import m:a; a.init; }); }


$ gdc-10-20190602 -c z1.d
d21: internal compiler error: Segmentation fault
0xb7265f crash_signal
../../gcc/toplev.c:326
0x6a0087 resolvePropertiesX(Scope*, Expression*, Expression*)
../../gcc/d/dmd/expression.c:251
0x6a0aef resolveProperties(Scope*, Expression*)
../../gcc/d/dmd/expression.c:496
0x725ac9 StatementSemanticVisitor::visit(ExpStatement*)
../../gcc/d/dmd/statementsem.c:96
0x7299fa semantic(Statement*, Scope*)
../../gcc/d/dmd/statementsem.c:3498
0x7299fa StatementSemanticVisitor::visit(CompoundStatement*)
../../gcc/d/dmd/statementsem.c:139
0x7255f5 semantic(Statement*, Scope*)
../../gcc/d/dmd/statementsem.c:3498
0x6ca69f FuncDeclaration::semantic3(Scope*)
../../gcc/d/dmd/func.c:1696
0x6b64ca ExpressionSemanticVisitor::visit(FuncExp*)
../../gcc/d/dmd/expressionsem.c:1564
0x6a95a5 semantic(Expression*, Scope*)
../../gcc/d/dmd/expressionsem.c:8158
0x73725c semanticTraits(TraitsExp*, Scope*)
../../gcc/d/dmd/traits.c:1316
0x6ac2ef ExpressionSemanticVisitor::visit(TraitsExp*)
../../gcc/d/dmd/expressionsem.c:1838
0x6a95a5 semantic(Expression*, Scope*)
../../gcc/d/dmd/expressionsem.c:8158
0x725ab9 StatementSemanticVisitor::visit(ExpStatement*)
../../gcc/d/dmd/statementsem.c:95
0x7299fa semantic(Statement*, Scope*)
../../gcc/d/dmd/statementsem.c:3498
0x7299fa StatementSemanticVisitor::visit(CompoundStatement*)
../../gcc/d/dmd/statementsem.c:139
0x7255f5 semantic(Statement*, Scope*)
../../gcc/d/dmd/statementsem.c:3498
0x6ca69f FuncDeclaration::semantic3(Scope*)
../../gcc/d/dmd/func.c:1696
0x66d3ff Module::semantic3(Scope*)
../../gcc/d/dmd/dmodule.c:814
0x767785 d_parse_file()
../../gcc/d/d-lang.cc:1201

sha512.sum for gcc-7.4.0.tar.gz incorrect on at least three mirrors

2019-06-05 Thread Farid Parpia


Greetings,

It appears that the given sha512 sum for gcc-7.4.0.tar.gz may be incorrect
on at least the following mirrors:
   http://mirrors.concertpass.com/
   http://www.netgull.com/
   https://bigsearcher.com/

Regards,

Farid Parpia  IBM Corporation: 710-2-RF28, 2455 South Road,
Poughkeepsie, NY 12601, USA; Telephone: (845) 433-8420 = Tie Line 293-8420


[Bug d/90761] New: ICE in visit, at d/dmd/dcast.c:883

2019-06-05 Thread gs...@t-online.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90761

Bug ID: 90761
   Summary: ICE in visit, at d/dmd/dcast.c:883
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: d
  Assignee: ibuclaw at gdcproject dot org
  Reporter: gs...@t-online.de
  Target Milestone: ---

$ cat z1.d
immutable b = new a;
class a { int c =  }


$ gdc-10-20190602 -c z1.d
d21: internal compiler error: Segmentation fault
0xb7265f crash_signal
../../gcc/toplev.c:326
0x63eb6c visit
../../gcc/d/dmd/dcast.c:883
0x63f705 implicitConvTo(Expression*, Type*)
../../gcc/d/dmd/dcast.c:1308
0x63f705 Expression::implicitConvTo(Type*)
../../gcc/d/dmd/expression.h:169
0x63f705 implicitMod
../../gcc/d/dmd/dcast.c:245
0x63f705 convertible
../../gcc/d/dmd/dcast.c:1215
0x63fcb5 visit
../../gcc/d/dmd/dcast.c:1232
0x641074 implicitConvTo(Expression*, Type*)
../../gcc/d/dmd/dcast.c:1308
0x6d9c2d Expression::implicitConvTo(Type*)
../../gcc/d/dmd/expression.h:169
0x6d9c2d InitializerSemanticVisitor::visit(ExpInitializer*)
../../gcc/d/dmd/initsem.c:363
0x6d96eb semantic(Initializer*, Scope*, Type*, NeedInterpret)
../../gcc/d/dmd/initsem.c:520
0x64d233 VarDeclaration::semantic2(Scope*)
../../gcc/d/dmd/declaration.c:1619
0x66c3ff Module::semantic2(Scope*)
../../gcc/d/dmd/dmodule.c:782
0x76771d d_parse_file()
../../gcc/d/d-lang.cc:1185

[Bug c/90760] New: ICE in set_section, at symtab.c:1573

2019-06-05 Thread gs...@t-online.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90760

Bug ID: 90760
   Summary: ICE in set_section, at symtab.c:1573
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gs...@t-online.de
  Target Milestone: ---

Affects versions down to gcc-5 :


$ cat z1.c
void f () __attribute__ ((alias ("g")));
void f () __attribute__ ((section ("h")));


$ gcc-10-20190602 -c z1.c
z1.c:2:1: internal compiler error: in set_section, at symtab.c:1573
2 | void f () __attribute__ ((section ("h")));
  | ^~~~
0x6f38aa symtab_node::set_section(char const*)
../../gcc/symtab.c:1573
0x5cea11 merge_decls
../../gcc/c/c-decl.c:2845
0x5cea11 duplicate_decls
../../gcc/c/c-decl.c:2907
0x5d0d00 pushdecl(tree_node*)
../../gcc/c/c-decl.c:3093
0x5e0fcd start_decl(c_declarator*, c_declspecs*, bool, tree_node*)
../../gcc/c/c-decl.c:5037
0x623443 c_parser_declaration_or_fndef
../../gcc/c/c-parser.c:2154
0x6295b3 c_parser_external_declaration
../../gcc/c/c-parser.c:1653
0x62a079 c_parser_translation_unit
../../gcc/c/c-parser.c:1534
0x62a079 c_parse_file()
../../gcc/c/c-parser.c:19884
0x671b60 c_common_parse_file()
../../gcc/c-family/c-opts.c:1156

[PATCH] rs6000: Fix new testcase

2019-06-05 Thread Segher Boessenkool
At least with -m32 you need -maltivec if you #include .
Tested on powerpc64-linux {-m32,-m64); committing to trunk.


Segher


2019-06-05  Segher Boessenkool  

gcc/testsuite/
* g++.target/powerpc/undef-bool-3.C: Add -maltivec to dg-options.

---
 gcc/testsuite/g++.target/powerpc/undef-bool-3.C | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/g++.target/powerpc/undef-bool-3.C 
b/gcc/testsuite/g++.target/powerpc/undef-bool-3.C
index 27f3da5..b4e18bd 100644
--- a/gcc/testsuite/g++.target/powerpc/undef-bool-3.C
+++ b/gcc/testsuite/g++.target/powerpc/undef-bool-3.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -std=c++11" } */
+/* { dg-options "-O2 -std=c++11 -maltivec" } */
 
 /* Test to ensure that "bool" is not #define'd in altivec.h for C++ when
we require strict ANSI.  We should compile without errors.  */
-- 
1.8.3.1



Re: Review Hashtable extract node API

2019-06-05 Thread Jonathan Wakely

On 04/06/19 19:19 +0200, François Dumont wrote:

Hi

    Here is a patch to enhance the _Hashtable extract node API and fix 
a FIXME request.


    The enhancement to the extract node Api is that extract(const 
key_type&) do not call extract(const_iterator) anymore. Doing so we 
had to loop again through bucket nodes to find the previous node to 
the one to extract. Even if a bucket shall not contain many nodes (in 
unique key mode) it's easy to avoid it.


Nice.

    To fix the FIXME I introduced a node smart pointer type managing 
the node lifetime. The node is extracted from this smart pointer only 
when there can't be any exception raised. In the context of the node 
extract api the node handle is considered as a smart pointer. So the 
node handle will remain owner of the node in case of exception when 
reinserting it, I hope it is the expected behavior.


Yes, that's right.

I was going to suggest just using the node handle type instead of
inventing a new smart pointer, but the handle type uses std::optional
so isn't available for C++11/14.



    * include/bits/hashtable_policy.h
    (struct _NodeSmartPointer<_NodeAlloc>): New.
    (_Map_base<>::operator[](const key_type&)): Use latter, adapt.
    (_Map_base<>::operator[](key_type&&)): Likewise.
    * include/bits/hashtable.h
    (_Hashtable<>::__node_sp_t): New.
    (_Hashtable<>::_M_insert_unique_node(size_type, __hash_code,
    __node_type*, size_type)): Replace by...
(_Hashtable<>::_M_insert_unique_node<_NodeAccessor>(const key_type&,
    size_type, __hash_code, const _NodeAccessor&, size_type)): ...that.
    (_Hashtable<>::_M_insert_multi_node(__node_type*, __hash_code,
    __node_type*)): Replace by...
(_Hashtable<>::_M_insert_multi_node<_NodeAccessor>(__node_type*,
    __hash_code, const _NodeAccessor&)): ...that.
    (_Hashtable<>::_M_reinsert_node): Adapt.
    (_Hashtable<>::_M_reinsert_node_multi): Adapt.
    (_Hashtable<>::_M_extract_node(size_t, __node_base*)): New.
    (_Hashtable<>::extract(const_iterator)): Use latter.
    (_Hashtable<>::extract(const _Key&)): Likewise.
    (_Hashtable<>::_M_merge_unique): Adapt.
    (_Hashtable<>::_M_emplace<_Args>(true_type, _Args&&...)): Adapt.
    (_Hashtable<>::_M_emplace<_Args>(const_iterator, false_type,
    _Args&&...)): Adapt.

Tested under Linux x86_64.

Ok to commit ?

François




diff --git a/libstdc++-v3/include/bits/hashtable.h 
b/libstdc++-v3/include/bits/hashtable.h
index e2e3f016a35..307865b96bf 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -197,6 +197,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  using __hash_cached = typename __traits_type::__hash_cached;
  using __node_type = __detail::_Hash_node<_Value, __hash_cached::value>;
  using __node_alloc_type = __alloc_rebind<_Alloc, __node_type>;
+  using __node_sp_t = __detail::_NodeSmartPointer<__node_alloc_type>;

  using __hashtable_alloc = __detail::_Hashtable_alloc<__node_alloc_type>;

@@ -669,18 +670,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __node_base*
  _M_get_previous_node(size_type __bkt, __node_base* __n);

-  // Insert node with hash code __code, in bucket bkt if no rehash (assumes
-  // no element with its key already present). Take ownership of the node,
-  // deallocate it on exception.
+  // Insert node with key __k and hash code __code, in bucket __bkt if no
+  // rehash (assumes no element with its key already present).
+  template
iterator
-  _M_insert_unique_node(size_type __bkt, __hash_code __code,
-   __node_type* __n, size_type __n_elt = 1);
+   _M_insert_unique_node(const key_type& __k, size_type __bkt,
+ __hash_code __code, const _NodeAccessor&,
+ size_type __n_elt = 1);

-  // Insert node with hash code __code. Take ownership of the node,
-  // deallocate it on exception.
+  // Insert node with hash code __code.
+  template
iterator
-  _M_insert_multi_node(__node_type* __hint,
-  __hash_code __code, __node_type* __n);
+   _M_insert_multi_node(__node_type* __hint, __hash_code __code,
+const _NodeAccessor& __node_accessor);


It looks like most times you call these functions you pass an
identical lambda expression, but each of those lambda expressions will
create a unique type. That means you create different instantiations
of the function templates even though they do exactly the same thing.

That's just generating multiple copies of identical code. Passing in a
function object to provide the node pointer doesn't really seem
necessary anyway, so if it results in larger executables it's really
not desirable.

The attached patch still just passes in a node pointer (which the
function takes ownership of, unless it throws). Because the semantics
of _M_insert_multi_node change, we need the symbol name to change as
well (so old code linking to 

[Bug c/65403] -Wno-error= is an error

2019-06-05 Thread alexhenrie24 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65403

--- Comment #13 from Alex Henrie  ---
I filled out all the forms and was approved to contribute code to GCC, but the
patches have still not been reviewed:
https://gcc.gnu.org/ml/gcc-patches/2019-03/msg00971.html

[Bug fortran/90744] [7/8/9/10 Regression] Bogus length for character temporaries passed to external procedures since r268992

2019-06-05 Thread trnka at scm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90744

--- Comment #3 from Tomáš Trnka  ---
I think the issue stems from this code in gfc_conv_procedure_call():

  /* Deferred length dummies pass the character length by reference
 so that the value can be returned.  */
  if (parmse.string_length && fsym && fsym->ts.deferred)
{
  if (INDIRECT_REF_P (parmse.string_length))
/* In chains of functions/procedure calls the string_length already
   is a pointer to the variable holding the length.  Therefore
   remove the deref on call.  */
parmse.string_length = TREE_OPERAND (parmse.string_length, 0);
  else
{
  tmp = parmse.string_length;
  if (!VAR_P (tmp) && TREE_CODE (tmp) != COMPONENT_REF)
tmp = gfc_evaluate_now (parmse.string_length, >pre);
  parmse.string_length = gfc_build_addr_expr (NULL_TREE, tmp);
}
}

On the first call to an external procedure, fsym is NULL here, so the
string_length is passed as is. Then, after the argument processing loop
finishes, conv_function_val() is called, which in turn calls
get_formal_from_actual_arglist() and populates fsym. On subsequent calls to the
same external procedure, fsym is no longer NULL and in the case of this
character(:), allocatable temporary it also has ts.deferred==true. The snippet
above then runs and converts the string length to a reference (even though the
procedure doesn't expect one).

I'm not sure how to fix this properly, but the following one-liner seems to
work for me:

--- a/gcc/fortran/trans-types.c
+++ b/gcc/fortran/trans-types.c
@@ -3006,6 +3006,7 @@ get_formal_from_actual_arglist (gfc_symbol *sym,
gfc_actual_arglist *actual_args
{
  s->ts = a->expr->ts;
  s->attr.flavor = FL_VARIABLE;
+ s->ts.deferred = false;
  if (a->expr->rank > 0)
{
  s->attr.dimension = 1;

[C++ Patch] Improve check_special_function_return_type locations

2019-06-05 Thread Paolo Carlini

Hi,

here certainly we can do better than using input_location. In principle 
we could also pass the location of the entity (constructor, destructor, 
etc) itself or something else but I think it makes a lot of sense to 
simply include locations[ds_type_spec] in the computation, seems 
consistent with the existing case of spurious qualifiers (ICC does 
something similar AFAICS). Tested x86_64-linux.


Thanks, Paolo.

//

/cp
2019-06-05  Paolo Carlini  

* decl.c (smallest_type_location): Add.
(check_special_function_return_type): Use it.
(grokdeclarator): Lkewise.

/testsuite
2019-06-05  Paolo Carlini  

* g++.dg/diagnostic/return-type-invalid-1.C: New.
* g++.old-deja/g++.brendan/crash16.C: Adjust.
* g++.old-deja/g++.law/ctors5.C: Likewise.
Index: cp/decl.c
===
--- cp/decl.c   (revision 271949)
+++ cp/decl.c   (working copy)
@@ -10111,6 +10111,15 @@ smallest_type_quals_location (int type_quals, cons
   return loc;
 }
 
+/* Returns the smallest among the latter and locations[ds_type_spec].  */
+
+static location_t
+smallest_type_location (int type_quals, const location_t* locations)
+{
+  location_t loc = smallest_type_quals_location (type_quals, locations);
+  return min_location (loc, locations[ds_type_spec]);
+}
+
 /* Check that it's OK to declare a function with the indicated TYPE
and TYPE_QUALS.  SFK indicates the kind of special function (if any)
that this function is.  OPTYPE is the type given in a conversion
@@ -10129,7 +10138,8 @@ check_special_function_return_type (special_functi
 {
 case sfk_constructor:
   if (type)
-   error ("return type specification for constructor invalid");
+   error_at (smallest_type_location (type_quals, locations),
+ "return type specification for constructor invalid");
   else if (type_quals != TYPE_UNQUALIFIED)
error_at (smallest_type_quals_location (type_quals, locations),
  "qualifiers are not allowed on constructor declaration");
@@ -10142,7 +10152,8 @@ check_special_function_return_type (special_functi
 
 case sfk_destructor:
   if (type)
-   error ("return type specification for destructor invalid");
+   error_at (smallest_type_location (type_quals, locations),
+ "return type specification for destructor invalid");
   else if (type_quals != TYPE_UNQUALIFIED)
error_at (smallest_type_quals_location (type_quals, locations),
  "qualifiers are not allowed on destructor declaration");
@@ -10157,7 +10168,8 @@ check_special_function_return_type (special_functi
 
 case sfk_conversion:
   if (type)
-   error ("return type specified for %", optype);
+   error_at (smallest_type_location (type_quals, locations),
+ "return type specified for %", optype);
   else if (type_quals != TYPE_UNQUALIFIED)
error_at (smallest_type_quals_location (type_quals, locations),
  "qualifiers are not allowed on declaration of "
@@ -10168,7 +10180,8 @@ check_special_function_return_type (special_functi
 
 case sfk_deduction_guide:
   if (type)
-   error ("return type specified for deduction guide");
+   error_at (smallest_type_location (type_quals, locations),
+ "return type specified for deduction guide");
   else if (type_quals != TYPE_UNQUALIFIED)
error_at (smallest_type_quals_location (type_quals, locations),
  "qualifiers are not allowed on declaration of "
@@ -10438,10 +10451,8 @@ grokdeclarator (const cp_declarator *declarator,
   if (initialized > 1)
 funcdef_flag = true;
 
-  location_t typespec_loc = smallest_type_quals_location (type_quals,
- declspecs->locations);
-  typespec_loc = min_location (typespec_loc,
-  declspecs->locations[ds_type_spec]);
+  location_t typespec_loc = smallest_type_location (type_quals,
+   declspecs->locations);
   if (typespec_loc == UNKNOWN_LOCATION)
 typespec_loc = input_location;
 
Index: testsuite/g++.dg/diagnostic/return-type-invalid-1.C
===
--- testsuite/g++.dg/diagnostic/return-type-invalid-1.C (nonexistent)
+++ testsuite/g++.dg/diagnostic/return-type-invalid-1.C (working copy)
@@ -0,0 +1,27 @@
+struct S1
+{
+  int S1();  // { dg-error "3:return type" }
+  int ~S1();  // { dg-error "3:return type" }
+  int operator int();  // { dg-error "3:return type" }
+};
+
+struct S2
+{
+  const int S2();  // { dg-error "3:return type" }
+  const int ~S2();  // { dg-error "3:return type" }
+  const int operator int();  // { dg-error "3:return type" }
+};
+
+struct S3
+{
+  volatile int S3();  // { dg-error "3:return type" }
+  volatile int ~S3();  // { dg-error "3:return type" }
+  

[PATCH] AIX unwind register number fixes

2019-06-05 Thread David Edelsohn
The recent changes to rs6000 "debug" register numbers created a
conflict in libgcc aix unwind support.  aix-unwind.h provided
definitions of the DWARF register numbers for its own reference.
After the internal register number changes and the definition of those
register names in rs6000.md, the register names appear in
insn-constants.h, which is included by libgcc.  The macro definitions
in aix-unwind.h now conflict with the versions derived from rs6000.md.

This patch changes the aix-unwind.h macro names from XX_REGNO to R_XX,
following the convention of the other unwind files in libgcc to avoid
conflict.

Bootstrapped on powerpc-ibm-aix7.2.0.0.

Thanks, David

* config/rs6000/aix-unwind.h (LR_REGNO): Rename to R_LR.
(CR2_REGNO): Rename to R_CR2.
(XER_REGNO): Rename to R_XER.
(FIRST_ALTIVEC_REGNO): Rename to R_FIRST_ALTIVEC.
(VRSAVE_REGNO): Rename to R_VRSAVE.
(VSCR_REGNO): R_VSCR.

Index: aix-unwind.h
===
--- aix-unwind.h(revision 271883)
+++ aix-unwind.h(working copy)
@@ -24,12 +24,12 @@

 /* Useful register numbers.  */

-#define LR_REGNO 65
-#define CR2_REGNO70
-#define XER_REGNO76
-#define FIRST_ALTIVEC_REGNO  77
-#define VRSAVE_REGNO109
-#define VSCR_REGNO  110
+#define R_LR 65
+#define R_CR270
+#define R_XER76
+#define R_FIRST_ALTIVEC  77
+#define R_VRSAVE109
+#define R_VSCR  110

 /* If the current unwind info (FS) does not contain explicit info
saving R2, then we have to do a minor amount of code reading to
@@ -44,7 +44,7 @@
   {
\
unsigned int *insn  \
  = (unsigned int *)\
-   _Unwind_GetGR ((CTX), LR_REGNO);\
+   _Unwind_GetGR ((CTX), R_LR);\
if (*insn == 0xE8410028)\
  _Unwind_SetGRPtr ((CTX), 2, (CTX)->cfa + 40); \
   }
\
@@ -56,7 +56,7 @@
   {
\
unsigned int *insn  \
  = (unsigned int *)\
-   _Unwind_GetGR ((CTX), LR_REGNO);\
+   _Unwind_GetGR ((CTX), R_LR);\
if (*insn == 0x80410014)\
  _Unwind_SetGRPtr ((CTX), 2, (CTX)->cfa + 20); \
   }\
@@ -241,9 +241,9 @@ ppc_aix_fallback_frame_state (struct _Unwind_Conte
 if (i != __LIBGCC_STACK_POINTER_REGNUM__)
   REGISTER_CFA_OFFSET_FOR (fs, i, >gpr[i], new_cfa);

-  REGISTER_CFA_OFFSET_FOR (fs, CR2_REGNO, >cr, new_cfa);
-  REGISTER_CFA_OFFSET_FOR (fs, XER_REGNO, >xer, new_cfa);
-  REGISTER_CFA_OFFSET_FOR (fs, LR_REGNO, >lr, new_cfa);
+  REGISTER_CFA_OFFSET_FOR (fs, R_CR2, >cr, new_cfa);
+  REGISTER_CFA_OFFSET_FOR (fs, R_XER, >xer, new_cfa);
+  REGISTER_CFA_OFFSET_FOR (fs, R_LR, >lr, new_cfa);

   fs->retaddr_column = RETURN_COLUMN;
   REGISTER_CFA_OFFSET_FOR (fs, RETURN_COLUMN, >iar, new_cfa);
@@ -268,10 +268,10 @@ ppc_aix_fallback_frame_state (struct _Unwind_Conte

  for (i = 0; i < 32; i++)
REGISTER_CFA_OFFSET_FOR
-   (fs, i+FIRST_ALTIVEC_REGNO, >regs[i], new_cfa);
+   (fs, i+R_FIRST_ALTIVEC, >regs[i], new_cfa);

- REGISTER_CFA_OFFSET_FOR (fs, VSCR_REGNO, >vscr, new_cfa);
- REGISTER_CFA_OFFSET_FOR (fs, VRSAVE_REGNO, >vrsave, new_cfa);
+ REGISTER_CFA_OFFSET_FOR (fs, R_VSCR, >vscr, new_cfa);
+ REGISTER_CFA_OFFSET_FOR (fs, R_VRSAVE, >vrsave, new_cfa);
}
 }


[Bug c/90758] New: out of bounds warning with -O3

2019-06-05 Thread pmatos at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90758

Bug ID: 90758
   Summary: out of bounds warning with -O3
   Product: gcc
   Version: 7.4.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pmatos at gcc dot gnu.org
  Target Milestone: ---

Created attachment 46455
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46455=edit
testcase

There's a bad out of bounds warning in gcc7 line which seems to be fixed in
gcc8 and later which only occurs with -O3.

With the attached file and the command line:
/home/pmatos/Projects/gcc-build/gcc/cc1 -fpreprocessed vfasl.i -quiet -dumpbase
vfasl.i -m64 -msse2 -mtune=generic -march=x86-64 -auxbase vfasl -O3
-Wpointer-arith -Wextra -Werror -Wimplicit-fallthrough=0 -Wall -version -o
/tmp/ccYGabNR.s   
GNU C11 (GCC) version 7.4.1 20190605 (x86_64-pc-linux-gnu)
compiled by GNU C version 8.2.1 20181127, GMP version 6.1.2, MPFR
version 4.0.2, MPC version 1.1.0, isl version none
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
GNU C11 (GCC) version 7.4.1 20190605 (x86_64-pc-linux-gnu)
compiled by GNU C version 8.2.1 20181127, GMP version 6.1.2, MPFR
version 4.0.2, MPC version 1.1.0, isl version none
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: 4c52f0564d8672c9a189665c732e246d
vfasl.i: In function ‘l’:
vfasl.i:25:14: error: array subscript is above array bounds
[-Werror=array-bounds]
   ay = at[ax];
~~^~~~
vfasl.i: In function ‘j’:
vfasl.i:42:11: error: ‘n’ is used uninitialized in this function
[-Werror=uninitialized]
   *(a *)0 = n;
   ^~~
vfasl.i:45:3: note: ‘n’ was declared here
 a j(ac *m, a n) { return o(m, n); }
   ^
cc1: all warnings being treated as errors

[Bug target/83531] Build broken on macOS 10.13.2

2019-06-05 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83531

--- Comment #9 from Iain Sandoe  ---
(In reply to MCCCS from comment #6)
> After reading your comment, I noticed that
> there were two things I forgot to mention:
> 
> 1 - availability.h is the file where
> "API_AVAILABLE" is defined for Clang.
> 
> 2 - the part of the file the patch
> changes is 1:1 copied from
> AvailabilityInternal.h, so someone
> must've forgotten to replace after
> pasting.
> 
> But yes, there's no need to hurry
> to fix it. It's existed since
> October 2017; no one has noticed.

Well, it's still there in the Xcode 10.2 SDK, and I agree it looks like a
pasto.  I suppose you could cover both bases by augmenting the existing text
with the versions without __ .. but perhaps that's OTT.

I've done one reg-strap with it on Darwin18 - and it looks OK, so go ahead and
post your patch.
(a check on Darwin17 seems worthwhile too, so will put that into my queue)

[Bug debug/24551] [meta-bug] -feliminate-unused-debug-types issues

2019-06-05 Thread patrickdepinguin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24551

--- Comment #4 from Thomas De Schampheleire  
---
Could it not be that #14167 is now fixed after fixing #86964 ?

Re: [0/3] Improve debug info for addressable vars

2019-06-05 Thread Richard Sandiford
Richard Biener  writes:
> On Sat, Jun 1, 2019 at 5:49 PM Richard Sandiford
>  wrote:
>>
>> Taking the address of a variable stops us doing var-tracking on it,
>> so that we just use the DECL_RTL instead.  This can easily cause wrong
>> debug info for regions of code that would have had correct debug info
>> if the variable weren't addressable.  E.g.:
>>
>> {
>>   int base;
>>   get_start ();
>>   x[i1] = base;
>>   base += 1; // No need to store this
>>   x[i2] = base; // ...so the debug info for "base" is wrong here
>> }
>>
>> or (the motivating example):
>>
>> {
>>   int base;
>>   get_start ();
>>   for (int i = 0; i < n; ++i)
>> {
>>   x[i] = base;
>>   base += y[i]; // Can apply LSM here, so the debug info for "base"
>> // in the loop is wrong
>> }
>>   consume ();
>> }
>>
>> This patch series lets us use the DECL_RTL location for some parts of a
>> variable's lifetime and debug-bind locations for other parts:
>>
>> 1) Gimple uses "VAR s=> VAR" to bind VAR to its DECL_RTL.  The binding
>>holds until overridden.
>>
>> 2) RTL does the same thing using:
>>
>>  (var_location VAR (decl_rtl_ref VAR))
>>
>>where DECL_RTL_REF is a new rtx code that captures the DECL_RTL
>>by reference rather than by value.
>>
>>We can't just use "(var_location VAR (mem X))" for this, because
>>that would bind VAR to the value that (mem X) has at that exact point.
>>VAR would therefore get reset by any possible change to (mem X),
>>whereas here we want it to track (possibly indirect) updates instead.
>>
>> 3) The gimplifier decides which variables should get the new treatment
>>and emits "VAR s=> VAR" to mark the start of VAR's lifetime.
>>Clobbers continue to mark the end of VAR's lifetime.
>>
>> 4) Stores to VAR implicitly reestablish the link between VAR and its
>>DECL_RTL.  This is simpler (and IMO more robust) than inserting an
>>explicit "VAR s=> VAR" at every write.
>>
>> 5) gsi_remove tries to insert "VAR => X" in place of a deleted "VAR = X",
>>falling back to a "VAR => NULL" reset if that fails.
>>
>> Patch 1 handles the new rtl code, patch 2 adds the gimple framework,
>> and patch 3 uses it for LSM.
>
> So I wonder how it handles
>
> void __attribute__((noinline)) foo(int *p) { *p = 42; }
> int x;
> int main()
> {
>   int base = 1;
>   foo ();
>   base = 2;
>   *(x ?  : ) = 1; // (*)
>   return 0;
> }
>
> here we DSE the base = 2 store leaving a
>
> # DEBUG base = 2
>
> stmt?  But there's an indirect store that also stores
> to base - what will the debug info say at/after (*)?  Will
> it claim that base is 2?  At least I do not see that
> the connection with bases DECL_RTL is re-established?

Yeah, true.

> There's a clobber of base before return 0 so you eventually
> have to add some dummy stmt you can print base after
> the indirect store.
>
> That said, doesn't "aliasing" create another source of wrong-debug
> with your approach that might be even worse?

Not sure about even worse, but maybe different.  In the example above
the patches fix the debug info after "base = 2" but break it after the
following statement.

But there's no real need for the compiler to store to base in (*) either.
We could end up with "if (...) x = 1;" instead.  So AFAICT there's no
guarantee that we'll get correct debug info at the return statement even
as things stand.

For memory variables, I think we're always at the mercy of dead stores
being optimised away, and the patch isn't trying to fix that.  Since
both writes to base are dead in the above, I don't think we can guarantee
correct debug info without compromising optimisation for the sake of
debuggability.  (FWIW, I have a WIP patch to add an option for that,
hope to post an RFC soon.)

I can't think of a case in which the patches introduce wrong debug
info for code that isn't dead.

Thanks,
Richard

>
> Otherwise the patches look reasonable.
>
> Richard.
>
>> Bootstrapped & regtested on aarch64-linux-gnu and x86_64-linux-gnu.
>> OK to install?
>>
>> Richard


[Bug debug/24551] [meta-bug] -feliminate-unused-debug-types issues

2019-06-05 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24551

--- Comment #3 from Eric Gallager  ---
(In reply to Richard Biener from comment #2)
> But this meta-bug is about -feliminate-unused-debug-types, not -decls.

oh sorry, I misread, nvm then...

Re: Ping: [PATCH] PR88395 Fix Nullptr when compiling with -fconcepts

2019-06-05 Thread Richard Sandiford
Thanks for the patch and sorry that there was no response.
I've added the C++ maintainers to cc:

nick  writes:
> I'm pinging this patch as it's old now and should be applied to fix the bug.
>
> Nick
>
> On 2019-04-08 7:20 p.m., Nicholas Krause wrote:
>> This fixes the caller in tsubst_requires_expr to
>> tsubst_constraint_variables to wrap their respective
>> trees in PARM_CONSTR_PARMS. This is to get the correct
>> parmeter constraints from the tree before calling
>> tsubst_constraint_variables like other callers
>> in constraint.cc and to fix the bug id, 88395 on
>> the gcc bugzilla. OK for merge?
>> 
>> Signed-off-by: Nicholas Krause 
>> ---
>>  gcc/cp/constraint.cc | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
>> index 9884eb0db50..a78d0a9a49b 100644
>> --- a/gcc/cp/constraint.cc
>> +++ b/gcc/cp/constraint.cc
>> @@ -1882,7 +1882,7 @@ tsubst_requires_expr (tree t, tree args,
>>tree parms = TREE_OPERAND (t, 0);
>>if (parms)
>>  {
>> -  parms = tsubst_constraint_variables (parms, args, complain, in_decl);
>> +  parms = tsubst_constraint_variables (PARM_CONSTR_PARMS (parms), args, 
>> complain, in_decl);
>>if (parms == error_mark_node)
>>  return error_mark_node;
>>  }
>> 


[Bug debug/24551] [meta-bug] -feliminate-unused-debug-types issues

2019-06-05 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24551

--- Comment #2 from Richard Biener  ---
But this meta-bug is about -feliminate-unused-debug-types, not -decls.

  1   2   >