Re: [patch, fortran] Fix PR 66089, ICE (plus wrong code) in dependency handling

2019-03-07 Thread Bernhard Reutner-Fischer
On 6 March 2019 19:49:59 CET, Thomas Koenig  wrote:
>Hello world,
>
>the attached patch fixes a 7/8/9 regression where dependency checking
>was for class arrays and a scalar value was mishandled when the
>dependency happened in an elemental function.
>
>There was an ICE for the test case which is handled by
>fixing up the class refs in gfc_walk_variable_expr.
>Once this was gone, a wrong-code issue appeared which was fixed
>by the part in gfc_scalar_elemental_arg_saved_as_reference
>(is that the longest function name in gfortran?).
>
>Regression-tested. OK for all affected branches?

Please change call abort to stop N in the test?

>   PR fortran/66089
>   * gfortran.dg/dependency_53.f90: New test.



Re: [C++ PATCH] Toplevel asm volatile followup (PR c++/89585)

2019-03-07 Thread Jason Merrill

On 3/7/19 2:13 PM, Jakub Jelinek wrote:

On Wed, Mar 06, 2019 at 07:44:25PM -0500, Jason Merrill wrote:

In addition to that, it mentions in the documentation that qualifiers are
not allowed at toplevel asm statements; apparently our documentation at
least from r220506 for GCC 5 says that at toplevel Basic Asm needs to be
used and for Basic Asm lists volatile qualifier as optional and its behavior
(that it is ignored for Basic Asm).  Makes me wonder if we don't want to
keep accepting/ignoring volatile at toplevel for both C and C++ instead of
rejecting it (and rejecting just the other qualifiers).  Thoughts on this?


That seems reasonable.  Or using warning or permerror instead of error.


This incremental patch uses warning.  Bootstrapped/regtested on
x86_64-linux and i686-linux, ok for trunk?


OK.

Jason


Re: [C++ PATCH] Disallow reinterpret_cast in potential_constant_expression_1 (PR c++/89599)

2019-03-07 Thread Jason Merrill

On 3/7/19 2:29 PM, Jakub Jelinek wrote:

Hi!

The last testcase in the patch diagnoses invalid constexpr in the
ptr case, but doesn't for arr.
The array is constexpr, so we do:
   value = fold_non_dependent_expr (value);
   if (DECL_DECLARED_CONSTEXPR_P (decl)
   || (DECL_IN_AGGR_P (decl)
   && DECL_INITIALIZED_IN_CLASS_P (decl)))
 {
   /* Diagnose a non-constant initializer for constexpr variable or
  non-inline in-class-initialized static data member.  */
   if (!require_constant_expression (value))
 value = error_mark_node;
   else if (processing_template_decl)
 /* In a template we might not have done the necessary
transformations to make value actually constant,
e.g. extend_ref_init_temps.  */
 value = maybe_constant_init (value, decl, true);
   else
 value = cxx_constant_init (value, decl);
 }
but require_constant_expression returned true even when there are
REINTERPRET_CAST_Ps in the CONSTRUCTOR, and then cxx_constant_init
doesn't reject it, because:
 case CONSTRUCTOR:
   if (TREE_CONSTANT (t) && reduced_constant_expression_p (t))
 {
   /* Don't re-process a constant CONSTRUCTOR, but do fold it to
  VECTOR_CST if applicable.  */
   verify_constructor_flags (t);
   if (TREE_CONSTANT (t))
 return fold (t);
 }
   r = cxx_eval_bare_aggregate (ctx, t, lval,
non_constant_p, overflow_p);
   break;
and reduced_constant_expression_p is true on it, so we never try to evaluate
it.

The following patch changes potential_constant_expression_1 to reject the
REINTERPRET_CAST_P, not really sure if that is the best way though.


That seems right to me.  The patch is OK.

Jason


Re: [C++ PATCH] Fix up joust diagnostics (PR c++/89622)

2019-03-07 Thread Jason Merrill

On 3/7/19 2:25 PM, Jakub Jelinek wrote:

Hi!

If no diagnostics is emitted by this pedwarn, whether because of
-Wno-system-headers and location from system headers, or because of -w
etc., we still emit the follow-up messages as if the pedwarn emitted
something.

The following patch makes it conditional on pedwarn returning true (i.e.
that something has been actually printed).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-07  Jakub Jelinek  

PR c++/89622
* call.c (joust): Call print_z_candidate only if pedwarn returned
true.


OK.

Jason



[C++ PATCH] PR c++/88123 - lambda and using-directive.

2019-03-07 Thread Jason Merrill
For named function calls in a template, the result of unqualified lookup is
safed in CALL_EXPR_FN.  But for operator expressions, no unqualified lookup
is performed until we know whether the operands have class type.  So when we
see in a lambda a use of an operator that might be overloaded, we need to do
that lookup then and save it away somewhere.  One possibility would be in
the expression, but we can't really add extra conditional operands to
standard tree codes.  I mostly implemented another approach using a new
WITH_LOOKUP_EXPR code, but teaching everywhere how to handle a new tree code
is always complicated.  Then it occurred to me that we could associate the
lookups with the function, which is both simpler and smaller.  So this patch
stores any operator bindings needed by a lambda function in an internal
attribute on the lambda call operator.

Tested x86_64-pc-linux-gnu, applying to trunk.  Nathan, does slipping these
bindings into the sk_function_parms binding level this way make sense to you?

* name-lookup.c (op_unqualified_lookup)
(maybe_save_operator_binding, discard_operator_bindings)
(push_operator_bindings): New.
* typeck.c (build_x_binary_op, build_x_unary_op): Call
maybe_save_operator_binding.
* decl.c (start_preparsed_function): Call push_operator_bindings.
* tree.c (cp_free_lang_data): Call discard_operator_bindings.
---
 gcc/cp/name-lookup.h  |  3 +
 gcc/cp/decl.c |  2 +
 gcc/cp/name-lookup.c  | 99 +++
 gcc/cp/tree.c |  2 +
 gcc/cp/typeck.c   | 12 ++-
 .../g++.dg/cpp1y/lambda-generic-using1.C  | 29 ++
 gcc/cp/ChangeLog  |  9 ++
 7 files changed, 154 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/lambda-generic-using1.C

diff --git a/gcc/cp/name-lookup.h b/gcc/cp/name-lookup.h
index 36816df5ada..a47486d1b8a 100644
--- a/gcc/cp/name-lookup.h
+++ b/gcc/cp/name-lookup.h
@@ -330,5 +330,8 @@ extern void push_nested_namespace (tree);
 extern void pop_nested_namespace (tree);
 extern void push_to_top_level (void);
 extern void pop_from_top_level (void);
+extern void maybe_save_operator_binding (tree);
+extern void push_operator_bindings (void);
+extern void discard_operator_bindings (tree);
 
 #endif /* GCC_CP_NAME_LOOKUP_H */
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 173758feddf..0187db5ff1c 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -15553,6 +15553,8 @@ start_preparsed_function (tree decl1, tree attrs, int 
flags)
 
   store_parm_decls (current_function_parms);
 
+  push_operator_bindings ();
+
   if (!processing_template_decl
   && (flag_lifetime_dse > 1)
   && DECL_CONSTRUCTOR_P (decl1)
diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 1ddcde26ef4..2ba888fd1c2 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -7556,4 +7556,103 @@ cp_emit_debug_info_for_using (tree t, tree context)
 }
 }
 
+/* Return the result of unqualified lookup for the overloaded operator
+   designated by CODE, if we are in a template and the binding we find is
+   not.  */
+
+static tree
+op_unqualified_lookup (tree fnname)
+{
+  if (cxx_binding *binding = IDENTIFIER_BINDING (fnname))
+{
+  cp_binding_level *l = binding->scope;
+  while (l && !l->this_entity)
+   l = l->level_chain;
+  if (l && uses_template_parms (l->this_entity))
+   /* Don't preserve decls from an uninstantiated template,
+  wait until that template is instantiated.  */
+   return NULL_TREE;
+}
+  tree fns = lookup_name (fnname);
+  if (fns && fns == get_global_binding (fnname))
+/* The instantiation can find these.  */
+return NULL_TREE;
+  return fns;
+}
+
+/* E is an expression representing an operation with dependent type, so we
+   don't know yet whether it will use the built-in meaning of the operator or a
+   function.  Remember declarations of that operator in scope.  */
+
+const char *const op_bind_attrname = "operator bindings";
+
+void
+maybe_save_operator_binding (tree e)
+{
+  /* This is only useful in a generic lambda.  */
+  if (!processing_template_decl)
+return;
+  tree cfn = current_function_decl;
+  if (!cfn)
+return;
+
+  /* Let's only do this for generic lambdas for now, we could do it for all
+ function templates if we wanted to.  */
+  if (!current_lambda_expr())
+return;
+
+  tree fnname = ovl_op_identifier (false, TREE_CODE (e));
+  if (!fnname)
+return;
+
+  tree attributes = DECL_ATTRIBUTES (cfn);
+  tree attr = lookup_attribute (op_bind_attrname, attributes);
+  tree bindings = NULL_TREE;
+  tree fns = NULL_TREE;
+  if (attr)
+{
+  bindings = TREE_VALUE (attr);
+  if (tree elt = purpose_member (fnname, bindings))
+   fns = TREE_VALUE (elt);
+}
+
+  if (!fns && (fns = op_unqualified_lookup (fnname)))
+{
+  

[PATCH] PR88497 - Extend reassoc for vector bit_field_ref

2019-03-07 Thread Kewen.Lin
Hi,

As PR88497 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88497), 
when we meet some code pattern like:
   
   V1[0] + V1[1] + ... + V1[k] + V2[0] + ... + V2[k] + ... Vn[k]
   // V1...Vn of VECTOR_TYPE

We can teach reassoc to transform it to:

   Vs = (V1 + V2 + ... + Vn)
   Vs[0] + Vs[1] + ... + Vs[k]

It saves addition and bit_field_ref operations and exposes more 
opportunities for downstream passes, I notice that even on one 
target doesn't support vector type and vector type gets expanded 
in veclower, it's still better to have it, since the generated 
sequence is more friendly for widening_mul.  (If one more time 
DCE after forwprop, it should be the same.  )

Bootstrapped/regtested on powerpc64le-linux-gnu, ok for trunk?

Thanks in advance!


gcc/ChangeLog

2019-03-08  Kewen Lin  

PR target/88497
* tree-ssa-reassoc.c (reassociate_bb): Swap the positions of 
GIMPLE_BINARY_RHS check and gimple_visited_p check, call new 
function undistribute_bitref_for_vector.
(undistribute_bitref_for_vector): New function.
(cleanup_vinfo_map): Likewise.
(unsigned_cmp): Likewise.

gcc/testsuite/ChangeLog

2019-03-08  Kewen Lin  

* gcc.dg/tree-ssa/pr88497.c: New test.

---
 gcc/testsuite/gcc.dg/tree-ssa/pr88497.c |  18 +++
 gcc/tree-ssa-reassoc.c  | 274 +++-
 2 files changed, 287 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr88497.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr88497.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr88497.c
new file mode 100644
index 000..4d9ac67
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -fdump-tree-reassoc1" } */
+typedef double v2df __attribute__ ((vector_size (16)));
+double
+test (double accumulator, v2df arg1[], v2df arg2[])
+{
+  v2df temp;
+  temp = arg1[0] * arg2[0];
+  accumulator += temp[0] + temp[1];
+  temp = arg1[1] * arg2[1];
+  accumulator += temp[0] + temp[1];
+  temp = arg1[2] * arg2[2];
+  accumulator += temp[0] + temp[1];
+  temp = arg1[3] * arg2[3];
+  accumulator += temp[0] + temp[1];
+  return accumulator;
+}
+/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 2 "reassoc1" } } */
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index e1c4dfe..fc0e297 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -1772,6 +1772,263 @@ undistribute_ops_list (enum tree_code opcode,
   return changed;
 }

+/* Hold the information of one specific VECTOR_TYPE SSA_NAME.
+- offsets: for different BIT_FIELD_REF offsets accessing same VECTOR.
+- ops_indexes: the index of vec ops* for each relavant BIT_FIELD_REF.  */
+struct v_info
+{
+  auto_vec offsets;
+  auto_vec ops_indexes;
+};
+
+typedef struct v_info *v_info_ptr;
+
+/* Comparison function for qsort on unsigned BIT_FIELD_REF offsets.  */
+static int
+unsigned_cmp (const void *p_i, const void *p_j)
+{
+  if (*(const unsigned *) p_i >= *(const unsigned *) p_j)
+return 1;
+  else
+return -1;
+}
+
+/* Cleanup hash map for VECTOR information.  */
+static void
+cleanup_vinfo_map (hash_map _map)
+{
+  for (hash_map::iterator it = info_map.begin ();
+   it != info_map.end (); ++it)
+{
+  v_info_ptr info = (*it).second;
+  delete info;
+  (*it).second = NULL;
+}
+}
+
+/* Perform un-distribution of BIT_FIELD_REF on VECTOR_TYPE.
+ V1[0] + V1[1] + ... + V1[k] + V2[0] + V2[1] + ... + V2[k] + ... Vn[k]
+   is transformed to
+ Vs = (V1 + V2 + ... + Vn)
+ Vs[0] + Vs[1] + ... + Vs[k]
+
+   The basic steps are listed below:
+
+1) Check the addition chain *OPS by looking those summands coming from
+   VECTOR bit_field_ref on VECTOR type. Put the information into
+   v_info_map for each satisfied summand, using VECTOR SSA_NAME as key.
+
+2) For each key (VECTOR SSA_NAME), validate all its BIT_FIELD_REFs are
+   continous, they can cover the whole VECTOR perfectly without any holes.
+   Obtain one VECTOR list which contain candidates to be transformed.
+
+3) Build the addition statements for all VECTOR candidates, generate
+   BIT_FIELD_REFs accordingly.
+
+   TODO: Now the implementation restrict all candidate VECTORs should have the
+   same VECTOR type, it can be extended into different groups by VECTOR types 
+   in future if any profitable cases found.  */
+static bool
+undistribute_bitref_for_vector (enum tree_code opcode, vec 
*ops,
+struct loop *loop)
+{
+  if (ops->length () <= 1 || opcode != PLUS_EXPR)
+return false;
+
+  hash_map v_info_map;
+  operand_entry *oe1;
+  unsigned i;
+
+  /* Find those summands from VECTOR BIT_FIELD_REF in addition chain, put the
+ information into map.  */
+  FOR_EACH_VEC_ELT (*ops, i, oe1)
+{
+  enum tree_code dcode;
+  gimple *oe1def;
+
+  if (TREE_CODE (oe1->op) != SSA_NAME)
+   continue;
+  oe1def = 

[PATCH] rs6000: Fix lost ud chains in swap optimization

2019-03-07 Thread Bill Schmidt
Hi,

We recently discovered a problem in swap optimization where the du- and 
ud-chains
were getting corrupted after a preliminary modification phase and prior to the
main body of the pass.  The fix for this is to rebuild the chains between 
phases.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
I've not included a test case because the problem tends to get lost in 
reduction,
and may shift over time anyway.  Is this okay for trunk, and eventual backport
to 8 and 7?

Thanks!

Bill


2019-03-07  Bill Schmidt  

* config/rs6000/rs6000-p8swap.c (rs6000_analyze_swaps): Rebuild
ud- and du-chains between phases.


Index: gcc/config/rs6000/rs6000-p8swap.c
===
--- gcc/config/rs6000/rs6000-p8swap.c   (revision 269471)
+++ gcc/config/rs6000/rs6000-p8swap.c   (working copy)
@@ -2316,7 +2316,14 @@ rs6000_analyze_swaps (function *fun)
 
   /* Pre-pass to recombine lvx and stvx patterns so we don't lose info.  */
   recombine_lvx_stvx_patterns (fun);
+
+  /* Rebuild ud- and du-chains.  */
+  df_remove_problem (df_chain);
   df_process_deferred_rescans ();
+  df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
+  df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
+  df_analyze ();
+  df_set_flags (DF_DEFER_INSN_RESCAN);
 
   /* Allocate structure to represent webs of insns.  */
   insn_entry = XCNEWVEC (swap_web_entry, get_max_uid ());



Re: [Bug libstdc++/89608] Undetected iterator invalidations on unordered containers in debug mode

2019-03-07 Thread Jonathan Wakely

On 07/03/19 23:57 +, Jonathan Wakely wrote:

On 07/03/19 22:22 +0100, François Dumont wrote:

Hi

    I consider the implementation to decide to invalidate iterators 
or not. As nodes are not deallocated and only slghtly impacted 
during the rehash process I consider that they shouldn't be 
invalidated appart from the local iterators. I should have just 
consider the Standard.


    Here is the complete patch which is Max Sistemich proposal but 
extended to the other unordered containers with also the test 
adapted for the testsuite.


PR libstdc++/89608
* include/debug/unordered_map (unordered_map<>::_M_check_rehashed):
  Invalidate all iterators in case of rehash.
  (unordered_multimap<>::_M_check_rehashed): Likewise.
* include/debug/unordered_set (unordered_set<>::_M_check_rehashed):
   Likewise.
   (unordered_multiset<>::_M_check_rehashed): Likewise.

* testsuite/23_containers/unordered_set/debug/89608_neg.cc: New.

    I run all unordered tests so far in Debug mode. I have 4 
unrelated failures that I'll fix through another patch. Ok to commit 
once all tests run ?


Yes, this is OK for trunk, thanks!


I guess it's probably OK for the branches too, but let's wait a bit
and commit it to the branches later.




Re: [Bug libstdc++/89608] Undetected iterator invalidations on unordered containers in debug mode

2019-03-07 Thread Jonathan Wakely

On 07/03/19 22:22 +0100, François Dumont wrote:

Hi

    I consider the implementation to decide to invalidate iterators or 
not. As nodes are not deallocated and only slghtly impacted during the 
rehash process I consider that they shouldn't be invalidated appart 
from the local iterators. I should have just consider the Standard.


    Here is the complete patch which is Max Sistemich proposal but 
extended to the other unordered containers with also the test adapted 
for the testsuite.


PR libstdc++/89608
* include/debug/unordered_map (unordered_map<>::_M_check_rehashed):
  Invalidate all iterators in case of rehash.
  (unordered_multimap<>::_M_check_rehashed): Likewise.
* include/debug/unordered_set (unordered_set<>::_M_check_rehashed):
   Likewise.
   (unordered_multiset<>::_M_check_rehashed): Likewise.

* testsuite/23_containers/unordered_set/debug/89608_neg.cc: New.

    I run all unordered tests so far in Debug mode. I have 4 unrelated 
failures that I'll fix through another patch. Ok to commit once all 
tests run ?


Yes, this is OK for trunk, thanks!



Re: PR libstdc++/89477 for Debug mode

2019-03-07 Thread Jonathan Wakely

On 07/03/19 22:34 +0100, François Dumont wrote:

Hi

    PR 89477 fixes haven't been applied to the Debug mode. Here it is 
to fix the different deduction.cc tests.


    PR libstdc++/89477
    * include/debug/map.h (map): Use _RequireNotAllocator to constrain
    parameters in deduction guides.
    * include/debug/multimap.h (multimap): Likewise.
    * include/debug/set.h (multimap): Likewise.
    * include/debug/multiset.h (multimap): Likewise.
    * include/debug/unordered_map (unordered_map): Likewise.
    (unordered_multimap): Likewise.
    * include/debug/unordered_set (unordered_set): Likewise.
    (unordered_multiset): Likewise.

    Tested under Linux x86_64, ok to commit ?


OK, thanks for catching this.




Re: [PATCH] Add baseline symbols for riscv64-linux-gnu

2019-03-07 Thread Jim Wilson
On Wed, Mar 6, 2019 at 6:22 AM Andreas Schwab  wrote:
> * config/abi/post/riscv64-linux-gnu/baseline_symbols.txt: New file.

I thought we had fixed this already.  I vaguely recall discussing it
with you.  But I see that nothing was checked in, and I see that you
posted a patch in November that I forgot to reply to.  To make sure I
didn't forget again, I did a quick build to verify that it works for
me, and checked it in.

Jim


Re: [PATCH] Fix translation issue in config/darwin.c (PR target/80190)

2019-03-07 Thread Mike Stump
On Mar 7, 2019, at 11:37 AM, Jakub Jelinek  wrote:
> 
> The following patch just makes it two complete diagnostic messages that
> translators can translate as they wish.
> 
> Ok for trunk?

Ok.


[PR fortran/60091, patch] - Misleading error messages in rank-2 pointer assignment to rank-1 target

2019-03-07 Thread Harald Anlauf
The PR rightly complains about bad error messages for invalid pointer
assignments.  I've tried to adjust the logic slightly so that we now
print error messages that should explain more clearly what is wrong.

This required adjustment of 2 testcases, one of which also had an
incorrect comment.

OK for trunk?

Thanks,
Harald

2019-03-07  Harald Anlauf  

PR fortran/60091
* expr.c (gfc_check_pointer_assign): Correct and improve error
messages for invalid pointer assignments.

2019-03-07  Harald Anlauf  

PR fortran/60091
* gfortran.dg/pointer_remapping_3.f08: Adjust error messages.
* gfortran.dg/pointer_remapping_7.f90: Adjust error message.

Index: gcc/fortran/expr.c
===
--- gcc/fortran/expr.c  (revision 269445)
+++ gcc/fortran/expr.c  (working copy)
@@ -3703,6 +3703,7 @@
   gfc_ref *ref;
   bool is_pure, is_implicit_pure, rank_remap;
   int proc_pointer;
+  bool same_rank;
 
   lhs_attr = gfc_expr_attr (lvalue);
   if (lvalue->ts.type == BT_UNKNOWN && !lhs_attr.proc_pointer)
@@ -3724,6 +3725,7 @@
   proc_pointer = lvalue->symtree->n.sym->attr.proc_pointer;
 
   rank_remap = false;
+  same_rank = lvalue->rank == rvalue->rank;
   for (ref = lvalue->ref; ref; ref = ref->next)
 {
   if (ref->type == REF_COMPONENT)
@@ -3748,36 +3750,67 @@
   lvalue->symtree->n.sym->name, >where))
return false;
 
- /* When bounds are given, all lbounds are necessary and either all
-or none of the upper bounds; no strides are allowed.  If the
-upper bounds are present, we may do rank remapping.  */
+ /* Fortran standard (e.g. F2018, 10.2.2 Pointer assignment):
+  *
+  * (C1017) If bounds-spec-list is specified, the number of
+  * bounds-specs shall equal the rank of data-pointer-object.
+  *
+  * If bounds-spec-list appears, it specifies the lower bounds.
+  *
+  * (C1018) If bounds-remapping-list is specified, the number of
+  * bounds-remappings shall equal the rank of data-pointer-object.
+  *
+  * If bounds-remapping-list appears, it specifies the upper and
+  * lower bounds of each dimension of the pointer; the pointer target
+  * shall be simply contiguous or of rank one.
+  *
+  * (C1019) If bounds-remapping-list is not specified, the ranks of
+  * data-pointer-object and data-target shall be the same.
+  *
+  * Thus when bounds are given, all lbounds are necessary and either
+  * all or none of the upper bounds; no strides are allowed.  If the
+  * upper bounds are present, we may do rank remapping.  */
  for (dim = 0; dim < ref->u.ar.dimen; ++dim)
{
- if (!ref->u.ar.start[dim]
- || ref->u.ar.dimen_type[dim] != DIMEN_RANGE)
+ if (ref->u.ar.stride[dim])
{
- gfc_error ("Lower bound has to be present at %L",
+ gfc_error ("Stride must not be present at %L",
 >where);
  return false;
}
- if (ref->u.ar.stride[dim])
+ if (!same_rank && (!ref->u.ar.start[dim] ||!ref->u.ar.end[dim]))
{
- gfc_error ("Stride must not be present at %L",
+ gfc_error ("Rank remapping requires a "
+"bounds-specification-list at %L",
 >where);
  return false;
}
+ if (!ref->u.ar.start[dim]
+ || ref->u.ar.dimen_type[dim] != DIMEN_RANGE)
+   {
+ gfc_error ("Expected bounds-remapping-list or "
+"bounds-specification-list at %L",
+>where);
+ return false;
+   }
 
  if (dim == 0)
rank_remap = (ref->u.ar.end[dim] != NULL);
  else
{
- if ((rank_remap && !ref->u.ar.end[dim])
- || (!rank_remap && ref->u.ar.end[dim]))
+ if ((rank_remap && !ref->u.ar.end[dim]))
{
- gfc_error ("Either all or none of the upper bounds"
-" must be specified at %L", >where);
+ gfc_error ("Rank remapping requires a "
+"bounds-specification-list at %L",
+>where);
  return false;
}
+ if (!rank_remap && ref->u.ar.end[dim])
+   {
+ gfc_error ("Expected bounds-remapping-list or "
+"bounds-specification-list at %L",
+>where);
+   }
}
}

C++ PATCH for c++/89612 - ICE with member friend template with noexcept

2019-03-07 Thread Marek Polacek
This was one of those PRs where the more you poke, the more ICEs turn up.
This patch fixes the ones I could find.  The original problem was that
maybe_instantiate_noexcept got a TEMPLATE_DECL created for the member
friend template in do_friend.  Its noexcept-specification was deferred,
so we went to the block with push_access_scope, but that crashes on a
TEMPLATE_DECL.  One approach could be to somehow not defer noexcept-specs
for friend templates, I guess, but I didn't want to do that.

So the approach I did take in the end was to handle TEMPLATE_DECLs in
maybe_instantiate_noexcept.

That broke in register_parameter_specializations but we don't need this
code anyway, so let's do away with it -- the current_class_{ref,ptr}
code is enough to fix the PR that register_parameter_specializations was
introduced for.

Another issue was that since here we are instantiating a deferred noexcept,
in a instantiate_class_template context, processing_template_decl was 0.
Let's pretend it's on for the tsubst purposes here, and for build_noexcept_spec
too, so that the *_dependent_expression_p functions work.

Lastly, I found an invalid testcase that was breaking because a template code
leaked to constexpr functions.  This I fixed similarly to the recent explicit
PR fix (r269131).

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-03-07  Marek Polacek  

PR c++/89612 - ICE with member friend template with noexcept.
* except.c (build_noexcept_spec): Call instantiate_non_dependent_expr
before perform_implicit_conversion_flags.  Add processing_template_decl
sentinel.
* pt.c (maybe_instantiate_noexcept): For function templates, use their
template result (function decl).  Don't set up local specializations.
Temporarily turn on processing_template_decl.  Update the template type
too.

* g++.dg/cpp0x/noexcept36.C: New test.
* g++.dg/cpp1y/noexcept1.C: New test.
* g++.dg/cpp1z/noexcept-type21.C: New test.

diff --git gcc/cp/except.c gcc/cp/except.c
index 139e871d7a7..d97b8d40542 100644
--- gcc/cp/except.c
+++ gcc/cp/except.c
@@ -1285,10 +1285,13 @@ build_noexcept_spec (tree expr, tsubst_flags_t complain)
   if (TREE_CODE (expr) != DEFERRED_NOEXCEPT
   && !value_dependent_expression_p (expr))
 {
+  expr = instantiate_non_dependent_expr_sfinae (expr, complain);
+  /* Don't let perform_implicit_conversion_flags create more template
+codes.  */
+  processing_template_decl_sentinel s;
   expr = perform_implicit_conversion_flags (boolean_type_node, expr,
complain,
LOOKUP_NORMAL);
-  expr = instantiate_non_dependent_expr (expr);
   expr = cxx_constant_value (expr);
 }
   if (TREE_CODE (expr) == INTEGER_CST)
diff --git gcc/cp/pt.c gcc/cp/pt.c
index 906cfe0a58c..44a2c4606f8 100644
--- gcc/cp/pt.c
+++ gcc/cp/pt.c
@@ -24174,6 +24174,17 @@ maybe_instantiate_noexcept (tree fn, tsubst_flags_t 
complain)
 
   if (DECL_CLONED_FUNCTION_P (fn))
 fn = DECL_CLONED_FUNCTION (fn);
+
+  tree orig_fn = NULL_TREE;
+  /* For a member friend template we can get a TEMPLATE_DECL.  Let's use
+ its FUNCTION_DECL for the rest of this function -- push_access_scope
+ doesn't accept TEMPLATE_DECLs.  */
+  if (DECL_FUNCTION_TEMPLATE_P (fn))
+{
+  orig_fn = fn;
+  fn = DECL_TEMPLATE_RESULT (fn);
+}
+
   fntype = TREE_TYPE (fn);
   spec = TYPE_RAISES_EXCEPTIONS (fntype);
 
@@ -24204,37 +24215,41 @@ maybe_instantiate_noexcept (tree fn, tsubst_flags_t 
complain)
  push_deferring_access_checks (dk_no_deferred);
  input_location = DECL_SOURCE_LOCATION (fn);
 
- /* A new stack interferes with pop_access_scope.  */
- {
-   /* Set up the list of local specializations.  */
-   local_specialization_stack lss (lss_copy);
-
-   tree save_ccp = current_class_ptr;
-   tree save_ccr = current_class_ref;
-   /* If needed, set current_class_ptr for the benefit of
-  tsubst_copy/PARM_DECL.  */
-   tree tdecl = DECL_TEMPLATE_RESULT (DECL_TI_TEMPLATE (fn));
-   if (DECL_NONSTATIC_MEMBER_FUNCTION_P (tdecl))
- {
-   tree this_parm = DECL_ARGUMENTS (tdecl);
-   current_class_ptr = NULL_TREE;
-   current_class_ref = cp_build_fold_indirect_ref (this_parm);
-   current_class_ptr = this_parm;
- }
+ tree save_ccp = current_class_ptr;
+ tree save_ccr = current_class_ref;
+ /* If needed, set current_class_ptr for the benefit of
+tsubst_copy/PARM_DECL.  */
+ tree tdecl = DECL_TEMPLATE_RESULT (DECL_TI_TEMPLATE (fn));
+ if (DECL_NONSTATIC_MEMBER_FUNCTION_P (tdecl))
+   {
+ tree this_parm = DECL_ARGUMENTS (tdecl);
+ current_class_ptr = NULL_TREE;
+ current_class_ref = 

PR libstdc++/89477 for Debug mode

2019-03-07 Thread François Dumont

Hi

    PR 89477 fixes haven't been applied to the Debug mode. Here it is 
to fix the different deduction.cc tests.


    PR libstdc++/89477
    * include/debug/map.h (map): Use _RequireNotAllocator to constrain
    parameters in deduction guides.
    * include/debug/multimap.h (multimap): Likewise.
    * include/debug/set.h (multimap): Likewise.
    * include/debug/multiset.h (multimap): Likewise.
    * include/debug/unordered_map (unordered_map): Likewise.
    (unordered_multimap): Likewise.
    * include/debug/unordered_set (unordered_set): Likewise.
    (unordered_multiset): Likewise.

    Tested under Linux x86_64, ok to commit ?

François

diff --git a/libstdc++-v3/include/debug/map.h b/libstdc++-v3/include/debug/map.h
index 5063325cb97..80ca1bebbd2 100644
--- a/libstdc++-v3/include/debug/map.h
+++ b/libstdc++-v3/include/debug/map.h
@@ -704,6 +704,7 @@ namespace __debug
 	   typename _Compare = less<__iter_key_t<_InputIterator>>,
 	   typename _Allocator = allocator<__iter_to_alloc_t<_InputIterator>>,
 	   typename = _RequireInputIter<_InputIterator>,
+	   typename = _RequireNotAllocator<_Compare>,
 	   typename = _RequireAllocator<_Allocator>>
 map(_InputIterator, _InputIterator,
 	_Compare = _Compare(), _Allocator = _Allocator())
@@ -712,6 +713,7 @@ namespace __debug
 
   template,
 	   typename _Allocator = allocator>,
+	   typename = _RequireNotAllocator<_Compare>,
 	   typename = _RequireAllocator<_Allocator>>
 map(initializer_list>,
 	_Compare = _Compare(), _Allocator = _Allocator())
diff --git a/libstdc++-v3/include/debug/multimap.h b/libstdc++-v3/include/debug/multimap.h
index 38659aaba26..560aa7dda95 100644
--- a/libstdc++-v3/include/debug/multimap.h
+++ b/libstdc++-v3/include/debug/multimap.h
@@ -585,6 +585,7 @@ namespace __debug
 	   typename _Compare = less<__iter_key_t<_InputIterator>>,
 	   typename _Allocator = allocator<__iter_to_alloc_t<_InputIterator>>,
 	   typename = _RequireInputIter<_InputIterator>,
+	   typename = _RequireNotAllocator<_Compare>,
 	   typename = _RequireAllocator<_Allocator>>
 multimap(_InputIterator, _InputIterator,
 	 _Compare = _Compare(), _Allocator = _Allocator())
@@ -593,6 +594,7 @@ namespace __debug
 
   template,
 	   typename _Allocator = allocator>,
+	   typename = _RequireNotAllocator<_Compare>,
 	   typename = _RequireAllocator<_Allocator>>
 multimap(initializer_list>,
 	 _Compare = _Compare(), _Allocator = _Allocator())
diff --git a/libstdc++-v3/include/debug/multiset.h b/libstdc++-v3/include/debug/multiset.h
index 19dc8ea6a1e..8fb11f871ac 100644
--- a/libstdc++-v3/include/debug/multiset.h
+++ b/libstdc++-v3/include/debug/multiset.h
@@ -555,6 +555,7 @@ namespace __debug
 	   typename _Allocator =
 	 allocator::value_type>,
 	   typename = _RequireInputIter<_InputIterator>,
+	   typename = _RequireNotAllocator<_Compare>,
 	   typename = _RequireAllocator<_Allocator>>
 multiset(_InputIterator, _InputIterator,
 	 _Compare = _Compare(), _Allocator = _Allocator())
@@ -564,6 +565,7 @@ namespace __debug
   template,
 	   typename _Allocator = allocator<_Key>,
+	   typename = _RequireNotAllocator<_Compare>,
 	   typename = _RequireAllocator<_Allocator>>
 multiset(initializer_list<_Key>,
 	 _Compare = _Compare(), _Allocator = _Allocator())
diff --git a/libstdc++-v3/include/debug/set.h b/libstdc++-v3/include/debug/set.h
index 88b84905ba2..9f16a9190b8 100644
--- a/libstdc++-v3/include/debug/set.h
+++ b/libstdc++-v3/include/debug/set.h
@@ -567,6 +567,7 @@ namespace __debug
 	   typename _Allocator =
 	 allocator::value_type>,
 	   typename = _RequireInputIter<_InputIterator>,
+	   typename = _RequireNotAllocator<_Compare>,
 	   typename = _RequireAllocator<_Allocator>>
 set(_InputIterator, _InputIterator,
 	_Compare = _Compare(), _Allocator = _Allocator())
@@ -575,6 +576,7 @@ namespace __debug
 
   template,
 	   typename _Allocator = allocator<_Key>,
+	   typename = _RequireNotAllocator<_Compare>,
 	   typename = _RequireAllocator<_Allocator>>
 set(initializer_list<_Key>,
 	_Compare = _Compare(), _Allocator = _Allocator())
diff --git a/libstdc++-v3/include/debug/unordered_map b/libstdc++-v3/include/debug/unordered_map
index 163acd7f7ec..895d5cd227c 100644
--- a/libstdc++-v3/include/debug/unordered_map
+++ b/libstdc++-v3/include/debug/unordered_map
@@ -651,6 +651,8 @@ namespace __debug
 	   typename _Pred = equal_to<__iter_key_t<_InputIterator>>,
 	   typename _Allocator = allocator<__iter_to_alloc_t<_InputIterator>>,
 	   typename = _RequireInputIter<_InputIterator>,
+	   typename = _RequireNotAllocatorOrIntegral<_Hash>,
+	   typename = _RequireNotAllocator<_Pred>,
 	   typename = _RequireAllocator<_Allocator>>
 unordered_map(_InputIterator, _InputIterator,
 		  typename unordered_map::size_type = {},
@@ -662,6 +664,8 @@ namespace __debug
   template,
 	   typename _Pred = equal_to<_Key>,
 	   typename _Allocator = allocator>,
+	   typename = _RequireNotAllocatorOrIntegral<_Hash>,
+	   typename = 

Re: [C++ PATCH] Toplevel asm volatile (PR c++/89585)

2019-03-07 Thread Jakub Jelinek
On Thu, Mar 07, 2019 at 10:11:56PM +0100, Matthias Klose wrote:
> On 07.03.19 00:39, Jakub Jelinek wrote:
> > The following patch tries to improve diagnostics of toplevel asm qualifiers
> > in C++ by actually parsing them and complaining if they appear at toplevel,
> > instead of just emitting a parse error that ( is expected, e.g. some
> > versions of Qt do use toplevel asm volatile and apparently the Qt code is
> > copied into lots of various projects.
> > 
> > In addition to that, it mentions in the documentation that qualifiers are
> > not allowed at toplevel asm statements; apparently our documentation at
> > least from r220506 for GCC 5 says that at toplevel Basic Asm needs to be
> > used and for Basic Asm lists volatile qualifier as optional and its behavior
> > (that it is ignored for Basic Asm).  Makes me wonder if we don't want to
> > keep accepting/ignoring volatile at toplevel for both C and C++ instead of
> > rejecting it (and rejecting just the other qualifiers).  Thoughts on this?
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > Attached is an untested backport of this patch to 8.4, which does allow
> > asm volatile at toplevel, so that we don't break in 8.4 what has been
> > accepted in 8.2.  Ok if it passes bootstrap/regtest there?
> 
> isn't that required for the gcc-7 branch as well?

Yes.

> r267536 backported these patches to the 7 branch as well.

If you've tested it, feel free to commit it to 7.x.

Jakub


Re: [Bug libstdc++/89608] Undetected iterator invalidations on unordered containers in debug mode

2019-03-07 Thread François Dumont

Hi

    I consider the implementation to decide to invalidate iterators or 
not. As nodes are not deallocated and only slghtly impacted during the 
rehash process I consider that they shouldn't be invalidated appart from 
the local iterators. I should have just consider the Standard.


    Here is the complete patch which is Max Sistemich proposal but 
extended to the other unordered containers with also the test adapted 
for the testsuite.


PR libstdc++/89608
* include/debug/unordered_map (unordered_map<>::_M_check_rehashed):
  Invalidate all iterators in case of rehash.
  (unordered_multimap<>::_M_check_rehashed): Likewise.
* include/debug/unordered_set (unordered_set<>::_M_check_rehashed):
   Likewise.
   (unordered_multiset<>::_M_check_rehashed): Likewise.

* testsuite/23_containers/unordered_set/debug/89608_neg.cc: New.

    I run all unordered tests so far in Debug mode. I have 4 unrelated 
failures that I'll fix through another patch. Ok to commit once all 
tests run ?


François

diff --git a/libstdc++-v3/include/debug/unordered_map b/libstdc++-v3/include/debug/unordered_map
index 7b0ea6d768b..163acd7f7ec 100644
--- a/libstdc++-v3/include/debug/unordered_map
+++ b/libstdc++-v3/include/debug/unordered_map
@@ -611,7 +611,7 @@ namespace __debug
   _M_check_rehashed(size_type __prev_count)
   {
 	if (__prev_count != this->bucket_count())
-	  this->_M_invalidate_locals();
+	  this->_M_invalidate_all();
   }
 
   void
@@ -1210,7 +1210,7 @@ namespace __debug
   _M_check_rehashed(size_type __prev_count)
   {
 	if (__prev_count != this->bucket_count())
-	  this->_M_invalidate_locals();
+	  this->_M_invalidate_all();
   }
 
   void
diff --git a/libstdc++-v3/include/debug/unordered_set b/libstdc++-v3/include/debug/unordered_set
index 8f44bd92f2a..22e0abd3283 100644
--- a/libstdc++-v3/include/debug/unordered_set
+++ b/libstdc++-v3/include/debug/unordered_set
@@ -496,7 +496,7 @@ namespace __debug
   _M_check_rehashed(size_type __prev_count)
   {
 	if (__prev_count != this->bucket_count())
-	  this->_M_invalidate_locals();
+	  this->_M_invalidate_all();
   }
 
   void
@@ -1050,7 +1050,7 @@ namespace __debug
   _M_check_rehashed(size_type __prev_count)
   {
 	if (__prev_count != this->bucket_count())
-	  this->_M_invalidate_locals();
+	  this->_M_invalidate_all();
   }
 
   void
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_set/debug/89608_neg.cc b/libstdc++-v3/testsuite/23_containers/unordered_set/debug/89608_neg.cc
new file mode 100644
index 000..871b1c3381d
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/unordered_set/debug/89608_neg.cc
@@ -0,0 +1,37 @@
+// Copyright (C) 2019 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+//
+// { dg-do run { target c++11 xfail *-*-* } }
+// { dg-require-debug-mode "" }
+
+// PR libstdc++/89608
+
+#include 
+
+int main()
+{
+  std::unordered_set myset;
+  myset.reserve(2);
+  myset.insert(0);
+  myset.insert(1);
+
+  int i = 2;
+  for (auto it = myset.begin(), end = myset.end(); it != end; ++it)
+myset.insert(i++);
+
+  return 0;
+}


New Swedish PO file for 'cpplib' (version 9.1-b20190203)

2019-03-07 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Swedish team of translators.  The file is available at:

https://translationproject.org/latest/cpplib/sv.po

(This file, 'cpplib-9.1-b20190203.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Contents of PO file 'cpplib-9.1-b20190203.sv.po'

2019-03-07 Thread Translation Project Robot


cpplib-9.1-b20190203.sv.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.



Re: [PATCH] Diagnostic fixes for config/i386/i386.c (PR target/80003)

2019-03-07 Thread Uros Bizjak
On Thu, Mar 7, 2019 at 8:45 PM Jakub Jelinek  wrote:
>
> Hi!
>
> The following patch fixes a couple of issues in i386.c diagnostics.
>
> In particular, diagnostics emitted to the user (not in dump files)
> should not start with a capital letter unless it is something that
> starts with capital letter in a middle of a sentence, and should not end
> with a period.  The ix86_handle_interrupt_attribute hunk is also something
> the translators requested, the wording has been composed in a weird way
> from portions of a keyword in the message and portions from (untranslated)
> strings, the change surrounds it with quotes and makes the whole type name
> come from untranslated string literal.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2019-03-07  Jakub Jelinek  
>
> PR target/80003
> * config/i386/i386.c (ix86_set_func_type): Make sure diagnostics
> doesn't start with a capital letter and doesn't end with a dot.
> (ix86_function_arg_boundary): Make sure diagnostics doesn't start
> with a capital letter.
> (ix86_mangle_function_version_assembler_name): Likewise.
> (ix86_generate_version_dispatcher_body): Likewise.
> (fold_builtin_cpu): Likewise.
> (get_builtin_code_for_version): Likewise.  Remove extraneous space.
> (ix86_handle_interrupt_attribute): Make the diagnostics easier for
> translators, wrap full type name in %qs.
>
> * gcc.target/i386/pr68657.c: Adjust expected diagnostics wording.
> * gcc.target/i386/interrupt-6.c: Likewise.
> * g++.target/i386/pr57362.C: Adjust capitalization in dg-prune-output.

LGTM.

Thanks,
Uros.

> --- gcc/config/i386/i386.c.jj   2019-02-28 21:48:51.662326676 +0100
> +++ gcc/config/i386/i386.c  2019-03-07 17:22:43.787842967 +0100
> @@ -5800,8 +5800,8 @@ ix86_set_func_type (tree fndecl)
>
>   /* Only dwarf2out.c can handle -WORD(AP) as a pointer argument.  */
>   if (write_symbols != NO_DEBUG && write_symbols != DWARF2_DEBUG)
> -   sorry ("Only DWARF debug format is supported for interrupt "
> -  "service routine.");
> +   sorry ("only DWARF debug format is supported for interrupt "
> +  "service routine");
> }
>else
> {
> @@ -9069,7 +9069,7 @@ ix86_function_arg_boundary (machine_mode
> {
>   warned = true;
>   inform (input_location,
> - "The ABI for passing parameters with %d-byte"
> + "the ABI for passing parameters with %d-byte"
>   " alignment has changed in GCC 4.6",
>   align / BITS_PER_UNIT);
> }
> @@ -32116,7 +32116,7 @@ get_builtin_code_for_version (tree decl,
>if (predicate_list && arg_str == NULL)
> {
>   error_at (DECL_SOURCE_LOCATION (decl),
> -   "No dispatcher found for the versioning attributes");
> +   "no dispatcher found for the versioning attributes");
>   return 0;
> }
>
> @@ -32166,7 +32166,7 @@ get_builtin_code_for_version (tree decl,
>if (predicate_list && i == NUM_FEATURES)
> {
>   error_at (DECL_SOURCE_LOCATION (decl),
> -   "No dispatcher found for %s", token);
> +   "no dispatcher found for %s", token);
>   return 0;
> }
>token = strtok (NULL, ",");
> @@ -32176,7 +32176,7 @@ get_builtin_code_for_version (tree decl,
>if (predicate_list && predicate_chain == NULL_TREE)
>  {
>error_at (DECL_SOURCE_LOCATION (decl),
> -   "No dispatcher found for the versioning attributes : %s",
> +   "no dispatcher found for the versioning attributes: %s",
> attrs_str);
>return 0;
>  }
> @@ -32338,12 +32338,12 @@ ix86_mangle_function_version_assembler_n
>&& lookup_attribute ("gnu_inline",
>DECL_ATTRIBUTES (decl)))
>  error_at (DECL_SOURCE_LOCATION (decl),
> - "Function versions cannot be marked as gnu_inline,"
> + "function versions cannot be marked as gnu_inline,"
>   " bodies have to be generated");
>
>if (DECL_VIRTUAL_P (decl)
>|| DECL_VINDEX (decl))
> -sorry ("Virtual function multiversioning not supported");
> +sorry ("virtual function multiversioning not supported");
>
>version_attr = lookup_attribute ("target", DECL_ATTRIBUTES (decl));
>
> @@ -32619,7 +32619,7 @@ ix86_generate_version_dispatcher_body (v
>  virtual methods in base classes but are not explicitly marked as
>  virtual.  */
>if (DECL_VINDEX (versn->decl))
> -   sorry ("Virtual function multiversioning not supported");
> +   sorry ("virtual function multiversioning not supported");
>
>fn_ver_vec.safe_push (versn->decl);
>  }
> @@ -32898,7 +32898,7 @@ fold_builtin_cpu (tree fndecl, tree *arg
>  STRING_CST.   */
>  

Re: [C++ PATCH] Toplevel asm volatile (PR c++/89585)

2019-03-07 Thread Matthias Klose
On 07.03.19 00:39, Jakub Jelinek wrote:
> Hi!
> 
> The following patch tries to improve diagnostics of toplevel asm qualifiers
> in C++ by actually parsing them and complaining if they appear at toplevel,
> instead of just emitting a parse error that ( is expected, e.g. some
> versions of Qt do use toplevel asm volatile and apparently the Qt code is
> copied into lots of various projects.
> 
> In addition to that, it mentions in the documentation that qualifiers are
> not allowed at toplevel asm statements; apparently our documentation at
> least from r220506 for GCC 5 says that at toplevel Basic Asm needs to be
> used and for Basic Asm lists volatile qualifier as optional and its behavior
> (that it is ignored for Basic Asm).  Makes me wonder if we don't want to
> keep accepting/ignoring volatile at toplevel for both C and C++ instead of
> rejecting it (and rejecting just the other qualifiers).  Thoughts on this?
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Attached is an untested backport of this patch to 8.4, which does allow
> asm volatile at toplevel, so that we don't break in 8.4 what has been
> accepted in 8.2.  Ok if it passes bootstrap/regtest there?

isn't that required for the gcc-7 branch as well? r267536 backported these
patches to the 7 branch as well.

Matthias


Re: [PATCH] Fix some cases of "whatever " on one line and " something" on the next one in diagnostics (PR other/80058)

2019-03-07 Thread Steve Kargl
On Thu, Mar 07, 2019 at 08:40:37PM +0100, Jakub Jelinek wrote:
> 
> 2019-03-07  Jakub Jelinek  
> 
>   PR other/80058
>   * lra-constraints.c (process_alt_operands): Avoid one space before
>   " at the end of line and another after " on another line in a string
>   literal.
>   * attribs.c (handle_dll_attribute): Likewise.
>   * config/avr/avr-devices.c (avr_texinfo): Likewise.
> cp/
>   * parser.c (cp_parser_template_declaration_after_parameters): Avoid
>   one space before " at the end of line and another after " on another
>   line in a string literal.
> fortran/
>   * arith.c (gfc_complex2complex): Avoid two spaces in the middle of
>   diagnostics.
>   * resolve.c (resolve_allocate_expr): Likewise.
> 

Fortran changes are OK.  I suspect that this falls under the
obviously correct category.

-- 
Steve


[PATCH] Fix a config/s390/s390.c diagnostics bug (PR target/79846)

2019-03-07 Thread Jakub Jelinek
Hi!

As mentioned in the PR, using HOST_WIDE_INT_PRINT_* in the middle of
translatable message is highly undesirable, we end up with:
#: config/s390/s390.c:737
#, gcc-internal-format
msgid "constant argument %d for builtin %qF is out of range (0.."
msgstr ""

#: config/s390/s390.c:754
#, gcc-internal-format
msgid "constant argument %d for builtin %qF is out of range ("
msgstr ""
in gcc.pot that way and nothing is translated.

The following patch should fix that by using proper %wu/%wd.
Tested by building a cross-compiler to s390x-linux, ok for trunk?

2019-03-07  Jakub Jelinek  

PR target/79846
* config/s390/s390.c (s390_const_operand_ok): Use %wu instead of
HOST_WIDE_INT_PRINT_UNSIGNED and %wd instead of
HOST_WIDE_INT_PRINT_DEC.  Formatting fixes.

--- gcc/config/s390/s390.c.jj   2019-02-18 20:48:32.873728534 +0100
+++ gcc/config/s390/s390.c  2019-03-07 18:13:44.757949114 +0100
@@ -734,10 +734,9 @@ s390_const_operand_ok (tree arg, int arg
   if (!tree_fits_uhwi_p (arg)
  || tree_to_uhwi (arg) > (HOST_WIDE_INT_1U << bitwidth) - 1)
{
- error("constant argument %d for builtin %qF is out of range (0.."
-   HOST_WIDE_INT_PRINT_UNSIGNED ")",
-   argnum, decl,
-   (HOST_WIDE_INT_1U << bitwidth) - 1);
+ error ("constant argument %d for builtin %qF is out of range "
+"(0..%wu)", argnum, decl,
+(HOST_WIDE_INT_1U << bitwidth) - 1);
  return false;
}
 }
@@ -751,12 +750,10 @@ s390_const_operand_ok (tree arg, int arg
  || tree_to_shwi (arg) < -(HOST_WIDE_INT_1 << (bitwidth - 1))
  || tree_to_shwi (arg) > ((HOST_WIDE_INT_1 << (bitwidth - 1)) - 1))
{
- error("constant argument %d for builtin %qF is out of range ("
-   HOST_WIDE_INT_PRINT_DEC ".."
-   HOST_WIDE_INT_PRINT_DEC ")",
-   argnum, decl,
-   -(HOST_WIDE_INT_1 << (bitwidth - 1)),
-   (HOST_WIDE_INT_1 << (bitwidth - 1)) - 1);
+ error ("constant argument %d for builtin %qF is out of range "
+"(%wd..%wd)", argnum, decl,
+-(HOST_WIDE_INT_1 << (bitwidth - 1)),
+(HOST_WIDE_INT_1 << (bitwidth - 1)) - 1);
  return false;
}
 }

Jakub


[committed] Improve diagnostics for OpenMP doacross loops (PR translation/79999)

2019-03-07 Thread Jakub Jelinek
Hi!

The following testcase clarifies the diagnostics for OpenMP doaccross
and adds a testcase to cover warnings that weren't covered in the testsuite
previously.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-07  Jakub Jelinek  

PR translation/7
* gimplify.c (gimplify_omp_ordered): Reword diagnostics to talk about
depend clause with source (or sink) modifier.
* omp-expand.c (expand_omp_ordered_sink): Likewise.

* c-c++-common/gomp/doacross-1.c: Adjust expected diagnostics.
* c-c++-common/gomp/doacross-3.c: New test.

--- gcc/gimplify.c.jj   2019-03-06 19:45:40.822744176 +0100
+++ gcc/gimplify.c  2019-03-07 17:40:39.958301852 +0100
@@ -12145,8 +12145,8 @@ gimplify_omp_ordered (tree expr, gimple_
if (!fail && i != gimplify_omp_ctxp->loop_iter_var.length () / 2)
  {
error_at (OMP_CLAUSE_LOCATION (c),
- "number of variables in % "
- "clause does not match number of "
+ "number of variables in % clause with "
+ "% modifier does not match number of "
  "iteration variables");
failures++;
  }
@@ -12158,8 +12158,8 @@ gimplify_omp_ordered (tree expr, gimple_
if (source_c)
  {
error_at (OMP_CLAUSE_LOCATION (c),
- "more than one % clause on an "
- "% construct");
+ "more than one % clause with % "
+ "modifier on an % construct");
failures++;
  }
else
@@ -12169,8 +12169,9 @@ gimplify_omp_ordered (tree expr, gimple_
   if (source_c && sink_c)
 {
   error_at (OMP_CLAUSE_LOCATION (source_c),
-   "% clause specified together with "
-   "% clauses on the same construct");
+   "% clause with % modifier specified "
+   "together with % clauses with % modifier "
+   "on the same construct");
   failures++;
 }
 
--- gcc/omp-expand.c.jj 2019-01-01 12:37:16.673982892 +0100
+++ gcc/omp-expand.c2019-03-07 17:42:04.906917520 +0100
@@ -2147,8 +2147,8 @@ expand_omp_ordered_sink (gimple_stmt_ite
  forward = tree_int_cst_sgn (step) != -1;
}
  if (forward ^ OMP_CLAUSE_DEPEND_SINK_NEGATIVE (deps))
-   warning_at (loc, 0, "% clause waiting for "
-   "lexically later iteration");
+   warning_at (loc, 0, "% clause with % modifier "
+   "waiting for lexically later iteration");
  break;
}
   deps = TREE_CHAIN (deps);
@@ -2284,8 +2284,9 @@ expand_omp_ordered_sink (gimple_stmt_ite
   build_int_cst (itype, 0));
  if (integer_zerop (t) && !warned_step)
{
- warning_at (loc, 0, "% refers to iteration never "
- "in the iteration space");
+ warning_at (loc, 0, "% clause with % modifier "
+ "refers to iteration never in the iteration "
+ "space");
  warned_step = true;
}
  cond = fold_build2_loc (loc, BIT_AND_EXPR, boolean_type_node,
--- gcc/testsuite/c-c++-common/gomp/doacross-1.c.jj 2015-11-06 
22:20:16.181539042 +0100
+++ gcc/testsuite/c-c++-common/gomp/doacross-1.c2019-03-07 
17:49:42.077467424 +0100
@@ -38,11 +38,11 @@ foo (void)
   for (i = 0; i < 64; i++)
 {
   #pragma omp ordered depend (sink: i - 1) depend (sink: i - 2)
-  #pragma omp ordered depend (source) depend (source) /* { dg-error "more 
than one .depend.source.. clause on an" } */
+  #pragma omp ordered depend (source) depend (source) /* { dg-error "more 
than one .depend. clause with .source. modifier on an .ordered. construct" } */
 }
   #pragma omp for ordered (1)
   for (i = 0; i < 64; i++)
 {
-  #pragma omp ordered depend (sink: i - 1) depend (source) depend (sink: i 
- 2) /* { dg-error "clause specified together with" } */
+  #pragma omp ordered depend (sink: i - 1) depend (source) depend (sink: i 
- 2) /* { dg-error ".depend. clause with .source. modifier specified together 
with .depend. clauses with .sink. modifier on the same construct" } */
 }
 }
--- gcc/testsuite/c-c++-common/gomp/doacross-3.c.jj 2019-03-07 
17:53:06.503136095 +0100
+++ gcc/testsuite/c-c++-common/gomp/doacross-3.c2019-03-07 
18:01:54.992521307 +0100
@@ -0,0 +1,54 @@
+/* { dg-do compile } */
+/* { dg-options "-fopenmp" } */
+
+void
+foo (void)
+{
+  int i, j;
+  #pragma omp for ordered (1)
+  for (i = 0; i < 64; i++)
+{
+  #pragma omp ordered depend (sink: i + 1) /* { dg-warning "'depend' 
clause with 'sink' modifier waiting for lexically later iteration" } */
+  #pragma omp ordered depend 

[PATCH] Diagnostic fixes for config/i386/i386.c (PR target/80003)

2019-03-07 Thread Jakub Jelinek
Hi!

The following patch fixes a couple of issues in i386.c diagnostics.

In particular, diagnostics emitted to the user (not in dump files)
should not start with a capital letter unless it is something that
starts with capital letter in a middle of a sentence, and should not end
with a period.  The ix86_handle_interrupt_attribute hunk is also something
the translators requested, the wording has been composed in a weird way
from portions of a keyword in the message and portions from (untranslated)
strings, the change surrounds it with quotes and makes the whole type name
come from untranslated string literal.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-07  Jakub Jelinek  

PR target/80003
* config/i386/i386.c (ix86_set_func_type): Make sure diagnostics
doesn't start with a capital letter and doesn't end with a dot.
(ix86_function_arg_boundary): Make sure diagnostics doesn't start
with a capital letter.
(ix86_mangle_function_version_assembler_name): Likewise.
(ix86_generate_version_dispatcher_body): Likewise.
(fold_builtin_cpu): Likewise.
(get_builtin_code_for_version): Likewise.  Remove extraneous space.
(ix86_handle_interrupt_attribute): Make the diagnostics easier for
translators, wrap full type name in %qs.

* gcc.target/i386/pr68657.c: Adjust expected diagnostics wording.
* gcc.target/i386/interrupt-6.c: Likewise.
* g++.target/i386/pr57362.C: Adjust capitalization in dg-prune-output.

--- gcc/config/i386/i386.c.jj   2019-02-28 21:48:51.662326676 +0100
+++ gcc/config/i386/i386.c  2019-03-07 17:22:43.787842967 +0100
@@ -5800,8 +5800,8 @@ ix86_set_func_type (tree fndecl)
 
  /* Only dwarf2out.c can handle -WORD(AP) as a pointer argument.  */
  if (write_symbols != NO_DEBUG && write_symbols != DWARF2_DEBUG)
-   sorry ("Only DWARF debug format is supported for interrupt "
-  "service routine.");
+   sorry ("only DWARF debug format is supported for interrupt "
+  "service routine");
}
   else
{
@@ -9069,7 +9069,7 @@ ix86_function_arg_boundary (machine_mode
{
  warned = true;
  inform (input_location,
- "The ABI for passing parameters with %d-byte"
+ "the ABI for passing parameters with %d-byte"
  " alignment has changed in GCC 4.6",
  align / BITS_PER_UNIT);
}
@@ -32116,7 +32116,7 @@ get_builtin_code_for_version (tree decl,
   if (predicate_list && arg_str == NULL)
{
  error_at (DECL_SOURCE_LOCATION (decl),
-   "No dispatcher found for the versioning attributes");
+   "no dispatcher found for the versioning attributes");
  return 0;
}
 
@@ -32166,7 +32166,7 @@ get_builtin_code_for_version (tree decl,
   if (predicate_list && i == NUM_FEATURES)
{
  error_at (DECL_SOURCE_LOCATION (decl),
-   "No dispatcher found for %s", token);
+   "no dispatcher found for %s", token);
  return 0;
}
   token = strtok (NULL, ",");
@@ -32176,7 +32176,7 @@ get_builtin_code_for_version (tree decl,
   if (predicate_list && predicate_chain == NULL_TREE)
 {
   error_at (DECL_SOURCE_LOCATION (decl),
-   "No dispatcher found for the versioning attributes : %s",
+   "no dispatcher found for the versioning attributes: %s",
attrs_str);
   return 0;
 }
@@ -32338,12 +32338,12 @@ ix86_mangle_function_version_assembler_n
   && lookup_attribute ("gnu_inline",
   DECL_ATTRIBUTES (decl)))
 error_at (DECL_SOURCE_LOCATION (decl),
- "Function versions cannot be marked as gnu_inline,"
+ "function versions cannot be marked as gnu_inline,"
  " bodies have to be generated");
 
   if (DECL_VIRTUAL_P (decl)
   || DECL_VINDEX (decl))
-sorry ("Virtual function multiversioning not supported");
+sorry ("virtual function multiversioning not supported");
 
   version_attr = lookup_attribute ("target", DECL_ATTRIBUTES (decl));
 
@@ -32619,7 +32619,7 @@ ix86_generate_version_dispatcher_body (v
 virtual methods in base classes but are not explicitly marked as
 virtual.  */
   if (DECL_VINDEX (versn->decl))
-   sorry ("Virtual function multiversioning not supported");
+   sorry ("virtual function multiversioning not supported");
 
   fn_ver_vec.safe_push (versn->decl);
 }
@@ -32898,7 +32898,7 @@ fold_builtin_cpu (tree fndecl, tree *arg
 STRING_CST.   */
   if (!EXPR_P (param_string_cst))
{
- error ("Parameter to builtin must be a string constant or literal");
+ error ("parameter to builtin must be a string constant or literal");
  return integer_zero_node;
}
   

[PATCH] Fix some cases of "whatever " on one line and " something" on the next one in diagnostics (PR other/80058)

2019-03-07 Thread Jakub Jelinek
Hi!

The following patch fixes a couple of cases where two or more spaces
are introduced in a middle of diagnostic message because we've split source
line and left a space both at the end of one line and at the start of next
one.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2019-03-07  Jakub Jelinek  

PR other/80058
* lra-constraints.c (process_alt_operands): Avoid one space before
" at the end of line and another after " on another line in a string
literal.
* attribs.c (handle_dll_attribute): Likewise.
* config/avr/avr-devices.c (avr_texinfo): Likewise.
cp/
* parser.c (cp_parser_template_declaration_after_parameters): Avoid
one space before " at the end of line and another after " on another
line in a string literal.
fortran/
* arith.c (gfc_complex2complex): Avoid two spaces in the middle of
diagnostics.
* resolve.c (resolve_allocate_expr): Likewise.

--- gcc/lra-constraints.c.jj2019-02-20 22:14:42.288643681 +0100
+++ gcc/lra-constraints.c   2019-03-07 16:58:59.121039574 +0100
@@ -2681,7 +2681,7 @@ process_alt_operands (int only_alternati
  if (lra_dump_file != NULL)
fprintf (lra_dump_file,
 "alt=%d: reload pseudo for op %d "
-" cannot hold the mode value -- refuse\n",
+"cannot hold the mode value -- refuse\n",
 nalt, nop);
  goto fail;
}
--- gcc/attribs.c.jj2019-03-05 14:38:14.447414660 +0100
+++ gcc/attribs.c   2019-03-07 16:57:36.600383417 +0100
@@ -1664,7 +1664,7 @@ handle_dll_attribute (tree * pnode, tree
  && DECL_DECLARED_INLINE_P (node))
{
  warning (OPT_Wattributes, "inline function %q+D declared as "
- " dllimport: attribute ignored", node);
+ "dllimport: attribute ignored", node);
  *no_add_attrs = true;
}
   /* Like MS, treat definition of dllimported variables and
--- gcc/config/avr/avr-devices.c.jj 2019-01-01 12:37:28.987780853 +0100
+++ gcc/config/avr/avr-devices.c2019-03-07 17:07:31.992688170 +0100
@@ -76,7 +76,7 @@ avr_texinfo[] =
 "the @code{MOVW} instruction." },
   { ARCH_AVR3,
 "``Classic'' devices with 16@tie{}KiB up to 64@tie{}KiB of "
-" program memory." },
+"program memory." },
   { ARCH_AVR31,
 "``Classic'' devices with 128@tie{}KiB of program memory." },
   { ARCH_AVR35,
--- gcc/cp/parser.c.jj  2019-03-07 10:06:49.0 +0100
+++ gcc/cp/parser.c 2019-03-07 17:02:12.514890165 +0100
@@ -27865,7 +27865,7 @@ cp_parser_template_declaration_after_par
  if (cxx_dialect > cxx17)
error ("literal operator template %qD has invalid parameter list;"
   "  Expected non-type template parameter pack  "
-  "  or single non-type parameter of class type",
+  "or single non-type parameter of class type",
   decl);
  else
error ("literal operator template %qD has invalid parameter list."
--- gcc/fortran/arith.c.jj  2019-02-25 10:12:55.454061762 +0100
+++ gcc/fortran/arith.c 2019-03-07 17:06:09.267034995 +0100
@@ -2472,7 +2472,7 @@ gfc_complex2complex (gfc_expr *src, int
   int w = warn_conversion ? OPT_Wconversion : OPT_Wconversion_extra;
 
   gfc_warning_now (w, "Change of value in conversion from "
-  " %qs to %qs at %L",
+  "%qs to %qs at %L",
   gfc_typename (>ts), gfc_typename (>ts),
   >where);
   did_warn = true;
--- gcc/fortran/resolve.c.jj2019-03-04 10:22:33.985168769 +0100
+++ gcc/fortran/resolve.c   2019-03-07 17:06:38.781554480 +0100
@@ -7798,7 +7798,7 @@ resolve_allocate_expr (gfc_expr *e, gfc_
if (mpz_cmp_si (ar->start[i]->value.integer, 1) < 0)
  {
gfc_error ("Upper cobound is less than lower cobound "
-  " of 1 at %L", >start[i]->where);
+  "of 1 at %L", >start[i]->where);
goto failure;
  }
  }

Jakub


[PATCH] Fix translation issue in config/darwin.c (PR target/80190)

2019-03-07 Thread Jakub Jelinek
Hi!

In this PR, the translators complained that this diagnostics is composed of
two parts, one that can be translated and the other can't, and while ASCII
and NUL probably don't need translation, character, embedded and non do.

The following patch just makes it two complete diagnostic messages that
translators can translate as they wish.

Tested with cross to x86_64-darwin, cc1 still builds.

Ok for trunk?

2019-03-07  Jakub Jelinek  

PR target/80190
* config/darwin.c: Include intl.h.
(darwin_build_constant_cfstring): Improve i18n of diagnostics by not
composing the message out of two separate parts.

--- gcc/config/darwin.c.jj  2019-01-01 12:37:22.233891667 +0100
+++ gcc/config/darwin.c 2019-03-07 16:46:56.983799698 +0100
@@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.
 #include "langhooks.h"
 #include "toplev.h"
 #include "lto-section-names.h"
+#include "intl.h"
 
 /* Darwin supports a feature called fix-and-continue, which is used
for rapid turn around debugging.  When code is compiled with the
@@ -3565,8 +3566,9 @@ darwin_build_constant_cfstring (tree str
  for (l = 0; l < length; l++)
if (!s[l] || !isascii (s[l]))
  {
-   warning (darwin_warn_nonportable_cfstrings, "%s in CFString 
literal",
-s[l] ? "non-ASCII character" : "embedded NUL");
+   warning (darwin_warn_nonportable_cfstrings,
+s[l] ? G_("non-ASCII character in CFString literal")
+ : G_("embedded NUL in CFString literal"));
break;
  }
}

Jakub


[PATCH] Fix up diagnostics in gimple-ssa-warn-alloca.c

2019-03-07 Thread Jakub Jelinek
Hi!

When looking at the diagnostics PRs, I've noticed that several diagnostic calls
in gimple-ssa-warn-alloca.c use G_(...) uselessly, it is only needed if the
argument is not a string literal.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
verified the messages are unmodified in gcc.pot, ok for trunk?

2019-03-07  Jakub Jelinek  

* gimple-ssa-warn-alloca.c (pass_walloca::execute): Don't wrap
warning_at or inform messages in G_() if there is no ?:.

--- gcc/gimple-ssa-warn-alloca.c.jj 2019-01-01 12:37:18.193957952 +0100
+++ gcc/gimple-ssa-warn-alloca.c2019-03-07 16:43:30.308166042 +0100
@@ -528,7 +528,7 @@ pass_walloca::execute (function *fun)
}
  else if (warn_alloca)
{
- warning_at (loc, OPT_Walloca, G_("use of %"));
+ warning_at (loc, OPT_Walloca, "use of %");
  continue;
}
  else if (warn_alloca_limit < 0)
@@ -571,8 +571,8 @@ pass_walloca::execute (function *fun)
&& t.limit != 0)
  {
print_decu (t.limit, buff);
-   inform (loc, G_("limit is %wu bytes, but argument "
-   "may be as large as %s"),
+   inform (loc, "limit is %wu bytes, but argument "
+"may be as large as %s",
is_vla ? warn_vla_limit : adjusted_alloca_limit,
buff);
  }
@@ -588,7 +588,7 @@ pass_walloca::execute (function *fun)
&& t.limit != 0)
  {
print_decu (t.limit, buff);
-   inform (loc, G_("limit is %wu bytes, but argument is %s"),
+   inform (loc, "limit is %wu bytes, but argument is %s",
  is_vla ? warn_vla_limit : adjusted_alloca_limit,
  buff);
  }
@@ -606,7 +606,7 @@ pass_walloca::execute (function *fun)
  break;
case ALLOCA_IN_LOOP:
  gcc_assert (!is_vla);
- warning_at (loc, wcode, G_("use of % within a loop"));
+ warning_at (loc, wcode, "use of % within a loop");
  break;
case ALLOCA_CAST_FROM_SIGNED:
  gcc_assert (invalid_casted_type != NULL_TREE);

Jakub


[PATCH] Two diagnostics fixes for ipa-devirt.c (PR ipa/80000, PR target/85665)

2019-03-07 Thread Jakub Jelinek
Hi!

The following patch fixes two diagnostics issues in ipa-devirt.c, one
is trailing space in one warning, the other is lack of articles in another
warning, both reported by the translation team.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-07  Jakub Jelinek  

PR ipa/8
* ipa-devirt.c (compare_virtual_tables): Remove two trailing spaces
from diagnostics.  Formatting fixes.

PR target/85665
* ipa-devirt.c (odr_types_equivalent_p): Fix grammar in
warn_odr diagnostics.

--- gcc/ipa-devirt.c.jj 2019-01-10 11:43:14.380377810 +0100
+++ gcc/ipa-devirt.c2019-03-07 16:21:14.641937211 +0100
@@ -842,17 +842,16 @@ compare_virtual_tables (varpool_node *pr
{
  class_type->odr_violated = true;
  auto_diagnostic_group d;
- if (warning_at (DECL_SOURCE_LOCATION
-   (TYPE_NAME (DECL_CONTEXT (vtable->decl))),
- OPT_Wodr,
+ tree ctx = TYPE_NAME (DECL_CONTEXT (vtable->decl));
+ if (warning_at (DECL_SOURCE_LOCATION (ctx), OPT_Wodr,
  "virtual table of type %qD violates "
- "one definition rule  ",
+ "one definition rule",
  DECL_CONTEXT (vtable->decl)))
{
- inform (DECL_SOURCE_LOCATION
-   (TYPE_NAME (DECL_CONTEXT (prevailing->decl))),
- "the conflicting type defined in another translation "
- "unit has virtual table of different size");
+ ctx = TYPE_NAME (DECL_CONTEXT (prevailing->decl));
+ inform (DECL_SOURCE_LOCATION (ctx),
+ "the conflicting type defined in another translation"
+ " unit has virtual table of different size");
}
}
  return;
@@ -1607,7 +1606,8 @@ odr_types_equivalent_p (tree t1, tree t2
if (DECL_BIT_FIELD (f1) != DECL_BIT_FIELD (f2))
  {
warn_odr (t1, t2, f1, f2, warn, warned,
- G_("one field is bitfield while other is not"));
+ G_("one field is a bitfield while the other "
+"is not"));
return false;
  }
else

Jakub


[C++ PATCH] Disallow reinterpret_cast in potential_constant_expression_1 (PR c++/89599)

2019-03-07 Thread Jakub Jelinek
Hi!

The last testcase in the patch diagnoses invalid constexpr in the
ptr case, but doesn't for arr.
The array is constexpr, so we do:
  value = fold_non_dependent_expr (value);
  if (DECL_DECLARED_CONSTEXPR_P (decl)
  || (DECL_IN_AGGR_P (decl)
  && DECL_INITIALIZED_IN_CLASS_P (decl)))
{
  /* Diagnose a non-constant initializer for constexpr variable or
 non-inline in-class-initialized static data member.  */
  if (!require_constant_expression (value))
value = error_mark_node;
  else if (processing_template_decl)
/* In a template we might not have done the necessary
   transformations to make value actually constant,
   e.g. extend_ref_init_temps.  */
value = maybe_constant_init (value, decl, true);
  else
value = cxx_constant_init (value, decl);
}
but require_constant_expression returned true even when there are
REINTERPRET_CAST_Ps in the CONSTRUCTOR, and then cxx_constant_init
doesn't reject it, because:
case CONSTRUCTOR:
  if (TREE_CONSTANT (t) && reduced_constant_expression_p (t))
{
  /* Don't re-process a constant CONSTRUCTOR, but do fold it to
 VECTOR_CST if applicable.  */
  verify_constructor_flags (t);
  if (TREE_CONSTANT (t))
return fold (t);
}
  r = cxx_eval_bare_aggregate (ctx, t, lval,
   non_constant_p, overflow_p);
  break;
and reduced_constant_expression_p is true on it, so we never try to evaluate
it.

The following patch changes potential_constant_expression_1 to reject the
REINTERPRET_CAST_P, not really sure if that is the best way though.

In any case, bootstrapped/regtested on x86_64-linux and i686-linux.

2019-03-07  Jakub Jelinek  

PR c++/89599
* constexpr.c (potential_constant_expression_1): Reject
REINTERPRET_CAST_P NOP_EXPRs.

* g++.dg/ubsan/vptr-4.C: Adjust expected diagnostics.
* g++.dg/parse/array-size2.C: Likewise.
* g++.dg/cpp0x/constexpr-89599.C: New test.

--- gcc/cp/constexpr.c.jj   2019-03-07 14:03:00.040329107 +0100
+++ gcc/cp/constexpr.c  2019-03-07 15:36:30.831721422 +0100
@@ -5997,6 +5997,13 @@ potential_constant_expression_1 (tree t,
   return true;
 
 case NOP_EXPR:
+  if (REINTERPRET_CAST_P (t))
+   {
+ if (flags & tf_error)
+   error_at (loc, "a reinterpret_cast is not a constant expression");
+ return false;
+   }
+  /* FALLTHRU */
 case CONVERT_EXPR:
 case VIEW_CONVERT_EXPR:
   /* -- a reinterpret_cast.  FIXME not implemented, and this rule
--- gcc/testsuite/g++.dg/ubsan/vptr-4.C.jj  2019-02-21 22:20:24.886099340 
+0100
+++ gcc/testsuite/g++.dg/ubsan/vptr-4.C 2019-03-07 15:59:34.853143085 +0100
@@ -19,7 +19,7 @@ struct T : S {
 };
 
 constexpr T t;
-constexpr const T *p = t.foo ();   // { dg-message "expansion of" }
+constexpr const T *p = t.foo ();   // { dg-error "called in a constant 
expression" }
 
 template 
 struct V {
@@ -39,17 +39,16 @@ struct W : V {
 };
 
 constexpr W w;
-constexpr const W *s = w.foo ();  // { dg-error "is not a constant 
expression" }
-// { dg-message "expansion of" "" { target *-*-* } .-1 }
+constexpr const W *s = w.foo ();  // { dg-error "called in a constant 
expression" }
 
 template 
 int foo (void)
 {
   static constexpr T t;
-  static constexpr const T *p = t.foo ();  // { dg-message "expansion of" }
+  static constexpr const T *p = t.foo ();  // { dg-error "called in a 
constant expression" }
   static constexpr W w;
-  static constexpr const W *s = w.foo ();   // { dg-error "is not a 
constant expression" }
-  return t.b + w.b;// { dg-message "expansion of" 
"" { target *-*-* } .-1 }
+  static constexpr const W *s = w.foo ();   // { dg-error "called in a 
constant expression" }
+  return t.b + w.b;
 }
 
 int x = foo  ();
--- gcc/testsuite/g++.dg/parse/array-size2.C.jj 2018-11-29 23:11:40.225616846 
+0100
+++ gcc/testsuite/g++.dg/parse/array-size2.C2019-03-07 15:51:12.998330254 
+0100
@@ -15,6 +15,8 @@ void
 foo (void)
 {
   char g[(char *) &((struct S *) 0)->b - (char *) 0]; // { dg-error "40:size 
of array .g. is not an integral constant-expression" }
+ // { dg-error "narrowing 
conversion" "" { target c++11 } .-1 }
+ // { dg-message 
"expression has a constant value but is not a C.. constant-expression" "" { 
target c++11 } .-2 }
   char h[(__SIZE_TYPE__) &((struct S *) 8)->b];  // { dg-error 
"10:size of array .h. is not an integral constant-expression" }
   bar (g, h);
 }
--- gcc/testsuite/g++.dg/cpp0x/constexpr-89599.C.jj 2019-03-07 
16:04:43.526107447 +0100
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-89599.C2019-03-07 
16:04:14.519580648 +0100
@@ -0,0 +1,6 @@
+// PR 

[C++ PATCH] Fix up joust diagnostics (PR c++/89622)

2019-03-07 Thread Jakub Jelinek
Hi!

If no diagnostics is emitted by this pedwarn, whether because of
-Wno-system-headers and location from system headers, or because of -w
etc., we still emit the follow-up messages as if the pedwarn emitted
something.

The following patch makes it conditional on pedwarn returning true (i.e.
that something has been actually printed).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-07  Jakub Jelinek  

PR c++/89622
* call.c (joust): Call print_z_candidate only if pedwarn returned
true.

* g++.dg/warn/pr89622.C: New test.

--- gcc/cp/call.c.jj2019-03-06 19:45:40.369751609 +0100
+++ gcc/cp/call.c   2019-03-07 14:51:20.062959019 +0100
@@ -10954,12 +10954,14 @@ tweak:
  if (warn)
{
  auto_diagnostic_group d;
- pedwarn (input_location, 0,
- "ISO C++ says that these are ambiguous, even "
- "though the worst conversion for the first is better than "
- "the worst conversion for the second:");
- print_z_candidate (input_location, _("candidate 1:"), w);
- print_z_candidate (input_location, _("candidate 2:"), l);
+ if (pedwarn (input_location, 0,
+  "ISO C++ says that these are ambiguous, even "
+  "though the worst conversion for the first is "
+  "better than the worst conversion for the second:"))
+   {
+ print_z_candidate (input_location, _("candidate 1:"), w);
+ print_z_candidate (input_location, _("candidate 2:"), l);
+   }
}
  else
add_warning (w, l);
--- gcc/testsuite/g++.dg/warn/pr89622.C.jj  2019-03-07 14:51:57.691344552 
+0100
+++ gcc/testsuite/g++.dg/warn/pr89622.C 2019-03-07 14:48:24.167831364 +0100
@@ -0,0 +1,27 @@
+// PR c++/89622
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wno-system-headers -w" }
+// { dg-bogus "says that these are ambiguous" "" { target *-*-* } 0 }
+// { dg-bogus "candidate 1" "" { target *-*-* } 0 }
+// { dg-bogus "candidate 2" "" { target *-*-* } 0 }
+
+# 3 "pr89622.h" 3
+template
+struct X
+{
+  X() { }
+  template X(int, U&&) { }
+  template X(char, const X&) { }
+};
+
+template
+X wrap_X(X x)
+{
+  return X('a', x);
+}
+
+int main()
+{
+  X x;
+  wrap_X(x);
+}

Jakub


[C++ PATCH] Add -fconstexpr-loop-nest-limit= option (PR c++/87481)

2019-03-07 Thread Jakub Jelinek
Hi!

We have -fconstexpr-loop-limit= option to have an upper bound for constexpr
evaluation of a single loop.  Even with that limit in place, if we have
several nested loops during constexpr evaluation, even when each could have
a few hundred iterations, the whole loop nest could take years to evaluate.
And the loops can be split across several constexpr function calls even.

The following patch adds another, slightly larger, limit on total number of
iterations in the whole loop nest.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-07  Jakub Jelinek  

PR c++/87481
* doc/invoke.texi (-fconstexpr-loop-nest-limit=): Document.
(-fconstexpr-loop-limit=): Slightly adjust wording.

* c.opt (-fconstexpr-loop-nest-limit=): New option.
(-fconstexpr-loop-limit=): Slightly adjust description.

* constexpr.c (cxx_eval_loop_expr): Count also number of iterations
of all nested loops and punt if it is above constexpr_loop_nest_limit.

* g++.dg/cpp1y/constexpr-87481.C: New test.

--- gcc/doc/invoke.texi.jj  2019-03-02 09:03:25.822755663 +0100
+++ gcc/doc/invoke.texi 2019-03-07 14:25:34.881197149 +0100
@@ -210,7 +210,7 @@ in the following sections.
 @gccoptlist{-fabi-version=@var{n}  -fno-access-control @gol
 -faligned-new=@var{n}  -fargs-in-order=@var{n}  -fchar8_t  -fcheck-new @gol
 -fconstexpr-depth=@var{n}  -fconstexpr-loop-limit=@var{n} @gol
--fno-elide-constructors @gol
+-fconstexpr-loop-nest-limit=@var{n} -fno-elide-constructors @gol
 -fno-enforce-eh-specs @gol
 -fno-gnu-keywords @gol
 -fno-implicit-templates @gol
@@ -2522,10 +2522,19 @@ is 512.
 
 @item -fconstexpr-loop-limit=@var{n}
 @opindex fconstexpr-loop-limit
-Set the maximum number of iterations for a loop in C++14 constexpr functions
-to @var{n}.  A limit is needed to detect infinite loops during
+Set the maximum number of iterations for a single loop in C++14 constexpr
+functions to @var{n}.  A limit is needed to detect infinite loops during
 constant expression evaluation.  The default is 262144 (1<<18).
 
+@item -fconstexpr-loop-nest-limit=@var{n}
+@opindex fconstexpr-loop-nest-limit
+Set the maximum number of iterations for all loops in a loop nest in C++14
+constexpr functions to @var{n}.  Even when number of iterations of a single
+loop is limited with the above limit, if there are several nested loops and
+each of them has many iterations but still smaller than the above limit,
+the resulting evaluation of the loop nest might take too long.
+The default is 1048576 (1<<20).
+
 @item -fdeduce-init-list
 @opindex fdeduce-init-list
 Enable deduction of a template type parameter as
--- gcc/c-family/c.opt.jj   2019-02-25 23:56:58.149055806 +0100
+++ gcc/c-family/c.opt  2019-03-07 14:25:07.782639790 +0100
@@ -1414,7 +1414,11 @@ C++ ObjC++ Joined RejectNegative UIntege
 
 fconstexpr-loop-limit=
 C++ ObjC++ Joined RejectNegative UInteger Var(constexpr_loop_limit) 
Init(262144)
--fconstexpr-loop-limit=Specify maximum constexpr loop 
iteration count.
+-fconstexpr-loop-limit=Specify maximum constexpr loop 
iteration count of a single loop.
+
+fconstexpr-loop-nest-limit=
+C++ ObjC++ Joined RejectNegative UInteger Var(constexpr_loop_nest_limit) 
Init(1048576)
+-fconstexpr-loop-nest-limit=   Specify maximum constexpr loop 
iteration count for all nested loops.
 
 fdebug-cpp
 C ObjC C++ ObjC++
--- gcc/cp/constexpr.c.jj   2019-03-06 19:45:40.360751756 +0100
+++ gcc/cp/constexpr.c  2019-03-07 14:03:00.040329107 +0100
@@ -4190,8 +4190,20 @@ cxx_eval_loop_expr (const constexpr_ctx
 default:
   gcc_unreachable ();
 }
+
+  /* True on entry to cxx_eval_loop_expr if called indirectly from another
+ cxx_eval_loop_expr.  */
+  static bool constexpr_loop_nested;
+
+  /* Number of evaluated loop iterations in the whole current loop nest.  */
+  static int constexpr_loop_nest_count;
+
+  bool save_constexpr_loop_nested = constexpr_loop_nested;
+  constexpr_loop_nested = true;
+
   hash_set save_exprs;
   new_ctx.save_exprs = _exprs;
+
   do
 {
   if (count != -1)
@@ -4248,6 +4260,24 @@ cxx_eval_loop_expr (const constexpr_ctx
  *non_constant_p = true;
  break;
}
+
+  /* In nested loops, don't count the first iteration, as it has been
+counted in the parent loop already.  */
+  if (count > (save_constexpr_loop_nested ? 1 : 0))
+   {
+ if (++constexpr_loop_nest_count >= constexpr_loop_nest_limit)
+   {
+ if (!ctx->quiet)
+   error_at (cp_expr_loc_or_loc (t, input_location),
+ "% loop iteration count in all nested "
+ "loops exceeds limit of %d (use "
+ "-fconstexpr-loop-nest-limit= to increase the limit)",
+ constexpr_loop_nest_limit);
+ constexpr_loop_nest_count = 0;
+ *non_constant_p = true;
+ break;
+   }
+   

[PATCH] Only set TREE_NO_WARNING after warning if warning returned true (PR tree-optimization/89550)

2019-03-07 Thread Jakub Jelinek
Hi!

While looking at why we didn't warn before Richi's commit, I've discovered
is that the change is that in a different pass we used to call c_strlen too,
but at that point with a location from system headers, so no warning has
been printed, but TREE_NO_WARNING has been set anyway and later on when
c_strlen was called again with a location not from system header, we
wouldn't warn because TREE_NO_WARNING is set.

Apparently there are quite a few spots that have similar bug (and quite a
few spots that do it right).

The following patch fixes what I found.  Bootstrapped/regtested on
x86_64-linux and i686-linux, ok for trunk?

2019-03-07  Jakub Jelinek  

PR tree-optimization/89550
* builtins.c (c_strlen): Only set TREE_NO_WARNING if warning_at
returned true.  Formatting fixes.
(expand_builtin_strnlen): Formatting fixes.
* tree-vrp.c (vrp_prop::check_mem_ref): Only set TREE_NO_WARNING
if warning_at returned true.
* tree-cfg.c (pass_warn_function_return::execute): Likewise.
c-family/
* c-common.c (c_common_truthvalue_conversion): Only set
TREE_NO_WARNING if warning_at returned true.
* c-warn.c (overflow_warning, warn_logical_operator): Likewise.
c/
* c-decl.c (finish_function): Only set TREE_NO_WARNING if warning_at
returned true.
(c_write_global_declarations_1): Only set TREE_NO_WARNING if pedwarn
or warning returned true.
cp/
* semantics.c (maybe_convert_cond): Only set TREE_NO_WARNING if
warning_at returned true.
* decl2.c (c_parse_final_cleanups): Likewise.
* typeck.c (convert_for_assignment): Likewise.
* decl.c (finish_function): Likewise.

--- gcc/builtins.c.jj   2019-03-05 17:21:42.536993265 +0100
+++ gcc/builtins.c  2019-03-07 12:01:23.114479128 +0100
@@ -760,15 +760,13 @@ c_strlen (tree src, int only_value, c_st
  runtime.  */
   if (eltoff < 0 || eltoff >= maxelts)
 {
- /* Suppress multiple warnings for propagated constant strings.  */
+  /* Suppress multiple warnings for propagated constant strings.  */
   if (only_value != 2
- && !TREE_NO_WARNING (src))
-{
- warning_at (loc, OPT_Warray_bounds,
- "offset %qwi outside bounds of constant string",
- eltoff);
-  TREE_NO_WARNING (src) = 1;
-}
+ && !TREE_NO_WARNING (src)
+ && warning_at (loc, OPT_Warray_bounds,
+"offset %qwi outside bounds of constant string",
+eltoff))
+   TREE_NO_WARNING (src) = 1;
   return NULL_TREE;
 }
 
@@ -3099,7 +3097,7 @@ expand_builtin_strnlen (tree exp, rtx ta
 "%K%qD specified bound %E "
 "exceeds maximum object size %E",
 exp, func, bound, maxobjsize))
- TREE_NO_WARNING (exp) = true;
+   TREE_NO_WARNING (exp) = true;
 
   bool exact = true;
   if (!len || TREE_CODE (len) != INTEGER_CST)
@@ -3158,7 +3156,7 @@ expand_builtin_strnlen (tree exp, rtx ta
 "%K%qD specified bound [%wu, %wu] "
 "exceeds maximum object size %E",
 exp, func, min.to_uhwi (), max.to_uhwi (), maxobjsize))
-  TREE_NO_WARNING (exp) = true;
+TREE_NO_WARNING (exp) = true;
 
   bool exact = true;
   if (!len || TREE_CODE (len) != INTEGER_CST)
--- gcc/tree-vrp.c.jj   2019-02-01 09:43:40.264757408 +0100
+++ gcc/tree-vrp.c  2019-03-07 12:05:05.959845856 +0100
@@ -4749,7 +4749,8 @@ vrp_prop::check_mem_ref (location_t loca
   if (warned && DECL_P (arg))
inform (DECL_SOURCE_LOCATION (arg), "while referencing %qD", arg);
 
-  TREE_NO_WARNING (ref) = 1;
+  if (warned)
+   TREE_NO_WARNING (ref) = 1;
   return;
 }
 
@@ -4762,11 +4763,10 @@ vrp_prop::check_mem_ref (location_t loca
 {
   HOST_WIDE_INT tmpidx = extrema[i].to_shwi () / eltsize.to_shwi ();
 
-  warning_at (location, OPT_Warray_bounds,
- "intermediate array offset %wi is outside array bounds "
- "of %qT",
- tmpidx,  reftype);
-  TREE_NO_WARNING (ref) = 1;
+  if (warning_at (location, OPT_Warray_bounds,
+ "intermediate array offset %wi is outside array bounds "
+ "of %qT", tmpidx, reftype))
+   TREE_NO_WARNING (ref) = 1;
 }
 }
 
--- gcc/tree-cfg.c.jj   2019-02-22 23:02:47.768118220 +0100
+++ gcc/tree-cfg.c  2019-03-07 12:02:52.662019138 +0100
@@ -9328,9 +9328,9 @@ pass_warn_function_return::execute (func
  location = gimple_location (last);
  if (LOCATION_LOCUS (location) == UNKNOWN_LOCATION)
location = fun->function_end_locus;
- warning_at (location, OPT_Wreturn_type,
- "control reaches end of non-void function");
- TREE_NO_WARNING (fun->decl) = 1;
+ if (warning_at 

[C++ PATCH] Toplevel asm volatile followup (PR c++/89585)

2019-03-07 Thread Jakub Jelinek
On Wed, Mar 06, 2019 at 07:44:25PM -0500, Jason Merrill wrote:
> > In addition to that, it mentions in the documentation that qualifiers are
> > not allowed at toplevel asm statements; apparently our documentation at
> > least from r220506 for GCC 5 says that at toplevel Basic Asm needs to be
> > used and for Basic Asm lists volatile qualifier as optional and its behavior
> > (that it is ignored for Basic Asm).  Makes me wonder if we don't want to
> > keep accepting/ignoring volatile at toplevel for both C and C++ instead of
> > rejecting it (and rejecting just the other qualifiers).  Thoughts on this?
> 
> That seems reasonable.  Or using warning or permerror instead of error.

This incremental patch uses warning.  Bootstrapped/regtested on
x86_64-linux and i686-linux, ok for trunk?

2019-03-07  Jakub Jelinek  

PR c++/89585
* parser.c (cp_parser_asm_definition): Just warn instead of error
on volatile qualifier outside of function body.

* g++.dg/asm-qual-3.C: Adjust expected diagnostics for toplevel
asm volatile.

--- gcc/cp/parser.c.jj  2019-03-07 09:17:41.406631683 +0100
+++ gcc/cp/parser.c 2019-03-07 10:06:49.297602880 +0100
@@ -19782,9 +19782,12 @@ cp_parser_asm_definition (cp_parser* par
inform (volatile_loc, "first seen here");
  }
else
- volatile_loc = loc;
-   if (!first_loc)
- first_loc = loc;
+ {
+   if (!parser->in_function_body)
+ warning_at (loc, 0, "asm qualifier %qT ignored outside of "
+ "function body", token->u.value);
+   volatile_loc = loc;
+ }
cp_lexer_consume_token (parser->lexer);
continue;
 
@@ -19830,10 +19833,10 @@ cp_parser_asm_definition (cp_parser* par
   bool inline_p = (inline_loc != UNKNOWN_LOCATION);
   bool goto_p = (goto_loc != UNKNOWN_LOCATION);
 
-  if (!parser->in_function_body && (volatile_p || inline_p || goto_p))
+  if (!parser->in_function_body && (inline_p || goto_p))
 {
   error_at (first_loc, "asm qualifier outside of function body");
-  volatile_p = inline_p = goto_p = false;
+  inline_p = goto_p = false;
 }
 
   /* Look for the opening `('.  */
--- gcc/testsuite/g++.dg/asm-qual-3.C.jj2019-03-07 09:17:41.417631503 
+0100
+++ gcc/testsuite/g++.dg/asm-qual-3.C   2019-03-07 10:11:18.160228806 +0100
@@ -3,7 +3,7 @@
 // { dg-options "-std=gnu++98" }
 
 asm const ("");// { dg-error {'const' is not an asm qualifier} }
-asm volatile (""); // { dg-error {asm qualifier outside of function body} }
+asm volatile (""); // { dg-warning {asm qualifier 'volatile' ignored outside 
of function body} }
 asm restrict (""); // { dg-error {expected '\(' before 'restrict'} }
 asm inline ("");   // { dg-error {asm qualifier outside of function body} }
 asm goto (""); // { dg-error {asm qualifier outside of function body} }


Jakub


[PATCH, PR d/89016] Committed fix for ICE at d/dmd/expression.c:3873

2019-03-07 Thread Iain Buclaw
Hi,

This patch merges the D front-end implementation with dmd upstream
d517c0e6a, fixing the ICE reported in PR d/89016.

Bootstrapped and regression tested on x86_64-linux-gnu.

Committed to trunk as r269465.

-- 
Iain
diff --git a/gcc/d/dmd/MERGE b/gcc/d/dmd/MERGE
index 97aa40d1ace..3f416dbfb7b 100644
--- a/gcc/d/dmd/MERGE
+++ b/gcc/d/dmd/MERGE
@@ -1,4 +1,4 @@
-ed71446aaa2bd0e548c3bf2154a638826dfe3db0
+d517c0e6a10b548f44d82b71b3c079663cb94f8e
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/dmd repository.
diff --git a/gcc/d/dmd/attrib.c b/gcc/d/dmd/attrib.c
index e4ad5739e73..a6686381485 100644
--- a/gcc/d/dmd/attrib.c
+++ b/gcc/d/dmd/attrib.c
@@ -30,6 +30,7 @@
 
 bool definitelyValueParameter(Expression *e);
 Expression *semantic(Expression *e, Scope *sc);
+StringExp *semanticString(Scope *sc, Expression *exp, const char *s);
 
 /* AttribDeclaration /
 
@@ -977,41 +978,29 @@ void PragmaDeclaration::semantic(Scope *sc)
 error("string expected for library name");
 else
 {
-Expression *e = (*args)[0];
-
-sc = sc->startCTFE();
-e = ::semantic(e, sc);
-e = resolveProperties(sc, e);
-sc = sc->endCTFE();
-
-e = e->ctfeInterpret();
-(*args)[0] = e;
-if (e->op == TOKerror)
-goto Lnodecl;
-StringExp *se = e->toStringExp();
+StringExp *se = semanticString(sc, (*args)[0], "library name");
 if (!se)
-error("string expected for library name, not '%s'", e->toChars());
-else
+goto Lnodecl;
+(*args)[0] = se;
+
+char *name = (char *)mem.xmalloc(se->len + 1);
+memcpy(name, se->string, se->len);
+name[se->len] = 0;
+if (global.params.verbose)
+message("library   %s", name);
+if (global.params.moduleDeps && !global.params.moduleDepsFile)
 {
-char *name = (char *)mem.xmalloc(se->len + 1);
-memcpy(name, se->string, se->len);
-name[se->len] = 0;
-if (global.params.verbose)
-message("library   %s", name);
-if (global.params.moduleDeps && !global.params.moduleDepsFile)
-{
-OutBuffer *ob = global.params.moduleDeps;
-Module *imod = sc->instantiatingModule();
-ob->writestring("depsLib ");
-ob->writestring(imod->toPrettyChars());
-ob->writestring(" (");
-escapePath(ob, imod->srcfile->toChars());
-ob->writestring(") : ");
-ob->writestring((char *) name);
-ob->writenl();
-}
-mem.xfree(name);
+OutBuffer *ob = global.params.moduleDeps;
+Module *imod = sc->instantiatingModule();
+ob->writestring("depsLib ");
+ob->writestring(imod->toPrettyChars());
+ob->writestring(" (");
+escapePath(ob, imod->srcfile->toChars());
+ob->writestring(") : ");
+ob->writestring((char *) name);
+ob->writenl();
 }
+mem.xfree(name);
 }
 goto Lnodecl;
 }
@@ -1053,19 +1042,11 @@ void PragmaDeclaration::semantic(Scope *sc)
 goto Ldecl;
 }
 
-Expression *e = (*args)[0];
-e = ::semantic(e, sc);
-e = e->ctfeInterpret();
-(*args)[0] = e;
-if (e->op == TOKerror)
-goto Ldecl;
-
-StringExp *se = e->toStringExp();
+StringExp *se = semanticString(sc, (*args)[0], "mangled name");
 if (!se)
-{
-error("string expected for mangled name, not '%s'", e->toChars());
 goto Ldecl;
-}
+(*args)[0] = se; // Will be used for later
+
 if (!se->len)
 {
 error("zero-length string not allowed for mangled name");
@@ -1418,35 +1399,22 @@ void CompileDeclaration::setScope(Scope *sc)
 void CompileDeclaration::compileIt(Scope *sc)
 {
 //printf("CompileDeclaration::compileIt(loc = %d) %s\n", loc.linnum, exp->toChars());
-sc = sc->startCTFE();
-exp = ::semantic(exp, sc);
-exp = resolveProperties(sc, exp);
-sc = sc->endCTFE();
+StringExp *se = semanticString(sc, exp, "argument to mixin");
+if (!se)
+return;
+se = se->toUTF8(sc);
+
+unsigned errors = global.errors;
+Parser p(loc, sc->_module, (utf8_t *)se->string, se->len, 0);
+p.nextToken();
 
-if (exp->op != TOKerror)
+decl = p.parseDeclDefs(0);
+if (p.token.value != TOKeof)
+exp->error("incomplete mixin declaration (%s)", se->toChars());
+if (p.errors)
 {
-

[C++ PATCH] PR c++/88820 - ICE with CTAD and member template used in DMI.

2019-03-07 Thread Jason Merrill
Here the problem was that in order to form a FUNCTION_DECL for foo in
the uninstantiated template, we were trying to deduce template args for S
from the template parm itself, and failing.

Tested x86_64-pc-linux-gnu, applying to trunk.

* pt.c (do_class_deduction): Handle parm used as its own arg.
---
 gcc/cp/pt.c| 3 +++
 gcc/testsuite/g++.dg/cpp1z/class-deduction64.C | 9 +
 gcc/cp/ChangeLog   | 5 +
 3 files changed, 17 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction64.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 8a5a0b38b2d..906cfe0a58c 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -27184,6 +27184,9 @@ do_class_deduction (tree ptype, tree tmpl, tree init, 
int flags,
error ("non-class template %qT used without template arguments", tmpl);
   return error_mark_node;
 }
+  if (init && TREE_TYPE (init) == ptype)
+/* Using the template parm as its own argument.  */
+return ptype;
 
   tree type = TREE_TYPE (tmpl);
 
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction64.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction64.C
new file mode 100644
index 000..3a06e6fb522
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction64.C
@@ -0,0 +1,9 @@
+// PR c++/88820
+// { dg-do compile { target c++17 } }
+
+template  struct S;
+
+template  struct W {
+  template  static int foo();
+  bool b = foo();
+};
diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index c2162a4a3d4..94e278dc944 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,8 @@
+2019-03-07  Jason Merrill  
+
+   PR c++/88820 - ICE with CTAD and member template used in DMI.
+   * pt.c (do_class_deduction): Handle parm used as its own arg.
+
 2019-03-07  Jakub Jelinek  
 
PR c++/89585

base-commit: f62ec1bb16cb7645d9bf6c981178d5e7b5e06ba2
-- 
2.20.1



Re: [PR 87525] Zero local estimated benefit for cloning extern inline function

2019-03-07 Thread Jan Hubicka

Dne 2019-02-27 13:00, Martin Jambor napsal:

Hi,

in the discussion in PR 87525 Honza noted that IPA-CP should not
estimate any local time benefits from cloning an extern inline 
function,

that any benefits this might bring about have to come from other
specializations such cloning might enable.

While the patch is only a heuristics change and so does not really fix
the issue (which itself is a part of more general set of problems with
-D_FORTIFY_SOURCE and LTO), it should make it much less likely and is
sensible change on its own.

Bootstrapped and tested on x86_54-linux, OK for trunk and the
gcc-8-branch?

Thanks,

Martin




2019-02-25  Martin Jambor  

PR lto/87525
* ipa-cp.c (perform_estimation_of_a_value): Account zero time benefit
for extern inline functions.

testsuite/
* gcc.dg/ipa/ipcp-5.c: New test.


OK,
please wait a week before backporting to gcc 8. I believe gcc 7 suffers 
from

same issue so backporting there makes sense too.

Honza


Re: [PR 88235] Relax cgraph_node::clone_of_p to also look through former clones

2019-03-07 Thread Jan Hubicka

Dne 2019-03-06 13:22, Martin Jambor napsal:

Hi,

PR 88235 is a cgraph verification failure which is a false positive.
The problem is that after thunk expansion which is done as a part of
thunk inlining the verifier is no longer able to detect that a call
graph edge callee points to a clone of the destination of the now
expanded thunk in the decl of the corresponding gimple statement.

Fixed in a way agreed on with Honza in bugzilla, we simply add a way to
detect former (expanded) thunks, I understand this is already done
by devirtualization in a similar way too, and use that information in
the verifier.

Passed bootstrap and testing on x86_64-linux, OK for trunk?  OK for
gcc-7-branch and gcc-8-branch too if a backport is straightforward (I
have not tried yet) and it passes testing there too?

Thanks,

Martin


2019-03-05  Martin Jambor  

PR ipa/88235
* cgraph.h (cgraph_node): New inline method former_thunk_p.
* cgraph.c (cgraph_node::dump): Dump a note if node is a former thunk.
(clone_of_p): Treat expanded thunks like thunks, be optimistic if they
have multiple callees.  At the end check if declarations match as
opposed to cgraph_nodes.

testsuite/
* g++.dg/ipa/pr88235.C: New test.


OK,
thanks!
Honza



Re: [PATCH] Significantly speed up verifiers for a cgraph_node with many clones.

2019-03-07 Thread Jan Hubicka
> Hi.
> 
> The patch makes a significant verifier speed up in a project that
> has a dtor for which we create ~70.000 clones.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2019-02-22  Martin Liska  
> 
>   * cgraph.c (cgraph_node::verify_node): Verify with a neighbour
>   which is equivalent to searching for this in clones chain.
>   * symtab.c (symtab_node::verify_base): Similarly compare ASM
>   names with a neighbour and special case first node in a chain.

OK, thanks!
Honza


[PATCH] Fix PR89578

2019-03-07 Thread Richard Biener


The following fixes PR89578, an optimization regression with the
recent restrict fixes.  The idea is to keep track of the original
function context by attaching that to our loop structure.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2019-03-07  Richard Biener  

PR middle-end/89578
* cfgloop.h (struct loop): Add owned_clique field.
* cfgloopmanip.c (copy_loop_info): Copy it.
* tree-cfg.c (gimple_duplicate_bb): Do not remap owned_clique
cliques.
* tree-inline.c (copy_loops): Remap owned_clique.
* lto-streamer-in.c (input_cfg): Stream owned_clique.
* lto-streamer-out.c (output_cfg): Likewise.

Index: gcc/cfgloop.h
===
--- gcc/cfgloop.h   (revision 269458)
+++ gcc/cfgloop.h   (working copy)
@@ -227,6 +227,10 @@ struct GTY ((chain_next ("%h.next"))) lo
  Other values means unroll with the given unrolling factor.  */
   unsigned short unroll;
 
+  /* If this loop was inlined the main clique of the callee which does
+ not need remapping when copying the loop body.  */
+  unsigned short owned_clique;
+
   /* For SIMD loops, this is a unique identifier of the loop, referenced
  by IFN_GOMP_SIMD_VF, IFN_GOMP_SIMD_LANE and IFN_GOMP_SIMD_LAST_LANE
  builtins.  */
Index: gcc/cfgloopmanip.c
===
--- gcc/cfgloopmanip.c  (revision 269458)
+++ gcc/cfgloopmanip.c  (working copy)
@@ -1024,6 +1024,7 @@ copy_loop_info (struct loop *loop, struc
   target->force_vectorize = loop->force_vectorize;
   target->in_oacc_kernels_region = loop->in_oacc_kernels_region;
   target->unroll = loop->unroll;
+  target->owned_clique = loop->owned_clique;
 }
 
 /* Copies copy of LOOP as subloop of TARGET loop, placing newly
Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c  (revision 269458)
+++ gcc/tree-cfg.c  (working copy)
@@ -6244,7 +6244,8 @@ gimple_duplicate_bb (basic_block bb, cop
  op = TREE_OPERAND (op, 0);
if ((TREE_CODE (op) == MEM_REF
 || TREE_CODE (op) == TARGET_MEM_REF)
-   && MR_DEPENDENCE_CLIQUE (op) > 1)
+   && MR_DEPENDENCE_CLIQUE (op) > 1
+   && MR_DEPENDENCE_CLIQUE (op) != bb->loop_father->owned_clique)
  {
if (!id->dependence_map)
  id->dependence_map = new hash_maphas_unroll = true;
  if (dest_loop->force_vectorize)
cfun->has_force_vectorize_loops = true;
+ if (id->src_cfun->last_clique != 0)
+   dest_loop->owned_clique
+ = remap_dependence_clique (id,
+src_loop->owned_clique
+? src_loop->owned_clique : 1);
 
  /* Finally place it into the loop array and the loop tree.  */
  place_new_loop (cfun, dest_loop);
Index: gcc/lto-streamer-in.c
===
--- gcc/lto-streamer-in.c   (revision 269458)
+++ gcc/lto-streamer-in.c   (working copy)
@@ -826,6 +826,7 @@ input_cfg (struct lto_input_block *ib, s
   /* Read OMP SIMD related info.  */
   loop->safelen = streamer_read_hwi (ib);
   loop->unroll = streamer_read_hwi (ib);
+  loop->owned_clique = streamer_read_hwi (ib);
   loop->dont_vectorize = streamer_read_hwi (ib);
   loop->force_vectorize = streamer_read_hwi (ib);
   loop->simduid = stream_read_tree (ib, data_in);
Index: gcc/lto-streamer-out.c
===
--- gcc/lto-streamer-out.c  (revision 269458)
+++ gcc/lto-streamer-out.c  (working copy)
@@ -1938,6 +1938,7 @@ output_cfg (struct output_block *ob, str
   /* Write OMP SIMD related info.  */
   streamer_write_hwi (ob, loop->safelen);
   streamer_write_hwi (ob, loop->unroll);
+  streamer_write_hwi (ob, loop->owned_clique);
   streamer_write_hwi (ob, loop->dont_vectorize);
   streamer_write_hwi (ob, loop->force_vectorize);
   stream_write_tree (ob, loop->simduid, true);


Re: [wwwdocs] Document recent libstdc++ changes

2019-03-07 Thread Jonathan Wakely

Some more additions.

Committed to CVS.

Index: htdocs/gcc-9/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-9/changes.html,v
retrieving revision 1.51
diff -u -r1.51 changes.html
--- htdocs/gcc-9/changes.html	6 Mar 2019 10:22:39 -	1.51
+++ htdocs/gcc-9/changes.html	7 Mar 2019 14:39:51 -
@@ -229,8 +229,10 @@
   contains member of maps and sets.
   String prefix and suffix checking (starts_with,
 ends_with).
-  Functions std::midpoint and lerp for
+  Functions std::midpoint and std::lerp for
 interpolation.
+  std::bind_front.
+  std::assume_aligned.
   Uses-allocator construction utilities.
   std::pmr::polymorphic_allocatorstd::byte.
   Library support for char8_t type.


Re: [PATCH] P0356R5 Simplified partial function application

2019-03-07 Thread Jonathan Wakely

On 07/03/19 14:15 +, Jonathan Wakely wrote:

* include/std/functional [C++20] (_Bind_front, _Bind_front_t): Define
helpers for bind_front.
(bind_front, __cpp_lib_bind_front): Define.
* testsuite/20_util/function_objects/bind_front/1.cc: New test.


I'm considering something like the attached patch (but with better
names for __tag1 and __tag2 obviously). With this change bind_front
would unwrap nested binders, so that bind_front(bind_front(f, 1), 2)
would create a _Bind_front instead of
_Bind_front<_Bind_front, int>.

That would make the call go straight to the target object, instead of
through an extra layer of wrapper (which should improve compile times,
and also unoptimized runtime performance).



diff --git a/libstdc++-v3/include/std/functional b/libstdc++-v3/include/std/functional
index 8cf2c670648..da61b00bd15 100644
--- a/libstdc++-v3/include/std/functional
+++ b/libstdc++-v3/include/std/functional
@@ -839,6 +839,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if __cplusplus > 201703L
 #define __cpp_lib_bind_front 201902L
 
+  struct __tag1;
+  struct __tag2;
+
   template
 struct _Bind_front
 {
@@ -849,13 +852,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // instead of the copy/move constructor.
   template
 	explicit constexpr
-	_Bind_front(int, _Fn&& __fn, _Args&&... __args)
+	_Bind_front(__tag1*, _Fn&& __fn, _Args&&... __args)
 	noexcept(__and_,
 			is_nothrow_constructible<_BoundArgs, _Args>...>::value)
 	: _M_fd(std::forward<_Fn>(__fn)),
 	  _M_bound_args(std::forward<_Args>(__args)...)
 	{ static_assert(sizeof...(_Args) == sizeof...(_BoundArgs)); }
 
+  template
+	explicit constexpr
+	_Bind_front(__tag2*, const _Bind_front<_Fd, _BArgs...>& __fn,
+		_Args&&... __args)
+	: _M_fd(__fn._M_fd),
+	  _M_bound_args(std::tuple_cat(__fn._M_bound_args,
+		std::make_tuple(std::forward<_Args>(__args)...)))
+	{ }
+
+  template
+	explicit constexpr
+	_Bind_front(__tag2*, _Bind_front<_Fd, _BArgs...>&& __fn,
+		_Args&&... __args)
+	: _M_fd(std::move(__fn._M_fd)),
+	  _M_bound_args(std::tuple_cat(std::move(__fn._M_bound_args),
+		std::make_tuple(std::forward<_Args>(__args)...)))
+	{ }
+
   _Bind_front(const _Bind_front&) = default;
   _Bind_front(_Bind_front&&) = default;
   _Bind_front& operator=(const _Bind_front&) = default;
@@ -919,19 +940,42 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   _Fd _M_fd;
   std::tuple<_BoundArgs...> _M_bound_args;
+
+  template
+	friend class _Bind_front;
 };
 
   template
-using _Bind_front_t
-  = _Bind_front, unwrap_ref_decay_t<_Args>...>;
+struct _Bind_front_helper
+{
+  using type = _Bind_front<_Fn, _Args...>;
+  using __tag = __tag1*;
+};
+
+  template
+struct _Bind_front_helper<_Bind_front<_Fn, _BoundArgs...>, _Args...>
+{
+  using type = _Bind_front<_Fn, _BoundArgs..., _Args...>;
+  using __tag = __tag2*;
+};
+
+  template
+using _Bind_front_t = typename
+  _Bind_front_helper, unwrap_ref_decay_t<_Args>...>::type;
+
+  template
+using _Bind_front_tag = typename
+  _Bind_front_helper, unwrap_ref_decay_t<_Args>...>::__tag;
 
   template
 _Bind_front_t<_Fn, _Args...>
 bind_front(_Fn&& __fn, _Args&&... __args)
-noexcept(is_nothrow_constructible_v,
+noexcept(is_nothrow_constructible_v<_Bind_front_tag<_Fn, _Args...>,
+	_Bind_front_t<_Fn, _Args...>,
 	_Fn, _Args...>)
 {
-  return _Bind_front_t<_Fn, _Args...>(0, std::forward<_Fn>(__fn),
+  return _Bind_front_t<_Fn, _Args...>(_Bind_front_tag<_Fn, _Args...>(),
+	  std::forward<_Fn>(__fn),
 	  std::forward<_Args>(__args)...);
 }
 #endif


Re: [PATCH] P0356R5 Simplified partial function application

2019-03-07 Thread Jonathan Wakely

On 07/03/19 14:15 +, Jonathan Wakely wrote:

* include/std/functional [C++20] (_Bind_front, _Bind_front_t): Define
helpers for bind_front.
(bind_front, __cpp_lib_bind_front): Define.
* testsuite/20_util/function_objects/bind_front/1.cc: New test.


The new test had a typo, which wasn't noticed because the test was
only being compiled, not executed. Fixed like so, committed to trunk.

commit 2d3fddd4358a7ab0f92aa3295c7ac04c8dc6390f
Author: Jonathan Wakely 
Date:   Thu Mar 7 14:34:21 2019 +

Fix new test to run as well as compile

* testsuite/20_util/function_objects/bind_front/1.cc: Change from
compile test to run. Fix typo.

diff --git a/libstdc++-v3/testsuite/20_util/function_objects/bind_front/1.cc b/libstdc++-v3/testsuite/20_util/function_objects/bind_front/1.cc
index eea31e9e8a5..8ebc2bab41a 100644
--- a/libstdc++-v3/testsuite/20_util/function_objects/bind_front/1.cc
+++ b/libstdc++-v3/testsuite/20_util/function_objects/bind_front/1.cc
@@ -16,7 +16,7 @@
 // .
 
 // { dg-options "-std=gnu++2a" }
-// { dg-do compile { target c++2a } }
+// { dg-do run { target c++2a } }
 
 #include 
 #include 
@@ -87,7 +87,7 @@ test02()
   // constness and value category should be forwarded to the target object:
   q = g();
   VERIFY( ! q.as_const && q.as_lvalue );
-  std::move(g)();
+  q = std::move(g)();
   VERIFY( ! q.as_const && ! q.as_lvalue );
   q = cg();
   VERIFY( q.as_const && q.as_lvalue );


[PATCH] Update C++20 status table in libstdc++ manual

2019-03-07 Thread Jonathan Wakely

* doc/xml/manual/status_cxx2020.xml: Update C++20 status.
* doc/html/*: Regenerate.

Committed to trunk.

commit bdac401ac5bb8046cfb19951820af229991151f2
Author: Jonathan Wakely 
Date:   Thu Mar 7 14:19:43 2019 +

Update C++20 status table in libstdc++ manual

* doc/xml/manual/status_cxx2020.xml: Update C++20 status.
* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2020.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2020.xml
index e2c598190da..d40185c5db6 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2020.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2020.xml
@@ -125,14 +125,13 @@ Feature-testing recommendations for C++.
 
 
 
-  
 Make std::memory_order a scoped enumeration 

   
 http://www.w3.org/1999/xlink; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0439r0.html;>
P0439R0

   
-   
+   9.1 
   
 
 
@@ -170,14 +169,13 @@ Feature-testing recommendations for C++.
 
 
 
-  
 de-pessimize legacy algorithms with std::move 

   
 http://www.w3.org/1999/xlink; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0616r0.pdf;>
P0616R0

   
-   
+   9.1 
   
 
 
@@ -624,15 +622,14 @@ Feature-testing recommendations for C++.
 
 
 
-  
 Simplified partial function application 
   
 http://www.w3.org/1999/xlink; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0356r5.html;>
P0356R5

   
-   
-  
+   9.1 
+   __cpp_lib_bind_front = 201811L 
 
 
 
@@ -647,15 +644,14 @@ Feature-testing recommendations for C++.
 
 
 
-  
 char8_t: A type for UTF-8 characters and strings 

   
 http://www.w3.org/1999/xlink; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0482r6.html;>
P0482R6

   
-   
-  
+   9.1 
+   __cpp_lib_char8_t = 201811L 
 
 
 
@@ -671,14 +667,13 @@ Feature-testing recommendations for C++.
 
 
 
-  
 Utility functions to implement uses-allocator construction 

   
 http://www.w3.org/1999/xlink; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0591r4.pdf;>
P0591R4

   
-   
+   9.1 
   
 
 
@@ -786,26 +781,24 @@ Feature-testing recommendations for C++.
 
 
 
-  
 Constexpr in std::pointer_traits 
   
 http://www.w3.org/1999/xlink; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1006r1.pdf;>
P1006R1

   
-   
+   9.1 
   
 
 
 
-  
 std::assume_aligned 
   
 http://www.w3.org/1999/xlink; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1007r3.pdf;>
P1007R3

   
-   
+   9.1 
   
 
 


re: add tsv110 pipeline scheduling

2019-03-07 Thread wuyuan (E)
Hi ,James:
The modified patch has been uploaded for ten days. If you have time, I hope to 
get your comments earlier, thank you very much! 

 Best Regards,

wuyuan

-邮件原件-
发件人: wuyuan (E) 
发送时间: 2019年3月4日 21:46
收件人: 'James Greenhalgh' 
抄送: 'Kyrill Tkachov' ; 'gcc-patches@gcc.gnu.org' 
; Zhangyichao (AB) ; 
Zhanghaijian (A) ; 'n...@arm.com' ; 
wufeng (O) ; Yangfei (Felix) 
主题: re: add tsv110 pipeline scheduling
Hi ,James:
Have you seen the patch submitted last week? If the problem with the patch has 
been fixed, I hope to get into the trunk earlier. look forward to your reply. 
Thank you.


Best Regards,

wuyuan 
-邮件原件-
发件人: wuyuan (E)
发送时间: 2019年2月23日 21:28
收件人: 'James Greenhalgh' 
抄送: Kyrill Tkachov ; gcc-patches@gcc.gnu.org; 
Zhangyichao (AB) ; Zhanghaijian (A) 
; n...@arm.com; wufeng (O) ; 
Yangfei (Felix)  Re : add tsv110 pipeline scheduling

Hi ,James:
Sorry for not responding to your email in time because of Chinese New Year’s 
holiday and urgent work. The three questions you mentioned last email are due 
to my misunderstanding of pipeline.
the first question, These instructions will occupy both the tsv110_ls* and 
tsv110_fsu* Pipeline at the same time.
rewritten as follows:
(define_insn_reservation
  "tsv110_neon_ld4_lane" 9
  (and (eq_attr "tune" "tsv110")
   (eq_attr "type" "neon_load4_all_lanes,neon_load4_all_lanes_q,\
   neon_load4_one_lane,neon_load4_one_lane_q"))
  "(tsv110_ls1 + tsv110_fsu1)|(tsv110_ls1 + tsv110_fsu2)|(tsv110_ls2 + 
tsv110_fsu1)|(tsv110_ls2 + tsv110_fsu2)")

the second question, These instructions will use tsv110_fsu1 Pipeline or 
tsv110_fsu2 Pipeline.
rewritten as follows:
(define_insn_reservation  "tsv110_neon_abd_aba" 4
  (and (eq_attr "tune" "tsv110")
   (eq_attr "type" "neon_abd,neon_arith_acc"))
  "tsv110_fsu1|tsv110_fsu2")

the third question, These instructions will use tsv110_fsu1 Pipeline or 
tsv110_fsu2 Pipeline.
rewritten as follows:
(define_insn_reservation  "tsv110_neon_abd_aba_q" 4
  (and (eq_attr "tune" "tsv110")
   (eq_attr "type" "neon_arith_acc_q"))
  "tsv110_fsu1|tsv110_fsu2")

In addition to the above changes, I asked hardware engineers and colleagues to 
review my  patch and modify some of the errors. The detailed patches are as 
follows:

  * config/aarch64/aarch64-cores.def (tsv1100): Change scheduling model.
  * config/aarch64/aarch64.md : Add "tsv110.md"
  * config/aarch64/tsv110.md: New file.

diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index ed56e5e..82d91d6
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -105,7 +105,7 @@ AARCH64_CORE("neoverse-n1",  neoversen1, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_  AARCH64_CORE("neoverse-e1",  neoversee1, cortexa53, 
8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa53, 0x41, 0xd4a, -1)
 
 /* HiSilicon ('H') cores. */
-AARCH64_CORE("tsv110",  tsv110, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | 
AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2, tsv110,  
 0x48, 0xd01, -1)
+AARCH64_CORE("tsv110",  tsv110, tsv110, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | 
AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2, tsv110,  
 0x48, 0xd01, -1)
 
 /* ARMv8.4-A Architecture Processors.  */
 
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md 
index b7cd9fc..861f059 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -361,6 +361,7 @@
 (include "thunderx.md")
 (include "../arm/xgene1.md")
 (include "thunderx2t99.md")
+(include "tsv110.md")
 
 ;; ---
 ;; Jumps and other miscellaneous insns
diff --git a/gcc/config/aarch64/tsv110.md b/gcc/config/aarch64/tsv110.md new 
file mode 100644 index 000..9d12839
--- /dev/null
+++ b/gcc/config/aarch64/tsv110.md
@@ -0,0 +1,708 @@
+;; tsv110 pipeline description
+;; Copyright (C) 2018 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it ;; 
+under the terms of the GNU General Public License as published by ;; 
+the Free Software Foundation; either version 3, or (at your option) ;; 
+any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but ;; 
+WITHOUT ANY WARRANTY; 

[PATCH] P0356R5 Simplified partial function application

2019-03-07 Thread Jonathan Wakely

* include/std/functional [C++20] (_Bind_front, _Bind_front_t): Define
helpers for bind_front.
(bind_front, __cpp_lib_bind_front): Define.
* testsuite/20_util/function_objects/bind_front/1.cc: New test.

Tested powerpc64le-linux, committed to trunk.


commit 1944c1ed745ed945860d9e28ff48e3d7436e6ba3
Author: Jonathan Wakely 
Date:   Thu Mar 7 13:06:52 2019 +

P0356R5 Simplified partial function application

* include/std/functional [C++20] (_Bind_front, _Bind_front_t): 
Define
helpers for bind_front.
(bind_front, __cpp_lib_bind_front): Define.
* testsuite/20_util/function_objects/bind_front/1.cc: New test.

diff --git a/libstdc++-v3/include/std/functional 
b/libstdc++-v3/include/std/functional
index 911a041cba5..8cf2c670648 100644
--- a/libstdc++-v3/include/std/functional
+++ b/libstdc++-v3/include/std/functional
@@ -836,6 +836,106 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  std::forward<_BoundArgs>(__args)...);
 }
 
+#if __cplusplus > 201703L
+#define __cpp_lib_bind_front 201902L
+
+  template
+struct _Bind_front
+{
+  static_assert(is_move_constructible_v<_Fd>);
+  static_assert((is_move_constructible_v<_BoundArgs> && ...));
+
+  // First parameter is to ensure this constructor is never used
+  // instead of the copy/move constructor.
+  template
+   explicit constexpr
+   _Bind_front(int, _Fn&& __fn, _Args&&... __args)
+   noexcept(__and_,
+   is_nothrow_constructible<_BoundArgs, _Args>...>::value)
+   : _M_fd(std::forward<_Fn>(__fn)),
+ _M_bound_args(std::forward<_Args>(__args)...)
+   { static_assert(sizeof...(_Args) == sizeof...(_BoundArgs)); }
+
+  _Bind_front(const _Bind_front&) = default;
+  _Bind_front(_Bind_front&&) = default;
+  _Bind_front& operator=(const _Bind_front&) = default;
+  _Bind_front& operator=(_Bind_front&&) = default;
+  ~_Bind_front() = default;
+
+  template
+   constexpr
+   invoke_result_t<_Fd&, _BoundArgs&..., _CallArgs...>
+   operator()(_CallArgs&&... __call_args) &
+   noexcept(is_nothrow_invocable_v<_Fd&, _BoundArgs&..., _CallArgs...>)
+   {
+ return _S_call(*this, _BoundIndices(),
+ std::forward<_CallArgs>(__call_args)...);
+   }
+
+  template
+   constexpr
+   invoke_result_t
+   operator()(_CallArgs&&... __call_args) const &
+   noexcept(is_nothrow_invocable_v)
+   {
+ return _S_call(*this, _BoundIndices(),
+ std::forward<_CallArgs>(__call_args)...);
+   }
+
+  template
+   constexpr
+   invoke_result_t<_Fd, _BoundArgs..., _CallArgs...>
+   operator()(_CallArgs&&... __call_args) &&
+   noexcept(is_nothrow_invocable_v<_Fd, _BoundArgs..., _CallArgs...>)
+   {
+ return _S_call(std::move(*this), _BoundIndices(),
+ std::forward<_CallArgs>(__call_args)...);
+   }
+
+  template
+   constexpr
+   invoke_result_t
+   operator()(_CallArgs&&... __call_args) const &&
+   noexcept(is_nothrow_invocable_v)
+   {
+ return _S_call(std::move(*this), _BoundIndices(),
+ std::forward<_CallArgs>(__call_args)...);
+   }
+
+private:
+  using _BoundIndices = index_sequence_for<_BoundArgs...>;
+
+  template
+   static constexpr
+   decltype(auto)
+   _S_call(_Tp&& __g, index_sequence<_Ind...>, _CallArgs&&... __call_args)
+   {
+ return std::invoke(std::forward<_Tp>(__g)._M_fd,
+ std::get<_Ind>(std::forward<_Tp>(__g)._M_bound_args)...,
+ std::forward<_CallArgs>(__call_args)...);
+   }
+
+  _Fd _M_fd;
+  std::tuple<_BoundArgs...> _M_bound_args;
+};
+
+  template
+using _Bind_front_t
+  = _Bind_front, unwrap_ref_decay_t<_Args>...>;
+
+  template
+_Bind_front_t<_Fn, _Args...>
+bind_front(_Fn&& __fn, _Args&&... __args)
+noexcept(is_nothrow_constructible_v,
+   _Fn, _Args...>)
+{
+  return _Bind_front_t<_Fn, _Args...>(0, std::forward<_Fn>(__fn),
+ std::forward<_Args>(__args)...);
+}
+#endif
+
 #if __cplusplus >= 201402L
   /// Generalized negator.
   template
diff --git a/libstdc++-v3/testsuite/20_util/function_objects/bind_front/1.cc 
b/libstdc++-v3/testsuite/20_util/function_objects/bind_front/1.cc
new file mode 100644
index 000..eea31e9e8a5
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/function_objects/bind_front/1.cc
@@ -0,0 +1,176 @@
+// Copyright (C) 2014-2019 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This 

Re: RFA: PATCH to gimple-fold.c for c++/80916, bogus "static but not defined" warning

2019-03-07 Thread Jan Hubicka
> On Thu, Feb 28, 2019 at 12:18 PM Jason Merrill  wrote:
> > On Thu, Feb 28, 2019 at 11:58 AM Jan Hubicka  wrote:
> > > sorry for late reply - I did not identify it as a patch to symbol table.
> > > Indeed we want can_refer_decl_in_current_unit_p is a good place to test
> > > this.  Is there a reason to resrict this to functions with no body?
> >
> > If the function has a definition, then of course we can refer to it in
> > its own unit.  Am I missing something?
> 
> Ah, yes, I was.  You mean, why do we care about DECL_INITIAL if
> DECL_EXTERNAL is set?  I think I added that check out of caution.
> 
> This would be a more straightforward change:

> commit 6af927c40585a4ff75a83b7cdabe8f9074a8d391
> Author: Jason Merrill 
> Date:   Fri Jan 25 09:09:17 2019 -0500
> 
> PR c++/80916 - spurious "static but not defined" warning.
> 
> Nothing can refer to an internal decl with no definition, so we shouldn't
> treat such a decl as a possible devirtualization target.
> 
> * gimple-fold.c (can_refer_decl_in_current_unit_p): Return false
> for an internal function with no definition.

Yes, this looks fine to me. Thanks a lot!
I hope frontends are not setting this combination of flags in scenarios
this transformation would be valid, but I can't think of a reason they
would be.

Patch is OK.
Honza
> 
> diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> index 7ef5004f5f9..62d2e0abc26 100644
> --- a/gcc/gimple-fold.c
> +++ b/gcc/gimple-fold.c
> @@ -121,9 +121,12 @@ can_refer_decl_in_current_unit_p (tree decl, tree 
> from_decl)
>|| !VAR_OR_FUNCTION_DECL_P (decl))
>  return true;
>  
> -  /* Static objects can be referred only if they was not optimized out yet.  
> */
> -  if (!TREE_PUBLIC (decl) && !DECL_EXTERNAL (decl))
> +  /* Static objects can be referred only if they are defined and not 
> optimized
> + out yet.  */
> +  if (!TREE_PUBLIC (decl))
>  {
> +  if (DECL_EXTERNAL (decl))
> + return false;
>/* Before we start optimizing unreachable code we can be sure all
>static objects are defined.  */
>if (symtab->function_flags_ready)
> diff --git a/gcc/testsuite/g++.dg/warn/unused-fn1.C 
> b/gcc/testsuite/g++.dg/warn/unused-fn1.C
> new file mode 100644
> index 000..aabc01b3f44
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/warn/unused-fn1.C
> @@ -0,0 +1,16 @@
> +// PR c++/80916
> +// { dg-options "-Os -Wunused" }
> +
> +struct j {
> +  virtual void dispatch(void *) {}
> +};
> +template 
> +struct i : j {
> +  void dispatch(void *) {} // warning: 'void i<  
> >::dispatch(void*) [with  = {anonymous}::l]' declared 
> 'static' but never defined [-Wunused-function]
> +};
> +namespace {
> +  struct l : i {};
> +}
> +void f(j *k) {
> +  k->dispatch(0);
> +}



[PATCH][OBVIOUS] Revert function removal made in r264561.

2019-03-07 Thread Martin Liška
Hi.

The removal causes build failures of *-vms targets.

I'm going to define the function again.
Martin

gcc/ChangeLog:

2019-03-07  Martin Liska  

* dwarf2out.c (add_AT_vms_delta): Revert function removal.
---
 gcc/dwarf2out.c | 18 ++
 1 file changed, 18 insertions(+)


diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 1b17f2bc1d5..e074ee3fcd1 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -3907,6 +3907,8 @@ static void prune_unused_types (void);
 static int maybe_emit_file (struct dwarf_file_data *fd);
 static inline const char *AT_vms_delta1 (dw_attr_node *);
 static inline const char *AT_vms_delta2 (dw_attr_node *);
+static inline void add_AT_vms_delta (dw_die_ref, enum dwarf_attribute,
+ const char *, const char *);
 static void append_entry_to_tmpl_value_parm_die_table (dw_die_ref, tree);
 static void gen_remaining_tmpl_value_param_die_attribute (void);
 static bool generic_type_p (tree);
@@ -5142,6 +5144,22 @@ AT_file (dw_attr_node *a)
   return a->dw_attr_val.v.val_file;
 }
 
+/* Add a vms delta attribute value to a DIE.  */
+
+static inline void
+add_AT_vms_delta (dw_die_ref die, enum dwarf_attribute attr_kind,
+		  const char *lbl1, const char *lbl2)
+{
+  dw_attr_node attr;
+
+  attr.dw_attr = attr_kind;
+  attr.dw_attr_val.val_class = dw_val_class_vms_delta;
+  attr.dw_attr_val.val_entry = NULL;
+  attr.dw_attr_val.v.val_vms_delta.lbl1 = xstrdup (lbl1);
+  attr.dw_attr_val.v.val_vms_delta.lbl2 = xstrdup (lbl2);
+  add_dwarf_attr (die, );
+}
+
 /* Add a symbolic view identifier attribute value to a DIE.  */
 
 static inline void



Re: A bug in vrp_meet?

2019-03-07 Thread Richard Biener
On Wed, Mar 6, 2019 at 11:05 AM Richard Biener
 wrote:
>
> On Tue, Mar 5, 2019 at 10:36 PM Jeff Law  wrote:
> >
> > On 3/5/19 7:44 AM, Richard Biener wrote:
> >
> > > So fixing it properly with also re-optimize_stmt those stmts so we'd CSE
> > > the MAX_EXPR introduced by folding makes it somewhat ugly.
> > >
> > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> > >
> > > Any ideas how to make it less so?  I can split out making optimize_stmt
> > > take a gsi * btw, in case that's a more obvious change and it makes the
> > > patch a little smaller.
> > >
> > > Richard.
> > >
> > > 2019-03-05  Richard Biener  
> > >
> > > PR tree-optimization/89595
> > > * tree-ssa-dom.c (dom_opt_dom_walker::optimize_stmt): Take
> > > stmt iterator as reference, take boolean output parameter to
> > > indicate whether the stmt was removed and thus the iterator
> > > already advanced.
> > > (dom_opt_dom_walker::before_dom_children): Re-iterate over
> > > stmts created by folding.
> > >
> > > * gcc.dg/torture/pr89595.c: New testcase.
> > >
> >
> > Well, all the real logic changs are in the before_dom_children method.
> > The bits in optimize_stmt are trivial enough to effectively ignore.
> >
> > I don't see a better way to discover and process statements that are
> > created in the bowels of fold_stmt.
>
> I'm not entirely happy so I created the following alternative which
> is a bit larger and slower due to the pre-pass clearing the visited flag
> but is IMHO easier to follow.  I guess there's plenty of TLC opportunity
> here but then I also hope to retire the VN parts of DOM in favor
> of the non-iterating RPO-VN code...
>
> So - I'd lean to this variant even though it has the extra loop over stmts,
> would you agree?

I have now applied this variant.

Richard.

> Bootstrap / regtest running on x86_64-unknown-linux-gnu.
>
> Richard.
>
> 2019-03-06  Richard Biener  
>
> PR tree-optimization/89595
> * tree-ssa-dom.c (dom_opt_dom_walker::optimize_stmt): Take
> stmt iterator as reference, take boolean output parameter to
> indicate whether the stmt was removed and thus the iterator
> already advanced.
> (dom_opt_dom_walker::before_dom_children): Re-iterate over
> stmts created by folding.
>
> * gcc.dg/torture/pr89595.c: New testcase.


Re: [PATCH] Fix PR89618

2019-03-07 Thread Jakub Jelinek
On Thu, Mar 07, 2019 at 01:03:43PM +0100, Richard Biener wrote:
> 
> This fixes a missed vectorization because loop_version (and in the end
> copy_loop_info) didn't copy IVDEP info (safelen) during if-conversion
> versioning.
> 
> Bootstrap & regtest running on x86_64-unknown-linux-gnu.
> 
> Even though this isn't a regression I'd like to fix this for GCC 9,
> it may appear as regression to the time we didn't do versioning in
> if-conversion for vectorization (but the testcase relies on AVX512
> support which is newer).

LGTM.

> 2019-04-07  Richard Biener  
> 
>   PR middle-end/89618
>   * cfgloopmanip.c (copy_loop_info): Copy forgotten fields.
>   * tree-inline.c (copy_loops): Simplify.
> 
>   * gcc.target/i386/pr89618.c: New testcase.

Jakub


[PATCH] Fix PR89618

2019-03-07 Thread Richard Biener


This fixes a missed vectorization because loop_version (and in the end
copy_loop_info) didn't copy IVDEP info (safelen) during if-conversion
versioning.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Even though this isn't a regression I'd like to fix this for GCC 9,
it may appear as regression to the time we didn't do versioning in
if-conversion for vectorization (but the testcase relies on AVX512
support which is newer).

Richard.

2019-04-07  Richard Biener  

PR middle-end/89618
* cfgloopmanip.c (copy_loop_info): Copy forgotten fields.
* tree-inline.c (copy_loops): Simplify.

* gcc.target/i386/pr89618.c: New testcase.

Index: gcc/cfgloopmanip.c
===
--- gcc/cfgloopmanip.c  (revision 269415)
+++ gcc/cfgloopmanip.c  (working copy)
@@ -1015,10 +1015,15 @@ copy_loop_info (struct loop *loop, struc
   target->any_estimate = loop->any_estimate;
   target->nb_iterations_estimate = loop->nb_iterations_estimate;
   target->estimate_state = loop->estimate_state;
+  target->safelen = loop->safelen;
   target->constraints = loop->constraints;
+  target->can_be_parallel = loop->can_be_parallel;
   target->warned_aggressive_loop_optimizations
 |= loop->warned_aggressive_loop_optimizations;
+  target->dont_vectorize = loop->dont_vectorize;
+  target->force_vectorize = loop->force_vectorize;
   target->in_oacc_kernels_region = loop->in_oacc_kernels_region;
+  target->unroll = loop->unroll;
 }
 
 /* Copies copy of LOOP as subloop of TARGET loop, placing newly
Index: gcc/tree-inline.c
===
--- gcc/tree-inline.c   (revision 269415)
+++ gcc/tree-inline.c   (working copy)
@@ -2666,23 +2666,15 @@ copy_loops (copy_body_data *id,
 
  /* Copy loop meta-data.  */
  copy_loop_info (src_loop, dest_loop);
+ if (dest_loop->unroll)
+   cfun->has_unroll = true;
+ if (dest_loop->force_vectorize)
+   cfun->has_force_vectorize_loops = true;
 
  /* Finally place it into the loop array and the loop tree.  */
  place_new_loop (cfun, dest_loop);
  flow_loop_tree_node_add (dest_parent, dest_loop);
 
- dest_loop->safelen = src_loop->safelen;
- if (src_loop->unroll)
-   {
- dest_loop->unroll = src_loop->unroll;
- cfun->has_unroll = true;
-   }
- dest_loop->dont_vectorize = src_loop->dont_vectorize;
- if (src_loop->force_vectorize)
-   {
- dest_loop->force_vectorize = true;
- cfun->has_force_vectorize_loops = true;
-   }
  if (src_loop->simduid)
{
  dest_loop->simduid = remap_decl (src_loop->simduid, id);
Index: gcc/testsuite/gcc.target/i386/pr89618.c
===
--- gcc/testsuite/gcc.target/i386/pr89618.c (nonexistent)
+++ gcc/testsuite/gcc.target/i386/pr89618.c (working copy)
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mavx512f -fdump-tree-vect-details" } */
+
+void foo (int n, int *off, double *a)
+{
+  const int m = 32;
+
+  for (int j = 0; j < n/m; ++j)
+{
+  int const start = j*m;
+  int const end = (j+1)*m;
+
+#pragma GCC ivdep
+  for (int i = start; i < end; ++i)
+   {
+ a[off[i]] = a[i] < 0 ? a[i] : 0;
+   }
+}
+}
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */


Re: [PATCH] improve performance of std::allocator::deallocate

2019-03-07 Thread Jonathan Wakely

On 06/03/19 22:27 +, Pádraig Brady wrote:



On 03/06/2019 01:44 AM, Jonathan Wakely wrote:

On 06/03/19 09:20 +, Pádraig Brady wrote:

On 03/06/2019 12:50 AM, Jonathan Wakely wrote:

On 06/03/19 02:43 +, Pádraig Brady wrote:



On 02/26/2019 04:23 PM, Padraig Brady wrote:



Note jemalloc >= 5.1 is required to fix a bug with 0 sizes.

How serious is the bug? What are the symptoms?


I've updated the commit summary to say it's a crash.
Arguably that's better than mem corruption.


It looks like 5.1.0 is less than a year old, so older versions are
presumably still in wide use.

We could potentially workaround it by making
new_allocator::allocate(0) call ::operator new(1) when
__cpp_sized_deallocation is defined, and deallocate(p, 0) call
::operator delete(p, 1). Obviously I'd prefer not to do that,
because
the default operator new already has an equivalent check, and only
programs linking to jemalloc require the workaround.


Right the jemalloc fix was released May 2018.
It would be great to avoid the extra workarounds.
Given this would be released about a year after the jemalloc fix was
released,
and that this would only manifest upon program rebuild,
I would be inclined to not add any workarounds.
For reference tcmalloc and glibc malloc were seen to work fine with
this.


Actually the jemalloc issue will only be fixed with the release of 5.2
(a few weeks away).
I've updated the commit message in the attached accordingly.


Hmm, I'm a bit nervous about making a change that will cause crashes
unless using an unreleased version (I know it will be released by the
time GCC 9.1 is released, but some people might upgrade GCC without
upgrading jemalloc).

Yes it's not ideal. It does make it a lot less risky that one has to
rebuild programs to get the new functionality, so existing programs
will be unaffected. Also -fsized-deallocation is only enabled by
default on gcc with -std >= c++14.


The default is -std=gnu++14 so it's enabled unless you explicitly
choose an older dialect or add -fno-sized-deallocation.

Good point :)



On the other hand, zero sized allocations should be rare in practice.

Yes they were rare in testing here

So programs _rebuilt_ against the following would need to update
to jemalloc 5.2:

 zero sized allocs, jemalloc<5.2, c++>=14, GCC>=9.1

Hopefully that's a small enough set.


Which versions of jemalloc replace operator delete(void*, size_t) ?

Was that something new in 5.0 or did older versions already provide a
replacement for the sized operator delete?

If it was introduced in 5.0 then there won't be a problem for 3.x and
4.x because they'll use the default definition from libstdc++ which
just calls ::operator delete(p).

Again very good point. The replacements were only added in 5.0:
https://github.com/jemalloc/jemalloc/commit/2319152d


OK, that makes me feel better about it. It's presumably much easier to
upgrade to 5.2 from 5.0 or 5.1 than it would be from 4.x.



How complicated is the fix to prevent the crashes? Would it be
feasible for distros to backport that fix? I see that RHEL8 has
jemalloc 5.0.1 for example, but if the fix could be backported to that
release then it's less of a problem.

The patch set is simple enough:
https://github.com/jemalloc/jemalloc/pull/1341/commits


Thanks. That does seem reasonable for distros and other packagers to
backport, if they want to support 5.0 or 5.1 for their users.

I'm leaning towards accepting the patch for gcc-9 (and if not, we
should do it early in the gcc-10 cycle).



Re: [PR 85762, 87008, 85459] Relax MEM_REF check in contains_vce_or_bfcref_p

2019-03-07 Thread Martin Jambor
Hi,

sorry for a somewhat long turnaround...

On Tue, Mar 05 2019, Richard Biener wrote:
> On Tue, 5 Mar 2019, Richard Biener wrote:
>
>> On Tue, 5 Mar 2019, Martin Jambor wrote:
>> > @@ -1165,14 +1165,9 @@ contains_vce_or_bfcref_p (const_tree ref)
>> >ref = TREE_OPERAND (ref, 0);
>> >  }
>> >  
>> > -  if (TREE_CODE (ref) != MEM_REF
>> > -  || TREE_CODE (TREE_OPERAND (ref, 0)) != ADDR_EXPR)
>> > -return false;
>> > -
>> > -  tree mem = TREE_OPERAND (TREE_OPERAND (ref, 0), 0);
>> > -  if (TYPE_MAIN_VARIANT (TREE_TYPE (ref))
>> > -  != TYPE_MAIN_VARIANT (TREE_TYPE (mem)))
>> > -return true;
>> > +  if (TREE_CODE (ref) == MEM_REF
>> > +  && TYPE_REF_CAN_ALIAS_ALL (TREE_TYPE (TREE_OPERAND (ref, 1
>> > +  return true;
>> 
>> This doesn't make much sense to me - why not simply go back to the
>> old implementation which would mean to just
>> 
>>return false;
>> 
>> here?
>
> Ah - beacause the testcase from r255510 would break...

Yes.

>
> The above is still a "bit" very tied to how we fold memcpy and friends,
> thus unreliable.

Well, I thought about it a bit too but eventually decided the
unreliability is on the false positive side.

>
> Isn't the issue for the memcpy thing that we fail to record an
> access for the char[] access (remember it's now char[] MEM_REF,
> no longer struct X * MEM_REF with ref-all!).  So maybe it now
> _does_ work with just return false...
>

No, it does not work, I had tried.  But yesterday I had another look at
the testcase and realized that the reason it does not is that total
scalarization kicks back in and we ignore data in padding in totally
scalarized aggregates (otherwise we would have never removed the
original accesses to the aggregate which is the point of total
scalarization).

So, if we can cope with the headache of yet another weird flag in SRA
access structure, the following patch also fixes the issue because it
does consider padding important if it is accessed with a type changing
MEM_REF - even when the aggregate is totally scalarized.

If we wanted to be even less conservative, the test in
contains_vce_or_bfcref_p could be extended along the lines of the code
in comment 6 of PR 85762 but for all three PRs this fixes, the produced
code is the same (or seems to be the same), so perhaps this is good
enough(TM).

So far I have only tested this on x86_64-linux (and found out I need to
un-XFAIL gcc.dg/guality/pr54970.c which is not so surprising because it
was XFAILed by r255510).  Should I test a bit more and commit this to
trunk (and then to gcc-8-branch)?

Thanks,

Martin



2019-03-06  Martin Jambor  

PR tree-optimization/85762
PR tree-optimization/87008
PR tree-optimization/85459
* tree-sra.c (struct access): New flag grp_type_chaning_ref.
(dump_access): Dump it.
(contains_vce_or_bfcref_p): New parameter, set the bool it points
to if there is a type changing MEM_REF.  Adjust all callers.
(build_accesses_from_assign): Set grp_type_chaning_ref if
appropriate.
(sort_and_splice_var_accesses): Merge grp_type_chaning_ref values.

testsuite/
* g++.dg/tree-ssa/pr87008.C: New test.
* gcc.dg/tree-ssa/pr83141.c: Remove dump test.
---
 gcc/testsuite/g++.dg/tree-ssa/pr87008.C | 17 
 gcc/testsuite/gcc.dg/tree-ssa/pr83141.c |  3 +-
 gcc/tree-sra.c  | 58 ++---
 3 files changed, 60 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr87008.C

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr87008.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr87008.C
new file mode 100644
index 000..eef521f9ad5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr87008.C
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+extern void dontcallthis();
+
+struct A { long a, b; };
+struct B : A {};
+templatevoid cp(T,T const){a=b;}
+long f(B x){
+  B y; cp(y,x);
+  B z; cp(z,x);
+  if (y.a - z.a)
+dontcallthis ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-not "dontcallthis" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83141.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83141.c
index 73ea45c613c..d1ad3340dbd 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr83141.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83141.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O -fdump-tree-esra-details" } */
+/* { dg-options "-O" } */
 
 volatile short vs;
 volatile long vl;
@@ -34,4 +34,3 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-not "Will attempt to totally scalarize" "esra" 
} } */
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index eeef31ba496..003a54e99f2 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -224,6 +224,10 @@ struct access
  entirely? */
   unsigned grp_total_scalarization : 1;
 
+  /* Set when this portion of the aggregate is accessed either through a
+ non-register type VIEW_CONVERT_EXPR 

Re: [PATCH] x86: Disable jump tables when retpolines are used (PR target/86952).

2019-03-07 Thread Martin Liška
On 3/7/19 11:27 AM, Uros Bizjak wrote:
> On Thu, Mar 7, 2019 at 10:50 AM Martin Liška  wrote:
>>
>> On 3/7/19 9:54 AM, Uros Bizjak wrote:
>>> On Thu, Mar 7, 2019 at 9:45 AM Martin Liška  wrote:

 Hi.

 Thanks to Intel guys, we've done some re-measurement in PR86952
 about usage of jump tables when retpolines are used.
 Numbers prove that disabling of JT should be the best for now.

 Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

 Ready to be installed?
 Thanks,
 Martin
>>>
>>> Please add a comment above your change.
>>
>> Sure, should be improved.
> 
> Eh, we didn't understand each other... Please add comment here:

Ah, this place :)

> 
> +  if (ix86_indirect_branch != indirect_branch_keep
> +  && !opts_set->x_flag_jump_tables)
> +opts->x_flag_jump_tables = 0;
> 
> so in future, it will still be documented why this part of the code is needed.

Sure, updated.

Martin

> 
> Uros.
> 
>> Martin
>>
>>>
>>> Uros.
>>>

 gcc/ChangeLog:

 2019-03-06  Martin Liska  

 PR target/86952
 * config/i386/i386.c (ix86_option_override_internal): Disable
 jump tables when retpolines are used.

 gcc/testsuite/ChangeLog:

 2019-03-06  Martin Liska  

 PR target/86952
 * gcc.target/i386/pr86952.c: New test.
 * gcc.target/i386/indirect-thunk-7.c: Use jump tables to match
 scanned pattern.
 * gcc.target/i386/indirect-thunk-inline-7.c: Likewise.
 ---
  gcc/config/i386/i386.c|  4 
  .../gcc.target/i386/indirect-thunk-7.c|  2 +-
  .../gcc.target/i386/indirect-thunk-inline-7.c |  2 +-
  gcc/testsuite/gcc.target/i386/pr86952.c   | 23 +++
  4 files changed, 29 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/i386/pr86952.c


>>

>From f78914272ba7a9fd19db65876a4d4cab576ddf0f Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 6 Mar 2019 13:05:50 +0100
Subject: [PATCH] x86: Disable jump tables when retpolines are used (PR
 target/86952).

Jump tables are implement on x86_64 with:
	jmp	*.L4(,%rdi,8)

where L4 contains list of labels where to jump. When using retpolines,
the instruction is replaced with:
	movq	.L4(,%rdi,8), %rax
	jmp	*%rax

which bypasses/confuses indirect branch predictor and it's slow. In that
case, a decision tree based on if condition is faster.

gcc/ChangeLog:

2019-03-06  Martin Liska  

	PR target/86952
	* config/i386/i386.c (ix86_option_override_internal): Disable
	jump tables when retpolines are used.

gcc/testsuite/ChangeLog:

2019-03-06  Martin Liska  

	PR target/86952
	* gcc.target/i386/pr86952.c: New test.
	* gcc.target/i386/indirect-thunk-7.c: Use jump tables to match
	scanned pattern.
	* gcc.target/i386/indirect-thunk-inline-7.c: Likewise.
---
 gcc/config/i386/i386.c|  6 +
 .../gcc.target/i386/indirect-thunk-7.c|  2 +-
 .../gcc.target/i386/indirect-thunk-inline-7.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr86952.c   | 23 +++
 4 files changed, 31 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr86952.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c8f9957163b..71e5cfd2897 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -4894,6 +4894,12 @@ ix86_option_override_internal (bool main_args_p,
 			   opts->x_param_values,
 			   opts_set->x_param_values);
 
+  /* PR86952: jump table usage with retpolines is slow.
+ The PR provides some numbers about the slowness.  */
+  if (ix86_indirect_branch != indirect_branch_keep
+  && !opts_set->x_flag_jump_tables)
+opts->x_flag_jump_tables = 0;
+
   return true;
 }
 
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-7.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-7.c
index 3c72036dbaf..53868f46558 100644
--- a/gcc/testsuite/gcc.target/i386/indirect-thunk-7.c
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-7.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk -fno-pic" } */
+/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk -fno-pic -fjump-tables" } */
 
 void func0 (void);
 void func1 (void);
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c
index ea009245a58..e6f064959a1 100644
--- a/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk-inline -fno-pic" } */
+/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk-inline -fno-pic 

Re: [PATCH] x86: Disable jump tables when retpolines are used (PR target/86952).

2019-03-07 Thread Uros Bizjak
On Thu, Mar 7, 2019 at 10:50 AM Martin Liška  wrote:
>
> On 3/7/19 9:54 AM, Uros Bizjak wrote:
> > On Thu, Mar 7, 2019 at 9:45 AM Martin Liška  wrote:
> >>
> >> Hi.
> >>
> >> Thanks to Intel guys, we've done some re-measurement in PR86952
> >> about usage of jump tables when retpolines are used.
> >> Numbers prove that disabling of JT should be the best for now.
> >>
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> Ready to be installed?
> >> Thanks,
> >> Martin
> >
> > Please add a comment above your change.
>
> Sure, should be improved.

Eh, we didn't understand each other... Please add comment here:

+  if (ix86_indirect_branch != indirect_branch_keep
+  && !opts_set->x_flag_jump_tables)
+opts->x_flag_jump_tables = 0;

so in future, it will still be documented why this part of the code is needed.

Uros.

> Martin
>
> >
> > Uros.
> >
> >>
> >> gcc/ChangeLog:
> >>
> >> 2019-03-06  Martin Liska  
> >>
> >> PR target/86952
> >> * config/i386/i386.c (ix86_option_override_internal): Disable
> >> jump tables when retpolines are used.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >> 2019-03-06  Martin Liska  
> >>
> >> PR target/86952
> >> * gcc.target/i386/pr86952.c: New test.
> >> * gcc.target/i386/indirect-thunk-7.c: Use jump tables to match
> >> scanned pattern.
> >> * gcc.target/i386/indirect-thunk-inline-7.c: Likewise.
> >> ---
> >>  gcc/config/i386/i386.c|  4 
> >>  .../gcc.target/i386/indirect-thunk-7.c|  2 +-
> >>  .../gcc.target/i386/indirect-thunk-inline-7.c |  2 +-
> >>  gcc/testsuite/gcc.target/i386/pr86952.c   | 23 +++
> >>  4 files changed, 29 insertions(+), 2 deletions(-)
> >>  create mode 100644 gcc/testsuite/gcc.target/i386/pr86952.c
> >>
> >>
>


Re: [PATCH] Add missing avx512fintrin.h intrinsics (PR target/89602)

2019-03-07 Thread H.J. Lu
Looks good to me.

Thanks.

On Thu, Mar 7, 2019, 4:15 PM Uros Bizjak  wrote:

> On Thu, Mar 7, 2019 at 9:09 AM Jakub Jelinek  wrote:
> >
> > On Thu, Mar 07, 2019 at 08:11:53AM +0100, Uros Bizjak wrote:
> > > > +(define_insn "*avx512f_load_mask"
> > > > +  [(set (match_operand: 0 "register_operand" "=v")
> > > > +   (vec_merge:
> > > > + (vec_merge:
> > > > +   (vec_duplicate:
> > > > + (match_operand:MODEF 1 "memory_operand" "m"))
> > > > +   (match_operand: 2 "nonimm_or_0_operand" "0C")
> > > > +   (match_operand:QI 3 "nonmemory_operand" "Yk"))
> > >
> > > Is there a reason to have nonmemory_operand predicate here instead of
> > > register_operand?
> >
> > Thanks for catching that up, that was from my earlier attempt to have
> > Yk,n constraints and deal with that during output.  For store it was
> > possible, for others there were some cases it couldn't handle but further
> > testing revealed that the combiner already handles most of the constant
> > mask cases right.
> >
> > Here is updated patch, I've changed this in two spots.  It even improves
> the
> > constant 1 case (the only one that is still not optimized as much as it
> > should):
> >  f4:
> > -   movzbl  .LC0(%rip), %eax
> > +   movl$1, %eax
> > kmovw   %eax, %k1
> > vmovsd  (%rsi), %xmm0{%k1}{z}
> > ret
> > Tested so far with make check-gcc RUNTESTFLAGS=i386.exp=avx512f-vmovs*.c
> > and compiling/eyeballing differences on the short testcase I've posted
> > in the description with also the u, -> 1, and u, -> 0, changes, appart
> > from the above f4 no differences.
> >
> > Ok for trunk if it passes another full bootstrap/regtest?
>
> LGTM with another fixup below.
>
> HJ should approve addition of intrinsic in header files.
>
> Thanks,
> Uros.
>
> >
> > 2019-03-07  Jakub Jelinek  
> >
> > PR target/89602
> > * config/i386/sse.md (avx512f_mov_mask,
> > *avx512f_load_mask, avx512f_store_mask): New
> define_insns.
> > (avx512f_load_mask): New define_expand.
> > * config/i386/i386-builtin.def (__builtin_ia32_loadsd_mask,
> > __builtin_ia32_loadss_mask, __builtin_ia32_storesd_mask,
> > __builtin_ia32_storess_mask, __builtin_ia32_movesd_mask,
> > __builtin_ia32_movess_mask): New builtins.
> > * config/i386/avx512fintrin.h (_mm_mask_load_ss,
> _mm_maskz_load_ss,
> > _mm_mask_load_sd, _mm_maskz_load_sd, _mm_mask_move_ss,
> > _mm_maskz_move_ss, _mm_mask_move_sd, _mm_maskz_move_sd,
> > _mm_mask_store_ss, _mm_mask_store_sd): New intrinsics.
> >
> > * gcc.target/i386/avx512f-vmovss-1.c: New test.
> > * gcc.target/i386/avx512f-vmovss-2.c: New test.
> > * gcc.target/i386/avx512f-vmovss-3.c: New test.
> > * gcc.target/i386/avx512f-vmovsd-1.c: New test.
> > * gcc.target/i386/avx512f-vmovsd-2.c: New test.
> > * gcc.target/i386/avx512f-vmovsd-3.c: New test.
> >
> > --- gcc/config/i386/sse.md.jj   2019-02-20 23:40:17.119140235 +0100
> > +++ gcc/config/i386/sse.md  2019-03-06 19:15:12.379749161 +0100
> > @@ -1151,6 +1151,67 @@ (define_insn "_load_mask"
> > (set_attr "memory" "none,load")
> > (set_attr "mode" "")])
> >
> > +(define_insn "avx512f_mov_mask"
> > +  [(set (match_operand:VF_128 0 "register_operand" "=v")
> > +   (vec_merge:VF_128
> > + (vec_merge:VF_128
> > +   (match_operand:VF_128 2 "register_operand" "v")
> > +   (match_operand:VF_128 3 "nonimm_or_0_operand" "0C")
> > +   (match_operand:QI 4 "register_operand" "Yk"))
> > + (match_operand:VF_128 1 "register_operand" "v")
> > + (const_int 1)))]
> > +  "TARGET_AVX512F"
> > +  "vmov\t{%2, %1, %0%{%4%}%N3|%0%{%4%}%N3, %1, %2}"
> > +  [(set_attr "type" "ssemov")
> > +   (set_attr "prefix" "evex")
> > +   (set_attr "mode" "")])
> > +
> > +(define_expand "avx512f_load_mask"
> > +  [(set (match_operand: 0 "register_operand")
> > +   (vec_merge:
> > + (vec_merge:
> > +   (vec_duplicate:
> > + (match_operand:MODEF 1 "memory_operand"))
> > +   (match_operand: 2 "nonimm_or_0_operand")
> > +   (match_operand:QI 3 "nonmemory_operand"))
>
> register operand here, the expander should match corresponding insn
> pattern.
>
> > + (match_dup 4)
> > + (const_int 1)))]
> > +  "TARGET_AVX512F"
> > +  "operands[4] = CONST0_RTX (mode);")
> > +
> > +(define_insn "*avx512f_load_mask"
> > +  [(set (match_operand: 0 "register_operand" "=v")
> > +   (vec_merge:
> > + (vec_merge:
> > +   (vec_duplicate:
> > + (match_operand:MODEF 1 "memory_operand" "m"))
> > +   (match_operand: 2 "nonimm_or_0_operand" "0C")
> > +   (match_operand:QI 3 "register_operand" "Yk"))
> > + (match_operand: 4 "const0_operand" "C")
> > + (const_int 1)))]
> > +  "TARGET_AVX512F"
> > +  "vmov\t{%1, %0%{%3%}%N2|%0%{3%}%N2, %1}"
> > +  [(set_attr 

Re: [PATCH] x86: Disable jump tables when retpolines are used (PR target/86952).

2019-03-07 Thread Martin Liška
On 3/7/19 9:54 AM, Uros Bizjak wrote:
> On Thu, Mar 7, 2019 at 9:45 AM Martin Liška  wrote:
>>
>> Hi.
>>
>> Thanks to Intel guys, we've done some re-measurement in PR86952
>> about usage of jump tables when retpolines are used.
>> Numbers prove that disabling of JT should be the best for now.
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
>> Thanks,
>> Martin
> 
> Please add a comment above your change.

Sure, should be improved.

Martin

> 
> Uros.
> 
>>
>> gcc/ChangeLog:
>>
>> 2019-03-06  Martin Liska  
>>
>> PR target/86952
>> * config/i386/i386.c (ix86_option_override_internal): Disable
>> jump tables when retpolines are used.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2019-03-06  Martin Liska  
>>
>> PR target/86952
>> * gcc.target/i386/pr86952.c: New test.
>> * gcc.target/i386/indirect-thunk-7.c: Use jump tables to match
>> scanned pattern.
>> * gcc.target/i386/indirect-thunk-inline-7.c: Likewise.
>> ---
>>  gcc/config/i386/i386.c|  4 
>>  .../gcc.target/i386/indirect-thunk-7.c|  2 +-
>>  .../gcc.target/i386/indirect-thunk-inline-7.c |  2 +-
>>  gcc/testsuite/gcc.target/i386/pr86952.c   | 23 +++
>>  4 files changed, 29 insertions(+), 2 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/i386/pr86952.c
>>
>>

>From 54a0f3ed784c05bef0bdddcc6ae4e8677307d989 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 6 Mar 2019 13:05:50 +0100
Subject: [PATCH] x86: Disable jump tables when retpolines are used (PR
 target/86952).

Jump tables are implement on x86_64 with:
	jmp	*.L4(,%rdi,8)

where L4 contains list of labels where to jump. When using retpolines,
the instruction is replaced with:
	movq	.L4(,%rdi,8), %rax
	jmp	*%rax

which bypasses/confuses indirect branch predictor and it's slow. In that
case, a decision tree based on if condition is faster.

gcc/ChangeLog:

2019-03-06  Martin Liska  

	PR target/86952
	* config/i386/i386.c (ix86_option_override_internal): Disable
	jump tables when retpolines are used.

gcc/testsuite/ChangeLog:

2019-03-06  Martin Liska  

	PR target/86952
	* gcc.target/i386/pr86952.c: New test.
	* gcc.target/i386/indirect-thunk-7.c: Use jump tables to match
	scanned pattern.
	* gcc.target/i386/indirect-thunk-inline-7.c: Likewise.
---
 gcc/config/i386/i386.c|  4 
 .../gcc.target/i386/indirect-thunk-7.c|  2 +-
 .../gcc.target/i386/indirect-thunk-inline-7.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr86952.c   | 23 +++
 4 files changed, 29 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr86952.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c8f9957163b..37fe41260dd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -4894,6 +4894,10 @@ ix86_option_override_internal (bool main_args_p,
 			   opts->x_param_values,
 			   opts_set->x_param_values);
 
+  if (ix86_indirect_branch != indirect_branch_keep
+  && !opts_set->x_flag_jump_tables)
+opts->x_flag_jump_tables = 0;
+
   return true;
 }
 
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-7.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-7.c
index 3c72036dbaf..53868f46558 100644
--- a/gcc/testsuite/gcc.target/i386/indirect-thunk-7.c
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-7.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk -fno-pic" } */
+/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk -fno-pic -fjump-tables" } */
 
 void func0 (void);
 void func1 (void);
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c
index ea009245a58..e6f064959a1 100644
--- a/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk-inline -fno-pic" } */
+/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk-inline -fno-pic -fjump-tables" } */
 
 void func0 (void);
 void func1 (void);
diff --git a/gcc/testsuite/gcc.target/i386/pr86952.c b/gcc/testsuite/gcc.target/i386/pr86952.c
new file mode 100644
index 000..3ff3e354878
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr86952.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk -fdump-tree-switchlower1" } */
+
+int global;
+
+int 
+foo (int x)
+{
+  switch (x & 7)
+{
+  case 0: ; return 1722;
+  case 1: global += 1; return 1060;
+  case 2: ; return 1990;
+  case 3: ; return 1242;
+  case 4: ; return 1466;
+  case 5: ; return 894;
+  case 6: ; return 570;
+  

Re: [PATCH] x86: Disable jump tables when retpolines are used (PR target/86952).

2019-03-07 Thread Uros Bizjak
On Thu, Mar 7, 2019 at 9:45 AM Martin Liška  wrote:
>
> Hi.
>
> Thanks to Intel guys, we've done some re-measurement in PR86952
> about usage of jump tables when retpolines are used.
> Numbers prove that disabling of JT should be the best for now.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
> Thanks,
> Martin

Please add a comment above your change.

Uros.

>
> gcc/ChangeLog:
>
> 2019-03-06  Martin Liska  
>
> PR target/86952
> * config/i386/i386.c (ix86_option_override_internal): Disable
> jump tables when retpolines are used.
>
> gcc/testsuite/ChangeLog:
>
> 2019-03-06  Martin Liska  
>
> PR target/86952
> * gcc.target/i386/pr86952.c: New test.
> * gcc.target/i386/indirect-thunk-7.c: Use jump tables to match
> scanned pattern.
> * gcc.target/i386/indirect-thunk-inline-7.c: Likewise.
> ---
>  gcc/config/i386/i386.c|  4 
>  .../gcc.target/i386/indirect-thunk-7.c|  2 +-
>  .../gcc.target/i386/indirect-thunk-inline-7.c |  2 +-
>  gcc/testsuite/gcc.target/i386/pr86952.c   | 23 +++
>  4 files changed, 29 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr86952.c
>
>


[PATCH] x86: Disable jump tables when retpolines are used (PR target/86952).

2019-03-07 Thread Martin Liška
Hi.

Thanks to Intel guys, we've done some re-measurement in PR86952
about usage of jump tables when retpolines are used.
Numbers prove that disabling of JT should be the best for now.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-03-06  Martin Liska  

PR target/86952
* config/i386/i386.c (ix86_option_override_internal): Disable
jump tables when retpolines are used.

gcc/testsuite/ChangeLog:

2019-03-06  Martin Liska  

PR target/86952
* gcc.target/i386/pr86952.c: New test.
* gcc.target/i386/indirect-thunk-7.c: Use jump tables to match
scanned pattern.
* gcc.target/i386/indirect-thunk-inline-7.c: Likewise.
---
 gcc/config/i386/i386.c|  4 
 .../gcc.target/i386/indirect-thunk-7.c|  2 +-
 .../gcc.target/i386/indirect-thunk-inline-7.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr86952.c   | 23 +++
 4 files changed, 29 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr86952.c


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c8f9957163b..37fe41260dd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -4894,6 +4894,10 @@ ix86_option_override_internal (bool main_args_p,
 			   opts->x_param_values,
 			   opts_set->x_param_values);
 
+  if (ix86_indirect_branch != indirect_branch_keep
+  && !opts_set->x_flag_jump_tables)
+opts->x_flag_jump_tables = 0;
+
   return true;
 }
 
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-7.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-7.c
index 3c72036dbaf..53868f46558 100644
--- a/gcc/testsuite/gcc.target/i386/indirect-thunk-7.c
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-7.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk -fno-pic" } */
+/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk -fno-pic -fjump-tables" } */
 
 void func0 (void);
 void func1 (void);
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c
index ea009245a58..e6f064959a1 100644
--- a/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk-inline -fno-pic" } */
+/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk-inline -fno-pic -fjump-tables" } */
 
 void func0 (void);
 void func1 (void);
diff --git a/gcc/testsuite/gcc.target/i386/pr86952.c b/gcc/testsuite/gcc.target/i386/pr86952.c
new file mode 100644
index 000..3ff3e354878
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr86952.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=thunk -fdump-tree-switchlower1" } */
+
+int global;
+
+int 
+foo (int x)
+{
+  switch (x & 7)
+{
+  case 0: ; return 1722;
+  case 1: global += 1; return 1060;
+  case 2: ; return 1990;
+  case 3: ; return 1242;
+  case 4: ; return 1466;
+  case 5: ; return 894;
+  case 6: ; return 570;
+  case 7: ; return 572;
+  default: return 0;
+}
+}
+
+/* { dg-final { scan-tree-dump ";; GIMPLE switch case clusters: 1 2 3 4 5 6 7" "switchlower1" } } */



Re: [PATCH] Add missing avx512fintrin.h intrinsics (PR target/89602)

2019-03-07 Thread Uros Bizjak
On Thu, Mar 7, 2019 at 9:09 AM Jakub Jelinek  wrote:
>
> On Thu, Mar 07, 2019 at 08:11:53AM +0100, Uros Bizjak wrote:
> > > +(define_insn "*avx512f_load_mask"
> > > +  [(set (match_operand: 0 "register_operand" "=v")
> > > +   (vec_merge:
> > > + (vec_merge:
> > > +   (vec_duplicate:
> > > + (match_operand:MODEF 1 "memory_operand" "m"))
> > > +   (match_operand: 2 "nonimm_or_0_operand" "0C")
> > > +   (match_operand:QI 3 "nonmemory_operand" "Yk"))
> >
> > Is there a reason to have nonmemory_operand predicate here instead of
> > register_operand?
>
> Thanks for catching that up, that was from my earlier attempt to have
> Yk,n constraints and deal with that during output.  For store it was
> possible, for others there were some cases it couldn't handle but further
> testing revealed that the combiner already handles most of the constant
> mask cases right.
>
> Here is updated patch, I've changed this in two spots.  It even improves the
> constant 1 case (the only one that is still not optimized as much as it
> should):
>  f4:
> -   movzbl  .LC0(%rip), %eax
> +   movl$1, %eax
> kmovw   %eax, %k1
> vmovsd  (%rsi), %xmm0{%k1}{z}
> ret
> Tested so far with make check-gcc RUNTESTFLAGS=i386.exp=avx512f-vmovs*.c
> and compiling/eyeballing differences on the short testcase I've posted
> in the description with also the u, -> 1, and u, -> 0, changes, appart
> from the above f4 no differences.
>
> Ok for trunk if it passes another full bootstrap/regtest?

LGTM with another fixup below.

HJ should approve addition of intrinsic in header files.

Thanks,
Uros.

>
> 2019-03-07  Jakub Jelinek  
>
> PR target/89602
> * config/i386/sse.md (avx512f_mov_mask,
> *avx512f_load_mask, avx512f_store_mask): New define_insns.
> (avx512f_load_mask): New define_expand.
> * config/i386/i386-builtin.def (__builtin_ia32_loadsd_mask,
> __builtin_ia32_loadss_mask, __builtin_ia32_storesd_mask,
> __builtin_ia32_storess_mask, __builtin_ia32_movesd_mask,
> __builtin_ia32_movess_mask): New builtins.
> * config/i386/avx512fintrin.h (_mm_mask_load_ss, _mm_maskz_load_ss,
> _mm_mask_load_sd, _mm_maskz_load_sd, _mm_mask_move_ss,
> _mm_maskz_move_ss, _mm_mask_move_sd, _mm_maskz_move_sd,
> _mm_mask_store_ss, _mm_mask_store_sd): New intrinsics.
>
> * gcc.target/i386/avx512f-vmovss-1.c: New test.
> * gcc.target/i386/avx512f-vmovss-2.c: New test.
> * gcc.target/i386/avx512f-vmovss-3.c: New test.
> * gcc.target/i386/avx512f-vmovsd-1.c: New test.
> * gcc.target/i386/avx512f-vmovsd-2.c: New test.
> * gcc.target/i386/avx512f-vmovsd-3.c: New test.
>
> --- gcc/config/i386/sse.md.jj   2019-02-20 23:40:17.119140235 +0100
> +++ gcc/config/i386/sse.md  2019-03-06 19:15:12.379749161 +0100
> @@ -1151,6 +1151,67 @@ (define_insn "_load_mask"
> (set_attr "memory" "none,load")
> (set_attr "mode" "")])
>
> +(define_insn "avx512f_mov_mask"
> +  [(set (match_operand:VF_128 0 "register_operand" "=v")
> +   (vec_merge:VF_128
> + (vec_merge:VF_128
> +   (match_operand:VF_128 2 "register_operand" "v")
> +   (match_operand:VF_128 3 "nonimm_or_0_operand" "0C")
> +   (match_operand:QI 4 "register_operand" "Yk"))
> + (match_operand:VF_128 1 "register_operand" "v")
> + (const_int 1)))]
> +  "TARGET_AVX512F"
> +  "vmov\t{%2, %1, %0%{%4%}%N3|%0%{%4%}%N3, %1, %2}"
> +  [(set_attr "type" "ssemov")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "")])
> +
> +(define_expand "avx512f_load_mask"
> +  [(set (match_operand: 0 "register_operand")
> +   (vec_merge:
> + (vec_merge:
> +   (vec_duplicate:
> + (match_operand:MODEF 1 "memory_operand"))
> +   (match_operand: 2 "nonimm_or_0_operand")
> +   (match_operand:QI 3 "nonmemory_operand"))

register operand here, the expander should match corresponding insn pattern.

> + (match_dup 4)
> + (const_int 1)))]
> +  "TARGET_AVX512F"
> +  "operands[4] = CONST0_RTX (mode);")
> +
> +(define_insn "*avx512f_load_mask"
> +  [(set (match_operand: 0 "register_operand" "=v")
> +   (vec_merge:
> + (vec_merge:
> +   (vec_duplicate:
> + (match_operand:MODEF 1 "memory_operand" "m"))
> +   (match_operand: 2 "nonimm_or_0_operand" "0C")
> +   (match_operand:QI 3 "register_operand" "Yk"))
> + (match_operand: 4 "const0_operand" "C")
> + (const_int 1)))]
> +  "TARGET_AVX512F"
> +  "vmov\t{%1, %0%{%3%}%N2|%0%{3%}%N2, %1}"
> +  [(set_attr "type" "ssemov")
> +   (set_attr "prefix" "evex")
> +   (set_attr "memory" "load")
> +   (set_attr "mode" "")])
> +
> +(define_insn "avx512f_store_mask"
> +  [(set (match_operand:MODEF 0 "memory_operand" "=m")
> +   (if_then_else:MODEF
> + (and:QI (match_operand:QI 2 "register_operand" "Yk")
> + 

Re: [PATCH] Add missing avx512fintrin.h intrinsics (PR target/89602)

2019-03-07 Thread Jakub Jelinek
On Thu, Mar 07, 2019 at 08:11:53AM +0100, Uros Bizjak wrote:
> > +(define_insn "*avx512f_load_mask"
> > +  [(set (match_operand: 0 "register_operand" "=v")
> > +   (vec_merge:
> > + (vec_merge:
> > +   (vec_duplicate:
> > + (match_operand:MODEF 1 "memory_operand" "m"))
> > +   (match_operand: 2 "nonimm_or_0_operand" "0C")
> > +   (match_operand:QI 3 "nonmemory_operand" "Yk"))
> 
> Is there a reason to have nonmemory_operand predicate here instead of
> register_operand?

Thanks for catching that up, that was from my earlier attempt to have
Yk,n constraints and deal with that during output.  For store it was
possible, for others there were some cases it couldn't handle but further
testing revealed that the combiner already handles most of the constant
mask cases right.

Here is updated patch, I've changed this in two spots.  It even improves the
constant 1 case (the only one that is still not optimized as much as it
should):
 f4:
-   movzbl  .LC0(%rip), %eax
+   movl$1, %eax
kmovw   %eax, %k1
vmovsd  (%rsi), %xmm0{%k1}{z}
ret
Tested so far with make check-gcc RUNTESTFLAGS=i386.exp=avx512f-vmovs*.c
and compiling/eyeballing differences on the short testcase I've posted
in the description with also the u, -> 1, and u, -> 0, changes, appart
from the above f4 no differences.

Ok for trunk if it passes another full bootstrap/regtest?

2019-03-07  Jakub Jelinek  

PR target/89602
* config/i386/sse.md (avx512f_mov_mask,
*avx512f_load_mask, avx512f_store_mask): New define_insns.
(avx512f_load_mask): New define_expand.
* config/i386/i386-builtin.def (__builtin_ia32_loadsd_mask,
__builtin_ia32_loadss_mask, __builtin_ia32_storesd_mask,
__builtin_ia32_storess_mask, __builtin_ia32_movesd_mask,
__builtin_ia32_movess_mask): New builtins.
* config/i386/avx512fintrin.h (_mm_mask_load_ss, _mm_maskz_load_ss,
_mm_mask_load_sd, _mm_maskz_load_sd, _mm_mask_move_ss,
_mm_maskz_move_ss, _mm_mask_move_sd, _mm_maskz_move_sd,
_mm_mask_store_ss, _mm_mask_store_sd): New intrinsics.

* gcc.target/i386/avx512f-vmovss-1.c: New test.
* gcc.target/i386/avx512f-vmovss-2.c: New test.
* gcc.target/i386/avx512f-vmovss-3.c: New test.
* gcc.target/i386/avx512f-vmovsd-1.c: New test.
* gcc.target/i386/avx512f-vmovsd-2.c: New test.
* gcc.target/i386/avx512f-vmovsd-3.c: New test.

--- gcc/config/i386/sse.md.jj   2019-02-20 23:40:17.119140235 +0100
+++ gcc/config/i386/sse.md  2019-03-06 19:15:12.379749161 +0100
@@ -1151,6 +1151,67 @@ (define_insn "_load_mask"
(set_attr "memory" "none,load")
(set_attr "mode" "")])
 
+(define_insn "avx512f_mov_mask"
+  [(set (match_operand:VF_128 0 "register_operand" "=v")
+   (vec_merge:VF_128
+ (vec_merge:VF_128
+   (match_operand:VF_128 2 "register_operand" "v")
+   (match_operand:VF_128 3 "nonimm_or_0_operand" "0C")
+   (match_operand:QI 4 "register_operand" "Yk"))
+ (match_operand:VF_128 1 "register_operand" "v")
+ (const_int 1)))]
+  "TARGET_AVX512F"
+  "vmov\t{%2, %1, %0%{%4%}%N3|%0%{%4%}%N3, %1, %2}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "")])
+
+(define_expand "avx512f_load_mask"
+  [(set (match_operand: 0 "register_operand")
+   (vec_merge:
+ (vec_merge:
+   (vec_duplicate:
+ (match_operand:MODEF 1 "memory_operand"))
+   (match_operand: 2 "nonimm_or_0_operand")
+   (match_operand:QI 3 "nonmemory_operand"))
+ (match_dup 4)
+ (const_int 1)))]
+  "TARGET_AVX512F"
+  "operands[4] = CONST0_RTX (mode);")
+
+(define_insn "*avx512f_load_mask"
+  [(set (match_operand: 0 "register_operand" "=v")
+   (vec_merge:
+ (vec_merge:
+   (vec_duplicate:
+ (match_operand:MODEF 1 "memory_operand" "m"))
+   (match_operand: 2 "nonimm_or_0_operand" "0C")
+   (match_operand:QI 3 "register_operand" "Yk"))
+ (match_operand: 4 "const0_operand" "C")
+ (const_int 1)))]
+  "TARGET_AVX512F"
+  "vmov\t{%1, %0%{%3%}%N2|%0%{3%}%N2, %1}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "evex")
+   (set_attr "memory" "load")
+   (set_attr "mode" "")])
+
+(define_insn "avx512f_store_mask"
+  [(set (match_operand:MODEF 0 "memory_operand" "=m")
+   (if_then_else:MODEF
+ (and:QI (match_operand:QI 2 "register_operand" "Yk")
+(const_int 1))
+ (vec_select:MODEF
+   (match_operand: 1 "register_operand" "v")
+   (parallel [(const_int 0)]))
+ (match_dup 0)))]
+  "TARGET_AVX512F"
+  "vmov\t{%1, %0%{%2%}|%0%{%2%}, %1}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "evex")
+   (set_attr "memory" "store")
+   (set_attr "mode" "")])
+
 (define_insn "_blendm"
   [(set (match_operand:V48_AVX512VL 0 "register_operand" "=v")