date:20201112

Re: [2/3][vect] Add widening add, subtract vect patterns

2020-11-12 Thread Richard Biener

On Thu, 12 Nov 2020, Joel Hutton wrote:

> Hi all,
> 
> This patch adds widening add and widening subtract patterns to 
> tree-vect-patterns.

I am missing documentation in md.texi for the new patterns.  In
particular I wonder why you need singed and unsigned variants
for the add/subtract patterns.

We're walking away from adding tree codes for new vectorizer
pieces and instead want to use direct internal functions for them.
Can you rework the patch to use this approach?

Thanks,
Richard.

> All 3 patches together bootstrapped and regression tested on aarch64.
> 
> gcc/ChangeLog:
> 
> 2020-11-12 ?Joel Hutton ?
> 
> ? ? ? ? * expr.c (expand_expr_real_2): add widen_add,widen_subtract cases
> ? ? ? ? * optabs-tree.c (optab_for_tree_code): optabs for widening 
> adds,subtracts
> ? ? ? ? * optabs.def (OPTAB_D): define vectorized widen add, subtracts
> ? ? ? ? * tree-cfg.c (verify_gimple_assign_binary): Add case for widening 
> adds, subtracts
> ? ? ? ? * tree-inline.c (estimate_operator_cost): Add case for widening adds, 
> subtracts
> ? ? ? ? * tree-vect-generic.c (expand_vector_operations_1): Add case for 
> widening adds, subtracts
> ? ? ? ? * tree-vect-patterns.c (vect_recog_widen_add_pattern): New recog 
> ptatern
> ? ? ? ? (vect_recog_widen_sub_pattern): New recog pattern
> ? ? ? ? (vect_recog_average_pattern): Update widened add code
> ? ? ? ? (vect_recog_average_pattern): Update widened add code
> ? ? ? ? * tree-vect-stmts.c (vectorizable_conversion): Add case for widened 
> add, subtract
> ? ? ? ? (supportable_widening_operation): Add case for widened add, subtract
> ? ? ? ? * tree.def (WIDEN_ADD_EXPR): New tree code
> ? ? ? ? (WIDEN_SUB_EXPR): New tree code
> ? ? ? ? (VEC_WIDEN_ADD_HI_EXPR): New tree code
> ? ? ? ? (VEC_WIDEN_ADD_LO_EXPR): New tree code
> ? ? ? ? (VEC_WIDEN_SUB_HI_EXPR): New tree code
> ? ? ? ? (VEC_WIDEN_SUB_LO_EXPR): New tree code
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-11-12 ?Joel Hutton ?
> 
> ? ? ? ? * gcc.target/aarch64/vect-widen-add.c: New test.
> ? ? ? ? * gcc.target/aarch64/vect-widen-sub.c: New test.
> 
> 
> Ok for trunk?
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend

Re: Improve handling of memory operands in ipa-icf 2/4

2020-11-12 Thread Richard Biener

On Thu, 12 Nov 2020, Jan Hubicka wrote:

> Hi,
> this is updated patch.  It fixes the comparsion of bitfield where I now
> check that they bitsizes and bitoffsets match (and OEP_ADDRESSOF is not
> used for bitfield references).
> I also noticed problem with dependence clique in ao_refs_may_alias that
> I copied here.  Instead of base rbase should be used.
> 
> Finally I ran statistics on when access paths mismatches and noticed
> that I do not really need to check that component_refs and array_refs
> are semantically equivalent since this is implied from earlier tests.
> This is described in inline comment and simplifies the code.
> 
> Bootstrapped/regtested x86_64-linux, OK?

OK.

Thanks,
Richard.

> Honza
> 
> 
>   * ipa-icf-gimple.c: Include tree-ssa-alias-compare.h.
>   (find_checker::func_checker): Initialize m_tbaa.
>   (func_checker::hash_operand): Use hash_ao_ref for memory accesses.
>   (func_checker::compare_operand): Use compare_ao_refs for memory
>   accesses.
>   (func_checker::cmopare_gimple_assign): Do not check LHS types
>   of memory stores.
>   * ipa-icf-gimple.h (func_checker): Derive from ao_compare;
>   add m_tbaa.
>   * ipa-icf.c: Include tree-ssa-alias-compare.h.
>   (sem_function::equals_private): Update call of
>   func_checker::func_checker.
>   * ipa-utils.h (lto_streaming_expected_p): New inline
>   predicate.
>   * tree-ssa-alias-compare.h: New file.
>   * tree-ssa-alias.c: Include tree-ssa-alias-compare.h
>   and bultins.h
>   (view_converted_memref_p): New function.
>   (types_equal_for_same_type_for_tbaa_p): New function.
>   (ao_compare::compare_ao_refs): New member function.
>   (ao_compare::hash_ao_ref): New function
> 
>   * c-c++-common/Wstringop-overflow-2.c: Disable ICF.
>   * g++.dg/warn/Warray-bounds-8.C: Disable ICF.
> 
> index f75951f7c49..26337dd7384 100644
> --- a/gcc/ipa-icf-gimple.c
> +++ b/gcc/ipa-icf-gimple.c
> @@ -40,6 +40,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "attribs.h"
>  #include "gimple-walk.h"
>  
> +#include "tree-ssa-alias-compare.h"
>  #include "ipa-icf-gimple.h"
>  
>  namespace ipa_icf_gimple {
> @@ -52,13 +53,13 @@ namespace ipa_icf_gimple {
> of declarations that can be skipped.  */
>  
>  func_checker::func_checker (tree source_func_decl, tree target_func_decl,
> - bool ignore_labels,
> + bool ignore_labels, bool tbaa,
>   hash_set *ignored_source_nodes,
>   hash_set *ignored_target_nodes)
>: m_source_func_decl (source_func_decl), m_target_func_decl 
> (target_func_decl),
>  m_ignored_source_nodes (ignored_source_nodes),
>  m_ignored_target_nodes (ignored_target_nodes),
> -m_ignore_labels (ignore_labels)
> +m_ignore_labels (ignore_labels), m_tbaa (tbaa)
>  {
>function *source_func = DECL_STRUCT_FUNCTION (source_func_decl);
>function *target_func = DECL_STRUCT_FUNCTION (target_func_decl);
> @@ -252,9 +253,16 @@ func_checker::hash_operand (const_tree arg, 
> inchash::hash ,
>  
>  void
>  func_checker::hash_operand (const_tree arg, inchash::hash ,
> - unsigned int flags, operand_access_type)
> + unsigned int flags, operand_access_type access)
>  {
> -  return hash_operand (arg, hstate, flags);
> +  if (access == OP_MEMORY)
> +{
> +  ao_ref ref;
> +  ao_ref_init (, const_cast  (arg));
> +  return hash_ao_ref (, lto_streaming_expected_p (), m_tbaa, hstate);
> +}
> +  else
> +return hash_operand (arg, hstate, flags);
>  }
>  
>  bool
> @@ -314,18 +322,40 @@ func_checker::compare_operand (tree t1, tree t2, 
> operand_access_type access)
>  return true;
>else if (!t1 || !t2)
>  return false;
> -  if (operand_equal_p (t1, t2, OEP_MATCH_SIDE_EFFECTS))
> -return true;
> -  switch (access)
> +  if (access == OP_MEMORY)
>  {
> -case OP_MEMORY:
> -  return return_false_with_msg
> -  ("operand_equal_p failed (access == memory)");
> -case OP_NORMAL:
> +  ao_ref ref1, ref2;
> +  ao_ref_init (, const_cast  (t1));
> +  ao_ref_init (, const_cast  (t2));
> +  int flags = compare_ao_refs (, ,
> +lto_streaming_expected_p (), m_tbaa);
> +
> +  if (!flags)
> + return true;
> +  if (flags & SEMANTICS)
> + return return_false_with_msg
> + ("compare_ao_refs failed (semantic difference)");
> +  if (flags & BASE_ALIAS_SET)
> + return return_false_with_msg
> + ("compare_ao_refs failed (base alias set difference)");
> +  if (flags & REF_ALIAS_SET)
> + return return_false_with_msg
> +  ("compare_ao_refs failed (ref alias set difference)");
> +  if (flags & ACCESS_PATH)
> + return return_false_with_msg
> +  ("compare_ao_refs failed (access path difference)");
> +  if (flags &

Re: Compare field offsets in fold_const when checking addresses

2020-11-12 Thread Richard Biener

On Thu, 12 Nov 2020, Jan Hubicka wrote:

> > On Thu, 12 Nov 2020, Jan Hubicka wrote:
> > 
> > > Hi,
> > > this is updated patch I am re-testing and plan to commit if it suceeds.
> > > 
> > >   * fold-const.c (operand_compare::operand_equal_p): Compare
> > >   offsets of fields in component_refs when comparing addresses.
> > >   (operand_compare::hash_operand): Likewise.
> > > diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> > > index c47557daeba..273ee25ceda 100644
> > > --- a/gcc/fold-const.c
> > > +++ b/gcc/fold-const.c
> > > @@ -3312,11 +3312,36 @@ operand_compare::operand_equal_p (const_tree 
> > > arg0, const_tree arg1,
> > >   case COMPONENT_REF:
> > > /* Handle operand 2 the same as for ARRAY_REF.  Operand 0
> > >may be NULL when we're called to compare MEM_EXPRs.  */
> > > -   if (!OP_SAME_WITH_NULL (0)
> > > -   || !OP_SAME (1))
> > > +   if (!OP_SAME_WITH_NULL (0))
> > >   return false;
> > > -   flags &= ~OEP_ADDRESS_OF;
> > > -   return OP_SAME_WITH_NULL (2);
> > > +   /* Most of time we only need to compare FIELD_DECLs for equality.
> > > +  However when determining address look into actual offsets.
> > > +  These may match for unions and unshared record types.  */
> > 
> > looks like you can simplify by doing
> > 
> >   flags &= ~OEP_ADDRESS_OF;
> > 
> > here.  Neither the FIELD_DECL compare nor the offsets need it
> 
> Yep
> > 
> > You elided
> > 
> >   flags &= ~OEP_ADDRESS_OF;
> > - return OP_SAME_WITH_NULL (2);
> > 
> > that was here when OP_SAME (1), please re-instantiate.
> Sorry for that, that was not very careful.
> Here is updated patch I re-tested x86_64-linux.

OK.

Richard.

> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> index c47557daeba..ddf18f27cb7 100644
> --- a/gcc/fold-const.c
> +++ b/gcc/fold-const.c
> @@ -3312,10 +3312,32 @@ operand_compare::operand_equal_p (const_tree arg0, 
> const_tree arg1,
>   case COMPONENT_REF:
> /* Handle operand 2 the same as for ARRAY_REF.  Operand 0
>may be NULL when we're called to compare MEM_EXPRs.  */
> -   if (!OP_SAME_WITH_NULL (0)
> -   || !OP_SAME (1))
> +   if (!OP_SAME_WITH_NULL (0))
>   return false;
> +   /* Most of time we only need to compare FIELD_DECLs for equality.
> +  However when determining address look into actual offsets.
> +  These may match for unions and unshared record types.  */
> flags &= ~OEP_ADDRESS_OF;
> +   if (!OP_SAME (1))
> + {
> +   if (flags & OEP_ADDRESS_OF)
> + {
> +   if (TREE_OPERAND (arg0, 2)
> +   || TREE_OPERAND (arg1, 2))
> + return OP_SAME_WITH_NULL (2);
> +   tree field0 = TREE_OPERAND (arg0, 1);
> +   tree field1 = TREE_OPERAND (arg1, 1);
> +
> +   if (!operand_equal_p (DECL_FIELD_OFFSET (field0),
> + DECL_FIELD_OFFSET (field1), flags)
> +   || !operand_equal_p (DECL_FIELD_BIT_OFFSET (field0),
> +DECL_FIELD_BIT_OFFSET (field1),
> +flags))
> + return false;
> + }
> +   else
> + return false;
> + }
> return OP_SAME_WITH_NULL (2);
>  
>   case BIT_FIELD_REF:
> @@ -3787,9 +3809,26 @@ operand_compare::hash_operand (const_tree t, 
> inchash::hash ,
> sflags = flags;
> break;
>  
> + case COMPONENT_REF:
> +   if (sflags & OEP_ADDRESS_OF)
> + {
> +   hash_operand (TREE_OPERAND (t, 0), hstate, flags);
> +   if (TREE_OPERAND (t, 2))
> + hash_operand (TREE_OPERAND (t, 2), hstate,
> +   flags & ~OEP_ADDRESS_OF);
> +   else
> + {
> +   tree field = TREE_OPERAND (t, 1);
> +   hash_operand (DECL_FIELD_OFFSET (field),
> + hstate, flags & ~OEP_ADDRESS_OF);
> +   hash_operand (DECL_FIELD_BIT_OFFSET (field),
> + hstate, flags & ~OEP_ADDRESS_OF);
> + }
> +   return;
> + }
> +   break;
>   case ARRAY_REF:
>   case ARRAY_RANGE_REF:
> - case COMPONENT_REF:
>   case BIT_FIELD_REF:
> sflags &= ~OEP_ADDRESS_OF;
> break;
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend

[Bug middle-end/64711] Unconsistency with -fnon-call-exceptions when used along inline and ipa optimizations and memmov

2020-11-12 Thread rguenther at suse dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64711

--- Comment #8 from rguenther at suse dot de  ---
On Thu, 12 Nov 2020, ebotcazou at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64711
> 
> --- Comment #7 from Eric Botcazou  ---
> > The issue with clearing nothrow is that those pesky builtins have
> > that "sticky" while the per-stmt flag (gimple_call_nothrow ())
> > just amends it.  Guess we might want to fix that (in gimple_call_flags)
> > and then clear the flag always for -fnon-call-exceptions?
> > 
> > I suppose all/most noexcept specifications in libstdc++ are similarly
> > questionable.
> 
> Let's not use too big a hammer though, -fnon-call-exceptions works fine for
> languages (Ada, Go) that enable it by default and I'm quite wary of C++ folks
> who try it once in a while, want to pessimize it because it doesn't work on
> their questionable testcase, and then forget about it.
> 
> Why not just extend what's done in build_common_builtin_nodes for
> __builtin_alloca to the family of __builtin_mem* functions?

Ah, didn't remember this place.  Yes, I guess we could fix that place
but that wouldn't conver the C/C++ frontends since those have
the builtins already (wrongly) declared via the builtins.def machinery
which does mark them NOTHROW (the __builtin_alloca handling also
doesn't work for them).

That means that similar to ATTR_MATHFN_FPROUNDING we'd need a
variants of ATTR_NOTHROW_NONNULL_LEAF and some others that make
the NOTHROW part conditional on flag_non_call_exceptions.
Guess that's doable, double checking LTO behavior on merging
of builtins from different CUs with possibly different settings
of -fnon-call-exceptions needs to be done though.

Re: [PATCH 3/3] RISC-V: Support version controling for ISA standard extensions

2020-11-12 Thread Kito Cheng via Gcc-patches

Oh I was dry-run but cc to gcc patches accidentally, but the patch set
is right, it just sent twice the same patch set.



On Fri, Nov 13, 2020 at 3:29 PM Kito Cheng  wrote:
>
>  - New option -misa-spec support: -misa-spec=[2.2|20190608|20191213] and
>corresponding configuration option --with-isa-spec.
>
>  - Current default ISA spec set to 2.2, but we intend to bump this to
>20191213 or later in next release.
>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.c (riscv_ext_version): New.
> (riscv_ext_version_table): Ditto.
> (get_default_version): Ditto.
> (riscv_subset_t::implied_p): New field.
> (riscv_subset_t::riscv_subset_t): Init implied_p.
> (riscv_subset_list::add): New.
> (riscv_subset_list::handle_implied_ext): Pass riscv_subset_t
> instead of separated argument.
> (riscv_subset_list::to_string): Handle zifencei and zicsr, and
> omit version if version is unknown.
> (riscv_subset_list::parsing_subset_version): New argument `ext`,
> remove default_major_version and default_minor_version, get
> default version info via get_default_version.
> (riscv_subset_list::parse_std_ext): Update argument for
> parsing_subset_version calls.
> Handle 2.2 ISA spec, always enable zicsr and zifencei, they are
> included in baseline ISA in that time.
> (riscv_subset_list::parse_multiletter_ext): Update argument for
> `parsing_subset_version` and `add` calls.
> (riscv_subset_list::parse): Adjust argument for
> riscv_subset_list::handle_implied_ext call.
> * config.gcc (riscv*-*-*): Handle --with-isa-spec=.
> * config.in (HAVE_AS_MISA_SPEC): New.
> (HAVE_AS_MARCH_ZIFENCEI): Ditto.
> * config/riscv/riscv-opts.h (riscv_isa_spec_class): New.
> (riscv_isa_spec): Ditto.
> * config/riscv/riscv.h (HAVE_AS_MISA_SPEC): New.
> (ASM_SPEC): Pass -misa-spec if gas supported.
> * config/riscv/riscv.opt (riscv_isa_spec_class) New.
> * configure.ac (HAVE_AS_MARCH_ZIFENCEI): New test.
> (HAVE_AS_MISA_SPEC): Ditto.
> * configure: Regen.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/arch-9.c: New.
> * gcc.target/riscv/arch-10.c: Ditto.
> * gcc.target/riscv/arch-11.c: Ditto.
> * gcc.target/riscv/attribute-6.c: Remove, we don't support G
> with version anymore.
> * gcc.target/riscv/attribute-8.c: Reorder arch string to fit canonical
> ordering.
> * gcc.target/riscv/attribute-9.c: We don't emit version for
> unknown extensions now.
> * gcc.target/riscv/attribute-11.c: Add -misa-spec=2.2 flags.
> * gcc.target/riscv/attribute-12.c: Ditto.
> * gcc.target/riscv/attribute-13.c: Ditto.
> * gcc.target/riscv/attribute-14.c: Ditto.
> * gcc.target/riscv/attribute-15.c: New.
> * gcc.target/riscv/attribute-16.c: Ditto.
> * gcc.target/riscv/attribute-17.c: Ditto.
> ---
>  gcc/common/config/riscv/riscv-common.c| 288 +-
>  gcc/config.gcc|  17 +-
>  gcc/config.in |  12 +
>  gcc/config/riscv/riscv-opts.h |  10 +
>  gcc/config/riscv/riscv.h  |   9 +-
>  gcc/config/riscv/riscv.opt|  17 ++
>  gcc/configure |  62 
>  gcc/configure.ac  |  10 +
>  gcc/testsuite/gcc.target/riscv/arch-10.c  |   6 +
>  gcc/testsuite/gcc.target/riscv/arch-11.c  |   5 +
>  gcc/testsuite/gcc.target/riscv/arch-9.c   |   6 +
>  gcc/testsuite/gcc.target/riscv/attribute-11.c |   2 +-
>  gcc/testsuite/gcc.target/riscv/attribute-12.c |   2 +-
>  gcc/testsuite/gcc.target/riscv/attribute-13.c |   2 +-
>  gcc/testsuite/gcc.target/riscv/attribute-14.c |   4 +-
>  gcc/testsuite/gcc.target/riscv/attribute-15.c |   6 +
>  gcc/testsuite/gcc.target/riscv/attribute-16.c |   6 +
>  gcc/testsuite/gcc.target/riscv/attribute-17.c |   6 +
>  gcc/testsuite/gcc.target/riscv/attribute-6.c  |   6 -
>  gcc/testsuite/gcc.target/riscv/attribute-8.c  |   4 +-
>  gcc/testsuite/gcc.target/riscv/attribute-9.c  |   2 +-
>  21 files changed, 394 insertions(+), 88 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/arch-10.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/arch-11.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/arch-9.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-15.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-16.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-17.c
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/attribute-6.c
>
> diff --git a/gcc/common/config/riscv/riscv-common.c 
> b/gcc/common/config/riscv/riscv-common.c
> index ca88ca1dacd..ea2d516bb36 100644
> ---

[PATCH 2/3] RISC-V: Support zicsr and zifencei extension for -march.

2020-11-12 Thread Kito Cheng

 - CSR related instructions and fence instructions has to be splitted from
   baseline ISA, zicsr and zifencei are corresponding sub-extension.

gcc/ChangeLog:

* common/config/riscv/riscv-common.c (riscv_implied_info):
d and f implied zicsr.
(riscv_ext_flag_table): Handle zicsr and zifencei.
* config/riscv/riscv-opts.h (MASK_ZICSR): New.
(MASK_ZIFENCEI): Ditto.
(TARGET_ZICSR): Ditto.
(TARGET_ZIFENCEI): Ditto.
* config/riscv/riscv.c (riscv_memmodel_needs_release_fence):
Check fence is available by TARGET_ZIFENCEI.
* config/riscv/riscv.opt (riscv_zi_subext): New.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-8.c: New.
* gcc.target/riscv/attribute-14.c: Ditto.
---
 gcc/common/config/riscv/riscv-common.c| 6 ++
 gcc/config/riscv/riscv-opts.h | 6 ++
 gcc/config/riscv/riscv.c  | 3 +++
 gcc/config/riscv/riscv.md | 7 ---
 gcc/config/riscv/riscv.opt| 3 +++
 gcc/testsuite/gcc.target/riscv/arch-8.c   | 5 +
 gcc/testsuite/gcc.target/riscv/attribute-14.c | 6 ++
 7 files changed, 33 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-14.c

diff --git a/gcc/common/config/riscv/riscv-common.c 
b/gcc/common/config/riscv/riscv-common.c
index f5f7be3cfff..ca88ca1dacd 100644
--- a/gcc/common/config/riscv/riscv-common.c
+++ b/gcc/common/config/riscv/riscv-common.c
@@ -57,6 +57,8 @@ struct riscv_implied_info_t
 static const riscv_implied_info_t riscv_implied_info[] =
 {
   {"d", "f"},
+  {"f", "zicsr"},
+  {"d", "zicsr"},
   {NULL, NULL}
 };
 
@@ -812,6 +814,10 @@ static const riscv_ext_flag_table_t riscv_ext_flag_table[] 
=
   {"f", _options::x_target_flags, MASK_HARD_FLOAT},
   {"d", _options::x_target_flags, MASK_DOUBLE_FLOAT},
   {"c", _options::x_target_flags, MASK_RVC},
+
+  {"zicsr",_options::x_riscv_zi_subext, MASK_ZICSR},
+  {"zifencei", _options::x_riscv_zi_subext, MASK_ZIFENCEI},
+
   {NULL, NULL, 0}
 };
 
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 2a3f9d9eef5..de8ac0e038d 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -57,4 +57,10 @@ enum stack_protector_guard {
   SSP_GLOBAL   /* global canary */
 };
 
+#define MASK_ZICSR(1 << 0)
+#define MASK_ZIFENCEI (1 << 1)
+
+#define TARGET_ZICSR((riscv_zi_subext & MASK_ZICSR) != 0)
+#define TARGET_ZIFENCEI ((riscv_zi_subext & MASK_ZIFENCEI) != 0)
+
 #endif /* ! GCC_RISCV_OPTS_H */
diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 738556539f6..2aaa8e96451 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3337,6 +3337,9 @@ riscv_memmodel_needs_amo_acquire (enum memmodel model)
 static bool
 riscv_memmodel_needs_release_fence (enum memmodel model)
 {
+  if (!TARGET_ZIFENCEI)
+return false;
+
   switch (model)
 {
   case MEMMODEL_ACQ_REL:
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index f15bad3b29e..756b35fb8c0 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1543,19 +1543,20 @@
 LCT_NORMAL, VOIDmode, operands[0], Pmode,
 operands[1], Pmode, const0_rtx, Pmode);
 #else
-  emit_insn (gen_fence_i ());
+  if (TARGET_ZIFENCEI)
+emit_insn (gen_fence_i ());
 #endif
   DONE;
 })
 
 (define_insn "fence"
   [(unspec_volatile [(const_int 0)] UNSPECV_FENCE)]
-  ""
+  "TARGET_ZIFENCEI"
   "%|fence%-")
 
 (define_insn "fence_i"
   [(unspec_volatile [(const_int 0)] UNSPECV_FENCE_I)]
-  ""
+  "TARGET_ZIFENCEI"
   "fence.i")
 
 ;;
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 808b4a04405..ca2fc7c8021 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -183,3 +183,6 @@ Use the given offset for addressing the stack-protector 
guard.
 
 TargetVariable
 long riscv_stack_protector_guard_offset = 0
+
+TargetVariable
+int riscv_zi_subext
diff --git a/gcc/testsuite/gcc.target/riscv/arch-8.c 
b/gcc/testsuite/gcc.target/riscv/arch-8.c
new file mode 100644
index 000..d7760fc576f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/arch-8.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-O -march=rv32id_zicsr_zifence -mabi=ilp32" } */
+int foo()
+{
+}
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-14.c 
b/gcc/testsuite/gcc.target/riscv/attribute-14.c
new file mode 100644
index 000..48456277152
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/attribute-14.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-O -mriscv-attribute -march=rv32if -mabi=ilp32" } */
+int foo()
+{
+}
+/* { dg-final { scan-assembler ".attribute arch, \"rv32i2p0_f2p0_zicsr2p0\"" } 
} */
-- 
2.29.2

[PATCH 1/3] RISC-V: Handle implied extension in canonical ordering.

2020-11-12 Thread Kito Cheng

 - ISA spec has specify the order between multi-letter extensions, implied
   extension also need to follow store in canonical ordering, so
   most easy way is we keep that in-order during insertion.

gcc/ChangeLog:

* common/config/riscv/riscv-common.c (single_letter_subset_rank): New.
(multi_letter_subset_rank): Ditto.
(subset_cmp): Ditto.
(riscv_subset_list::add): Insert subext in canonical ordering.
(riscv_subset_list::parse_std_ext): Move handle_implied_ext to ...
(riscv_subset_list::parse): ... here.
---
 gcc/common/config/riscv/riscv-common.c | 177 -
 1 file changed, 172 insertions(+), 5 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.c 
b/gcc/common/config/riscv/riscv-common.c
index 9a576eb689b..f5f7be3cfff 100644
--- a/gcc/common/config/riscv/riscv-common.c
+++ b/gcc/common/config/riscv/riscv-common.c
@@ -145,6 +145,129 @@ riscv_subset_list::~riscv_subset_list ()
 }
 }
 
+/* Get the rank for single-letter subsets, lower value meaning higher
+   priority.  */
+
+static int
+single_letter_subset_rank (char ext)
+{
+  int rank;
+
+  switch (ext)
+{
+case 'i':
+  return 0;
+case 'e':
+  return 1;
+default:
+  break;
+}
+
+  const char *all_ext = riscv_supported_std_ext ();
+  const char *ext_pos = strchr (all_ext, ext);
+  if (ext_pos == NULL)
+/* If got an unknown extension letter, then give it an alphabetical
+   order, but after all known standard extension.  */
+rank = strlen (all_ext) + ext - 'a';
+  else
+rank = (int)(ext_pos - all_ext) + 2 /* e and i has higher rank.  */;
+
+  return rank;
+}
+
+/* Get the rank for multi-letter subsets, lower value meaning higher
+   priority.  */
+
+static int
+multi_letter_subset_rank (const std::string )
+{
+  gcc_assert (subset.length () >= 2);
+  int high_order = -1;
+  int low_order = 0;
+  /* The order between multi-char extensions: s -> h -> z -> x.  */
+  char multiletter_class = subset[0];
+  switch (multiletter_class)
+{
+case 's':
+  high_order = 0;
+  break;
+case 'h':
+  high_order = 1;
+  break;
+case 'z':
+  gcc_assert (subset.length () > 2);
+  high_order = 2;
+  break;
+case 'x':
+  high_order = 3;
+  break;
+default:
+  gcc_unreachable ();
+  return -1;
+}
+
+  if (multiletter_class == 'z')
+/* Order for z extension on spec: If multiple "Z" extensions are named, 
they
+   should be ordered first by category, then alphabetically within a
+   category - for example, "Zicsr_Zifencei_Zam". */
+low_order = single_letter_subset_rank (subset[1]);
+  else
+low_order = 0;
+
+  return (high_order << 8) + low_order;
+}
+
+/* subset compare
+
+  Returns an integral value indicating the relationship between the subsets:
+  Return value  indicates
+  -1B has higher order than A.
+  0 A and B are same subset.
+  1 A has higher order than B.
+
+*/
+
+static int
+subset_cmp (const std::string , const std::string )
+{
+  if (a == b)
+return 0;
+
+  size_t a_len = a.length ();
+  size_t b_len = b.length ();
+
+  /* Single-letter extension always get higher order than
+ multi-letter extension.  */
+  if (a_len == 1 && b_len != 1)
+return 1;
+
+  if (a_len != 1 && b_len == 1)
+return -1;
+
+  if (a_len == 1 && b_len == 1)
+{
+  int rank_a = single_letter_subset_rank (a[0]);
+  int rank_b = single_letter_subset_rank (b[0]);
+
+  if (rank_a < rank_b)
+   return 1;
+  else
+   return -1;
+}
+  else
+{
+  int rank_a = multi_letter_subset_rank(a);
+  int rank_b = multi_letter_subset_rank(b);
+
+  /* Using alphabetical/lexicographical order if they have same rank.  */
+  if (rank_a == rank_b)
+   /* The return value of strcmp has opposite meaning.  */
+   return -strcmp (a.c_str (), b.c_str ());
+  else
+   return (rank_a < rank_b) ? 1 : -1;
+}
+}
+
 /* Add new subset to list.  */
 
 void
@@ -152,6 +275,7 @@ riscv_subset_list::add (const char *subset, int 
major_version,
int minor_version, bool explicit_version_p)
 {
   riscv_subset_t *s = new riscv_subset_t ();
+  riscv_subset_t *itr;
 
   if (m_head == NULL)
 m_head = s;
@@ -162,9 +286,45 @@ riscv_subset_list::add (const char *subset, int 
major_version,
   s->explicit_version_p = explicit_version_p;
   s->next = NULL;
 
-  if (m_tail != NULL)
-m_tail->next = s;
+  if (m_tail == NULL)
+{
+  m_tail = s;
+  return;
+}
+
+  /* e, i or g should be first subext, never come here.  */
+  gcc_assert (subset[0] != 'e'
+ && subset[0] != 'i'
+ && subset[0] != 'g');
+
+  if (m_tail == m_head)
+{
+  gcc_assert (m_head->next == NULL);
+  m_head->next = s;
+  m_tail = s;
+  return;
+}
+
+  gcc_assert (m_head->next != NULL);
+
+  /* Subset list must in canonical order, but implied subset

[PATCH 3/3] RISC-V: Support version controling for ISA standard extensions

2020-11-12 Thread Kito Cheng

 - New option -misa-spec support: -misa-spec=[2.2|20190608|20191213] and
   corresponding configuration option --with-isa-spec.

 - Current default ISA spec set to 2.2, but we intend to bump this to
   20191213 or later in next release.

gcc/ChangeLog:

* common/config/riscv/riscv-common.c (riscv_ext_version): New.
(riscv_ext_version_table): Ditto.
(get_default_version): Ditto.
(riscv_subset_t::implied_p): New field.
(riscv_subset_t::riscv_subset_t): Init implied_p.
(riscv_subset_list::add): New.
(riscv_subset_list::handle_implied_ext): Pass riscv_subset_t
instead of separated argument.
(riscv_subset_list::to_string): Handle zifencei and zicsr, and
omit version if version is unknown.
(riscv_subset_list::parsing_subset_version): New argument `ext`,
remove default_major_version and default_minor_version, get
default version info via get_default_version.
(riscv_subset_list::parse_std_ext): Update argument for
parsing_subset_version calls.
Handle 2.2 ISA spec, always enable zicsr and zifencei, they are
included in baseline ISA in that time.
(riscv_subset_list::parse_multiletter_ext): Update argument for
`parsing_subset_version` and `add` calls.
(riscv_subset_list::parse): Adjust argument for
riscv_subset_list::handle_implied_ext call.
* config.gcc (riscv*-*-*): Handle --with-isa-spec=.
* config.in (HAVE_AS_MISA_SPEC): New.
(HAVE_AS_MARCH_ZIFENCEI): Ditto.
* config/riscv/riscv-opts.h (riscv_isa_spec_class): New.
(riscv_isa_spec): Ditto.
* config/riscv/riscv.h (HAVE_AS_MISA_SPEC): New.
(ASM_SPEC): Pass -misa-spec if gas supported.
* config/riscv/riscv.opt (riscv_isa_spec_class) New.
* configure.ac (HAVE_AS_MARCH_ZIFENCEI): New test.
(HAVE_AS_MISA_SPEC): Ditto.
* configure: Regen.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-9.c: New.
* gcc.target/riscv/arch-10.c: Ditto.
* gcc.target/riscv/arch-11.c: Ditto.
* gcc.target/riscv/attribute-6.c: Remove, we don't support G
with version anymore.
* gcc.target/riscv/attribute-8.c: Reorder arch string to fit canonical
ordering.
* gcc.target/riscv/attribute-9.c: We don't emit version for
unknown extensions now.
* gcc.target/riscv/attribute-11.c: Add -misa-spec=2.2 flags.
* gcc.target/riscv/attribute-12.c: Ditto.
* gcc.target/riscv/attribute-13.c: Ditto.
* gcc.target/riscv/attribute-14.c: Ditto.
* gcc.target/riscv/attribute-15.c: New.
* gcc.target/riscv/attribute-16.c: Ditto.
* gcc.target/riscv/attribute-17.c: Ditto.
---
 gcc/common/config/riscv/riscv-common.c| 288 +-
 gcc/config.gcc|  17 +-
 gcc/config.in |  12 +
 gcc/config/riscv/riscv-opts.h |  10 +
 gcc/config/riscv/riscv.h  |   9 +-
 gcc/config/riscv/riscv.opt|  17 ++
 gcc/configure |  62 
 gcc/configure.ac  |  10 +
 gcc/testsuite/gcc.target/riscv/arch-10.c  |   6 +
 gcc/testsuite/gcc.target/riscv/arch-11.c  |   5 +
 gcc/testsuite/gcc.target/riscv/arch-9.c   |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-11.c |   2 +-
 gcc/testsuite/gcc.target/riscv/attribute-12.c |   2 +-
 gcc/testsuite/gcc.target/riscv/attribute-13.c |   2 +-
 gcc/testsuite/gcc.target/riscv/attribute-14.c |   4 +-
 gcc/testsuite/gcc.target/riscv/attribute-15.c |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-16.c |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-17.c |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-6.c  |   6 -
 gcc/testsuite/gcc.target/riscv/attribute-8.c  |   4 +-
 gcc/testsuite/gcc.target/riscv/attribute-9.c  |   2 +-
 21 files changed, 394 insertions(+), 88 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-15.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-16.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-17.c
 delete mode 100644 gcc/testsuite/gcc.target/riscv/attribute-6.c

diff --git a/gcc/common/config/riscv/riscv-common.c 
b/gcc/common/config/riscv/riscv-common.c
index ca88ca1dacd..ea2d516bb36 100644
--- a/gcc/common/config/riscv/riscv-common.c
+++ b/gcc/common/config/riscv/riscv-common.c
@@ -44,6 +44,7 @@ struct riscv_subset_t
   struct riscv_subset_t *next;
 
   bool explicit_version_p;
+  bool implied_p;
 };
 
 /* Type for implied ISA info.  */
@@ -62,6 +63,58 @@ static const riscv_implied_info_t riscv_implied_info[] =
   {NULL, NULL}
 };
 
+/* This structure holds version information

[PATCH 0/3] RISC-V: Support version controling for ISA standard extensions

2020-11-12 Thread Kito Cheng

Current GCC implementation is RISC-V ISA 2.2, this patch set implement 
v20190608 and v20191213, and also add option -misa-spec=[2.2|20190608|20191213] 
to change the default ISA spec version.

There is one major incompatible

That option will effect the default version of each sub-extension, for example 
I-extension is 2.0 for 2.2 and 2.1 for v20190608 and v20191213.

We also update the -march parser to fit the latest standard, the canonical 
ordering for multi-letter, drop version support for G extension, and we also 
omitted the version for unrecognized extension.

And we add an special rule for G extension, imafd can't appear again if G 
extension is present, but zicsr and zifencei can.

The default ISA spec will keep on 2.2, and change that in next GCC release.

[PATCH 3/3] RISC-V: Support version controling for ISA standard extensions

2020-11-12 Thread Kito Cheng

 - New option -misa-spec support: -misa-spec=[2.2|20190608|20191213] and
   corresponding configuration option --with-isa-spec.

 - Current default ISA spec set to 2.2, but we intend to bump this to
   20191213 or later in next release.

gcc/ChangeLog:

* common/config/riscv/riscv-common.c (riscv_ext_version): New.
(riscv_ext_version_table): Ditto.
(get_default_version): Ditto.
(riscv_subset_t::implied_p): New field.
(riscv_subset_t::riscv_subset_t): Init implied_p.
(riscv_subset_list::add): New.
(riscv_subset_list::handle_implied_ext): Pass riscv_subset_t
instead of separated argument.
(riscv_subset_list::to_string): Handle zifencei and zicsr, and
omit version if version is unknown.
(riscv_subset_list::parsing_subset_version): New argument `ext`,
remove default_major_version and default_minor_version, get
default version info via get_default_version.
(riscv_subset_list::parse_std_ext): Update argument for
parsing_subset_version calls.
Handle 2.2 ISA spec, always enable zicsr and zifencei, they are
included in baseline ISA in that time.
(riscv_subset_list::parse_multiletter_ext): Update argument for
`parsing_subset_version` and `add` calls.
(riscv_subset_list::parse): Adjust argument for
riscv_subset_list::handle_implied_ext call.
* config.gcc (riscv*-*-*): Handle --with-isa-spec=.
* config.in (HAVE_AS_MISA_SPEC): New.
(HAVE_AS_MARCH_ZIFENCEI): Ditto.
* config/riscv/riscv-opts.h (riscv_isa_spec_class): New.
(riscv_isa_spec): Ditto.
* config/riscv/riscv.h (HAVE_AS_MISA_SPEC): New.
(ASM_SPEC): Pass -misa-spec if gas supported.
* config/riscv/riscv.opt (riscv_isa_spec_class) New.
* configure.ac (HAVE_AS_MARCH_ZIFENCEI): New test.
(HAVE_AS_MISA_SPEC): Ditto.
* configure: Regen.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-9.c: New.
* gcc.target/riscv/arch-10.c: Ditto.
* gcc.target/riscv/arch-11.c: Ditto.
* gcc.target/riscv/attribute-6.c: Remove, we don't support G
with version anymore.
* gcc.target/riscv/attribute-8.c: Reorder arch string to fit canonical
ordering.
* gcc.target/riscv/attribute-9.c: We don't emit version for
unknown extensions now.
* gcc.target/riscv/attribute-11.c: Add -misa-spec=2.2 flags.
* gcc.target/riscv/attribute-12.c: Ditto.
* gcc.target/riscv/attribute-13.c: Ditto.
* gcc.target/riscv/attribute-14.c: Ditto.
* gcc.target/riscv/attribute-15.c: New.
* gcc.target/riscv/attribute-16.c: Ditto.
* gcc.target/riscv/attribute-17.c: Ditto.
---
 gcc/common/config/riscv/riscv-common.c| 288 +-
 gcc/config.gcc|  17 +-
 gcc/config.in |  12 +
 gcc/config/riscv/riscv-opts.h |  10 +
 gcc/config/riscv/riscv.h  |   9 +-
 gcc/config/riscv/riscv.opt|  17 ++
 gcc/configure |  62 
 gcc/configure.ac  |  10 +
 gcc/testsuite/gcc.target/riscv/arch-10.c  |   6 +
 gcc/testsuite/gcc.target/riscv/arch-11.c  |   5 +
 gcc/testsuite/gcc.target/riscv/arch-9.c   |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-11.c |   2 +-
 gcc/testsuite/gcc.target/riscv/attribute-12.c |   2 +-
 gcc/testsuite/gcc.target/riscv/attribute-13.c |   2 +-
 gcc/testsuite/gcc.target/riscv/attribute-14.c |   4 +-
 gcc/testsuite/gcc.target/riscv/attribute-15.c |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-16.c |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-17.c |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-6.c  |   6 -
 gcc/testsuite/gcc.target/riscv/attribute-8.c  |   4 +-
 gcc/testsuite/gcc.target/riscv/attribute-9.c  |   2 +-
 21 files changed, 394 insertions(+), 88 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-15.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-16.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-17.c
 delete mode 100644 gcc/testsuite/gcc.target/riscv/attribute-6.c

diff --git a/gcc/common/config/riscv/riscv-common.c 
b/gcc/common/config/riscv/riscv-common.c
index ca88ca1dacd..ea2d516bb36 100644
--- a/gcc/common/config/riscv/riscv-common.c
+++ b/gcc/common/config/riscv/riscv-common.c
@@ -44,6 +44,7 @@ struct riscv_subset_t
   struct riscv_subset_t *next;
 
   bool explicit_version_p;
+  bool implied_p;
 };
 
 /* Type for implied ISA info.  */
@@ -62,6 +63,58 @@ static const riscv_implied_info_t riscv_implied_info[] =
   {NULL, NULL}
 };
 
+/* This structure holds version information

[PATCH 1/3] RISC-V: Handle implied extension in canonical ordering.

2020-11-12 Thread Kito Cheng

 - ISA spec has specify the order between multi-letter extensions, implied
   extension also need to follow store in canonical ordering, so
   most easy way is we keep that in-order during insertion.

gcc/ChangeLog:

* common/config/riscv/riscv-common.c (single_letter_subset_rank): New.
(multi_letter_subset_rank): Ditto.
(subset_cmp): Ditto.
(riscv_subset_list::add): Insert subext in canonical ordering.
(riscv_subset_list::parse_std_ext): Move handle_implied_ext to ...
(riscv_subset_list::parse): ... here.
---
 gcc/common/config/riscv/riscv-common.c | 177 -
 1 file changed, 172 insertions(+), 5 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.c 
b/gcc/common/config/riscv/riscv-common.c
index 9a576eb689b..f5f7be3cfff 100644
--- a/gcc/common/config/riscv/riscv-common.c
+++ b/gcc/common/config/riscv/riscv-common.c
@@ -145,6 +145,129 @@ riscv_subset_list::~riscv_subset_list ()
 }
 }
 
+/* Get the rank for single-letter subsets, lower value meaning higher
+   priority.  */
+
+static int
+single_letter_subset_rank (char ext)
+{
+  int rank;
+
+  switch (ext)
+{
+case 'i':
+  return 0;
+case 'e':
+  return 1;
+default:
+  break;
+}
+
+  const char *all_ext = riscv_supported_std_ext ();
+  const char *ext_pos = strchr (all_ext, ext);
+  if (ext_pos == NULL)
+/* If got an unknown extension letter, then give it an alphabetical
+   order, but after all known standard extension.  */
+rank = strlen (all_ext) + ext - 'a';
+  else
+rank = (int)(ext_pos - all_ext) + 2 /* e and i has higher rank.  */;
+
+  return rank;
+}
+
+/* Get the rank for multi-letter subsets, lower value meaning higher
+   priority.  */
+
+static int
+multi_letter_subset_rank (const std::string )
+{
+  gcc_assert (subset.length () >= 2);
+  int high_order = -1;
+  int low_order = 0;
+  /* The order between multi-char extensions: s -> h -> z -> x.  */
+  char multiletter_class = subset[0];
+  switch (multiletter_class)
+{
+case 's':
+  high_order = 0;
+  break;
+case 'h':
+  high_order = 1;
+  break;
+case 'z':
+  gcc_assert (subset.length () > 2);
+  high_order = 2;
+  break;
+case 'x':
+  high_order = 3;
+  break;
+default:
+  gcc_unreachable ();
+  return -1;
+}
+
+  if (multiletter_class == 'z')
+/* Order for z extension on spec: If multiple "Z" extensions are named, 
they
+   should be ordered first by category, then alphabetically within a
+   category - for example, "Zicsr_Zifencei_Zam". */
+low_order = single_letter_subset_rank (subset[1]);
+  else
+low_order = 0;
+
+  return (high_order << 8) + low_order;
+}
+
+/* subset compare
+
+  Returns an integral value indicating the relationship between the subsets:
+  Return value  indicates
+  -1B has higher order than A.
+  0 A and B are same subset.
+  1 A has higher order than B.
+
+*/
+
+static int
+subset_cmp (const std::string , const std::string )
+{
+  if (a == b)
+return 0;
+
+  size_t a_len = a.length ();
+  size_t b_len = b.length ();
+
+  /* Single-letter extension always get higher order than
+ multi-letter extension.  */
+  if (a_len == 1 && b_len != 1)
+return 1;
+
+  if (a_len != 1 && b_len == 1)
+return -1;
+
+  if (a_len == 1 && b_len == 1)
+{
+  int rank_a = single_letter_subset_rank (a[0]);
+  int rank_b = single_letter_subset_rank (b[0]);
+
+  if (rank_a < rank_b)
+   return 1;
+  else
+   return -1;
+}
+  else
+{
+  int rank_a = multi_letter_subset_rank(a);
+  int rank_b = multi_letter_subset_rank(b);
+
+  /* Using alphabetical/lexicographical order if they have same rank.  */
+  if (rank_a == rank_b)
+   /* The return value of strcmp has opposite meaning.  */
+   return -strcmp (a.c_str (), b.c_str ());
+  else
+   return (rank_a < rank_b) ? 1 : -1;
+}
+}
+
 /* Add new subset to list.  */
 
 void
@@ -152,6 +275,7 @@ riscv_subset_list::add (const char *subset, int 
major_version,
int minor_version, bool explicit_version_p)
 {
   riscv_subset_t *s = new riscv_subset_t ();
+  riscv_subset_t *itr;
 
   if (m_head == NULL)
 m_head = s;
@@ -162,9 +286,45 @@ riscv_subset_list::add (const char *subset, int 
major_version,
   s->explicit_version_p = explicit_version_p;
   s->next = NULL;
 
-  if (m_tail != NULL)
-m_tail->next = s;
+  if (m_tail == NULL)
+{
+  m_tail = s;
+  return;
+}
+
+  /* e, i or g should be first subext, never come here.  */
+  gcc_assert (subset[0] != 'e'
+ && subset[0] != 'i'
+ && subset[0] != 'g');
+
+  if (m_tail == m_head)
+{
+  gcc_assert (m_head->next == NULL);
+  m_head->next = s;
+  m_tail = s;
+  return;
+}
+
+  gcc_assert (m_head->next != NULL);
+
+  /* Subset list must in canonical order, but implied subset

RISC-V: Support version controling for ISA standard extensions

2020-11-12 Thread Kito Cheng

Current GCC implementation is RISC-V ISA 2.2, this patch set implement 
v20190608 and v20191213, and also add option -misa-spec=[2.2|20190608|20191213] 
to change the default ISA spec version.

There is one major incompatible

That option will effect the default version of each sub-extension, for example 
I-extension is 2.0 for 2.2 and 2.1 for v20190608 and v20191213.

We also update the -march parser to fit the latest standard, the canonical 
ordering for multi-letter, drop version support for G extension, and we also 
omitted the version for unrecognized extension.

And we add an special rule for G extension, imafd can't appear again if G 
extension is present, but zicsr and zifencei can.

The default ISA spec will keep on 2.2, and change that in next GCC release.

[PATCH 2/3] RISC-V: Support zicsr and zifencei extension for -march.

2020-11-12 Thread Kito Cheng

 - CSR related instructions and fence instructions has to be splitted from
   baseline ISA, zicsr and zifencei are corresponding sub-extension.

gcc/ChangeLog:

* common/config/riscv/riscv-common.c (riscv_implied_info):
d and f implied zicsr.
(riscv_ext_flag_table): Handle zicsr and zifencei.
* config/riscv/riscv-opts.h (MASK_ZICSR): New.
(MASK_ZIFENCEI): Ditto.
(TARGET_ZICSR): Ditto.
(TARGET_ZIFENCEI): Ditto.
* config/riscv/riscv.c (riscv_memmodel_needs_release_fence):
Check fence is available by TARGET_ZIFENCEI.
* config/riscv/riscv.opt (riscv_zi_subext): New.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-8.c: New.
* gcc.target/riscv/attribute-14.c: Ditto.
---
 gcc/common/config/riscv/riscv-common.c| 6 ++
 gcc/config/riscv/riscv-opts.h | 6 ++
 gcc/config/riscv/riscv.c  | 3 +++
 gcc/config/riscv/riscv.md | 7 ---
 gcc/config/riscv/riscv.opt| 3 +++
 gcc/testsuite/gcc.target/riscv/arch-8.c   | 5 +
 gcc/testsuite/gcc.target/riscv/attribute-14.c | 6 ++
 7 files changed, 33 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-14.c

diff --git a/gcc/common/config/riscv/riscv-common.c 
b/gcc/common/config/riscv/riscv-common.c
index f5f7be3cfff..ca88ca1dacd 100644
--- a/gcc/common/config/riscv/riscv-common.c
+++ b/gcc/common/config/riscv/riscv-common.c
@@ -57,6 +57,8 @@ struct riscv_implied_info_t
 static const riscv_implied_info_t riscv_implied_info[] =
 {
   {"d", "f"},
+  {"f", "zicsr"},
+  {"d", "zicsr"},
   {NULL, NULL}
 };
 
@@ -812,6 +814,10 @@ static const riscv_ext_flag_table_t riscv_ext_flag_table[] 
=
   {"f", _options::x_target_flags, MASK_HARD_FLOAT},
   {"d", _options::x_target_flags, MASK_DOUBLE_FLOAT},
   {"c", _options::x_target_flags, MASK_RVC},
+
+  {"zicsr",_options::x_riscv_zi_subext, MASK_ZICSR},
+  {"zifencei", _options::x_riscv_zi_subext, MASK_ZIFENCEI},
+
   {NULL, NULL, 0}
 };
 
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 2a3f9d9eef5..de8ac0e038d 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -57,4 +57,10 @@ enum stack_protector_guard {
   SSP_GLOBAL   /* global canary */
 };
 
+#define MASK_ZICSR(1 << 0)
+#define MASK_ZIFENCEI (1 << 1)
+
+#define TARGET_ZICSR((riscv_zi_subext & MASK_ZICSR) != 0)
+#define TARGET_ZIFENCEI ((riscv_zi_subext & MASK_ZIFENCEI) != 0)
+
 #endif /* ! GCC_RISCV_OPTS_H */
diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 738556539f6..2aaa8e96451 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3337,6 +3337,9 @@ riscv_memmodel_needs_amo_acquire (enum memmodel model)
 static bool
 riscv_memmodel_needs_release_fence (enum memmodel model)
 {
+  if (!TARGET_ZIFENCEI)
+return false;
+
   switch (model)
 {
   case MEMMODEL_ACQ_REL:
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index f15bad3b29e..756b35fb8c0 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1543,19 +1543,20 @@
 LCT_NORMAL, VOIDmode, operands[0], Pmode,
 operands[1], Pmode, const0_rtx, Pmode);
 #else
-  emit_insn (gen_fence_i ());
+  if (TARGET_ZIFENCEI)
+emit_insn (gen_fence_i ());
 #endif
   DONE;
 })
 
 (define_insn "fence"
   [(unspec_volatile [(const_int 0)] UNSPECV_FENCE)]
-  ""
+  "TARGET_ZIFENCEI"
   "%|fence%-")
 
 (define_insn "fence_i"
   [(unspec_volatile [(const_int 0)] UNSPECV_FENCE_I)]
-  ""
+  "TARGET_ZIFENCEI"
   "fence.i")
 
 ;;
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 808b4a04405..ca2fc7c8021 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -183,3 +183,6 @@ Use the given offset for addressing the stack-protector 
guard.
 
 TargetVariable
 long riscv_stack_protector_guard_offset = 0
+
+TargetVariable
+int riscv_zi_subext
diff --git a/gcc/testsuite/gcc.target/riscv/arch-8.c 
b/gcc/testsuite/gcc.target/riscv/arch-8.c
new file mode 100644
index 000..d7760fc576f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/arch-8.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-O -march=rv32id_zicsr_zifence -mabi=ilp32" } */
+int foo()
+{
+}
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-14.c 
b/gcc/testsuite/gcc.target/riscv/attribute-14.c
new file mode 100644
index 000..48456277152
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/attribute-14.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-O -mriscv-attribute -march=rv32if -mabi=ilp32" } */
+int foo()
+{
+}
+/* { dg-final { scan-assembler ".attribute arch, \"rv32i2p0_f2p0_zicsr2p0\"" } 
} */
-- 
2.29.2

Re: [PATCH 2/2] loops: Invoke lim after successful loop interchange

2020-11-12 Thread Richard Biener

On Thu, 12 Nov 2020, Martin Jambor wrote:

> Hi,
> 
> On Wed, Nov 11 2020, Richard Biener wrote:
> > On Mon, 9 Nov 2020, Martin Jambor wrote:
> >
> >> this patch modifies the loop invariant pass so that is can operate
> >> only on a single requested loop and its sub-loops and ignore the rest
> >> of the function, much like it currently ignores basic blocks that are
> >> not in any real loop.  It then invokes it from within the loop
> >> interchange pass when it successfully swaps two loops.  This avoids
> >> the non-LTO -Ofast run-time regressions of 410.bwaves and 503.bwaves_r
> >> (which are 19% and 15% faster than current master on an AMD zen2
> >> machine) while not introducing a full LIM pass into the pass pipeline.
> >> 
> >> I have not modified the LIM data structures, this means that it still
> >> contains vectors indexed by loop->num even though only a single loop
> >> nest is actually processed.  I also did not replace the uses of
> >> pre_and_rev_post_order_compute_fn with a function that would count a
> >> postorder only for a given loop.  I can of course do so if the
> >> approach is otherwise deemed viable.
> >> 
> >> The patch adds one additional global variable requested_loop to the
> >> pass and then at various places behaves differently when it is set.  I
> >> was considering storing the fake root loop into it for normal
> >> operation, but since this loop often requires special handling anyway,
> >> I came to the conclusion that the code would actually end up less
> >> straightforward.
> >> 
> >> I have bootstrapped and tested the patch on x86_64-linux and a very
> >> similar one on aarch64-linux.  I have also tested it by modifying the
> >> tree_ssa_lim function to run loop_invariant_motion_from_loop on each
> >> real outermost loop in a function and this variant also passed
> >> bootstrap and all tests, including dump scans, of all languages.
> >> 
> >> I have built the entire SPEC 2006 FPrate monitoring the activity of
> >> the LIM pass without and with the patch (on top of commit b642fca1c31
> >> with which 526.blender_r and 538.imagick_r seemed to be failing) and
> >> it only examined 0.2% more loops, 0.02% more BBs and even fewer
> >> percent of statements because it is invoked only in a rather special
> >> circumstance.  But the patch allows for more such need-based uses at
> >> hopefully reasonable cost.
> >> 
> >> Since I do not have much experience with loop optimizers, I expect
> >> that there will be requests to adjust the patch during the review.
> >> Still, it fixes a performance regression against GCC 9 and so I hope
> >> to address the concerns in time to get it into GCC 11.
> >> 
> 
> [...]
> 
> >
> > That said, in the way it's currently structured I think it's
> > "better" to export tree_ssa_lim () and call it from interchange
> > if any loop was interchanged (thus run a full pass but conditional
> > on interchange done).  You can make it cheaper by adding a flag
> > to tree_ssa_lim whether to do store-motion (I guess this might
> > be an interesting user-visible flag as well and a possibility
> > to make select lim passes cheaper via a pass flag) and not do
> > store-motion from the interchange call.  I think that's how we should
> > fix the regression, refactoring LIM properly requires more work
> > that doesn't seem to fit the stage1 deadline.
> >
> 
> So just like this?  Bootstrapped and tested on x86_64-linux and I have
> verified it fixes the bwaves reduction.

OK.

Thanks,
Richard.


> Thanks,
> 
> Martin
> 
> 
> 
> gcc/ChangeLog:
> 
> 2020-11-12  Martin Jambor  
> 
>   PR tree-optimization/94406
>   * tree-ssa-loop-im.c (tree_ssa_lim): Renamed to
>   loop_invariant_motion_in_fun, added a parameter to control store
>   motion.
>   (pass_lim::execute): Adjust call to tree_ssa_lim, now
>   loop_invariant_motion_in_fun.
>   * tree-ssa-loop-manip.h (loop_invariant_motion_in_fun): Declare.
>   * gimple-loop-interchange.cc (pass_linterchange::execute): Call
>   loop_invariant_motion_in_fun if any interchange has been done.
> ---
>  gcc/gimple-loop-interchange.cc |  9 +++--
>  gcc/tree-ssa-loop-im.c | 12 +++-
>  gcc/tree-ssa-loop-manip.h  |  2 +-
>  3 files changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/gcc/gimple-loop-interchange.cc b/gcc/gimple-loop-interchange.cc
> index 1656004ecf0..a36dbb49b1f 100644
> --- a/gcc/gimple-loop-interchange.cc
> +++ b/gcc/gimple-loop-interchange.cc
> @@ -2085,8 +2085,13 @@ pass_linterchange::execute (function *fun)
>  }
>  
>if (changed_p)
> -scev_reset ();
> -  return changed_p ? (TODO_update_ssa_only_virtuals) : 0;
> +{
> +  unsigned todo = TODO_update_ssa_only_virtuals;
> +  todo |= loop_invariant_motion_in_fun (cfun, false);
> +  scev_reset ();
> +  return todo;
> +}
> +  return 0;
>  }
>  
>  } // anon namespace
> diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
> index 6bb07e133cd..3c7412737f0 100644
> ---

[Bug middle-end/64101] GCC considers that the erf math function does not set errno

2020-11-12 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64101

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|NEW |RESOLVED

--- Comment #5 from Richard Biener  ---
Recent glibc documents

   These functions do not set errno.

(and in fact they do not).  GCC does not elide the erf() call with -O0 which is
how you can verify your erf implementation sets or does not set errno, at -O1
GCC elides the erf call as the result is unused.

According to Joseph glibcs behavior is correct and so is then GCCs.

[PATCH v2] PR target/97682 - Fix to reuse t1 register between call address and epilogue.

2020-11-12 Thread Monk Chiang

  - When expanding the call pattern, choose t1 register be a jump register.
Epilogue also uses a t1 register to adjust Stack point. The call pattern
and epilogue will initial t1 twice, if both are generated in the same
function. The call pattern will emit 'la t1,symbol' and 'jalr 
t1'instructions.
Epilogue also emits 'li t1,4096' and 'addi sp,sp,t1' instructions.
But li and addi instructions will be placed between la and jalr 
instructions.
The la instruction will be removed by some optimizations,
because t1 register define twice, the first define instruction look
likes duplicate.

  - To resolve this issue, Prologue and Epilogue use the t0 register
be a temporary register, the call pattern use the t1 register be
a temporary register.

  gcc/ChangeLog:

PR target/97682
* config/riscv/riscv.h (RISCV_PROLOGUE_TEMP_REGNUM): Change register to 
t0.
(RISCV_CALL_ADDRESS_TEMP_REGNUM): New Marco, define t1 register.
(RISCV_CALL_ADDRESS_TEMP): Use it for call instructions.
* config/riscv/riscv.c (riscv_legitimize_call_address): Use
RISCV_CALL_ADDRESS_TEMP.
(riscv_compute_frame_info): Change temporary register to t0 form t1.
(riscv_trampoline_init): Adjust comment.

  gcc/testsuite/ChangeLog

PR target/97682
* g++.target/riscv/pr97682.C: New test.
* gcc.target/riscv/interrupt-3.c: Check register for t0.
* gcc.target/riscv/interrupt-4.c: Likewise.
---
 gcc/config/riscv/riscv.c |  23 +--
 gcc/config/riscv/riscv.h |   6 +-
 gcc/testsuite/g++.target/riscv/pr97682.C | 160 +++
 gcc/testsuite/gcc.target/riscv/interrupt-3.c |   4 +-
 gcc/testsuite/gcc.target/riscv/interrupt-4.c |   4 +-
 5 files changed, 181 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/riscv/pr97682.C

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 989a9f15250..35029e7b435 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3110,7 +3110,7 @@ riscv_legitimize_call_address (rtx addr)
 {
   if (!call_insn_operand (addr, VOIDmode))
 {
-  rtx reg = RISCV_PROLOGUE_TEMP (Pmode);
+  rtx reg = RISCV_CALL_ADDRESS_TEMP (Pmode);
   riscv_emit_move (reg, addr);
   return reg;
 }
@@ -3707,18 +3707,18 @@ riscv_compute_frame_info (void)
 {
   struct riscv_frame_info *frame;
   HOST_WIDE_INT offset;
-  bool interrupt_save_t1 = false;
+  bool interrupt_save_prologue_temp = false;
   unsigned int regno, i, num_x_saved = 0, num_f_saved = 0;
 
   frame = >machine->frame;
 
   /* In an interrupt function, if we have a large frame, then we need to
- save/restore t1.  We check for this before clearing the frame struct.  */
+ save/restore t0.  We check for this before clearing the frame struct.  */
   if (cfun->machine->interrupt_handler_p)
 {
   HOST_WIDE_INT step1 = riscv_first_stack_step (frame);
   if (! SMALL_OPERAND (frame->total_size - step1))
-   interrupt_save_t1 = true;
+   interrupt_save_prologue_temp = true;
 }
 
   memset (frame, 0, sizeof (*frame));
@@ -3728,7 +3728,8 @@ riscv_compute_frame_info (void)
   /* Find out which GPRs we need to save.  */
   for (regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
if (riscv_save_reg_p (regno)
-   || (interrupt_save_t1 && (regno == T1_REGNUM)))
+   || (interrupt_save_prologue_temp
+   && (regno == RISCV_PROLOGUE_TEMP_REGNUM)))
  frame->mask |= 1 << (regno - GP_REG_FIRST), num_x_saved++;
 
   /* If this function calls eh_return, we must also save and restore the
@@ -4902,9 +4903,9 @@ riscv_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
 
   rtx target_function = force_reg (Pmode, XEXP (DECL_RTL (fndecl), 0));
   /* lui t2, hi(chain)
-lui t1, hi(func)
+lui t0, hi(func)
 addit2, t2, lo(chain)
-jr  r1, lo(func)
+jr  t0, lo(func)
   */
   unsigned HOST_WIDE_INT lui_hi_chain_code, lui_hi_func_code;
   unsigned HOST_WIDE_INT lo_chain_code, lo_func_code;
@@ -4929,7 +4930,7 @@ riscv_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
   mem = adjust_address (m_tramp, SImode, 0);
   riscv_emit_move (mem, lui_hi_chain);
 
-  /* Gen lui t1, hi(func).  */
+  /* Gen lui t0, hi(func).  */
   rtx hi_func = riscv_force_binary (SImode, PLUS, target_function,
fixup_value);
   hi_func = riscv_force_binary (SImode, AND, hi_func,
@@ -4956,7 +4957,7 @@ riscv_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
   mem = adjust_address (m_tramp, SImode, 2 * GET_MODE_SIZE (SImode));
   riscv_emit_move (mem, addi_lo_chain);
 
-  /* Gen jr r1, lo(func).  */
+  /* Gen jr t0, lo(func).  */
   rtx lo_func = riscv_force_binary (SImode, AND, target_function,

Re: [PATCH] Support the new ("v0") mangling scheme in rust-demangle.

2020-11-12 Thread Nikhil Benesch via Gcc-patches


On 11/6/20 12:09 PM, Jeff Law wrote:

So I think the best path forward is to let you and Eduard-Mihai make the
technical decisions about what bits are ready for the trunk.  When y'all
think something is ready, let's go ahead and get it installed and
iterate on things that aren't quite ready yet.


For bits y'all think are ready, ISTM that Eduard-Mihai should commit the
changes.


I've attached an updated version of the patch that contains some
additional unit tests that eddyb noticed I lost. From my perspective,
this is now ready for commit.

Neither eddyb nor I have write access, so someone else will need to
commit. (But please wait for eddyb to sign off too.)


It's better to get it in sooner, but there is some degree of freedom
depending on the impact of the changes.  Changes in the rust demangler
aren't likely to trigger codegen or ABI breakages in the compiler itself
-- so with that in mind I think we should give this code a higher degree
of freedom to land after the stage1 close deadline.


Got it. Thanks. That's very helpful context.

Nikhil
diff --git a/libiberty/rust-demangle.c b/libiberty/rust-demangle.c
index b87365c85fe..08c615f6d8b 100644
--- a/libiberty/rust-demangle.c
+++ b/libiberty/rust-demangle.c
@@ -1,6 +1,7 @@
 /* Demangler for the Rust programming language
Copyright (C) 2016-2020 Free Software Foundation, Inc.
Written by David Tolnay (dtol...@gmail.com).
+   Rewritten by Eduard-Mihai Burtescu (ed...@lyken.rs) for v0 support.
 
 This file is part of the libiberty library.
 Libiberty is free software; you can redistribute it and/or
@@ -64,11 +65,16 @@ struct rust_demangler
   /* Non-zero if any error occurred. */
   int errored;
 
+  /* Non-zero if nothing should be printed. */
+  int skipping_printing;
+
   /* Non-zero if printing should be verbose (e.g. include hashes). */
   int verbose;
 
   /* Rust mangling version, with legacy mangling being -1. */
   int version;
+
+  uint64_t bound_lifetime_depth;
 };
 
 /* Parsing functions. */
@@ -81,6 +87,18 @@ peek (const struct rust_demangler *rdm)
   return 0;
 }
 
+static int
+eat (struct rust_demangler *rdm, char c)
+{
+  if (peek (rdm) == c)
+{
+  rdm->next++;
+  return 1;
+}
+  else
+return 0;
+}
+
 static char
 next (struct rust_demangler *rdm)
 {
@@ -92,11 +110,87 @@ next (struct rust_demangler *rdm)
   return c;
 }
 
+static uint64_t
+parse_integer_62 (struct rust_demangler *rdm)
+{
+  char c;
+  uint64_t x;
+
+  if (eat (rdm, '_'))
+return 0;
+
+  x = 0;
+  while (!eat (rdm, '_'))
+{
+  c = next (rdm);
+  x *= 62;
+  if (ISDIGIT (c))
+x += c - '0';
+  else if (ISLOWER (c))
+x += 10 + (c - 'a');
+  else if (ISUPPER (c))
+x += 10 + 26 + (c - 'A');
+  else
+{
+  rdm->errored = 1;
+  return 0;
+}
+}
+  return x + 1;
+}
+
+static uint64_t
+parse_opt_integer_62 (struct rust_demangler *rdm, char tag)
+{
+  if (!eat (rdm, tag))
+return 0;
+  return 1 + parse_integer_62 (rdm);
+}
+
+static uint64_t
+parse_disambiguator (struct rust_demangler *rdm)
+{
+  return parse_opt_integer_62 (rdm, 's');
+}
+
+static size_t
+parse_hex_nibbles (struct rust_demangler *rdm, uint64_t *value)
+{
+  char c;
+  size_t hex_len;
+
+  hex_len = 0;
+  *value = 0;
+
+  while (!eat (rdm, '_'))
+{
+  *value <<= 4;
+
+  c = next (rdm);
+  if (ISDIGIT (c))
+*value |= c - '0';
+  else if (c >= 'a' && c <= 'f')
+*value |= 10 + (c - 'a');
+  else
+{
+  rdm->errored = 1;
+  return 0;
+}
+  hex_len++;
+}
+
+  return hex_len;
+}
+
 struct rust_mangled_ident
 {
   /* ASCII part of the identifier. */
   const char *ascii;
   size_t ascii_len;
+
+  /* Punycode insertion codes for Unicode codepoints, if any. */
+  const char *punycode;
+  size_t punycode_len;
 };
 
 static struct rust_mangled_ident
@@ -104,10 +198,16 @@ parse_ident (struct rust_demangler *rdm)
 {
   char c;
   size_t start, len;
+  int is_punycode = 0;
   struct rust_mangled_ident ident;
 
   ident.ascii = NULL;
   ident.ascii_len = 0;
+  ident.punycode = NULL;
+  ident.punycode_len = 0;
+
+  if (rdm->version != -1)
+is_punycode = eat (rdm, 'u');
 
   c = next (rdm);
   if (!ISDIGIT (c))
@@ -121,6 +221,10 @@ parse_ident (struct rust_demangler *rdm)
 while (ISDIGIT (peek (rdm)))
   len = len * 10 + (next (rdm) - '0');
 
+  /* Skip past the optional `_` separator (v0). */
+  if (rdm->version != -1)
+eat (rdm, '_');
+
   start = rdm->next;
   rdm->next += len;
   /* Check for overflows. */
@@ -133,6 +237,27 @@ parse_ident (struct rust_demangler *rdm)
   ident.ascii = rdm->sym + start;
   ident.ascii_len = len;
 
+  if (is_punycode)
+{
+  ident.punycode_len = 0;
+  while (ident.ascii_len > 0)
+{
+  ident.ascii_len--;
+
+  /* The last '_' is a separator between ascii & punycode. */
+  if (ident.ascii[ident.ascii_len] == '_')
+break;
+
+

Re: [PATCH] Remove redundant builtins for avx512f scalar instructions.

2020-11-12 Thread Hongyu Wang via Gcc-patches

Hi

Thanks for reminding me about this patch. I didn't remove any existing
intrinsics, just remove redundant builtin functions that end-users
would not likely to use.

Also I'm OK to keep current implementation, in case there might be
someone using the builtin directly.

Jeff Law  于2020年11月13日周五 下午1:43写道：
>
>
> On 12/23/19 10:31 PM, Hongyu Wang wrote:
>
> Hi:
>   For avx512f scalar instructions, current builtin function like
> __builtin_ia32_*{sd,ss}_round can be replaced by
> __builtin_ia32_*{sd,ss}_mask_round with mask parameter set to -1. This
> patch did the replacement and remove the corresponding redundant
> builtins.
>
>   Bootstrap is ok, make-check ok for i386 target.
>   Ok for trunk?
>
> Changelog
>
> gcc/
> * config/i386/avx512fintrin.h
> (_mm_add_round_sd, _mm_add_round_ss): Use
>  __builtin_ia32_adds?_mask_round builtins instead of
> __builtin_ia32_adds?_round.
> (_mm_sub_round_sd, _mm_sub_round_ss,
> _mm_mul_round_sd, _mm_mul_round_ss,
> _mm_div_round_sd, _mm_div_round_ss,
> _mm_getexp_sd, _mm_getexp_ss,
> _mm_getexp_round_sd, _mm_getexp_round_ss,
> _mm_getmant_sd, _mm_getmant_ss,
> _mm_getmant_round_sd, _mm_getmant_round_ss,
> _mm_max_round_sd, _mm_max_round_ss,
> _mm_min_round_sd, _mm_min_round_ss,
> _mm_fmadd_round_sd, _mm_fmadd_round_ss,
> _mm_fmsub_round_sd, _mm_fmsub_round_ss,
> _mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
> _mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
> * config/i386/i386-builtin.def
> (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
> __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
> __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
> __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
> __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
> __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
> __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
> __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
> __builtin_ia32_vfmaddsd3_round,
> __builtin_ia32_vfmaddss3_round): Remove.
> * config/i386/i386-expand.c
> (ix86_expand_round_builtin): Remove corresponding case.
>
> gcc/testsuite/
> * lib/target-supports.exp
> (check_effective_target_avx512f): Use
> __builtin_ia32_getmantsd_mask_round builtins instead of
> __builtin_ia32_getmantsd_round.
> *gcc.target/i386/avx-1.c
> (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
> __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
> __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
> __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
> __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
> __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
> __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
> __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
> __builtin_ia32_vfmaddsd3_round,
> __builtin_ia32_vfmaddss3_round): Remove.
> *gcc.target/i386/sse-13.c: Ditto.
> *gcc.target/i386/sse-23.c: Ditto.
>
> So I like the idea of simplifying the implementation of some of the 
> intrinsics when we can, but ISTM that removing existing intrinsics would be a 
> mistake since end-users could be using them in their code.   I'd think we'd 
> want to keep the existing APIs, even if we change the implementation under 
> the hood.
>
>
> Thoughts?
>
>
> jeff
>
>
> Hongyu Wang
>
>
> 0001-Remove-redundant-round-builtins-for-avx512f-scalar-i.patch
>
> From 9cc4928aad5770c53ff580f5c996092cdaf2f9ba Mon Sep 17 00:00:00 2001
> From: hongyuw1 
> Date: Wed, 18 Dec 2019 14:52:54 +
> Subject: [PATCH] Remove redundant round builtins for avx512f scalar
>  instructions
>
> Changelog
>
> gcc/
> * config/i386/avx512fintrin.h
> (_mm_add_round_sd, _mm_add_round_ss): Use
> __builtin_ia32_adds?_mask_round builtins instead of
> __builtin_ia32_adds?_round.
> (_mm_sub_round_sd, _mm_sub_round_ss,
> _mm_mul_round_sd, _mm_mul_round_ss,
> _mm_div_round_sd, _mm_div_round_ss,
> _mm_getexp_sd, _mm_getexp_ss,
> _mm_getexp_round_sd, _mm_getexp_round_ss,
> _mm_getmant_sd, _mm_getmant_ss,
> _mm_getmant_round_sd, _mm_getmant_round_ss,
> _mm_max_round_sd, _mm_max_round_ss,
> _mm_min_round_sd, _mm_min_round_ss,
> _mm_fmadd_round_sd, _mm_fmadd_round_ss,
> _mm_fmsub_round_sd, _mm_fmsub_round_ss,
> _mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
> _mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
> * config/i386/i386-builtin.def
> (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
> __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
> __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
> __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
> __builtin_ia32_getexpsd128_round,

Re: [PATCH] Remove redundant builtins for avx512f scalar instructions.

2020-11-12 Thread Jeff Law via Gcc-patches



On 12/23/19 10:31 PM, Hongyu Wang wrote:
> Hi:
>   For avx512f scalar instructions, current builtin function like
> __builtin_ia32_*{sd,ss}_round can be replaced by
> __builtin_ia32_*{sd,ss}_mask_round with mask parameter set to -1. This
> patch did the replacement and remove the corresponding redundant
> builtins.
>
>   Bootstrap is ok, make-check ok for i386 target.
>   Ok for trunk?
>
> Changelog
>
> gcc/
> * config/i386/avx512fintrin.h
> (_mm_add_round_sd, _mm_add_round_ss): Use
>  __builtin_ia32_adds?_mask_round builtins instead of
> __builtin_ia32_adds?_round.
> (_mm_sub_round_sd, _mm_sub_round_ss,
> _mm_mul_round_sd, _mm_mul_round_ss,
> _mm_div_round_sd, _mm_div_round_ss,
> _mm_getexp_sd, _mm_getexp_ss,
> _mm_getexp_round_sd, _mm_getexp_round_ss,
> _mm_getmant_sd, _mm_getmant_ss,
> _mm_getmant_round_sd, _mm_getmant_round_ss,
> _mm_max_round_sd, _mm_max_round_ss,
> _mm_min_round_sd, _mm_min_round_ss,
> _mm_fmadd_round_sd, _mm_fmadd_round_ss,
> _mm_fmsub_round_sd, _mm_fmsub_round_ss,
> _mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
> _mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
> * config/i386/i386-builtin.def
> (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
> __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
> __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
> __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
> __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
> __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
> __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
> __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
> __builtin_ia32_vfmaddsd3_round,
> __builtin_ia32_vfmaddss3_round): Remove.
> * config/i386/i386-expand.c
> (ix86_expand_round_builtin): Remove corresponding case.
>
> gcc/testsuite/
> * lib/target-supports.exp
> (check_effective_target_avx512f): Use
> __builtin_ia32_getmantsd_mask_round builtins instead of
> __builtin_ia32_getmantsd_round.
> *gcc.target/i386/avx-1.c
> (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
> __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
> __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
> __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
> __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
> __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
> __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
> __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
> __builtin_ia32_vfmaddsd3_round,
> __builtin_ia32_vfmaddss3_round): Remove.
> *gcc.target/i386/sse-13.c: Ditto.
> *gcc.target/i386/sse-23.c: Ditto.

So I like the idea of simplifying the implementation of some of the
intrinsics when we can, but ISTM that removing existing intrinsics would
be a mistake since end-users could be using them in their code.   I'd
think we'd want to keep the existing APIs, even if we change the
implementation under the hood.


Thoughts?


jeff


> Hongyu Wang
>
> 0001-Remove-redundant-round-builtins-for-avx512f-scalar-i.patch
>
> From 9cc4928aad5770c53ff580f5c996092cdaf2f9ba Mon Sep 17 00:00:00 2001
> From: hongyuw1 
> Date: Wed, 18 Dec 2019 14:52:54 +
> Subject: [PATCH] Remove redundant round builtins for avx512f scalar
>  instructions
>
> Changelog
>
> gcc/
>   * config/i386/avx512fintrin.h
>   (_mm_add_round_sd, _mm_add_round_ss): Use
>__builtin_ia32_adds?_mask_round builtins instead of
>   __builtin_ia32_adds?_round.
>   (_mm_sub_round_sd, _mm_sub_round_ss,
>   _mm_mul_round_sd, _mm_mul_round_ss,
>   _mm_div_round_sd, _mm_div_round_ss,
>   _mm_getexp_sd, _mm_getexp_ss,
>   _mm_getexp_round_sd, _mm_getexp_round_ss,
>   _mm_getmant_sd, _mm_getmant_ss,
>   _mm_getmant_round_sd, _mm_getmant_round_ss,
>   _mm_max_round_sd, _mm_max_round_ss,
>   _mm_min_round_sd, _mm_min_round_ss,
>   _mm_fmadd_round_sd, _mm_fmadd_round_ss,
>   _mm_fmsub_round_sd, _mm_fmsub_round_ss,
>   _mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
>   _mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
>   * config/i386/i386-builtin.def
>   (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
>   __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
>   __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
>   __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
>   __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
>   __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
>   __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
>   __builtin_ia32_minsd_round,

[Bug fortran/87142] Aliasing issue with overloaded assignment and allocatable components

2020-11-12 Thread mscfd at gmx dot net via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87142

--- Comment #4 from martin  ---
With yesterdays master branch, I still see an invalid read with valgrind and an
"AddressSanitizer: heap-use-after-free"-error with -fsanitize=address. So looks
like this has not been fixed by the patch for PR 92178.

[r11-4958 Regression] FAIL: 30_threads/future/members/poll.cc execution test on Linux/x86_64

2020-11-12 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

93fc47746815ea9dac413322fcade2931f757e7f is the first bad commit
commit 93fc47746815ea9dac413322fcade2931f757e7f
Author: Jonathan Wakely 
Date:   Thu Nov 12 21:25:14 2020 +

libstdc++: Optimise std::future::wait_for and fix futex polling

caused

FAIL: 30_threads/future/members/poll.cc execution test

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-4958/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="conformance.exp=30_threads/future/members/poll.cc 
--target_board='unix{-m32}'"
$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="conformance.exp=30_threads/future/members/poll.cc 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="conformance.exp=30_threads/future/members/poll.cc 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

Re: Do we need to do a loop invariant motion after loop interchange ?

2020-11-12 Thread Jeff Law via Gcc



On 11/23/19 11:26 PM, Bin.Cheng wrote:
> On Fri, Nov 22, 2019 at 3:23 PM Bin.Cheng  wrote:
>> On Fri, Nov 22, 2019 at 3:19 PM Richard Biener
>>  wrote:
>>> On November 22, 2019 6:51:38 AM GMT+01:00, Li Jia He 
>>>  wrote:

 On 2019/11/21 8:10 PM, Richard Biener wrote:
> On Thu, Nov 21, 2019 at 10:22 AM Li Jia He 
 wrote:
>> Hi,
>>
>> I found for the follow code:
>>
>> #define N 256
>> int a[N][N][N], b[N][N][N];
>> int d[N][N], c[N][N];
>> void __attribute__((noinline))
>> double_reduc (int n)
>> {
>> for (int k = 0; k < n; k++)
>> {
>>   for (int l = 0; l < n; l++)
>>{
>>  c[k][l] = 0;
>>   for (int m = 0; m < n; m++)
>> c[k][l] += a[k][m][l] * d[k][m] + b[k][m][l] * d[k][m];
>>}
>> }
>> }
>>
>> I dumped the file after loop interchange and got the following
 information:
>>  [local count: 118111600]:
>> # m_46 = PHI <0(7), m_45(11)>
>> # ivtmp_44 = PHI <_42(7), ivtmp_43(11)>
>> _39 = _49 + 1;
>>
>>  [local count: 955630224]:
>> # l_48 = PHI <0(3), l_47(12)>
>> # ivtmp_41 = PHI <_39(3), ivtmp_40(12)>
>> c_I_I_lsm.5_18 = c[k_28][l_48];
>> c_I_I_lsm.5_53 = m_46 != 0 ? c_I_I_lsm.5_18 : 0;
>> _2 = a[k_28][m_46][l_48];
>> _3 = d[k_28][m_46];
>> _4 = _2 * _3;
>> _5 = b[k_28][m_46][l_48];
>> _6 = _3 * _5;
>> _7 = _4 + _6;
>> _8 = _7 + c_I_I_lsm.5_53;
>> c[k_28][l_48] = _8;
>> l_47 = l_48 + 1;
>> ivtmp_40 = ivtmp_41 - 1;
>> if (ivtmp_40 != 0)
>>   goto ; [89.00%]
>> else
>>   goto ; [11.00%]
>>
>> we can see '_3 = d[k_28][m_46];'  is a loop invariant.
>> Do we need to add a loop invariant motion pass after the loop
 interchange?
> There is one at the end of the loop pipeline.
 Hi,

 The one at the end of the loop pipeline may miss some optimization
 opportunities.  If we vectorize the above code (a.c.158t.vect), we
 can get information similar to the following:

 bb 3:
  # m_46 = PHI <0(7), m_45(11)>  // loop m, outer loop
   if (_59 <= 2)
 goto bb 20;
   else
 goto bb 15;

 bb 15:
   _89 = d[k_28][m_46];
   vect_cst__90 = {_89, _89, _89, _89};

 bb 4:
# l_48 = PHI  // loop l, inner loop
   vect__6.23_100 = vect_cst__99 * vect__5.22_98;
if (ivtmp_110 < bnd.8_1)
 goto bb 12;
   else
 goto bb 17;

 bb 20:
 bb 18:
_27 = d[k_28][m_46];
 if (ivtmp_12 != 0)
 goto bb 19;
   else
 goto bb 21;

 Vectorization will do some conversions in this case.  We can see
 ‘ _89 = d[k_28][m_46];’ and ‘_27 = d[k_28][m_46];’ are loop invariant
 relative to loop l.  We can move ‘d[k_28][m_46]’ to the front of
 ‘if (_59 <= 2)’ to get rid of loading data from memory in both
 branches.

 The one at at the end of the loop pipeline can't handle this situation.
 If we move d[k_28][m_46] from loop l to loop m before doing
 vectorization, we can get rid of this situation.
>>> But we can't run every pass after every other. With multiple passes having 
>>> ordering issues is inevitable.
>>>
>>> Now - interchange could trigger a region based invariant motion just for 
>>> the nest it interchanged. But that doesn't exist right now.
>> With data reference/dependence information in the pass, I think it
>> could be quite straightforward.  Didn't realize that we need it
>> before.
> FYI, attachment is a simple fix in loop interchange for the reported
> issue. It's untested, neither for GCC10.
>
> Thanks,
> bin
>> Thanks,
>> bin
>>> Richard.
>>>
>>> linterchange-invariant-dataref-motion.patch
>>>
So it looks like Martin and Richi are working on this right now.  I'm
going to drop this from my queue.


jeff

Re: [PATCH] [libiberty] Fix write buffer overflow in cplus_demangle

2020-11-12 Thread Jeff Law via Gcc-patches

On 11/29/19 12:15 PM, Tim Rühsen wrote:
> * cplus-dem.c (ada_demangle): Correctly calculate the demangled
>   size by using two passes.

So I'm not sure why, but I can't get this patch to apply.  What's even
more interesting is ada_demangle doesn't seem to have changed since 2010
and even if I checkout a Nov 2019 trunk, I still can't apply the patch.

I can see what you're doing with your patch (it's primarily introducing
a loop where you count on the first pass and allocate on the second and
re-indent all the necessary code), I'd prefer not to muck it up trying
to apply by hand.

Any change you could update the patch so that it applies to the trunk. 
THe review is done, so it should be able to go straight in.  If you have
commit privs (I don't recall if you do or not), you can go ahead and
commit it yourself.

Sorry for the insane delays here.

jeff

[Bug lto/46769] LTO failed to build gold

2020-11-12 Thread hjl.tools at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46769

H.J. Lu  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |10.2

--- Comment #8 from H.J. Lu  ---
GCC 10.2 can build gold with LTO.

Re: [PATCH v2] c: Silently ignore pragma region [PR85487]

2020-11-12 Thread Jeff Law via Gcc-patches

On 9/2/20 6:59 PM, Austin Morton via Gcc-patches wrote:
> #pragma region is a feature introduced by Microsoft in order to allow
> manual grouping and folding of code within Visual Studio.  It is
> entirely ignored by the compiler.  Clang has supported this feature
> since 2012 when in MSVC compatibility mode, and enabled it across the
> board in 2018.
>
> As it stands, you cannot use #pragma region within GCC without
> disabling unknown pragma warnings, which is not advisable.
>
> I propose GCC adopt "#pragma region" and "#pragma endregion" in order
> to alleviate these issues.  Because the pragma has no purpose at
> compile time, the implementation is trivial.
>
>
> Microsoft Documentation on the feature:
> https://docs.microsoft.com/en-us/cpp/preprocessor/region-endregion
>
> LLVM change which enabled pragma region across the board:
> https://reviews.llvm.org/D42248
> ---
>  gcc/ChangeLog|  5 +
>  gcc/c-family/ChangeLog   |  5 +
>  gcc/c-family/c-pragma.c  | 10 ++
>  gcc/doc/cpp.texi |  6 ++
>  gcc/testsuite/ChangeLog  |  5 +
>  gcc/testsuite/gcc.dg/pragma-region.c | 21 +
>  6 files changed, 52 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/pragma-region.c

I'm not sure that this is really the way we want to handle this stuff. 
I understand the problem you're trying to solve, but embedding a list of
pragmas to ignore into the compiler itself just seems like the wrong
approach -- it bakes that set of pragmas to ignore into the compiler.

ISTM that we'd be better off either having a command line option to list
the set of pragmas to ignore, or they should be pulled from a file
specified on the command line.   That would seem to be a lot more
friendly to downstream users since each project could set the list of
pragmas to ignore on their own and have that set updated dynamically
over time without having to patch and update GCC.

Any chance you would be willing to work on that?

Jeff

Re: CSE deletes valid REG_EQUAL?

2020-11-12 Thread Jeff Law via Gcc



On 11/12/20 7:02 PM, Xionghu Luo via Gcc wrote:
> Hi all,
>
> In PR51505(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51505), Paolo Bonzini 
> added the code to delete REG_EQUAL notes in df_remove_dead_eq_notes:
>
> gcc/df-problems.c:
> df_remove_dead_eq_notes (rtx_insn *insn, bitmap live)
> {
> ...
>   case REG_EQUAL:
>   case REG_EQUIV:
> {
>   /* Remove the notes that refer to dead registers.  As we have at 
> most
>  one REG_EQUAL/EQUIV note, all of EQ_USES will refer to this note
>  so we need to purge the complete EQ_USES vector when removing
>  the note using df_notes_rescan.  */
>   df_ref use;
>   bool deleted = false;
>
>   FOR_EACH_INSN_EQ_USE (use, insn)
> if (DF_REF_REGNO (use) >= FIRST_PSEUDO_REGISTER
> && DF_REF_LOC (use)
> && (DF_REF_FLAGS (use) & DF_REF_IN_NOTE)
> && !bitmap_bit_p (live, DF_REF_REGNO (use))
> && loc_mentioned_in_p (DF_REF_LOC (use), XEXP (link, 0)))
>   {
> deleted = true;
> break;
>   }
>   if (deleted)
> {
>   rtx next;
>   if (REG_DEAD_DEBUGGING)
> df_print_note ("deleting: ", insn, link);
>   next = XEXP (link, 1);
>   free_EXPR_LIST_node (link);
>   *pprev = link = next;
>   df_notes_rescan (insn);
> }
> ...
> }
>
>
> while I have a test case as below:
>
>
> typedef long myint_t;
> __attribute__ ((noinline)) myint_t
> hash_loop (myint_t nblocks, myint_t hash)
> {
> int i;
> for (i = 0; i < nblocks; i++)
>   hash = ((hash + 13) | hash) + 0x66546b64;
> return hash;
> }
>
> before cse1:
>
>22: L22:
>16: NOTE_INSN_BASIC_BLOCK 4
>17: r125:DI=r120:DI+0xd
>18: r118:DI=r125:DI|r120:DI
>19: r126:DI=r118:DI+0x6654
>20: r120:DI=r126:DI+0x6b64
>   REG_EQUAL r118:DI+0x66546b64
>21: r119:DI=r119:DI-0x1
>23: r127:CC=cmp(r119:DI,0)
>24: pc={(r127:CC!=0)?L22:pc}
>   REG_BR_PROB 955630228
>
> The dump in cse1:
>
>16: NOTE_INSN_BASIC_BLOCK 4
>17: r125:DI=r120:DI+0xd
>18: r118:DI=r125:DI|r120:DI
>   REG_DEAD r125:DI
>   REG_DEAD r120:DI
>19: r126:DI=r118:DI+0x6654
>   REG_DEAD r118:DI
>20: r120:DI=r126:DI+0x6b64
>   REG_DEAD r126:DI
>21: r119:DI=r119:DI-0x1
>23: r127:CC=cmp(r119:DI,0)
>24: pc={(r127:CC!=0)?L22:pc}
>   REG_DEAD r127:CC
>   REG_BR_PROB 955630228
>   ; pc falls through to BB 6
>
>
> The output shows "REQ_EQUAL r118:DI+0x66546b64" is deleted by 
> df_remove_dead_eq_notes,
> but r120:DI is not REG_DEAD here, so is it correct here to check insn use and 
> find that
> r118:DI is dead then do the delete?

It doesn't matter where the death occurs, any REG_DEAD note will cause
the REG_EQUAL note to be removed.  So given the death note for r118,
then any REG_EQUAL note that references r118 will be removed.  This is
overly pessimistic as the note may still be valid/useful at some
points.  See

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92291


Jeff


ps.  Note that a REG_EQUAL note is valid at a particular point in the IL
-- it is not a function-wide equivalence.  So you have to be careful
using such values as they can be invalidated by other statements. 
Contrast to a REG_EQUIV note where the equivalence is global and you
don't have to worry about invalidation.

[committed] MAINTAINERS: add myself for write after approval

2020-11-12 Thread HAO CHEN GUI via Gcc-patches


2020-11-13  Haochen Gui  

    * MAINTAINERS (Write After Approval): add myself
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index a0216185de9..be42e1441ca 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -409,6 +409,7 @@ Matthew Gretton-Dann 
 Yury Gribov 
 Jon Grimm 
 Laurent Guerby 
+Haochen Gui 
 Jiufu Guo 
 Xuepeng Guo 
 Wei Guozhi 
--
2.18.4

[Bug c++/96121] Uninitialized variable copying in member initialized list not diagnosed

2020-11-12 Thread mpolacek at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96121

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |mpolacek at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

[Bug c++/19808] miss a warning about uninitialized member usage in member initializer list in constructor

2020-11-12 Thread mpolacek at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19808

Marek Polacek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |mpolacek at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

[Bug middle-end/92492] AVX512: Missed vectorization opportunity

2020-11-12 Thread crazylht at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92492

--- Comment #7 from Hongtao.liu  ---
I notice TARGET_VECTORIZE_RELATED_MODE is added, and can be used to handle
convertion, i'm working on this.

Re: Order

2020-11-12 Thread destciqueut--- via Gcc


I've invited you to fill out the following form:
Re: Order

To fill it out, visit:
https://docs.google.com/forms/d/e/1FAIpQLSdvTz-uNrwzYEDRle3NKO8L0HG7h5hasmZNnR2EPGRKB8tXPQ/viewform?vc=0c=0w=1flr=0usp=mail_form_link

Ive invited you to fill out a form:

Google Forms: Create and analyze surveys.

Re: [PATCH][RFC] Make mingw-w64 printf/scanf attribute alias to ms_printf/ms_scanf only for C89

2020-11-12 Thread Liu Hao via Gcc-patches

在 2020/11/13 2:46, Joseph Myers 写道:
> I'd expect these patches to include updates to the gcc.dg/format/ms_*.c 
> tests to reflect the changed semantics (or new tests there if some of the 
> changes don't result in any failures in the existing tests).
> 

Does the attached patch suffice?

I know very little about Deja GNU. I only tried compiling the function in that 
test and verified
that lines without `dg-warning` didn't result in any warnings with my 
bootstrapped GCC last night,
both on i686 and x86_64.



-- 
Best regards,
LH_Mouse
From 3f58912fb369fd1f645d880a3d967e6523b87507 Mon Sep 17 00:00:00 2001
From: Liu Hao 
Date: Thu, 12 Nov 2020 22:20:29 +0800
Subject: [PATCH] gcc: Add `ll` and `L` length modifiers for `ms_printf`

Previous code abuse `FMT_LEN_L` for the `I` modifier. As `L` is a valid
modifier for `f`, `e`, `g`, etc. and `I` has the same semantics as the
C99 `z` modifier, `FMT_LEN_z` is now used.

First, in the Microsoft ABI, type `long double` has the same layout as
type `double`, so `%Lg` behaves identically to `%g`. Users should pass
in `double`s instead as `long double`s, as GCC uses the 10-byte format.

Second, with a CRT that is recent enough (MSVCRT since Vista, MSVCR80,
UCRT, or mingw-w64 8.0), `printf`-family functions can handle the `ll`
length modifier correctly. This ability is assumed to be available
universally. A lot of libraries (such as libgomp) that use the
`format(printf, ...)` attribute used to suffer from warnings about
unknown format specifiers.

Reference: 
https://docs.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2008/tcxf1dw6(v=vs.90)
Reference: 
https://docs.microsoft.com/en-us/cpp/porting/visual-cpp-what-s-new-2003-through-2015#new-crt-features
Signed-off-by: Liu Hao 

gcc/:
* config/i386/msformat-c.c: Add more length modifiers
---
 gcc/config/i386/msformat-c.c  | 45 ++-
 gcc/testsuite/gcc.dg/format/ms_c99-printf-3.c | 22 -
 2 files changed, 44 insertions(+), 23 deletions(-)

diff --git a/gcc/config/i386/msformat-c.c b/gcc/config/i386/msformat-c.c
index 4ceec633a6e..1902b3c73d0 100644
--- a/gcc/config/i386/msformat-c.c
+++ b/gcc/config/i386/msformat-c.c
@@ -32,10 +32,11 @@ along with GCC; see the file COPYING3.  If not see
 static format_length_info ms_printf_length_specs[] =
 {
   { "h", FMT_LEN_h, STD_C89, NULL, FMT_LEN_none, STD_C89, 0 },
-  { "l", FMT_LEN_l, STD_C89, NULL, FMT_LEN_none, STD_C89, 0 },
+  { "l", FMT_LEN_l, STD_C89, "ll", FMT_LEN_ll, STD_C89, 0 },
+  { "L", FMT_LEN_L, STD_C89, NULL, FMT_LEN_none, STD_C89, 1 },
   { "I32", FMT_LEN_l, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
   { "I64", FMT_LEN_ll, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
-  { "I", FMT_LEN_L, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
+  { "I", FMT_LEN_z, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
   { NULL, FMT_LEN_none, STD_C89, NULL, FMT_LEN_none, STD_C89, 0 }
 };
 
@@ -90,33 +91,33 @@ static const format_flag_pair ms_strftime_flag_pairs[] =
 static const format_char_info ms_print_char_table[] =
 {
   /* C89 conversion specifiers.  */
-  { "di",  0, STD_C89, { T89_I,   BADLEN,  T89_S,   T89_L,   T9L_LL,  T99_SST, 
 BADLEN, BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "-wp0 +'",  "i",  NULL 
},
-  { "oxX", 0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, T99_ST, 
BADLEN, BADLEN, BADLEN, BADLEN,  BADLEN,  BADLEN }, "-wp0#", "i",  NULL },
-  { "u",   0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, T99_ST, 
BADLEN, BADLEN, BADLEN, BADLEN,  BADLEN,  BADLEN }, "-wp0'","i",  NULL },
-  { "fgG", 0, STD_C89, { T89_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN }, "-wp0 +#'", "",   NULL },
-  { "eE",  0, STD_C89, { T89_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN }, "-wp0 +#",  "",   NULL },
-  { "c",   0, STD_C89, { T89_I,   BADLEN,  T89_S,  T94_WI,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-w","",   NULL 
},
-  { "s",   1, STD_C89, { T89_C,   BADLEN,  T89_S,  T94_W,   BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp",   "cR", NULL 
},
-  { "p",   1, STD_C89, { T89_V,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-w","c",  NULL 
},
-  { "n",   1, STD_C89, { T89_I,   BADLEN,  T89_S,   T89_L,   T9L_LL,  BADLEN,  
BADLEN, BADLEN,  T99_IM,  BADLEN,  BADLEN,  BADLEN }, "",  "W",  NULL },
+  { "di",  0, STD_C89, { T89_I,   BADLEN,  T89_S,   T89_L,   T9L_LL,  BADLEN, 
T99_SST, BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp0 +'",  "i",  NULL },
+  { "oxX", 0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, BADLEN, 
T99_ST,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp0#","i",  NULL },
+  { "u",   0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, BADLEN, 
T99_ST,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN },

Ping^2: [PATCH 0/4] rs6000: Enable variable vec_insert with IFN VEC_SET

2020-11-12 Thread Xionghu Luo via Gcc-patches


Ping^2, thanks.

On 2020/11/5 09:34, Xionghu Luo via Gcc-patches wrote:

Ping.

On 2020/10/10 16:08, Xionghu Luo wrote:

Originated from
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554240.html
with patch split and some refinement per review comments.

Patch of IFN VEC_SET for ARRAY_REF(VIEW_CONVERT_EXPR) is committed,
this patch set enables expanding IFN VEC_SET for Power9 and Power8
with specfic instruction sequences.

Xionghu Luo (4):
   rs6000: Change rs6000_expand_vector_set param
   rs6000: Support variable insert and Expand vec_insert in expander 
[PR79251]

   rs6000: Enable vec_insert for P8 with rs6000_expand_vector_set_var_p8
   rs6000: Update testcases' instruction count

  gcc/config/rs6000/rs6000-c.c  |  44 +++--
  gcc/config/rs6000/rs6000-call.c   |   2 +-
  gcc/config/rs6000/rs6000-protos.h |   3 +-
  gcc/config/rs6000/rs6000.c    | 181 +-
  gcc/config/rs6000/vector.md   |   4 +-
  .../powerpc/fold-vec-insert-char-p8.c |   8 +-
  .../powerpc/fold-vec-insert-char-p9.c |  12 +-
  .../powerpc/fold-vec-insert-double.c  |  11 +-
  .../powerpc/fold-vec-insert-float-p8.c    |   6 +-
  .../powerpc/fold-vec-insert-float-p9.c    |  10 +-
  .../powerpc/fold-vec-insert-int-p8.c  |   6 +-
  .../powerpc/fold-vec-insert-int-p9.c  |  11 +-
  .../powerpc/fold-vec-insert-longlong.c    |  10 +-
  .../powerpc/fold-vec-insert-short-p8.c    |   6 +-
  .../powerpc/fold-vec-insert-short-p9.c    |   8 +-
  .../gcc.target/powerpc/pr79251-run.c  |  28 +++
  gcc/testsuite/gcc.target/powerpc/pr79251.h    |  19 ++
  gcc/testsuite/gcc.target/powerpc/pr79251.p8.c |  17 ++
  gcc/testsuite/gcc.target/powerpc/pr79251.p9.c |  18 ++
  .../gcc.target/powerpc/vsx-builtin-7.c    |   4 +-
  20 files changed, 337 insertions(+), 71 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251-run.c
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251.h
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251.p8.c
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251.p9.c





--
Thanks,
Xionghu

CSE deletes valid REG_EQUAL?

2020-11-12 Thread Xionghu Luo via Gcc

Hi all,

In PR51505(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51505), Paolo Bonzini 
added the code to delete REG_EQUAL notes in df_remove_dead_eq_notes:

gcc/df-problems.c:
df_remove_dead_eq_notes (rtx_insn *insn, bitmap live)
{
...
case REG_EQUAL:
case REG_EQUIV:
  {
/* Remove the notes that refer to dead registers.  As we have at 
most
   one REG_EQUAL/EQUIV note, all of EQ_USES will refer to this note
   so we need to purge the complete EQ_USES vector when removing
   the note using df_notes_rescan.  */
df_ref use;
bool deleted = false;

FOR_EACH_INSN_EQ_USE (use, insn)
  if (DF_REF_REGNO (use) >= FIRST_PSEUDO_REGISTER
  && DF_REF_LOC (use)
  && (DF_REF_FLAGS (use) & DF_REF_IN_NOTE)
  && !bitmap_bit_p (live, DF_REF_REGNO (use))
  && loc_mentioned_in_p (DF_REF_LOC (use), XEXP (link, 0)))
{
  deleted = true;
  break;
}
if (deleted)
  {
rtx next;
if (REG_DEAD_DEBUGGING)
  df_print_note ("deleting: ", insn, link);
next = XEXP (link, 1);
free_EXPR_LIST_node (link);
*pprev = link = next;
df_notes_rescan (insn);
  }
...
}


while I have a test case as below:


typedef long myint_t;
__attribute__ ((noinline)) myint_t
hash_loop (myint_t nblocks, myint_t hash)
{
int i;
for (i = 0; i < nblocks; i++)
  hash = ((hash + 13) | hash) + 0x66546b64;
return hash;
}

before cse1:

   22: L22:
   16: NOTE_INSN_BASIC_BLOCK 4
   17: r125:DI=r120:DI+0xd
   18: r118:DI=r125:DI|r120:DI
   19: r126:DI=r118:DI+0x6654
   20: r120:DI=r126:DI+0x6b64
  REG_EQUAL r118:DI+0x66546b64
   21: r119:DI=r119:DI-0x1
   23: r127:CC=cmp(r119:DI,0)
   24: pc={(r127:CC!=0)?L22:pc}
  REG_BR_PROB 955630228

The dump in cse1:

   16: NOTE_INSN_BASIC_BLOCK 4
   17: r125:DI=r120:DI+0xd
   18: r118:DI=r125:DI|r120:DI
  REG_DEAD r125:DI
  REG_DEAD r120:DI
   19: r126:DI=r118:DI+0x6654
  REG_DEAD r118:DI
   20: r120:DI=r126:DI+0x6b64
  REG_DEAD r126:DI
   21: r119:DI=r119:DI-0x1
   23: r127:CC=cmp(r119:DI,0)
   24: pc={(r127:CC!=0)?L22:pc}
  REG_DEAD r127:CC
  REG_BR_PROB 955630228
  ; pc falls through to BB 6


The output shows "REQ_EQUAL r118:DI+0x66546b64" is deleted by 
df_remove_dead_eq_notes,
but r120:DI is not REG_DEAD here, so is it correct here to check insn use and 
find that
r118:DI is dead then do the delete?


Thanks,
Xionghu

Re: [committed] wwwdocs: Editorial changes around x86-64 ISA extensions

2020-11-12 Thread Hongtao Liu via Gcc-patches

On Fri, Nov 13, 2020 at 3:32 AM Gerald Pfeifer  wrote:
>
> Per our discussion on the list (plus a grammer improvement in a
> section above).
>
> One question: why are the ISA extension lists not alphabetically
> sorted?  Wouldn't that be beneficial for users?  Easier to find
> something and also easier to compare?
>

Hmm, I just sorted them by the time they are enabled.

When I changed the wwwdocs, I was referring to the previous
gcc-8/changes.html, and didn't find that it was alphabetical.

> Gerald
>
> ---
>  htdocs/gcc-11/changes.html | 13 +++--
>  1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
> index fc4c74f4..106db8e9 100644
> --- a/htdocs/gcc-11/changes.html
> +++ b/htdocs/gcc-11/changes.html
> @@ -265,7 +265,8 @@ a work-in-progress.
>
>New ISA extension support for Intel AMX-TILE, AMX-INT8, AMX-BF16 was
>added to GCC. AMX-TILE, AMX-INT8, AMX-BF16 intrinsics are available
> -  via the -mamx-tile, -mamx-int8, -mamx-bf16 compiler 
> switch.
> +  via the -mamx-tile, -mamx-int8, -mamx-bf16 compiler
> +  switches.
>
>New ISA extension support for Intel AVX-VNNI was added to GCC.
>AVX-VNNI intrinsics are available via the -mavxvnni
> @@ -273,14 +274,14 @@ a work-in-progress.
>
>GCC now supports the Intel CPU named Sapphire Rapids through
>  -march=sapphirerapids.
> -The switch enables the MOVDIRI MOVDIR64B AVX512VP2INTERSECT ENQCMD 
> CLDEMOTE
> -SERIALIZE PTWRITE WAITPKG TSXLDTRK AMT-TILE AMX-INT8 AMX-BF16 AVX-VNNI
> -ISA extensions.
> +The switch enables the MOVDIRI, MOVDIR64B, AVX512VP2INTERSECT, ENQCMD,
> +CLDEMOTE, SERIALIZE, PTWRITE, WAITPKG, TSXLDTRK, AMT-TILE, AMX-INT8,
> +AMX-BF16, and AVX-VNNI ISA extensions.
>
>GCC now supports the Intel CPU named Alderlake through
>  -march=alderlake.
> -The switch enables the CLDEMOTE PTWRITE WAITPKG SERIALIZE KEYLOCKER 
> AVX-VNNI
> -HRESET ISA extensions.
> +The switch enables the CLDEMOTE, PTWRITE, WAITPKG, SERIALIZE, KEYLOCKER,
> +AVX-VNNI, and HRESET ISA extensions.
>
>  
>
> --
> 2.29.2



-- 
BR,
Hongtao

Re: [PATCH,wwwdocs] gcc-11/changes: Mention Intel AVX-VNNI

2020-11-12 Thread Hongtao Liu via Gcc-patches

Got it.

On Fri, Nov 13, 2020 at 3:26 AM Gerald Pfeifer  wrote:
>
> On Wed, 11 Nov 2020, Hongtao Liu via Gcc-patches wrote:
> > +  New ISA extension support for Intel AVX-VNNI was added to GCC.
>
> More for the future (i.e., no need to change that now): I suggest
> to skip "to GCC" in cases like this, since this is our context to
> begin with.
>
> Gerald



-- 
BR,
Hongtao

Re: [PATCH] RISC-V: Enable ifunc if it was supported in the binutils for linux toolchain.

2020-11-12 Thread Nelson Chu

On Fri, Nov 13, 2020 at 5:50 AM Jim Wilson  wrote:
>I committed and pushed it.

Thanks for your help!!

> I see some extra ifunc related testsuite failures, but that is because we 
> don't have the glibc ifunc patches upstream yet.  It will be important to get 
> those done next.

Yeah, hope we can catch up on this before the next release.

Thanks
Nelson

[Bug target/97534] [10/11 Regression] ICE in decompose, at rtl.h:2280 (arm-linux-gnueabihf)

2020-11-12 Thread rearnsha at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97534

--- Comment #5 from Richard Earnshaw  ---
No, I don't think it's related to that, in fact, I think this is just a latent
bug that's been in the code for a long time.

At one point we have a 32-bit signed integer containing INT_MIN, which is
internally represented as a 64-bit constant 0x8000; we try to
negate that (so that we can use addition) and end up with 0x8000,
but that's not a canonical value for internal use on an SImode value (they need
to be sign-extended) and eventually this causes the compiler to trip over its
own feet.

I'm testing a patch.

[PATCH] Change range_handler, was Re: Fix gimple_expr_code?

2020-11-12 Thread Andrew MacLeod via Gcc-patches


On 11/12/20 4:12 PM, Andrew MacLeod via Gcc-patches wrote:

On 11/12/20 3:53 PM, Richard Biener wrote: ... 



But it means that gimple_expr_code() isn't returning the correct result

for GIMPLE_SINGLE_RHS
It depends. A SSA name isn't an expression code either. As said, the 
generic gimple_expr_code should be used with extreme care.


what is an expression code?  It seems like its just a  tree_code 
representing what is on the RHS?    Im not sure I understand why one 
needs to be careful with it.  It only applies to COND, ASSIGN and 
CALL. and its current right for everything except GIMPLE_SINGLE_RHS?


If we dont fix gimple_expr_code, then Im basically going to be 
reimplementing it myself... which seems kind of pointless.


Andrew


However, that said, It seems like reworking the accessor is probably 
better anyway.  Point taken on expr_type..  for a GIMPLE_COND I wasn't 
actually getting the type I really wanted as it turned out.


anyway, fixed thusly.

Bootstrapped on x86_64-pc-linux-gnu, no regressions.  pushed.

Andrew

commit ee24da1b983a89b05303f2ac8828dd8cbe28d3b4
Author: Andrew MacLeod 
Date:   Thu Nov 12 19:25:59 2020 -0500

Change range_handler, was  Re: Fix gimple_expr_code?

Adjust the range_handler to not use gimple_expr_code/type.

* gimple-range.h (gimple_range_handler): Use gimple_assign and
gimple_cond routines to get type and code.
* range-op.cc (range_op_handler): Check for integral types.

diff --git a/gcc/gimple-range.h b/gcc/gimple-range.h
index 0aa6d4672ee..88d2ada324b 100644
--- a/gcc/gimple-range.h
+++ b/gcc/gimple-range.h
@@ -97,8 +97,12 @@ extern bool gimple_range_calc_op2 (irange , const gimple 
*s,
 static inline range_operator *
 gimple_range_handler (const gimple *s)
 {
-  if ((gimple_code (s) == GIMPLE_ASSIGN) || (gimple_code (s) == GIMPLE_COND))
-return range_op_handler (gimple_expr_code (s), gimple_expr_type (s));
+  if (gimple_code (s) == GIMPLE_ASSIGN)
+return range_op_handler (gimple_assign_rhs_code (s),
+TREE_TYPE (gimple_assign_lhs (s)));
+  if (gimple_code (s) == GIMPLE_COND)
+return range_op_handler (gimple_cond_code (s),
+TREE_TYPE (gimple_cond_lhs (s)));
   return NULL;
 }
 
diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index aff9383d936..86d1af7fe54 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -3341,10 +3341,12 @@ pointer_table::pointer_table ()
 range_operator *
 range_op_handler (enum tree_code code, tree type)
 {
-  // First check if there is apointer specialization.
+  // First check if there is a pointer specialization.
   if (POINTER_TYPE_P (type))
 return pointer_tree_table[code];
-  return integral_tree_table[code];
+  if (INTEGRAL_TYPE_P (type))
+return integral_tree_table[code];
+  return NULL;
 }
 
 // Cast the range in R to TYPE.

[Bug libstdc++/96322] 22_locale/numpunct/members/char/3.cc is outdated: expects grouping=0, actual=3

2020-11-12 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96322

--- Comment #2 from Jonathan Wakely  ---
We really need to create our own custom locales for testing, so that we don't
depend on externally defined data that keep changing.

[Bug other/97417] RISC-V Unnecessary andi instruction when loading volatile bool

2020-11-12 Thread wilson at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97417

--- Comment #35 from Jim Wilson  ---
By combine issue, are you referring to the regression I mentioned in comment 3
and filed as bug 97747?  We can handle that as a separate issue.  It should be
uncommon.  I expect to get much more benefit from this patch than the downside
due to that combine issue.

As for the shorten-memrefs problem, I didn't notice that one in my testing.  It
does need to be fixed.  Taking a look now, it looks pretty simple to fix.  The
code currently looks for MEM, it needs to handle (SIGN_EXTEND (MEM)) and
((ZERO_EXTEND (MEM)).  See the get_si_mem_base_reg function.  You need to skip
over the sign_extend or zero_extend when looking fot the mem at the first call.
 Then at the second call you need to be careful to put the sign_extend or
zero_extend back when creating the new RTL.  Maybe just another arg to
get_si_mem_base so it can record the parent rtx code of the mem.  Or maybe do
this outside get_si_mem_base to skip over a sign/zero extend at the first call,
and then do the same at the second call but record what rtx we skipped over so
that we can put it back.  We can either handle this here or as another patch. 
But since you have some time while waiting for paperwork maybe you can try
writing a fix.

[committed] libstdc++: Optimise std::future::wait_for and fix futex polling

2020-11-12 Thread Jonathan Wakely via Gcc-patches

To poll a std::future to see if it's ready you have to call one of the
timed waiting functions. The most obvious way is wait_for(0s) but this
was previously very inefficient because it would turn the relative
timeout to an absolute one by calling system_clock::now(). When the
relative timeout is zero (or less) we're obviously going to get a time
that has already passed, but the overhead of obtaining the current time
can be dozens of microseconds. The alternative is to call wait_until
with an absolute timeout that is in the past. If you know the clock's
epoch is in the past you can use a default constructed time_point.
Alternatively, using some_clock::time_point::min() gives the earliest
time point supported by the clock, which should be safe to assume is in
the past. However, using a futex wait with an absolute timeout before
the UNIX epoch fails and sets errno=EINVAL. The new code using futex
waits with absolute timeouts was not checking for this case, which could
result in hangs (or killing the process if the libray is built with
assertions enabled).

This patch checks for times before the epoch before attempting to wait
on a futex with an absolute timeout, which fixes the hangs or crashes.
It also makes it very fast to poll using an absolute timeout before the
epoch (because we skip the futex syscall).

It also makes future::wait_for avoid waiting at all when the relative
timeout is zero or less, to avoid the unnecessary overhead of getting
the current time. This makes polling with wait_for(0s) take only a few
cycles instead of dozens of milliseconds.

libstdc++-v3/ChangeLog:

* include/std/future (future::wait_for): Do not wait for
durations less than or equal to zero.
* src/c++11/futex.cc (_M_futex_wait_until)
(_M_futex_wait_until_steady): Do not wait for timeouts before
the epoch.
* testsuite/30_threads/future/members/poll.cc: New test.

Tested powerpc64le-linux. Committed to trunk.

I think the shortcut in future::wait_for is worth backporting. The
changes in src/c++11/futex.cc are not needed because the code using
absolute timeouts with futex waits is not present on any release
branch.

commit 93fc47746815ea9dac413322fcade2931f757e7f
Author: Jonathan Wakely 
Date:   Thu Nov 12 21:25:14 2020

libstdc++: Optimise std::future::wait_for and fix futex polling

To poll a std::future to see if it's ready you have to call one of the
timed waiting functions. The most obvious way is wait_for(0s) but this
was previously very inefficient because it would turn the relative
timeout to an absolute one by calling system_clock::now(). When the
relative timeout is zero (or less) we're obviously going to get a time
that has already passed, but the overhead of obtaining the current time
can be dozens of microseconds. The alternative is to call wait_until
with an absolute timeout that is in the past. If you know the clock's
epoch is in the past you can use a default constructed time_point.
Alternatively, using some_clock::time_point::min() gives the earliest
time point supported by the clock, which should be safe to assume is in
the past. However, using a futex wait with an absolute timeout before
the UNIX epoch fails and sets errno=EINVAL. The new code using futex
waits with absolute timeouts was not checking for this case, which could
result in hangs (or killing the process if the libray is built with
assertions enabled).

This patch checks for times before the epoch before attempting to wait
on a futex with an absolute timeout, which fixes the hangs or crashes.
It also makes it very fast to poll using an absolute timeout before the
epoch (because we skip the futex syscall).

It also makes future::wait_for avoid waiting at all when the relative
timeout is zero or less, to avoid the unnecessary overhead of getting
the current time. This makes polling with wait_for(0s) take only a few
cycles instead of dozens of milliseconds.

libstdc++-v3/ChangeLog:

* include/std/future (future::wait_for): Do not wait for
durations less than or equal to zero.
* src/c++11/futex.cc (_M_futex_wait_until)
(_M_futex_wait_until_steady): Do not wait for timeouts before
the epoch.
* testsuite/30_threads/future/members/poll.cc: New test.

diff --git a/libstdc++-v3/include/std/future b/libstdc++-v3/include/std/future
index 5d948018c75c..f7617cac8e93 100644
--- a/libstdc++-v3/include/std/future
+++ b/libstdc++-v3/include/std/future
@@ -345,10 +345,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  // to synchronize with the thread that made it ready.
  if (_M_status._M_load(memory_order_acquire) == _Status::__ready)
return future_status::ready;
+
  if (_M_is_deferred_future())
return future_status::deferred;
- if

[Bug libstdc++/95048] [9/10/11 Regression] wstring-constructor of std::filesystem::path throws for non-ASCII characters

2020-11-12 Thread gcc-bugzilla at m dot chronial.de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95048

--- Comment #9 from Christian Fersch  ---
But is it possible to query the value of -fwide-exec-charset? I had quick look
and couldn't find anything.

Re: [PATCH] C-Family, Objective-C : Implement Objective-C nullability Part 1 [PR90707].

2020-11-12 Thread Joseph Myers

On Thu, 12 Nov 2020, Iain Sandoe wrote:

> OK for the C-family changes?

OK.

> +When @var{nullability kind} is @var{"unspecified"} or @var{0}, nothing is

I think you mean @code or @samp for the second and third @var on this 
line, they look like literal code not metasyntactic variables.  Likewise 
below.

-- 
Joseph S. Myers
jos...@codesourcery.com

[Bug libstdc++/96322] 22_locale/numpunct/members/char/3.cc is outdated: expects grouping=0, actual=3

2020-11-12 Thread slyfox at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96322

--- Comment #1 from Sergei Trofimovich  ---
Maybe pick another similar locale? Candidates are:

glibc $ git grep '0;0' localedata/locales/ | cat
localedata/locales/aa_DJ:grouping   0;0
localedata/locales/bs_BA:grouping  0;0
localedata/locales/el_CY:grouping  0;0
localedata/locales/el_GR:grouping  0;0
localedata/locales/eo:grouping  0;0
localedata/locales/es_CU:grouping 0;0
localedata/locales/gl_ES:grouping 0;0
localedata/locales/mg_MG:grouping  0;0
localedata/locales/pap_AW:grouping  0;0
localedata/locales/pap_CW:grouping  0;0
localedata/locales/pt_PT:grouping  0;0
localedata/locales/sl_SI:grouping  0;0
localedata/locales/sr_RS:grouping  0;0
localedata/locales/ti_ER:grouping  0;0
localedata/locales/wo_SN:grouping  0;0

Re: PowerPC: Use float128 instead of ieee128 in tests.

2020-11-12 Thread Segher Boessenkool

On Thu, Nov 12, 2020 at 04:44:09PM -0500, Michael Meissner wrote:
> On Thu, Nov 12, 2020 at 01:26:32PM -0600, Segher Boessenkool wrote:
> > On Thu, Oct 22, 2020 at 06:12:31PM -0400, Michael Meissner wrote:
> > > Two of the tests used the __ieee128 keyword instead of __float128.  This
> > > patch changes those cases to use the official keyword.
> > 
> > What is "official" about that?
> > 
> > Why make this change at all?  __ieee128 should work as well!  Did you
> > see failures without this patch?  Thos need fixing, then.
> 
> We document '__float128'.  We don't document '__ieee128'.  As I said, using
> '__ieee128' internally was due some issues in the GCC 7 time frame,
> particularly before we had the glibc changes.

Well, it is a much clearer type as well: __ibm128 is also 128 bits, and
is also a floating point type.  But if __float128 now *always* means
__ieee128, then fine :-)

(But the manual needs fixing in four places, then.)

Is __float128 a standard type?  ("Standard", in whatever context -- not
just a rs6000 GCC thing, and what else uses it then, and/or will other
things use it in the future?)

Also, we then should change things so __ieee128 becomes really only a
legacy alias for __float128.

Thanks,

Segher

Re: [PATCH v5 2/8] libstdc++ futex: Use FUTEX_CLOCK_REALTIME for wait

2020-11-12 Thread Jonathan Wakely via Gcc-patches


On 29/05/20 07:17 +0100, Mike Crowe via Libstdc++ wrote:

The futex system call supports waiting for an absolute time if
FUTEX_WAIT_BITSET is used rather than FUTEX_WAIT.  Doing so provides two
benefits:

1. The call to gettimeofday is not required in order to calculate a
  relative timeout.

2. If someone changes the system clock during the wait then the futex
  timeout will correctly expire earlier or later.  Currently that only
  happens if the clock is changed prior to the call to gettimeofday.

According to futex(2), support for FUTEX_CLOCK_REALTIME was added in the
v2.6.28 Linux kernel and FUTEX_WAIT_BITSET was added in v2.6.25.  To ensure
that the code still works correctly with earlier kernel versions, an ENOSYS
error from futex[1] results in the futex_clock_realtime_unavailable flag
being set.  This flag is used to avoid the unnecessary unsupported futex
call in the future and to fall back to the previous gettimeofday and
relative time implementation.

glibc applied an equivalent switch in pthread_cond_timedwait to use
FUTEX_CLOCK_REALTIME and FUTEX_WAIT_BITSET rather than FUTEX_WAIT for
glibc-2.10 back in 2009.  See
glibc:cbd8aeb836c8061c23a5e00419e0fb25a34abee7

The futex_clock_realtime_unavailable flag is accessed using
std::memory_order_relaxed to stop it becoming a bottleneck.  If the first
two calls to _M_futex_wait_until happen to happen simultaneously then the
only consequence is that both will try to use FUTEX_CLOCK_REALTIME, both
risk discovering that it doesn't work and, if so, both set the flag.

[1] This is how glibc's nptl-init.c determines whether these flags are
   supported.

* libstdc++-v3/src/c++11/futex.cc: Add new constants for required
futex flags.  Add futex_clock_realtime_unavailable flag to store
result of trying to use
FUTEX_CLOCK_REALTIME. 
(__atomic_futex_unsigned_base::_M_futex_wait_until):
Try to use FUTEX_WAIT_BITSET with FUTEX_CLOCK_REALTIME and only
fall back to using gettimeofday and FUTEX_WAIT if that's not
supported.


Mike,

I've been doing some performance comparisons and this patch seems to
make quite a big difference to code that polls a future by calling
fut.wait_until(t) using any t < now() as the timeout. For example,
fut.wait_until(chrono::system_clock::time_point{}) to wait until the
UNIX epoch.

With GCC 10 (or with the if (!futex_clock_realtime_unavailable.load(...)
commented out) I see that polling take < 100ns. With the change, it
takes 3000ns or more.

Now this is still far better than polling using fut.wait_for(0s) which
takes around 5ns due to the clock_gettime call, but I'm about to
fix that.

I'm not sure how important it is for wait_until(past) to be fast, but
the difference from 100ns to 3000ns seems significant. Do you see the
same kind of numbers? Is this just a property of the futex wait with
an absolute time?

N.B. using wait_until(system_clock::time_point::min()) or any other
time before the epoch doesn't work. The futex syscall returns EINVAL
which we don't check for. I'm about to fix that too.



libstdc++-v3/src/c++11/futex.cc | 37 ++-
1 file changed, 37 insertions(+)

diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
index c9de11a..25b3e05 100644
--- a/libstdc++-v3/src/c++11/futex.cc
+++ b/libstdc++-v3/src/c++11/futex.cc
@@ -35,8 +35,16 @@

// Constants for the wait/wake futex syscall operations
const unsigned futex_wait_op = 0;
+const unsigned futex_wait_bitset_op = 9;
+const unsigned futex_clock_realtime_flag = 256;
+const unsigned futex_bitset_match_any = ~0;
const unsigned futex_wake_op = 1;

+namespace
+{
+  std::atomic futex_clock_realtime_unavailable;
+}
+
namespace std _GLIBCXX_VISIBILITY(default)
{
_GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -58,6 +66,35 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  }
else
  {
+   if (!futex_clock_realtime_unavailable.load(std::memory_order_relaxed))
+ {
+   struct timespec rt;
+   rt.tv_sec = __s.count();
+   rt.tv_nsec = __ns.count();
+   if (syscall (SYS_futex, __addr,
+futex_wait_bitset_op | futex_clock_realtime_flag,
+__val, , nullptr, futex_bitset_match_any) == -1)
+ {
+   __glibcxx_assert(errno == EINTR || errno == EAGAIN
+   || errno == ETIMEDOUT || errno == ENOSYS);
+   if (errno == ETIMEDOUT)
+ return false;
+   if (errno == ENOSYS)
+ {
+   futex_clock_realtime_unavailable.store(true,
+   std::memory_order_relaxed);
+   // Fall through to legacy implementation if the system
+   // call is unavailable.
+ }
+   else
+ return true;
+ }
+   else
+ return true;
+ }
+
+   // We only get to here

Re: [PATCH] openmp: Retire nest-var ICV

2020-11-12 Thread Kwok Cheung Yeung


On 10/11/2020 6:01 pm, Jakub Jelinek wrote:

One thing is that max-active-levels-var in 5.0 is per-device,
but in 5.1 per-data environment.  The question is if we should implement
the problematic 5.0 way or the 5.1 one.  E.g.:
#include 
#include 

int
main ()
{
   #pragma omp parallel
   {
 omp_set_nested (1);
 #pragma omp parallel num_threads(2)
 printf ("Hello, world!\n");
   }
}
which used to be valid in 4.5 (where nest-var used to be per-data
environment) is in 5.0 racy (and in 5.1 will not be racy again).
Though, as these are deprecated APIs, perhaps we can just do the 5.0 way for
now.


Since max-active-levels-var is still current in 5.1, I guess we might as well do 
it properly :-). I have now placed max-active-levels-var into gomp_task_icv. The 
definition of omp_get_nested in 5.1 refers to the active-level-var ICV which is 
currently not implemented, so the comparison is against omp_get_active_level() 
instead.



--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -489,8 +489,11 @@ represent their language-specific counterparts.
  
  Nested parallel regions may be initialized at startup by the

  @env{OMP_NESTED} environment variable or at runtime using
-@code{omp_set_nested}.  If undefined, nested parallel regions are
-disabled by default.
+@code{omp_set_nested}.  Setting the maximum number of nested
+regions to above one using the @env{OMP_MAX_ACTIVE_LEVELS}
+environment variable or @code{omp_set_max_active_levels} will
+also enable nesting.  If undefined, nested parallel regions
+are disabled by default.


This doesn't really describe what env.c does.  If undefined, then
if OMP_NESTED is defined, it will be folloed, and if neither is
defined, the code sets the default based on
"OMP_NUM_THREADS or OMP_PROC_BIND is set to a
comma-separated list of more than one value"
as the spec says and only is disabled otherwise.



Similarly.



Again.


I have changed these to more accurately describe what is happening. The 
descriptions are starting to get rather verbose though...



--- a/libgomp/testsuite/libgomp.c/target-5.c
+++ b/libgomp/testsuite/libgomp.c/target-5.c


Why does this testcase need updates?
It doesn't seem to use omp_[sg]et_max_active_levels and so I don't see
why it couldn't use omp_[sg]et_nested.



The problem is with max-active-levels-var (which nesting is now in terms of) 
being per device rather than per data environment. The test expects the nested 
setting to go back to its previous value after leaving a DE that sets it to 
something else.


Anyway, with max-active-levels-var now being per data environment, that is all 
moot now, and the test can remain unchanged.


Is this version okay for trunk? Bootstrapped on x86_64 and libgomp tested with 
no regressions with nvptx offloading.


Thanks

Kwok
commit bcaa3dbf1f130e3a2c7e6033a10be3f61221a951
Author: Kwok Cheung Yeung 
Date:   Thu Nov 12 13:42:28 2020 -0800

openmp: Retire nest-var ICV for OpenMP 5.1

This removes the nest-var ICV, expressing nesting in terms of the
max-active-levels-var ICV instead.  The max-active-levels-var ICV
is now per data environment rather than per device.

2020-11-12  Kwok Cheung Yeung  

libgomp/
* env.c (gomp_global_icv): Remove nest_var field.  Add
max_active_levels_var field.
(gomp_max_active_levels_var): Remove.
(parse_boolean): Return true on success.
(handle_omp_display_env): Express OMP_NESTED in terms of
max_active_levels_var.
(initialize_env): Set max_active_levels_var from
OMP_MAX_ACTIVE_LEVELS, OMP_NESTED, OMP_NUM_THREADS and
OMP_PROC_BIND.
* icv.c (omp_set_nested): Express in terms of
max_active_levels_var.
(omp_get_nested): Likewise.
(omp_set_max_active_levels): Use max_active_levels_var field instead
of gomp_max_active_levels_var.
(omp_get_max_active_levels): Likewise.
* libgomp.h (struct gomp_task_icv): Remove nest_var field.  Add
max_active_levels_var field.
(gomp_max_active_levels_var): Delete.
* libgomp.texi (omp_get_nested): Update documentation.
(omp_set_nested): Likewise.
(OMP_MAX_ACTIVE_LEVELS): Likewise.
(OMP_NESTED): Likewise.
(OMP_NUM_THREADS): Likewise.
(OMP_PROC_BIND): Likewise.
* parallel.c (gomp_resolve_num_threads): Replace reference
to nest_var with max_active_levels_var.  Use max_active_levels_var
field instead of gomp_max_active_levels_var.

diff --git a/libgomp/env.c b/libgomp/env.c
index ab22525..b8ed1bd 100644
--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -68,12 +68,11 @@ struct gomp_task_icv gomp_global_icv = {
   .run_sched_chunk_size = 1,
   .default_device_var = 0,
   .dyn_var = false,
-  .nest_var = false,
+  .max_active_levels_var = 1,
   .bind_var = omp_proc_bind_false,
   .target_data = NULL
 };
 
-unsigned long gomp_max_active_levels_var = gomp_supported_active_levels;
 bool

[Bug jit/87291] Add support for inline asm to libgccjit

2020-11-12 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87291

David Malcolm  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #33 from David Malcolm  ---
Patch pushed to master for gcc 11: g:421d0d0f54294a7bf2872b3b2ac521ce0fa9869e
(this bug didn't get notified for some reason).

Changes since v4:
* added location to gcc_jit_context_add_top_level_asm
* more logging
* filter testsuite to x86 targets
* use LIBGCCJIT_ABI_15 and #ifdef LIBGCCJIT_HAVE_ASM_STATEMENTS
* fix asm_operand::make_debug_string
* fixes to write_reproducer
* update expected debug-string results to reflect escaping fixes from
g:fec573408310139e1ffc42741fbe46b4f2947592
* fixed missing comments

Marking this one as resolved.

[PATCH] Use SHF_GNU_RETAIN to preserve symbol definitions

2020-11-12 Thread H.J. Lu via Gcc-patches

In assemly code, the section flag 'R' sets the SHF_GNU_RETAIN flag to
indicate that the section must be preserved by the linker.

Add SECTION_RETAIN to indicate a section should be retained by the linker
and set SECTION_RETAIN on section for the preserved symbol if assembler
supports SHF_GNU_RETAIN.  All retained symbols are placed in separate
sections with

.section .data.rel.local.preserved_symbol,"awR"
preserved_symbol:
...
.section .data.rel.local,"aw"
not_preserved_symbol:
...

to avoid

.section .data.rel.local,"awR"
preserved_symbol:
...
not_preserved_symbol:
...

which places not_preserved_symbol definition in the SHF_GNU_RETAIN
section.

gcc/

2020-11-XX  H.J. Lu  

* configure.ac (HAVE_GAS_SHF_GNU_RETAIN): New.  Define 1 if
the assembler supports marking sections with SHF_GNU_RETAIN flag.
* output.h (SECTION_RETAIN): New.  Defined as 0x400.
(SECTION_MACH_DEP): Changed from 0x400 to 0x800.
(default_unique_section): Add a bool argument.
* varasm.c (get_section): Set SECTION_RETAIN for the preserved
symbol with HAVE_GAS_SHF_GNU_RETAIN.
(resolve_unique_section): Used named section for the preserved
symbol if assembler supports SHF_GNU_RETAIN.
(get_variable_section): Handle the preserved common symbol with
HAVE_GAS_SHF_GNU_RETAIN.
(default_elf_asm_named_section): Require the full declaration and
use the 'R' flag for SECTION_RETAIN.
* config.in: Regenerated.
* configure: Likewise.

gcc/testsuite/

2020-11-XX  H.J. Lu  
Jozef Lawrynowicz  

* c-c++-common/attr-used.c: Check the 'R' flag.
* c-c++-common/attr-used-2.c: Likewise.
* c-c++-common/attr-used-3.c: New test.
* c-c++-common/attr-used-4.c: Likewise.
* gcc.c-torture/compile/attr-used-retain-1.c: Likewise.
* gcc.c-torture/compile/attr-used-retain-2.c: Likewise.
* lib/target-supports.exp
(check_effective_target_R_flag_in_section): New proc.
---
 gcc/config.in |  7 +++
 gcc/configure | 51 +++
 gcc/configure.ac  | 20 
 gcc/output.h  |  6 ++-
 gcc/testsuite/c-c++-common/attr-used-2.c  |  1 +
 gcc/testsuite/c-c++-common/attr-used-3.c  |  7 +++
 gcc/testsuite/c-c++-common/attr-used-4.c  |  7 +++
 gcc/testsuite/c-c++-common/attr-used.c|  1 +
 .../compile/attr-used-retain-1.c  | 32 
 .../compile/attr-used-retain-2.c  | 15 ++
 gcc/testsuite/lib/target-supports.exp | 40 +++
 gcc/varasm.c  | 17 +--
 12 files changed, 200 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-3.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-4.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/attr-used-retain-1.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/attr-used-retain-2.c

diff --git a/gcc/config.in b/gcc/config.in
index b7c3107bfe3..23ae2f9bc1b 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1352,6 +1352,13 @@
 #endif
 
 
+/* Define 0/1 if your assembler supports marking sections with SHF_GNU_RETAIN
+   flag. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_GAS_SHF_GNU_RETAIN
+#endif
+
+
 /* Define 0/1 if your assembler supports marking sections with SHF_MERGE flag.
*/
 #ifndef USED_FOR_TARGET
diff --git a/gcc/configure b/gcc/configure
index dbda4415a17..a925a6e5efb 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -24272,6 +24272,57 @@ cat >>confdefs.h <<_ACEOF
 _ACEOF
 
 
+# Test if the assembler supports the section flag 'R' for specifying
+# section with SHF_GNU_RETAIN.
+case "${target}" in
+  # Solaris may use GNU assembler with Solairs ld.  Even if GNU
+  # assembler supports the section flag 'R', it doesn't mean that
+  # Solairs ld supports it.
+  *-*-solaris2*)
+gcc_cv_as_shf_gnu_retain=no
+;;
+  *)
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for section 
'R' flag" >&5
+$as_echo_n "checking assembler for section 'R' flag... " >&6; }
+if ${gcc_cv_as_shf_gnu_retain+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  gcc_cv_as_shf_gnu_retain=no
+if test $in_tree_gas = yes; then
+if test $in_tree_gas_is_elf = yes \
+  && test $gcc_cv_gas_vers -ge `expr \( \( 2 \* 1000 \) + 36 \) \* 1000 + 0`
+  then gcc_cv_as_shf_gnu_retain=yes
+fi
+  elif test x$gcc_cv_as != x; then
+$as_echo '.section .foo,"awR",%progbits
+.byte 0' > conftest.s
+if { ac_try='$gcc_cv_as $gcc_cv_as_flags --fatal-warnings -o conftest.o 
conftest.s >&5'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; }
+then
+   gcc_cv_as_shf_gnu_retain=yes
+

[committed] jit: add support for inline asm [PR87291]

2020-11-12 Thread David Malcolm via Gcc-patches

This patch adds various entrypoints to libgccjit for directly embedding
asm statements into a compile, analogous to inline asm in the C frontend:
  gcc_jit_block_add_extended_asm
  gcc_jit_block_end_with_extended_asm_goto
  gcc_jit_extended_asm_as_object
  gcc_jit_extended_asm_set_volatile_flag
  gcc_jit_extended_asm_set_inline_flag
  gcc_jit_extended_asm_add_output_operand
  gcc_jit_extended_asm_add_input_operand
  gcc_jit_extended_asm_add_clobber
  gcc_jit_context_add_top_level_asm

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as 421d0d0f54294a7bf2872b3b2ac521ce0fa9869e.

gcc/jit/ChangeLog:
PR jit/87291
* docs/cp/topics/asm.rst: New file.
* docs/cp/topics/index.rst (Topic Reference): Add it.
* docs/topics/asm.rst: New file.
* docs/topics/compatibility.rst (LIBGCCJIT_ABI_15): New.
* docs/topics/functions.rst (Statements): Add link to extended
asm.
* docs/topics/index.rst (Topic Reference): Add asm.rst.
* docs/topics/objects.rst: Add gcc_jit_extended_asm to ASCII art.
* jit-common.h (gcc::jit::recording::extended_asm): New forward
decl.
(gcc::jit::recording::top_level_asm): Likewise.
* jit-playback.c: Include "stmt.h".
(build_string): New.
(gcc::jit::playback::context::new_string_literal): Disambiguate
build_string call.
(gcc::jit::playback::context::add_top_level_asm): New.
(build_operand_chain): New.
(build_clobbers): New.
(build_goto_operands): New.
(gcc::jit::playback::block::add_extended_asm): New.
* jit-playback.h (gcc::jit::playback::context::add_top_level_asm):
New decl.
(struct gcc::jit::playback::asm_operand): New struct.
(gcc::jit::playback::block::add_extended_asm): New decl.
* jit-recording.c (gcc::jit::recording::context::dump_to_file):
Dump top-level asms.
(gcc::jit::recording::context::add_top_level_asm): New.
(gcc::jit::recording::block::add_extended_asm): New.
(gcc::jit::recording::block::end_with_extended_asm_goto): New.
(gcc::jit::recording::asm_operand::asm_operand): New.
(gcc::jit::recording::asm_operand::print): New.
(gcc::jit::recording::asm_operand::make_debug_string): New.
(gcc::jit::recording::output_asm_operand::write_reproducer): New.
(gcc::jit::recording::output_asm_operand::print): New.
(gcc::jit::recording::input_asm_operand::write_reproducer): New.
(gcc::jit::recording::input_asm_operand::print): New.
(gcc::jit::recording::extended_asm::add_output_operand): New.
(gcc::jit::recording::extended_asm::add_input_operand): New.
(gcc::jit::recording::extended_asm::add_clobber): New.
(gcc::jit::recording::extended_asm::replay_into): New.
(gcc::jit::recording::extended_asm::make_debug_string): New.
(gcc::jit::recording::extended_asm::write_flags): New.
(gcc::jit::recording::extended_asm::write_clobbers): New.
(gcc::jit::recording::extended_asm_simple::write_reproducer): New.
(gcc::jit::recording::extended_asm::maybe_populate_playback_blocks):
New.
(gcc::jit::recording::extended_asm_goto::extended_asm_goto): New.
(gcc::jit::recording::extended_asm_goto::replay_into): New.
(gcc::jit::recording::extended_asm_goto::write_reproducer): New.
(gcc::jit::recording::extended_asm_goto::get_successor_blocks):
New.
(gcc::jit::recording::extended_asm_goto::maybe_print_gotos): New.

(gcc::jit::recording::extended_asm_goto::maybe_populate_playback_blocks):
New.
(gcc::jit::recording::top_level_asm::top_level_asm): New.
(gcc::jit::recording::top_level_asm::replay_into): New.
(gcc::jit::recording::top_level_asm::make_debug_string): New.
(gcc::jit::recording::top_level_asm::write_to_dump): New.
(gcc::jit::recording::top_level_asm::write_reproducer): New.
* jit-recording.h
(gcc::jit::recording::context::add_top_level_asm): New decl.
(gcc::jit::recording::context::m_top_level_asms): New field.
(gcc::jit::recording::block::add_extended_asm): New decl.
(gcc::jit::recording::block::end_with_extended_asm_goto): New
decl.
(gcc::jit::recording::asm_operand): New class.
(gcc::jit::recording::output_asm_operand): New class.
(gcc::jit::recording::input_asm_operand): New class.
(gcc::jit::recording::extended_asm): New class.
(gcc::jit::recording::extended_asm_simple): New class.
(gcc::jit::recording::extended_asm_goto): New class.
(gcc::jit::recording::top_level_asm): New class.
* libgccjit++.h (gccjit::extended_asm): New forward decl.
(gccjit::context::add_top_level_asm): New.
(gccjit::block::add_extended_asm): New.
(gccjit::block::end_with_extended_asm_goto): New.

gcc-8-20201112 is now available

2020-11-12 Thread GCC Administrator via Gcc

Snapshot gcc-8-20201112 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/8-20201112/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 8 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-8 
revision 6f53dfa9acec588c3c7fb19ab10a286c190045fe

You'll find:

 gcc-8-20201112.tar.xzComplete GCC

  SHA256=56c1908be7eae6da42a37141217b57fd5a587437c29b74ae7b73f23a98d4a6f0
  SHA1=c888b6ef596cef6c4abd872fe10e85bc826cc12a

Diffs from 8-20201105 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-8
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.

[committed] jit: fix string escaping

2020-11-12 Thread David Malcolm via Gcc-patches

This patch fixes a bug in recording::string::make_debug_string in which
'\t' and '\n' were "escaped" by simply prepending a '\', thus emitting
'\' then '\n', rather than '\' then 'n'.  It also removes a hack that
determined if a string is to be escaped by checking for a leading '"',
by instead adding a flag.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as fec573408310139e1ffc42741fbe46b4f2947592.

gcc/jit/ChangeLog:
* jit-recording.c (recording::context::new_string): Add "escaped"
param and use it when creating the new recording::string instance.
(recording::string::string): Add "escaped" param and use it to
initialize m_escaped.
(recording::string::make_debug_string): Replace check that first
char is double-quote with use of m_escaped.  Fix escaping of
'\t' and '\n'.  Set "escaped" on the result.
* jit-recording.h (recording::context::new_string): Add "escaped"
param.
(recording::string::string): Add "escaped" param.
(recording::string::m_escaped): New field.

gcc/testsuite/ChangeLog:
* jit.dg/test-debug-strings.c (create_code): Add tests of
string literal escaping.
---
 gcc/jit/jit-recording.c   | 39 ---
 gcc/jit/jit-recording.h   |  9 --
 gcc/testsuite/jit.dg/test-debug-strings.c | 20 
 3 files changed, 55 insertions(+), 13 deletions(-)

diff --git a/gcc/jit/jit-recording.c b/gcc/jit/jit-recording.c
index 3cbeba0f371..3a84c1fc5c0 100644
--- a/gcc/jit/jit-recording.c
+++ b/gcc/jit/jit-recording.c
@@ -724,12 +724,12 @@ recording::context::disassociate_from_playback ()
This creates a fresh copy of the given 0-terminated buffer.  */
 
 recording::string *
-recording::context::new_string (const char *text)
+recording::context::new_string (const char *text, bool escaped)
 {
   if (!text)
 return NULL;
 
-  recording::string *result = new string (this, text);
+  recording::string *result = new string (this, text, escaped);
   record (result);
   return result;
 }
@@ -1954,8 +1954,9 @@ recording::memento::write_to_dump (dump )
 /* Constructor for gcc::jit::recording::string::string, allocating a
copy of the given text using new char[].  */
 
-recording::string::string (context *ctxt, const char *text)
-  : memento (ctxt)
+recording::string::string (context *ctxt, const char *text, bool escaped)
+: memento (ctxt),
+  m_escaped (escaped)
 {
   m_len = strlen (text);
   m_buffer = new char[m_len + 1];
@@ -2005,9 +2006,9 @@ recording::string::from_printf (context *ctxt, const char 
*fmt, ...)
 recording::string *
 recording::string::make_debug_string ()
 {
-  /* Hack to avoid infinite recursion into strings when logging all
- mementos: don't re-escape strings:  */
-  if (m_buffer[0] == '"')
+  /* Avoid infinite recursion into strings when logging all mementos:
+ don't re-escape strings:  */
+  if (m_escaped)
 return this;
 
   /* Wrap in quotes and do escaping etc */
@@ -2024,15 +2025,31 @@ recording::string::make_debug_string ()
   for (size_t i = 0; i < m_len ; i++)
 {
   char ch = m_buffer[i];
-  if (ch == '\t' || ch == '\n' || ch == '\\' || ch == '"')
-   APPEND('\\');
-  APPEND(ch);
+  switch (ch)
+   {
+   default:
+ APPEND(ch);
+ break;
+   case '\t':
+ APPEND('\\');
+ APPEND('t');
+ break;
+   case '\n':
+ APPEND('\\');
+ APPEND('n');
+ break;
+   case '\\':
+   case '"':
+ APPEND('\\');
+ APPEND(ch);
+ break;
+   }
 }
   APPEND('"'); /* closing quote */
 #undef APPEND
   tmp[len] = '\0'; /* nil termintator */
 
-  string *result = m_ctxt->new_string (tmp);
+  string *result = m_ctxt->new_string (tmp, true);
 
   delete[] tmp;
   return result;
diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h
index 30e37aff387..9a43a7bf33a 100644
--- a/gcc/jit/jit-recording.h
+++ b/gcc/jit/jit-recording.h
@@ -74,7 +74,7 @@ public:
   void disassociate_from_playback ();
 
   string *
-  new_string (const char *text);
+  new_string (const char *text, bool escaped = false);
 
   location *
   new_location (const char *filename,
@@ -414,7 +414,7 @@ private:
 class string : public memento
 {
 public:
-  string (context *ctxt, const char *text);
+  string (context *ctxt, const char *text, bool escaped);
   ~string ();
 
   const char *c_str () { return m_buffer; }
@@ -431,6 +431,11 @@ private:
 private:
   size_t m_len;
   char *m_buffer;
+
+  /* Flag to track if this string is the result of string::make_debug_string,
+ to avoid infinite recursion when logging all mementos: don't re-escape
+ such strings.  */
+  bool m_escaped;
 };
 
 class location : public memento
diff --git a/gcc/testsuite/jit.dg/test-debug-strings.c 
b/gcc/testsuite/jit.dg/test-debug-strings.c
index e515a176257..03ef3370d94 100644
---

[committed] libgccjit.h: fix typo in comment

2020-11-12 Thread David Malcolm via Gcc-patches

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as 8948a5715b00fe36d20c03b6c4c4397b74cc6282.

gcc/jit/ChangeLog:
* libgccjit.h: Fix typo in comment.
---
 gcc/jit/libgccjit.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h
index 7134841bb07..7fbaa9f3162 100644
--- a/gcc/jit/libgccjit.h
+++ b/gcc/jit/libgccjit.h
@@ -1504,7 +1504,7 @@ gcc_jit_context_new_rvalue_from_vector (gcc_jit_context 
*ctxt,
 
 #define LIBGCCJIT_HAVE_gcc_jit_version
 
-/* Functions to retrive libgccjit version.
+/* Functions to retrieve libgccjit version.
Analogous to __GNUC__, __GNUC_MINOR__, __GNUC_PATCHLEVEL__ in C code.
 
These API entrypoints were added in LIBGCCJIT_ABI_13; you can test for their
-- 
2.26.2

Re: [PATCH] c++: Don't form a templated TARGET_EXPR in finish_compound_literal

2020-11-12 Thread Jason Merrill via Gcc-patches


On 11/12/20 1:27 PM, Patrick Palka wrote:

The atom_cache in normalize_atom relies on the assumption that two
equivalent (templated) trees (in the sense of cp_tree_equal) must use
the same template parameters (according to find_template_parameters).

This assumption unfortunately doesn't always hold for TARGET_EXPRs,
because cp_tree_equal ignores an artificial target of a TARGET_EXPR, but
find_template_parameters walks this target (and its DECL_CONTEXT).

Hence two TARGET_EXPRs built by force_target_expr with the same
initializer but under different settings of current_function_decl may
compare equal according to cp_tree_equal, but find_template_parameters
returns a different set of template parameters for them.  This breaks
the below testcase because during normalization we build two such
TARGET_EXPRs (one under current_function_decl=f and another under =g),
and then use the same ATOMIC_CONSTR for the two corresponding atoms,
leading to a crash during satisfaction of g's associated constraints.

This patch works around this assumption violation by removing the source
of these templated TARGET_EXPRs.  The relevant call to get_target_expr was
added in r9-6043, but it seems it's no longer necessary (according to
https://gcc.gnu.org/pipermail/gcc-patches/2019-February/517323.html, the
call was added in order to avoid regressing on initlist109.C at the time).

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.  I wonder what else asserting !processing_template_decl in 
build_target_expr would find...



gcc/cp/ChangeLog:

* semantics.c (finish_compound_literal): Don't wrap the original
compound literal in a TARGET_EXPR when inside a template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-decltype3.C: New test.
---
  gcc/cp/semantics.c  |  7 +--
  gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C | 15 +++
  2 files changed, 16 insertions(+), 6 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 33d715edaec..172286922e7 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -3006,12 +3006,7 @@ finish_compound_literal (tree type, tree 
compound_literal,
  
/* If we're in a template, return the original compound literal.  */

if (orig_cl)
-{
-  if (!VECTOR_TYPE_P (type))
-   return get_target_expr_sfinae (orig_cl, complain);
-  else
-   return orig_cl;
-}
+return orig_cl;
  
if (TREE_CODE (compound_literal) == CONSTRUCTOR)

  {
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C
new file mode 100644
index 000..837855ce8ac
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C
@@ -0,0 +1,15 @@
+// { dg-do compile { target c++20 } }
+
+template  concept C = requires(T t) { t; };
+
+template  using A = decltype((T{}, int{}));
+
+template  concept D = C>;
+
+template  void f() requires D;
+template  void g() requires D;
+
+void h() {
+  f();
+  g();
+}

Re: [PATCH] PR target/97682 - Fix to reuse t1 register between call address and epilogue.

2020-11-12 Thread Jim Wilson

On Mon, Nov 9, 2020 at 11:15 PM Monk Chiang  wrote:

> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> index 172c7ca7c98..3bd1993c4c9 100644
> --- a/gcc/config/riscv/riscv.h
> +++ b/gcc/config/riscv/riscv.h
> @@ -342,9 +342,13 @@ extern const char *riscv_default_mtune (int argc,
> const char **argv);
> The epilogue temporary mustn't conflict with the return registers,
> the frame pointer, the EH stack adjustment, or the EH data registers.
> */
>
> -#define RISCV_PROLOGUE_TEMP_REGNUM (GP_TEMP_FIRST + 1)
> +#define RISCV_PROLOGUE_TEMP_REGNUM (GP_TEMP_FIRST)
>  #define RISCV_PROLOGUE_TEMP(MODE) gen_rtx_REG (MODE,
> RISCV_PROLOGUE_TEMP_REGNUM)
>
> +#define RISCV_CALL_ADDRESS_TEMP_REGNUM (GP_TEMP_FIRST + 1)
> +#define RISCV_CALL_ADDRESS_TEMP(MODE) \
> +  gen_rtx_REG (MODE, RISCV_CALL_ADDRESS_TEMP_REGNUM)
>

This looks generally OK, however there is a minor problem that we have code
in riscv_compute_frame_info to save t1 in an interrupt handler register
with a large stack frame, as we know the prologue code will clobber t1 in
this case.  However, with this patch, the prologue now clobbers t0
instead.  So riscv_computer_frame_info needs to be fixed.  I'd suggest
changing the T1_REGNUM to RISCV_PROLOGUE_TEMP_REGNUM to prevent this from
happening again, that is probably my fault.  And the interrupt_save_t1
variable should be renamed, maybe to interupt_save_prologue_temp.

You can see the problem with gcc/testsuite/gcc.target/riscv/interrupt-3.c
if you compile with -O0 and we get
foo:
addi sp,sp,-32
sw t1,28(sp)
sw s0,24(sp)
addi s0,sp,32
li t0,-4096
addi t0,t0,16
add sp,sp,t0
so we are saving t1 and then clobbering t0 with your patch.

Otherwise this looks good.

Jim

[Bug rtl-optimization/97777] ICE: in df_refs_verify, at df-scan.c:3991 with -O -ffinite-math-only -fzero-call-used-regs=all

2020-11-12 Thread qinzhao at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9

--- Comment #3 from qinzhao at gcc dot gnu.org ---
This does not look like a bug in the new -fzero-call-used-regs implemenation.

it's more likely an existing bug in data flow analysis. 

I made the following change in gcc/function.c to make the new pass to do
nothing except a df_analyze:

qinzhao@gcc10:~/Work/write_gcc/gcc$ git diff function.c
diff --git a/gcc/function.c b/gcc/function.c
index 004fa389207..658b08ae215 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -6658,13 +6658,14 @@ pass_zero_call_used_regs::execute (function *fun)
   /* Iterate over the function's return instructions and insert any
  register zeroing required by the -fzero-call-used-regs command-line
  option or the "zero_call_used_regs" function attribute.  */
+#if 0
   FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
 {
   rtx_insn *insn = BB_END (e->src);
   if (JUMP_P (insn) && ANY_RETURN_P (JUMP_LABEL (insn)))
gen_call_used_regs_seq (insn, zero_regs_type);
 }
-
+#endif
   return 0;
 }

with this gcc, the exactly same ICE still there.

Re: [PATCH] RISC-V: Enable ifunc if it was supported in the binutils for linux toolchain.

2020-11-12 Thread Jim Wilson

On Tue, Nov 10, 2020 at 7:33 PM Nelson Chu  wrote:

> gcc/
> * configure: Regenerated.
> * configure.ac: If ifunc was supported in the binutils for
> linux toolchain, then set enable_gnu_indirect_function to yes.
>

Looks good.  I committed and pushed it.

I see some extra ifunc related testsuite failures, but that is because we
don't have the glibc ifunc patches upstream yet.  It will be important to
get those done next.

Jim

Re: PowerPC: Use float128 instead of ieee128 in tests.

2020-11-12 Thread Michael Meissner via Gcc-patches

On Thu, Nov 12, 2020 at 01:26:32PM -0600, Segher Boessenkool wrote:
> Hi,
> 
> On Thu, Oct 22, 2020 at 06:12:31PM -0400, Michael Meissner wrote:
> > Two of the tests used the __ieee128 keyword instead of __float128.  This
> > patch changes those cases to use the official keyword.
> 
> What is "official" about that?
> 
> Why make this change at all?  __ieee128 should work as well!  Did you
> see failures without this patch?  Thos need fixing, then.

We document '__float128'.  We don't document '__ieee128'.  As I said, using
'__ieee128' internally was due some issues in the GCC 7 time frame,
particularly before we had the glibc changes.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Re: PowerPC: Add __float128 conversions to/from Decimal

2020-11-12 Thread Michael Meissner via Gcc-patches

On Thu, Oct 29, 2020 at 10:05:38PM +, Joseph Myers wrote:
> On Thu, 29 Oct 2020, Segher Boessenkool wrote:
> 
> > > Doing these conversions accurately is nontrivial.  Converting via strings 
> > > is the simple approach (i.e. the one that moves the complexity somewhere 
> > > else).  There are more complicated but more efficient approaches that can 
> > > achieve correct conversions with smaller bounds on resource usage (and 
> > > there are various papers published in this area), but those involve a lot 
> > > more code (and precomputed data, with a speed/space trade-off in how much 
> > > you precompute; the BID code in libgcc has several MB of precomputed data 
> > > for that purpose).
> > 
> > Does the printf code in libgcc handle things correctly for IEEE QP float
> > as long double, do you know?
> 
> As far as I know, the code in libgcc for conversions *from* decimal *to* 
> binary (so the direction that uses strtof128 as opposed to the one using 
> strfrom128, in the binary128 case) works correctly, if the underlying libc 
> has accurate string/numeric conversion operations.
> 
> Binary to decimal is another matter, even for cases such as float to 
> _Decimal64.  I've just filed bug 97635 for that.
> 
> Also note that if you want to use printf as opposed to strfromf128 for 
> IEEE binary128 you'll need to use __printfieee128 (the version that 
> expects long double to be IEEE binary128) which was introduced in glibc 
> 2.32, so that doesn't help with the glibc version dependencies.

My latest patches now switches to using the GLIBC 2.32 and __sprintfieee128.
If we don't have glibc 2.32, it just calls abort, so we don't get linker
errors.  I hope to submit it tonight or tomorrow night.

> When I investigated and reported several bugs in the conversion operations 
> in libdfp, I noted (e.g. https://github.com/libdfp/libdfp/issues/29 ) that 
> the libgcc versions were working correctly for those tests (and filed and 
> subsequently fixed one glibc strtod bug, missing inexact exceptions, that 
> I'd noticed while looking at such issues in libdfp).  But the specific 
> case I tested for badly rounded conversions was the case of conversions 
> from decimal to binary, not the case of conversions from binary to 
> decimal, which, as noted above, turn out to be buggy in libgcc.
> 
> Lots of bugs have been fixed in the glibc conversion code over the years 
> (more on the strtod side than in the code shared by printf and strfrom 
> functions).  That code uses multiple-precision operations from GMP, which 
> avoids some complications but introduces others (it also needs to e.g. 
> deal with locale issues, which are irrelevant for libgcc conversions).

Using the sprintf method, I see an error in

c-c++-common/dfp/convert-bfp-11.c

that I didn't see with the method used in the patches with strtof128 and
strfromf128 directly.  I need to track down exactly what the error is.

All of the other dfp conversion tests work fine.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

[PATCH] C-Family, Objective-C : Implement Objective-C nullability Part 1 [PR90707].

2020-11-12 Thread Iain Sandoe


Hi,

The PR notes that our inability to parse these keywords in GNU Objective-C
is one of the contributing factors to being unable to use some important  
system

headers (at least, on Darwin platforms).

tested on x86_64-darwin and x86_64-linux-gnu,
OK for the C-family changes?
thanks
Iain

— commit log

Part 1 of the implementation covers property nullability attributes
and includes the changes to common code. Follow-on changes will be needed
to cover Objective-C method definitions, but those are expected to be
local to the Objective-C front end.

The basis of the implementation is to translate the Objective-C-specific
keywords into an attribute (objc_nullability) which has the required
states to carry the attribute markup.

We introduce the keywords, and these are parsed and validated in the same
manner as other property attributes.  The resulting value is attached to
the property as an objc_nullability attribute.

gcc/c-family/ChangeLog:

PR objc/90707
* c-common.c (c_common_reswords): null_unspecified, nullable,
nonnull, null_resettable: New keywords.
* c-common.h (enum rid): RID_NULL_UNSPECIFIED, RID_NULLABLE,
RID_NONNULL, RID_NULL_RESETTABLE: New.
(OBJC_IS_PATTR_KEYWORD): Include nullability keywords in the
ranges accepted for property attributes.
* c-attribs.c (handle_objc_nullability_attribute): New.
* c-objc.h (enum objc_property_attribute_group): Add
OBJC_PROPATTR_GROUP_NULLABLE.
(enum objc_property_attribute_kind):Add
OBJC_PROPERTY_ATTR_NULL_UNSPECIFIED, OBJC_PROPERTY_ATTR_NULLABLE,
OBJC_PROPERTY_ATTR_NONNULL, OBJC_PROPERTY_ATTR_NULL_RESETTABLE.

gcc/objc/ChangeLog:

PR objc/90707
* objc-act.c (objc_prop_attr_kind_for_rid): Handle nullability.
(objc_add_property_declaration): Handle nullability attributes.
Check that these are applicable to the property type.
* objc-act.h (enum objc_property_nullability): New.

gcc/testsuite/ChangeLog:

PR objc/90707
* obj-c++.dg/property/at-property-4.mm: Add basic nullability
tests.
* objc.dg/property/at-property-4.m: Likewise.
* obj-c++.dg/attributes/nullability-00.mm: New test.
* obj-c++.dg/property/nullability-00.mm: New test.
* objc.dg/attributes/nullability-00.m: New test.
* objc.dg/property/nullability-00.m: New test.

gcc/ChangeLog:

PR objc/90707
* doc/extend.texi: Document the objc_nullability attribute.
---
 gcc/c-family/c-attribs.c  | 49 ++
 gcc/c-family/c-common.c   |  6 +++
 gcc/c-family/c-common.h   |  7 ++-
 gcc/c-family/c-objc.h |  5 ++
 gcc/doc/extend.texi   | 27 ++
 gcc/objc/objc-act.c   | 51 ++-
 gcc/objc/objc-act.h   | 10 
 .../obj-c++.dg/attributes/nullability-00.mm   | 20 
 .../obj-c++.dg/property/at-property-4.mm  | 20 +++-
 .../obj-c++.dg/property/nullability-00.mm | 21 
 .../objc.dg/attributes/nullability-00.m   | 20 
 .../objc.dg/property/at-property-4.m  | 18 +++
 .../objc.dg/property/nullability-00.m | 21 
 13 files changed, 272 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/obj-c++.dg/attributes/nullability-00.mm
 create mode 100644 gcc/testsuite/obj-c++.dg/property/nullability-00.mm
 create mode 100644 gcc/testsuite/objc.dg/attributes/nullability-00.m
 create mode 100644 gcc/testsuite/objc.dg/property/nullability-00.m

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 6718fff6efb..9c62508651c 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -161,6 +161,7 @@ static tree handle_patchable_function_entry_attribute  
(tree *, tree, tree,

 static tree handle_copy_attribute (tree *, tree, tree, int, bool *);
 static tree handle_nsobject_attribute (tree *, tree, tree, int, bool *);
 static tree handle_objc_root_class_attribute (tree *, tree, tree, int, bool *);
+static tree handle_objc_nullability_attribute (tree *, tree, tree, int,  
bool *);


 /* Helper to define attribute exclusions.  */
 #define ATTR_EXCL(name, function, type, variable)  \
@@ -520,6 +521,8 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_nsobject_attribute, NULL },
   { "objc_root_class", 0, 0, true, false, false, false,
  handle_objc_root_class_attribute, NULL },
+  { "objc_nullability",1, 1, true, false, false, false,
+ handle_objc_nullability_attribute, NULL },
   { NULL, 0, 0, false, false, false, false, NULL, NULL }
 };

@@ -5251,6 +5254,52 @@ handle_objc_root_class_attribute (tree */*node*/,  
tree name, tree /*args*/,

   return NULL_TREE;
 }

+/* Handle an

[Bug fortran/85796] ICE: Floating point exception

2020-11-12 Thread anlauf at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85796

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 CC||anlauf at gcc dot gnu.org

--- Comment #6 from anlauf at gcc dot gnu.org ---
Jerry, are you still following this one?

c: C2x __has_c_attribute

2020-11-12 Thread Joseph Myers

C2x adds the __has_c_attribute preprocessor operator, similar to C++
__has_cpp_attribute.

GCC implements __has_cpp_attribute as exactly equivalent to
__has_attribute.  (The documentation says they differ regarding the
values returned for standard attributes, but that's actually only a
matter of the particular nonzero value returned not being specified in
the documentation for __has_attribute; the implementation makes no
distinction between the two.)

I don't think having them exactly equivalent is actually correct,
either for __has_cpp_attribute or for __has_c_attribute.
Specifically, I think it is only correct for __has_cpp_attribute or
__has_c_attribute to return nonzero if the given attribute is
supported, with the particular pp-tokens passed to __has_cpp_attribute
or __has_c_attribute, with [[]] syntax, not if it's only accepted in
__attribute__ or with gnu:: added in [[]].  For example, they should
return nonzero for gnu::packed, but zero for plain packed, because
[[gnu::packed]] is accepted but [[packed]] is ignored as not a
standard attribute.

This patch implements that for __has_c_attribute, leaving any changes
to __has_cpp_attribute for the C++ maintainers.  A new
BT_HAS_STD_ATTRIBUTE is added for __has_c_attribute (which I think,
based on the above, would actually be correct to use for
__has_cpp_attribute as well).  The code in c_common_has_attribute that
deals with scopes has its C++ conditional removed; instead, whether
the language is C or C++ is used only to determine the numeric values
returned for standard attributes (and which standard attributes are
handled there at all).  A new argument is passed to
c_common_has_attribute to distinguish BT_HAS_STD_ATTRIBUTE from
BT_HAS_ATTRIBUTE, and that argument is used to stop attributes with no
scope specified from being accepted with __has_c_attribute unless they
are one of the known standard attributes and so handled specially.

Although the standard specify constants ending with 'L' as the values
for the standard attributes, there is no correctness issue with the
lack of code in GCC to add that 'L' to the expansion:
__has_c_attribute and __has_cpp_attribute are expanded in #if after
other macro expansion has occurred, with no semantics being specified
if they occur outside #if, so there is no way for a conforming program
to inspect the exact text of the expansion of those macros, only to
use the resulting pp-number in a #if expression, where long and int
have the same set of values.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.  Applied to 
mainline.

gcc/
2020-11-12  Joseph Myers  

* doc/cpp.texi (__has_attribute): Document when scopes are allowed
for C.
(__has_c_attribute): New.

gcc/c-family/
2020-11-12  Joseph Myers  

* c-lex.c (c_common_has_attribute): Take argument std_syntax.
Allow scope for C.  Handle standard attributes for C.  Do not
accept unscoped attributes if std_syntax and not handled as
standard attributes.
* c-common.h (c_common_has_attribute): Update prototype.

gcc/testsuite/
2020-11-12  Joseph Myers  

* gcc.dg/c2x-has-c-attribute-1.c, gcc.dg/c2x-has-c-attribute-2.c,
gcc.dg/c2x-has-c-attribute-3.c, gcc.dg/c2x-has-c-attribute-4.c:
New tests.

libcpp/
2020-11-12  Joseph Myers  

* include/cpplib.h (struct cpp_callbacks): Add bool argument to
has_attribute.
(enum cpp_builtin_type): Add BT_HAS_STD_ATTRIBUTE.
* init.c (builtin_array): Add __has_c_attribute.
(cpp_init_special_builtins): Handle BT_HAS_STD_ATTRIBUTE.
* macro.c (_cpp_builtin_macro_text): Handle BT_HAS_STD_ATTRIBUTE.
Update call to has_attribute for BT_HAS_ATTRIBUTE.
* traditional.c (fun_like_macro): Handle BT_HAS_STD_ATTRIBUTE.

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 94f4868915a..f47097442eb 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1042,7 +1042,7 @@ extern bool c_cpp_diagnostic (cpp_reader *, enum 
cpp_diagnostic_level,
  enum cpp_warning_reason, rich_location *,
  const char *, va_list *)
  ATTRIBUTE_GCC_DIAG(5,0);
-extern int c_common_has_attribute (cpp_reader *);
+extern int c_common_has_attribute (cpp_reader *, bool);
 extern int c_common_has_builtin (cpp_reader *);
 
 extern bool parse_optimize_options (tree, bool);
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index e81e16ddc26..6cd3df7c96f 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -300,7 +300,7 @@ get_token_no_padding (cpp_reader *pfile)
 
 /* Callback for has_attribute.  */
 int
-c_common_has_attribute (cpp_reader *pfile)
+c_common_has_attribute (cpp_reader *pfile, bool std_syntax)
 {
   int result = 0;
   tree attr_name = NULL_TREE;
@@ -319,35 +319,37 @@ c_common_has_attribute (cpp_reader *pfile)
   attr_name = get_identifier ((const char *)

Re: Fix gimple_expr_code?

2020-11-12 Thread Andrew MacLeod via Gcc-patches


On 11/12/20 3:53 PM, Richard Biener wrote:

On November 12, 2020 9:43:52 PM GMT+01:00, Andrew MacLeod via Gcc-patches 
 wrote:

So I spent some time tracking down a ranger issue, and in the end, it
boiled down to the range-op handler not being picked up properly.

The handler is picked up by:

   if ((gimple_code (s) == GIMPLE_ASSIGN) || (gimple_code (s) ==
GIMPLE_COND))
     return range_op_handler (gimple_expr_code (s), gimple_expr_type
(s));

IMHO this should use more specific functions. Gimple_expr_code should go away 
similar to gimple_expr_type.


gimple_expr_type is quite pervasive.. and each consumer is going to have 
to roll their own version of it.  Why do we want to get rid of it?


If we are trying to save a few bytes by storing the information in 
different places, then we're going to need some sort of accessing 
function like that



where it is indexing the table with the gimple_expr_code..
the stmt being processed was for a pointer assignment,
   _5 = _33
and it was coming back with a gimple_expr_code of  VAR_DECL instead of
an SSA_NAME... which confused me greatly.


gimple_expr_code (const gimple *stmt)
{
   enum gimple_code code = gimple_code (stmt);
   if (code == GIMPLE_ASSIGN || code == GIMPLE_COND)
     return (enum tree_code) stmt->subcode;

A little more digging shows this:

static inline enum tree_code
gimple_assign_rhs_code (const gassign *gs)
{
   enum tree_code code = (enum tree_code) gs->subcode;
   /* While we initially set subcode to the TREE_CODE of the rhs for
  GIMPLE_SINGLE_RHS assigns we do not update that subcode to stay
  in sync when we rewrite stmts into SSA form or do SSA
propagations.  */
   if (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS)
     code = TREE_CODE (gs->op[1]);

   return code;
}

Fascinating comment.

... 


But it means that gimple_expr_code() isn't returning the correct result

for GIMPLE_SINGLE_RHS

It depends. A SSA name isn't an expression code either. As said, the generic 
gimple_expr_code should be used with extreme care.


what is an expression code?  It seems like its just a  tree_code 
representing what is on the RHS?    Im not sure I understand why one 
needs to be careful with it.  It only applies to COND, ASSIGN and CALL. 
and its current right for everything except GIMPLE_SINGLE_RHS?


If we dont fix gimple_expr_code, then Im basically going to be 
reimplementing it myself... which seems kind of pointless.


Andrew

Re: Fix gimple_expr_code?

2020-11-12 Thread Richard Biener via Gcc-patches

On November 12, 2020 9:43:52 PM GMT+01:00, Andrew MacLeod via Gcc-patches 
 wrote:
>So I spent some time tracking down a ranger issue, and in the end, it 
>boiled down to the range-op handler not being picked up properly.
>
>The handler is picked up by:
>
>   if ((gimple_code (s) == GIMPLE_ASSIGN) || (gimple_code (s) == 
>GIMPLE_COND))
>    return range_op_handler (gimple_expr_code (s), gimple_expr_type
>(s));

IMHO this should use more specific functions. Gimple_expr_code should go away 
similar to gimple_expr_type. 

>where it is indexing the table with the gimple_expr_code..
>the stmt being processed was for a pointer assignment,
>   _5 = _33
>and it was coming back with a gimple_expr_code of  VAR_DECL instead of 
>an SSA_NAME... which confused me greatly.
>
>
>gimple_expr_code (const gimple *stmt)
>{
>   enum gimple_code code = gimple_code (stmt);
>   if (code == GIMPLE_ASSIGN || code == GIMPLE_COND)
>     return (enum tree_code) stmt->subcode;
>
>A little more digging shows this:
>
>static inline enum tree_code
>gimple_assign_rhs_code (const gassign *gs)
>{
>   enum tree_code code = (enum tree_code) gs->subcode;
>   /* While we initially set subcode to the TREE_CODE of the rhs for
>  GIMPLE_SINGLE_RHS assigns we do not update that subcode to stay
>  in sync when we rewrite stmts into SSA form or do SSA 
>propagations.  */
>   if (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS)
>     code = TREE_CODE (gs->op[1]);
>
>   return code;
>}
>
>Fascinating comment.

...  

>But it means that gimple_expr_code() isn't returning the correct result
>
>for GIMPLE_SINGLE_RHS

It depends. A SSA name isn't an expression code either. As said, the generic 
gimple_expr_code should be used with extreme care. 

>Wouldn't it make sense that gimple_expr_code be changed to return 
>gimple_assign_rhs_code() for GIMPLE_ASSIGN?
>
>I tested the attached patch, and it bootstraps and passes regression
>tests.
>
>There aren't a lot of places where its used, but I saw a suspicious bit
>
>in ipa-icf-gimple.c that looks like it is working around this?
>
>
>bool
>func_checker::compare_gimple_assign (gimple *s1, gimple *s2)
>{
>   tree arg1, arg2;
>   tree_code code1, code2;
>   unsigned i;
>
>   code1 = gimple_expr_code (s1);
>   code2 = gimple_expr_code (s2);
>
>   if (code1 != code2)
>     return false;
>
>   code1 = gimple_assign_rhs_code (s1);
>   code2 = gimple_assign_rhs_code (s2);
>
>   if (code1 != code2)
>     return false;
>
>
>and  there were one or two other places where SSA_NAME occurred in the 
>cases of a switch after calling gimple_expr_code().
>
>This seems like it should be the right thing?
>Andrew

[Bug rtl-optimization/97777] ICE: in df_refs_verify, at df-scan.c:3991 with -O -ffinite-math-only -fzero-call-used-regs=all

2020-11-12 Thread qinzhao at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9

qinzhao at gcc dot gnu.org changed:

   What|Removed |Added

 CC||qinzhao at gcc dot gnu.org

--- Comment #2 from qinzhao at gcc dot gnu.org ---
when configured the gcc with --enable-checking=df, I can repeat the failure.
will check what's wrong with the data flow information.

Fix gimple_expr_code?

2020-11-12 Thread Andrew MacLeod via Gcc-patches

So I spent some time tracking down a ranger issue, and in the end, it 
boiled down to the range-op handler not being picked up properly.


The handler is picked up by:

  if ((gimple_code (s) == GIMPLE_ASSIGN) || (gimple_code (s) == 
GIMPLE_COND))

    return range_op_handler (gimple_expr_code (s), gimple_expr_type (s));

where it is indexing the table with the gimple_expr_code..
the stmt being processed was for a pointer assignment,
  _5 = _33
and it was coming back with a gimple_expr_code of  VAR_DECL instead of 
an SSA_NAME... which confused me greatly.



gimple_expr_code (const gimple *stmt)
{
  enum gimple_code code = gimple_code (stmt);
  if (code == GIMPLE_ASSIGN || code == GIMPLE_COND)
    return (enum tree_code) stmt->subcode;

A little more digging shows this:

static inline enum tree_code
gimple_assign_rhs_code (const gassign *gs)
{
  enum tree_code code = (enum tree_code) gs->subcode;
  /* While we initially set subcode to the TREE_CODE of the rhs for
 GIMPLE_SINGLE_RHS assigns we do not update that subcode to stay
 in sync when we rewrite stmts into SSA form or do SSA 
propagations.  */

  if (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS)
    code = TREE_CODE (gs->op[1]);

  return code;
}

Fascinating comment.

But it means that gimple_expr_code() isn't returning the correct result 
for GIMPLE_SINGLE_RHS


Wouldn't it make sense that gimple_expr_code be changed to return 
gimple_assign_rhs_code() for GIMPLE_ASSIGN?


I tested the attached patch, and it bootstraps and passes regression tests.

There aren't a lot of places where its used, but I saw a suspicious bit 
in ipa-icf-gimple.c that looks like it is working around this?



   bool
   func_checker::compare_gimple_assign (gimple *s1, gimple *s2)
   {
  tree arg1, arg2;
  tree_code code1, code2;
  unsigned i;

  code1 = gimple_expr_code (s1);
  code2 = gimple_expr_code (s2);

  if (code1 != code2)
    return false;

  code1 = gimple_assign_rhs_code (s1);
  code2 = gimple_assign_rhs_code (s2);

  if (code1 != code2)
    return false;


and  there were one or two other places where SSA_NAME occurred in the 
cases of a switch after calling gimple_expr_code().


This seems like it should be the right thing?
Andrew
	* gimple.h (gimple_expr_code): Return gimple_assign_rhs_code
	for GIMPLE_ASSIGN.

diff --git a/gcc/gimple.h b/gcc/gimple.h
index 62b5a8a6124..8ef2f83d412 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -2229,26 +2229,6 @@ gimple_set_modified (gimple *s, bool modifiedp)
 }
 
 
-/* Return the tree code for the expression computed by STMT.  This is
-   only valid for GIMPLE_COND, GIMPLE_CALL and GIMPLE_ASSIGN.  For
-   GIMPLE_CALL, return CALL_EXPR as the expression code for
-   consistency.  This is useful when the caller needs to deal with the
-   three kinds of computation that GIMPLE supports.  */
-
-static inline enum tree_code
-gimple_expr_code (const gimple *stmt)
-{
-  enum gimple_code code = gimple_code (stmt);
-  if (code == GIMPLE_ASSIGN || code == GIMPLE_COND)
-return (enum tree_code) stmt->subcode;
-  else
-{
-  gcc_gimple_checking_assert (code == GIMPLE_CALL);
-  return CALL_EXPR;
-}
-}
-
-
 /* Return true if statement STMT contains volatile operands.  */
 
 static inline bool
@@ -2889,6 +2869,29 @@ gimple_assign_cast_p (const gimple *s)
   return false;
 }
 
+
+/* Return the tree code for the expression computed by STMT.  This is
+   only valid for GIMPLE_COND, GIMPLE_CALL and GIMPLE_ASSIGN.  For
+   GIMPLE_CALL, return CALL_EXPR as the expression code for
+   consistency.  This is useful when the caller needs to deal with the
+   three kinds of computation that GIMPLE supports.  */
+
+static inline enum tree_code
+gimple_expr_code (const gimple *stmt)
+{
+  enum gimple_code code = gimple_code (stmt);
+  if (code == GIMPLE_ASSIGN)
+return gimple_assign_rhs_code (stmt);
+  else if (code == GIMPLE_COND)
+return (enum tree_code) stmt->subcode;
+  else
+{
+  gcc_gimple_checking_assert (code == GIMPLE_CALL);
+  return CALL_EXPR;
+}
+}
+
+
 /* Return true if S is a clobber statement.  */
 
 static inline bool

[committed] openmp: Implement allocate clause in omp lowering

2020-11-12 Thread Jakub Jelinek via Gcc-patches

Hi!

For now, task/taskloop constructs aren't handled and C/C++ array reductions
and reductions with task or inscan modifiers need further work.
Instead of calling omp_alloc/omp_free (where the former doesn't have
alignment argument and omp_aligned_alloc is 5.1 only feature), this calls
GOMP_alloc/GOMP_free, so that the library can fail if it would fall back
into NULL (exception is zero length allocations).

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2020-11-12  Jakub Jelinek  

gcc/
* builtin-types.def (BT_FN_PTR_SIZE_SIZE_PTRMODE): New function type.
* omp-builtins.def (BUILT_IN_GOACC_DECLARE): Move earlier.
(BUILT_IN_GOMP_ALLOC, BUILT_IN_GOMP_FREE): New builtins.
* gimplify.c (gimplify_scan_omp_clauses): Force allocator into a
decl if it is not NULL, INTEGER_CST or decl.
(gimplify_adjust_omp_clauses): Clear GOVD_EXPLICIT on explicit clauses
which are being removed.  Remove allocate clauses for variables not seen
if they are private, firstprivate or linear too.  Call
omp_notice_variable on the allocator otherwise.
(gimplify_omp_for): Handle iterator vars mentioned in allocate clauses
similarly to non-is_gimple_reg iterators.
* omp-low.c (struct omp_context): Add allocate_map field.
(delete_omp_context): Delete it.
(scan_sharing_clauses): Fill it from allocate clauses.  Remove it
if mentioned also in shared clause.
(lower_private_allocate): New function.
(lower_rec_input_clauses): Handle allocate clause for privatized
variables, except for task/taskloop, C/C++ array reductions for now
and task/inscan variables.
(lower_send_shared_vars): Don't consider variables in allocate_map
as shared.
* omp-expand.c (expand_omp_for_generic, expand_omp_for_static_nochunk,
expand_omp_for_static_chunk): Use expand_omp_build_assign instead of
gimple_build_assign + gsi_insert_after.
* builtins.c (builtin_fnspec): Handle BUILTIN_GOMP_ALLOC and
BUILTIN_GOMP_FREE.
* tree-ssa-ccp.c (evaluate_stmt): Handle BUILTIN_GOMP_ALLOC.
* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Handle
BUILTIN_GOMP_ALLOC.
(mark_all_reaching_defs_necessary_1): Handle BUILTIN_GOMP_ALLOC
and BUILTIN_GOMP_FREE.
(propagate_necessity): Likewise.
gcc/fortran/
* f95-lang.c (ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LIST):
Define.
(gfc_init_builtin_functions): Add alloc_size and warn_unused_result
attributes to __builtin_GOMP_alloc.
* types.def (BT_PTRMODE): New primitive type.
(BT_FN_VOID_PTR_PTRMODE, BT_FN_PTR_SIZE_SIZE_PTRMODE): New function
types.
libgomp/
* libgomp.map (GOMP_alloc, GOMP_free): Export at GOMP_5.0.1.
* omp.h.in (omp_alloc): Add malloc and alloc_size attributes.
* libgomp_g.h (GOMP_alloc, GOMP_free): Declare.
* allocator.c (omp_aligned_alloc): New for now static function,
add alignment argument and handle it.
(omp_alloc): Reimplement using omp_aligned_alloc.
(GOMP_alloc, GOMP_free): New functions.
(omp_free): Add ialias.
* testsuite/libgomp.c-c++-common/allocate-1.c: New test.
* testsuite/libgomp.c++/allocate-1.C: New test.

--- gcc/builtin-types.def.jj2020-11-12 11:57:58.465562360 +0100
+++ gcc/builtin-types.def   2020-11-12 12:42:06.093029492 +0100
@@ -637,6 +637,8 @@ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_SIZE_SIZ
 DEF_FUNCTION_TYPE_3 (BT_FN_UINT_UINT_PTR_PTR, BT_UINT, BT_UINT, BT_PTR, BT_PTR)
 DEF_FUNCTION_TYPE_3 (BT_FN_PTR_PTR_CONST_SIZE_BOOL,
 BT_PTR, BT_PTR, BT_CONST_SIZE, BT_BOOL)
+DEF_FUNCTION_TYPE_3 (BT_FN_PTR_SIZE_SIZE_PTRMODE,
+BT_PTR, BT_SIZE, BT_SIZE, BT_PTRMODE)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
 BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR)
--- gcc/omp-builtins.def.jj 2020-11-12 11:57:58.470562304 +0100
+++ gcc/omp-builtins.def2020-11-12 12:42:06.105029360 +0100
@@ -47,6 +47,8 @@ DEF_GOACC_BUILTIN (BUILT_IN_GOACC_UPDATE
 DEF_GOACC_BUILTIN (BUILT_IN_GOACC_WAIT, "GOACC_wait",
   BT_FN_VOID_INT_INT_VAR,
   ATTR_NOTHROW_LIST)
+DEF_GOACC_BUILTIN (BUILT_IN_GOACC_DECLARE, "GOACC_declare",
+  BT_FN_VOID_INT_SIZE_PTR_PTR_PTR, ATTR_NOTHROW_LIST)
 
 DEF_GOACC_BUILTIN_COMPILER (BUILT_IN_ACC_ON_DEVICE, "acc_on_device",
BT_FN_INT_INT, ATTR_CONST_NOTHROW_LEAF_LIST)
@@ -444,5 +446,8 @@ DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TASK_RED
 DEF_GOMP_BUILTIN (BUILT_IN_GOMP_WORKSHARE_TASK_REDUCTION_UNREGISTER,
  "GOMP_workshare_task_reduction_unregister",
  BT_FN_VOID_BOOL, ATTR_NOTHROW_LEAF_LIST)
-DEF_GOACC_BUILTIN (BUILT_IN_GOACC_DECLARE, "GOACC_declare",
-  BT_FN_VOID_INT_SIZE_PTR_PTR_PTR,

Re: [PATCH 2/2] loops: Invoke lim after successful loop interchange

2020-11-12 Thread Martin Jambor

Hi,

On Wed, Nov 11 2020, Richard Biener wrote:
> On Mon, 9 Nov 2020, Martin Jambor wrote:
>
>> this patch modifies the loop invariant pass so that is can operate
>> only on a single requested loop and its sub-loops and ignore the rest
>> of the function, much like it currently ignores basic blocks that are
>> not in any real loop.  It then invokes it from within the loop
>> interchange pass when it successfully swaps two loops.  This avoids
>> the non-LTO -Ofast run-time regressions of 410.bwaves and 503.bwaves_r
>> (which are 19% and 15% faster than current master on an AMD zen2
>> machine) while not introducing a full LIM pass into the pass pipeline.
>> 
>> I have not modified the LIM data structures, this means that it still
>> contains vectors indexed by loop->num even though only a single loop
>> nest is actually processed.  I also did not replace the uses of
>> pre_and_rev_post_order_compute_fn with a function that would count a
>> postorder only for a given loop.  I can of course do so if the
>> approach is otherwise deemed viable.
>> 
>> The patch adds one additional global variable requested_loop to the
>> pass and then at various places behaves differently when it is set.  I
>> was considering storing the fake root loop into it for normal
>> operation, but since this loop often requires special handling anyway,
>> I came to the conclusion that the code would actually end up less
>> straightforward.
>> 
>> I have bootstrapped and tested the patch on x86_64-linux and a very
>> similar one on aarch64-linux.  I have also tested it by modifying the
>> tree_ssa_lim function to run loop_invariant_motion_from_loop on each
>> real outermost loop in a function and this variant also passed
>> bootstrap and all tests, including dump scans, of all languages.
>> 
>> I have built the entire SPEC 2006 FPrate monitoring the activity of
>> the LIM pass without and with the patch (on top of commit b642fca1c31
>> with which 526.blender_r and 538.imagick_r seemed to be failing) and
>> it only examined 0.2% more loops, 0.02% more BBs and even fewer
>> percent of statements because it is invoked only in a rather special
>> circumstance.  But the patch allows for more such need-based uses at
>> hopefully reasonable cost.
>> 
>> Since I do not have much experience with loop optimizers, I expect
>> that there will be requests to adjust the patch during the review.
>> Still, it fixes a performance regression against GCC 9 and so I hope
>> to address the concerns in time to get it into GCC 11.
>> 

[...]

>
> That said, in the way it's currently structured I think it's
> "better" to export tree_ssa_lim () and call it from interchange
> if any loop was interchanged (thus run a full pass but conditional
> on interchange done).  You can make it cheaper by adding a flag
> to tree_ssa_lim whether to do store-motion (I guess this might
> be an interesting user-visible flag as well and a possibility
> to make select lim passes cheaper via a pass flag) and not do
> store-motion from the interchange call.  I think that's how we should
> fix the regression, refactoring LIM properly requires more work
> that doesn't seem to fit the stage1 deadline.
>

So just like this?  Bootstrapped and tested on x86_64-linux and I have
verified it fixes the bwaves reduction.

Thanks,

Martin



gcc/ChangeLog:

2020-11-12  Martin Jambor  

PR tree-optimization/94406
* tree-ssa-loop-im.c (tree_ssa_lim): Renamed to
loop_invariant_motion_in_fun, added a parameter to control store
motion.
(pass_lim::execute): Adjust call to tree_ssa_lim, now
loop_invariant_motion_in_fun.
* tree-ssa-loop-manip.h (loop_invariant_motion_in_fun): Declare.
* gimple-loop-interchange.cc (pass_linterchange::execute): Call
loop_invariant_motion_in_fun if any interchange has been done.
---
 gcc/gimple-loop-interchange.cc |  9 +++--
 gcc/tree-ssa-loop-im.c | 12 +++-
 gcc/tree-ssa-loop-manip.h  |  2 +-
 3 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/gcc/gimple-loop-interchange.cc b/gcc/gimple-loop-interchange.cc
index 1656004ecf0..a36dbb49b1f 100644
--- a/gcc/gimple-loop-interchange.cc
+++ b/gcc/gimple-loop-interchange.cc
@@ -2085,8 +2085,13 @@ pass_linterchange::execute (function *fun)
 }
 
   if (changed_p)
-scev_reset ();
-  return changed_p ? (TODO_update_ssa_only_virtuals) : 0;
+{
+  unsigned todo = TODO_update_ssa_only_virtuals;
+  todo |= loop_invariant_motion_in_fun (cfun, false);
+  scev_reset ();
+  return todo;
+}
+  return 0;
 }
 
 } // anon namespace
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 6bb07e133cd..3c7412737f0 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -3089,10 +3089,11 @@ tree_ssa_lim_finalize (void)
 }
 
 /* Moves invariants from loops.  Only "expensive" invariants are moved out --
-   i.e. those that are likely to be win regardless of the register pressure.  
*/
+

[Bug fortran/82314] internal compiler error: in gfc_conv_expr_descriptor, at fortran/trans-array.c:6972

2020-11-12 Thread anlauf at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82314

--- Comment #7 from anlauf at gcc dot gnu.org ---
The ICE in comment#0 vanishes when one replaces

  integer,parameter::iarray(merge(2,3,.true.)) = 1

with

  integer,parameter::iarray(merge(2,3,.true.)) = [ 1, 1 ]

[Bug target/97534] [10/11 Regression] ICE in decompose, at rtl.h:2280 (arm-linux-gnueabihf)

2020-11-12 Thread jrtc27 at jrtc27 dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97534

James Clarke  changed:

   What|Removed |Added

 CC||jrtc27 at jrtc27 dot com,
   ||rearnsha at arm dot com

--- Comment #4 from James Clarke  ---
[Adding Richard to CC]

Richard, I see you committed a big series of changes in Oct 2019 to
gcc/config/arm that affected subtraction; is it possible one of those broke
this test case?

[PATCH] Implementation of asm goto outputs

2020-11-12 Thread Vladimir Makarov via Gcc-patches


  The following patch implements asm goto with outputs.  Kernel
developers several times expressed wish to have this feature. Asm
goto with outputs was implemented in LLVM recently.  This new feature
was presented on 2020 linux plumbers conference
(https://linuxplumbersconf.org/event/7/contributions/801/attachments/659/1212/asm_goto_w__Outputs.pdf)
and 2020 LLVM conference
(https://www.youtube.com/watch?v=vcPD490s-hE).

  The patch permits to use outputs in asm gotos only when LRA is used.
It is problematic to implement it in the old reload pass.  To be
honest it was hard to implement it in LRA too until global live info
update was added to LRA few years ago.

  Different from LLVM asm goto output implementation, you can use
outputs on any path from the asm goto (not only on fallthrough path as
in LLVM).

  The patch removes critical edges on which potentially asm output
reloads could occur (it means you can have several asm gotos using the
same labels and the same outputs).  It is done in IRA as it is
difficult to create new BBs in LRA.  The most of the work (placement
of output reloads in BB destinations of asm goto basic block) is done in
LRA.  When it happens, LRA updates global live info to reflect that
new pseudos live on the BB borders and the old ones do not live there
anymore.

  I tried also approach to split live ranges of pseudos involved in
asm goto outputs to guarantee they get hard registers in IRA. But
this approach did not work as it is difficult to keep this assignment
through all LRA. Also probably it would result in worse code as move
insn coalescing is not guaranteed.

  Asm goto with outputs will not work for targets which were not
converted to LRA (probably some outdated targets as the old reload
pass is not supported anymore).  An error will be generated when the
old reload pass meets asm goto with an output.  A precaution is taken
not to crash compiler after this error.

  The patch is pretty small as all necessary infrastructure was
already implemented, practically in all compiler pipeline.  It did not
required adding new RTL insns opposite to what Google engineers did to
LLVM MIR.

  The patch could be also useful for implementing jump insns with
output reloads in the future (e.g. branch and count insns).

  I think asm gotos with outputs should be considered as an experimental
feature as there are no real usage of this yet.  Earlier adoption of
this feature could help with debugging and hardening the
implementation.

  The patch was successfully bootstrapped and tested on x86-64, ppc64, 
and aarch64.


Are non-RA changes ok in the patch?

2020-11-12  Vladimir Makarov 

    * c/c-parser.c (c_parser_asm_statement): Parse outputs for asm
    goto too.
    * c/c-typeck.c (build_asm_expr): Remove an assert checking output
    absence for asm goto.
    * cfgexpand.c (expand_asm_stmt): Output asm goto with outputs too.
    Place insns after asm goto on edges.
    * cp/parser.c (cp_parser_asm_definition): Parse outputs for asm
    goto too.
    * doc/extend.texi: Reflect the changes in asm goto documentation.
    * gcc/gimple.c (gimple_build_asm_1): Remove an assert checking 
output

    absence for asm goto.
    * gimple.h (gimple_asm_label_op, gimple_asm_set_label_op): Take
    possible asm goto outputs into account.
    * ira.c (ira): Remove critical edges for potential asm goto output
    reloads.
    (ira_nullify_asm_goto): New function.
    * ira.h (ira_nullify_asm_goto): New prototype.
    * lra-assigns.c (lra_split_hard_reg_for): Use ira_nullify_asm_goto.
    Check that splitting is done inside a basic block.
    * lra-constraints.c (curr_insn_transform): Permit output reloads
    for any jump insn.
    * lra-spills.c (lra_final_code_change): Remove USEs added in 
ira for asm gotos.

    * lra.c (lra_process_new_insns): Place output reload insns after
    jumps in the beginning of destination BBs.
    * reload.c (find_reloads): Report error for asm gotos with
    outputs.  Modify them to keep CFG consistency to avoid crashes.
    * tree-into-ssa.c (rewrite_stmt): Don't put debug stmt after asm
    goto.


2020-11-12  Vladimir Makarov  

    * c-c++-common/asmgoto-2.c: Permit output in asm goto.
    * gcc.c-torture/compile/asmgoto-[2345].c: New tests.

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index ecc3d2119fa..db719fad58c 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -7144,10 +7144,7 @@ c_parser_asm_statement (c_parser *parser)
 	switch (section)
 	  {
 	  case 0:
-	/* For asm goto, we don't allow output operands, but reserve
-	   the slot for a future extension that does allow them.  */
-	if (!is_goto)
-	  outputs = c_parser_asm_operands (parser);
+	outputs = c_parser_asm_operands (parser);
 	break;
 	  case 1:
 	inputs = c_parser_asm_operands (parser);
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index

Re: [PATCH 1/3] C-family, Objective-C [1/3] : Implement Wobjc-root-class [PR77404].

2020-11-12 Thread Joseph Myers

On Thu, 12 Nov 2020, Iain Sandoe wrote:

> OK for the c-family parts?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

[1/3][aarch64] Add aarch64 support for vec_widen_add, vec_widen_sub patterns

2020-11-12 Thread Joel Hutton via Gcc-patches

Hi all,

This patch adds backend patterns for vec_widen_add, vec_widen_sub on aarch64.

All 3 patches together bootstrapped and regression tested on aarch64.

Ok for stage 1?

gcc/ChangeLog:

2020-11-12  Joel Hutton  

        * config/aarch64/aarch64-simd.md: New patterns 
vec_widen_saddl_lo/hi_
From 3e47bc562b83417a048e780bcde52fb2c9617df3 Mon Sep 17 00:00:00 2001
From: Joel Hutton 
Date: Mon, 9 Nov 2020 15:35:57 +
Subject: [PATCH 1/3] [aarch64] Add vec_widen patterns to aarch64

Add widening add and subtract pattrerns to the aarch64
backend.
---
 gcc/config/aarch64/aarch64-simd.md | 94 ++
 1 file changed, 94 insertions(+)

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 2cf6fe9154a2ee1b21ad9e8e2a6109805022be7f..b4f56a2295926f027bd53e7456eec729af0cd6df 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3382,6 +3382,100 @@
   [(set_attr "type" "neon__long")]
 )
 
+(define_expand "vec_widen_saddl_lo_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
+  emit_insn (gen_aarch64_saddl_lo_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+
+(define_expand "vec_widen_ssubl_lo_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
+  emit_insn (gen_aarch64_ssubl_lo_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+(define_expand "vec_widen_saddl_hi_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , true);
+  emit_insn (gen_aarch64_saddl_hi_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+
+(define_expand "vec_widen_ssubl_hi_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , true);
+  emit_insn (gen_aarch64_ssubl_hi_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+(define_expand "vec_widen_uaddl_lo_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
+  emit_insn (gen_aarch64_uaddl_lo_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+
+(define_expand "vec_widen_usubl_lo_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
+  emit_insn (gen_aarch64_usubl_lo_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+
+(define_expand "vec_widen_uaddl_hi_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , true);
+  emit_insn (gen_aarch64_uaddl_hi_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+
+(define_expand "vec_widen_usubl_hi_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , true);
+  emit_insn (gen_aarch64_usubl_hi_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+
 
 (define_expand "aarch64_saddl2"
   [(match_operand: 0 "register_operand")
-- 
2.17.1

[3/3][aarch64] Add support for vec_widen_shift pattern

2020-11-12 Thread Joel Hutton via Gcc-patches

Hi all,

This patch adds support in the aarch64 backend for the vec_widen_shift 
vect-pattern and makes a minor mid-end fix to support it.

All 3 patches together bootstrapped and regression tested on aarch64.

Ok for stage 1?

gcc/ChangeLog:

2020-11-12  Joel Hutton  

        * config/aarch64/aarch64-simd.md: vec_widen_lshift_hi/lo patterns
        * tree-vect-stmts.c 
        (vectorizable_conversion): Fix for widen_lshift case

gcc/testsuite/ChangeLog:

2020-11-12  Joel Hutton  

        * gcc.target/aarch64/vect-widen-lshift.c: New test.
From 97af35b2d2a505dcefd8474cbd4bc3441b83ab02 Mon Sep 17 00:00:00 2001
From: Joel Hutton 
Date: Thu, 12 Nov 2020 11:48:25 +
Subject: [PATCH 3/3] [AArch64][vect] vec_widen_lshift pattern

Add aarch64 vec_widen_lshift_lo/hi patterns and fix bug it triggers in
mid-end.
---
 gcc/config/aarch64/aarch64-simd.md| 66 +++
 .../gcc.target/aarch64/vect-widen-lshift.c| 60 +
 gcc/tree-vect-stmts.c |  9 ++-
 3 files changed, 133 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index b4f56a2295926f027bd53e7456eec729af0cd6df..2bb39c530a1a861cb9bd3df0c2943f62bd6153d7 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4711,8 +4711,74 @@
   [(set_attr "type" "neon_sat_shift_reg")]
 )
 
+(define_expand "vec_widen_shiftl_lo_"
+  [(set (match_operand: 0 "register_operand" "=w")
+	(unspec: [(match_operand:VQW 1 "register_operand" "w")
+			 (match_operand:SI 2
+			   "aarch64_simd_shift_imm_bitsize_" "i")]
+			 VSHLL))]
+  "TARGET_SIMD"
+  {
+rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
+emit_insn (gen_aarch64_shll_internal (operands[0], operands[1],
+		 p, operands[2]));
+DONE;
+  }
+)
+
+(define_expand "vec_widen_shiftl_hi_"
+   [(set (match_operand: 0 "register_operand")
+	(unspec: [(match_operand:VQW 1 "register_operand" "w")
+			 (match_operand:SI 2
+			   "immediate_operand" "i")]
+			  VSHLL))]
+   "TARGET_SIMD"
+   {
+rtx p = aarch64_simd_vect_par_cnst_half (mode, , true);
+emit_insn (gen_aarch64_shll2_internal (operands[0], operands[1],
+		  p, operands[2]));
+DONE;
+   }
+)
+
 ;; vshll_n
 
+(define_insn "aarch64_shll_internal"
+  [(set (match_operand: 0 "register_operand" "=w")
+	(unspec: [(vec_select:
+			(match_operand:VQW 1 "register_operand" "w")
+			(match_operand:VQW 2 "vect_par_cnst_lo_half" ""))
+			 (match_operand:SI 3
+			   "aarch64_simd_shift_imm_bitsize_" "i")]
+			 VSHLL))]
+  "TARGET_SIMD"
+  {
+if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode))
+  return "shll\\t%0., %1., %3";
+else
+  return "shll\\t%0., %1., %3";
+  }
+  [(set_attr "type" "neon_shift_imm_long")]
+)
+
+(define_insn "aarch64_shll2_internal"
+  [(set (match_operand: 0 "register_operand" "=w")
+	(unspec: [(vec_select:
+			(match_operand:VQW 1 "register_operand" "w")
+			(match_operand:VQW 2 "vect_par_cnst_hi_half" ""))
+			 (match_operand:SI 3
+			   "aarch64_simd_shift_imm_bitsize_" "i")]
+			 VSHLL))]
+  "TARGET_SIMD"
+  {
+if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode))
+  return "shll2\\t%0., %1., %3";
+else
+  return "shll2\\t%0., %1., %3";
+  }
+  [(set_attr "type" "neon_shift_imm_long")]
+)
+
 (define_insn "aarch64_shll_n"
   [(set (match_operand: 0 "register_operand" "=w")
 	(unspec: [(match_operand:VD_BHSI 1 "register_operand" "w")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
new file mode 100644
index ..23ed93d1dcbc3ca559efa6708b4ed5855fb6a050
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
@@ -0,0 +1,60 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -save-temps" } */
+#include 
+#include 
+
+#define ARR_SIZE 1024
+
+/* Should produce an shll,shll2 pair*/
+void sshll_opt (int32_t *foo, int16_t *a, int16_t *b)
+{
+for( int i = 0; i < ARR_SIZE - 3;i=i+4)
+{
+foo[i]   = a[i]   << 16;
+foo[i+1] = a[i+1] << 16;
+foo[i+2] = a[i+2] << 16;
+foo[i+3] = a[i+3] << 16;
+}
+}
+
+__attribute__((optimize (0)))
+void sshll_nonopt (int32_t *foo, int16_t *a, int16_t *b)
+{
+for( int i = 0; i < ARR_SIZE - 3;i=i+4)
+{
+foo[i]   = a[i]   << 16;
+foo[i+1] = a[i+1] << 16;
+foo[i+2] = a[i+2] << 16;
+foo[i+3] = a[i+3] << 16;
+}
+}
+
+
+void __attribute__((optimize (0)))
+init(uint16_t *a, uint16_t *b)
+{
+for( int i = 0; i < ARR_SIZE;i++)
+{
+  a[i] = i;
+  b[i] = 2*i;
+}
+}
+
+int __attribute__((optimize (0)))
+main()
+{
+uint32_t foo_arr[ARR_SIZE];
+uint32_t bar_arr[ARR_SIZE];
+uint16_t a[ARR_SIZE];
+uint16_t b[ARR_SIZE];
+
+init(a, b);
+sshll_opt(foo_arr, a, b);
+

[2/3][vect] Add widening add, subtract vect patterns

2020-11-12 Thread Joel Hutton via Gcc-patches

Hi all,

This patch adds widening add and widening subtract patterns to 
tree-vect-patterns.

All 3 patches together bootstrapped and regression tested on aarch64.

gcc/ChangeLog:

2020-11-12  Joel Hutton  

        * expr.c (expand_expr_real_2): add widen_add,widen_subtract cases
        * optabs-tree.c (optab_for_tree_code): optabs for widening 
adds,subtracts
        * optabs.def (OPTAB_D): define vectorized widen add, subtracts
        * tree-cfg.c (verify_gimple_assign_binary): Add case for widening adds, 
subtracts
        * tree-inline.c (estimate_operator_cost): Add case for widening adds, 
subtracts
        * tree-vect-generic.c (expand_vector_operations_1): Add case for 
widening adds, subtracts
        * tree-vect-patterns.c (vect_recog_widen_add_pattern): New recog ptatern
        (vect_recog_widen_sub_pattern): New recog pattern
        (vect_recog_average_pattern): Update widened add code
        (vect_recog_average_pattern): Update widened add code
        * tree-vect-stmts.c (vectorizable_conversion): Add case for widened 
add, subtract
        (supportable_widening_operation): Add case for widened add, subtract
        * tree.def (WIDEN_ADD_EXPR): New tree code
        (WIDEN_SUB_EXPR): New tree code
        (VEC_WIDEN_ADD_HI_EXPR): New tree code
        (VEC_WIDEN_ADD_LO_EXPR): New tree code
        (VEC_WIDEN_SUB_HI_EXPR): New tree code
        (VEC_WIDEN_SUB_LO_EXPR): New tree code

gcc/testsuite/ChangeLog:

2020-11-12  Joel Hutton  

        * gcc.target/aarch64/vect-widen-add.c: New test.
        * gcc.target/aarch64/vect-widen-sub.c: New test.


Ok for trunk?
From e0c10ca554729b9e6d58dbd3f18ba72b2c3ee8bc Mon Sep 17 00:00:00 2001
From: Joel Hutton 
Date: Mon, 9 Nov 2020 15:44:18 +
Subject: [PATCH 2/3] [vect] Add widening add, subtract patterns

Add widening add, subtract patterns to tree-vect-patterns.
Add aarch64 tests for patterns.

fix sad
---
 gcc/expr.c|  6 ++
 gcc/optabs-tree.c | 17 
 gcc/optabs.def|  8 ++
 .../gcc.target/aarch64/vect-widen-add.c   | 90 +++
 .../gcc.target/aarch64/vect-widen-sub.c   | 90 +++
 gcc/tree-cfg.c|  8 ++
 gcc/tree-inline.c |  6 ++
 gcc/tree-vect-generic.c   |  4 +
 gcc/tree-vect-patterns.c  | 32 +--
 gcc/tree-vect-stmts.c | 15 +++-
 gcc/tree.def  |  6 ++
 11 files changed, 276 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c

diff --git a/gcc/expr.c b/gcc/expr.c
index ae16f07775870792729e3805436d7f2debafb6ca..ffc8aed5296174066849d9e0d73b1c352c20fd9e 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9034,6 +9034,8 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 	  target, unsignedp);
   return target;
 
+case WIDEN_ADD_EXPR:
+case WIDEN_SUB_EXPR:
 case WIDEN_MULT_EXPR:
   /* If first operand is constant, swap them.
 	 Thus the following special case checks need only
@@ -9754,6 +9756,10 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 	return temp;
   }
 
+case VEC_WIDEN_ADD_HI_EXPR:
+case VEC_WIDEN_ADD_LO_EXPR:
+case VEC_WIDEN_SUB_HI_EXPR:
+case VEC_WIDEN_SUB_LO_EXPR:
 case VEC_WIDEN_MULT_HI_EXPR:
 case VEC_WIDEN_MULT_LO_EXPR:
 case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
index 4dfda756932de1693667c39c6fabed043b20b63b..009dccfa3bd298bca7b3b45401a4cc2acc90ff21 100644
--- a/gcc/optabs-tree.c
+++ b/gcc/optabs-tree.c
@@ -170,6 +170,23 @@ optab_for_tree_code (enum tree_code code, const_tree type,
   return (TYPE_UNSIGNED (type)
 	  ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab);
 
+case VEC_WIDEN_ADD_LO_EXPR:
+  return (TYPE_UNSIGNED (type)
+	  ? vec_widen_uaddl_lo_optab  : vec_widen_saddl_lo_optab);
+
+case VEC_WIDEN_ADD_HI_EXPR:
+  return (TYPE_UNSIGNED (type)
+	  ? vec_widen_uaddl_hi_optab  : vec_widen_saddl_hi_optab);
+
+case VEC_WIDEN_SUB_LO_EXPR:
+  return (TYPE_UNSIGNED (type)
+	  ? vec_widen_usubl_lo_optab  : vec_widen_ssubl_lo_optab);
+
+case VEC_WIDEN_SUB_HI_EXPR:
+  return (TYPE_UNSIGNED (type)
+	  ? vec_widen_usubl_hi_optab  : vec_widen_ssubl_hi_optab);
+
+
 case VEC_UNPACK_HI_EXPR:
   return (TYPE_UNSIGNED (type)
 	  ? vec_unpacku_hi_optab : vec_unpacks_hi_optab);
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 78409aa14537d259bf90277751aac00d452a0d3f..a97cdb360781ca9c743e2991422c600626c75aa5 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -383,6 +383,14 @@ OPTAB_D (vec_widen_smult_even_optab, "vec_widen_smult_even_$a")
 OPTAB_D (vec_widen_smult_hi_optab, "vec_widen_smult_hi_$a")
 OPTAB_D (vec_widen_smult_lo_optab,

[Bug libstdc++/97798] FTB msp430-elf error: the value of '__gnu_cxx::__numeric_traits_integer<int20>::max' is not usable in a constant expression

2020-11-12 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97798

--- Comment #17 from Jonathan Wakely  ---
Nice, with binutils HEAD my gcc-10 build continues. Thanks!

RE: gcc-wwwdocs branch master updated. 88e29096c36837553fc841bd1fa5df6caa776b44

2020-11-12 Thread Gerald Pfeifer

On Fri, 6 Nov 2020, Liu, Hongtao wrote:
> I realize you're talking about the patch for gcc-wwwdocs.
> No, I didn't send out a patch, sorry for that, will do it in further commit.

Thanks - saw that. Jeff just beat me to it. :-)

Gerald

[committed] wwwdocs: Editorial changes around x86-64 ISA extensions

2020-11-12 Thread Gerald Pfeifer

Per our discussion on the list (plus a grammer improvement in a
section above).

One question: why are the ISA extension lists not alphabetically
sorted?  Wouldn't that be beneficial for users?  Easier to find
something and also easier to compare?

Gerald

---
 htdocs/gcc-11/changes.html | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
index fc4c74f4..106db8e9 100644
--- a/htdocs/gcc-11/changes.html
+++ b/htdocs/gcc-11/changes.html
@@ -265,7 +265,8 @@ a work-in-progress.
   
   New ISA extension support for Intel AMX-TILE, AMX-INT8, AMX-BF16 was
   added to GCC. AMX-TILE, AMX-INT8, AMX-BF16 intrinsics are available
-  via the -mamx-tile, -mamx-int8, -mamx-bf16 compiler switch.
+  via the -mamx-tile, -mamx-int8, -mamx-bf16 compiler
+  switches.
   
   New ISA extension support for Intel AVX-VNNI was added to GCC.
   AVX-VNNI intrinsics are available via the -mavxvnni
@@ -273,14 +274,14 @@ a work-in-progress.
   
   GCC now supports the Intel CPU named Sapphire Rapids through
 -march=sapphirerapids.
-The switch enables the MOVDIRI MOVDIR64B AVX512VP2INTERSECT ENQCMD CLDEMOTE
-SERIALIZE PTWRITE WAITPKG TSXLDTRK AMT-TILE AMX-INT8 AMX-BF16 AVX-VNNI
-ISA extensions.
+The switch enables the MOVDIRI, MOVDIR64B, AVX512VP2INTERSECT, ENQCMD,
+CLDEMOTE, SERIALIZE, PTWRITE, WAITPKG, TSXLDTRK, AMT-TILE, AMX-INT8,
+AMX-BF16, and AVX-VNNI ISA extensions.
   
   GCC now supports the Intel CPU named Alderlake through
 -march=alderlake.
-The switch enables the CLDEMOTE PTWRITE WAITPKG SERIALIZE KEYLOCKER 
AVX-VNNI
-HRESET ISA extensions.
+The switch enables the CLDEMOTE, PTWRITE, WAITPKG, SERIALIZE, KEYLOCKER,
+AVX-VNNI, and HRESET ISA extensions.
   
 
 
-- 
2.29.2

[Bug libstdc++/97798] FTB msp430-elf error: the value of '__gnu_cxx::__numeric_traits_integer<int20>::max' is not usable in a constant expression

2020-11-12 Thread jozefl at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97798

--- Comment #16 from jozefl at gcc dot gnu.org ---
(In reply to Jonathan Wakely from comment #15)
> Hmm, I get the same error for a out-of-tree binutils built from today's git
> sources:
> 
> GNU assembler (GNU Binutils) 2.35.50.20201112
> Copyright (C) 2020 Free Software Foundation, Inc.
> This program is free software; you may redistribute it under the terms of
> the GNU General Public License version 3 or later.
> This program has absolutely no warranty.
> This assembler was configured for a target of `msp430-elf'.

Sigh, that's a different bug. I removed a GAS option that doesn't do anything,
but is automatically passed by MSP430 GCC 10 ASM_SPEC when certain other
options aren't present. And my GCC 10 tester explicitly uses Binutils 2.34
rather than master, so it went undetected.

Pushed a Binutils fix. gcc-10 branch now builds with latest Binutils.

Thanks for reporting this.

Re: PowerPC: Use float128 instead of ieee128 in tests.

2020-11-12 Thread Segher Boessenkool

Hi,

On Thu, Oct 22, 2020 at 06:12:31PM -0400, Michael Meissner wrote:
> Two of the tests used the __ieee128 keyword instead of __float128.  This
> patch changes those cases to use the official keyword.

What is "official" about that?

Why make this change at all?  __ieee128 should work as well!  Did you
see failures without this patch?  Thos need fixing, then.

Segher

Re: [PATCH,wwwdocs] gcc-11/changes: Mention Intel AVX-VNNI

2020-11-12 Thread Gerald Pfeifer

On Wed, 11 Nov 2020, Hongtao Liu via Gcc-patches wrote:
> +  New ISA extension support for Intel AVX-VNNI was added to GCC.

More for the future (i.e., no need to change that now): I suggest
to skip "to GCC" in cases like this, since this is our context to
begin with. 

Gerald

Re: [Patch] Fortran: improve location data for OpenACC/OpenMP directives [PR97782]

2020-11-12 Thread Thomas Schwinge

Hi!

On 2020-11-12T12:45:24+0100, Tobias Burnus  wrote:
> For code like
>   !$acc kernels
>  ... a lot of loops and other code
>   !$acc end kernels
>
> gfortran generates
>#pragma ..._kernels
>  {
>... lot of code
>  }
>
> As the PR shows, the location associated with the #pragma
> is not the 'acc kernels' line but the one near the 'acc end kernel'
> line.
>
> The reason is that [...]

> This patch [...]

> In principle, it should also have an effect on warnings (if there are
> any)

..., and there are -- one, at least (and somewhat bogus, but still).  ;-)
I've thus pushed "Adjust 'libgomp.oacc-fortran/attach-descriptor-1.f90'
for improved location information" to master branch in commit
9106c51e57c06e88a0dddf994fb5432b4bbe68c0, see attached.  (Not (yet)
relevant for releases/gcc-10 branch; the commit introducing that testcase
isn't there yet -- that's to be discussed in a different thread.)


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From 9106c51e57c06e88a0dddf994fb5432b4bbe68c0 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 12 Nov 2020 20:07:25 +0100
Subject: [PATCH] Adjust 'libgomp.oacc-fortran/attach-descriptor-1.f90' for
 improved location information

Fix-up for commit b71ff8c15f5a7d6b1cc1524b4d27843f0d88dbda "Fortran: improve
location data for OpenACC/OpenMP directives [PR97782]".

	libgomp/
	PR fortran/97782
	* testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90: Adjust.
---
 libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90 | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90
index 960b9f94507..2701192e37d 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90
@@ -42,9 +42,8 @@ subroutine test(variant)
  stop 1
   end if
 
-  ! FIXME: This warning is emitted on the wrong line number.
-  ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } 52 }
   !$acc serial present(myvar%arr2)
+  ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } .-1 }
   do i=1,10
 myvar%arr1(i) = i + variant
 myvar%arr2(i) = i - variant
-- 
2.17.1

[Bug fortran/97782] [Fortran] Confused location information for OpenACC compute constructs

2020-11-12 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97782

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Thomas Schwinge :

https://gcc.gnu.org/g:9106c51e57c06e88a0dddf994fb5432b4bbe68c0

commit r11-4951-g9106c51e57c06e88a0dddf994fb5432b4bbe68c0
Author: Thomas Schwinge 
Date:   Thu Nov 12 20:07:25 2020 +0100

Adjust 'libgomp.oacc-fortran/attach-descriptor-1.f90' for improved location
information

Fix-up for commit b71ff8c15f5a7d6b1cc1524b4d27843f0d88dbda "Fortran:
improve
location data for OpenACC/OpenMP directives [PR97782]".

libgomp/
PR fortran/97782
* testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90: Adjust.

[Bug c++/63287] __STDCPP_THREADS__ is not defined

2020-11-12 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63287

--- Comment #4 from Jonathan Wakely  ---
Library patch:

diff --git a/libgcc/gthr.h b/libgcc/gthr.h
index f31cf083cbe5..e6462679b362 100644
--- a/libgcc/gthr.h
+++ b/libgcc/gthr.h
@@ -147,6 +147,13 @@ see the files COPYING3 and COPYING.RUNTIME respectively. 
If not, see
 #endif
 #include "gthr-default.h"

+#if defined __GTHREADS_CXX0X && ! defined __STDCPP_THREADS__
+// The C++ standard says that __STDCPP_THREADS__ should be defined to 1,
+// but G++ does not currently do that (PR c++/63287).
+// Define it here if gthr-default.h defined __GTHREADS_CXX0X.
+# define __STDCPP_THREADS__ 1
+#endif
+
 #ifndef HIDE_EXPORTS
 #pragma GCC visibility pop
 #endif
diff --git a/libstdc++-v3/include/Makefile.am
b/libstdc++-v3/include/Makefile.am
index 292d89da8ba7..74e3e4932579 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -1316,6 +1316,7 @@ uppercase = [ABCDEFGHIJKLMNOPQRSTUVWXYZ_]

 ${host_builddir}/gthr.h: ${toplevel_srcdir}/libgcc/gthr.h stamp-${host_alias}
sed -e '/^#pragma/b' \
+   -e '/__STDCPP_THREADS__/b' \
-e '/^#/s/\(${uppercase}${uppercase}*\)/_GLIBCXX_\1/g' \
-e 's/_GLIBCXX_SUPPORTS_WEAK/__GXX_WEAK__/g' \
-e 's/_GLIBCXX___MINGW32_GLIBCXX___/__MINGW32__/g' \

Re: Installing a generated header file

2020-11-12 Thread Bill Schmidt via Gcc




On 11/12/20 10:15 AM, Bill Schmidt via Gcc wrote:

On 11/12/20 10:06 AM, Marc Glisse wrote:


Does the i386 mm_malloc.h file match your scenario?

Ah, that looks promising indeed, and perhaps very simple!  Marc, 
thanks for the pointer!


And indeed, with this example it was a two-line change to do what I 
needed.  Thanks again. :)


Bill

Re: [PATCH] PR libstdc++/71579 assert that type traits are not misused with an incomplete type

2020-11-12 Thread Antony Polukhin via Gcc-patches

Final bits for libstdc/71579

std::common_type assertions attempt to give a proper 'required from
here' hint for user code, do not bring many changes to the
implementation and check all the template parameters for completeness.
In some cases the type could be checked for completeness more than
once. This seems to be unsolvable due to the fact that
std::common_type could be specialized by the user, so we have to call
std::common_type recursively, potentially repeating the check for the
first type.

std::common_reference assertions make sure that we detect incomplete
types even if the user specialized the std::basic_common_reference.

Changelog:

2020-11-12  Antony Polukhin  
PR libstdc/71579
* include/std/type_traits (is_convertible, is_nothrow_convertible)
(common_type, common_reference): Add static_asserts
to make sure that the arguments of the type traits are not misused
with incomplete types.
* testsuite/20_util/common_reference/incomplete_basic_common_neg.cc:
New test.
* testsuite/20_util/common_reference/incomplete_neg.cc: New test.
* testsuite/20_util/common_type/incomplete_neg.cc: New test.
* testsuite/20_util/common_type/requirements/sfinae_friendly_1.cc: Remove
SFINAE tests on incomplete types.
* testsuite/20_util/is_convertible/incomplete_neg.cc: New test.
* testsuite/20_util/is_nothrow_convertible/incomplete_neg.cc: New test.



--
Best regards,
Antony Polukhin
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 34e068b..00fa7f5 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1406,12 +1406,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_convertible
 : public __is_convertible_helper<_From, _To>::type
-{ };
+{
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_From>{}),
+   "first template argument must be a complete class or an unbounded 
array");
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_To>{}),
+   "second template argument must be a complete class or an unbounded 
array");
+};
 
   // helper trait for unique_ptr, shared_ptr, and span
   template
 using __is_array_convertible
-  = is_convertible<_FromElementType(*)[], _ToElementType(*)[]>;
+  = typename __is_convertible_helper<
+   _FromElementType(*)[], _ToElementType(*)[]>::type;
 
   template, is_function<_To>,
@@ -1454,7 +1460,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_nothrow_convertible
 : public __is_nt_convertible_helper<_From, _To>::type
-{ };
+{
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_From>{}),
+   "first template argument must be a complete class or an unbounded 
array");
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_To>{}),
+   "second template argument must be a complete class or an unbounded 
array");
+};
 
   /// is_nothrow_convertible_v
   template
@@ -2239,7 +2250,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct common_type<_Tp1, _Tp2>
 : public __common_type_impl<_Tp1, _Tp2>::type
-{ };
+{
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp1>{}),
+   "each argument type must be a complete class or an unbounded array");
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp2>{}),
+   "each argument type must be a complete class or an unbounded array");
+};
 
   template
 struct __common_type_pack
@@ -2253,7 +2269,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct common_type<_Tp1, _Tp2, _Rp...>
 : public __common_type_fold,
__common_type_pack<_Rp...>>
-{ };
+{
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp1>{}),
+   "first argument type must be a complete class or an unbounded array");
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp2>{}),
+   "second argument type must be a complete class or an unbounded array");
+#ifdef __cpp_fold_expressions
+  static_assert((std::__is_complete_or_unbounded(
+   __type_identity<_Rp>{}) && ...),
+   "each argument type must be a complete class or an unbounded array");
+#endif
+};
 
   // Let C denote the same type, if any, as common_type_t.
   // If there is such a type C, type shall denote the same type, if any,
@@ -3315,9 +3341,10 @@ template 
 
   // If A and B are both rvalue reference types, ...
   template
-struct __common_ref_impl<_Xp&&, _Yp&&,
-  _Require>,
-  is_convertible<_Yp&&, __common_ref_C<_Xp, _Yp
+struct __common_ref_impl<_Xp&&, _Yp&&, _Require<
+  typename __is_convertible_helper<_Xp&&, __common_ref_C<_Xp, _Yp>>::type,
+  typename __is_convertible_helper<_Yp&&, __common_ref_C<_Xp, _Yp>>::type
+>>
 { using type = __common_ref_C<_Xp, _Yp>; };
 
   // let D be COMMON-REF(const X&, Y&)
@@ -3326,8

[Bug lto/97787] [10/11 regression] 64bit mips lto: .symtab local symbol at index x (>= sh_info of y)

2020-11-12 Thread bunk at stusta dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97787

--- Comment #5 from Adrian Bunk  ---
(In reply to Richard Biener from comment #4)
> You can also try to 'reduce' the testcase.  Since you are linking a shared
> object you can try to strip as many linker inputs as possible and then
> reduce the source files.

Bisecting does not work when both halves are working, but I now found some
clues:

$ g++ -flto -shared CMakeFiles/cpp.dir/src/dolfin.cpp.o
CMakeFiles/cpp.dir/src/parameter.cpp.o CMakeFiles/cpp.dir/src/adaptivity.cpp.o
CMakeFiles/cpp.dir/src/ale.cpp.o CMakeFiles/cpp.dir/src/common.cpp.o
CMakeFiles/cpp.dir/src/fem.cpp.o CMakeFiles/cpp.dir/src/function.cpp.o
CMakeFiles/cpp.dir/src/generation.cpp.o CMakeFiles/cpp.dir/src/geometry.cpp.o
CMakeFiles/cpp.dir/src/graph.cpp.o CMakeFiles/cpp.dir/src/log.cpp.o
CMakeFiles/cpp.dir/src/math.cpp.o CMakeFiles/cpp.dir/src/mesh.cpp.o
CMakeFiles/cpp.dir/src/multistage.cpp.o CMakeFiles/cpp.dir/src/ts.cpp.o
CMakeFiles/cpp.dir/src/io.cpp.o CMakeFiles/cpp.dir/src/la.cpp.o
CMakeFiles/cpp.dir/src/nls.cpp.o CMakeFiles/cpp.dir/src/refinement.cpp.o
CMakeFiles/cpp.dir/src/MPICommWrapper.cpp.o
/usr/bin/ld: /tmp/ccofV1SZ.ltrans32.ltrans.o: .symtab local symbol at index 214
(>= sh_info of 34)
/usr/bin/ld: /tmp/ccofV1SZ.ltrans32.ltrans.o: error adding symbols: bad value
collect2: error: ld returned 1 exit status
$  g++ -flto -shared dolfin.ii parameter.ii adaptivity.ii ale.ii common.ii
fem.ii function.ii generation.ii geometry.ii graph.ii log.ii math.ii mesh.ii
multistage.ii ts.ii io.ii la.ii nls.ii refinement.ii MPICommWrapper.ii
/tmp/ccraiNyo.ltrans9.ltrans.o: in function
`std::__exception_ptr::exception_ptr::operator=(std::__exception_ptr::exception_ptr&&)':
:(.text+0x290): relocation truncated to fit: R_MIPS_CALL16 against
`std::__exception_ptr::exception_ptr::~exception_ptr()@@CXXABI_1.3.3'
/tmp/ccraiNyo.ltrans9.ltrans.o: in function `std::__cxx11::to_string(int)':
:(.text+0x414): relocation truncated to fit: R_MIPS_CALL16 against
`std::__cxx11::basic_string, std::allocator
>::basic_string(unsigned long, char, std::allocator
const&)@@GLIBCXX_3.4.21'
:(.text+0x42c): relocation truncated to fit: R_MIPS_CALL16 against
`std::allocator::~allocator()@@GLIBCXX_3.4'
:(.text+0x498): relocation truncated to fit: R_MIPS_CALL16 against
`std::allocator::~allocator()@@GLIBCXX_3.4'
:(.text+0x4b0): relocation truncated to fit: R_MIPS_CALL16 against
`_Unwind_Resume@@GCC_3.0'
:(.text+0x4e0): relocation truncated to fit: R_MIPS_CALL16 against
`_Unwind_Resume@@GCC_3.0'
/tmp/ccraiNyo.ltrans9.ltrans.o: in function `std::__cxx11::to_string(unsigned
long)':
:(.text+0x584): relocation truncated to fit: R_MIPS_CALL16 against
`std::__cxx11::basic_string, std::allocator
>::basic_string(unsigned long, char, std::allocator
const&)@@GLIBCXX_3.4.21'
:(.text+0x598): relocation truncated to fit: R_MIPS_CALL16 against
`std::allocator::~allocator()@@GLIBCXX_3.4'
:(.text+0x5c8): relocation truncated to fit: R_MIPS_CALL16 against
`std::__cxx11::basic_string, std::allocator
>::size() const@@GLIBCXX_3.4.21'
:(.text+0x610): relocation truncated to fit: R_MIPS_CALL16 against
`std::allocator::~allocator()@@GLIBCXX_3.4'
:(.text+0x628): additional relocation overflows omitted from the
output
collect2: error: ld returned 1 exit status
$ g++ -flto -shared dolfin.ii parameter.ii adaptivity.ii ale.ii common.ii
fem.ii function.ii generation.ii geometry.ii graph.ii log.ii math.ii mesh.ii
multistage.ii ts.ii io.ii la.ii nls.ii refinement.ii MPICommWrapper.ii -mxgot
$ g++ -flto -shared CMakeFiles/cpp.dir/src/dolfin.cpp.o
CMakeFiles/cpp.dir/src/parameter.cpp.o CMakeFiles/cpp.dir/src/adaptivity.cpp.o
CMakeFiles/cpp.dir/src/ale.cpp.o CMakeFiles/cpp.dir/src/common.cpp.o
CMakeFiles/cpp.dir/src/fem.cpp.o CMakeFiles/cpp.dir/src/function.cpp.o
CMakeFiles/cpp.dir/src/generation.cpp.o CMakeFiles/cpp.dir/src/geometry.cpp.o
CMakeFiles/cpp.dir/src/graph.cpp.o CMakeFiles/cpp.dir/src/log.cpp.o
CMakeFiles/cpp.dir/src/math.cpp.o CMakeFiles/cpp.dir/src/mesh.cpp.o
CMakeFiles/cpp.dir/src/multistage.cpp.o CMakeFiles/cpp.dir/src/ts.cpp.o
CMakeFiles/cpp.dir/src/io.cpp.o CMakeFiles/cpp.dir/src/la.cpp.o
CMakeFiles/cpp.dir/src/nls.cpp.o CMakeFiles/cpp.dir/src/refinement.cpp.o
CMakeFiles/cpp.dir/src/MPICommWrapper.cpp.o -flto-partition=none
$ 

Adding -mxgot to compiler and linker flags of a normal LTO build does not work,
but -flto-partition=none during linking is a workaround.

[Bug c++/97814] Copy constructor deletion not recognized in initialization list with -std=c++17

2020-11-12 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97814

--- Comment #3 from Jonathan Wakely  ---
N.B. that's not a copy constructor, it's a move constructor.

[Bug c++/97814] Copy constructor deletion not recognized in initialization list with -std=c++17

2020-11-12 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97814

--- Comment #2 from Jonathan Wakely  ---
There is no copy in C++17, it is elided, so lock(S(1)) is equivalent to lock(1)
in C++17, and that constructor exists.

GCC is correct.

Re: [PATCH][RFC] Make mingw-w64 printf/scanf attribute alias to ms_printf/ms_scanf only for C89

2020-11-12 Thread Joseph Myers

I'd expect these patches to include updates to the gcc.dg/format/ms_*.c 
tests to reflect the changed semantics (or new tests there if some of the 
changes don't result in any failures in the existing tests).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] c++: Don't form a templated TARGET_EXPR in finish_compound_literal

2020-11-12 Thread Marek Polacek via Gcc-patches

On Thu, Nov 12, 2020 at 01:27:23PM -0500, Patrick Palka wrote:
> The atom_cache in normalize_atom relies on the assumption that two
> equivalent (templated) trees (in the sense of cp_tree_equal) must use
> the same template parameters (according to find_template_parameters).
> 
> This assumption unfortunately doesn't always hold for TARGET_EXPRs,
> because cp_tree_equal ignores an artificial target of a TARGET_EXPR, but
> find_template_parameters walks this target (and its DECL_CONTEXT).
> 
> Hence two TARGET_EXPRs built by force_target_expr with the same
> initializer but under different settings of current_function_decl may
> compare equal according to cp_tree_equal, but find_template_parameters
> returns a different set of template parameters for them.  This breaks
> the below testcase because during normalization we build two such
> TARGET_EXPRs (one under current_function_decl=f and another under =g),
> and then use the same ATOMIC_CONSTR for the two corresponding atoms,
> leading to a crash during satisfaction of g's associated constraints.
> 
> This patch works around this assumption violation by removing the source
> of these templated TARGET_EXPRs.  The relevant call to get_target_expr was
> added in r9-6043, but it seems it's no longer necessary (according to
> https://gcc.gnu.org/pipermail/gcc-patches/2019-February/517323.html, the
> call was added in order to avoid regressing on initlist109.C at the time).
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?

Looks OK to me, thanks!

> gcc/cp/ChangeLog:
> 
>   * semantics.c (finish_compound_literal): Don't wrap the original
>   compound literal in a TARGET_EXPR when inside a template.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp2a/concepts-decltype3.C: New test.
> ---
>  gcc/cp/semantics.c  |  7 +--
>  gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C | 15 +++
>  2 files changed, 16 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C
> 
> diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
> index 33d715edaec..172286922e7 100644
> --- a/gcc/cp/semantics.c
> +++ b/gcc/cp/semantics.c
> @@ -3006,12 +3006,7 @@ finish_compound_literal (tree type, tree 
> compound_literal,
>  
>/* If we're in a template, return the original compound literal.  */
>if (orig_cl)
> -{
> -  if (!VECTOR_TYPE_P (type))
> - return get_target_expr_sfinae (orig_cl, complain);
> -  else
> - return orig_cl;
> -}
> +return orig_cl;
>  
>if (TREE_CODE (compound_literal) == CONSTRUCTOR)
>  {
> diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C 
> b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C
> new file mode 100644
> index 000..837855ce8ac
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C
> @@ -0,0 +1,15 @@
> +// { dg-do compile { target c++20 } }
> +
> +template  concept C = requires(T t) { t; };
> +
> +template  using A = decltype((T{}, int{}));
> +
> +template  concept D = C>;
> +
> +template  void f() requires D;
> +template  void g() requires D;
> +
> +void h() {
> +  f();
> +  g();
> +}
> -- 
> 2.29.2.260.ge31aba42fb
> 

Marek

Re: SLS Mitigation patches backported for GCC9

2020-11-12 Thread Sebastian Pop via Gcc-patches

Hi,

could the SLS Mitigation patches be back-ported to the gcc-8 branch?

https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=dc586a74922 aarch64:
Introduce SLS mitigation for RET and BR instructions
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=20da13e395b aarch64:
New Straight Line Speculation (SLS) mitigation flags

Thanks,
Sebastian

On Tue, Aug 4, 2020 at 3:34 AM Kyrylo Tkachov  wrote:
>
> Hi Matthew,
>
> > -Original Message-
> > From: Matthew Malcomson 
> > Sent: 24 July 2020 17:03
> > To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org
> > Cc: Richard Earnshaw ; Ross Burton
> > ; Richard Sandiford 
> > Subject: Re: SLS Mitigation patches backported for GCC9
> >
> > On 24/07/2020 12:01, Kyrylo Tkachov wrote:
> > > Hi Matthew,
> > >
> > >> -Original Message-
> > >> From: Matthew Malcomson 
> > >> Sent: 21 July 2020 16:16
> > >> To: gcc-patches@gcc.gnu.org
> > >> Cc: Richard Earnshaw ; Kyrylo Tkachov
> > >> ; Ross Burton 
> > >> Subject: SLS Mitigation patches backported for GCC9
> > >>
> > >> Hello,
> > >>
> > >> Eventually we will want to backport the SLS patches to older branches.
> > >>
> > >> When the GCC10 release is unfrozen we will work on getting the same
> > >> patches
> > >> already posted backported to that branch.  The patches already posted on
> > >> the
> > >> mailing list apply cleanly to the current releases/gcc-10 branch.
> > >>
> > >> I've heard interest in having the GCC 9 patches, so I'm posting the
> > modified
> > >> versions upstream sooner than otherwise.
> > >
> > > I'd say let's go ahead with the GCC 10 patches (assuming testing works out
> > well on there).
> > > For the GCC 9 patches it would be useful if you included a bit of text of 
> > > how
> > they differ from the GCC 10/11 patches.
> > > This would speed up the technical review.
> > > Thanks,
> > > Kyrill
> > >
> > >>
> > >> Cheers,
> > >> Matthew
> > >>
> > >> Entire patch series attached to cover letter.
> >
> > Below were the only two "interesting" hunks that failed to apply after
> > `patch -p1`.
> >
> > The differences causing these were:
> > - in GCC-9 the `retab` instruction wasn't in the "do_return" pattern.
> > - `simple_return` had "aarch64_use_simple_return_insn_p ()" as a
> > condition.
> >
> >
>
> Thanks, the backports to GCC 10 and GCC 9 are okay, let's go ahead with them.
> Kyrill
>
> >
> >
> > --- gcc/config/aarch64/aarch64.md
> > +++ gcc/config/aarch64/aarch64.md
> > @@ -863,18 +882,23 @@
> > [(return)]
> > ""
> > {
> > +const char *ret = NULL;
> >   if (aarch64_return_address_signing_enabled ()
> >  && TARGET_ARMV8_3
> >  && !crtl->calls_eh_return)
> > {
> >  if (aarch64_ra_sign_key == AARCH64_KEY_B)
> > - return "retab";
> > + ret = "retab";
> >  else
> > - return "retaa";
> > + ret = "retaa";
> > }
> > -return "ret";
> > +else
> > +  ret = "ret";
> > +output_asm_insn (ret, operands);
> > +return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
> > }
> > -  [(set_attr "type" "branch")]
> > +  [(set_attr "type" "branch")
> > +   (set_attr "sls_length" "retbr")]
> >   )
> >
> >   (define_expand "return"
> > @@ -886,8 +910,12 @@
> >   (define_insn "simple_return"
> > [(simple_return)]
> > ""
> > -  "ret"
> > -  [(set_attr "type" "branch")]
> > +  {
> > +output_asm_insn ("ret", operands);
> > +return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
> > +  }
> > +  [(set_attr "type" "branch")
> > +   (set_attr "sls_length" "retbr")]
> >   )
> >
> >   (define_insn "*cb1"

Re: [PATCH] [PR target/97194] [AVX2] Support variable index vec_set.

2020-11-12 Thread Uros Bizjak via Gcc-patches

On Thu, Nov 12, 2020 at 7:26 PM Uros Bizjak  wrote:
>
> On Thu, Nov 12, 2020 at 6:51 PM Uros Bizjak  wrote:
>
> > > > > Yes, removed 'code' and value_mode by checking VECTOR_MODE_P and use 
> > > > > GET_MODE_INNER
> > > > > for value_mode.  ".md expanders" shall support for integer constants 
> > > > > index mode, but
> > > > > I guess they shouldn't be expanded by IFN as this function is for 
> > > > > variable index
> > > > > insert only?  Anyway, the v3 patch used VOIDmode check...
> > >
> > > I'm not sure what best to do here, as said accepting "any" (integer) mode 
> > > as
> > > input is desirable (SImode, DImode but eventually also smaller modes).  
> > > How
> > > that can be best achieved I don't know.
> >
> > I was expecting something similar to how extvM/extzvM operands are
> > handled here. We have:
> >
> > Operands 0 and 1 both have mode M.  Operands 2 and 3 have a
> > target-specific mode.
> >
> > Please note operands 2 and 3 having a "target-specific" mode, handled
> > in optabs-query.c as:
> >
> >   machine_mode struct_mode = data->operand[struct_op].mode;
> >   if (struct_mode == VOIDmode)
> > struct_mode = word_mode;
> >   if (mode != struct_mode)
> > return false;
> >
> > > Why's not specifying any mode in the patter no good?  Just make sure you
> > > appropriately extend/subreg it?  We can make sure it will be an integer
> > > mode in the expander itself.
> >
> > IIRC, having known mode, expanders can use create_convert_operand_to,
> > and the middle-end will do the above by itself. Also note that at
> > least two targets specify SImode, so register operands are currently
> > ineffective there.
>
> On a related note, the pattern is currently expanded as (see
> store_bit_field_1 in expmed.c):
>
>   create_fixed_operand ([0], op0);
>   create_input_operand ([1], value, innermode);
>   create_integer_operand ([2], pos);
>
> I don't think calling create_integer_operand on register operand is
> correct. The function comment says:
>
> /* Make OP describe an input operand that has value INTVAL and that has
>no inherent mode.  This function should only be used for operands that
>are always expand-time constants.  The backend may request that INTVAL
>be copied into a different kind of rtx, but it must specify the mode
>of that rtx if so.  */

Ah, sorry - variable vec_set takes a different path, please disregard
my last message.

Uros.

Re: [RFC][PR target PR90000] (rs6000) Compile time hog w/impossible asm constraint lra loop

2020-11-12 Thread Segher Boessenkool

On Thu, Nov 12, 2020 at 09:15:11AM -0700, Jeff Law wrote:
> > void foo (void)
> > {
> >   register float __attribute__ ((mode(SD))) r31 __asm__ ("r31");
> >   register float __attribute__ ((mode(SD))) fr1 __asm__ ("fr1");
> >
> >   __asm__ ("#" : "=d" (fr1));
> >   r31 = fr1;
> >   __asm__ ("#" : : "r" (r31));
> > }
> 
> Looking at this again after many months away, I wonder the real problem
> is the reloads we have to generate for copies to/from he fr1 local
> variable, which is bound to hard reg fr1 rather than the asm statements
> themselves.  It's not clear to me from the BZ and I don't have a PPC
> cross handy to look directly.

We should never do a reload of a (local) register variable.
Unfortunately we cannot currently tell during reload that something is
one!

See also PR97708, and many more, going many years back.


Segher

[PATCH] c++: Don't form a templated TARGET_EXPR in finish_compound_literal

2020-11-12 Thread Patrick Palka via Gcc-patches

The atom_cache in normalize_atom relies on the assumption that two
equivalent (templated) trees (in the sense of cp_tree_equal) must use
the same template parameters (according to find_template_parameters).

This assumption unfortunately doesn't always hold for TARGET_EXPRs,
because cp_tree_equal ignores an artificial target of a TARGET_EXPR, but
find_template_parameters walks this target (and its DECL_CONTEXT).

Hence two TARGET_EXPRs built by force_target_expr with the same
initializer but under different settings of current_function_decl may
compare equal according to cp_tree_equal, but find_template_parameters
returns a different set of template parameters for them.  This breaks
the below testcase because during normalization we build two such
TARGET_EXPRs (one under current_function_decl=f and another under =g),
and then use the same ATOMIC_CONSTR for the two corresponding atoms,
leading to a crash during satisfaction of g's associated constraints.

This patch works around this assumption violation by removing the source
of these templated TARGET_EXPRs.  The relevant call to get_target_expr was
added in r9-6043, but it seems it's no longer necessary (according to
https://gcc.gnu.org/pipermail/gcc-patches/2019-February/517323.html, the
call was added in order to avoid regressing on initlist109.C at the time).

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

* semantics.c (finish_compound_literal): Don't wrap the original
compound literal in a TARGET_EXPR when inside a template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-decltype3.C: New test.
---
 gcc/cp/semantics.c  |  7 +--
 gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C | 15 +++
 2 files changed, 16 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 33d715edaec..172286922e7 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -3006,12 +3006,7 @@ finish_compound_literal (tree type, tree 
compound_literal,
 
   /* If we're in a template, return the original compound literal.  */
   if (orig_cl)
-{
-  if (!VECTOR_TYPE_P (type))
-   return get_target_expr_sfinae (orig_cl, complain);
-  else
-   return orig_cl;
-}
+return orig_cl;
 
   if (TREE_CODE (compound_literal) == CONSTRUCTOR)
 {
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C
new file mode 100644
index 000..837855ce8ac
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C
@@ -0,0 +1,15 @@
+// { dg-do compile { target c++20 } }
+
+template  concept C = requires(T t) { t; };
+
+template  using A = decltype((T{}, int{}));
+
+template  concept D = C>;
+
+template  void f() requires D;
+template  void g() requires D;
+
+void h() {
+  f();
+  g();
+}
-- 
2.29.2.260.ge31aba42fb

Re: [PATCH] [PR target/97194] [AVX2] Support variable index vec_set.

2020-11-12 Thread Uros Bizjak via Gcc-patches

On Thu, Nov 12, 2020 at 6:51 PM Uros Bizjak  wrote:

> > > > Yes, removed 'code' and value_mode by checking VECTOR_MODE_P and use 
> > > > GET_MODE_INNER
> > > > for value_mode.  ".md expanders" shall support for integer constants 
> > > > index mode, but
> > > > I guess they shouldn't be expanded by IFN as this function is for 
> > > > variable index
> > > > insert only?  Anyway, the v3 patch used VOIDmode check...
> >
> > I'm not sure what best to do here, as said accepting "any" (integer) mode as
> > input is desirable (SImode, DImode but eventually also smaller modes).  How
> > that can be best achieved I don't know.
>
> I was expecting something similar to how extvM/extzvM operands are
> handled here. We have:
>
> Operands 0 and 1 both have mode M.  Operands 2 and 3 have a
> target-specific mode.
>
> Please note operands 2 and 3 having a "target-specific" mode, handled
> in optabs-query.c as:
>
>   machine_mode struct_mode = data->operand[struct_op].mode;
>   if (struct_mode == VOIDmode)
> struct_mode = word_mode;
>   if (mode != struct_mode)
> return false;
>
> > Why's not specifying any mode in the patter no good?  Just make sure you
> > appropriately extend/subreg it?  We can make sure it will be an integer
> > mode in the expander itself.
>
> IIRC, having known mode, expanders can use create_convert_operand_to,
> and the middle-end will do the above by itself. Also note that at
> least two targets specify SImode, so register operands are currently
> ineffective there.

On a related note, the pattern is currently expanded as (see
store_bit_field_1 in expmed.c):

  create_fixed_operand ([0], op0);
  create_input_operand ([1], value, innermode);
  create_integer_operand ([2], pos);

I don't think calling create_integer_operand on register operand is
correct. The function comment says:

/* Make OP describe an input operand that has value INTVAL and that has
   no inherent mode.  This function should only be used for operands that
   are always expand-time constants.  The backend may request that INTVAL
   be copied into a different kind of rtx, but it must specify the mode
   of that rtx if so.  */

Uros.

[Bug c++/63287] __STDCPP_THREADS__ is not defined

2020-11-12 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63287

--- Comment #3 from Jonathan Wakely  ---
I tried to implement this by adding a macro definition to c_cpp_builtins in
gcc/c-family/c-cppbuiltin.c but failed. I think we want to inspect the
'thread_model' global variable and see if it is "single", but that might only
be available to the driver, not cc1plus.

Maybe the driver needs to check THREAD_MODEL_SPEC and thread_model and then
pass -D__STDCPP_THREADS__=1 to cc1plus when strcmp(thread_model, "single").

For now, I think we can define it in libstdc++'s  based on
the value of the _GLIBCXX_HAS_GTHREADS macro.

Re: [gcc r9-8794] aarch64: Clear canary value after stack_protect_test [PR96191]

2020-11-12 Thread Sebastian Pop via Gcc-patches

Hi,

On Fri, Aug 7, 2020 at 6:18 AM Richard Sandiford  wrote:
>
> https://gcc.gnu.org/g:5380912a17ea09a8996720fb62b1a70c16c8f9f2
>
> commit r9-8794-g5380912a17ea09a8996720fb62b1a70c16c8f9f2
> Author: Richard Sandiford 
> Date:   Fri Aug 7 12:17:37 2020 +0100

could you please also apply this change to the gcc-8 branch?

Thanks,
Sebastian

>
> aarch64: Clear canary value after stack_protect_test [PR96191]
>
> The stack_protect_test patterns were leaving the canary value in the
> temporary register, meaning that it was often still in registers on
> return from the function.  An attacker might therefore have been
> able to use it to defeat stack-smash protection for a later function.
>
> gcc/
> PR target/96191
> * config/aarch64/aarch64.md (stack_protect_test_): Set the
> CC register directly, instead of a GPR.  Replace the original GPR
> destination with an extra scratch register.  Zero out operand 3
> after use.
> (stack_protect_test): Update accordingly.
>
> gcc/testsuite/
> PR target/96191
> * gcc.target/aarch64/stack-protector-1.c: New test.
> * gcc.target/aarch64/stack-protector-2.c: Likewise.
>
> (cherry picked from commit fe1a26429038d7cd17abc53f96a6f3e2639b605f)
>
> Diff:
> ---
>  gcc/config/aarch64/aarch64.md  | 34 -
>  .../gcc.target/aarch64/stack-protector-1.c | 89 
> ++
>  .../gcc.target/aarch64/stack-protector-2.c |  6 ++
>  3 files changed, 110 insertions(+), 19 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index ed8cf8ecea1..9598bac387f 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -6985,10 +6985,8 @@
> (match_operand 2)]
>""
>  {
> -  rtx result;
>machine_mode mode = GET_MODE (operands[0]);
>
> -  result = gen_reg_rtx(mode);
>if (aarch64_stack_protector_guard != SSP_GLOBAL)
>{
>  /* Generate access through the system register. The
> @@ -7013,29 +7011,27 @@
>  operands[1] = gen_rtx_MEM (mode, tmp_reg);
>}
>emit_insn ((mode == DImode
> - ? gen_stack_protect_test_di
> - : gen_stack_protect_test_si) (result,
> -   operands[0],
> -   operands[1]));
> -
> -  if (mode == DImode)
> -emit_jump_insn (gen_cbranchdi4 (gen_rtx_EQ (VOIDmode, result, 
> const0_rtx),
> -   result, const0_rtx, operands[2]));
> -  else
> -emit_jump_insn (gen_cbranchsi4 (gen_rtx_EQ (VOIDmode, result, 
> const0_rtx),
> -   result, const0_rtx, operands[2]));
> +? gen_stack_protect_test_di
> +: gen_stack_protect_test_si) (operands[0], operands[1]));
> +
> +  rtx cc_reg = gen_rtx_REG (CCmode, CC_REGNUM);
> +  emit_jump_insn (gen_condjump (gen_rtx_EQ (VOIDmode, cc_reg, const0_rtx),
> +   cc_reg, operands[2]));
>DONE;
>  })
>
> +;; DO NOT SPLIT THIS PATTERN.  It is important for security reasons that the
> +;; canary value does not live beyond the end of this sequence.
>  (define_insn "stack_protect_test_"
> -  [(set (match_operand:PTR 0 "register_operand" "=r")
> -   (unspec:PTR [(match_operand:PTR 1 "memory_operand" "m")
> -(match_operand:PTR 2 "memory_operand" "m")]
> -UNSPEC_SP_TEST))
> +  [(set (reg:CC CC_REGNUM)
> +   (unspec:CC [(match_operand:PTR 0 "memory_operand" "m")
> +   (match_operand:PTR 1 "memory_operand" "m")]
> +  UNSPEC_SP_TEST))
> +   (clobber (match_scratch:PTR 2 "="))
> (clobber (match_scratch:PTR 3 "="))]
>""
> -  "ldr\t%3, %1\;ldr\t%0, %2\;eor\t%0, %3, %0"
> -  [(set_attr "length" "12")
> +  "ldr\t%2, %0\;ldr\t%3, %1\;subs\t%2, %2, %3\;mov\t%3, 0"
> +  [(set_attr "length" "16")
> (set_attr "type" "multiple")])
>
>  ;; Write Floating-point Control Register.
> diff --git a/gcc/testsuite/gcc.target/aarch64/stack-protector-1.c 
> b/gcc/testsuite/gcc.target/aarch64/stack-protector-1.c
> new file mode 100644
> index 000..73e83bc413f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/stack-protector-1.c
> @@ -0,0 +1,89 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target fstack_protector } */
> +/* { dg-options "-fstack-protector-all -O2" } */
> +
> +extern volatile long *stack_chk_guard_ptr;
> +
> +volatile long *
> +get_ptr (void)
> +{
> +  return stack_chk_guard_ptr;
> +}
> +
> +void __attribute__ ((noipa))
> +f (void)
> +{
> +  volatile int x;
> +  x = 1;
> +  x += 1;
> +}
> +
> +#define CHECK(REG) "\tcmp\tx0, " #REG "\n\tbeq\t1f\n"
> +
> +asm (
> +"  .pushsection .data\n"
> +"  .align  3\n"
> +"  .globl  stack_chk_guard_ptr\n"
> +"stack_chk_guard_ptr:\n"
> +#if __ILP32__
> +"  .word   __stack_chk_guard\n"
> +#else
> +"

[Bug rtl-optimization/97777] ICE: in df_refs_verify, at df-scan.c:3991 with -O -ffinite-math-only -fzero-call-used-regs=all

2020-11-12 Thread ubizjak at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9

Uroš Bizjak  changed:

   What|Removed |Added

 CC||qing.zhao at oracle dot com
   Last reconfirmed||2020-11-12
 Status|UNCONFIRMED |NEW
   Target Milestone|--- |11.0
 Ever confirmed|0   |1

--- Comment #1 from Uroš Bizjak  ---
Confirmed (compiler must be configured with checking enabled).

CC author.

1 2 3 >

1 - 100 of 243 matches

Mail list logo