Re: [PATCH] c, tree: Fix ICE from get_parm_array_spec [PR97860]

2020-11-18 Thread Richard Biener
On Wed, 18 Nov 2020, Jakub Jelinek wrote:

> Hi!
> 
> The C and C++ FEs handle zero sized arrays differently, C uses
> NULL TYPE_MAX_VALUE on non-NULL TYPE_DOMAIN on complete ARRAY_TYPEs
> with bitsize_zero_node TYPE_SIZE, while C++ FE likes to set
> TYPE_MAX_VALUE to the largest value (and min to the lowest).
> 
> Martin has used array_type_nelts in get_parm_array_spec where the
> function on the C form of [0] arrays returns error_mark_node and the code
> crashes soon afterwards.  The following patch teaches array_type_nelts about
> this (e.g. dwarf2out already handles that as [0]).  While it will change
> what is_empty_type returns for certain types (e.g. struct S { int a[0]; };),
> as those types occupy zero bits in C, it should make an ABI difference.
> 
> So, the tree.c change makes the c-decl.c code handle the [0] arrays
> like any other constant extents, and the c-decl.c change just makes sure
> that if we'd run into error_mark_node e.g. from the VLA expressions, we
> don't crash on those.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Looks OK to me.  I wonder if we want to change the C FE to follow the
C++ FE here.  Note I also chickened out making the domain signed
(the C++ variant is to be interpreted as [0, -1] for zero-size arrays).

In principle [0,] (NULL TYPE_MAX_VALUE) is a flex-array domain but
I guess a COMPLETE_TYPE_P clearly says it isn't.  Still it would be
nice to fix this inconsistency.  (layout_type could even ICE on
complete array types with "incomplete" domain)

Thanks,
Richard.

> 2020-11-18  Jakub Jelinek  
> 
>   PR c/97860
>   * tree.c (array_type_nelts): For complete arrays with zero min
>   and NULL max and zero size return -1.
> 
>   * c-decl.c (get_parm_array_spec): Bail out of nelts is
>   error_operand_p.
> 
>   * gcc.dg/pr97860.c: New test.
> 
> --- gcc/tree.c.jj 2020-11-18 09:40:09.798660999 +0100
> +++ gcc/tree.c2020-11-18 20:02:41.655398514 +0100
> @@ -3483,7 +3483,17 @@ array_type_nelts (const_tree type)
>  
>/* TYPE_MAX_VALUE may not be set if the array has unknown length.  */
>if (!max)
> -return error_mark_node;
> +{
> +  /* zero sized arrays are represented from C FE as complete types with
> +  NULL TYPE_MAX_VALUE and zero TYPE_SIZE, while C++ FE represents
> +  them as min 0, max -1.  */
> +  if (COMPLETE_TYPE_P (type)
> +   && integer_zerop (TYPE_SIZE (type))
> +   && integer_zerop (min))
> + return build_int_cst (TREE_TYPE (min), -1);
> +
> +  return error_mark_node;
> +}
>  
>return (integer_zerop (min)
> ? max
> --- gcc/c/c-decl.c.jj 2020-11-11 01:46:03.245697697 +0100
> +++ gcc/c/c-decl.c2020-11-18 20:03:53.053602265 +0100
> @@ -5775,6 +5775,8 @@ get_parm_array_spec (const struct c_parm
>  type = TREE_TYPE (type))
>   {
> tree nelts = array_type_nelts (type);
> +   if (error_operand_p (nelts))
> + return attrs;
> if (TREE_CODE (nelts) != INTEGER_CST)
>   {
> /* Each variable VLA bound is represented by the dollar
> --- gcc/testsuite/gcc.dg/pr97860.c.jj 2020-11-18 15:15:08.858931877 +0100
> +++ gcc/testsuite/gcc.dg/pr97860.c2020-11-18 15:14:50.751135430 +0100
> @@ -0,0 +1,11 @@
> +/* PR c/97860 */
> +/* { dg-do compile } */
> +/* { dg-options "" } */
> +
> +void
> +foo (int n)
> +{
> +  typedef int T[0];
> +  typedef T V[n];
> +  void bar (V);
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


[PATCH] rs6000: Fix p8_mtvsrd_df's insn type

2020-11-18 Thread Kewen.Lin via Gcc-patches
Hi,

The insn type of p8_mtvsrd_df looks missed to be updated
with mtvsr.  Here I supposed mtvsrd's all usages should
be with the same insn type.

This patch is to fix its current insn type mfvsr by mtvsr.

Is it ok for trunk?

BR,
Kewen
-
gcc/ChangeLog:

* config/rs6000/rs6000.md (p8_mtvsrd_df): Fix insn type.
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 5e5ad9f7c3d..7de31cab80b 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -8761,7 +8761,7 @@
   UNSPEC_P8V_MTVSRD))]
   "TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
   "mtvsrd %x0,%1"
-  [(set_attr "type" "mfvsr")])
+  [(set_attr "type" "mtvsr")])
 
 (define_insn "p8_xxpermdi_"
   [(set (match_operand:FMOVE128_GPR 0 "register_operand" "=wa")


Re: [PATCH] Remove lambdas from _Rb_tree

2020-11-18 Thread François Dumont via Gcc-patches

On 18/11/20 12:50 am, Jonathan Wakely wrote:

On 17/11/20 21:51 +0100, François Dumont via Libstdc++ wrote:
This is a change that has been done to _Hashtable and that I forgot 
to propose for _Rb_tree.


The _GLIBCXX_XREF macro can be easily removed of course.

    libstdc++: _Rb_tree code cleanup, remove lambdas.

    Use an additional template parameter on the clone method to 
propagate if the values must be

    copy or move rather than lambdas.

    libstdc++-v3/ChangeLog:

            * include/bits/move.h (_GLIBCXX_XREF): New.
            * include/bits/stl_tree.h: Adapt to use latter.
            (_Rb_tree<>::_S_fwd_value_for): New.
            (_Rb_tree<>::_M_clone_node): Add _Tree 
template parameter.

            Use _S_fwd_value_for.
            (_Rb_tree<>::_M_cbegin): New.
            (_Rb_tree<>::_M_begin): Use latter.
            (_Rb_tree<>::_M_copy): Add _Tree template 
parameter.
            (_Rb_tree<>::_M_move_data): Use rvalue 
reference for _Rb_tree parameter.

            (_Rb_tree<>::_M_move_assign): Likewise.

Tested under Linux x86_64.

Ok to commit ?


GCC is in stage 3 now, so this should have been posted last week
really.


Ok, no problem, it can wait.

Still, following your advises here is what I come up with, much simpler 
indeed.


I just run a few tests for the moment but so far so good.

Thanks




diff --git a/libstdc++-v3/include/bits/move.h 
b/libstdc++-v3/include/bits/move.h

index 5a4dbdc823c..e0d68ca9108 100644
--- a/libstdc++-v3/include/bits/move.h
+++ b/libstdc++-v3/include/bits/move.h
@@ -158,9 +158,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

  /// @} group utilities

+#define _GLIBCXX_XREF(_Tp) _Tp&&


I think this does improve the code that uses this. But the correct
name for this is forwarding reference, so I think FWDREF would be
better than XREF. XREF doesn't tell me anything about what it's for.


#define _GLIBCXX_MOVE(__val) std::move(__val)
#define _GLIBCXX_FORWARD(_Tp, __val) std::forward<_Tp>(__val)
#else
+#define _GLIBCXX_XREF(_Tp) const _Tp&
#define _GLIBCXX_MOVE(__val) (__val)
#define _GLIBCXX_FORWARD(_Tp, __val) (__val)
#endif
diff --git a/libstdc++-v3/include/bits/stl_tree.h 
b/libstdc++-v3/include/bits/stl_tree.h

index ec141ea01c7..128c7e2c892 100644
--- a/libstdc++-v3/include/bits/stl_tree.h
+++ b/libstdc++-v3/include/bits/stl_tree.h
@@ -478,11 +478,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

template
  _Link_type
-#if __cplusplus < 201103L
-  operator()(const _Arg& __arg)
-#else
-  operator()(_Arg&& __arg)
-#endif
+  operator()(_GLIBCXX_XREF(_Arg) __arg)
  {
    _Link_type __node = static_cast<_Link_type>(_M_extract());
    if (__node)
@@ -544,11 +540,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

template
  _Link_type
-#if __cplusplus < 201103L
-  operator()(const _Arg& __arg) const
-#else
-  operator()(_Arg&& __arg) const
-#endif
+  operator()(_GLIBCXX_XREF(_Arg) __arg) const
  { return _M_t._M_create_node(_GLIBCXX_FORWARD(_Arg, __arg)); }

  private:
@@ -655,11 +647,27 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_put_node(__p);
  }

-  template
+#if __cplusplus >= 201103L
+  template
+    static constexpr
+    typename conditional::value,
+ const value_type&, value_type&&>::type
+    _S_fwd_value_for(value_type& __val) noexcept
+    { return std::move(__val); }
+#else
+  template
+    static const value_type&
+    _S_fwd_value_for(value_type& __val)
+    { return __val; }
+#endif
+
+  template
_Link_type
-    _M_clone_node(_Const_Link_type __x, _NodeGen& __node_gen)
+    _M_clone_node(_GLIBCXX_XREF(_Tree),


Since the _Tree type is only used to decide whether to copy or move,
could it just be a bool instead?

  template
 _Link_type
_M_clone_node(_Link_type __x, _NodeGen& __node_gen)

Then it would be called as _M_clone_node<_Move>(__x, __node_gen)
instead of _M_clone_node(_GLIBCXX_FORWARD(_Tree, __t), __x, __node_gen).
That seems easier to read.


+  _Link_type __x, _NodeGen& __node_gen)
{
-  _Link_type __tmp = __node_gen(*__x->_M_valptr());
+  _Link_type __tmp
+    = __node_gen(_S_fwd_value_for<_Tree>(*__x->_M_valptr()));


Is _S_fwd_value_for necessary? This would work:

#if __cplusplus >= 201103L
  using _Vp = typename 
conditional::value,

   const value_type&,
value_type&&>::type;
#else
  typedef const value_type& _Vp;
#endif
  _Link_type __tmp
    = __node_gen(_GLIBCXX_FORWARD(_Vp, *__x->_M_valptr()));

Or with the suggestion above, the typedef would be:

  using _Vp = typename conditional<_Move, value_type&&,
   const value_type&>::type;



  __tmp->_M_color = __x->_M_color;
  __tmp->_M_left = 0;
  __tmp->_M_right = 0;
@@ -748,9 +756,13 @@ 

[C PATCH] Drop qualifiers during lvalue conversion

2020-11-18 Thread Uecker, Martin


Here is another version of the patch. The
only difference is the additional the check 
using 'tree_ssa_useless_type_conversion'.


Best,
Martin




C: Drop qualifiers during lvalue conversion. PR97702

2020-11-XX  Martin Uecker  

gcc/
* gcc/gimplify.c (gimplify_modify_expr_rhs): Optimizie
NOP_EXPRs that contain compound literals.

gcc/c/
* c-typeck.c (convert_lvalue_to_rvalue): Drop qualifiers.
 
gcc/testsuite/
* gcc.dg/cond-constqual-1.c: Adapt test.
* gcc.dg/lvalue-11.c: New test.
* gcc.dg/pr60195.c: Add warning.





diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 413109c916c..286f3d9cd6c 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -2080,6 +2080,9 @@ convert_lvalue_to_rvalue (location_t loc, struct c_expr 
exp,
 exp = default_function_array_conversion (loc, exp);
   if (!VOID_TYPE_P (TREE_TYPE (exp.value)))
 exp.value = require_complete_type (loc, exp.value);
+  if (convert_p && !error_operand_p (exp.value)
+  && (TREE_CODE (TREE_TYPE (exp.value)) != ARRAY_TYPE))
+exp.value = convert (build_qualified_type (TREE_TYPE (exp.value), 
TYPE_UNQUALIFIED),
exp.value);
   if (really_atomic_lvalue (exp.value))
 {
   vec *params;
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 2566ec7f0af..fd0b5202b45 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -5518,6 +5518,19 @@ gimplify_modify_expr_rhs (tree *expr_p, tree *from_p, 
tree *to_p,
    return GS_OK;
      }
 
+   case NOP_EXPR:
+     /* Pull out compound literal expressions from a NOP_EXPR.
+    Those are created in the C FE to drop qualifiers during
+    lvalue conversion.  */
+     if ((TREE_CODE (TREE_OPERAND (*from_p, 0)) == COMPOUND_LITERAL_EXPR)
+     && tree_ssa_useless_type_conversion (*from_p))
+   {
+     *from_p = TREE_OPERAND (*from_p, 0);
+     ret = GS_OK;
+     changed = true;
+   }
+     break;
+
    case COMPOUND_LITERAL_EXPR:
      {
    tree complit = TREE_OPERAND (*expr_p, 1);
diff --git a/gcc/testsuite/gcc.dg/cond-constqual-1.c 
b/gcc/testsuite/gcc.dg/cond-constqual-1.c
index 3354c7214a4..b5a09cb0038 100644
--- a/gcc/testsuite/gcc.dg/cond-constqual-1.c
+++ b/gcc/testsuite/gcc.dg/cond-constqual-1.c
@@ -11,5 +11,5 @@ test (void)
   __typeof__ (1 ? foo (0) : 0) texpr;
   __typeof__ (1 ? i : 0) texpr2;
   texpr = 0;  /* { dg-bogus "read-only variable" "conditional expression with 
call to const
function" } */
-  texpr2 = 0; /* { dg-error "read-only variable" "conditional expression with 
const variable" } */
+  texpr2 = 0; /* { dg-bogus "read-only variable" "conditional expression with 
const variable" } */
 }
diff --git a/gcc/testsuite/gcc.dg/lvalue-11.c b/gcc/testsuite/gcc.dg/lvalue-11.c
new file mode 100644
index 000..45a97d86890
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lvalue-11.c
@@ -0,0 +1,46 @@
+/* test that lvalue conversions drops qualifiers, Bug 97702 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+
+void f(void)
+{
+ const int j;
+ typeof((0,j)) i10; i10 = j;;
+ typeof(+j) i11; i11 = j;;
+ typeof(-j) i12; i12 = j;;
+ typeof(1?j:0) i13; i13 = j;;
+ typeof((int)j) i14; i14 = j;;
+ typeof((const int)j) i15; i15 = j;;
+}
+
+void g(void)
+{
+ volatile int j;
+ typeof((0,j)) i21; i21 = j;;
+ typeof(+j) i22; i22 = j;;
+ typeof(-j) i23; i23 = j;;
+ typeof(1?j:0) i24; i24 = j;;
+ typeof((int)j) i25; i25 = j;;
+ typeof((volatile int)j) i26; i26 = j;;
+}
+
+void h(void)
+{
+ _Atomic int j;
+ typeof((0,j)) i32; i32 = j;;
+ typeof(+j) i33; i33 = j;;
+ typeof(-j) i34; i34 = j;;
+ typeof(1?j:0) i35; i35 = j;;
+ typeof((int)j) i36; i36 = j;;
+ typeof((_Atomic int)j) i37; i37 = j;;
+}
+
+void e(void)
+{
+ int* restrict j;
+ typeof((0,j)) i43; i43 = j;;
+ typeof(1?j:0) i44; i44 = j;;
+ typeof((int*)j) i45; i45 = j;;
+ typeof((int* restrict)j) i46; i46 = j;;
+}
diff --git a/gcc/testsuite/gcc.dg/pr60195.c b/gcc/testsuite/gcc.dg/pr60195.c
index 0a50a30be25..8eccf7f63ad 100644
--- a/gcc/testsuite/gcc.dg/pr60195.c
+++ b/gcc/testsuite/gcc.dg/pr60195.c
@@ -15,7 +15,7 @@ atomic_int
 fn2 (void)
 {
   atomic_int y = 0;
-  y;
+  y;   /* { dg-warning "statement with no effect" } */
   return y;
 }
 

Re: [PATCH v3 1/2] generate EH info for volatile asm statements (PR93981)

2020-11-18 Thread Jeff Law via Gcc-patches



On 11/15/20 6:04 AM, J.W. Jagersma via Gcc-patches wrote:
> On 2020-11-13 09:41, Richard Biener wrote:
>> On Thu, Mar 12, 2020 at 1:41 AM J.W. Jagersma via Gcc-patches
>>  wrote:
>>> diff --git a/gcc/tree-eh.c b/gcc/tree-eh.c
>>> index 2a409dcaffe..58b16aa763a 100644
>>> --- a/gcc/tree-eh.c
>>> +++ b/gcc/tree-eh.c
>>> @@ -2077,6 +2077,8 @@ lower_eh_constructs_2 (struct leh_state *state, 
>>> gimple_stmt_iterator *gsi)
>>> DECL_GIMPLE_REG_P (tmp) = 1;
>>>   gsi_insert_after (gsi, s, GSI_SAME_STMT);
>>> }
>>> +
>>> +record_throwing_stmt:
>>>/* Look for things that can throw exceptions, and record them.  */
>>>if (state->cur_region && stmt_could_throw_p (cfun, stmt))
>>> {
>>> @@ -2085,6 +2087,36 @@ lower_eh_constructs_2 (struct leh_state *state, 
>>> gimple_stmt_iterator *gsi)
>>> }
>>>break;
>>>
>>> +case GIMPLE_ASM:
>>> +  {
>>> +   /* As above with GIMPLE_ASSIGN.  Change each register output operand
>>> +  to a temporary and insert a new stmt to assign this to the 
>>> original
>>> +  operand.  */
>>> +   gasm *asm_stmt = as_a  (stmt);
>>> +   if (stmt_could_throw_p (cfun, stmt)
>>> +   && gimple_asm_noutputs (asm_stmt) > 0
>>> +   && gimple_stmt_may_fallthru (stmt))
>>> + {
>>> +   for (unsigned i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
>>> + {
>>> +   tree op = gimple_asm_output_op (asm_stmt, i);
>>> +   tree opval = TREE_VALUE (op);
>>> +   if (tree_could_throw_p (opval)
>>> +   || !is_gimple_reg_type (TREE_TYPE (opval))
>>> +   || !is_gimple_reg (get_base_address (opval)))
>>> + continue;
>>> +
>>> +   tree tmp = create_tmp_reg (TREE_TYPE (opval));
>>> +   gimple *s = gimple_build_assign (opval, tmp);
>>> +   gimple_set_location (s, gimple_location (stmt));
>>> +   gimple_set_block (s, gimple_block (stmt));
>>> +   TREE_VALUE (op) = tmp;
>>> +   gsi_insert_after (gsi, s, GSI_SAME_STMT);
>>> + }
>>> + }
>>> +  }
>>> +  goto record_throwing_stmt;
>> Can you avoid the ugly goto by simply duplicating the common code please?
>>
>> Otherwise OK.
>>
>> As you say volatile asms are already considered throwing in some pieces of
>> code so this is a step towards fulfilling that promise.
>>
>> Thanks,
>> Richard.
>>
> Hi Richard,
>
> Thanks for your feedback.  I'll have to check again if I made any other
> changes since this.  If not, and if there are no further objections, I will
> resubmit this patch soon with this goto removed.
Sounds good.  I'll keep an eye out for it.  I think we'll want to look
at the doc text one more time too to make sure it matches the semantics
we can actually guarantee.

Jeff



Re: [PATCH v3 1/2] generate EH info for volatile asm statements (PR93981)

2020-11-18 Thread Jeff Law via Gcc-patches



On 11/15/20 6:00 AM, J.W. Jagersma wrote:
> On 2020-11-12 16:51, Jeff Law wrote:
>> On 3/11/20 6:38 PM, J.W. Jagersma via Gcc-patches wrote:
>>> The following patch extends the generation of exception handling
>>> information, so that it is possible to catch exceptions thrown from
>>> volatile asm statements, when -fnon-call-exceptions is enabled.  Parts
>>> of the gcc code already suggested this should be possible, but it was
>>> never fully implemented.
>>>
>>> Two new test cases are added.  The target-dependent test should pass on
>>> platforms where throwing from a signal handler is allowed.  The only
>>> platform I am aware of where that is the case is *-linux-gnu, so it is
>>> set to XFAIL on all others.
>>>
>>> gcc/
>>> 2020-03-11  Jan W. Jagersma  
>>>
>>> PR inline-asm/93981
>>> * tree-cfg.c (make_edges_bb): Make EH edges for GIMPLE_ASM.
>>> * tree-eh.c (lower_eh_constructs_2): Add case for GIMPLE_ASM.
>>> Assign register output operands to temporaries.
>>> * doc/extend.texi: Document that volatile asms can now throw.
>>>
>>> gcc/testsuite/
>>> 2020-03-11  Jan W. Jagersma  
>>>
>>> PR inline-asm/93981
>>> * g++.target/i386/pr93981.C: New test.
>>> * g++.dg/eh/pr93981.C: New test.
>> Is this the final version of the patch?  Do we have agreement on the
>> sematics for output operands, particularly memory operands?  The last
>> few messages in the March thread lead me to believe that's still not
>> settled.
>>
>>
>> Jeff
> Hi Jeff,
>
> From what I remember, no consensus was reached.  The discussion didn't seem
> to be going anywhere, and I had found a suitable workaround, so the issue
> dropped off my radar.  However this workaround now turned out to be somewhat
> fragile so I do hope to see this implemented in gcc someday.
>
> I'll have to check but I do think this is the "best" version I have of this
> patch.  In my most recent branch here I changed all memory operands to inout,
> but I recall there being some problem with this.  Anyway, as I think Richard
> said (most of this goes over my head), I think this is best left up to the
> user.  If one expects their asm to throw they would also be smart enough to
> use the '+' modifier so that previous assignments are not optimized out.
I wouldn't expect that changing the memory operands to inout would
consistently work.

jeff



Re: [PATCH v3 1/2] generate EH info for volatile asm statements (PR93981)

2020-11-18 Thread Jeff Law via Gcc-patches



On 11/13/20 1:45 AM, Richard Biener wrote:
> On Thu, Nov 12, 2020 at 4:53 PM Jeff Law via Gcc-patches
>  wrote:
>>
>> On 3/11/20 6:38 PM, J.W. Jagersma via Gcc-patches wrote:
>>> The following patch extends the generation of exception handling
>>> information, so that it is possible to catch exceptions thrown from
>>> volatile asm statements, when -fnon-call-exceptions is enabled.  Parts
>>> of the gcc code already suggested this should be possible, but it was
>>> never fully implemented.
>>>
>>> Two new test cases are added.  The target-dependent test should pass on
>>> platforms where throwing from a signal handler is allowed.  The only
>>> platform I am aware of where that is the case is *-linux-gnu, so it is
>>> set to XFAIL on all others.
>>>
>>> gcc/
>>> 2020-03-11  Jan W. Jagersma  
>>>
>>>   PR inline-asm/93981
>>>   * tree-cfg.c (make_edges_bb): Make EH edges for GIMPLE_ASM.
>>>   * tree-eh.c (lower_eh_constructs_2): Add case for GIMPLE_ASM.
>>>   Assign register output operands to temporaries.
>>>   * doc/extend.texi: Document that volatile asms can now throw.
>>>
>>> gcc/testsuite/
>>> 2020-03-11  Jan W. Jagersma  
>>>
>>>   PR inline-asm/93981
>>>   * g++.target/i386/pr93981.C: New test.
>>>   * g++.dg/eh/pr93981.C: New test.
>> Is this the final version of the patch?  Do we have agreement on the
>> sematics for output operands, particularly memory operands?  The last
>> few messages in the March thread lead me to believe that's still not
>> settled.
> I think it's up to the asm itself to put the correct contents.  For the
> cases where GCC needs to emit copies from outputs (that is,
> if it ever reloads them) the only sensible thing is that those are
> not emitted on the EH edge but only on the fallthru one.
On the non-EH edge everything should work as expected.

We can't know where in the ASM where the throw occurred and we don't
model what happens inside the ASM.  I think that combination inherently
means we have no way to reliably know the state any output operand on
the EH edge. 

>
> On GIMPLE this cannot be represented but it means that
> SSA uses of asm defs may not appear on the EH edge
> (I do have some checking patch for this somewhere which
> I think catches one or two existing problems).
Right.  I do wonder if we should have a clobber of the output operands
that are gimple registers on the EH edge though.  I think that most
closely matches the actual semantics.  A checker could see if there's
any SSA_NAME uses that are reached by one of those clobbers.  I think
all you and I aren't in agreement on is whether to issue the clobber on
the EH edge or not and the implications for how we'd check for this case.



>   On RTL if
> the outputs are registers we cannot do any such checking
> of course (no SSA form) and whether the old or the "new"
> value lives is an implementation detail of the asm itself.
Agreed.  I don't see a way to check this in RTL.   I'd like to hope that
if we catch it in gimple that it wouldn't ever be an issue in RTL.   But
I'm sure there'd be some path where it could sneak through ;(

jeff



PING^5 [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2020-11-18 Thread Kewen.Lin via Gcc-patches
Hi,

Gentle ping^5 for:

https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546698.html

BR,
Kewen

on 2020/11/2 下午5:13, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> Gentle ping^4 this:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546698.html
> 
> BR,
> Kewen
> 
> on 2020/10/13 下午3:06, Kewen.Lin via Gcc-patches wrote:
>> Hi,
>>
>> Gentle ping this:
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546698.html
>>
>> BR,
>> Kewen
>>
>> on 2020/9/15 下午3:44, Kewen.Lin via Gcc-patches wrote:
>>> Hi,
>>>
>>> Gentle ping this:
>>>
>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546698.html
>>>
>>> BR,
>>> Kewen
>>>
>>> on 2020/8/31 下午1:49, Kewen.Lin via Gcc-patches wrote:
 Hi,

 I'd like to gentle ping this since IVOPTs part is already to land.

 https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546698.html

 BR,
 Kewen

 on 2020/5/28 下午8:19, Kewen.Lin via Gcc-patches wrote:
>
> gcc/ChangeLog
>
> 2020-MM-DD  Kewen Lin  
>
>   * cfgloop.h (struct loop): New field estimated_unroll.
>   * tree-ssa-loop-manip.c (decide_unroll_const_iter): New function.
>   (decide_unroll_runtime_iter): Likewise.
>   (decide_unroll_stupid): Likewise.
>   (estimate_unroll_factor): Likewise.
>   * tree-ssa-loop-manip.h (estimate_unroll_factor): New declaration.
>   * tree-ssa-loop.c (tree_average_num_loop_insns): New function.
>   * tree-ssa-loop.h (tree_average_num_loop_insns): New declaration.
>


Minor H8 shift code generation change in preparation for cc0 removal

2020-11-18 Thread Jeff Law via Gcc-patches
So I didn't stay up late to work from Pago Pago this year and beat the
stage1 close, but I do want to flush out the removal of cc0 from the H8
port this cycle.  Given these patches only affect the H8 and the H8
would be killed this cycle without the conversion, I think this is
suitable even though we're past stage1 close.

This patch addresses an initial codegen issue that would have resulted
in regressions after removal of cc0.  The compare/test eliminate pass is
unable to handle multiple clobbers.  So patterns that clobber a scratch
and also clobber a condition code are never used to eliminate a
compare/test.

The H8 can shift 1 or 2 bits at a time depending on the precise model. 
Not surprisingly we have multiple strategies to implement shifts, some
of which clobber scratch registers -- but we have a clobber on every
shift insn and as a result they can not participate in compare/test
removal once cc0 is removed from the port.

This patch removes the clobber in the initial code generation in cases
where it's obviously not needed allowing those shifts to participate in
compare/test removal in a future patch.  It has the advantage that is
also generates slightly better code.  By installing this now the removal
of cc0 is a smaller patch, but more importantly, it allows for a more
direct comparison of the generated code before/after cc0 removal.

I've had my tester test before/after this patch with no regressions on
the major H8 multilibs.  I've also spot checked the generated code and
as expected it's ever-so-slightly better after this patch.

I'll be installing this on the trunk momentarily.  More patches will
follow, though probably not in rapid succession as my time to push this
stuff is very limited.

Jeff
commit 700337494e1b0d5ff608e1a3c77852381e264653
Author: Jeff Law 
Date:   Wed Nov 18 21:01:06 2020 -0700

Minor H8 shift code generation change in preparation for cc0 removal

So I didn't stay up late to work from pago pago this year and beat the 
stage1
close, but I do want to flush out the removal of cc0 from the H8 port this
cycle.  Given these patches only affect the H8 and the H8 would be killed 
this
cycle without the conversion, I think this is suitable even though we're 
past
stage1 close.

This patch addresses an initial codegen issue that would have resulted in
regressions after removal of cc0.  The compare/test eliminate pass is 
unable to
handle multiple clobbers.  So patterns that clobber a scratch and also 
clobber
a condition code are never used to eliminate a compare/test.

The H8 can shift 1 or 2 bits at a time depending on the precise model.  Not
surprisingly we have multiple strategies to implement shifts, some of which
clobber scratch registers -- but we have a clobber on every shift insn and 
as
a result they can not participate in compare/test removal once cc0 is 
removed
from the port.

This patch removes the clobber in the initial code generation in cases where
it's obviously not needed allowing those shifts to participate in 
compare/test
removal in a future patch.  It has the advantage that is also generates
slightly better code.  By installing this now the removal of cc0 is a 
smaller
patch, but more importantly, it allows for a more direct comparison of the
generated code before/after cc0 removal.

I've had my tester test before/after this patch with no regressions on the
major H8 multilibs.  I've also spot checked the generated code and as 
expected
it's ever-so-slightly better after this patch.

I'll be installing this on the trunk momentarily.  More patches will follow,
though probably not in rapid succession as my time to push this stuff is 
very
limited.

gcc/

* config/h8300/constraints.md (R constraint): Add argument to call
to h8300_shift_needs_scratch_p.
(S and T constraints): Similary.
* config/h8300/h8300-protos.h: Update h8300_shift_needs_scratch_p
prototype.
* config/h8300/h8300.c (expand_a_shift): Emit a different pattern
if the shift does not require a scratch register.
(h8300_shift_needs_scratch_p): Refine to be more accurate.
* config/h8300/shiftrotate.md (shiftqi_noscratch): New pattern.
(shifthi_noscratch, shiftsi_noscratch): Similarly.

diff --git a/gcc/config/h8300/constraints.md b/gcc/config/h8300/constraints.md
index d24518225f8..1d80152ce41 100644
--- a/gcc/config/h8300/constraints.md
+++ b/gcc/config/h8300/constraints.md
@@ -152,7 +152,7 @@
 (define_constraint "R"
   "@internal"
   (and (match_code "const_int")
-   (match_test "!h8300_shift_needs_scratch_p (ival, QImode)")))
+   (match_test "!h8300_shift_needs_scratch_p (ival, QImode, CLOBBER)")))
 
 (define_constraint "C"
   "@internal"
@@ -161,12 +161,12 @@
 (define_constraint "S"
   "@internal"
   (and (match_code "const_int")
- 

Re: [PATCH] PowerPC Fix ibm128 defaults for pr70117.c test.

2020-11-18 Thread Paul A. Clarke via Gcc-patches
On Wed, Nov 18, 2020 at 04:29:09PM -0600, Segher Boessenkool wrote:
> On Wed, Nov 18, 2020 at 10:53:49PM +0100, Jakub Jelinek wrote:
> > On Wed, Nov 18, 2020 at 03:43:20PM -0600, Segher Boessenkool wrote:
> > > On Sun, Nov 15, 2020 at 12:17:47PM -0500, Michael Meissner wrote:
> > > > --- a/gcc/testsuite/gcc.target/powerpc/pr70117.c
> > > > +++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> > > > @@ -9,9 +9,11 @@
> > > > 128-bit floating point, because the type is not enabled on those
> > > > systems.  */
> > > >  #define LDOUBLE __ibm128
> > > > +#define IBM128_MAX ((__ibm128) 
> > > > 1.79769313486231580793728971405301199e+308L)
> > > 
> > > This is the IEEE QP float number 43fef780 which
> > > I very much doubt is the maximum finite double-double?  See the 0 in the
> > 
> > Numbers without the 0 in the middle-end aren't valid, see
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95450#c6
> > for more details.  Without the 0 in the middle the double double number
> > rounded to double would require increasing the higher double, and as it is
> > the largest representable finite number, that is not possible.
> 
> Ah, in that way.  Tricky.
> 
> Mike, please add a comment, what number it represents?  Okay for trunk
> with that, thanks.
> 
> (Should those not be define in some header though?)

Would it be better to represent the number in hex, like with printf's '%a'
formatting (e.g. "0x1.921fb54442d18p+0"...this is NOT the same value)?

(I always get nervous when I see a long float hardcoded in decimal.)

PC


Re: [PATCH] rs6000, vector integer multiply/divide/modulo instructions

2020-11-18 Thread David Edelsohn via Gcc-patches
On Wed, Nov 4, 2020 at 11:44 AM Carl Love  wrote:
>
> David:
>
> I have reworked the patch moving the new vector instruction patterns to
> vsx.md.  Also, cleaned up the vector division instructions.  The
> div3 pattern definitions are the only ones that should be
> defined.
>
> I have retested the patch on:
>
>powerpc64le-unknown-linux-gnu (Power 9 LE)
>
> with no regressions. Additionally the new test case was compiled and
> executed by hand on Mambo to verify the test case passes.
>
> Please let me know if this patch is acceptable for mainline.  Thanks.
>
> Carl Love
>
> --
>
> 2020-11-02  Carl Love  
>
> gcc/
> * config/rs6000/altivec.h (vec_mulh, vec_div, vec_dive, vec_mod): New
> defines.
> * config/rs6000/altivec.md (VIlong): Move define to file vsx.md.
> * config/rs6000/rs6000-builtin.def (VDIVES_V4SI, VDIVES_V2DI,
> VDIVEU_V4SI, VDIVEU_V2DI, VDIVS_V4SI, VDIVS_V2DI, VDIVU_V4SI,
> VDIVU_V2DI, VMODS_V2DI, VMODS_V4SI, VMODU_V2DI, VMODU_V4SI,
> VMULHS_V2DI, VMULHS_V4SI, VMULHU_V2DI, VMULHU_V4SI, VMULLD_V2DI):
> Add builtin define.
> (VMUL, VMULH, VDIVE, VMOD):  Add new BU_P10_OVERLOAD_2 definitions.
> * config/rs6000/rs6000-call.c (VSX_BUILTIN_VEC_DIV,
> P10_BUILTIN_VEC_VDIVE, P10_BUILTIN_VEC_VMOD, P10_BUILTIN_VEC_VMULH):
> New overloaded definitions.
> (builtin_function_type) [P10V_BUILTIN_VDIVEU_V4SI,
> P10V_BUILTIN_VDIVEU_V2DI, P10V_BUILTIN_VDIVU_V4SI,
> P10V_BUILTIN_VDIVU_V2DI, P10V_BUILTIN_VMODU_V2DI,
> P10V_BUILTIN_VMODU_V4SI, P10V_BUILTIN_VMULHU_V2DI,
> P10V_BUILTIN_VMULHU_V4SI, P10V_BUILTIN_VMULLD_V2DI]: Add case
> statement for builtins.
> * config/rs6000/vsx.md (VIlong_char): Add define_mod_attribute.
> (UNSPEC_VDIVES, UNSPEC_VDIVEU,
> UNSPEC_VMULHS, UNSPEC_VMULHU, UNSPEC_VMULLD): Add enum for UNSPECs.
> (vsx_mul_v2di, vsx_udiv_v2di): Add if TARGET_POWER10 statement.
> (vdives_, vdiveu_, vdiv3, uuvdiv3,
> vmods_, vmodu_, vmulhs_, vmulhu_, mulv2di3):
> Add define_insn, mode is VIlong.
> * doc/extend.texi (vec_mulh, vec_mul, vec_div, vec_dive, vec_mod): Add
> builtin descriptions.
>
> gcc/testsuite/
> * gcc.target/powerpc/builtins-1-p10-runnable.c: New test file.

Hi, Carl

Thanks for making the changes.  This looks okay to me now.  I don't
know if Segher has any additional requests.

Thanks, David


Re: [PATCH] pru: Add builtins for HALT and LMBD

2020-11-18 Thread Jeff Law via Gcc-patches



On 11/13/20 1:07 PM, Dimitar Dimitrov wrote:
> Add builtins for HALT and LMBD, per Texas Instruments document
> SPRUHV7C.  Use the new LMBD pattern to define an expand for clz.
>
> Binutils [1] and sim [2] support for LMBD instruction are merged now.
>
> [1] https://sourceware.org/pipermail/binutils/2020-October/113901.html
> [2] https://sourceware.org/pipermail/gdb-patches/2020-November/173141.html
>
> gcc/ChangeLog:
>
>   * config/pru/alu-zext.md: Add lmbd patterns for zero_extend
>   variants.
>   * config/pru/pru.c (enum pru_builtin): Add HALT and LMBD.
>   (pru_init_builtins): Ditto.
>   (pru_builtin_decl): Ditto.
>   (pru_expand_builtin): Ditto.
>   * config/pru/pru.h (CLZ_DEFINED_VALUE_AT_ZERO): Define PRU
>   value for CLZ with zero value parameter.
>   * config/pru/pru.md: Add halt, lmbd and clz patterns.
>   * doc/extend.texi: Document PRU builtins.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/pru/halt.c: New test.
>   * gcc.target/pru/lmbd.c: New test.
OK.  Please commit if you haven't already.

Thanks,
jeff



Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-18 Thread Jeff Law via Gcc-patches



On 11/17/20 9:58 AM, Jakub Jelinek wrote:
> On Tue, Nov 17, 2020 at 09:54:46AM -0700, Jeff Law wrote:
>>> So, e.g. if we had __builtin_warning (dunno where Martin S. is with that),
>>> we could e.g. queue a __builtin_warning and add __builtin_unreachable (or
>>> other possibilities), or e.g. during VRP just canonicalize proven always
>>> out of bound shifts to shifts by an out of bound constant and let some later
>>> pass warn and/or add __builtin_warning.
>> So the idea is to start funneling this through the path isolation code
>> and handle the various strategies there.
> If the path isolation code would use the ranger for this, it wouldn't need
> to be in VRP but could be anywhere, sure.
Good point.  I'll have to get used to having ranges available anywhere :-)

Jeff



Re: [PATCH 1/2] correct BB frequencies after loop changed

2020-11-18 Thread Jeff Law via Gcc-patches



On 11/18/20 12:28 AM, Richard Biener wrote:
> On Tue, 17 Nov 2020, Jeff Law wrote:
>
>> Minor questions for Jan and Richi embedded below...
>>
>> On 10/9/20 4:12 AM, guojiufu via Gcc-patches wrote:
>>> When investigating the issue from 
>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549786.html
>>> I find the BB COUNTs of loop seems are not accurate in some case.
>>> For example:
>>>
>>> In below figure:
>>>
>>>
>>>COUNT:268435456  pre-header
>>> |
>>> |  ..
>>> |  ||
>>> V  v|
>>>COUNT:805306369|
>>>/ \  |
>>>33%/   \ |
>>>  / \|
>>> v   v   |
>>> COUNT:268435456  COUNT:536870911  | 
>>> exit-edge |   latch |
>>>   ._.
>>>
>>> Those COUNTs have below equations:
>>> COUNT of exit-edge:268435456 = COUNT of pre-header:268435456
>>> COUNT of exit-edge:268435456 = COUNT of header:805306369 * 33
>>> COUNT of header:805306369 = COUNT of pre-header:268435456 + COUNT of 
>>> latch:536870911
>>>
>>>
>>> While after pcom:
>>>
>>>COUNT:268435456  pre-header
>>> |
>>> |  ..
>>> |  ||
>>> V  v|
>>>COUNT:268435456|
>>>/ \  |
>>>50%/   \ |
>>>  / \|
>>> v   v   |
>>> COUNT:134217728  COUNT:134217728  | 
>>> exit-edge |   latch |
>>>   ._.
>>>
>>> COUNT != COUNT + COUNT
>>> COUNT != COUNT
>>>
>>> In some cases, the probility of exit-edge is easy to estimate, then
>>> those COUNTs of other BBs in loop can be re-caculated.
>>>
>>> Bootstrap and regtest pass on ppc64le. Is this ok for trunk?
>>>
>>> Jiufu
>>>
>>> gcc/ChangeLog:
>>> 2020-10-09  Jiufu Guo   
>>>
>>> * cfgloopmanip.h (recompute_loop_frequencies): New function.
>>> * cfgloopmanip.c (recompute_loop_frequencies): New implementation.
>>> * tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Call
>>> recompute_loop_frequencies.
>>>
>>> ---
>>>  gcc/cfgloopmanip.c| 53 +++
>>>  gcc/cfgloopmanip.h|  2 +-
>>>  gcc/tree-ssa-loop-manip.c | 28 +++--
>>>  3 files changed, 57 insertions(+), 26 deletions(-)
>>>
>>> diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
>>> index 73134a20e33..b0ca82a67fd 100644
>>> --- a/gcc/cfgloopmanip.c
>>> +++ b/gcc/cfgloopmanip.c
>>> @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "gimplify-me.h"
>>>  #include "tree-ssa-loop-manip.h"
>>>  #include "dumpfile.h"
>>> +#include "cfgrtl.h"
>>>  
>>>  static void copy_loops_to (class loop **, int,
>>>class loop *);
>>> @@ -1773,3 +1774,55 @@ loop_version (class loop *loop,
>>>  
>>>return nloop;
>>>  }
>>> +
>>> +/* Recalculate the COUNTs of BBs in LOOP, if the probability of exit edge
>>> +   is NEW_PROB.  */
>>> +
>>> +bool
>>> +recompute_loop_frequencies (class loop *loop, profile_probability new_prob)
>>> +{
>>> +  edge exit = single_exit (loop);
>>> +  if (!exit)
>>> +return false;
>>> +
>>> +  edge e;
>>> +  edge_iterator ei;
>>> +  edge non_exit;
>>> +  basic_block * bbs;
>>> +  profile_count exit_count = loop_preheader_edge (loop)->count ();
>>> +  profile_probability exit_p = exit_count.probability_in 
>>> (loop->header->count);
>>> +  profile_count base_count = loop->header->count;
>>> +  profile_count after_num = base_count.apply_probability (exit_p);
>>> +  profile_count after_den = base_count.apply_probability (new_prob);
>>> +
>>> +  /* Update BB counts in loop body.
>>> + COUNT = COUNT
>>> + COUNT = COUNT * exit_edge_probility
>>> + The COUNT = COUNT * old_exit_p / new_prob.  */
>>> +  bbs = get_loop_body (loop);
>>> +  scale_bbs_frequencies_profile_count (bbs, loop->num_nodes, after_num,
>>> +after_den);
>>> +  free (bbs);
>>> +
>>> +  /* Update probability and count of the BB besides exit edge (maybe 
>>> latch).  */
>>> +  FOR_EACH_EDGE (e, ei, exit->src->succs)
>>> +if (e != exit)
>>> +  break;
>>> +  non_exit = e;
>> Are we sure that exit->src has just two successors (will that case be
>> canonicalized before we get here?).? If it has > 2 successors, then I'm
>> pretty sure the frequencies get mucked up.? Richi could probably answer
>> whether or not the block with the loop exit 

Re: [PATCH] tighten up attribute access validation (PR 97879)

2020-11-18 Thread Jeff Law via Gcc-patches



On 11/18/20 3:41 PM, Martin Sebor via Gcc-patches wrote:
> The access attribute handler doesn't check to make sure the mode
> argument is an identifier and readily accepts string arguments
> which are assumed to be the condensed internal representation
> the user attribute is translated to.  This can cause all sorts
> of unintended behavior when the user supplies a bogus string,
> either by accident or in an effort to break things.
>
> The attached patch tightens up the attribute handler to reject
> strings and any other modes that aren't the expected indentifiers.
> It distinguishes the internal strings by introducing a new flag,
> ATTR_FLAG_INTERNAL, and calling decl_attributes() with it.
>
> Martin
>
> gcc-97879.diff
>
> PR middle-end/97879 - ICE on invalid mode in attribute access
>
> gcc/c-family/ChangeLog:
>
>   PR middle-end/97879
>   * c-attribs.c (handle_access_attribute): Handle ATTR_FLAG_INTERNAL.
>   Error out on invalid modes.
>
> gcc/ChangeLog:
>
>   PR middle-end/97879
>   * tree-core.h (enum attribute_flags): Add ATTR_FLAG_INTERNAL.
>
> gcc/testsuite/ChangeLog:
>
>   PR middle-end/97879
>   * gcc.dg/attr-access-3.c: New test.
OK
jeff



[PATCH] tighten up attribute access validation (PR 97879)

2020-11-18 Thread Martin Sebor via Gcc-patches

The access attribute handler doesn't check to make sure the mode
argument is an identifier and readily accepts string arguments
which are assumed to be the condensed internal representation
the user attribute is translated to.  This can cause all sorts
of unintended behavior when the user supplies a bogus string,
either by accident or in an effort to break things.

The attached patch tightens up the attribute handler to reject
strings and any other modes that aren't the expected indentifiers.
It distinguishes the internal strings by introducing a new flag,
ATTR_FLAG_INTERNAL, and calling decl_attributes() with it.

Martin
PR middle-end/97879 - ICE on invalid mode in attribute access

gcc/c-family/ChangeLog:

	PR middle-end/97879
	* c-attribs.c (handle_access_attribute): Handle ATTR_FLAG_INTERNAL.
	Error out on invalid modes.

gcc/ChangeLog:

	PR middle-end/97879
	* tree-core.h (enum attribute_flags): Add ATTR_FLAG_INTERNAL.

gcc/testsuite/ChangeLog:

	PR middle-end/97879
	* gcc.dg/attr-access-3.c: New test.


diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index b979fbcc0c6..813b90a465c 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -4289,8 +4289,8 @@ append_access_attr (tree node[3], tree attrs, const char *attrstr,
the attribute and its arguments into a string.  */
 
 static tree
-handle_access_attribute (tree node[3], tree name, tree args,
-			 int ARG_UNUSED (flags), bool *no_add_attrs)
+handle_access_attribute (tree node[3], tree name, tree args, int flags,
+			 bool *no_add_attrs)
 {
   tree attrs = TYPE_ATTRIBUTES (*node);
   tree type = *node;
@@ -4336,15 +4336,19 @@ handle_access_attribute (tree node[3], tree name, tree args,
 
 	  /* Recursively call self to "replace" the documented/external
 	 form of the attribute with the condensend internal form.  */
-	  decl_attributes (node, axsat, flags);
+	  decl_attributes (node, axsat, flags | ATTR_FLAG_INTERNAL);
 	  return NULL_TREE;
 	}
 
-  /* This is a recursive call to handle the condensed internal form
-	 of the attribute (see below).  Since all validation has been
-	 done simply return here, accepting the attribute as is.  */
-  *no_add_attrs = false;
-  return NULL_TREE;
+  if (flags & ATTR_FLAG_INTERNAL)
+	{
+	  /* This is a recursive call to handle the condensed internal
+	 form of the attribute (see below).  Since all validation
+	 has been done simply return here, accepting the attribute
+	 as is.  */
+	  *no_add_attrs = false;
+	  return NULL_TREE;
+	}
 }
 
   /* Set to true when the access mode has the form of a function call
@@ -4363,6 +4367,13 @@ handle_access_attribute (tree node[3], tree name, tree args,
   access_mode = DECL_NAME (access_mode);
   funcall = true;
 }
+  else if (TREE_CODE (access_mode) != IDENTIFIER_NODE)
+{
+  error ("attribute %qE mode %qE is not an identifier; expected one of "
+	 "%qs, %qs, %qs, or %qs", name, access_mode,
+	 "read_only", "read_write", "write_only", "none");
+  return NULL_TREE;
+}
 
   const char* const access_str = IDENTIFIER_POINTER (access_mode);
   const char *ps = access_str;
@@ -4573,7 +4584,7 @@ handle_access_attribute (tree node[3], tree name, tree args,
 
   /* Recursively call self to "replace" the documented/external form
  of the attribute with the condensed internal form.  */
-  decl_attributes (node, new_attrs, flags);
+  decl_attributes (node, new_attrs, flags | ATTR_FLAG_INTERNAL);
   return NULL_TREE;
 }
 
diff --git a/gcc/testsuite/gcc.dg/attr-access-3.c b/gcc/testsuite/gcc.dg/attr-access-3.c
new file mode 100644
index 000..8c793bf4b82
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/attr-access-3.c
@@ -0,0 +1,18 @@
+/* PR middle-end/97879 - ICE on invalid mode in attribute access
+   { dg-do compile }
+   { dg-options "-Wall" } */
+
+#define A(...) __attribute__ ((access (__VA_ARGS__)))
+
+A (" ", 1) void f1 (int *);   // { dg-error "attribute 'access' mode '\" \"' is not an identifier; expected one of 'read_only', 'read_write', 'write_only', or 'none'" }
+   void f1 (int *);
+
+
+A ("none", 1) void f2 (char *);   // { dg-error "not an identifier" }
+  void f2 (char *);
+
+A (1) void f3 (); // { dg-error "not an identifier" }
+
+A (1, 2) void f4 ();  // { dg-error "not an identifier" }
+A (2., 3.) void f5 ();// { dg-error "not an identifier" }
+
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index c9280a8d3b1..313a6af2253 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -859,7 +859,10 @@ enum attribute_flags {
  are not in fact compatible with the function type.  */
   ATTR_FLAG_BUILT_IN = 16,
   /* A given attribute has been parsed as a C++-11 attribute.  */
-  ATTR_FLAG_CXX11 = 32
+  ATTR_FLAG_CXX11 = 32,
+  /* The attribute handler is being invoked with an internal argument
+ that may not otherwise be valid when specified in source code.  */
+  ATTR_FLAG_INTERNAL = 64
 };
 
 /* 

Re: [PATCH] PowerPC Fix ibm128 defaults for pr70117.c test.

2020-11-18 Thread Segher Boessenkool
On Wed, Nov 18, 2020 at 10:53:49PM +0100, Jakub Jelinek wrote:
> On Wed, Nov 18, 2020 at 03:43:20PM -0600, Segher Boessenkool wrote:
> > Hi!
> > 
> > On Sun, Nov 15, 2020 at 12:17:47PM -0500, Michael Meissner wrote:
> > > --- a/gcc/testsuite/gcc.target/powerpc/pr70117.c
> > > +++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> > > @@ -9,9 +9,11 @@
> > > 128-bit floating point, because the type is not enabled on those
> > > systems.  */
> > >  #define LDOUBLE __ibm128
> > > +#define IBM128_MAX ((__ibm128) 
> > > 1.79769313486231580793728971405301199e+308L)
> > 
> > This is the IEEE QP float number 43fef780 which
> > I very much doubt is the maximum finite double-double?  See the 0 in the
> 
> Numbers without the 0 in the middle-end aren't valid, see
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95450#c6
> for more details.  Without the 0 in the middle the double double number
> rounded to double would require increasing the higher double, and as it is
> the largest representable finite number, that is not possible.

Ah, in that way.  Tricky.

Mike, please add a comment, what number it represents?  Okay for trunk
with that, thanks.

(Should those not be define in some header though?)


Segher


Re: Support to check vliw overlapping register constraint created by regrename, please help to review, thanks

2020-11-18 Thread Jeff Law via Gcc-patches



On 6/20/20 5:18 AM, Zhongyunde wrote:
> In some target, it is limited to issue two insns with change the same 
> register.(The insn 73 start with insn:TI, so it will be issued together with 
> others insns  until a new insn start with insn:TI, such as insn 71)
> The regrename can known the mode V2VF in insn 73 need two successive 
> registers, i.e. v2 and v3, here is dump snippet before the regrename.
>
> (insn:TI 73 76 71 4 (set (reg/v:V2VF 37 v2 [orig:180 _62 ] [180])
> (unspec:V2VF [
> (reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])
> (reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])
> ] UNSPEC_HFSQMAG_32X32)) "../test_modify.c":57 710 {hfsqmag_v2vf}
>  (expr_list:REG_DEAD (reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])
> (expr_list:REG_UNUSED (reg:VHF 38 v3)
> (expr_list:REG_STAGE (const_int 2 [0x2])
> (expr_list:REG_CYCLE (const_int 2 [0x2])
> (expr_list:REG_UNITS (const_int 256 [0x100])
> (nil)))
>
> (insn 71 73 243 4 (set (reg:VHF 43 v8 [orig:265 MEM[(const vfloat32x16 
> *)Src_base_134] ] [265])
> (mem:VHF (reg/v/f:DI 13 a13 [orig:207 Src_base ] [207]) [1 MEM[(const 
> vfloat32x16 *)Src_base_134]+0 S64 A512])) "../test_modify.c":56 450 
> {movvhf_internal}
>  (expr_list:REG_STAGE (const_int 1 [0x1])
> (expr_list:REG_CYCLE (const_int 2 [0x2])
> (nil
>
> Then, in the regrename, the insn 71 will be transformed into following code 
> with register v3, so there is an conflict between insn 73 and insn 71, as 
> both of them set the v3 register.
>
> Register v2 (2): 73 [SVEC_REGS]
> Register v8 (1): 71 [VEC_ALL_REGS]
>
> 
>
> (insn 71 73 243 4 (set (reg:VHF 38 v3 [orig:265 MEM[(const vfloat32x16 
> *)Src_base_134] ] [265])
> (mem:VHF (reg/v/f:DI 13 a13 [orig:207 Src_base ] [207]) [1 MEM[(const 
> vfloat32x16 *)Src_base_134]+0 S64 A512])) "../test_modify.c":56 450 
> {movvhf_internal}
>  (expr_list:REG_STAGE (const_int 1 [0x1])
> (expr_list:REG_CYCLE (const_int 2 [0x2])
>
> 2644.diff
>
> diff --git a/gcc/regrename.c b/gcc/regrename.c
> index c38173a77..8eea34dd8 100644
> --- a/gcc/regrename.c
> +++ b/gcc/regrename.c
> @@ -284,6 +284,48 @@ create_new_chain (unsigned this_regno, unsigned 
> this_nregs, rtx *loc,
>return head;
>  }
>  
> +/* For a def-use chain HEAD, find which registers with REG_UNUSED
> +   conflict its VLIW constraint and set the corresponding bits in *PSET.  */
> +
> +static void
> +merge_vliw_overlapping_regs (HARD_REG_SET *pset, struct du_head *head)
> +{
> +  rtx_insn *insn;
> +  rtx_insn *cur_insn = head->first->insn;
> +  basic_block bb = BLOCK_FOR_INSN (cur_insn);
> +
> +  /* Only SMS related basic block may generate VLIW before regrename pass.  
> */
> +  if ((bb->flags & BB_DISABLE_SCHEDULE) == 0 || (bb->flags & BB_MODIFIED) != 
> 0)
> +return;
Presumably this is to keep the cost down.  But ISTM that it might be
possible for a target to get TImode set via a peep2.  Similarly if a
plugin hooks into the pass manager to insert its own passes.  And it may
be the case that we could add other passes that might set TImode on the
insn in the future.  So I don't think this is necessarily a good test.

Closely related, I think we insert the TImode insn markers independent
of the target in the selective scheduler.  So your patch could pessimize
other targets that use the selective scheduler.



So, I think the target files need to indicate they need to constrain
regrename's behavior.  This probably means a target hook that your
target will need to set.   You could then test that target hook rather
than looking at the bb flags.

 

> b
> +
> +  for (insn = prev_active_insn (cur_insn); insn && BLOCK_FOR_INSN (insn) == 
> bb;
> +   insn = prev_active_insn (insn))
> +{
> +  rtx note;
> +
> +  for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
> +{
> +  if (REG_NOTE_KIND (note) == REG_UNUSED)
So does this only happen when there's a REG_UNUSED note?  That's useful
to know.


> +  {
> +int regno = REGNO (XEXP (note, 0));
> +bool old = TEST_HARD_REG_BIT(*pset, regno);
> +SET_HARD_REG_BIT (*pset, nregs);
So can't XEXP (note, 0) have a mode which would cover multiple
registers?  And if so, don't we need to iterate over all the covered
registers?


Jeff



Re: [PATCH] middle-end: Fix PR middle-end/85811: Introduce tree_expr_maybe_nan_p et al.

2020-11-18 Thread Jeff Law via Gcc-patches



On 8/15/20 5:10 AM, Roger Sayle wrote:
> The motivation for this patch is PR middle-end/85811, a wrong-code
> regression entitled "Invalid optimization with fmax, fabs and nan".
> The optimization involves assuming max(x,y) is non-negative if (say)
> y is non-negative, i.e. max(x,2.0).  Unfortunately, this is an invalid
> assumption in the presence of NaNs.  Hence max(x,+qNaN), with IEEE fmax
> semantics will always return x even though the qNaN is non-negative.
> Worse, max(x,2.0) may return a negative value if x is -sNaN.
>
> I'll quote Joseph Myers (many thanks) who describes things clearly as:
>> (a) When both arguments are NaNs, the return value should be a qNaN,
>> but sometimes it is an sNaN if at least one argument is an sNaN.
>> (b) Under TS 18661-1 semantics, if either argument is an sNaN then the
>> result should be a qNaN (whereas if one argument is a qNaN and the
>> other is not a NaN, the result should be the non-NaN argument).
>> Various implementations treat sNaNs like qNaNs here.
> Under this logic, the tree_expr_nonnegative_p for IEEE fmax should be:
>
> CASE_CFN_FMAX:
> CASE_CFN_FMAX_FN:
>   /* Usually RECURSE (arg0) || RECURSE (arg1) but NaNs complicate
>  things.  In the presence of sNaNs, we're only guaranteed to be
>  non-negative if both operands are non-negative.  In the presence
>  of qNaNs, we're non-negative if either operand is non-negative
>  and can't be a qNaN, or if both operands are non-negative.  */
>   if (tree_expr_maybe_signaling_nan_p (arg0) ||
>   tree_expr_maybe_signaling_nan_p (arg1))
> return RECURSE (arg0) && RECURSE (arg1);
>   return RECURSE (arg0) ? (!tree_expr_maybe_nan_p (arg0)
>   || RECURSE (arg1))
> : (RECURSE (arg1)
>   && !tree_expr_maybe_nan_p (arg1));
>
> Which indeed resolves the wrong code in the PR.  The infrastructure that
> makes this possible are the two new functions tree_expr_maybe_nan_p and
> tree_expr_maybe_signaling_nan_p which test whether a value may potentially
> be a NaN or a signaling NaN respectively.  In fact, this patch adds seven
> new predicates to the middle-end:
>
> bool tree_expr_finite_p (const_tree);
> bool tree_expr_infinite_p (const_tree);
> bool tree_expr_maybe_infinite_p (const_tree);
> bool tree_expr_signaling_nan_p (const_tree);
> bool tree_expr_maybe_signaling_nan_p (const_tree);
> bool tree_expr_nan_p (const_tree);
> bool tree_expr_maybe_nan_p (const_tree);
>
> These functions correspond to the "must" and "may" operators in modal logic,
> and allow us to triage expressions in the middle-end; definitely a NaN,
> definitely not a NaN, and unknown at compile-time, etc.  A prime example of
> the utility of these functions is that a IEEE floating point value promoted
> from an integer type can't be a NaN or infinite.  Hence (double)i+0.0 where
> i is an integer can be simplified to (double)i even with -fsignaling-nans.
> Currently in GCC optimizations are enabled/disabled based on whether the
> expression's type supports NaNs or sNaNs; with these new predicates they
> can be controlled by whether the actual operands may or may not be NaNs.
>
> Having added these extremely useful helper functions to the middle-end,
> I couldn't help by use then in a few places in fold-const.c, builtins.c
> and match.pd.  In the near term, these can/should be used in places
> where the tree optimizers test for HONOR_NANS, HONOR_INFINITIES or
> HONOR_SNANS, or explicitly test whether a REAL_CST is a NaN or Inf.
> In the longer term (I'm not volunteering) these predicates could perhaps
> be hooked into the middle-end's SSA chaining and/or VRP machinery,
> allowing finiteness to propagated around the CFG, much like we
> currently propagate value ranges.
>
> This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap"
> and "make -k check".
> Ok for mainline?
>
>
> 2020-08-15  Roger Sayle  
>
> gcc/ChangeLog
>   PR middle-end/85811
>   * fold-const.c (tree_expr_finite_p): New function to test whether
>   a tree expression must be finite, i.e. not a FP NaN or infinity.
>   (tree_expr_infinite_p):  New function to test whether a tree
>   expression must be infinite, i.e. a FP infinity.
>   (tree_expr_maybe_infinite_p): New function to test whether a tree
>   expression may be infinite, i.e. a FP infinity.
>   (tree_expr_signaling_nan_p): New function to test whether a tree
>   expression must evaluate to a signaling NaN (sNaN).
>   (tree_expr_maybe_signaling_nan_p): New function to test whether a
>   tree expression may be a signaling NaN (sNaN).
>   (tree_expr_nan_p): New function to test whether a tree expression
>   must evaluate to a (quiet or signaling) NaN.
>   (tree_expr_maybe_nan_p): New function to test whether a tree
>   expression me be a (quiet or signaling) NaN.
>
>   (tree_binary_nonnegative_warnv_p) 

Re: [PATCH] PowerPC Fix ibm128 defaults for pr70117.c test.

2020-11-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 18, 2020 at 03:43:20PM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Sun, Nov 15, 2020 at 12:17:47PM -0500, Michael Meissner wrote:
> > --- a/gcc/testsuite/gcc.target/powerpc/pr70117.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> > @@ -9,9 +9,11 @@
> > 128-bit floating point, because the type is not enabled on those
> > systems.  */
> >  #define LDOUBLE __ibm128
> > +#define IBM128_MAX ((__ibm128) 1.79769313486231580793728971405301199e+308L)
> 
> This is the IEEE QP float number 43fef780 which
> I very much doubt is the maximum finite double-double?  See the 0 in the

Numbers without the 0 in the middle-end aren't valid, see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95450#c6
for more details.  Without the 0 in the middle the double double number
rounded to double would require increasing the higher double, and as it is
the largest representable finite number, that is not possible.

Jakub



Re: [PATCH v5] rtl: builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-11-18 Thread Jeff Law via Gcc-patches



On 11/18/20 12:31 AM, Richard Biener wrote:
> On Tue, 17 Nov 2020, Jeff Law wrote:
>
>>
>> On 11/4/20 8:10 AM, Raoni Fassina Firmino via Gcc-patches wrote:
>>> On Wed, Nov 04, 2020 at 10:35:03AM +0100, Richard Biener wrote:
> +/* Expand call EXP to the fegetround builtin (from C99 fenv.h), 
> returning the
> +   result and setting it in TARGET.  Otherwise return NULL_RTX on 
> failure.  */
> +static rtx
> +expand_builtin_fegetround (tree exp, rtx target, machine_mode 
> target_mode)
> +{
> +  if (!validate_arglist (exp, VOID_TYPE))
> +return NULL_RTX;
> +
> +  insn_code icode = direct_optab_handler (fegetround_optab, SImode);
> +  if (icode == CODE_FOR_nothing)
> +return NULL_RTX;
> +
> +  if (target == 0
> +  || GET_MODE (target) != target_mode
> +  || !(*insn_data[icode].operand[0].predicate) (target, target_mode))
> +target = gen_reg_rtx (target_mode);
> +
> +  rtx pat = GEN_FCN (icode) (target);
> +  if (!pat)
> +return NULL_RTX;
> +  emit_insn (pat);
 I think you need to verify whether the expansion ended up in 'target'
 and otherwise emit a move since usually 'target' is just a hint.
>>> I thought the "if (target == 0 ..." took care of that. The expands do
>>> emit a move, if that helps.
>> It looks like if we have a passed in target and it either has the wrong
>> mode or it does not match the predicate, then we generaet a new target
>> and use that instead.? I don't see where we'd copy from that new target
>> to the original desired target.? For some expanders the caller would
>> handle that, but I don't see how that's possible for this one without
>> the caller digging into the generated RTL to determine that
>> expand_builtin_fegetround put the result somewhere other than TARGET and
>> thus a copy is needed.
>>
>> That may be what Richi is worried about.
> I know we've added missing
>
>   if (!rtx_equal_p (target, ops[0].value))
> emit_move_insn (target, ops[0].value);
>
> to several expanders (using expand_insn rather than manual
> GEN_FCN (icode) calls).
Yes.  But I think we end up doing that mostly for expanders that return
the object where the value was stored in some reasonably convenient
location (either as a return value or in an ops array).  I don't think
that's the case here. 

Jeff



Re: [PATCH] PowerPC Fix ibm128 defaults for pr70117.c test.

2020-11-18 Thread Segher Boessenkool
Hi!

On Sun, Nov 15, 2020 at 12:17:47PM -0500, Michael Meissner wrote:
> --- a/gcc/testsuite/gcc.target/powerpc/pr70117.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> @@ -9,9 +9,11 @@
> 128-bit floating point, because the type is not enabled on those
> systems.  */
>  #define LDOUBLE __ibm128
> +#define IBM128_MAX ((__ibm128) 1.79769313486231580793728971405301199e+308L)

This is the IEEE QP float number 43fef780 which
I very much doubt is the maximum finite double-double?  See the 0 in the
middle of the mantissa...  43feff00 is bigger,
and representable as double-double just as well?  Or even
43feff80 should be.


Segher


Re: Ping: [PATCH] Ensure colorization doesn't corrupt multibyte sequences in diagnostics

2020-11-18 Thread Jeff Law via Gcc-patches



On 11/14/20 1:33 PM, Lewis Hyatt wrote:
> On Fri, Nov 13, 2020 at 5:27 PM Jeff Law  wrote:
>>
>> On 1/14/20 5:05 PM, Lewis Hyatt wrote:
>>> Hello-
>>>
>>> I thought I might ping this short patch please, just in case it may
>>> make sense to include in GCC 10 along with the other UTF-8-related
>>> fixes to diagnostics. Thanks!
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00915.html
>> This is fine for the trunk.  Note that due to the changes to handle
>> tabs/control bytes will require this patch to be updated.  It may be as
>> simple as moving the c = dw.next_byte() statement up.
>>
>>
>> Go ahead and do the necessary update and retest & repost the patch for
>> archival purposes.  If you have commit privs, go ahead and commit the
>> updated patch, else indicate in the patch repost that someone needs to
>> apply it for you.
>>
>>
>> Thanks for your patience,
>>
>> Jeff
>>
>>
 #1. diagnostic_show_locus() should be sure it will not corrupt output in
 this way, regardless of what ranges it is given to work with.
>> Yes.
>>
>>
 #2. libcpp should probably generate a range that includes the whole UTF-8
 character. Actually in other ways the range seems not ideal, for example
 if an invalid character appears in the middle of the identifier, the
 diagnostic still points to the first byte of the identifier.
>> Probably.  We haven't traditionally worried a  lot about multitbyte
>> sequences, so I'm not surprised we're not handling them particularly well.
>>
>>
 The attached patch fixes #1. It's essentially a one-line change, plus a
 new selftest. Would you please have a look at it sometime? bootstrap
 and testsuite were done on linux x86-64.

 Other questions that I have:

 - I am not quite clear when a selftest is preferred vs a dejagnu test. In
   this case I stuck with the selftest because color diagnostics don't seem
   to work well with dg-error etc, and it didn't seem worth creating a new
   plugin-based test like g++.dg/plugin just for this. (I also considered
   using the existing g++.dg plugin, but it seems this test should run for
   gcc as well.)
>> It varies and there's cases that are fine in either and I suspect there
>> are many tests in the dejagnu suite that would be better as selftests --
>> selftests are a fairly new concept.
>>
>>
>> The guidance I would give is the more a particular test is tied to the
>> internals of the code, the more likely a selftest is the right
>> approach.  THe more the test needs an end-to-end run through passes of
>> the compiler, the more it belongs in the dejagnu suite.
>>
>>
>>
 - I wasn't sure if I should create a PR for an issue such as this, if
   there is already a patch readily available. And if I did create a PR,
   not sure if it's preferred to post the patch to gcc-patches, or as an
   attachment to the PR.
>> We still prefer patches to go to gcc-patches -- I personally don't troll
>> BZ looking for attached patches.
>>
>>
 - Does it seem worth me looking into #2? I think the patch to address #1 is
   appropriate in any case, because it handles generically all potential
   cases where this may arise, but still perhaps the ranges coming out of
   libcpp could be improved?
>> I don't think it can hurt to look into the difficulty in addressing #2.
>>
>>
>> jeff
>>
> Thanks very much for the detailed comments, that's all very useful to
> me. This particular patch was subsumed by r11-2092, which added the
> support for tab expansion, since this whole function was redone and
> now handles multibyte correctly. Sorry I probably should have updated
> the thread for this old patch in addition to mentioning in the new
> one, to save you some time. I will try to take a look sometime at the
> ranges that libcpp outputs too. Thanks again!
No problem.  I'm slogging my way through lots of old stuff.  I can often
determine if something has been subsumed, but wasn't able to in this case.

Thanks,

Jeff



Re: [PATCH] lto: Fix typo in comment of gcc/lto/lto-symtab.c

2020-11-18 Thread Jeff Law via Gcc-patches



On 11/14/20 3:18 AM, Jerry Clcanny via Gcc-patches wrote:
> Hi, thanks for reviewing this patch. This patch just change a typo in
> comment of gcc/lto/lto-symtab.c. The original comment is "because after
> removing one of duplicate decls the hash is not correcly updated to the
> ohter dupliate.", I change "ohter" to "other". So I don't do any tesst and
> provide reports.
I also fixed dupliate->duplicate and pushed the patch to the trunk.

Thanks,
jeff



Re: [PATCH] [libiberty] Fix write buffer overflow in cplus_demangle

2020-11-18 Thread Jeff Law via Gcc-patches



On 11/14/20 6:08 AM, Tim Rühsen wrote:
> Hey,
>
> On 13.11.20 05:45, Jeff Law wrote:
>>
>> On 11/29/19 12:15 PM, Tim Rühsen wrote:
>>> * cplus-dem.c (ada_demangle): Correctly calculate the demangled
>>>    size by using two passes.
>>
>> So I'm not sure why, but I can't get this patch to apply.  What's even
>> more interesting is ada_demangle doesn't seem to have changed since 2010
>> and even if I checkout a Nov 2019 trunk, I still can't apply the patch.
>>
>>
>> I can see what you're doing with your patch (it's primarily introducing
>> a loop where you count on the first pass and allocate on the second and
>> re-indent all the necessary code), I'd prefer not to muck it up trying
>> to apply by hand.
>>
>>
>> Any change you could update the patch so that it applies to the trunk.
>> THe review is done, so it should be able to go straight in.  If you have
>> commit privs (I don't recall if you do or not), you can go ahead and
>> commit it yourself.
>
> hm sorry, I am a bit out of the loop currently. It would be awesome if
> someone with more project knowledge could apply the patch.
>
> From what I can see here, the patch was made on top of binutils-gdb
> commit 3d18c3354209bd42361cb26ec611455cdf8b401b. Hope this helps.
Normally a GIT id would be sufficient...  *But*:
[law@localhost binutils-gdb]$ git checkout
3d18c3354209bd42361cb26ec611455cdf8b401b
fatal: reference is not a tree: 3d18c3354209bd42361cb26ec611455cdf8b401b

Maybe you could send me your cplus-dem.c with and without your patch
installed.  I can probably sort it out from there.



>
>> Sorry for the insane delays here.
>
> That is how life goes ;-)
> A delay is better than never.
Yea, but it shouldn't take this long to get to relatively simple
patches.  It's just been a terrible year.

jeff



[PATCH] c, tree: Fix ICE from get_parm_array_spec [PR97860]

2020-11-18 Thread Jakub Jelinek via Gcc-patches
Hi!

The C and C++ FEs handle zero sized arrays differently, C uses
NULL TYPE_MAX_VALUE on non-NULL TYPE_DOMAIN on complete ARRAY_TYPEs
with bitsize_zero_node TYPE_SIZE, while C++ FE likes to set
TYPE_MAX_VALUE to the largest value (and min to the lowest).

Martin has used array_type_nelts in get_parm_array_spec where the
function on the C form of [0] arrays returns error_mark_node and the code
crashes soon afterwards.  The following patch teaches array_type_nelts about
this (e.g. dwarf2out already handles that as [0]).  While it will change
what is_empty_type returns for certain types (e.g. struct S { int a[0]; };),
as those types occupy zero bits in C, it should make an ABI difference.

So, the tree.c change makes the c-decl.c code handle the [0] arrays
like any other constant extents, and the c-decl.c change just makes sure
that if we'd run into error_mark_node e.g. from the VLA expressions, we
don't crash on those.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-11-18  Jakub Jelinek  

PR c/97860
* tree.c (array_type_nelts): For complete arrays with zero min
and NULL max and zero size return -1.

* c-decl.c (get_parm_array_spec): Bail out of nelts is
error_operand_p.

* gcc.dg/pr97860.c: New test.

--- gcc/tree.c.jj   2020-11-18 09:40:09.798660999 +0100
+++ gcc/tree.c  2020-11-18 20:02:41.655398514 +0100
@@ -3483,7 +3483,17 @@ array_type_nelts (const_tree type)
 
   /* TYPE_MAX_VALUE may not be set if the array has unknown length.  */
   if (!max)
-return error_mark_node;
+{
+  /* zero sized arrays are represented from C FE as complete types with
+NULL TYPE_MAX_VALUE and zero TYPE_SIZE, while C++ FE represents
+them as min 0, max -1.  */
+  if (COMPLETE_TYPE_P (type)
+ && integer_zerop (TYPE_SIZE (type))
+ && integer_zerop (min))
+   return build_int_cst (TREE_TYPE (min), -1);
+
+  return error_mark_node;
+}
 
   return (integer_zerop (min)
  ? max
--- gcc/c/c-decl.c.jj   2020-11-11 01:46:03.245697697 +0100
+++ gcc/c/c-decl.c  2020-11-18 20:03:53.053602265 +0100
@@ -5775,6 +5775,8 @@ get_parm_array_spec (const struct c_parm
   type = TREE_TYPE (type))
{
  tree nelts = array_type_nelts (type);
+ if (error_operand_p (nelts))
+   return attrs;
  if (TREE_CODE (nelts) != INTEGER_CST)
{
  /* Each variable VLA bound is represented by the dollar
--- gcc/testsuite/gcc.dg/pr97860.c.jj   2020-11-18 15:15:08.858931877 +0100
+++ gcc/testsuite/gcc.dg/pr97860.c  2020-11-18 15:14:50.751135430 +0100
@@ -0,0 +1,11 @@
+/* PR c/97860 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+void
+foo (int n)
+{
+  typedef int T[0];
+  typedef T V[n];
+  void bar (V);
+}

Jakub



[committed] vrp: Fix operator_trunc_mod::op1_range [PR97888]

2020-11-18 Thread Jakub Jelinek via Gcc-patches
Hi!

As mentioned in the PR, in (x % y) >= 0 && y >= 0, we can't deduce
x's range to be x >= 0, as e.g. -7 % 7 is 0.  But we can deduce it
from (x % y) > 0.  The patch also fixes up the comments.

Bootstrapped/regtested on x86_64-linux and i686-linux, preapproved in the PR
by Andrew, committed to trunk.

2020-11-18  Jakub Jelinek  

PR tree-optimization/91029
PR tree-optimization/97888
* range-op.cc (operator_trunc_mod::op1_range): Only set op1
range to >= 0 if lhs is > 0, rather than >= 0.  Fix up comments.

* gcc.dg/pr91029.c: Add comment with PR number.
(f2): Use > 0 rather than >= 0.
* gcc.c-torture/execute/pr97888-1.c: New test.
* gcc.c-torture/execute/pr97888-2.c: New test.

--- gcc/range-op.cc.jj  2020-11-18 09:40:09.732661752 +0100
+++ gcc/range-op.cc 2020-11-18 11:19:25.322812925 +0100
@@ -2692,13 +2692,13 @@ operator_trunc_mod::op1_range (irange 
   if (TYPE_SIGN (type) == SIGNED && wi::ge_p (op2.lower_bound (), 0, SIGNED))
 {
   unsigned prec = TYPE_PRECISION (type);
-  // if a & b >=0 , then a >= 0.
-  if (wi::ge_p (lhs.lower_bound (), 0, SIGNED))
+  // if a % b > 0 , then a >= 0.
+  if (wi::gt_p (lhs.lower_bound (), 0, SIGNED))
{
  r = value_range (type, wi::zero (prec), wi::max_value (prec, SIGNED));
  return true;
}
-  // if a & b < 0 , then a <= 0.
+  // if a % b < 0 , then a <= 0.
   if (wi::lt_p (lhs.upper_bound (), 0, SIGNED))
{
  r = value_range (type, wi::min_value (prec, SIGNED), wi::zero (prec));
--- gcc/testsuite/gcc.dg/pr91029.c.jj   2020-11-18 11:21:25.412447195 +0100
+++ gcc/testsuite/gcc.dg/pr91029.c  2020-11-18 11:22:08.391958411 +0100
@@ -1,3 +1,4 @@
+/* PR tree-optimization/91029 */
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-evrp" } */
 
@@ -16,7 +17,7 @@ void f1 (int i)
 
 void f2 (int i)
 {
-  if ((i % 7) >= 0)
+  if ((i % 7) > 0)
 {
   xx = (i < 0);
   if (xx)
--- gcc/testsuite/gcc.c-torture/execute/pr97888-1.c.jj  2020-11-18 
11:29:22.017027013 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr97888-1.c 2020-11-18 
11:28:47.131423750 +0100
@@ -0,0 +1,24 @@
+/* PR tree-optimization/97888 */
+
+int a = 1, c = 4, d, e;
+
+int
+main ()
+{
+  int f = -173;
+  int b;
+  for (b = 0; b < 10; b++)
+{
+  int g = f % (~0 && a), h = 0, i = 0;
+  if (g)
+   __builtin_unreachable ();
+  if (c)
+   h = f;
+  if (h > -173)
+   e = d / i;
+  f = h;
+}
+  if (f != -173)
+__builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.c-torture/execute/pr97888-2.c.jj  2020-11-18 
11:29:24.956993572 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr97888-2.c 2020-11-18 
11:29:12.543134760 +0100
@@ -0,0 +1,19 @@
+/* PR tree-optimization/97888 */
+
+__attribute__((noipa)) void
+foo (int i)
+{
+  if ((i % 7) >= 0)
+{
+  if (i >= 0)
+__builtin_abort ();
+}
+}
+
+int
+main ()
+{
+  foo (-7);
+  foo (-21);
+  return 0;
+}


Jakub



Re: [PATCH] RFC: add "deallocated_by" attribute for use by analyzer

2020-11-18 Thread Martin Sebor via Gcc-patches

On 11/18/20 1:41 PM, David Malcolm wrote:

On Mon, 2020-11-16 at 17:49 -0700, Martin Sebor wrote:

On 11/13/20 2:44 PM, Jeff Law via Gcc-patches wrote:

On 10/5/20 5:12 PM, David Malcolm via Gcc-patches wrote:

This work-in-progress patch generalizes the malloc/free problem-
checking
in -fanalyzer so that it can work on arbitrary acquire/release
API pairs.

It adds a new __attribute__((deallocated_by(FOO))) that could be
used
like this in a library header:

struct foo;

extern void foo_release (struct foo *);

extern struct foo *foo_acquire (void)
  __attribute__ ((deallocated_by(foo_release)));

In theory, the analyzer then "knows" these functions are an
acquire/release pair, and can emit diagnostics for leaks, double-
frees,
use-after-frees, mismatching deallocations, etc.

My hope was that this would provide a minimal level of markup
that would
support library-checking without requiring lots of further
markup.
I attempted to use this to detect a memory leak within a Linux
driver (CVE-2019-19078), by adding the attribute to mark these
fns:
extern struct urb *usb_alloc_urb(int iso_packets, gfp_t
mem_flags);
extern void usb_free_urb(struct urb *urb);
where there is a leak of a "urb" on an error-handling path.
Unfortunately I ran into the problem that there are various other
fns
that take "struct urb *" and the analyzer conservatively assumes
that a
urb passed to them might or might not be freed and thus stops
tracking
state for them.

So I don't know how much use this feature would be as-is.
(without either requiring lots of additional attributes for
marking
fndecl args as being merely borrowed, or simply assuming that
they
are borrowed in the absence of a function body to analyze)

Thoughts?
Dave

gcc/analyzer/ChangeLog:
* region-model-impl-calls.cc
(region_model::impl_deallocation_call): New.
* region-model.cc: Include "attribs.h".
(region_model::on_call_post): Handle fndecls referenced by
__attribute__((deallocated_by(FOO))).
* region-model.h (region_model::impl_deallocation_call): New
decl.
* sm-malloc.cc: Include "stringpool.h" and "attribs.h".
(enum wording): Add WORDING_DEALLOCATED.
(malloc_state_machine::custom_api_map_t): New typedef.
(malloc_state_machine::m_custom_apis): New field.
(start_p): New.
(use_after_free::describe_state_change): Handle
WORDING_DEALLOCATED.
(use_after_free::describe_final_event): Likewise.
(malloc_leak::describe_state_change): Only emit "allocated
here" on
a start->nonnull transition, rather than on other transitions
to
nonnull.
(malloc_state_machine::~malloc_state_machine): New.
(malloc_state_machine::on_stmt): Handle
"__attribute__((deallocated_by(FOO)))", and the special
attribute
set on FOO.
(malloc_state_machine::get_or_create_api): New.
(malloc_state_machine::on_allocator_call): Add
"returns_nonnull"
param and use it to affect which state to transition to.

gcc/c-family/ChangeLog:
* c-attribs.c (c_common_attribute_table): Add entry for
"deallocated_by".
(matching_deallocator_type_p): New.
(maybe_add_deallocator_attribute): New.
(handle_deallocated_by_attribute): New.

gcc/ChangeLog:
* doc/extend.texi (Common Function Attributes): Add
"deallocated_by".

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/attr-deallocated_by-1.c: New test.
* gcc.dg/analyzer/attr-deallocated_by-1a.c: New test.
* gcc.dg/analyzer/attr-deallocated_by-2.c: New test.
* gcc.dg/analyzer/attr-deallocated_by-3.c: New test.
* gcc.dg/analyzer/attr-deallocated_by-4.c: New test.
* gcc.dg/analyzer/attr-deallocated_by-CVE-2019-19078-usb-
leak.c:
New test.
* gcc.dg/analyzer/attr-deallocated_by-misuses.c: New test.


I'd probably go with something more like acquire/release since I
think
the same concepts apply to things like file descriptors acquired by
open
and released by close.  I think the basic concept makes sense and
would
be useful, so I'd lean towards moving forward even if it hasn't
been
particularly useful for the analyzer yet.  One could even ponder
propagation of the attribute similar to what we do with const/pure
so
that we could see through wrappers without the user having to do
more
markup.


What I wonder here is whether or not Martin's work could take
advantage
of the attribute.   I don't see that as strictly necessary for the
patch
to move forward, just a question we should try to answer.


It could, but it would need at least one change.  The patch I posted
on Friday introduces a similar attribute:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559053.html

The main differences between the two are that deallocated_by works
on integers in addition to pointers, but doesn't support positional
arguments for the deallocator.  The work I submitted has no 

Re: [PATCH, rs6000] Add Power10 scheduling description

2020-11-18 Thread Pat Haugen via Gcc-patches
On 11/17/20 10:31 PM, will schmidt wrote:
> On Fri, 2020-11-13 at 16:04 -0600, Pat Haugen via Gcc-patches wrote:
>> +(define_automaton "power10dsp,power10issue,power10div")
>> +
>> +; Decode/dispatch slots
>> +(define_cpu_unit "du0_power10,du1_power10,du2_power10,du3_power10,
>> +  du4_power10,du5_power10,du6_power10,du7_power10" "power10dsp")
>> +
>> +; Four execution units
>> +(define_cpu_unit "exu0_power10,exu1_power10,exu2_power10,exu3_power10"
>> + "power10issue")
>> +; Two load units and two store units
>> +(define_cpu_unit "lu0_power10,lu1_power10" "power10issue")
>> +(define_cpu_unit "stu0_power10,stu1_power10" "power10issue")
>> +; Create false units for use by non-pipelined div/sqrt
>> +(define_cpu_unit "fx_div0_power10,fx_div1_power10" "power10div")
>> +(define_cpu_unit "fp_div0_power10,fp_div1_power10,fp_div2_power10,
>> +  fp_div3_power10" "power10div")
> 
> The spacing catches my eye, I'd want to add spaces around those commas,
> etc.   But.. this appears to be consistent with behavior
> as seen in the
> existing power9.md ; power9.md ; etc. 
> So it's either this way per necessity, or this way per history.
> Either way, no change requested here given that precedence.
> (If this and
> the older stuff also needs to be cosmetically tweaked, that can be
> handled later on..)

Yeah, just historical.


>> +; Load Unit
>> +(define_insn_reservation "power10-load" 4
>> +  (and (eq_attr "type" "load")
>> +   (eq_attr "update" "no")
>> +   (eq_attr "size" "!128")
>> +   (eq_attr "prefixed" "no")
>> +   (eq_attr "cpu" "power10"))
>> +  "DU_any_power10,LU_power10")
>> +
>> +(define_insn_reservation "power10-prefixed-load" 4
>> +  (and (eq_attr "type" "load")
>> +   (eq_attr "update" "no")
>> +   (eq_attr "size" "!128")
>> +   (eq_attr "prefixed" "!no")
> 
> I'm sure there is reason, but remind me..  "!no" versus "yes" ?

>From my prior patch, the prefixed attribute can now have values no/yes/always, 
>so '!no' means 'yes' || 'always'.


>> +
>> +; DFP
>> +; Use the minimum 12 cycle latency for all insns, even though some are more
> 
> ".. for all DFP insns"
> Since you specify this is a minimum, can prob drop "even though some
> are more".

ok



[committed] analyzer: only use CWE-690 for unchecked return value [PR97893]

2020-11-18 Thread David Malcolm via Gcc-patches
CWE-690 is only for dereferencing an unchecked return value; for
other kinds of NULL dereference, use the parent classification, CWE-476.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as r11-5148-gf3f312b535f57b5773953746f6ad0d890ce09b88.

gcc/analyzer/ChangeLog:
PR analyzer/97893
* sm-malloc.cc (null_deref::emit): Use CWE-476 rather than
CWE-690, as this isn't due to an unchecked return value.
(null_arg::emit): Likewise.

gcc/testsuite/ChangeLog:
PR analyzer/97893
* gcc.dg/analyzer/malloc-1.c: Add CWE-690 and CWE-476 codes to
expected output.
---
 gcc/analyzer/sm-malloc.cc|  8 +++
 gcc/testsuite/gcc.dg/analyzer/malloc-1.c | 30 
 2 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc
index fd12a358176..4c387381137 100644
--- a/gcc/analyzer/sm-malloc.cc
+++ b/gcc/analyzer/sm-malloc.cc
@@ -675,9 +675,9 @@ public:
 
   bool emit (rich_location *rich_loc) FINAL OVERRIDE
   {
-/* CWE-690: Unchecked Return Value to NULL Pointer Dereference.  */
+/* CWE-476: NULL Pointer Dereference.  */
 diagnostic_metadata m;
-m.add_cwe (690);
+m.add_cwe (476);
 return warning_meta (rich_loc, m,
 OPT_Wanalyzer_null_dereference,
 "dereference of NULL %qE", m_arg);
@@ -723,10 +723,10 @@ public:
 
   bool emit (rich_location *rich_loc) FINAL OVERRIDE
   {
-/* CWE-690: Unchecked Return Value to NULL Pointer Dereference.  */
+/* CWE-476: NULL Pointer Dereference.  */
 auto_diagnostic_group d;
 diagnostic_metadata m;
-m.add_cwe (690);
+m.add_cwe (476);
 
 bool warned;
 if (zerop (m_arg))
diff --git a/gcc/testsuite/gcc.dg/analyzer/malloc-1.c 
b/gcc/testsuite/gcc.dg/analyzer/malloc-1.c
index 44eca9fc28c..576ab9dee52 100644
--- a/gcc/testsuite/gcc.dg/analyzer/malloc-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/malloc-1.c
@@ -29,14 +29,14 @@ void test_2a (void *ptr)
 int *test_3 (void)
 {
   int *ptr = (int *)malloc (sizeof (int));
-  *ptr = 42; /* { dg-warning "dereference of possibly-NULL 'ptr'" } */
+  *ptr = 42; /* { dg-warning "dereference of possibly-NULL 'ptr' 
\\\[CWE-690\\\]" } */
   return ptr;
 }
 
 int *test_3a (void)
 {
   int *ptr = (int *)__builtin_malloc (sizeof (int));
-  *ptr = 42; /* { dg-warning "dereference of possibly-NULL 'ptr'" } */
+  *ptr = 42; /* { dg-warning "dereference of possibly-NULL 'ptr' 
\\\[CWE-690\\\]" } */
   return ptr;
 }
 
@@ -46,7 +46,7 @@ int *test_4 (void)
   if (ptr)
 *ptr = 42;
   else
-*ptr = 43; /* { dg-warning "dereference of NULL 'ptr'" } */
+*ptr = 43; /* { dg-warning "dereference of NULL 'ptr' \\\[CWE-476\\\]" } */
   return ptr;
 }
 
@@ -259,14 +259,14 @@ void test_22 (void)
 int *test_23 (int n)
 {
   int *ptr = (int *)calloc (n, sizeof (int));
-  ptr[0] = 42; /* { dg-warning "dereference of possibly-NULL 'ptr'" } */
+  ptr[0] = 42; /* { dg-warning "dereference of possibly-NULL 'ptr' 
\\\[CWE-690\\\]" } */
   return ptr;
 }
 
 int *test_23a (int n)
 {
   int *ptr = (int *)__builtin_calloc (n, sizeof (int));
-  ptr[0] = 42; /* { dg-warning "dereference of possibly-NULL 'ptr'" } */
+  ptr[0] = 42; /* { dg-warning "dereference of possibly-NULL 'ptr' 
\\\[CWE-690\\\]" } */
   return ptr;
 }
 
@@ -301,7 +301,7 @@ struct coord {
 struct coord *test_27 (void)
 {
   struct coord *p = (struct coord *) malloc (sizeof (struct coord)); /* { 
dg-message "this call could return NULL" } */
-  p->x = 0.f;  /* { dg-warning "dereference of possibly-NULL 'p'" } */
+  p->x = 0.f;  /* { dg-warning "dereference of possibly-NULL 'p' 
\\\[CWE-690\\\]" } */
 
   /* Only the first such usage should be reported: */
   p->y = 0.f;
@@ -312,7 +312,7 @@ struct coord *test_27 (void)
 struct coord *test_28 (void)
 {
   struct coord *p = NULL;
-  p->x = 0.f; /* { dg-warning "dereference of NULL 'p'" } */
+  p->x = 0.f; /* { dg-warning "dereference of NULL 'p' \\\[CWE-476\\\]" } */
 
   /* Only the first such usage should be reported: */
   p->y = 0.f;
@@ -415,7 +415,7 @@ void test_36 (void)
 void *test_37a (void)
 {
   void *ptr = malloc(4096); /* { dg-message "this call could return NULL" } */
-  __builtin_memset(ptr, 0, 4096); /* { dg-warning "use of possibly-NULL 'ptr' 
where non-null expected" } */
+  __builtin_memset(ptr, 0, 4096); /* { dg-warning "use of possibly-NULL 'ptr' 
where non-null expected \\\[CWE-690\\\]" } */
   return ptr;
 }
 
@@ -426,7 +426,7 @@ int test_37b (void)
   if (p) {
 __builtin_memset(p, 0, 4096); /* Not a bug: checked */
   } else {
-__builtin_memset(q, 0, 4096); /* { dg-warning "use of possibly-NULL 'q' 
where non-null expected" } */
+__builtin_memset(q, 0, 4096); /* { dg-warning "use of possibly-NULL 'q' 
where non-null expected \\\[CWE-690\\\]" } */
   }
   free(p);
   free(q);
@@ -451,7 +451,7 @@ int *
 test_39 (int i)
 {
   int *p = (int*)malloc(sizeof(int*)); /* { 

Re: [committed] libstdc++: Use custom timespec in system calls [PR 93421]

2020-11-18 Thread Mike Crowe via Gcc-patches
On Wednesday 18 November 2020 at 20:22:53 +, Jonathan Wakely wrote:
> On 18/11/20 00:01 +, Jonathan Wakely wrote:
> > On 14/11/20 14:23 +, Jonathan Wakely wrote:
> > > On Sat, 14 Nov 2020, 13:30 Mike Crowe wrote:
> > > > > @@ -195,7 +205,7 @@ namespace
> > > > >   if (__s.count() < 0) [[unlikely]]
> > > > > return false;
> > > > > 
> > > > > - struct timespec rt;
> > > > > + syscall_timespec rt;
> > > > >   if (__s.count() > __int_traits::__max) [[unlikely]]
> > > > > rt.tv_sec = __int_traits::__max;
> > > > 
> > > > Do these now need to be __int_traits::__max in case time_t is 
> > > > 64-bit
> > > > yet syscall_timespec is using 32-bit long?
> > > > 
> > > 
> > > Ah yes. Maybe decltype(rt.tv_sec).
> > 
> > I'll fix that in the next patch.
> 
> And here's that next patch. I'm testing this and will commit if all
> goes well.
> 
> 

> commit 11dfb2a0cca90b277f6bfff9306339f4424bbbdb
> Author: Jonathan Wakely 
> Date:   Wed Nov 18 15:05:25 2020
> 
> libstdc++: Fix overflow checks to use the correct "time_t" [PR 93456]
> 
> I recently added overflow checks to src/c++11/futex.cc for PR 93456, but
> then changed the type of the timespec for PR 93421. This meant the
> overflow checks were no longer using the right range, because the
> variable being written to might be smaller than time_t.
> 
> This introduces new typedef that corresponds to the tv_sec member of the
> struct being passed to the syscall, and uses that typedef in the range
> checks.
> 
> libstdc++-v3/ChangeLog:
> 
> PR libstdc++/93421
> PR libstdc++/93456
> * src/c++11/futex.cc (syscall_time_t): New typedef for
> the type of the syscall_timespec::tv_sec member.
> (relative_timespec, _M_futex_wait_until)
> (_M_futex_wait_until_steady): Use syscall_time_t in overflow
> checks, not time_t.
> 
> diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
> index 33e2097e19cf..290201ae2540 100644
> --- a/libstdc++-v3/src/c++11/futex.cc
> +++ b/libstdc++-v3/src/c++11/futex.cc
> @@ -64,8 +64,10 @@ namespace
>// The SYS_futex syscall still uses the old definition of timespec
>// where tv_sec is 32 bits, so define a type that matches that.
>struct syscall_timespec { long tv_sec; long tv_nsec; };
> +  using syscall_time_t = long;
>  #else
>using syscall_timespec = ::timespec;
> +  using syscall_time_t = time_t;
>  #endif
>  
>// Return the relative duration from (now_s + now_ns) to (abs_s + abs_ns)
> @@ -86,9 +88,9 @@ namespace
>  const auto rel_s = abs_s.count() - now_s;
>  
>  // Convert the absolute timeout to a relative timeout, without overflow.
> -if (rel_s > __int_traits::__max) [[unlikely]]
> +if (rel_s > __int_traits::__max) [[unlikely]]
>{
> - rt.tv_sec = __int_traits::__max;
> + rt.tv_sec = __int_traits::__max;
>   rt.tv_nsec = 9;
>}
>  else
> @@ -130,8 +132,8 @@ namespace
> return false;
>  
>   syscall_timespec rt;
> - if (__s.count() > __int_traits::__max) [[unlikely]]
> -   rt.tv_sec = __int_traits::__max;
> + if (__s.count() > __int_traits::__max) [[unlikely]]
> +   rt.tv_sec = __int_traits::__max;
>   else
> rt.tv_sec = __s.count();
>   rt.tv_nsec = __ns.count();
> @@ -206,8 +208,8 @@ namespace
> return false;
>  
>   syscall_timespec rt;
> - if (__s.count() > __int_traits::__max) [[unlikely]]
> -   rt.tv_sec = __int_traits::__max;
> + if (__s.count() > __int_traits::__max) [[unlikely]]
> +   rt.tv_sec = __int_traits::__max;
>   else
> rt.tv_sec = __s.count();
>   rt.tv_nsec = __ns.count();

LGTM.

Mike.


[PATCH] libstdc++: Fix detection of intrinsics for __GNUC__ < 11

2020-11-18 Thread Jonathan Wakely via Gcc-patches
The previous code would never use __is_identifier and __is_builtin for a
compiler that defines __GNU_C__ >= 7. A hypothetical compiler that
supports __is_identifier and __is_builtin might define __GNUC__ to 8
and therefore the macros for BUILTIN_IS_CONSTANT_EVALUATED
and BUILTIN_IS_SAME would not get defined, and we would not check for
those intrinsics using __is_identifier or __is_builtin either.

This change means that after the conditions using __GNUC__ we then check
__is_identifier or __is_builtin for any features not yet defined.

This probably doesn't achieve anything in practice, because Clang
defines __GNUC__ to 4, and Intel defines it to match whatever GCC
defines, so it will claim to be GCC 11 if using the libstdc++ headers
from GCC 11. But it still seems more correct to do it this way.

libstdc++-v3/ChangeLog:

* include/bits/c++config: Do checks using __is_identifier and
__is_builtin independent of checks for __GNUC__.

Does this seem worth doing, despite the caveat above?


commit 358069de33f305d753c64c7e8e4819bf4638ed14
Author: Jonathan Wakely 
Date:   Wed Nov 18 16:15:15 2020

libstdc++: Fix detection of intrinsics for __GNUC__ < 11

The previous code would never use __is_identifier and __is_builtin for a
compiler that defines __GNU_C__ >= 7. A hypothetical compiler that
supports __is_identifier and __is_builtin might define __GNUC__ to 8
and therefore the macros for BUILTIN_IS_CONSTANT_EVALUATED
and BUILTIN_IS_SAME would not get defined, and we would not check for
those intrinsics using __is_identifier or __is_builtin either.

This change means that after the conditions using __GNUC__ we then check
__is_identifier or __is_builtin for any features not yet defined.

This probably doesn't achieve anything in practice, because Clang
defines __GNUC__ to 4, and Intel defines it to match whatever GCC
defines, so it will claim to be GCC 11 if using the libstdc++ headers
from GCC 11. But it still seems more correct to do it this way.

libstdc++-v3/ChangeLog:

* include/bits/c++config: Do checks using __is_identifier and
__is_builtin independent of checks for __GNUC__.

diff --git a/libstdc++-v3/include/bits/c++config 
b/libstdc++-v3/include/bits/c++config
index 2e6c880ad95a..abf6320082c0 100644
--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -664,21 +664,28 @@ namespace std
 # if __GNUC__ >= 11
 #  define _GLIBCXX_HAVE_BUILTIN_IS_SAME 1
 # endif
-#elif defined(__is_identifier) && defined(__has_builtin)
+#endif
+
+#if defined(__is_identifier) && defined(__has_builtin)
 // For non-GNU compilers:
-# if ! __is_identifier(__has_unique_object_representations)
+# if ! defined _GLIBCXX_HAVE_BUILTIN_HAS_UNIQ_OBJ_REP \
+  && ! __is_identifier(__has_unique_object_representations)
 #  define _GLIBCXX_HAVE_BUILTIN_HAS_UNIQ_OBJ_REP 1
 # endif
-# if ! __is_identifier(__is_aggregate)
+# if ! defined _GLIBCXX_HAVE_BUILTIN_IS_AGGREGATE \
+  && ! __is_identifier(__is_aggregate)
 #  define _GLIBCXX_HAVE_BUILTIN_IS_AGGREGATE 1
 # endif
-# if __has_builtin(__builtin_launder)
+# if ! defined _GLIBCXX_HAVE_BUILTIN_LAUNDER \
+  && __has_builtin(__builtin_launder)
 #  define _GLIBCXX_HAVE_BUILTIN_LAUNDER 1
 # endif
-# if __has_builtin(__builtin_is_constant_evaluated)
+# if ! defined _GLIBCXX_HAVE_BUILTIN_IS_CONSTANT_EVALUATED \
+  && __has_builtin(__builtin_is_constant_evaluated)
 #  define _GLIBCXX_HAVE_BUILTIN_IS_CONSTANT_EVALUATED 1
 # endif
-# if ! __is_identifier(__is_same)
+# if ! defined _GLIBCXX_HAVE_BUILTIN_IS_SAME \
+  && ! __is_identifier(__is_same)
 #  define _GLIBCXX_HAVE_BUILTIN_IS_SAME 1
 # endif
 #endif // GCC


[pushed] Objective-C++ : Avoid ICE on invalid with empty attributes.

2020-11-18 Thread Iain Sandoe

Hi

Empty prefix attributes like:

__attribute__ (())
@interface MyClass
@end

cause an ICE at present.
Check for that case and skip them.

tested on x86_64-darwin,
pushed to master,
thanks
Iain

gcc/cp/ChangeLog:

* parser.c (cp_parser_objc_valid_prefix_attributes): Check
for empty attributes.
---
 gcc/cp/parser.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index b7ef259b048..cf4e4aa1b75 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -33992,8 +33992,8 @@ cp_parser_objc_valid_prefix_attributes (cp_parser*  
parser, tree *attrib)

 {
   cp_lexer_save_tokens (parser->lexer);
   tree addon = cp_parser_attributes_opt (parser);
-  gcc_checking_assert (addon);
-  if (OBJC_IS_AT_KEYWORD (cp_lexer_peek_token (parser->lexer)->keyword))
+  if (addon
+  && OBJC_IS_AT_KEYWORD (cp_lexer_peek_token (parser->lexer)->keyword))
 {
   cp_lexer_commit_tokens (parser->lexer);
   if (*attrib)
--
2.24.1




Re: [PATCH] PR 83938 Reduce memory consumption in stable_sort/inplace_merge

2020-11-18 Thread François Dumont via Gcc-patches

Gentle reminder now that we are in stage 3 ?


On 24/06/20 7:39 pm, Jonathan Wakely wrote:

On 11/06/20 08:32 +0200, François Dumont via Libstdc++ wrote:

As we are on patching algos we still have this old one.

    From the original patch I only kept the memory optimization 
part as the new performance test was not showing good result for the 
other part to change pivot value. I also kept the small change in 
get_temporary_buffer even if I don't have strong feeling about it, it 
just make sure that we'll try to allocate 1 element as a last chance 
allocation.


    Note that there is still place for an improvement. If we miss 
memory on the heap we then use a recursive implementation which then 
rely on stack memory. I would be surprise that a system which miss 
heap memory would have no problem to allocate about the same on the 
stack so we will surely end up in a stack overflow. I still have this 
on my todo even if I already made several tries with no satisfying 
result in terms of performance.


    Tested under Linux x86_64.

Commit message:

    libstdc++: Limit memory allocation in 
stable_sort/inplace_merge (PR 83938)


    Reduce memory consumption in stable_sort/inplace_merge to what 
is used.


    Co-authored-by: François Dumont 

    libstdc++-v3/ChangeLog:

    2020-06-11  John Chang  
                François Dumont 

            PR libstdc++/83938
            * include/bits/stl_tempbuf.h 
(get_temporary_buffer): Change __len

            computation in the loop.
            * include/bits/stl_algo.h:
            (__inplace_merge): Take temporary buffer 
length from smallest range.

            (__stable_sort): Limit temporary buffer length.
            * testsuite/25_algorithms/inplace_merge/1.cc 
(test03): Test different

            pivot positions.
            * 
testsuite/performance/25_algorithms/stable_sort.cc: Test stable_sort

            under different heap memory conditions.
            * 
testsuite/performance/25_algorithms/inplace_merge.cc: New.


Ok to commit ?


I'm very nervous about changes to sort algos that aren't absolutely
necessary for correctness. It needs careful review and lots of
testing. Please be patient.






Re: [PATCH] RFC: add "deallocated_by" attribute for use by analyzer

2020-11-18 Thread David Malcolm via Gcc-patches
On Mon, 2020-11-16 at 17:49 -0700, Martin Sebor wrote:
> On 11/13/20 2:44 PM, Jeff Law via Gcc-patches wrote:
> > On 10/5/20 5:12 PM, David Malcolm via Gcc-patches wrote:
> > > This work-in-progress patch generalizes the malloc/free problem-
> > > checking
> > > in -fanalyzer so that it can work on arbitrary acquire/release
> > > API pairs.
> > > 
> > > It adds a new __attribute__((deallocated_by(FOO))) that could be
> > > used
> > > like this in a library header:
> > > 
> > >struct foo;
> > > 
> > >extern void foo_release (struct foo *);
> > > 
> > >extern struct foo *foo_acquire (void)
> > >  __attribute__ ((deallocated_by(foo_release)));
> > > 
> > > In theory, the analyzer then "knows" these functions are an
> > > acquire/release pair, and can emit diagnostics for leaks, double-
> > > frees,
> > > use-after-frees, mismatching deallocations, etc.
> > > 
> > > My hope was that this would provide a minimal level of markup
> > > that would
> > > support library-checking without requiring lots of further
> > > markup.
> > > I attempted to use this to detect a memory leak within a Linux
> > > driver (CVE-2019-19078), by adding the attribute to mark these
> > > fns:
> > >extern struct urb *usb_alloc_urb(int iso_packets, gfp_t
> > > mem_flags);
> > >extern void usb_free_urb(struct urb *urb);
> > > where there is a leak of a "urb" on an error-handling path.
> > > Unfortunately I ran into the problem that there are various other
> > > fns
> > > that take "struct urb *" and the analyzer conservatively assumes
> > > that a
> > > urb passed to them might or might not be freed and thus stops
> > > tracking
> > > state for them.
> > > 
> > > So I don't know how much use this feature would be as-is.
> > > (without either requiring lots of additional attributes for
> > > marking
> > > fndecl args as being merely borrowed, or simply assuming that
> > > they
> > > are borrowed in the absence of a function body to analyze)
> > > 
> > > Thoughts?
> > > Dave
> > > 
> > > gcc/analyzer/ChangeLog:
> > >   * region-model-impl-calls.cc
> > >   (region_model::impl_deallocation_call): New.
> > >   * region-model.cc: Include "attribs.h".
> > >   (region_model::on_call_post): Handle fndecls referenced by
> > >   __attribute__((deallocated_by(FOO))).
> > >   * region-model.h (region_model::impl_deallocation_call): New
> > > decl.
> > >   * sm-malloc.cc: Include "stringpool.h" and "attribs.h".
> > >   (enum wording): Add WORDING_DEALLOCATED.
> > >   (malloc_state_machine::custom_api_map_t): New typedef.
> > >   (malloc_state_machine::m_custom_apis): New field.
> > >   (start_p): New.
> > >   (use_after_free::describe_state_change): Handle
> > >   WORDING_DEALLOCATED.
> > >   (use_after_free::describe_final_event): Likewise.
> > >   (malloc_leak::describe_state_change): Only emit "allocated
> > > here" on
> > >   a start->nonnull transition, rather than on other transitions
> > > to
> > >   nonnull.
> > >   (malloc_state_machine::~malloc_state_machine): New.
> > >   (malloc_state_machine::on_stmt): Handle
> > >   "__attribute__((deallocated_by(FOO)))", and the special
> > > attribute
> > >   set on FOO.
> > >   (malloc_state_machine::get_or_create_api): New.
> > >   (malloc_state_machine::on_allocator_call): Add
> > > "returns_nonnull"
> > >   param and use it to affect which state to transition to.
> > > 
> > > gcc/c-family/ChangeLog:
> > >   * c-attribs.c (c_common_attribute_table): Add entry for
> > >   "deallocated_by".
> > >   (matching_deallocator_type_p): New.
> > >   (maybe_add_deallocator_attribute): New.
> > >   (handle_deallocated_by_attribute): New.
> > > 
> > > gcc/ChangeLog:
> > >   * doc/extend.texi (Common Function Attributes): Add
> > >   "deallocated_by".
> > > 
> > > gcc/testsuite/ChangeLog:
> > >   * gcc.dg/analyzer/attr-deallocated_by-1.c: New test.
> > >   * gcc.dg/analyzer/attr-deallocated_by-1a.c: New test.
> > >   * gcc.dg/analyzer/attr-deallocated_by-2.c: New test.
> > >   * gcc.dg/analyzer/attr-deallocated_by-3.c: New test.
> > >   * gcc.dg/analyzer/attr-deallocated_by-4.c: New test.
> > >   * gcc.dg/analyzer/attr-deallocated_by-CVE-2019-19078-usb-
> > > leak.c:
> > >   New test.
> > >   * gcc.dg/analyzer/attr-deallocated_by-misuses.c: New test.
> > 
> > I'd probably go with something more like acquire/release since I
> > think
> > the same concepts apply to things like file descriptors acquired by
> > open
> > and released by close.  I think the basic concept makes sense and
> > would
> > be useful, so I'd lean towards moving forward even if it hasn't
> > been
> > particularly useful for the analyzer yet.  One could even ponder
> > propagation of the attribute similar to what we do with const/pure
> > so
> > that we could see through wrappers without the user having to do
> > more
> > markup.
> > 
> > 
> > What I wonder here is whether or not Martin's work could take
> > advantage
> > of the attribute.   I don't see that as strictly necessary for the
> > patch
> > to move forward, 

Re: [committed] libstdc++: Use custom timespec in system calls [PR 93421]

2020-11-18 Thread Jonathan Wakely via Gcc-patches

On 18/11/20 00:01 +, Jonathan Wakely wrote:

On 14/11/20 14:23 +, Jonathan Wakely wrote:

On Sat, 14 Nov 2020, 13:30 Mike Crowe wrote:

@@ -195,7 +205,7 @@ namespace
  if (__s.count() < 0) [[unlikely]]
return false;

- struct timespec rt;
+ syscall_timespec rt;
  if (__s.count() > __int_traits::__max) [[unlikely]]
rt.tv_sec = __int_traits::__max;


Do these now need to be __int_traits::__max in case time_t is 64-bit
yet syscall_timespec is using 32-bit long?



Ah yes. Maybe decltype(rt.tv_sec).


I'll fix that in the next patch.


And here's that next patch. I'm testing this and will commit if all
goes well.


commit 11dfb2a0cca90b277f6bfff9306339f4424bbbdb
Author: Jonathan Wakely 
Date:   Wed Nov 18 15:05:25 2020

libstdc++: Fix overflow checks to use the correct "time_t" [PR 93456]

I recently added overflow checks to src/c++11/futex.cc for PR 93456, but
then changed the type of the timespec for PR 93421. This meant the
overflow checks were no longer using the right range, because the
variable being written to might be smaller than time_t.

This introduces new typedef that corresponds to the tv_sec member of the
struct being passed to the syscall, and uses that typedef in the range
checks.

libstdc++-v3/ChangeLog:

PR libstdc++/93421
PR libstdc++/93456
* src/c++11/futex.cc (syscall_time_t): New typedef for
the type of the syscall_timespec::tv_sec member.
(relative_timespec, _M_futex_wait_until)
(_M_futex_wait_until_steady): Use syscall_time_t in overflow
checks, not time_t.

diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
index 33e2097e19cf..290201ae2540 100644
--- a/libstdc++-v3/src/c++11/futex.cc
+++ b/libstdc++-v3/src/c++11/futex.cc
@@ -64,8 +64,10 @@ namespace
   // The SYS_futex syscall still uses the old definition of timespec
   // where tv_sec is 32 bits, so define a type that matches that.
   struct syscall_timespec { long tv_sec; long tv_nsec; };
+  using syscall_time_t = long;
 #else
   using syscall_timespec = ::timespec;
+  using syscall_time_t = time_t;
 #endif
 
   // Return the relative duration from (now_s + now_ns) to (abs_s + abs_ns)
@@ -86,9 +88,9 @@ namespace
 const auto rel_s = abs_s.count() - now_s;
 
 // Convert the absolute timeout to a relative timeout, without overflow.
-if (rel_s > __int_traits::__max) [[unlikely]]
+if (rel_s > __int_traits::__max) [[unlikely]]
   {
-	rt.tv_sec = __int_traits::__max;
+	rt.tv_sec = __int_traits::__max;
 	rt.tv_nsec = 9;
   }
 else
@@ -130,8 +132,8 @@ namespace
 	  return false;
 
 	syscall_timespec rt;
-	if (__s.count() > __int_traits::__max) [[unlikely]]
-	  rt.tv_sec = __int_traits::__max;
+	if (__s.count() > __int_traits::__max) [[unlikely]]
+	  rt.tv_sec = __int_traits::__max;
 	else
 	  rt.tv_sec = __s.count();
 	rt.tv_nsec = __ns.count();
@@ -206,8 +208,8 @@ namespace
 	  return false;
 
 	syscall_timespec rt;
-	if (__s.count() > __int_traits::__max) [[unlikely]]
-	  rt.tv_sec = __int_traits::__max;
+	if (__s.count() > __int_traits::__max) [[unlikely]]
+	  rt.tv_sec = __int_traits::__max;
 	else
 	  rt.tv_sec = __s.count();
 	rt.tv_nsec = __ns.count();


Re: [PATCH] Include math.h in nextafter-2.c test.

2020-11-18 Thread Segher Boessenkool
On Sun, Nov 15, 2020 at 12:12:34PM -0500, Michael Meissner wrote:
> --- a/gcc/testsuite/gcc.dg/nextafter-2.c
> +++ b/gcc/testsuite/gcc.dg/nextafter-2.c
> @@ -6,6 +6,18 @@
>  
>  #include 
>  
> +/* In order to run on systems like the PowerPC that have 3 different long
> +   double types, include math.h so it can choose what is the appropriate
> +   nextafterl function to use.
> +
> +   If we didn't use -fno-builtin for this test, the PowerPC compiler would 
> have
> +   changed the names of the built-in functions that use long double.  The
> +   nextafter-1.c function runs with this mapping.
> +
> +   Since this test uses -fno-builtin, include math.h, so that math.h can make
> +   the appropriate choice to use.  */
> +#include 

So if you use -fno-builtin (or just for some functions), and you don't
include , things just break?  Nasty.

Of course such things aren't proper C (you *have to* include 
if you use functions from there), but how often will code like this
happen in practice :-/


The patch is okay for trunk.  Thanks!


Segher


Re: [PATCH] [tree-optimization] Optimize two patterns with three xors.

2020-11-18 Thread Jeff Law via Gcc-patches



On 11/17/20 12:57 AM, Richard Biener via Gcc-patches wrote:
> On Tue, Nov 17, 2020 at 3:19 AM Eugene Rozenfeld
>  wrote:
>> Thank you for the review Richard!
>>
>> I re-worked the patch based on your suggestions (attached).
>> I made the change to reuse the first bit_xor in both patterns and I added :s 
>> to the last xor in the first pattern.
>> For the second pattern I didn't add :s because I think the simplification is 
>> beneficial even if the second or third bit_xor has more than one use since 
>> we are simplifying them to just a single operand (@2). If that is incorrect, 
>> please explain why.
> Ah, true, that's correct.
>
> The patch is OK.
I've installed this on the trunk.

Eugene, if you're going to contribute regularly you should probably go
ahead and get commit privs so that you can commit ACK's patches
yourself.   There should be a link to a form from this page:

https://gcc.gnu.org/gitwrite.html


Jeff



Re: [PATCH] PowerPC: Restrict long double test to use IBM long double.

2020-11-18 Thread Segher Boessenkool
Hi!

On Sun, Nov 15, 2020 at 12:23:50PM -0500, Michael Meissner wrote:
> --- a/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c
> +++ b/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c
> @@ -1,4 +1,5 @@
>  /* { dg-skip-if "" { ! "powerpc*-*-linux*" } } */
> +/* { dg-require-effective-target ppc_long_double_ibm } */

You can remove the dg-skip-if then (there is nothing in this test that
requires Linux).  But you want a
/* { dg-require-effective-target dfp } */
(or dfprt).

So what happens if you use IEEE QP float, instead?  You didn't explain.
(Explain in the source code, with a comment where you require it!)


Segher


Re: [PATCH] openmp: Retire nest-var ICV

2020-11-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 18, 2020 at 07:05:19PM +, Kwok Cheung Yeung wrote:
> From a75481979c86aa1da5b5da641fc776bc71d156f7 Mon Sep 17 00:00:00 2001
> From: Kwok Cheung Yeung 
> Date: Wed, 18 Nov 2020 10:02:00 -0800
> Subject: [PATCH] openmp: Retire nest-var ICV for OpenMP 5.1
> 
> This removes the nest-var ICV, expressing nesting in terms of the
> max-active-levels-var ICV instead.  The max-active-levels-var ICV
> is now per data environment rather than per device.
> 
> 2020-11-18  Kwok Cheung Yeung  
> 
>   libgomp/
>   * env.c (gomp_global_icv): Remove nest_var field.  Add
>   max_active_levels_var field.
>   (gomp_max_active_levels_var): Remove.
>   (parse_boolean): Return true on success.
>   (handle_omp_display_env): Express OMP_NESTED in terms of
>   max_active_levels_var.  Change format specifier for
>   max_active_levels_var.
>   (initialize_env): Set max_active_levels_var from
>   OMP_MAX_ACTIVE_LEVELS, OMP_NESTED, OMP_NUM_THREADS and
>   OMP_PROC_BIND.
>   * icv.c (omp_set_nested): Express in terms of
>   max_active_levels_var.
>   (omp_get_nested): Likewise.
>   (omp_set_max_active_levels): Use max_active_levels_var field instead
>   of gomp_max_active_levels_var.
>   (omp_get_max_active_levels): Likewise.
>   * libgomp.h (struct gomp_task_icv): Remove nest_var field.  Add
>   max_active_levels_var field.
>   (gomp_supported_active_levels): Set to UCHAR_MAX.
>   (gomp_max_active_levels_var): Delete.
>   * libgomp.texi (omp_get_nested): Update documentation.
>   (omp_set_nested): Likewise.
>   (OMP_MAX_ACTIVE_LEVELS): Likewise.
>   (OMP_NESTED): Likewise.
>   (OMP_NUM_THREADS): Likewise.
>   (OMP_PROC_BIND): Likewise.
>   * parallel.c (gomp_resolve_num_threads): Replace reference
>   to nest_var with max_active_levels_var.  Use max_active_levels_var
>   field instead of gomp_max_active_levels_var.

LGTM, thanks.

Jakub



Re: [PATCH] options, lto: Optimize streaming of optimization nodes

2020-11-18 Thread Joseph Myers
On Wed, 18 Nov 2020, Jakub Jelinek via Gcc-patches wrote:

> Hi!
> 
> Reposting with self-contained description per Joseph's request:
> 
> Honza mentioned that especially for the new param machinery, most of
> streamed values are probably going to be the default values.  Perhaps
> somehow we could stream them more effectively.
> 
> This patch implements it and brings further savings, the size
> goes down from 574 bytes to 273 bytes, i.e. less than half.
> Not trying to handle enums because the code doesn't know if (enum ...) 10
> is even valid, similarly non-parameters because those really generally
> don't have large initializers, and params without Init (those are 0
> initialized and thus don't need to be handled).
> 
> Bootstrapped/regtested again on x86_64-linux and i686-linux, ok for trunk?
> 
> 2020-11-18  Jakub Jelinek  
> 
>   * optc-save-gen.awk: Initialize var_opt_init.  In
>   cl_optimization_stream_out for params with default values larger than
>   10, xor the default value with the actual parameter value.  In
>   cl_optimization_stream_in repeat the above xor.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] openmp: Retire nest-var ICV

2020-11-18 Thread Kwok Cheung Yeung

On 18/11/2020 11:41 am, Jakub Jelinek wrote:

On Thu, Nov 12, 2020 at 10:44:35PM +, Kwok Cheung Yeung wrote:

+  /* OMP_NESTED is deprecated in OpenMP 5.0.  */
+  if (parse_boolean ("OMP_NESTED", ))
+   gomp_global_icv.max_active_levels_var =
+   nested ? gomp_supported_active_levels : 1;


Formatting - = should be on the next line, indented 2 columns further from
gomp_global_icv.



Fixed.


  int
  omp_get_nested (void)
  {
struct gomp_task_icv *icv = gomp_icv (false);
-  return icv->nest_var;
+  return icv->max_active_levels_var > 1
+  && icv->max_active_levels_var > omp_get_active_level ();


Formatting, should be:
   return (icv->max_active_levels_var > 1
  && icv->max_active_levels_var > omp_get_active_level ());



Fixed.


@@ -118,19 +122,21 @@ omp_get_thread_limit (void)
  void
  omp_set_max_active_levels (int max_levels)
  {
+  struct gomp_task_icv *icv = gomp_icv (false);


Should be gomp_icv (true), because it modifies the ICVs rather than
just querying them.  And perhaps move it inside of the if (max_levels >= 0)
if.


Done.


So, let's change gomp_supported_active_levels to say 255 and use
   bool dyn_var;
   unsigned char max_active_levels_var;
   char bind_var;



Done (though I used UCHAR_MAX instead of 255). The change in type requires 
changing a format specifier from %lu to %u in handle_omp_display_env, and the 
use of a temporary when parsing OMP_MAX_ACTIVE_LEVELS in initialize_env.


If there are no objections, I will commit this to master and OG10 shortly. 
Bootstrapped on x86_64 with no offloading, and tested libgomp with Nvidia 
offloading with no regressions.


Thanks

Kwok
From a75481979c86aa1da5b5da641fc776bc71d156f7 Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung 
Date: Wed, 18 Nov 2020 10:02:00 -0800
Subject: [PATCH] openmp: Retire nest-var ICV for OpenMP 5.1

This removes the nest-var ICV, expressing nesting in terms of the
max-active-levels-var ICV instead.  The max-active-levels-var ICV
is now per data environment rather than per device.

2020-11-18  Kwok Cheung Yeung  

libgomp/
* env.c (gomp_global_icv): Remove nest_var field.  Add
max_active_levels_var field.
(gomp_max_active_levels_var): Remove.
(parse_boolean): Return true on success.
(handle_omp_display_env): Express OMP_NESTED in terms of
max_active_levels_var.  Change format specifier for
max_active_levels_var.
(initialize_env): Set max_active_levels_var from
OMP_MAX_ACTIVE_LEVELS, OMP_NESTED, OMP_NUM_THREADS and
OMP_PROC_BIND.
* icv.c (omp_set_nested): Express in terms of
max_active_levels_var.
(omp_get_nested): Likewise.
(omp_set_max_active_levels): Use max_active_levels_var field instead
of gomp_max_active_levels_var.
(omp_get_max_active_levels): Likewise.
* libgomp.h (struct gomp_task_icv): Remove nest_var field.  Add
max_active_levels_var field.
(gomp_supported_active_levels): Set to UCHAR_MAX.
(gomp_max_active_levels_var): Delete.
* libgomp.texi (omp_get_nested): Update documentation.
(omp_set_nested): Likewise.
(OMP_MAX_ACTIVE_LEVELS): Likewise.
(OMP_NESTED): Likewise.
(OMP_NUM_THREADS): Likewise.
(OMP_PROC_BIND): Likewise.
* parallel.c (gomp_resolve_num_threads): Replace reference
to nest_var with max_active_levels_var.  Use max_active_levels_var
field instead of gomp_max_active_levels_var.
---
 libgomp/env.c| 44 ++
 libgomp/icv.c| 17 ++-
 libgomp/libgomp.h|  5 ++---
 libgomp/libgomp.texi | 60 ++--
 libgomp/parallel.c   |  4 ++--
 5 files changed, 90 insertions(+), 40 deletions(-)

diff --git a/libgomp/env.c b/libgomp/env.c
index ab22525..5a49ae6 100644
--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -68,12 +68,11 @@ struct gomp_task_icv gomp_global_icv = {
   .run_sched_chunk_size = 1,
   .default_device_var = 0,
   .dyn_var = false,
-  .nest_var = false,
+  .max_active_levels_var = 1,
   .bind_var = omp_proc_bind_false,
   .target_data = NULL
 };
 
-unsigned long gomp_max_active_levels_var = gomp_supported_active_levels;
 bool gomp_cancel_var = false;
 enum gomp_target_offload_t gomp_target_offload_var
   = GOMP_TARGET_OFFLOAD_DEFAULT;
@@ -959,16 +958,17 @@ parse_spincount (const char *name, unsigned long long 
*pvalue)
 }
 
 /* Parse a boolean value for environment variable NAME and store the
-   result in VALUE.  */
+   result in VALUE.  Return true if one was present and it was
+   successfully parsed.  */
 
-static void
+static bool
 parse_boolean (const char *name, bool *value)
 {
   const char *env;
 
   env = getenv (name);
   if (env == NULL)
-return;
+return false;
 
   while (isspace ((unsigned char) *env))
 ++env;
@@ -987,7 +987,11 @@ parse_boolean (const char *name, bool *value)
   while (isspace 

Re: [PATCH] configury: --enable-link-serialization support

2020-11-18 Thread Joseph Myers
On Wed, 18 Nov 2020, Jakub Jelinek via Gcc-patches wrote:

> Bootstrapped/regtested again last night on x86_64-linux and i686-linux, ok
> for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] plugins: Allow plugins to handle global_options changes

2020-11-18 Thread Joseph Myers
On Wed, 18 Nov 2020, Jakub Jelinek via Gcc-patches wrote:

> On Wed, Nov 18, 2020 at 10:39:46AM +0100, Richard Biener wrote:
> > We already have --{enable,disable}-plugin, so could remove it when
> > those are not enabled.
> 
> Here is a variant that does that:
> 
> 2020-11-18  Jakub Jelinek  
> 
>   * opts.h (struct cl_var): New type.
>   (cl_vars): Declare.
>   * optc-gen.awk: Generate cl_vars array.

This version is OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 1/5] testsuite: Fix vect/vect-sdiv-pow2-1.c

2020-11-18 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> On Tue, Nov 17, 2020 at 2:02 PM Richard Sandiford
>  wrote:
>>
>> Richard Biener via Gcc-patches  writes:
>> > On Tue, Nov 17, 2020 at 12:24 PM Richard Sandiford via Gcc-patches
>> >  wrote:
>> >>
>> >> We're now able to vectorise the set-up loop:
>> >>
>> >>   int p = power2 (fns[i].po2);
>> >>   for (int j = 0; j < N; j++)
>> >> a[j] = ((p << 4) * j) / (N - 1) - (p << 5);
>> >>
>> >> Rather than adjust the expected output for that, it seemed better
>> >> to disable optimisation for the testing code.
>> >>
>> >> Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
>> >> and x86_64-linux-gnu.  OK to install?
>> >
>> > In other places we just add a asm ("" : : : "memory") to the loop body, 
>> > can you
>> > do it like htat?
>>
>> I wondered about that, but I don't think it's reliable long-term.
>> We could (perhaps rightly) decide that it's a win to vectorise the
>> rhs of a[j] even if the asm prevents us from doing a vector store.
>
> But this is about dump-scanning and I'd rather avoid optimize attributes
> since that removes coverage gained by people running the testsuite
> with random set of options.

Yeah, but I'd argue that getting optimisation coverage of the validity
checking isn't really a good thing, since it just increases the chances
that the validity code will be misoptimised in the same way as the code
that it's testing.  Can see it cuts both ways though.

> We do have #pragma no_vector support in the middle-end just not
> yet in the C FE parser (see where it builds ANNOTATE_EXPRs,
> add support for the annot_expr_no_vector_kind).  If you want a
> future-proof reliable way to disable vectorizing a loop, that is.

OK, I went for your original suggestion of using an asm.

Thanks,
Richard


gcc/testsuite/
* gcc.dg/vect/vect-sdiv-pow2-1.c (main): Add an asm to the
set-up loop.

diff --git a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c 
b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
index be70bc6c47e..484efb1e8c8 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
@@ -62,7 +62,10 @@ main (void)
 {
   int p = power2 (fns[i].po2);
   for (int j = 0; j < N; j++)
-a[j] = ((p << 4) * j) / (N - 1) - (p << 5);
+   {
+ a[j] = ((p << 4) * j) / (N - 1) - (p << 5);
+ asm volatile ("" ::: "memory");
+   }
 
   fns[i].div (b, a, N);
   fns[i].mod (c, a, N);
-- 
2.17.1



Re: [32/32] fixinclude

2020-11-18 Thread Nathan Sidwell

This is what I've pushed.

This fixes an ODR violation in the AIX headers that is detected by C++
modules.  While unnamed structs with typedef names for linkage
purposes are accepted, this case is an anonymous struct without such a
typedef name -- the typedef is attached to the pointer-to-struct type.
Fixed by naming the struct.

fixincludes/
* inclhack.def (aix_physaddr_t): New.
* fixincl.x: Regenerated.

nathan

--
Nathan Sidwell
diff --git c/fixincludes/fixincl.x w/fixincludes/fixincl.x
index 758d5620641..21439652bce 100644
--- c/fixincludes/fixincl.x
+++ w/fixincludes/fixincl.x
@@ -2,11 +2,11 @@
  *
  * DO NOT EDIT THIS FILE   (fixincl.x)
  *
- * It has been AutoGen-ed  October  3, 2020 at 11:40:52 PM by AutoGen 5.18
+ * It has been AutoGen-ed  October 21, 2020 at 10:43:22 AM by AutoGen 5.18.16
  * From the definitionsinclhack.def
  * and the template file   fixincl
  */
-/* DO NOT SVN-MERGE THIS FILE, EITHER Sat Oct  3 23:40:52 UTC 2020
+/* DO NOT SVN-MERGE THIS FILE, EITHER Wed Oct 21 10:43:22 EDT 2020
  *
  * You must regenerate it.  Use the ./genfixes script.
  *
@@ -15,7 +15,7 @@
  * certain ANSI-incompatible system header files which are fixed to work
  * correctly with ANSI C and placed in a directory that GNU C will search.
  *
- * This file contains 259 fixup descriptions.
+ * This file contains 260 fixup descriptions.
  *
  * See README for more information.
  *
@@ -1247,6 +1247,43 @@ static const char* apzAix_Rwlock_Initializer_1Patch[] = {
 {{ \\\n",
 (char*)NULL };
 
+/* * * * * * * * * * * * * * * * * * * * * * * * * *
+ *
+ *  Description of Aix_Physadr_T fix
+ */
+tSCC zAix_Physadr_TName[] =
+ "aix_physadr_t";
+
+/*
+ *  File name selection pattern
+ */
+tSCC zAix_Physadr_TList[] =
+  "sys/types.h\0";
+/*
+ *  Machine/OS name selection pattern
+ */
+tSCC* apzAix_Physadr_TMachs[] = {
+"*-*-aix*",
+(const char*)NULL };
+
+/*
+ *  content selection pattern - do fix if pattern found
+ */
+tSCC zAix_Physadr_TSelect0[] =
+   "typedef[ \t]*struct[ \t]*([{][^}]*[}][ \t]*\\*[ \t]*physadr_t;)";
+
+#defineAIX_PHYSADR_T_TEST_CT  1
+static tTestDesc aAix_Physadr_TTests[] = {
+  { TT_EGREP,zAix_Physadr_TSelect0, (regex_t*)NULL }, };
+
+/*
+ *  Fix Command Arguments for Aix_Physadr_T
+ */
+static const char* apzAix_Physadr_TPatch[] = {
+"format",
+"typedef struct __physadr_s %1",
+(char*)NULL };
+
 /* * * * * * * * * * * * * * * * * * * * * * * * * *
  *
  *  Description of Aix_Pthread fix
@@ -10521,9 +10558,9 @@ static const char* apzX11_SprintfPatch[] = {
  *
  *  List of all fixes
  */
-#define REGEX_COUNT  297
+#define REGEX_COUNT  298
 #define MACH_LIST_SIZE_LIMIT 187
-#define FIX_COUNT259
+#define FIX_COUNT260
 
 /*
  *  Enumerate the fixes
@@ -10555,6 +10592,7 @@ typedef enum {
 AIX_MUTEX_INITIALIZER_1_FIXIDX,
 AIX_COND_INITIALIZER_1_FIXIDX,
 AIX_RWLOCK_INITIALIZER_1_FIXIDX,
+AIX_PHYSADR_T_FIXIDX,
 AIX_PTHREAD_FIXIDX,
 AIX_STDINT_1_FIXIDX,
 AIX_STDINT_2_FIXIDX,
@@ -10921,6 +10959,11 @@ tFixDesc fixDescList[ FIX_COUNT ] = {
  AIX_RWLOCK_INITIALIZER_1_TEST_CT, FD_MACH_ONLY | FD_SUBROUTINE,
  aAix_Rwlock_Initializer_1Tests,   apzAix_Rwlock_Initializer_1Patch, 0 },
 
+  {  zAix_Physadr_TName,zAix_Physadr_TList,
+ apzAix_Physadr_TMachs,
+ AIX_PHYSADR_T_TEST_CT, FD_MACH_ONLY | FD_SUBROUTINE,
+ aAix_Physadr_TTests,   apzAix_Physadr_TPatch, 0 },
+
   {  zAix_PthreadName,zAix_PthreadList,
  apzAix_PthreadMachs,
  AIX_PTHREAD_TEST_CT, FD_MACH_ONLY | FD_SUBROUTINE,
diff --git c/fixincludes/inclhack.def w/fixincludes/inclhack.def
index 47eb236586c..80c9adfb07c 100644
--- c/fixincludes/inclhack.def
+++ w/fixincludes/inclhack.def
@@ -720,6 +720,20 @@ fix = {
 		"{ \n";
 };
 
+
+/* On AIX 'typedef struct {} * physadr_t;' needs to give the struct a
+   name for linkage purposes.  Fortunately it is on exactly one
+   line.  */
+fix = {
+hackname  = aix_physadr_t;
+mach  = "*-*-aix*";
+files = sys/types.h;
+select= "typedef[ \t]*struct[ \t]*([{][^}]*[}][ \t]*\\*[ \t]*physadr_t;)";
+c_fix = format;
+c_fix_arg = "typedef struct __physadr_s %1";
+test_text = "typedef struct __physadr_s {";
+};
+
 /*
  *  pthread.h on AIX 4.3.3 tries to define a macro without whitspace
  *  which violates a requirement of ISO C.


preprocessor: C++ module-directives

2020-11-18 Thread Nathan Sidwell


C++20 modules introduces a new kind of preprocessor directive -- a
module directive.  These are directives but without the leading '#'.
We have to detect them by sniffing the start of a logical line.  When
detected we replace the initial identifiers with unspellable tokens
and pass them through to the language parser the same way deferred
pragmas are.  There's a PRAGMA_EOL at the logical end of line too.

One additional complication is that we have to do header-name lexing
after the initial tokens, and that requires changes in the macro-aware
piece of the preprocessor.  The above sniffer sets a counter in the
lexer state, and that triggers at the appropriate point.  We then do
the same header-name lexing that occurs on a #include directive or
has_include pseudo-macro.  Except that the header name ends up in the
token stream.

A couple of token emitters need to deal with the new token possibility.

gcc/c-family/
* c-lex.c (c_lex_with_flags): CPP_HEADER_NAMEs can now be seen.
libcpp/
* include/cpplib.h (struct cpp_options): Add module_directives
option.
(NODE_MODULE): New node flag.
(struct cpp_hashnode): Make rid-code a bitfield, increase bits in
flags and swap with type field.
* init.c (post_options): Create module-directive identifier nodes.
* internal.h (struct lexer_state): Add directive_file_token &
n_modules fields.  Add module node enumerator.
* lex.c (cpp_maybe_module_directive): New.
(_cpp_lex_token): Call it.
(cpp_output_token): Add '"' around CPP_HEADER_NAME token.
(do_peek_ident, do_peek_module): New.
(cpp_directives_only): Detect module-directive lines.
* macro.c (cpp_get_token_1): Deal with directive_file_token
triggering.

pushing to trunk
--
Nathan Sidwell
diff --git i/gcc/c-family/c-lex.c w/gcc/c-family/c-lex.c
index 8dd1420d10d..c8d33d0c9d1 100644
--- i/gcc/c-family/c-lex.c
+++ w/gcc/c-family/c-lex.c
@@ -667,8 +667,11 @@ c_lex_with_flags (tree *value, location_t *loc, unsigned char *cpp_flags,
   *value = build_int_cst (integer_type_node, tok->val.pragma);
   break;
 
-  /* These tokens should not be visible outside cpplib.  */
 case CPP_HEADER_NAME:
+  *value = build_string (tok->val.str.len, (const char *)tok->val.str.text);
+  break;
+
+  /* This token should not be visible outside cpplib.  */
 case CPP_MACRO_ARG:
   gcc_unreachable ();
 
diff --git i/libcpp/include/cpplib.h w/libcpp/include/cpplib.h
index 389af32bc5c..630f2e055d1 100644
--- i/libcpp/include/cpplib.h
+++ w/libcpp/include/cpplib.h
@@ -487,6 +487,9 @@ struct cpp_options
   /* Nonzero for the '::' token.  */
   unsigned char scope;
 
+  /* Nonzero means tokenize C++20 module directives.  */
+  unsigned char module_directives;
+
   /* Holds the name of the target (execution) character set.  */
   const char *narrow_charset;
 
@@ -842,6 +845,7 @@ struct GTY(()) cpp_macro {
 #define NODE_USED	(1 << 5)	/* Dumped with -dU.  */
 #define NODE_CONDITIONAL (1 << 6)	/* Conditional macro */
 #define NODE_WARN_OPERATOR (1 << 7)	/* Warn about C++ named operator.  */
+#define NODE_MODULE (1 << 8)		/* C++-20 module-related name.  */
 
 /* Different flavors of hash node.  */
 enum node_type
@@ -900,11 +904,11 @@ struct GTY(()) cpp_hashnode {
   unsigned int directive_index : 7;	/* If is_directive,
 	   then index into directive table.
 	   Otherwise, a NODE_OPERATOR.  */
-  unsigned char rid_code;		/* Rid code - for front ends.  */
+  unsigned int rid_code : 8;		/* Rid code - for front ends.  */
+  unsigned int flags : 9;		/* CPP flags.  */
   ENUM_BITFIELD(node_type) type : 2;	/* CPP node type.  */
-  unsigned int flags : 8;		/* CPP flags.  */
 
-  /* 6 bits spare (plus another 32 on 64-bit hosts).  */
+  /* 5 bits spare (plus another 32 on 64-bit hosts).  */
 
   union _cpp_hashnode_value GTY ((desc ("%1.type"))) value;
 };
diff --git i/libcpp/init.c w/libcpp/init.c
index 76882bc5f1c..fc826583d3a 100644
--- i/libcpp/init.c
+++ w/libcpp/init.c
@@ -843,4 +843,27 @@ post_options (cpp_reader *pfile)
   CPP_OPTION (pfile, trigraphs) = 0;
   CPP_OPTION (pfile, warn_trigraphs) = 0;
 }
+
+  if (CPP_OPTION (pfile, module_directives))
+{
+  /* These unspellable tokens have a leading space.  */
+  const char *const inits[spec_nodes::M_HWM]
+	= {"export ", "module ", "import ", "__import"};
+
+  for (int ix = 0; ix != spec_nodes::M_HWM; ix++)
+	{
+	  cpp_hashnode *node = cpp_lookup (pfile, UC (inits[ix]),
+	   strlen (inits[ix]));
+
+	  /* Token we pass to the compiler.  */
+	  pfile->spec_nodes.n_modules[ix][1] = node;
+
+	  if (ix != spec_nodes::M__IMPORT)
+	/* Token we recognize when lexing, drop the trailing ' '.  */
+	node = cpp_lookup (pfile, NODE_NAME (node), NODE_LEN (node) - 1);
+
+	  node->flags |= NODE_MODULE;
+	  pfile->spec_nodes.n_modules[ix][0] = node;
+	}
+}
 }
diff --git i/libcpp/internal.h 

Re: [PATCH] c++: Allow template lambdas without lambda-declarator [PR97839]

2020-11-18 Thread Marek Polacek via Gcc-patches
On Tue, Nov 17, 2020 at 01:05:20PM -0500, Marek Polacek via Gcc-patches wrote:
> Our implementation of template lambdas incorrectly requires the optional
> lambda-declarator.  This was probably required by an early draft of
> generic lambdas, but now the production is [expr.prim.lambda.general]:
> 
>  lambda-expression:
> lambda-introducer lambda-declarator [opt] compound-statement
> lambda-introducer < template-parameter-list > requires-clause [opt]
> lambda-declarator [opt] compound-statement
> 
> Therefore, we should accept the following test.
> 
> Incidentally, I noticed we give a terrible diagnostic when the user uses
> 'mutable', but forgets to type '()' before it, which sounds like a common
> mistake.  So it seems to me we should handle that specifically, rather
> than to emit this:

This might be necessary to handle  anyway.

> lambda-generic8.C: In lambda function:
> lambda-generic8.C:8:18: error: expected '{' before 'mutable'
> 8 |   [] mutable {}.operator()();
>   |  ^~~
> lambda-generic8.C: In function 'int main()':
> lambda-generic8.C:8:17: error: expected ';' before 'mutable'
> 8 |   [] mutable {}.operator()();
>   | ^~~~
>   | ;
> lambda-generic8.C:8:28: error: expected primary-expression before '.' token
> 8 |   [] mutable {}.operator()();
>   |^
> lambda-generic8.C:8:40: error: expected primary-expression before 'int'
> 8 |   [] mutable {}.operator()();
>   |^~~
> 
> Is it okay to fix this in stage3?
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> 
> gcc/cp/ChangeLog:
> 
>   PR c++/97839
>   * parser.c (cp_parser_lambda_declarator_opt): Don't require ().
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c++/97839
>   * g++.dg/cpp2a/lambda-generic8.C: New test.
> ---
>  gcc/cp/parser.c  | 14 ++
>  gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C |  9 +
>  2 files changed, 15 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C
> 
> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> index 42f705266bb..9f09c778c29 100644
> --- a/gcc/cp/parser.c
> +++ b/gcc/cp/parser.c
> @@ -10604,6 +10604,8 @@ cp_parser_trait_expr (cp_parser* parser, enum rid 
> keyword)
>  
> lambda-expression:
>   lambda-introducer lambda-declarator [opt] compound-statement
> + lambda-introducer < template-parameter-list > requires-clause [opt]
> +   lambda-declarator [opt] compound-statement
>  
> Returns a representation of the expression.  */
>  
> @@ -11061,13 +11063,11 @@ cp_parser_lambda_introducer (cp_parser* parser, 
> tree lambda_expr)
>  /* Parse the (optional) middle of a lambda expression.
>  
> lambda-declarator:
> - < template-parameter-list [opt] >
> -   requires-clause [opt]
> - ( parameter-declaration-clause [opt] )
> -   attribute-specifier [opt]
> + ( parameter-declaration-clause )
> decl-specifier-seq [opt]
> -   exception-specification [opt]
> -   lambda-return-type-clause [opt]
> +   noexcept-specifier [opt]
> +   attribute-specifier-seq [opt]
> +   trailing-return-type [opt]
> requires-clause [opt]
>  
> LAMBDA_EXPR is the current representation of the lambda expression.  */
> @@ -11217,8 +11217,6 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, 
> tree lambda_expr)
>   trailing-return-type in case of decltype.  */
>pop_bindings_and_leave_scope ();
>  }
> -  else if (template_param_list != NULL_TREE) // generate diagnostic
> -cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN);
>  
>/* Create the function call operator.
>  
> diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C 
> b/gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C
> new file mode 100644
> index 000..f3c3809b36d
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C
> @@ -0,0 +1,9 @@
> +// PR c++/97839
> +// { dg-do compile { target c++20 } }
> +// Test that a lambda with  doesn't require
> +// a lambda-declarator.
> +
> +int main()
> +{
> +  []{}.operator()();
> +}
> 
> base-commit: 8661f4faa875f361cd22a197774c1fa04cd0580b
> -- 
> 2.28.0
> 

Marek



Re: [PATCH 4/X] libsanitizer: options: Add hwasan flags and argument parsing

2020-11-18 Thread Richard Sandiford via Gcc-patches
Matthew Malcomson  writes:
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 
> 5320e6c1e1e3c8d1482c20590049f763e11f8ff0..84050058be8eaa306b07655737e49ea8b6eb21a9
>  100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -13709,6 +13709,53 @@ is greater or equal to this number, use callbacks 
> instead of inline checks.
>  E.g. to disable inline code use
>  @option{--param asan-instrumentation-with-call-threshold=0}.
>  
> +@item hwasan-instrument-stack
> +Enable hwasan instrumentation of statically sized stack-allocated variables.
> +This kind of instrumentation is enabled by default when using
> +@option{-fsanitize=hwaddress} and disabled by default when using
> +@option{-fsanitize=kernel-hwaddress}.
> +To disable stack instrumentation use
> +@option{--param hwasan-instrument-stack=0}, and to enable it use
> +@option{--param hwasan-instrument-stack=1}.
> +
> +@item hwasan-random-frame-tag
> +When using stack instrumentation, decide tags for stack variables using a
> +deterministic sequence beginning at a random tag for each frame.  Usually 
> tags
> +are chosen using the same sequence beginning from 1.
> +This is enabled by default for @option{-fsanitize=hwaddress} and unavailable
> +for @option{-fsanitize=kernel-hwaddress}.
> +To disable it use @option{--param hwasan-random-frame-tag=0}.

I think it would be worth clarifying this.  I wasn't sure whether
“Usually tags are chosen…” was describing the “determinstic random
sequence” or whether it was describing the behaviour of
hwasan-random-frame-tag=0.  If it's describing the behaviour of
hwasan-random-frame-tag=0, then I'm not sure “usually” applies,
given that hwasan-random-frame-tag=1 is the default.

> […]
> +@item -fsanitize=kernel-hwaddress
> +@opindex fsanitize=kernel-hwaddress
> +Enable Hardware-assisted AddressSanitizer for compilation of the Linux 
> kernel.
> +Similar to @option{-fsanitize=kernel-address} but using an alternate
> +instrumentation method, and similar to @option{-fsanitize=hwaddress} but with
> +instrumentation differences necessary for compiling the Linux kernel.
> +These differences are to avoid hwasan library initialisation calls and to

initialization

> +account for the stack pointer having a different value in its top byte.
> +Note: This option has different defaults to the 
> @option{-fsanitize=hwaddress}.

texinfo has:

  @quotation Note
  @end quotation

for this, but we don't seem to use it.  Still, I think it would be better
to break the paragraph before Note: and use:

  @emph{Note:}

which seems to be the preferred style in the GCC manual.

> +Instrumenting the stack and alloca calls are not on by default but is still

@code{alloca}
s/is still/are still/

> +possible by specifying it on the command line with

maybe s/it on the command line with/specifying the command-line options/?
Just a suggestion: would be happy with alternatives.

> +@option{--param hwasan-instrument-stack=1} and
> +@option{--param hwasan-instrument-allocas=1}. Using a random frame tag is not

and maybe add a “respectively” at the end of this sentence.

> +implemented for kernel instrumentation.
> +
>  @item -fsanitize=pointer-compare
>  @opindex fsanitize=pointer-compare
>  Instrument comparison operation (<, <=, >, >=) with pointer operands.
> […]
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 
> ac9972d9c386247af3482e07a94c76da3e1abb4d..f3662062c421e1c58c3243109891900eb2dc84bc
>  100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -823,6 +823,51 @@ control_options_for_live_patching (struct gcc_options 
> *opts,
>  /* --help option argument if set.  */
>  vec help_option_arguments;
>  
> +/* Return the string name describing the argument provided on the command 
> line
> +which has set this particular flag.  */
> +const char *
> +find_argument (struct gcc_options *opts, unsigned int flags)
> +{

I think either (a) the name and comment need to mention the
sanitiser more explicitly or (b) this should be a lambda function
in report_conflicting_sanitizer_options:

  auto find_argument = [&](unsigned int flags)
{
  …
};

(in which case the implementation and comments above are fine as-is).

> +  for (int i = 0; sanitizer_opts[i].name != NULL; ++i)
> +{
> +  /* Need to find the sanitizer_opts element which:
> +  a) Could have set the flags requested.
> +  b) Has been set on the command line.
> +
> +  Can have (a) without (b) if the flag requested is e.g.
> +  SANITIZE_ADDRESS, since both -fsanitize=address and
> +  -fsanitize=kernel-address set this flag.
> +
> +  Can have (b) without (a) by requesting more than one sanitizer on the
> +  command line.  */
> +  if ((sanitizer_opts[i].flag & opts->x_flag_sanitize)
> +   != sanitizer_opts[i].flag)
> + continue;
> +  if ((sanitizer_opts[i].flag & flags) != flags)
> + continue;
> +  return sanitizer_opts[i].name;
> +}
> +  return NULL;
> +}
> +
> +
> +/* Report any conflicting sanitizer options 

Re: [AArch64] Add --with-tune configure flag

2020-11-18 Thread Pop, Sebastian via Gcc-patches
Hi,

On 11/18/20, 10:17 AM, "Wilco Dijkstra"  wrote:
>I presume you're trying to unify the --with- options across most targets?

Yes, my intention was to provide the same configure options on arm64
as on x86, such that projects that already use those options can change
cpu name to "neoverse-n1" and that will build a compiler with the right
tuning for Graviton2.

Allowing arm64 users to specify all the flags available on x86 is important.

>That would be very useful! However there are significant differences 
> between
>targets in how they interpret options like --with-arch=native (or -march). 
> So
>those differences also need to be looked at and fixed to avoid unexpected 
> results.
>
>As for the first patch, I think support for --witch-tune requires more 
> changes.
>Without proper processing of a --with-tune, you get an incorrect 
> architecture
>version (if say the CPU you tune for is newer than the --with-cpu/arch
>or default).
>
>   I posted patches to add --with-tune and fix various issues a while back:
>https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553865.html
>https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553866.html

Thanks for pointing me to your patches, I was not aware of these changes.
I see that your patches enable more use cases and fix several bugs.
These changes would definitely be good to have in trunk and branches.

My patch was the minimal change to enable --with-tune=neoverse-n1

>As for your second patch, --with-cpu-64 could be a simple alias indeed,
>but what is the exact definition/expected behaviour of a --with-cpu-32
>on a target that only supports 64-bit code? The AArch64 target cannot
>generate AArch32 code, so we shouldn't silently accept it.

IMO allowing users to specify all the flags available on x86 is important.

Thanks,
Sebastian



Re: [PATCH 3/X] libsanitizer: Add option to bootstrap using HWASAN

2020-11-18 Thread Richard Sandiford via Gcc-patches
Matthew Malcomson  writes:
> This is an analogous option to --bootstrap-asan to configure.  It allows
> bootstrapping GCC using HWASAN.
>
> For the same reasons as for ASAN we have to avoid using the HWASAN
> sanitizer when compiling libiberty and the lto-plugin.
>
> Also add a function to query whether -fsanitize=hwaddress has been
> passed.
>
> ChangeLog:
>
>   * configure: Regenerate.
>   * configure.ac: Add --bootstrap-hwasan option.
>
> config/ChangeLog:
>
>   * bootstrap-hwasan.mk: New file.
>
> gcc/ChangeLog:
>
>   * doc/install.texi: Document new option.
>
> libiberty/ChangeLog:
>
>   * configure: Regenerate.
>   * configure.ac: Avoid using sanitizer.
>
> lto-plugin/ChangeLog:
>
>   * Makefile.am: Avoid using sanitizer.
>   * Makefile.in: Regenerate.

OK, thanks.

> ### Attachment also inlined for ease of reply
> ###
>
>
> diff --git a/config/bootstrap-hwasan.mk b/config/bootstrap-hwasan.mk
> new file mode 100644
> index 
> ..4f60bed3fd6e98b47a3a38aea6eba2a7c320da25
> --- /dev/null
> +++ b/config/bootstrap-hwasan.mk
> @@ -0,0 +1,8 @@
> +# This option enables -fsanitize=hwaddress for stage2 and stage3.
> +
> +STAGE2_CFLAGS += -fsanitize=hwaddress
> +STAGE3_CFLAGS += -fsanitize=hwaddress
> +POSTSTAGE1_LDFLAGS += -fsanitize=hwaddress -static-libhwasan \
> +   -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/ \
> +   -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/ \
> +   -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/.libs
> diff --git a/configure b/configure
> index 
> a2ea1a329b69de06906315e54a49c694c9704522..b41a258c80ee9f289de534185eb364bcb5ca6ae5
>  100755
> --- a/configure
> +++ b/configure
> @@ -9305,7 +9305,7 @@ fi
>  # or bootstrap-ubsan, bootstrap it.
>  if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; 
> then
>case "$BUILD_CONFIG" in
> -*bootstrap-asan* | *bootstrap-ubsan* )
> +*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
>bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
>bootstrap_fixincludes=yes
>;;
> diff --git a/configure.ac b/configure.ac
> index 
> 44fa75f3a329ef68f6800c8e09d49a9373f731cf..944f30cfea84e9266b4322df7902b867882d4d8c
>  100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -2814,7 +2814,7 @@ fi
>  # or bootstrap-ubsan, bootstrap it.
>  if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; 
> then
>case "$BUILD_CONFIG" in
> -*bootstrap-asan* | *bootstrap-ubsan* )
> +*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
>bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
>bootstrap_fixincludes=yes
>;;
> diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
> index 
> 60ee0a9dba17bf8f00d7c5320468ba847f08f8a0..e0f75b3d55582b5d1b8718f94e39c7d50656a926
>  100644
> --- a/gcc/doc/install.texi
> +++ b/gcc/doc/install.texi
> @@ -2794,6 +2794,11 @@ the build tree.
>  Compiles GCC itself using Address Sanitization in order to catch invalid 
> memory
>  accesses within the GCC code.
>  
> +@item @samp{bootstrap-hwasan}
> +Compiles GCC itself using HWAddress Sanitization in order to catch invalid
> +memory accesses within the GCC code.  This option is only available on 
> AArch64
> +systems that are running Linux kernel version 5.4 or later.
> +
>  @end table
>  
>  @section Building a cross compiler
> diff --git a/libiberty/configure b/libiberty/configure
> index 
> ff93c9ee9a6fa9c6bb1938bcdfffb2d0ae8c9698..b6af9baf21204a323cad0e7b40a426c72988ba3b
>  100755
> --- a/libiberty/configure
> +++ b/libiberty/configure
> @@ -5264,6 +5264,7 @@ fi
>  NOASANFLAG=
>  case " ${CFLAGS} " in
>*\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
> +  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
>  esac
>  
>  
> diff --git a/libiberty/configure.ac b/libiberty/configure.ac
> index 
> 4e2599c14a89bafcb8c7e523b9ce5b3d60b8c0f6..ad952963971a31968b5d109661b9cab0aa4b95fc
>  100644
> --- a/libiberty/configure.ac
> +++ b/libiberty/configure.ac
> @@ -240,6 +240,7 @@ AC_SUBST(PICFLAG)
>  NOASANFLAG=
>  case " ${CFLAGS} " in
>*\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
> +  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
>  esac
>  AC_SUBST(NOASANFLAG)
>  
> diff --git a/lto-plugin/Makefile.am b/lto-plugin/Makefile.am
> index 
> 204b25f45ef2f22bb246641a2aa9f9d09719737b..8b20e1d1d87e2dda9f37763492ddf39a8022c48c
>  100644
> --- a/lto-plugin/Makefile.am
> +++ b/lto-plugin/Makefile.am
> @@ -11,8 +11,8 @@ AM_CPPFLAGS = -I$(top_srcdir)/../include $(DEFS)
>  AM_CFLAGS = @ac_lto_plugin_warn_cflags@ $(CET_HOST_FLAGS)
>  AM_LDFLAGS = @ac_lto_plugin_ldflags@
>  AM_LIBTOOLFLAGS = --tag=disable-static
> -override CFLAGS := $(filter-out -fsanitize=address,$(CFLAGS))
> -override LDFLAGS := $(filter-out -fsanitize=address,$(LDFLAGS))
> 

Re: [PATCH 2/X] libsanitizer: Only build libhwasan when targeting AArch64

2020-11-18 Thread Richard Sandiford via Gcc-patches
Matthew Malcomson  writes:
> Though the library has limited support for x86, we don't have any
> support for generating code targeting x86 so there is no point building
> for that target.
>
> Ensure we build for AArch64 but not for AArch64 ilp32.
>
> libsanitizer/ChangeLog:
>
>   * Makefile.am: Condition Build hwasan directory.
>   * Makefile.in: Regenerate.
>   * configure: Regenerate.
>   * configure.ac: Set HWASAN_SUPPORTED based on target
>   architecture.
>   * configure.tgt: Likewise.

OK, thanks.

Richard

> ### Attachment also inlined for ease of reply
> ###
>
>
> diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am
> index 
> 2a7e8e1debe838719db0f0fad218b2543cc3111b..065a65e78d49f7689a01ecb64db1f07ca83aa987
>  100644
> --- a/libsanitizer/Makefile.am
> +++ b/libsanitizer/Makefile.am
> @@ -14,7 +14,7 @@ endif
>  if LIBBACKTRACE_SUPPORTED
>  SUBDIRS += libbacktrace
>  endif
> -SUBDIRS += lsan asan ubsan hwasan
> +SUBDIRS += lsan asan ubsan
>  nodist_saninclude_HEADERS += \
>include/sanitizer/lsan_interface.h \
>include/sanitizer/asan_interface.h \
> @@ -23,6 +23,9 @@ nodist_saninclude_HEADERS += \
>  if TSAN_SUPPORTED
>  SUBDIRS += tsan
>  endif
> +if HWASAN_SUPPORTED
> +SUBDIRS += hwasan
> +endif
>  endif
>  
>  ## May be used by toolexeclibdir.
> diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in
> index 
> 2c57d49cbffdb486645aeb5f2c0f85d6e0fad124..3873ea4d7050f04a3f7bbd0dd3f2a71e9b65d287
>  100644
> --- a/libsanitizer/Makefile.in
> +++ b/libsanitizer/Makefile.in
> @@ -97,6 +97,7 @@ target_triplet = @target@
>  @SANITIZER_SUPPORTED_TRUE@@USING_MAC_INTERPOSE_FALSE@am__append_2 = 
> interception
>  @LIBBACKTRACE_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_3 = 
> libbacktrace
>  @SANITIZER_SUPPORTED_TRUE@@TSAN_SUPPORTED_TRUE@am__append_4 = tsan
> +@HWASAN_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_5 = hwasan
>  subdir = .
>  ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
>  am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \
> @@ -208,7 +209,7 @@ ETAGS = etags
>  CTAGS = ctags
>  CSCOPE = cscope
>  DIST_SUBDIRS = sanitizer_common interception libbacktrace lsan asan \
> - ubsan hwasan tsan
> + ubsan tsan hwasan
>  ACLOCAL = @ACLOCAL@
>  ALLOC_FILE = @ALLOC_FILE@
>  AMTAR = @AMTAR@
> @@ -364,7 +365,7 @@ sanincludedir = 
> $(libdir)/gcc/$(target_alias)/$(gcc_version)/include/sanitizer
>  nodist_saninclude_HEADERS = $(am__append_1)
>  @SANITIZER_SUPPORTED_TRUE@SUBDIRS = sanitizer_common $(am__append_2) \
>  @SANITIZER_SUPPORTED_TRUE@   $(am__append_3) lsan asan ubsan \
> -@SANITIZER_SUPPORTED_TRUE@   hwasan $(am__append_4)
> +@SANITIZER_SUPPORTED_TRUE@   $(am__append_4) $(am__append_5)
>  gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
>  
>  # Work around what appears to be a GNU make bug handling MAKEFLAGS
> diff --git a/libsanitizer/configure b/libsanitizer/configure
> index 
> 27e72c089cb891dcce09494fa9e39eebe55d2598..720d4e17044170e4b91c42fede685761d98c1965
>  100755
> --- a/libsanitizer/configure
> +++ b/libsanitizer/configure
> @@ -659,6 +659,8 @@ link_libubsan
>  link_libtsan
>  link_libhwasan
>  link_libasan
> +HWASAN_SUPPORTED_FALSE
> +HWASAN_SUPPORTED_TRUE
>  LSAN_SUPPORTED_FALSE
>  LSAN_SUPPORTED_TRUE
>  TSAN_SUPPORTED_FALSE
> @@ -12362,7 +12364,7 @@ else
>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>lt_status=$lt_dlunknown
>cat > conftest.$ac_ext <<_LT_EOF
> -#line 12365 "configure"
> +#line 12367 "configure"
>  #include "confdefs.h"
>  
>  #if HAVE_DLFCN_H
> @@ -12468,7 +12470,7 @@ else
>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>lt_status=$lt_dlunknown
>cat > conftest.$ac_ext <<_LT_EOF
> -#line 12471 "configure"
> +#line 12473 "configure"
>  #include "confdefs.h"
>  
>  #if HAVE_DLFCN_H
> @@ -15819,6 +15821,7 @@ fi
>  # Get target configury.
>  unset TSAN_SUPPORTED
>  unset LSAN_SUPPORTED
> +unset HWASAN_SUPPORTED
>  . ${srcdir}/configure.tgt
>   if test "x$TSAN_SUPPORTED" = "xyes"; then
>TSAN_SUPPORTED_TRUE=
> @@ -15836,6 +15839,14 @@ else
>LSAN_SUPPORTED_FALSE=
>  fi
>  
> + if test "x$HWASAN_SUPPORTED" = "xyes"; then
> +  HWASAN_SUPPORTED_TRUE=
> +  HWASAN_SUPPORTED_FALSE='#'
> +else
> +  HWASAN_SUPPORTED_TRUE='#'
> +  HWASAN_SUPPORTED_FALSE=
> +fi
> +
>  
>  # Check for functions needed.
>  for ac_func in clock_getres clock_gettime clock_settime lstat readlink
> @@ -16818,7 +16829,7 @@ ac_config_files="$ac_config_files Makefile 
> libsanitizer.spec libbacktrace/backtr
>  ac_config_headers="$ac_config_headers config.h"
>  
>  
> -ac_config_files="$ac_config_files interception/Makefile 
> sanitizer_common/Makefile libbacktrace/Makefile lsan/Makefile asan/Makefile 
> hwasan/Makefile ubsan/Makefile"
> +ac_config_files="$ac_config_files interception/Makefile 
> sanitizer_common/Makefile libbacktrace/Makefile lsan/Makefile asan/Makefile 
> ubsan/Makefile"
>  
>  
>  if test "x$TSAN_SUPPORTED" = "xyes"; then
> 

Re: [PATCH 1/X] libsanitizer: Tie the hwasan library into our build system

2020-11-18 Thread Richard Sandiford via Gcc-patches
Matthew Malcomson  writes:
> This patch tries to tie libhwasan into the GCC build system in the same way
> that the other sanitizer runtime libraries are handled.
>
> libsanitizer/ChangeLog:
>
>   * Makefile.am:  Build libhwasan.
>   * Makefile.in:  Build libhwasan.
>   * asan/Makefile.in:  Build libhwasan.
>   * configure:  Build libhwasan.
>   * configure.ac:  Build libhwasan.
>   * hwasan/Makefile.am: New file.
>   * hwasan/Makefile.in: New file.
>   * hwasan/libtool-version: New file.
>   * interception/Makefile.in: Build libhwasan.
>   * libbacktrace/Makefile.in: Build libhwasan.
>   * libsanitizer.spec.in: Build libhwasan.
>   * lsan/Makefile.in: Build libhwasan.
>   * sanitizer_common/Makefile.in: Build libhwasan.
>   * tsan/Makefile.in: Build libhwasan.
>   * ubsan/Makefile.in: Build libhwasan.

OK, thanks.

Richard

> ### Attachment also inlined for ease of reply
> ###
>
>
> diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am
> index 
> 65ed1e712378ef453f820f86c4d3221f9dee5f2c..2a7e8e1debe838719db0f0fad218b2543cc3111b
>  100644
> --- a/libsanitizer/Makefile.am
> +++ b/libsanitizer/Makefile.am
> @@ -14,11 +14,12 @@ endif
>  if LIBBACKTRACE_SUPPORTED
>  SUBDIRS += libbacktrace
>  endif
> -SUBDIRS += lsan asan ubsan
> +SUBDIRS += lsan asan ubsan hwasan
>  nodist_saninclude_HEADERS += \
>include/sanitizer/lsan_interface.h \
>include/sanitizer/asan_interface.h \
> -  include/sanitizer/tsan_interface.h
> +  include/sanitizer/tsan_interface.h \
> +  include/sanitizer/hwasan_interface.h
>  if TSAN_SUPPORTED
>  SUBDIRS += tsan
>  endif
> diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in
> index 
> 02c7f70ac6578a3e93a490ce8bd2c54fc0693c50..2c57d49cbffdb486645aeb5f2c0f85d6e0fad124
>  100644
> --- a/libsanitizer/Makefile.in
> +++ b/libsanitizer/Makefile.in
> @@ -92,7 +92,8 @@ target_triplet = @target@
>  @SANITIZER_SUPPORTED_TRUE@am__append_1 = 
> include/sanitizer/common_interface_defs.h \
>  @SANITIZER_SUPPORTED_TRUE@   include/sanitizer/lsan_interface.h \
>  @SANITIZER_SUPPORTED_TRUE@   include/sanitizer/asan_interface.h \
> -@SANITIZER_SUPPORTED_TRUE@   include/sanitizer/tsan_interface.h
> +@SANITIZER_SUPPORTED_TRUE@   include/sanitizer/tsan_interface.h \
> +@SANITIZER_SUPPORTED_TRUE@   include/sanitizer/hwasan_interface.h
>  @SANITIZER_SUPPORTED_TRUE@@USING_MAC_INTERPOSE_FALSE@am__append_2 = 
> interception
>  @LIBBACKTRACE_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_3 = 
> libbacktrace
>  @SANITIZER_SUPPORTED_TRUE@@TSAN_SUPPORTED_TRUE@am__append_4 = tsan
> @@ -207,7 +208,7 @@ ETAGS = etags
>  CTAGS = ctags
>  CSCOPE = cscope
>  DIST_SUBDIRS = sanitizer_common interception libbacktrace lsan asan \
> - ubsan tsan
> + ubsan hwasan tsan
>  ACLOCAL = @ACLOCAL@
>  ALLOC_FILE = @ALLOC_FILE@
>  AMTAR = @AMTAR@
> @@ -329,6 +330,7 @@ install_sh = @install_sh@
>  libdir = @libdir@
>  libexecdir = @libexecdir@
>  link_libasan = @link_libasan@
> +link_libhwasan = @link_libhwasan@
>  link_liblsan = @link_liblsan@
>  link_libtsan = @link_libtsan@
>  link_libubsan = @link_libubsan@
> @@ -362,7 +364,7 @@ sanincludedir = 
> $(libdir)/gcc/$(target_alias)/$(gcc_version)/include/sanitizer
>  nodist_saninclude_HEADERS = $(am__append_1)
>  @SANITIZER_SUPPORTED_TRUE@SUBDIRS = sanitizer_common $(am__append_2) \
>  @SANITIZER_SUPPORTED_TRUE@   $(am__append_3) lsan asan ubsan \
> -@SANITIZER_SUPPORTED_TRUE@   $(am__append_4)
> +@SANITIZER_SUPPORTED_TRUE@   hwasan $(am__append_4)
>  gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
>  
>  # Work around what appears to be a GNU make bug handling MAKEFLAGS
> diff --git a/libsanitizer/asan/Makefile.in b/libsanitizer/asan/Makefile.in
> index 
> 29622bf466a37f819c9fade30e31195adda51190..25c7fd7b7597d6e243005a1bb7de5b6243d2cfcf
>  100644
> --- a/libsanitizer/asan/Makefile.in
> +++ b/libsanitizer/asan/Makefile.in
> @@ -383,6 +383,7 @@ install_sh = @install_sh@
>  libdir = @libdir@
>  libexecdir = @libexecdir@
>  link_libasan = @link_libasan@
> +link_libhwasan = @link_libhwasan@
>  link_liblsan = @link_liblsan@
>  link_libtsan = @link_libtsan@
>  link_libubsan = @link_libubsan@
> diff --git a/libsanitizer/configure b/libsanitizer/configure
> index 
> 04eca04fbe5e59bae1ba00597de0cf1b7cf1b5fa..27e72c089cb891dcce09494fa9e39eebe55d2598
>  100755
> --- a/libsanitizer/configure
> +++ b/libsanitizer/configure
> @@ -657,6 +657,7 @@ USING_MAC_INTERPOSE_TRUE
>  link_liblsan
>  link_libubsan
>  link_libtsan
> +link_libhwasan
>  link_libasan
>  LSAN_SUPPORTED_FALSE
>  LSAN_SUPPORTED_TRUE
> @@ -12361,7 +12362,7 @@ else
>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>lt_status=$lt_dlunknown
>cat > conftest.$ac_ext <<_LT_EOF
> -#line 12364 "configure"
> +#line 12365 "configure"
>  #include "confdefs.h"
>  
>  #if HAVE_DLFCN_H
> @@ -12467,7 +12468,7 @@ else
>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>

New Chinese (traditional) PO file for 'gcc' (version 10.2.0)

2020-11-18 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Chinese (traditional) team of translators.  The file is available at:

https://translationproject.org/latest/gcc/zh_TW.po

(This file, 'gcc-10.2.0.zh_TW.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




preprocessor: Add support for header unit translation

2020-11-18 Thread Nathan Sidwell
This adds preprocessor support for header units.  Every #include 
directive needs to go through the include-translation hook to determine 
if it is to be treated as 'import $file;'.  This is that code.  when 
include translation occurs, the hook creates a buffer containing the 
translation.  We then continue to lex that buffer in place of the 
#include we would have lexed.


libcpp/
* files.c (struct _cpp_file): Add header_unit field.
(_cpp_stack_file): Add header unit support.
(cpp_find_header_unit): New.
* include/cpplib (cpp_find_header_unit): Declare.

pushing to trunk

--
Nathan Sidwell
diff --git i/libcpp/files.c w/libcpp/files.c
index d73177aa1ee..b5d9f30297e 100644
--- i/libcpp/files.c
+++ w/libcpp/files.c
@@ -111,6 +111,9 @@ struct _cpp_file
 
   /* If this file is implicitly preincluded.  */
   bool implicit_preinclude : 1;
+
+  /* > 0: Known C++ Module header unit, <0: known not.  ==0, unknown  */
+  int header_unit : 2;
 };
 
 /* A singly-linked list for all searches for a given file name, with
@@ -891,9 +894,9 @@ has_unique_contents (cpp_reader *pfile, _cpp_file *file, bool import,
 }
 
 /* Place the file referenced by FILE into a new buffer on the buffer
-   stack if possible.  IMPORT is true if this stacking attempt is
-   because of a #import directive.  Returns true if a buffer is
-   stacked.  Use LOC for any diagnostics.  */
+   stack if possible.  Returns true if a buffer is stacked.  Use LOC
+   for any diagnostics.  */
+
 bool
 _cpp_stack_file (cpp_reader *pfile, _cpp_file *file, include_type type,
 		 location_t loc)
@@ -901,39 +904,73 @@ _cpp_stack_file (cpp_reader *pfile, _cpp_file *file, include_type type,
   if (is_known_idempotent_file (pfile, file, type == IT_IMPORT))
 return false;
 
-  if (!read_file (pfile, file, loc))
-return false;
+  int sysp = 0;
+  char *buf = nullptr;
 
-  if (!has_unique_contents (pfile, file, type == IT_IMPORT, loc))
-return false;
+  /* Check C++ module include translation.  */
+  if (!file->header_unit && type < IT_HEADER_HWM
+  /* Do not include translate include-next.  */
+  && type != IT_INCLUDE_NEXT
+  && pfile->cb.translate_include)
+buf = (pfile->cb.translate_include
+	   (pfile, pfile->line_table, loc, file->path));
 
-  int sysp = 0;
-  if (pfile->buffer && file->dir)
-sysp = MAX (pfile->buffer->sysp, file->dir->sysp);
-
-  /* Add the file to the dependencies on its first inclusion.  */
-  if (CPP_OPTION (pfile, deps.style) > (sysp != 0)
-  && !file->stack_count
-  && file->path[0]
-  && !(file->main_file && CPP_OPTION (pfile, deps.ignore_main_file)))
-deps_add_dep (pfile->deps, file->path);
-
-  /* Clear buffer_valid since _cpp_clean_line messes it up.  */
-  file->buffer_valid = false;
-  file->stack_count++;
-
-  /* Stack the buffer.  */
-  cpp_buffer *buffer
-= cpp_push_buffer (pfile, file->buffer, file->st.st_size,
-		   CPP_OPTION (pfile, preprocessed)
-		   && !CPP_OPTION (pfile, directives_only));
-  buffer->file = file;
-  buffer->sysp = sysp;
-  buffer->to_free = file->buffer_start;
-
-  /* Initialize controlling macro state.  */
-  pfile->mi_valid = true;
-  pfile->mi_cmacro = 0;
+  if (buf)
+{
+  /* We don't increment the line number at the end of a buffer,
+	 because we don't usually need that location (we're popping an
+	 include file).  However in this case we do want to do the
+	 increment.  So push a writable buffer of two newlines to acheive
+	 that.  */
+  static uchar newlines[] = "\n\n";
+  cpp_push_buffer (pfile, newlines, 2, true);
+
+  cpp_buffer *buffer
+	= cpp_push_buffer (pfile, reinterpret_cast (buf),
+			   strlen (buf), true);
+  buffer->to_free = buffer->buf;
+
+  file->header_unit = +1;
+  _cpp_mark_file_once_only (pfile, file);
+}
+  else
+{
+  /* Not a header unit, and we know it.  */
+  file->header_unit = -1;
+
+  if (!read_file (pfile, file, loc))
+	return false;
+
+  if (!has_unique_contents (pfile, file, type == IT_IMPORT, loc))
+	return false;
+
+  if (pfile->buffer && file->dir)
+	sysp = MAX (pfile->buffer->sysp, file->dir->sysp);
+
+  /* Add the file to the dependencies on its first inclusion.  */
+  if (CPP_OPTION (pfile, deps.style) > (sysp != 0)
+	  && !file->stack_count
+	  && file->path[0]
+	  && !(file->main_file && CPP_OPTION (pfile, deps.ignore_main_file)))
+	deps_add_dep (pfile->deps, file->path);
+
+  /* Clear buffer_valid since _cpp_clean_line messes it up.  */
+  file->buffer_valid = false;
+  file->stack_count++;
+
+  /* Stack the buffer.  */
+  cpp_buffer *buffer
+	= cpp_push_buffer (pfile, file->buffer, file->st.st_size,
+			   CPP_OPTION (pfile, preprocessed)
+			   && !CPP_OPTION (pfile, directives_only));
+  buffer->file = file;
+  buffer->sysp = sysp;
+  buffer->to_free = file->buffer_start;
+
+  /* Initialize controlling macro state.  */
+  pfile->mi_valid = true;
+  

Re: [AArch64] Add --with-tune configure flag

2020-11-18 Thread Wilco Dijkstra via Gcc-patches
Hi Sebastian,

I presume you're trying to unify the --with- options across most targets?
That would be very useful! However there are significant differences between
targets in how they interpret options like --with-arch=native (or -march). So
those differences also need to be looked at and fixed to avoid unexpected 
results.

As for the first patch, I think support for --witch-tune requires more changes.
Without proper processing of a --with-tune, you get an incorrect architecture
version (if say the CPU you tune for is newer than the --with-cpu/arch
or default).

I posted patches to add --with-tune and fix various issues a while back:

https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553865.html
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553866.html

So I think these should go in first. They are simple and useful enough to 
backport
(since they fix bugs) but that decision is up to the AArch64 maintainers.

As for your second patch, --with-cpu-64 could be a simple alias indeed,
but what is the exact definition/expected behaviour of a --with-cpu-32
on a target that only supports 64-bit code? The AArch64 target cannot
generate AArch32 code, so we shouldn't silently accept it.

Cheers,
Wilco



Re: [PATCH] PowerPC: Restrict long double test to use IBM long double.

2020-11-18 Thread will schmidt via Gcc-patches
On Wed, 2020-11-18 at 01:03 -0500, Michael Meissner wrote:
> On Tue, Nov 17, 2020 at 11:33:29PM -0600, will schmidt wrote:
> > On Sun, 2020-11-15 at 12:23 -0500, Michael Meissner via Gcc-patches 
> > wrote:
> > > PowerPC: Restrict long double test to use IBM long double.
> > > 
> > > I posted this patch previously as a set of 3 testsuite
> > > patches.  I have
> > > separated them into separate patches.  This patch marks the
> > > convert-bfp-11.c
> > > patch as needing IBM extended double.  If you look at the code,
> > > it is
> > > specifically designed around testing the limits of the IBM 128-
> > > bit extended
> > > double representation.  I added a new target-supports that says
> > > the test
> > > requires IBM extended long double, and changed the test to
> > > require this
> > > effective test.  Can I check this into the master branch?
> > 
> > 
> > It's harder to review that without all the history handy here.
> > 
> > This will stand alone better if you lead with what you are adding
> > and
> > keep it clean.  i.e.
> 
> The patch I was referring to was posted on October 22nd:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556865.html
> 
> > Subject: PowerPC: Add ppc_long_double_ibm effective-target check
> > 
> > "Add a ppc_long_double_ibm dg-require-effective-target check to
> > ensure
> > tests that require LONG_DOUBLE_IBM128 . "
> > An additional statement to clarify it's relationship with
> > I128
> > wouldn't  hurt if that is the case.  i.e. 
> > "This is a counterpart to LONG_DOUBLE_IEEE 128 " 
> 
> At the moment, we don't need a target supports for long double IEEE
> 128-bit or
> long double 64-bit.  I can add them if needed.

I would probably add one for each of the three so you have the complete
picture of what is going on.

> 
> > Hmm, I have those backwards in my head apparently.  Can the return
> > 1 if
> > not-defined logic be flattened out so we see the direct
> > relationship?
> 
> I'm not sure what you are asking.  These are preprocessor macros that
> are only
> defined in certain cases.  And remember this is main returning a
> value, so
> returning 0 is true and 1 is false.
> 
> In particular:
> 

This:

> If your long double is 128-bits and uses the IEEE 128-bit
> representation, the
> following macros are defined:
> 
>   __LONG_DOUBLE_128__
>   __LONG_DOUBLE_IEEE128__
> 
> If your long double is 128-bit and uses the IBM 128-bit
> representation (current
> default0, the following macros are defined:
> 
>   __LONG_DOUBLE_128__
>   __LONG_DOUBLE_IBM128__
> 
> If your long double is 64 bits, neither of those two macros are
> defined.
> 

.. clearly defines what is going on, and would be good to add as a
comment in/around where the checks are defined.

Thats my perspective of course,...  :-)
thanks
-Will




Re: [PATCH] Include math.h in nextafter-2.c test.

2020-11-18 Thread will schmidt via Gcc-patches
On Wed, 2020-11-18 at 00:55 -0500, Michael Meissner wrote:
> On Tue, Nov 17, 2020 at 11:33:23PM -0600, will schmidt wrote:
> > On Sun, 2020-11-15 at 12:12 -0500, Michael Meissner via Gcc-patches 
> > wrote:
> > > Include math.h in nextafter-2.c test.
> > > 
> > > I previously posted this with two other patches.  I've separated
> > > this into its
> > > own patch.  What happens is because the nextafter-2.c test uses
> > > -fno-builtin,
> > > and it does not include math.h, the wrong nextafterl and
> > > nextforwardl gets
> > > called when long double is not IBM 128-bit (i.e. either 64-bit,
> > > or IEEE
> > > 128-bit).
> > 
> > Thats a sandbox issue, or something upstream ?
> 
> I'm not sure what you are asking.  If you install the three critical
> IEEE
> 128-bit long double patches, and then configure a build with long
> double
> defaulting to IEEE 128-bit, the nextafter-2 test will fail.

That answers my question.. this fixes an issue with patches that are
not upstream yet.  (your sandbox). 

> 
> The reason is the nextafterl function in GLIBC assumes long double is
> IBM
> 128-bit extended double.  The __builtin_nextafterl function calls
> that
> function.
> 
> If you compile it normally (with long double using IEEE 128-bit), the
> compiler
> will automatically map nextafterl to __nextafterieee128.
> 
> Similarly if you include math.h, and use the -fno-builtin option, the
> math.h
> library will still map nextafterl into __nextafterieee128, and the
> compiler
> will call it.
> 
> However, if you do not include math.h and use the -fno-builtin
> option, the
> compiler will call nextafterl, and get the wrong results, because the
> wrong
> function was called.
> 
> What I meant in terms of the 3 patches being separated, the last time
> I posted
> a patch for this problem, I grouped together 3 test suite failures
> into one
> patch.  This time, I separated the cases into 3 separate patches
> (this one, the
> fix for pr70117, and the fix for the decimal conversion test).
> 
> > > 
> > > Rather than add the include only for the PowerPC, I thought it
> > > was better to
> > > always include it.  There might be some port in the future that
> > > has the same
> > > issue with multiple long double types without using multilibs.
> > > 
> > > Can I check this into the master branch.
> > > 
> > > 2020-11-15  Michael Meissner  
> > > 
> > >   * gcc.dg/nextafter-2.c: Include math.h.
> > > ---
> > >  gcc/testsuite/gcc.dg/nextafter-2.c | 12 
> > >  1 file changed, 12 insertions(+)
> > > 
> > > diff --git a/gcc/testsuite/gcc.dg/nextafter-2.c
> > > b/gcc/testsuite/gcc.dg/nextafter-2.c
> > > index e51ae94be0c..8149a709fa5 100644
> > > --- a/gcc/testsuite/gcc.dg/nextafter-2.c
> > > +++ b/gcc/testsuite/gcc.dg/nextafter-2.c
> > > @@ -6,6 +6,18 @@
> > > 
> > >  #include 
> > > 
> > > +/* In order to run on systems like the PowerPC that have 3
> > > different long
> > > +   double types, include math.h so it can choose what is the
> > > appropriate
> > > +   nextafterl function to use.
> > > +
> > > +   If we didn't use -fno-builtin for this test, the PowerPC
> > > compiler would have
> > > +   changed the names of the built-in functions that use long
> > > double.  The
> > > +   nextafter-1.c function runs with this mapping.
> > > +
> > > +   Since this test uses -fno-builtin, include math.h, so that
> > > math.h can make
> > > +   the appropriate choice to use.  */
> > 
> > 
> > 
> > Can this be simplified to stl
> > 
> > /* Include math.h so that systems like PowerPC that have different
> > long
> > double types can choose the appropriate nextafterl function to
> > use.  */
> > 
> > 
> > > +#include 
> > > +
> > >  #if defined(__GLIBC__) && defined(__GLIBC_PREREQ)
> > >  # if !__GLIBC_PREREQ (2, 24)
> > >  /* Workaround buggy nextafterl in glibc 2.23 and earlier,
> > > -- 
> > > 2.22.0
> > > 
> > > 
> 
> Sure, the comment is just trying to explain why math.h needs to be
> included.

Ok.   Your first paragraph in the comment clarifies that.  I'm
uncertain the rest of the comment helps, but i'll defer. 
Thanks. 

> 



Re: [PATCH] AArch64: Add cost table for Cortex-A76

2020-11-18 Thread Richard Earnshaw via Gcc-patches
On 18/11/2020 14:55, Wilco Dijkstra via Gcc-patches wrote:
> Add an initial cost table for Cortex-A76 - this is copied from
> cotexa57_extra_costs but updates it based on the Optimization Guide.
> Use the new cost table on all Neoverse tunings and ensure the tunings
> are consistent for all.  As a result more compact code is generated
> with more combined shift+alu operations. Eg. -mcpu=cortex-a76 will now
> merge the shifts in:
> 
> int f(int x, int y) { return (x & y << 3) * (x | y << 3); }
> 
> and  w2, w0, w1, lsl 3
> orr  w0, w0, w1, lsl 3
> mul  w0, w2, w0
> ret
> 
> SPEC2017 codesize improves by 0.02% and SPECINT2017 shows 0.24% gain.
> 
> Bootstrap OK, regress passes, OK for commit?
> 
> ChangeLog:
> 2020-11-18  Wilco Dijkstra  
> 
> * config/aarch64/aarch64.c (neoversen1_tunings): Use new
> cortexa76_extra_costs.
> (neoversev1_tunings): Likewise.
> (neoversen2_tunines): Likewise.
> * config/arm/aarch-cost-tables.h (cortexa76_extra_costs):
> add new costs.
> 
> ---
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> 6bf2f9aa344f9150dec72db660d951e50521285c..65ff49d2b4125013466f90a54ff698ae810580f0
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -1312,7 +1312,7 @@ static const struct tune_params thunderx3t110_tunings =
>  
>  static const struct tune_params neoversen1_tunings =
>  {
> -  _extra_costs,
> +  _extra_costs,
>_addrcost_table,
>_regmove_cost,
>_vector_cost,
> @@ -1338,7 +1338,7 @@ static const struct tune_params neoversen1_tunings =
>  
>  static const struct tune_params neoversev1_tunings =
>  {
> -  _extra_costs,
> +  _extra_costs,
>_addrcost_table,
>_regmove_cost,
>_vector_cost,
> @@ -1364,7 +1364,7 @@ static const struct tune_params neoversev1_tunings =
>  
>  static const struct tune_params neoversen2_tunings =
>  {
> -  _extra_costs,
> +  _extra_costs,
>_addrcost_table,
>_regmove_cost,
>_vector_cost,
> diff --git a/gcc/config/arm/aarch-cost-tables.h 
> b/gcc/config/arm/aarch-cost-tables.h
> index 
> cf8186599018cc5e51cf44e4f2080a502d895e1d..1b9d53d07b54bddf1767121236b06d2b4581631c
>  100644
> --- a/gcc/config/arm/aarch-cost-tables.h
> +++ b/gcc/config/arm/aarch-cost-tables.h
> @@ -331,6 +331,109 @@ const struct cpu_cost_table cortexa57_extra_costs =
>}
>  };
>  
> +const struct cpu_cost_table cortexa76_extra_costs =
> +{
> +  /* ALU */
> +  {
> +0, /* arith.  */
> +0, /* logical.  */
> +0, /* shift.  */
> +0,  /* shift_reg.  */
> +COSTS_N_INSNS (1), /* arith_shift.  */
> +COSTS_N_INSNS (1), /* arith_shift_reg.  */
> +0,  /* log_shift.  */
> +COSTS_N_INSNS (1), /* log_shift_reg.  */
> +0, /* extend.  */
> +COSTS_N_INSNS (1), /* extend_arith.  */
> +COSTS_N_INSNS (1), /* bfi.  */
> +0, /* bfx.  */
> +0, /* clz.  */
> +0,  /* rev.  */
> +0, /* non_exec.  */
> +true   /* non_exec_costs_exec.  */
> +  },
> +  {
> +/* MULT SImode */
> +{
> +  COSTS_N_INSNS (1),   /* simple.  */
> +  COSTS_N_INSNS (2),   /* flag_setting.  */
> +  COSTS_N_INSNS (1),   /* extend.  */
> +  COSTS_N_INSNS (1),   /* add.  */
> +  COSTS_N_INSNS (1),   /* extend_add.  */
> +  COSTS_N_INSNS (6) /* idiv.  */
> +},
> +/* MULT DImode */
> +{
> +  COSTS_N_INSNS (3),   /* simple.  */
> +  0,   /* flag_setting (N/A).  */
> +  COSTS_N_INSNS (1),   /* extend.  */
> +  COSTS_N_INSNS (3),   /* add.  */
> +  COSTS_N_INSNS (1),   /* extend_add.  */
> +  COSTS_N_INSNS (10)   /* idiv.  */
> +}
> +  },
> +  /* LD/ST */
> +  {
> +COSTS_N_INSNS (3), /* load.  */
> +COSTS_N_INSNS (3), /* load_sign_extend.  */
> +COSTS_N_INSNS (3), /* ldrd.  */
> +COSTS_N_INSNS (2), /* ldm_1st.  */
> +1, /* ldm_regs_per_insn_1st.  */
> +2, /* ldm_regs_per_insn_subsequent.  */
> +COSTS_N_INSNS (4), /* loadf.  */
> +COSTS_N_INSNS (4), /* loadd.  */
> +COSTS_N_INSNS (5), /* load_unaligned.  */
> +0, /* store.  */
> +0, /* strd.  */
> +0, /* stm_1st.  */
> +1, /* stm_regs_per_insn_1st.  */
> +2, /* stm_regs_per_insn_subsequent.  */
> +0, /* storef.  */
> +0, /* stored.  */
> +COSTS_N_INSNS (1), /* store_unaligned.  */
> +COSTS_N_INSNS (1), /* loadv.  */
> +COSTS_N_INSNS (1)  /* storev.  */
> +  },
> +  {
> +/* FP SFmode */
> +{
> +  

[COMMITTED] Patch fixing PR97870

2020-11-18 Thread Vladimir Makarov via Gcc-patches

The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97870

The patch was successfully bootstrapped and tested on x86-64.


[PR97870] LRA: don't remove asm goto, just nullify it.

gcc/

2020-11-18  Vladimir Makarov  

PR target/97870
* lra-constraints.c (curr_insn_transform): Do not delete asm goto
with wrong constraints.  Nullify it saving CFG.

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index f034c7749e9..80ca1e06e31 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -4104,9 +4104,18 @@ curr_insn_transform (bool check_only_p)
   error_for_asm (curr_insn,
 		 "inconsistent operand constraints in an %");
   lra_asm_error_p = true;
-  /* Avoid further trouble with this insn.  Don't generate use
-	 pattern here as we could use the insn SP offset.  */
-  lra_set_insn_deleted (curr_insn);
+  if (! JUMP_P (curr_insn))
+	{
+	  /* Avoid further trouble with this insn.  Don't generate use
+	 pattern here as we could use the insn SP offset.  */
+	  lra_set_insn_deleted (curr_insn);
+	}
+  else
+	{
+	  lra_invalidate_insn_data (curr_insn);
+	  ira_nullify_asm_goto (curr_insn);
+	  lra_update_insn_regno_info (curr_insn);
+	}
   return true;
 }
 


Re: [PATCH] AArch64: Improve inline memcpy expansion

2020-11-18 Thread Richard Sandiford via Gcc-patches
Wilco Dijkstra via Gcc-patches  writes:
>>> +  || (size <= (max_copy_size / 2)
>>> +  && (aarch64_tune_params.extra_tuning_flags
>>> +  & AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS)))
>>> +    copy_bits = GET_MODE_BITSIZE (TImode);
>>
>> (Looks like the mailer has eaten some tabs here.)
>
> The email contains the correct tabs at the time I send it.

Yeah, sorry the noise, looks like Exchange ate them at my end.
The version on gmane was ok.

>> As discussed in Sudi's setmem patch, I think we should make the
>> AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS conditional on optimising
>> for speed.  For size, using LDP Q and STP Q is a win regardless
>> of what the CPU wants.
>
> I've changed the logic slightly based on benchmarking. It's actually better
> to fallback to calling memcpy for larger sizes in this case rather than emit 
> an
> inlined Q-register memcpy.

OK.

>>>    int n_bits = GET_MODE_BITSIZE (next_mode).to_constant ();
>>
>> As I mentioned in the reply I've just sent to Sudi's patch,
>> I think it might help to add:
>>
>>  gcc_assert (n_bits <= mode_bits);
>>
>> to show why this is guaranteed not to overrun the original copy.
>> (I agree that the code does guarantee no overrun.)
>
> I've added the same assert so the code remains similar.
>
> Here is the updated version:
>
>
> Improve the inline memcpy expansion.  Use integer load/store for copies <= 24 
> bytes
> instead of SIMD.  Set the maximum copy to expand to 256 by default, except 
> that -Os or
> no Neon expands up to 128 bytes.  When using LDP/STP of Q-registers, also use 
> Q-register
> accesses for the unaligned tail, saving 2 instructions (eg. all sizes up to 
> 48 bytes emit
> exactly 4 instructions).  Cleanup code and comments.
>
> The codesize gain vs the GCC10 expansion is 0.05% on SPECINT2017.
>
> Passes bootstrap and regress. OK for commit?

OK, thanks.

Richard

>
> ChangeLog:
> 2020-11-16  Wilco Dijkstra  
>
> * config/aarch64/aarch64.c (aarch64_expand_cpymem): Cleanup code and
> comments, tweak expansion decisions and improve tail expansion.
>
> ---
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> 41e2a699108146e0fa7464743607bd34e91ea9eb..4b2d5fa7d452dc53ff42308dd2781096ff8c95d2
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -21255,35 +21255,39 @@ aarch64_copy_one_block_and_progress_pointers (rtx 
> *src, rtx *dst,
>  bool
>  aarch64_expand_cpymem (rtx *operands)
>  {
> -  /* These need to be signed as we need to perform arithmetic on n as
> - signed operations.  */
> -  int n, mode_bits;
> +  int mode_bits;
>rtx dst = operands[0];
>rtx src = operands[1];
>rtx base;
> -  machine_mode cur_mode = BLKmode, next_mode;
> -  bool speed_p = !optimize_function_for_size_p (cfun);
> -
> -  /* When optimizing for size, give a better estimate of the length of a
> - memcpy call, but use the default otherwise.  Moves larger than 8 bytes
> - will always require an even number of instructions to do now.  And each
> - operation requires both a load+store, so divide the max number by 2.  */
> -  unsigned int max_num_moves = (speed_p ? 16 : AARCH64_CALL_RATIO) / 2;
> +  machine_mode cur_mode = BLKmode;
>  
> -  /* We can't do anything smart if the amount to copy is not constant.  */
> +  /* Only expand fixed-size copies.  */
>if (!CONST_INT_P (operands[2]))
>  return false;
>  
> -  unsigned HOST_WIDE_INT tmp = INTVAL (operands[2]);
> +  unsigned HOST_WIDE_INT size = INTVAL (operands[2]);
>  
> -  /* Try to keep the number of instructions low.  For all cases we will do at
> - most two moves for the residual amount, since we'll always overlap the
> - remainder.  */
> -  if (((tmp / 16) + (tmp % 16 ? 2 : 0)) > max_num_moves)
> -return false;
> +  /* Inline up to 256 bytes when optimizing for speed.  */
> +  unsigned HOST_WIDE_INT max_copy_size = 256;
>  
> -  /* At this point tmp is known to have to fit inside an int.  */
> -  n = tmp;
> +  if (optimize_function_for_size_p (cfun))
> +max_copy_size = 128;
> +
> +  int copy_bits = 256;
> +
> +  /* Default to 256-bit LDP/STP on large copies, however small copies, no 
> SIMD
> + support or slow 256-bit LDP/STP fall back to 128-bit chunks.  */
> +  if (size <= 24
> +  || !TARGET_SIMD
> +  || (aarch64_tune_params.extra_tuning_flags
> +   & AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS))
> +{
> +  copy_bits = 128;
> +  max_copy_size = max_copy_size / 2;
> +}
> +
> +  if (size > max_copy_size)
> +return false;
>  
>base = copy_to_mode_reg (Pmode, XEXP (dst, 0));
>dst = adjust_automodify_address (dst, VOIDmode, base, 0);
> @@ -21291,15 +21295,8 @@ aarch64_expand_cpymem (rtx *operands)
>base = copy_to_mode_reg (Pmode, XEXP (src, 0));
>src = adjust_automodify_address (src, VOIDmode, base, 0);
>  
> -  /* Convert n to bits to make the rest of the code simpler.  */
> -  n = n * BITS_PER_UNIT;
> -
> -  /* Maximum amount 

[PATCH] AArch64: Add cost table for Cortex-A76

2020-11-18 Thread Wilco Dijkstra via Gcc-patches
Add an initial cost table for Cortex-A76 - this is copied from
cotexa57_extra_costs but updates it based on the Optimization Guide.
Use the new cost table on all Neoverse tunings and ensure the tunings
are consistent for all.  As a result more compact code is generated
with more combined shift+alu operations. Eg. -mcpu=cortex-a76 will now
merge the shifts in:

int f(int x, int y) { return (x & y << 3) * (x | y << 3); }

and  w2, w0, w1, lsl 3
orr  w0, w0, w1, lsl 3
mul  w0, w2, w0
ret

SPEC2017 codesize improves by 0.02% and SPECINT2017 shows 0.24% gain.

Bootstrap OK, regress passes, OK for commit?

ChangeLog:
2020-11-18  Wilco Dijkstra  

* config/aarch64/aarch64.c (neoversen1_tunings): Use new
cortexa76_extra_costs.
(neoversev1_tunings): Likewise.
(neoversen2_tunines): Likewise.
* config/arm/aarch-cost-tables.h (cortexa76_extra_costs):
add new costs.

---
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
6bf2f9aa344f9150dec72db660d951e50521285c..65ff49d2b4125013466f90a54ff698ae810580f0
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1312,7 +1312,7 @@ static const struct tune_params thunderx3t110_tunings =
 
 static const struct tune_params neoversen1_tunings =
 {
-  _extra_costs,
+  _extra_costs,
   _addrcost_table,
   _regmove_cost,
   _vector_cost,
@@ -1338,7 +1338,7 @@ static const struct tune_params neoversen1_tunings =
 
 static const struct tune_params neoversev1_tunings =
 {
-  _extra_costs,
+  _extra_costs,
   _addrcost_table,
   _regmove_cost,
   _vector_cost,
@@ -1364,7 +1364,7 @@ static const struct tune_params neoversev1_tunings =
 
 static const struct tune_params neoversen2_tunings =
 {
-  _extra_costs,
+  _extra_costs,
   _addrcost_table,
   _regmove_cost,
   _vector_cost,
diff --git a/gcc/config/arm/aarch-cost-tables.h 
b/gcc/config/arm/aarch-cost-tables.h
index 
cf8186599018cc5e51cf44e4f2080a502d895e1d..1b9d53d07b54bddf1767121236b06d2b4581631c
 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -331,6 +331,109 @@ const struct cpu_cost_table cortexa57_extra_costs =
   }
 };
 
+const struct cpu_cost_table cortexa76_extra_costs =
+{
+  /* ALU */
+  {
+0, /* arith.  */
+0, /* logical.  */
+0, /* shift.  */
+0,  /* shift_reg.  */
+COSTS_N_INSNS (1), /* arith_shift.  */
+COSTS_N_INSNS (1), /* arith_shift_reg.  */
+0,/* log_shift.  */
+COSTS_N_INSNS (1), /* log_shift_reg.  */
+0, /* extend.  */
+COSTS_N_INSNS (1), /* extend_arith.  */
+COSTS_N_INSNS (1), /* bfi.  */
+0, /* bfx.  */
+0, /* clz.  */
+0,  /* rev.  */
+0, /* non_exec.  */
+true   /* non_exec_costs_exec.  */
+  },
+  {
+/* MULT SImode */
+{
+  COSTS_N_INSNS (1),   /* simple.  */
+  COSTS_N_INSNS (2),   /* flag_setting.  */
+  COSTS_N_INSNS (1),   /* extend.  */
+  COSTS_N_INSNS (1),   /* add.  */
+  COSTS_N_INSNS (1),   /* extend_add.  */
+  COSTS_N_INSNS (6)   /* idiv.  */
+},
+/* MULT DImode */
+{
+  COSTS_N_INSNS (3),   /* simple.  */
+  0,   /* flag_setting (N/A).  */
+  COSTS_N_INSNS (1),   /* extend.  */
+  COSTS_N_INSNS (3),   /* add.  */
+  COSTS_N_INSNS (1),   /* extend_add.  */
+  COSTS_N_INSNS (10)   /* idiv.  */
+}
+  },
+  /* LD/ST */
+  {
+COSTS_N_INSNS (3), /* load.  */
+COSTS_N_INSNS (3), /* load_sign_extend.  */
+COSTS_N_INSNS (3), /* ldrd.  */
+COSTS_N_INSNS (2), /* ldm_1st.  */
+1, /* ldm_regs_per_insn_1st.  */
+2, /* ldm_regs_per_insn_subsequent.  */
+COSTS_N_INSNS (4), /* loadf.  */
+COSTS_N_INSNS (4), /* loadd.  */
+COSTS_N_INSNS (5), /* load_unaligned.  */
+0, /* store.  */
+0, /* strd.  */
+0, /* stm_1st.  */
+1, /* stm_regs_per_insn_1st.  */
+2, /* stm_regs_per_insn_subsequent.  */
+0, /* storef.  */
+0, /* stored.  */
+COSTS_N_INSNS (1), /* store_unaligned.  */
+COSTS_N_INSNS (1), /* loadv.  */
+COSTS_N_INSNS (1)  /* storev.  */
+  },
+  {
+/* FP SFmode */
+{
+  COSTS_N_INSNS (10),  /* div.  */
+  COSTS_N_INSNS (2),   /* mult.  */
+  COSTS_N_INSNS (3),   /* mult_addsub.  */
+  COSTS_N_INSNS (3),   /* fma.  */
+  COSTS_N_INSNS (1),   /* addsub.  */
+  0,   /* fpconst.  */
+  0,   /* neg.  */
+  0,   

Re: Improve handling of memory operands in ipa-icf 3/4

2020-11-18 Thread Jan Hubicka
> On 11/13/20 6:50 PM, Jan Hubicka wrote:
> > Bootstrapped/regtested x86_64-linux. I plan to commit it on monday if there 
> > are
> > no complains.
> 
> Hello Honza.
> 
> Thank you very much for the patch set.
> It's a nice improvement and it will eventually fix the WPA slowness caused by 
> IPA ICF.
> 
> I made some measurements for master before a first patch and this patch (3/4) 
> on godot
> game engine:
> 
> BEFORE:
> 
> Equal symbols: 15690
> Totally needed symbols: 17913, fraction of loaded symbols: 39.05%
> 
> 2156989   false returned: '' in equals_private at 
> /home/marxin/Programming/gcc2/gcc/ipa-icf.c:879
> 1099887   false returned: 'operand_equal_p failed' in compare_operand at 
> /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:307
> 1048605   false returned: 'types are not compatible' in compatible_types_p at 
> /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:210
> 1047679   false returned: 'GIMPLE assignment operands are different' in 
> compare_gimple_assign at 
> /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:632
> 1047517   false returned: 'GIMPLE NOP LHS type mismatch' in 
> compare_gimple_assign at 
> /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:628
>   57659   false returned: 'call function types are not compatible' in 
> compare_gimple_call at /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:573
>   52088   false returned: 'PHI node comparison returns false' in 
> equals_private at /home/marxin/Programming/gcc2/gcc/ipa-icf.c:914
>   52088   false returned: '' in compare_phi_node at 
> /home/marxin/Programming/gcc2/gcc/ipa-icf.c:1552
>   13565   false returned: 'decl_or_type flags are different' in equals_wpa at 
> /home/marxin/Programming/gcc2/gcc/ipa-icf.c:567
>9919   false returned: 'result types are different' in equals_wpa at 
> /home/marxin/Programming/gcc2/gcc/ipa-icf.c:616
> 
> Time variable   usr   sys  
> wall   GGC
>  ipa icf:   4.31 (  7%)   0.06 (  2%)   4.38 (  
> 7%)  6008k (  0%)
>  TOTAL  :  57.57  3.49 61.11  
>4830M
> 
> AFTER:
> 
> Equal symbols: 17019
> Totally needed symbols: 19875, fraction of loaded symbols: 70.88%
> 
>  377327   false returned: '' in equals_private at 
> /home/marxin/Programming/gcc2/gcc/ipa-icf.c:886
>  213086   false returned: 'operand_equal_p failed' in compare_operand at 
> /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:356
>  212179   false returned: 'compare_ao_refs failed (access path difference)' 
> in compare_operand at /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:345
>  159947   false returned: '' in compare_gimple_call at 
> /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:607
>  147098   false returned: 'GIMPLE assignment operands are different' in 
> compare_gimple_assign at 
> /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:699
>   66123   false returned: 'GIMPLE call operands are different' in 
> compare_gimple_call at /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:656
>   52088   false returned: 'PHI node comparison returns false' in 
> equals_private at /home/marxin/Programming/gcc2/gcc/ipa-icf.c:921
>   52088   false returned: '' in compare_phi_node at 
> /home/marxin/Programming/gcc2/gcc/ipa-icf.c:1580
>   12643   false returned: 'decl_or_type flags are different' in equals_wpa at 
> /home/marxin/Programming/gcc2/gcc/ipa-icf.c:572
>6318   false returned: 'different tree types' in compatible_types_p at 
> /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:206
> 
> Time variable   usr   sys  
> wall   GGC
>  ipa icf:   3.40 (  6%)   0.09 (  3%)   3.49 (  
> 6%)27M (  1%)
>  TOTAL  :  56.60  2.94 59.58  
>4478M
> 
> and I'm also sending usage-wrapper graphs.

Thanks for checking!  I also uploaded some data to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92535

Note that you want to also note gimple in timevar since that is also
mostly ICF related.

It seems that ICF performance is highly sensitive to application: it now
behaves very well on cc1plus, seems to do quite well on godot and still
does very bad on Firefox (we still have regression there compared to gcc
9 that itself did relatively bad).

I noticed one stupid bug in operand_equal_p on coponent_refs (I am just
testing a fix) and there are quite few important things that we compare
but do not hash. Those should be easy to fix. I plan to iterate through
this on firefox.

It would be great to get chromium data.  Did you suceeded building it
recently?  I now got last year firefox building and working and I am
looking into updating it to current firefox tree that will probaby keep
me occupied for some time.

Honza
> 
> Martin




preprocessor: Update mkdeps for modules

2020-11-18 Thread Nathan Sidwell


This is slightly different to the original patch I posted.  This adds
separate module target and dependency functions (rather than a single
bi-modal function).

libcpp/
* include/cpplib.h (struct cpp_options): Add modules to
dep-options.
* include/mkdeps.h (deps_add_module_target): Declare.
(deps_add_module_dep): Declare.
* mkdes.c (class mkdeps): Add modules, module_name, cmi_name,
is_header_unit fields.  Adjust cdtors.
(deps_add_module_target, deps_add_module_dep): New.
(make_write): Write module dependencies, if enabled.

pushing to trunk

--
Nathan Sidwell
diff --git i/libcpp/include/cpplib.h w/libcpp/include/cpplib.h
index d2324266d39..75d4d0a9f2f 100644
--- i/libcpp/include/cpplib.h
+++ w/libcpp/include/cpplib.h
@@ -528,6 +528,9 @@ struct cpp_options
one.  */
 bool phony_targets;
 
+/* Generate dependency info for modules.  */
+bool modules;
+
 /* If true, no dependency is generated on the main file.  */
 bool ignore_main_file;
 
diff --git i/libcpp/include/mkdeps.h w/libcpp/include/mkdeps.h
index 593b718aaeb..9f10327eec3 100644
--- i/libcpp/include/mkdeps.h
+++ w/libcpp/include/mkdeps.h
@@ -51,6 +51,13 @@ extern void deps_add_target (class mkdeps *, const char *, int);
string as the default target is interpreted as stdin.  */
 extern void deps_add_default_target (class mkdeps *, const char *);
 
+/* Adds a module target.  The module name and cmi name are copied.  */
+extern void deps_add_module_target (struct mkdeps *, const char *module,
+const char *cmi, bool is_header);
+
+/* Adds a module dependency.  The module name is copied.  */
+extern void deps_add_module_dep (struct mkdeps *, const char *module);
+
 /* Add a dependency (appears on the right side of the colon) to the
deps list.  Dependencies will be printed in the order that they
were entered with this function.  By convention, the first
diff --git i/libcpp/mkdeps.c w/libcpp/mkdeps.c
index a989ed355fa..4a8e101b912 100644
--- i/libcpp/mkdeps.c
+++ w/libcpp/mkdeps.c
@@ -81,7 +81,7 @@ public:
   };
 
   mkdeps ()
-: quote_lwm (0)
+: module_name (NULL), cmi_name (NULL), is_header_unit (false), quote_lwm (0)
   {
   }
   ~mkdeps ()
@@ -94,14 +94,22 @@ public:
   free (const_cast  (deps[i]));
 for (i = vpath.size (); i--;)
   XDELETEVEC (vpath[i].str);
+for (i = modules.size (); i--;)
+  XDELETEVEC (modules[i]);
+XDELETEVEC (module_name);
+free (const_cast  (cmi_name));
   }
 
 public:
   vec targets;
   vec deps;
   vec vpath;
+  vec modules;
 
 public:
+  const char *module_name;
+  const char *cmi_name;
+  bool is_header_unit;
   unsigned short quote_lwm;
 };
 
@@ -313,6 +321,28 @@ deps_add_vpath (class mkdeps *d, const char *vpath)
 }
 }
 
+/* Add a new module target (there can only be one).  M is the module
+   name.   */
+
+void
+deps_add_module_target (struct mkdeps *d, const char *m,
+			const char *cmi, bool is_header_unit)
+{
+  gcc_assert (!d->module_name);
+  
+  d->module_name = xstrdup (m);
+  d->is_header_unit = is_header_unit;
+  d->cmi_name = xstrdup (cmi);
+}
+
+/* Add a new module dependency.  M is the module name.  */
+
+void
+deps_add_module_dep (struct mkdeps *d, const char *m)
+{
+  d->modules.push (xstrdup (m));
+}
+
 /* Write NAME, with a leading space to FP, a Makefile.  Advance COL as
appropriate, wrap at COLMAX, returning new column number.  Iff
QUOTE apply quoting.  Append TRAIL.  */
@@ -369,6 +399,8 @@ make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
   if (d->deps.size ())
 {
   column = make_write_vec (d->targets, fp, 0, colmax, d->quote_lwm);
+  if (CPP_OPTION (pfile, deps.modules) && d->cmi_name)
+	column = make_write_name (d->cmi_name, fp, column, colmax);
   fputs (":", fp);
   column++;
   make_write_vec (d->deps, fp, column, colmax);
@@ -377,6 +409,59 @@ make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
 	for (unsigned i = 1; i < d->deps.size (); i++)
 	  fprintf (fp, "%s:\n", munge (d->deps[i]));
 }
+
+  if (!CPP_OPTION (pfile, deps.modules))
+return;
+
+  if (d->modules.size ())
+{
+  column = make_write_vec (d->targets, fp, 0, colmax, d->quote_lwm);
+  if (d->cmi_name)
+	column = make_write_name (d->cmi_name, fp, column, colmax);
+  fputs (":", fp);
+  column++;
+  column = make_write_vec (d->modules, fp, column, colmax, 0, ".c++m");
+  fputs ("\n", fp);
+}
+
+  if (d->module_name)
+{
+  if (d->cmi_name)
+	{
+	  /* module-name : cmi-name */
+	  column = make_write_name (d->module_name, fp, 0, colmax,
+true, ".c++m");
+	  fputs (":", fp);
+	  column++;
+	  column = make_write_name (d->cmi_name, fp, column, colmax);
+	  fputs ("\n", fp);
+
+	  column = fprintf (fp, ".PHONY:");
+	  column = make_write_name (d->module_name, fp, column, colmax,
+true, ".c++m");
+	  fputs ("\n", fp);
+	}
+
+  if (d->cmi_name && 

Re: [Patch] varasm.c: Always output flags in merged .section for LLVM assembler compatibility [PR97827]

2020-11-18 Thread Richard Biener via Gcc-patches
On Wed, Nov 18, 2020 at 1:04 PM Jakub Jelinek via Gcc-patches
 wrote:
>
> On Wed, Nov 18, 2020 at 12:51:02PM +0100, Tobias Burnus wrote:
> > As noted by Matthias when bootstrapping with AMD GCN support [PR97827]:
> > Assembler source code generated by GCC might no longer assembly with
> > LLVM's 'mc' since LLVM 11.
> >
> > The reason is that GCC generates on purpose first the section with
> > the flags, e.g. (via mergeable_constant_section)
> >.section.rodata.cst8,"aM",@progbits,8
> > and then for subsequent uses, it does not repeat the flags:
> >.section.rodata.cst8
> >
> > GNU assembler warns (and with as >=2.35 gives an error) if the flags
> > do not match, but not if the attributes/flags are left in the other
> > same-named sections (as above) – just if they are specified and different.
> >
> > LLVM since February (in git) and released with LLVM 11 (12 Oct 2020)
> > does a similar check – but without the no-error-if-no-flag exception:
> >   strtod.s:4472:2: error: changed section flags for .rodata.cst8, expected: 
> > 0x12
> >   strtod.s:4472:2: error: changed section entsize for .rodata.cst8, 
> > expected: 8
> >
> >
> > The solution done by the attached patch is to emit the full flags also
> > for SECTION_MERGE.
> >
> > Side note: For AMD GCN, we rely on LLVM as "GNU as" does not handle
> > this target, yet; still, also in general, it makes sense to be
> > compatible with llvm-mc.
> >
> > OK?
>
> I think we shouldn't do this except when targetting the (buggy) llvm
> assembler.
> Specifying section flags just on first .section directive and not others
> is correct, there is no point repeating that and GNU as (but I think many
> other assemblers) has been supporting it that way forever.
> The only time one needs to specify the section flags again is for comdat
> sections because then the section name is not unique, one needs section name
> and comdat pair...

The branch might need similar updates.

Richard.

>
> Jakub
>


Re: [PATCH] recognize implied ranges for modulo.

2020-11-18 Thread Andrew MacLeod via Gcc-patches

On 11/18/20 3:35 AM, Aldy Hernandez wrote:



On 11/17/20 11:01 PM, Andrew MacLeod wrote:

PR 91029 observes when

  a % b > 0 && b >= 0,

then a has an implied range of  a >=0.  likewise


Shouldn't that be && b > 0?  b == 0 is undefined.


If you were folding, sure, but  I think its OK for equation solving.. 
whats important is that b is not negative.
 I can easily imagine having a positive LHS and an unknown unsigned 
value for b..   we could still conclude that 'a' is positive even though 
0 is in the "possible ranges" for 'b'


Andrew



Re: [PATCH] libstdc++: Fix ranges::join_view::_Iterator::operator-> [LWG 3500]

2020-11-18 Thread Jonathan Wakely via Gcc-patches

On 18/11/20 09:10 -0500, Patrick Palka via Libstdc++ wrote:

This applies the proposed resolution of LWG 3500, which corrects the
return type and constraints of this member function to use the right
iterator type.  Additionally, a nearby local variable is uglified.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk and the 10
branch?


OK, thanks.



[PATCH] libstdc++: Fix ranges::join_view::_Iterator::operator-> [LWG 3500]

2020-11-18 Thread Patrick Palka via Gcc-patches
This applies the proposed resolution of LWG 3500, which corrects the
return type and constraints of this member function to use the right
iterator type.  Additionally, a nearby local variable is uglified.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk and the 10
branch?

libstdc++-v3/ChangeLog:

* include/std/ranges (join_view::_Iterator::_M_satisfy): Uglify
local variable inner.
(join_view::_Iterator::operator->): Use _Inner_iter instead of
_Outer_iter in the function signature as per LWG 3500.
* testsuite/std/ranges/adaptors/join.cc (test08): Test it.
---
 libstdc++-v3/include/std/ranges| 14 --
 libstdc++-v3/testsuite/std/ranges/adaptors/join.cc | 12 
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 14d2a11f7fb..d38b1998de9 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -2128,9 +2128,9 @@ namespace views
 
for (; _M_outer != ranges::end(_M_parent->_M_base); ++_M_outer)
  {
-   auto& inner = __update_inner(*_M_outer);
-   _M_inner = ranges::begin(inner);
-   if (_M_inner != ranges::end(inner))
+   auto& __inner = __update_inner(*_M_outer);
+   _M_inner = ranges::begin(__inner);
+   if (_M_inner != ranges::end(__inner))
  return;
  }
 
@@ -2211,10 +2211,12 @@ namespace views
  operator*() const
  { return *_M_inner; }
 
- constexpr _Outer_iter
+ // _GLIBCXX_RESOLVE_LIB_DEFECTS
+ // 3500. join_view::iterator::operator->() is bogus
+ constexpr _Inner_iter
  operator->() const
-   requires __detail::__has_arrow<_Outer_iter>
- && copyable<_Outer_iter>
+   requires __detail::__has_arrow<_Inner_iter>
+ && copyable<_Inner_iter>
  { return _M_inner; }
 
  constexpr _Iterator&
diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/join.cc 
b/libstdc++-v3/testsuite/std/ranges/adaptors/join.cc
index e21e7054b35..8bbea9a6b25 100644
--- a/libstdc++-v3/testsuite/std/ranges/adaptors/join.cc
+++ b/libstdc++-v3/testsuite/std/ranges/adaptors/join.cc
@@ -138,6 +138,17 @@ test07()
   static_assert( std::same_as, int> );
 }
 
+void
+test08()
+{
+  // LWG 3500. join_view::iterator::operator->() is bogus
+  struct X { int a; };
+  ranges::single_view> s{std::in_place, std::in_place, 
5};
+  auto v = s | views::join;
+  auto i = v.begin();
+  VERIFY( i->a == 5 );
+}
+
 int
 main()
 {
@@ -148,4 +159,5 @@ main()
   test05();
   test06();
   test07();
+  test08();
 }
-- 
2.29.2.260.ge31aba42fb



PING^5 [PATCH] Use the section flag 'o' for __patchable_function_entries

2020-11-18 Thread H.J. Lu via Gcc-patches
On Sat, Nov 7, 2020 at 7:47 AM H.J. Lu  wrote:
>
> On Sat, Oct 31, 2020 at 5:01 AM H.J. Lu  wrote:
> >
> > On Fri, Oct 23, 2020 at 5:41 AM H.J. Lu  wrote:
> > >
> > > On Fri, Oct 2, 2020 at 6:00 AM H.J. Lu  wrote:
> > > >
> > > > On Thu, Feb 6, 2020 at 6:57 PM H.J. Lu  wrote:
> > > > >
> > > > > This commit in GNU binutils 2.35:
> > > > >
> > > > > https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=b7d072167715829eed0622616f6ae0182900de3e
> > > > >
> > > > > added the section flag 'o' to .section directive:
> > > > >
> > > > > .section __patchable_function_entries,"awo",@progbits,foo
> > > > >
> > > > > which specifies the symbol name which the section references.  
> > > > > Assembler
> > > > > creates a unique __patchable_function_entries section with the 
> > > > > section,
> > > > > where foo is defined, as its linked-to section.  Linker keeps a 
> > > > > section
> > > > > if its linked-to section is kept during garbage collection.
> > > > >
> > > > > This patch checks assembler support for the section flag 'o' and uses
> > > > > it to implement __patchable_function_entries section.  Since Solaris 
> > > > > may
> > > > > use GNU assembler with Solairs ld.  Even if GNU assembler supports the
> > > > > section flag 'o', it doesn't mean that Solairs ld supports it.  This
> > > > > feature is disabled for Solairs targets.
> > > > >
> > > > > gcc/
> > > > >
> > > > > PR middle-end/93195
> > > > > PR middle-end/93197
> > > > > * configure.ac (HAVE_GAS_SECTION_LINK_ORDER): New.  Define if
> > > > > the assembler supports the section flag 'o' for specifying
> > > > > section with link-order.
> > > > > * dwarf2out.c (output_comdat_type_unit): Pass 0 as flags2
> > > > > to targetm.asm_out.named_section.
> > > > > * config/sol2.c (solaris_elf_asm_comdat_section): Likewise.
> > > > > * output.h (SECTION2_LINK_ORDER): New.
> > > > > (switch_to_section): Add an unsigned int argument.
> > > > > (default_no_named_section): Likewise.
> > > > > (default_elf_asm_named_section): Likewise.
> > > > > * target.def (asm_out.named_section): Likewise.
> > > > > * targhooks.c (default_print_patchable_function_entry): Pass
> > > > > current_function_decl to get_section and SECTION2_LINK_ORDER
> > > > > to switch_to_section.
> > > > > * varasm.c (default_no_named_section): Add an unsigned int
> > > > > argument.
> > > > > (default_elf_asm_named_section): Add an unsigned int argument,
> > > > > flags2.  Use 'o' flag for SECTION2_LINK_ORDER if assembler
> > > > > supports it.
> > > > > (switch_to_section): Add an unsigned int argument and pass it
> > > > > to targetm.asm_out.named_section.
> > > > > (handle_vtv_comdat_section): Pass 0 to
> > > > > targetm.asm_out.named_section.
> > > > > * config.in: Regenerated.
> > > > > * configure: Likewise.
> > > > > * doc/tm.texi: Likewise.
> > > > >
> > > > > gcc/testsuite/
> > > > >
> > > > > PR middle-end/93195
> > > > > * g++.dg/pr93195a.C: New test.
> > > > > * g++.dg/pr93195b.C: Likewise.
> > > > > * lib/target-supports.exp
> > > > > (check_effective_target_o_flag_in_section): New proc.
> > > >
> > > > PING
> > > >
> > > > https://gcc.gnu.org/pipermail/gcc-patches/2020-February/539963.html
> > >
> > > PING.
> > >
> >
> > PING.
> >
>
> PING.

Here is a simpler patch.  OK for master?

This commit in GNU binutils 2.35:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=b7d072167715
829eed0622616f6ae0182900de3e

added the section flag 'o' to .section directive:

.section __patchable_function_entries,"awo",@progbits,foo

which specifies the symbol name which the section references.  Assembler
creates a unique __patchable_function_entries section with the section,
where foo is defined, as its linked-to section.  Linker keeps a section
if its linked-to section is kept during garbage collection.

This patch checks assembler support for the section flag 'o' and uses
it to implement __patchable_function_entries section.  Since Solaris may
use GNU assembler with Solairs ld.  Even if GNU assembler supports the
section flag 'o', it doesn't mean that Solairs ld supports it.  This
feature is disabled for Solairs targets.

gcc/

PR middle-end/93195
PR middle-end/93197
* configure.ac (HAVE_GAS_SECTION_LINK_ORDER): New.  Define 1 if
the assembler supports the section flag 'o' for specifying
section with link-order.
* output.h (SECTION_LINK_ORDER): New.  Defined to 0x400.
(SECTION_MACH_DEP): Changed from 0x400 to 0x800.
* targhooks.c (default_print_patchable_function_entry): Pass
SECTION_LINK_ORDER to switch_to_section if the section flag 'o'
works.  Pass current_function_decl to switch_to_section.
* varasm.c (default_elf_asm_named_section): Use 'o' flag for
SECTION_LINK_ORDER if assembler supports 

Re: PR97849: aarch64: ICE (segfault) during GIMPLE pass: ifcvt

2020-11-18 Thread Richard Biener
On Wed, 18 Nov 2020, Prathamesh Kulkarni wrote:

> Hi,
> For the following test-case (slightly reduced from PR)
> int a, b, c;
> 
> int g() {
>   char i = 0;
>   for (c = 0; c <= 8; c++)
> --i;
> 
>   while (b) {
> _Bool f = i <= 0;
> a = (a == 0) ? 0 : f / a;
>   }
> }
> 
> The compiler segfaults with -O1 -march=armv8.2-a+sve in ifcvt_local_dce.
> 
> IIUC, the issue here is that tree-if-conv.c:predicate_rhs_code
> processes the following statement:
> iftmp.2_7 = a_lsm.10_11 != 0 ? iftmp.2_13 : 0;
> and records  mapping.
> 
> However RPO VN eliminates iftmp.2_13:
> Removing dead stmt iftmp.2_13 = .COND_DIV (_29, _4, a_lsm.10_11, 0);
> 
> and we end up replacing iftmp.2_7 with a dead ssa_name in ifcvt_local_dce:
> FOR_EACH_VEC_ELT (redundant_ssa_names, i, name_pair)
> replace_uses_by (name_pair->first, name_pair->second);
>   redundant_ssa_names.release ();
> 
> resulting in incorrect IR, and segfault down the line.
> 
> To avoid clashing of RPO VN with redunant_ssa_names, the patch simply moves
> ifcvt_local_dce before do_rpo_vn, which avoids the segfault.
> Does that look OK ?
> (Altho I guess, doing DCE after VN is better in principle)

Yes, I'd say just moving

  FOR_EACH_VEC_ELT (redundant_ssa_names, i, name_pair)
replace_uses_by (name_pair->first, name_pair->second);
  redundant_ssa_names.release ();

before rpo_vn makes more sense, no?

OK with that change.
Thanks,
Richard.

> Thanks,
> Prathamesh
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


Re: [Patch] testsuite/libgomp.c/usleep.h: Use sleep-loop also for GCN

2020-11-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 18, 2020 at 01:58:58PM +0100, Tobias Burnus wrote:
> At least here newlib is build without HAVE_POSIX / 'posix' subdirectory
> and, hence, among others without 'usleep'.
> 
> For nvptx, the same issue exists – and there 'omp declare variant' is used
> to call a burn-cycles loop as replacement.
> Affects target-32.c and thread-limit-2.c.
> 
> OK?

Then please rename nvptx_usleep to fallback_usleep.

Ok with that change.

Jakub



[Patch] testsuite/libgomp.c/usleep.h: Use sleep-loop also for GCN

2020-11-18 Thread Tobias Burnus

At least here newlib is build without HAVE_POSIX / 'posix' subdirectory
and, hence, among others without 'usleep'.

For nvptx, the same issue exists – and there 'omp declare variant' is used
to call a burn-cycles loop as replacement.
Affects target-32.c and thread-limit-2.c.

OK?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
testsuite/libgomp.c/usleep.h: Use sleep-loop also for GCN

As typically configured, newlib's libc.a does not build 'posix' and,
hence, usleep is not available. Thus, use the same fallback as for nvptx.

libgomp/
	* testsuite/libgomp.c/usleep.h (nvptx_usleep): Also use for device={arch(gcn)}.

diff --git a/libgomp/testsuite/libgomp.c/usleep.h b/libgomp/testsuite/libgomp.c/usleep.h
index c01aaa0a88f..1535ad06201 100644
--- a/libgomp/testsuite/libgomp.c/usleep.h
+++ b/libgomp/testsuite/libgomp.c/usleep.h
@@ -14,6 +14,7 @@ nvptx_usleep (useconds_t d)
 }
 
 #pragma omp declare variant (nvptx_usleep) match(construct={target},device={arch(nvptx)})
+#pragma omp declare variant (nvptx_usleep) match(construct={target},device={arch(gcn)})
 #pragma omp declare variant (usleep) match(user={condition(1)})
 int
 tgt_usleep (useconds_t d)


Re: [PATCH v5] rtl: builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-11-18 Thread Segher Boessenkool
On Wed, Nov 18, 2020 at 08:31:28AM +0100, Richard Biener wrote:
> On Tue, 17 Nov 2020, Jeff Law wrote:
> > On 11/4/20 8:10 AM, Raoni Fassina Firmino via Gcc-patches wrote:
> > > On Wed, Nov 04, 2020 at 10:35:03AM +0100, Richard Biener wrote:
> > >>> +/* Expand call EXP to the fegetround builtin (from C99 fenv.h), 
> > >>> returning the
> > >>> +   result and setting it in TARGET.  Otherwise return NULL_RTX on 
> > >>> failure.  */
> > >>> +static rtx
> > >>> +expand_builtin_fegetround (tree exp, rtx target, machine_mode 
> > >>> target_mode)
> > >>> +{
> > >>> +  if (!validate_arglist (exp, VOID_TYPE))
> > >>> +return NULL_RTX;
> > >>> +
> > >>> +  insn_code icode = direct_optab_handler (fegetround_optab, SImode);
> > >>> +  if (icode == CODE_FOR_nothing)
> > >>> +return NULL_RTX;
> > >>> +
> > >>> +  if (target == 0
> > >>> +  || GET_MODE (target) != target_mode
> > >>> +  || !(*insn_data[icode].operand[0].predicate) (target, 
> > >>> target_mode))
> > >>> +target = gen_reg_rtx (target_mode);
> > >>> +
> > >>> +  rtx pat = GEN_FCN (icode) (target);
> > >>> +  if (!pat)
> > >>> +return NULL_RTX;
> > >>> +  emit_insn (pat);
> > >> I think you need to verify whether the expansion ended up in 'target'
> > >> and otherwise emit a move since usually 'target' is just a hint.
> > > I thought the "if (target == 0 ..." took care of that. The expands do
> > > emit a move, if that helps.
> > It looks like if we have a passed in target and it either has the wrong
> > mode or it does not match the predicate, then we generaet a new target
> > and use that instead.? I don't see where we'd copy from that new target
> > to the original desired target.? For some expanders the caller would
> > handle that, but I don't see how that's possible for this one without
> > the caller digging into the generated RTL to determine that
> > expand_builtin_fegetround put the result somewhere other than TARGET and
> > thus a copy is needed.
> > 
> > That may be what Richi is worried about.
> 
> I know we've added missing
> 
>   if (!rtx_equal_p (target, ops[0].value))
> emit_move_insn (target, ops[0].value);
> 
> to several expanders (using expand_insn rather than manual
> GEN_FCN (icode) calls).

We can handle the constants issue similarly to what we do for
__builtin_fpclassify, too.


Segher


Re: [PATCH v2] Add if-chain to switch conversion pass.

2020-11-18 Thread Martin Liška

On 11/16/20 1:21 PM, Richard Biener wrote:

but the most trivial thing would be to feed the pass the
balanced-tree generated by switch expansion where I
would expect us to be able to detect it as the original switch again.


Well, if we want to support such matching, then please deffer it to a phase 2.
I don't see it a common pattern that people write such a code in wild.

Right now, we have some local analysis and one can eventually build a more 
advanced
algorithm on top of it. Can we please make a progress for GCC 11 with the 
current
approach that will cover quite some interesting if-chains?

Thanks,
Martin


Re: Improve handling of memory operands in ipa-icf 4/4

2020-11-18 Thread Martin Liška

On 11/16/20 12:20 AM, Jan Hubicka wrote:

This is controlled by -fipa-icf-alias-sets

The patch drops too early, so we may end up processing function twice.  Also if
merging is not performed we lose code quality for no win (this is rare case).
My original plan was to remember the mismatched parameter and apply them only
after merging decisions are finished, but I was not sure how to do that in
ipa-icf.  In particular we need to ensure transitivity. In particular if
function foo is merged to bar, we also need to be sure that we dropped
base alias setsin functions tht are called by bar even if they themselves
are not merged. Martin, is there easy way to implement this on top of current 
ICF?


Well, you will need to create a set of merged functions and then traverse all
callers of these (via cgraph_node callers). It should not be so difficult, or?



Patch improves ICF code size savings on cc1plus to 1.3% compared to 0.6% before 
patch
and 3% with -fno-strict-aliasing. 6802 functions are merged and 3975 base alias 
sets
are dropped, 3642 originating from hash-table.h.

memory stats are:
  ipa lto gimple in  :   1.01 (  3%)   0.47 ( 12%)   1.56 (  
4%)   228M ( 14%)
  tree operand scan  :   0.40 (  1%)   0.16 (  4%)   0.49 (  
1%)39M (  2%)
  ipa icf:   1.96 (  6%)   0.20 (  5%)   2.14 (  
6%)12M (  1%)
  TOTAL  :  31.99  3.99 36.07   
  1632M

compared to --disable-ipa-icf:
  TOTAL  :  26.52  2.44 29.21   
  1354M

To recover more of -fno-strict-alias-analysis we could have
-fipa-icf-alias-sets 4-state
  -fipa-icf-alias-sets=0 do not drop any TBAA
  -fipa-icf-alias-sets=1 drop base alias sets
  -fipa-icf-alias-sets=2 drop pointer ref alias sets
  -fipa-icf-alias-sets=3 also drop ref alias sets to 0 for completely 
incompatible types.

On GCC most common reason is now diference in pointer types. so =2
should get the 3% of code size.

Bootstrapped/regtested x86_64-linux.
OK for mainline?  I would be also fine with making this wait for next
stage1 if that looks too intrusive (with part 3 of series at least icf
is not consuming a lot of memory for nothing), but there ought to be
again very nice savings on libreoffice which I think had double-digit
saving for icf (13% if I recall correctly).  I am busy tomorrow but will
start looking into firefox, clang and libreoffice builds again on
wednesday.


I would recommend deferring that to the next stage1.

Thanks for working on that Honza,
Martin


Re: [Ada] Build support units for 128-bit integer types on 64-bit platforms

2020-11-18 Thread Eric Botcazou
> that broke the build of an ada cross compiler targeting
> powerpc64le-linux-gnu. target_cpu is powerpc64le which is not matched by
> the Makefile logic.
> 
> Ok for the trunk?
> 
>   PR ada/97859
>   * Makefile.rtl (powerpc% linux%): Also match powerpc64le cpu.

Yes, thanks.

-- 
Eric Botcazou




Re: Improve handling of memory operands in ipa-icf 3/4

2020-11-18 Thread Martin Liška

On 11/13/20 6:50 PM, Jan Hubicka wrote:

Bootstrapped/regtested x86_64-linux. I plan to commit it on monday if there are
no complains.


Hello Honza.

Thank you very much for the patch set.
It's a nice improvement and it will eventually fix the WPA slowness caused by 
IPA ICF.

I made some measurements for master before a first patch and this patch (3/4) 
on godot
game engine:

BEFORE:

Equal symbols: 15690
Totally needed symbols: 17913, fraction of loaded symbols: 39.05%

2156989   false returned: '' in equals_private at 
/home/marxin/Programming/gcc2/gcc/ipa-icf.c:879
1099887   false returned: 'operand_equal_p failed' in compare_operand at 
/home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:307
1048605   false returned: 'types are not compatible' in compatible_types_p at 
/home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:210
1047679   false returned: 'GIMPLE assignment operands are different' in 
compare_gimple_assign at /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:632
1047517   false returned: 'GIMPLE NOP LHS type mismatch' in 
compare_gimple_assign at /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:628
  57659   false returned: 'call function types are not compatible' in 
compare_gimple_call at /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:573
  52088   false returned: 'PHI node comparison returns false' in equals_private 
at /home/marxin/Programming/gcc2/gcc/ipa-icf.c:914
  52088   false returned: '' in compare_phi_node at 
/home/marxin/Programming/gcc2/gcc/ipa-icf.c:1552
  13565   false returned: 'decl_or_type flags are different' in equals_wpa at 
/home/marxin/Programming/gcc2/gcc/ipa-icf.c:567
   9919   false returned: 'result types are different' in equals_wpa at 
/home/marxin/Programming/gcc2/gcc/ipa-icf.c:616

Time variable   usr   sys  wall 
  GGC
 ipa icf:   4.31 (  7%)   0.06 (  2%)   4.38 (  7%) 
 6008k (  0%)
 TOTAL  :  57.57  3.49 61.11
 4830M

AFTER:

Equal symbols: 17019
Totally needed symbols: 19875, fraction of loaded symbols: 70.88%

 377327   false returned: '' in equals_private at 
/home/marxin/Programming/gcc2/gcc/ipa-icf.c:886
 213086   false returned: 'operand_equal_p failed' in compare_operand at 
/home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:356
 212179   false returned: 'compare_ao_refs failed (access path difference)' in 
compare_operand at /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:345
 159947   false returned: '' in compare_gimple_call at 
/home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:607
 147098   false returned: 'GIMPLE assignment operands are different' in 
compare_gimple_assign at /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:699
  66123   false returned: 'GIMPLE call operands are different' in 
compare_gimple_call at /home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:656
  52088   false returned: 'PHI node comparison returns false' in equals_private 
at /home/marxin/Programming/gcc2/gcc/ipa-icf.c:921
  52088   false returned: '' in compare_phi_node at 
/home/marxin/Programming/gcc2/gcc/ipa-icf.c:1580
  12643   false returned: 'decl_or_type flags are different' in equals_wpa at 
/home/marxin/Programming/gcc2/gcc/ipa-icf.c:572
   6318   false returned: 'different tree types' in compatible_types_p at 
/home/marxin/Programming/gcc2/gcc/ipa-icf-gimple.c:206

Time variable   usr   sys  wall 
  GGC
 ipa icf:   3.40 (  6%)   0.09 (  3%)   3.49 (  6%) 
   27M (  1%)
 TOTAL  :  56.60  2.94 59.58
 4478M

and I'm also sending usage-wrapper graphs.

Martin


godot-files.tar.zst
Description: application/zstd


Re: [Patch] varasm.c: Always output flags in merged .section for LLVM assembler compatibility [PR97827]

2020-11-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 18, 2020 at 12:51:02PM +0100, Tobias Burnus wrote:
> As noted by Matthias when bootstrapping with AMD GCN support [PR97827]:
> Assembler source code generated by GCC might no longer assembly with
> LLVM's 'mc' since LLVM 11.
> 
> The reason is that GCC generates on purpose first the section with
> the flags, e.g. (via mergeable_constant_section)
>.section.rodata.cst8,"aM",@progbits,8
> and then for subsequent uses, it does not repeat the flags:
>.section.rodata.cst8
> 
> GNU assembler warns (and with as >=2.35 gives an error) if the flags
> do not match, but not if the attributes/flags are left in the other
> same-named sections (as above) – just if they are specified and different.
> 
> LLVM since February (in git) and released with LLVM 11 (12 Oct 2020)
> does a similar check – but without the no-error-if-no-flag exception:
>   strtod.s:4472:2: error: changed section flags for .rodata.cst8, expected: 
> 0x12
>   strtod.s:4472:2: error: changed section entsize for .rodata.cst8, expected: 
> 8
> 
> 
> The solution done by the attached patch is to emit the full flags also
> for SECTION_MERGE.
> 
> Side note: For AMD GCN, we rely on LLVM as "GNU as" does not handle
> this target, yet; still, also in general, it makes sense to be
> compatible with llvm-mc.
> 
> OK?

I think we shouldn't do this except when targetting the (buggy) llvm
assembler.
Specifying section flags just on first .section directive and not others
is correct, there is no point repeating that and GNU as (but I think many
other assemblers) has been supporting it that way forever.
The only time one needs to specify the section flags again is for comdat
sections because then the section name is not unique, one needs section name
and comdat pair...

Jakub



[Patch] varasm.c: Always output flags in merged .section for LLVM assembler compatibility [PR97827]

2020-11-18 Thread Tobias Burnus

As noted by Matthias when bootstrapping with AMD GCN support [PR97827]:
Assembler source code generated by GCC might no longer assembly with
LLVM's 'mc' since LLVM 11.

The reason is that GCC generates on purpose first the section with
the flags, e.g. (via mergeable_constant_section)
   .section.rodata.cst8,"aM",@progbits,8
and then for subsequent uses, it does not repeat the flags:
   .section.rodata.cst8

GNU assembler warns (and with as >=2.35 gives an error) if the flags
do not match, but not if the attributes/flags are left in the other
same-named sections (as above) – just if they are specified and different.

LLVM since February (in git) and released with LLVM 11 (12 Oct 2020)
does a similar check – but without the no-error-if-no-flag exception:
  strtod.s:4472:2: error: changed section flags for .rodata.cst8, expected: 0x12
  strtod.s:4472:2: error: changed section entsize for .rodata.cst8, expected: 8


The solution done by the attached patch is to emit the full flags also
for SECTION_MERGE.

Side note: For AMD GCN, we rely on LLVM as "GNU as" does not handle
this target, yet; still, also in general, it makes sense to be
compatible with llvm-mc.

OK?

Tobias

References:
- GCC bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97827
  with some analysis and quotes from LLVM MC and GNU AS
- Filled https://bugs.llvm.org/show_bug.cgi?id=48201 about this
  issue but no one commented on that issue so far
- Added to LLVM with commit https://reviews.llvm.org/D73999
- Some alignment was tried with GNU as (see all three links
  above) but the only "if (attr != 0)" then do check of
  gas/config/obj-elf.c was missed when implementing the
if (Section->getType() != Type)
  in LLVM MC.

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
varasm.c: Always output flags in merged .section for LLVM assembler compatibility [PR97827]

For compatibility with LLVM 11's 'mc' assembler, the flags have to be
repeated every time. See also LLVM Bug 48201 for this issue and
https://reviews.llvm.org/D73999 for the patch causing the issue.

gcc/
	PR target/97827
	* varasm.c (default_elf_asm_named_section): Always output all
	flags if SECTION_MERGE, even if already declared before.

diff --git a/gcc/varasm.c b/gcc/varasm.c
index ada99940f65..51a507393a8 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -6738,9 +6738,11 @@ default_elf_asm_named_section (const char *name, unsigned int flags,
   /* If we have already declared this section, we can use an
  abbreviated form to switch back to it -- unless this section is
  part of a COMDAT groups, in which case GAS requires the full
- declaration every time.  */
+ declaration every time.  LLVM's MC linker requires that the
+ flags are identical, thus avoid the abbreviated form with MERGE.  */
   if (!(HAVE_COMDAT_GROUP && (flags & SECTION_LINKONCE))
-  && (flags & SECTION_DECLARED))
+  && (flags & SECTION_DECLARED)
+  && !(flags & SECTION_MERGE))
 {
   fprintf (asm_out_file, "\t.section\t%s\n", name);
   return;


Re: [PATCH] openmp: Retire nest-var ICV

2020-11-18 Thread Jakub Jelinek via Gcc-patches
On Thu, Nov 12, 2020 at 10:44:35PM +, Kwok Cheung Yeung wrote:
> +  /* OMP_NESTED is deprecated in OpenMP 5.0.  */
> +  if (parse_boolean ("OMP_NESTED", ))
> + gomp_global_icv.max_active_levels_var =
> + nested ? gomp_supported_active_levels : 1;

Formatting - = should be on the next line, indented 2 columns further from
gomp_global_icv.

>  int
>  omp_get_nested (void)
>  {
>struct gomp_task_icv *icv = gomp_icv (false);
> -  return icv->nest_var;
> +  return icv->max_active_levels_var > 1
> +  && icv->max_active_levels_var > omp_get_active_level ();

Formatting, should be:
  return (icv->max_active_levels_var > 1
  && icv->max_active_levels_var > omp_get_active_level ());

> @@ -118,19 +122,21 @@ omp_get_thread_limit (void)
>  void
>  omp_set_max_active_levels (int max_levels)
>  {
> +  struct gomp_task_icv *icv = gomp_icv (false);

Should be gomp_icv (true), because it modifies the ICVs rather than
just querying them.  And perhaps move it inside of the if (max_levels >= 0)
if.

>if (max_levels >= 0)
>  {
>if (max_levels <= gomp_supported_active_levels)
> - gomp_max_active_levels_var = max_levels;
> + icv->max_active_levels_var = max_levels;
>else
> - gomp_max_active_levels_var = gomp_supported_active_levels;
> + icv->max_active_levels_var = gomp_supported_active_levels;
>  }
>  }

> --- a/libgomp/libgomp.h
> +++ b/libgomp/libgomp.h
> @@ -428,7 +428,7 @@ struct gomp_task_icv
>int default_device_var;
>unsigned int thread_limit_var;
>bool dyn_var;
> -  bool nest_var;
> +  unsigned long max_active_levels_var;
>char bind_var;
>/* Internal ICV.  */
>struct target_mem_desc *target_data;

This is in __thread vars, so we can't waste space in useless paddings or
overlong fields.
On 64-bit arches, thread_limit_var ends on 64-bit boundary and target_data
starts at that boundary, so with bool; unsigned long; char in between
that means 24 bytes in between them including 14 bytes of padding.

There is no point to make max_active_levels_var unsigned long,
gomp_supported_active_levels is INT_MAX, so one can't have anything higher
than that anyway.  So, either
  bool dyn_var;
  char bind_var;
  int max_active_levels_var;
which will be 8 bytes together on both 64-bit and 32-bit arches, or
perhaps we could go even further and lower the max active levels to USHRT_MAX
and use unsigned short, 65535 nesting levels is many thousands more than
what a reasonable program should try, after all, only active levels (aka
where there are 2 or more threads) count, so already 32 active levels
when each level has 2 threads means 4 billion threads, and 64 active levels
means 18 quintillion threads, so even 64 active levels are impossible
in 64-bit address spaces even if each thread occupied just 1 byte rather
than several pages.

So, let's change gomp_supported_active_levels to say 255 and use
  bool dyn_var;
  unsigned char max_active_levels_var;
  char bind_var;

Jakub



Re: [Ada] Build support units for 128-bit integer types on 64-bit platforms

2020-11-18 Thread Matthias Klose
On 10/22/20 2:12 PM, Pierre-Marie de Rodat wrote:
> This enables the build of the support units for 128-bit integer types
> in the full runtime of 64-bit platforms.
> 
> Tested on x86_64-pc-linux-gnu, committed on trunk
> 
> gcc/ada/
> 
>   * Makefile.rtl (64-bit platforms): Add GNATRTL_128BIT_PAIRS to
>   the LIBGNAT_TARGET_PAIRS list and also GNATRTL_128BIT_OBJS to
>   the EXTRA_GNATRTL_NONTASKING_OBJS list.
> 

that broke the build of an ada cross compiler targeting powerpc64le-linux-gnu.
target_cpu is powerpc64le which is not matched by the Makefile logic.

Ok for the trunk?

PR ada/97859
* Makefile.rtl (powerpc% linux%): Also match powerpc64le cpu.

--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -2305,7 +2305,7 @@ ifeq ($(strip $(filter-out powerpc% linux%,$(target_cpu)
$(target_os))),)
   $(ATOMICS_BUILTINS_TARGET_PAIRS) \
   system.ads

PR97849: aarch64: ICE (segfault) during GIMPLE pass: ifcvt

2020-11-18 Thread Prathamesh Kulkarni via Gcc-patches
Hi,
For the following test-case (slightly reduced from PR)
int a, b, c;

int g() {
  char i = 0;
  for (c = 0; c <= 8; c++)
--i;

  while (b) {
_Bool f = i <= 0;
a = (a == 0) ? 0 : f / a;
  }
}

The compiler segfaults with -O1 -march=armv8.2-a+sve in ifcvt_local_dce.

IIUC, the issue here is that tree-if-conv.c:predicate_rhs_code
processes the following statement:
iftmp.2_7 = a_lsm.10_11 != 0 ? iftmp.2_13 : 0;
and records  mapping.

However RPO VN eliminates iftmp.2_13:
Removing dead stmt iftmp.2_13 = .COND_DIV (_29, _4, a_lsm.10_11, 0);

and we end up replacing iftmp.2_7 with a dead ssa_name in ifcvt_local_dce:
FOR_EACH_VEC_ELT (redundant_ssa_names, i, name_pair)
replace_uses_by (name_pair->first, name_pair->second);
  redundant_ssa_names.release ();

resulting in incorrect IR, and segfault down the line.

To avoid clashing of RPO VN with redunant_ssa_names, the patch simply moves
ifcvt_local_dce before do_rpo_vn, which avoids the segfault.
Does that look OK ?
(Altho I guess, doing DCE after VN is better in principle)

Thanks,
Prathamesh


pr97849-1.diff
Description: Binary data


Re: [Patch] Fortran: cleanup OpenMP's OMP_LIST_* handling

2020-11-18 Thread Jakub Jelinek via Gcc-patches
On Mon, Nov 16, 2020 at 03:08:36PM +0100, Tobias Burnus wrote:
> as discussed the other day (I think via IRC or in a patch review),
>   omp_list_clauses
> did grow quite a bit: it has 26 entries and I am about to add two
> more.
> 
> This variable is used as:
>  typedef struct gfc_omp_clauses
>  {
>...
>gfc_omp_namelist *lists[OMP_LIST_NUM];
> 
> with sizeof(gfc_omp_namelist) = 5*sizeof(void*) + sizeof(locus);
> 
> This patch replaces it with a linked list.
> 
> 
> I am not really sure the new version is that readable and whether
> we move into the right direction or not.

> --- a/gcc/fortran/gfortran.h
> +++ b/gcc/fortran/gfortran.h
> @@ -1240,7 +1240,7 @@ enum gfc_omp_linear_op
>  /* For use in OpenMP clauses in case we need extra information
> (aligned clause alignment, linear clause step, etc.).  */
>  
> -typedef struct gfc_omp_namelist
> +typedef struct gfc_omp_namelist_item
>  {
>struct gfc_symbol *sym;
>struct gfc_expr *expr;
> @@ -1254,17 +1254,16 @@ typedef struct gfc_omp_namelist
>bool lastprivate_conditional;
>  } u;
>struct gfc_omp_namelist_udr *udr;
> -  struct gfc_omp_namelist *next;
> +  struct gfc_omp_namelist_item *next;
>locus where;
>  }
> -gfc_omp_namelist;
> +gfc_omp_namelist_item;
> +#define gfc_get_omp_namelist_item() XCNEW (gfc_omp_namelist_item)
>  
> -#define gfc_get_omp_namelist() XCNEW (gfc_omp_namelist)
> -
> -enum
> +enum omp_list_clauses
>  {
> -  OMP_LIST_FIRST,
> -  OMP_LIST_PRIVATE = OMP_LIST_FIRST,
> +  OMP_LIST_UNSET,
> +  OMP_LIST_PRIVATE,
>OMP_LIST_FIRSTPRIVATE,
>OMP_LIST_LASTPRIVATE,
>OMP_LIST_COPYPRIVATE,
> @@ -1289,8 +1288,7 @@ enum
>OMP_LIST_IS_DEVICE_PTR,
>OMP_LIST_USE_DEVICE_PTR,
>OMP_LIST_USE_DEVICE_ADDR,
> -  OMP_LIST_NONTEMPORAL,
> -  OMP_LIST_NUM
> +  OMP_LIST_NONTEMPORAL
>  };
>  
>  /* Because a symbol can belong to multiple namelists, they must be
> @@ -1386,12 +1384,22 @@ enum gfc_omp_memorder
>OMP_MEMORDER_RELAXED
>  };
>  
> +typedef struct gfc_omp_namelist
> +{
> +  enum omp_list_clauses clause;
> +  gfc_omp_namelist_item *item;
> +  struct gfc_omp_namelist *next;
> +}
> +gfc_omp_namelist;

I meant something slightly different, in particular not
this gfc_omp_namelist vs. gfc_omp_namelist_item (i.e. linked list of linked
lists), but just a linked list, i.e. just add
  ENUM_BITFIELD (enum omp_list_clauses) clause : 16;
  int flags : 16;
into gfc_omp_namelist where the flags could hold various flags for certain
clauses, e.g. for OMP_LIST_LASTPRIVATE it could hold the conditional
modifier, for OMP_LIST_REDUCTION the inscan and task modifiers, etc.
Or do you see any advantages in having the same clauses for different
symbols adjacent in the FE?

Jakub



[committed][GCC-10 backport] d: Fix undefined template references with circular module imports

2020-11-18 Thread Iain Buclaw via Gcc-patches
Hi,

This patch backports r11-4424 to GCC-10, as it fixes a critical
link-time error that occurred whilst bootstrapping the D implementation
of the DMD front-end.

Regression tested on x86_64-linux-gnu and committed to branch.

Regards
Iain

---
gcc/d/ChangeLog:

* dmd/dtemplate.c (TemplateInstance::semantic): Propagate the root
module where the instantiated template should belong from the instance
to all member scopes.

gcc/testsuite/ChangeLog:

* gdc.test/compilable/imports/test21299/func.d: New test.
* gdc.test/compilable/imports/test21299/mtype.d: New test.
* gdc.test/compilable/imports/test21299/rootstringtable.d: New test.
* gdc.test/compilable/test21299a.d: New test.
* gdc.test/compilable/test21299b.d: New test.
* gdc.test/compilable/test21299c.d: New test.
* gdc.test/compilable/test21299d.d: New test.

(cherry picked from commit e419ede8915eeb879de3d9c026cd4213aaceb86a)
---
 gcc/d/dmd/dtemplate.c | 66 -
 .../compilable/imports/test21299/func.d   |  8 ++
 .../compilable/imports/test21299/mtype.d  |  8 ++
 .../imports/test21299/rootstringtable.d   | 96 +++
 .../gdc.test/compilable/test21299a.d  |  4 +
 .../gdc.test/compilable/test21299b.d  |  4 +
 .../gdc.test/compilable/test21299c.d  |  5 +
 .../gdc.test/compilable/test21299d.d  | 27 ++
 8 files changed, 215 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gdc.test/compilable/imports/test21299/func.d
 create mode 100644 gcc/testsuite/gdc.test/compilable/imports/test21299/mtype.d
 create mode 100644 
gcc/testsuite/gdc.test/compilable/imports/test21299/rootstringtable.d
 create mode 100644 gcc/testsuite/gdc.test/compilable/test21299a.d
 create mode 100644 gcc/testsuite/gdc.test/compilable/test21299b.d
 create mode 100644 gcc/testsuite/gdc.test/compilable/test21299c.d
 create mode 100644 gcc/testsuite/gdc.test/compilable/test21299d.d

diff --git a/gcc/d/dmd/dtemplate.c b/gcc/d/dmd/dtemplate.c
index 9d211eb0340..cba2beea0dd 100644
--- a/gcc/d/dmd/dtemplate.c
+++ b/gcc/d/dmd/dtemplate.c
@@ -33,6 +33,7 @@
 #include "hdrgen.h"
 #include "id.h"
 #include "attrib.h"
+#include "cond.h"
 #include "tokens.h"
 
 #define IDX_NOTFOUND (0x12345678)   // index is not found
@@ -6088,17 +6089,18 @@ Lerror:
 if (minst && minst->isRoot() && !(inst->minst && 
inst->minst->isRoot()))
 {
 /* Swap the position of 'inst' and 'this' in the instantiation 
graph.
- * Then, the primary instance `inst` will be changed to a root 
instance.
+ * Then, the primary instance `inst` will be changed to a root 
instance,
+ * along with all members of `inst` having their scopes updated.
  *
  * Before:
- *  non-root -> A!() -> B!()[inst] -> C!()
+ *  non-root -> A!() -> B!()[inst] -> C!() { members[non-root] }
  *  |
  *  root -> D!() -> B!()[this]
  *
  * After:
  *  non-root -> A!() -> B!()[this]
  *  |
- *  root -> D!() -> B!()[inst] -> C!()
+ *  root -> D!() -> B!()[inst] -> C!() { members[root] }
  */
 Module *mi = minst;
 TemplateInstance *ti = tinst;
@@ -6107,6 +6109,64 @@ Lerror:
 inst->minst = mi;
 inst->tinst = ti;
 
+/* https://issues.dlang.org/show_bug.cgi?id=21299
+   `minst` has been updated on the primary instance `inst` so it is
+   now coming from a root module, however all Dsymbol 
`inst.members`
+   of the instance still have their `_scope.minst` pointing at the
+   original non-root module. We must now propagate `minst` to all
+   members so that forward referenced dependencies that get
+   instantiated will also be appended to the root module, otherwise
+   there will be undefined references at link-time.  */
+class InstMemberWalker : public Visitor
+{
+public:
+TemplateInstance *inst;
+
+InstMemberWalker(TemplateInstance *inst)
+: inst(inst) { }
+
+void visit(Dsymbol *d)
+{
+if (d->_scope)
+d->_scope->minst = inst->minst;
+}
+
+void visit(ScopeDsymbol *sds)
+{
+if (!sds->members)
+return;
+for (size_t i = 0; i < sds->members->dim; i++)
+{
+Dsymbol *s = (*sds->members)[i];
+s->accept(this);
+}
+visit((Dsymbol *)sds);
+}
+
+void 

[committed][GCC-10 backport] d: Explicitly determine which built-in copysign function to call.

2020-11-18 Thread Iain Buclaw via Gcc-patches
Hi,

This patch backports r11-4980 to GCC-10, as it fixes an ICE that could
occur on some targets.

Regression tested on x86_64-linux-gnu and committed to branch.

Regards
Iain

---
gcc/d/ChangeLog:

* intrinsics.cc (expand_intrinsic_copysign): Explicitly determine
which built-in copysign function to call.

(cherry picked from commit d975d6dce98a3e26ddd304d50dad2786b3acecc4)
---
 gcc/d/intrinsics.cc | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/gcc/d/intrinsics.cc b/gcc/d/intrinsics.cc
index c32819885bb..51cbd7b92fd 100644
--- a/gcc/d/intrinsics.cc
+++ b/gcc/d/intrinsics.cc
@@ -430,11 +430,14 @@ expand_intrinsic_copysign (tree callexp)
 from = fold_convert (type, from);
 
   /* Which variant of __builtin_copysign* should we call?  */
-  tree builtin = mathfn_built_in (type, BUILT_IN_COPYSIGN);
-  gcc_assert (builtin != NULL_TREE);
+  built_in_function code = (type == float_type_node) ? BUILT_IN_COPYSIGNF
+: (type == double_type_node) ? BUILT_IN_COPYSIGN
+: (type == long_double_type_node) ? BUILT_IN_COPYSIGNL
+: END_BUILTINS;
 
-  return call_builtin_fn (callexp, DECL_FUNCTION_CODE (builtin), 2,
- to, from);
+  gcc_assert (code != END_BUILTINS);
+
+  return call_builtin_fn (callexp, code, 2, to, from);
 }
 
 /* Expand a front-end intrinsic call to pow().  This takes two arguments, the
-- 
2.27.0



[PATCH] tree-optimization/97886 - deal with strange LC PHI nodes

2020-11-18 Thread Richard Biener
This makes vectorization properly assign vector types to PHI
nodes that copy from externals on loop exit edges.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-11-18  Richard Biener  

PR tree-optimization/97886
* tree-vect-loop.c (vectorizable_lc_phi): Properly assign
vector types to invariants for SLP.
---
 gcc/tree-vect-loop.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index ecaaf0116d3..856bbfebf7c 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -7593,6 +7593,17 @@ vectorizable_lc_phi (loop_vec_info loop_vinfo,
 
   if (!vec_stmt) /* transformation not required.  */
 {
+  /* Deal with copies from externs or constants that disguise as
+loop-closed PHI nodes (PR97886).  */
+  if (slp_node
+ && !vect_maybe_update_slp_op_vectype (SLP_TREE_CHILDREN (slp_node)[0],
+   SLP_TREE_VECTYPE (slp_node)))
+   {
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"incompatible vector types for invariants\n");
+ return false;
+   }
   STMT_VINFO_TYPE (stmt_info) = lc_phi_info_type;
   return true;
 }
-- 
2.26.2


Re: [PATCH] plugins: Allow plugins to handle global_options changes

2020-11-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 18, 2020 at 10:39:46AM +0100, Richard Biener wrote:
> We already have --{enable,disable}-plugin, so could remove it when
> those are not enabled.

Here is a variant that does that:

2020-11-18  Jakub Jelinek  

* opts.h (struct cl_var): New type.
(cl_vars): Declare.
* optc-gen.awk: Generate cl_vars array.

--- gcc/opts.h.jj   2020-04-30 11:49:28.462900760 +0200
+++ gcc/opts.h  2020-08-24 10:21:08.563288412 +0200
@@ -124,6 +124,14 @@ struct cl_option
   int range_max;
 };
 
+struct cl_var
+{
+  /* Name of the variable.  */
+  const char *var_name;
+  /* Offset of field for this var in struct gcc_options.  */
+  unsigned short var_offset;
+};
+
 /* Records that the state of an option consists of SIZE bytes starting
at DATA.  DATA might point to CH in some cases.  */
 struct cl_option_state {
@@ -134,6 +142,9 @@ struct cl_option_state {
 
 extern const struct cl_option cl_options[];
 extern const unsigned int cl_options_count;
+#ifdef ENABLE_PLUGIN
+extern const struct cl_var cl_vars[];
+#endif
 extern const char *const lang_names[];
 extern const unsigned int cl_lang_count;
 
--- gcc/optc-gen.awk.jj 2020-01-12 11:54:36.691409214 +0100
+++ gcc/optc-gen.awk2020-08-24 10:19:49.410410288 +0200
@@ -592,5 +592,29 @@ for (i = 0; i < n_opts; i++) {
 }
 print "}   "
 
+split("", var_seen, ":")
+print "\n#if !defined(GENERATOR_FILE) && defined(ENABLE_PLUGIN)"
+print "DEBUG_VARIABLE const struct cl_var cl_vars[] =\n{"
+
+for (i = 0; i < n_opts; i++) {
+   name = var_name(flags[i]);
+   if (name == "")
+   continue;
+   var_seen[name] = 1;
+}
+
+for (i = 0; i < n_extra_vars; i++) {
+   var = extra_vars[i]
+   sub(" *=.*", "", var)
+   name = var
+   sub("^.*[ *]", "", name)
+   sub("\\[.*\\]$", "", name)
+   if (name in var_seen)
+   continue;
+   print "  { " quote name quote ", offsetof (struct gcc_options, x_" name 
") },"
+   var_seen[name] = 1
 }
 
+print "  { NULL, (unsigned short) -1 }\n};\n#endif"
+
+}

Jakub



Re: [PATCH] plugins: Allow plugins to handle global_options changes

2020-11-18 Thread Richard Biener via Gcc-patches
On Wed, Nov 18, 2020 at 10:14 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> Reposting with self-contained description per Joseph's request:
>
> Any time somebody adds or removes an option in some *.opt file (which e.g.
> on the 10 branch after branching off 11 happened 7 times already), many
> offsets in global_options variable change and so plugins that ever access
> GCC options or other global_options values are ABI dependent on it.  It is
> true we don't guarantee ABI stability for plugins, but we change the most
> often used data structures on the release branches only very rarely and so
> the options changes are the most problematic for ABI stability of plugins.
>
> Annobin uses a way to remap accesses to some of the global_options.x_* by
> looking them up in the cl_options array where we have
> offsetof (struct gcc_options, x_flag_lto)
> etc. remembered, but sadly doesn't do it for all options (e.g. some flag_*
> etc. option accesses may be hidden in various macros like POINTER_SIZE),
> and more importantly some struct gcc_options offsets are not covered at all.
> E.g. there is no offsetof (struct gcc_options, x_optimize),
> offsetof (struct gcc_options, x_flag_sanitize) etc.  Those are usually:
> Variable
> int optimize
> in the *.opt files.
>
> The following patch allows the plugins to deal with reshuffling of even
> the global_options fields that aren't tracked in cl_options by adding
> another array that describes those, which adds an 816 bytes long array
> and 1039 bytes in string literals, so 1855 .rodata bytes in total ATM.
>
> If needed, this could be guarded by some configure option if those 1855
> .rodata (0.02% of .rodata size) bytes is something that people don't want
> to sacrifice for this.
>
> Bootstrapped/regtested again last night on x86_64-linux and i686-linux, ok
> for trunk?
>
> Or if not, would it be ok if this was guarded by some
> --enable-plugin-option-tracking
> or other configure option?

We already have --{enable,disable}-plugin, so could remove it when
those are not enabled.  OTOH the GCC binaries are already gigantic in size.

Richard.

> 2020-11-18  Jakub Jelinek  
>
> * opts.h (struct cl_var): New type.
> (cl_vars): Declare.
> * optc-gen.awk: Generate cl_vars array.
>
> --- gcc/opts.h.jj   2020-04-30 11:49:28.462900760 +0200
> +++ gcc/opts.h  2020-08-24 10:21:08.563288412 +0200
> @@ -124,6 +124,14 @@ struct cl_option
>int range_max;
>  };
>
> +struct cl_var
> +{
> +  /* Name of the variable.  */
> +  const char *var_name;
> +  /* Offset of field for this var in struct gcc_options.  */
> +  unsigned short var_offset;
> +};
> +
>  /* Records that the state of an option consists of SIZE bytes starting
> at DATA.  DATA might point to CH in some cases.  */
>  struct cl_option_state {
> @@ -134,6 +142,7 @@ struct cl_option_state {
>
>  extern const struct cl_option cl_options[];
>  extern const unsigned int cl_options_count;
> +extern const struct cl_var cl_vars[];
>  extern const char *const lang_names[];
>  extern const unsigned int cl_lang_count;
>
> --- gcc/optc-gen.awk.jj 2020-01-12 11:54:36.691409214 +0100
> +++ gcc/optc-gen.awk2020-08-24 10:19:49.410410288 +0200
> @@ -592,5 +592,29 @@ for (i = 0; i < n_opts; i++) {
>  }
>  print "}   "
>
> +split("", var_seen, ":")
> +print "\n#ifndef GENERATOR_FILE"
> +print "DEBUG_VARIABLE const struct cl_var cl_vars[] =\n{"
> +
> +for (i = 0; i < n_opts; i++) {
> +   name = var_name(flags[i]);
> +   if (name == "")
> +   continue;
> +   var_seen[name] = 1;
> +}
> +
> +for (i = 0; i < n_extra_vars; i++) {
> +   var = extra_vars[i]
> +   sub(" *=.*", "", var)
> +   name = var
> +   sub("^.*[ *]", "", name)
> +   sub("\\[.*\\]$", "", name)
> +   if (name in var_seen)
> +   continue;
> +   print "  { " quote name quote ", offsetof (struct gcc_options, x_" 
> name ") },"
> +   var_seen[name] = 1
>  }
>
> +print "  { NULL, (unsigned short) -1 }\n};\n#endif"
> +
> +}
>
> Jakub
>


C++ patch ping^2

2020-11-18 Thread Jakub Jelinek via Gcc-patches
Hi!

I'd like to ping the updated bit_cast patch:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557781.html

Thanks

Jakub



[PATCH] options, lto: Optimize streaming of optimization nodes

2020-11-18 Thread Jakub Jelinek via Gcc-patches
Hi!

Reposting with self-contained description per Joseph's request:

Honza mentioned that especially for the new param machinery, most of
streamed values are probably going to be the default values.  Perhaps
somehow we could stream them more effectively.

This patch implements it and brings further savings, the size
goes down from 574 bytes to 273 bytes, i.e. less than half.
Not trying to handle enums because the code doesn't know if (enum ...) 10
is even valid, similarly non-parameters because those really generally
don't have large initializers, and params without Init (those are 0
initialized and thus don't need to be handled).

Bootstrapped/regtested again on x86_64-linux and i686-linux, ok for trunk?

2020-11-18  Jakub Jelinek  

* optc-save-gen.awk: Initialize var_opt_init.  In
cl_optimization_stream_out for params with default values larger than
10, xor the default value with the actual parameter value.  In
cl_optimization_stream_in repeat the above xor.

--- gcc/optc-save-gen.awk.jj2020-09-14 10:51:54.493740942 +0200
+++ gcc/optc-save-gen.awk   2020-09-14 11:39:39.441602594 +0200
@@ -1186,6 +1186,7 @@ for (i = 0; i < n_opts; i++) {
var_opt_val_type[n_opt_val] = otype;
var_opt_val[n_opt_val] = "x_" name;
var_opt_hash[n_opt_val] = flag_set_p("Optimization", flags[i]);
+   var_opt_init[n_opt_val] = opt_args("Init", flags[i]);
n_opt_val++;
}
 }
@@ -1257,10 +1258,21 @@ for (i = 0; i < n_opt_val; i++) {
otype = var_opt_val_type[i];
if (otype ~ "^const char \\**$")
print "  bp_pack_string (ob, bp, ptr->" name", true);";
-   else if (otype ~ "^unsigned")
-   print "  bp_pack_var_len_unsigned (bp, ptr->" name");";
-   else
-   print "  bp_pack_var_len_int (bp, ptr->" name");";
+   else {
+   if (otype ~ "^unsigned") {
+   sgn = "unsigned";
+   } else {
+   sgn = "int";
+   }
+   if (name ~ "^x_param" && !(otype ~ "^enum ") && 
var_opt_init[i]) {
+   print "  if (" var_opt_init[i] " > (" 
var_opt_val_type[i] ") 10)";
+   print "bp_pack_var_len_" sgn " (bp, ptr->" name" ^ 
" var_opt_init[i] ");";
+   print "  else";
+   print "bp_pack_var_len_" sgn " (bp, ptr->" name");";
+   } else {
+   print "  bp_pack_var_len_" sgn " (bp, ptr->" name");";
+   }
+   }
 }
 print "  for (size_t i = 0; i < sizeof (ptr->explicit_mask) / sizeof 
(ptr->explicit_mask[0]); i++)";
 print "bp_pack_value (bp, ptr->explicit_mask[i], 64);";
@@ -1281,10 +1293,18 @@ for (i = 0; i < n_opt_val; i++) {
print "  if (ptr->" name")";
print "ptr->" name" = xstrdup (ptr->" name");";
}
-   else if (otype ~ "^unsigned")
-   print "  ptr->" name" = (" var_opt_val_type[i] ") 
bp_unpack_var_len_unsigned (bp);";
-   else
-   print "  ptr->" name" = (" var_opt_val_type[i] ") 
bp_unpack_var_len_int (bp);";
+   else {
+   if (otype ~ "^unsigned") {
+   sgn = "unsigned";
+   } else {
+   sgn = "int";
+   }
+   print "  ptr->" name" = (" var_opt_val_type[i] ") 
bp_unpack_var_len_" sgn " (bp);";
+   if (name ~ "^x_param" && !(otype ~ "^enum ") && 
var_opt_init[i]) {
+   print "  if (" var_opt_init[i] " > (" 
var_opt_val_type[i] ") 10)";
+   print "ptr->" name" ^= " var_opt_init[i] ";";
+   }
+   }
 }
 print "  for (size_t i = 0; i < sizeof (ptr->explicit_mask) / sizeof 
(ptr->explicit_mask[0]); i++)";
 print "ptr->explicit_mask[i] = bp_unpack_value (bp, 64);";


Jakub



[PATCH] configury: --enable-link-serialization support

2020-11-18 Thread Jakub Jelinek via Gcc-patches
Hi!

Reposting with self-contained description per Joseph's request:

When performing LTO bootstraps, especially when using tmpfs for /tmp,
one can run a machine to halt when using higher levels of parallelism
and a large number of FEs, because there are too many concurrent LTO
link commands running at the same time and each one of them puts most of the
middle-end/backend objects into /tmp.

We have --enable-link-mutex configure option, but --enable-link-mutex has
a big problem that it decreases number of available jobs by the number of
link commands waiting for the lock, so e.g. when doing make -j32 build with
11 different big programs linked with $(LLINKER) we end up with just 22
effective jobs, and with e.g. make -j8 with those 11 different big programs
we actually most likely serialize everything during linking onto a single job.

The following patch implements a new configure option,
--enable-link-serialization, which implements different serialization and
as it doesn't use the mutex, just modifying the old option to be implemented
differently would be strange.  We can deprecate and later remove the old
option.  The new option doesn't use any shell mutexes, but uses make
dependencies.

The option is implemented inside of gcc/ configure and Makefiles,
which means that even inside of gcc/ make all (as well as e.g. make lto-dump)
will serialize and build all previous large binaries when configured this
way.
One can always make -j32 cc1 DO_LINK_SERIALIZATION=
to avoid that.
Furthermore, I've implemented the idea I wrote about, so that
--enable-link-serialization
is the same as
--enable-link-serialization=1
and means the large link commands are serialized, one can (the default)
--disable-link-serialization
which will cause all links to be parallelizable, but one can also
--enable-link-serialization=3
etc. which says that at most 3 of the large link commands can run
concurrently.
And finally I've implemented (only if the serialization is enabled) simple
progress bars for the linking.
With --enable-link-serialization and e.g. the 5 large links I have in my
current tree (cc1, cc1plus, f951, lto1 and lto-dump), before the linking it
prints
Linking |==--  | 20%
and after it
Linking |  | 40%
(each == characters stand for already finished links, each --
characters stand for the link being started).
With --enable-link-serialization=3 it will change the way the start is
printed, one will get:
Linking |--| 0%
at the start of cc1 link,
Linking |>>--  | 0%
at the start of the second large link and
Linking |--| 0%
at the start of the third large link, where the >> characters stand for
already pending links.  The printing at the end of link command is
the same as with the full serialization, i.e. for the above 3:
Linking |==| 20%
Linking |  | 40%
Linking |==| 60%
but one could actually get them in any order depending on which of those 3
finishes first - to get it 100% accurate I'd need to add some directory with
files representing finished links or similar, doesn't seem worth it.

Bootstrapped/regtested again last night on x86_64-linux and i686-linux, ok
for trunk?

2020-11-18  Jakub Jelinek  

gcc/
* configure.ac: Add $lang.prev rules, INDEX.$lang and SERIAL_LIST and
SERIAL_COUNT variables to Make-hooks.
(--enable-link-serialization): New configure option.
* Makefile.in (DO_LINK_SERIALIZATION, LINK_PROGRESS): New variables.
* doc/install.texi (--enable-link-serialization): Document.
* configure: Regenerated.
gcc/c/
* Make-lang.in (c.serial): New goal.
(.PHONY): Add c.serial c.prev.
(cc1$(exeext)): Call LINK_PROGRESS.
gcc/cp/
* Make-lang.in (c++.serial): New goal.
(.PHONY): Add c++.serial c++.prev.
(cc1plus$(exeext)): Depend on c++.prev.  Call LINK_PROGRESS.
gcc/fortran/
* Make-lang.in (fortran.serial): New goal.
(.PHONY): Add fortran.serial fortran.prev.
(f951$(exeext)): Depend on fortran.prev.  Call LINK_PROGRESS.
gcc/lto/
* Make-lang.in (lto, lto1.serial, lto2.serial): New goals.
(.PHONY): Add lto lto1.serial lto1.prev lto2.serial lto2.prev.
(lto.all.cross, lto.start.encap): Remove dependencies.
($(LTO_EXE)): Depend on lto1.prev.  Call LINK_PROGRESS.
($(LTO_DUMP_EXE)): Depend on lto2.prev.  Call LINK_PROGRESS.
gcc/objc/
* Make-lang.in (objc.serial): New goal.
(.PHONY): Add objc.serial objc.prev.
(cc1obj$(exeext)): Depend on objc.prev.  Call LINK_PROGRESS.
gcc/objcp/
* Make-lang.in (obj-c++.serial): New goal.
(.PHONY): Add obj-c++.serial obj-c++.prev.
(cc1objplus$(exeext)): Depend on obj-c++.prev.  Call LINK_PROGRESS.
gcc/ada/
* gcc-interface/Make-lang.in (ada.serial): New goal.
(.PHONY): Add ada.serial ada.prev.
(gnat1$(exeext)): Depend on ada.prev.  Call LINK_PROGRESS.
gcc/brig/
* Make-lang.in (brig.serial): New goal.
   

[PATCH] d: Fix LHS of array concatentation evaluated before the RHS.

2020-11-18 Thread Iain Buclaw via Gcc-patches
Hi,

In an array append expression:

array ~= fun(array);

The array in the left hand side of the expression was extended before
evaluating the result of the right hand side, which resulted in the
newly uninitialized array index being used before set.

This fixes that so that the result of the right hand side is always
saved in a reusable temporary before assigning to the destination.

Bootstrapped and regression tested on x86_64-linux-gnu, committed to
mainline, and backported to the release/gcc-10 branch, as it is a
regression from gcc-9.

Regards
Iain.

---
gcc/d/ChangeLog:

PR d/97843
* d-codegen.cc (build_assign): Evaluate TARGET_EXPR before use in
the right hand side of an assignment.
* expr.cc (ExprVisitor::visit (CatAssignExp *)): Force a TARGET_EXPR
on the element to append if it is a CALL_EXPR.

gcc/testsuite/ChangeLog:

PR d/97843
* gdc.dg/torture/pr97843.d: New test.
---
 gcc/d/d-codegen.cc |  5 +++-
 gcc/d/expr.cc  |  3 +++
 gcc/testsuite/gdc.dg/torture/pr97843.d | 37 ++
 3 files changed, 44 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gdc.dg/torture/pr97843.d

diff --git a/gcc/d/d-codegen.cc b/gcc/d/d-codegen.cc
index 1f2d65c4ae2..4c16f6a822b 100644
--- a/gcc/d/d-codegen.cc
+++ b/gcc/d/d-codegen.cc
@@ -1343,7 +1343,10 @@ build_assign (tree_code code, tree lhs, tree rhs)
 since that would cause the LHS to be constructed twice.
 So we force the TARGET_EXPR to be expanded without a target.  */
   if (code != INIT_EXPR)
-   rhs = compound_expr (rhs, TARGET_EXPR_SLOT (rhs));
+   {
+ init = compound_expr (init, rhs);
+ rhs = TARGET_EXPR_SLOT (rhs);
+   }
   else
{
  d_mark_addressable (lhs);
diff --git a/gcc/d/expr.cc b/gcc/d/expr.cc
index 79f212c3a08..ef2bf5f2e36 100644
--- a/gcc/d/expr.cc
+++ b/gcc/d/expr.cc
@@ -884,6 +884,9 @@ public:
tree t2 = build_expr (e->e2);
tree expr = stabilize_expr ();
 
+   if (TREE_CODE (t2) == CALL_EXPR)
+ t2 = force_target_expr (t2);
+
result = modify_expr (build_deref (ptrexp), t2);
 
this->result_ = compound_expr (expr, result);
diff --git a/gcc/testsuite/gdc.dg/torture/pr97843.d 
b/gcc/testsuite/gdc.dg/torture/pr97843.d
new file mode 100644
index 000..9a775f2b650
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/torture/pr97843.d
@@ -0,0 +1,37 @@
+// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90601
+// { dg-additional-options "-fmain -funittest" }
+// { dg-do run }
+// { dg-skip-if "needs gcc/config.d" { ! d_runtime } }
+
+struct Sdtor
+{
+int value;
+~this() { }
+}
+
+Sdtor sum(Sdtor[] sdtors)
+{
+int result;
+foreach (s; sdtors)
+result += s.value;
+return Sdtor(result);
+}
+
+uint sum(uint[] ints)
+{
+uint result;
+foreach(i; ints)
+result += i;
+return result;
+}
+
+unittest
+{
+Sdtor[] sdtors = [Sdtor(0), Sdtor(1)];
+sdtors ~= sum(sdtors);
+assert(sdtors == [Sdtor(0), Sdtor(1), Sdtor(1)]);
+
+uint[] ints = [0, 1];
+ints ~= ints.sum;
+assert(ints == [0, 1, 1]);
+}
-- 
2.27.0



[PATCH] d: Fix a couple of ICEs found in the dmd front-end (PR97842)

2020-11-18 Thread Iain Buclaw via Gcc-patches
Hi,

This patch merges the D front-end implementation with upstream dmd
b6a779e49, fixing two segmentation faults. One when encountering an
incomplete static if, and another when resolving typeof() expressions
whilst gagging is on.

Bootstrapped and regression tested on x86_64-linux-gnu, committed to
mainline, and backported to the release/gcc-10 branch.

Regards
Iain.

---
gcc/d/ChangeLog:

PR d/97842
* dmd/MERGE: Merge upstream dmd b6a779e49
---
 gcc/d/dmd/MERGE   |  2 +-
 gcc/d/dmd/cond.c  |  4 ++
 gcc/d/dmd/mtype.c |  6 +++
 .../gdc.test/fail_compilation/fail18970.d | 37 +++
 .../fail_compilation/imports/test21164a.d |  9 +
 .../fail_compilation/imports/test21164b.d |  4 ++
 .../fail_compilation/imports/test21164c.d | 10 +
 .../fail_compilation/imports/test21164d.d |  9 +
 .../gdc.test/fail_compilation/test21164.d | 13 +++
 9 files changed, 93 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gdc.test/fail_compilation/fail18970.d
 create mode 100644 gcc/testsuite/gdc.test/fail_compilation/imports/test21164a.d
 create mode 100644 gcc/testsuite/gdc.test/fail_compilation/imports/test21164b.d
 create mode 100644 gcc/testsuite/gdc.test/fail_compilation/imports/test21164c.d
 create mode 100644 gcc/testsuite/gdc.test/fail_compilation/imports/test21164d.d
 create mode 100644 gcc/testsuite/gdc.test/fail_compilation/test21164.d

diff --git a/gcc/d/dmd/MERGE b/gcc/d/dmd/MERGE
index e2a0bab2e4a..b00cb8262a7 100644
--- a/gcc/d/dmd/MERGE
+++ b/gcc/d/dmd/MERGE
@@ -1,4 +1,4 @@
-95044d8e45a4320f07d9c75b4eb30e55688a8195
+b6a779e49a3bba8be6272e6730e14cbb6293ef77
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/dmd repository.
diff --git a/gcc/d/dmd/cond.c b/gcc/d/dmd/cond.c
index beda133ffdb..9f76e83238e 100644
--- a/gcc/d/dmd/cond.c
+++ b/gcc/d/dmd/cond.c
@@ -705,6 +705,10 @@ int StaticIfCondition::include(Scope *sc)
 sc = sc->push(sc->scopesym);
 
 bool errors = false;
+
+if (!exp)
+goto Lerror;
+
 bool result = evalStaticCondition(sc, exp, exp, errors);
 sc->pop();
 
diff --git a/gcc/d/dmd/mtype.c b/gcc/d/dmd/mtype.c
index bc66be028c1..6f0195af305 100644
--- a/gcc/d/dmd/mtype.c
+++ b/gcc/d/dmd/mtype.c
@@ -7418,6 +7418,12 @@ void TypeTypeof::resolve(Loc loc, Scope *sc, Expression 
**pe, Type **pt, Dsymbol
 
 //printf("TypeTypeof::resolve(sc = %p, idents = '%s')\n", sc, toChars());
 //static int nest; if (++nest == 50) *(char*)0=0;
+if (sc == NULL)
+{
+*pt = Type::terror;
+error(loc, "Invalid scope.");
+return;
+}
 if (inuse)
 {
 inuse = 2;
diff --git a/gcc/testsuite/gdc.test/fail_compilation/fail18970.d 
b/gcc/testsuite/gdc.test/fail_compilation/fail18970.d
new file mode 100644
index 000..846a5782d7d
--- /dev/null
+++ b/gcc/testsuite/gdc.test/fail_compilation/fail18970.d
@@ -0,0 +1,37 @@
+/*
+TEST_OUTPUT:
+---
+fail_compilation/fail18970.d(22): Error: no property `y` for type `fail18970.S`
+fail_compilation/fail18970.d(29): Error: no property `yyy` for type 
`fail18970.S2`
+---
+*/
+
+// https://issues.dlang.org/show_bug.cgi?id=18970
+
+struct S
+{
+auto opDispatch(string name)(int)
+{
+alias T = typeof(x);
+static assert(!is(T.U));
+return 0;
+}
+}
+void f()
+{
+S().y(1);
+}
+
+struct S2
+{
+this(int)
+{
+this.yyy;
+}
+
+auto opDispatch(string name)()
+{
+alias T = typeof(x);
+static if(is(T.U)) {}
+}
+}
diff --git a/gcc/testsuite/gdc.test/fail_compilation/imports/test21164a.d 
b/gcc/testsuite/gdc.test/fail_compilation/imports/test21164a.d
new file mode 100644
index 000..e5fcd43595e
--- /dev/null
+++ b/gcc/testsuite/gdc.test/fail_compilation/imports/test21164a.d
@@ -0,0 +1,9 @@
+struct D(E)
+{
+void G(){
+import imports.test21164d;
+I;
+}
+
+}
+
diff --git a/gcc/testsuite/gdc.test/fail_compilation/imports/test21164b.d 
b/gcc/testsuite/gdc.test/fail_compilation/imports/test21164b.d
new file mode 100644
index 000..ece5476654e
--- /dev/null
+++ b/gcc/testsuite/gdc.test/fail_compilation/imports/test21164b.d
@@ -0,0 +1,4 @@
+import imports.test21164c;
+enum N = O();
+alias Q = R!(N, S);
+
diff --git a/gcc/testsuite/gdc.test/fail_compilation/imports/test21164c.d 
b/gcc/testsuite/gdc.test/fail_compilation/imports/test21164c.d
new file mode 100644
index 000..21a252f5036
--- /dev/null
+++ b/gcc/testsuite/gdc.test/fail_compilation/imports/test21164c.d
@@ -0,0 +1,10 @@
+enum S = 1;
+
+struct O
+{
+}
+
+struct R(O U, int W)
+{
+}
+
diff --git a/gcc/testsuite/gdc.test/fail_compilation/imports/test21164d.d 
b/gcc/testsuite/gdc.test/fail_compilation/imports/test21164d.d
new file mode 100644
index 000..08f83ea91f7
--- /dev/null
+++ 

[committed] libphobos: Merge upstream phobos 7948e0967.

2020-11-18 Thread Iain Buclaw via Gcc-patches
Hi,

This patch merges the libphobos library with upstream phobos 7948e0967,
removing all deprecated functions from std.string module.

Regression tested and committed to mainline.

Regards,
Iain.

---
libphobos/ChangeLog:

* src/MERGE: Merge upstream phobos 7948e0967.
---
 libphobos/src/MERGE|   2 +-
 libphobos/src/std/string.d | 267 -
 2 files changed, 1 insertion(+), 268 deletions(-)

diff --git a/libphobos/src/MERGE b/libphobos/src/MERGE
index 1562f747b74..de86ff5b65b 100644
--- a/libphobos/src/MERGE
+++ b/libphobos/src/MERGE
@@ -1,4 +1,4 @@
-021ae0df76727a32809a29887095ab7093489ea3
+7948e096735adbc09da789fc28feadce24b0
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/phobos repository.
diff --git a/libphobos/src/std/string.d b/libphobos/src/std/string.d
index 5b61cde4ac1..1128a090304 100644
--- a/libphobos/src/std/string.d
+++ b/libphobos/src/std/string.d
@@ -5174,273 +5174,6 @@ body
 assert(buffer.data == "h5 rd");
 }
 
-//@@@DEPRECATED_2.086@@@
-deprecated("This function is obsolete. It is available in 
https://github.com/dlang/undeaD if necessary.")
-bool inPattern(S)(dchar c, in S pattern) @safe pure @nogc
-if (isSomeString!S)
-{
-bool result = false;
-int range = 0;
-dchar lastc;
-
-foreach (size_t i, dchar p; pattern)
-{
-if (p == '^' && i == 0)
-{
-result = true;
-if (i + 1 == pattern.length)
-return (c == p);// or should this be an error?
-}
-else if (range)
-{
-range = 0;
-if (lastc <= c && c <= p || c == p)
-return !result;
-}
-else if (p == '-' && i > result && i + 1 < pattern.length)
-{
-range = 1;
-continue;
-}
-else if (c == p)
-return !result;
-lastc = p;
-}
-return result;
-}
-
-
-deprecated
-@safe pure @nogc unittest
-{
-import std.conv : to;
-import std.exception : assertCTFEable;
-
-assertCTFEable!(
-{
-assert(inPattern('x', "x") == 1);
-assert(inPattern('x', "y") == 0);
-assert(inPattern('x', string.init) == 0);
-assert(inPattern('x', "^y") == 1);
-assert(inPattern('x', "yxxy") == 1);
-assert(inPattern('x', "^yxxy") == 0);
-assert(inPattern('x', "^abcd") == 1);
-assert(inPattern('^', "^^") == 0);
-assert(inPattern('^', "^") == 1);
-assert(inPattern('^', "a^") == 1);
-assert(inPattern('x', "a-z") == 1);
-assert(inPattern('x', "A-Z") == 0);
-assert(inPattern('x', "^a-z") == 0);
-assert(inPattern('x', "^A-Z") == 1);
-assert(inPattern('-', "a-") == 1);
-assert(inPattern('-', "^A-") == 0);
-assert(inPattern('a', "z-a") == 1);
-assert(inPattern('z', "z-a") == 1);
-assert(inPattern('x', "z-a") == 0);
-});
-}
-
-//@@@DEPRECATED_2.086@@@
-deprecated("This function is obsolete. It is available in 
https://github.com/dlang/undeaD if necessary.")
-bool inPattern(S)(dchar c, S[] patterns) @safe pure @nogc
-if (isSomeString!S)
-{
-foreach (string pattern; patterns)
-{
-if (!inPattern(c, pattern))
-{
-return false;
-}
-}
-return true;
-}
-
-//@@@DEPRECATED_2.086@@@
-deprecated("This function is obsolete. It is available in 
https://github.com/dlang/undeaD if necessary.")
-size_t countchars(S, S1)(S s, in S1 pattern) @safe pure @nogc
-if (isSomeString!S && isSomeString!S1)
-{
-size_t count;
-foreach (dchar c; s)
-{
-count += inPattern(c, pattern);
-}
-return count;
-}
-
-deprecated
-@safe pure @nogc unittest
-{
-import std.conv : to;
-import std.exception : assertCTFEable;
-
-assertCTFEable!(
-{
-assert(countchars("abc", "a-c") == 3);
-assert(countchars("hello world", "or") == 3);
-});
-}
-
-//@@@DEPRECATED_2.086@@@
-deprecated("This function is obsolete. It is available in 
https://github.com/dlang/undeaD if necessary.")
-S removechars(S)(S s, in S pattern) @safe pure
-if (isSomeString!S)
-{
-import std.utf : encode;
-
-Unqual!(typeof(s[0]))[] r;
-bool changed = false;
-
-foreach (size_t i, dchar c; s)
-{
-if (inPattern(c, pattern))
-{
-if (!changed)
-{
-changed = true;
-r = s[0 .. i].dup;
-}
-continue;
-}
-if (changed)
-{
-encode(r, c);
-}
-}
-if (changed)
-return r;
-else
-return s;
-}
-
-deprecated
-@safe pure unittest
-{
-import std.conv : to;
-import std.exception : assertCTFEable;
-
-assertCTFEable!(
-{
-assert(removechars("abc", "a-c").length == 0);
-assert(removechars("hello world", "or") == "hell wld");
-assert(removechars("hello world", "d") == "hello worl");
-assert(removechars("hah", "h") == "a");
-});
-}
-
-deprecated

[PATCH] plugins: Allow plugins to handle global_options changes

2020-11-18 Thread Jakub Jelinek via Gcc-patches
Hi!

Reposting with self-contained description per Joseph's request:

Any time somebody adds or removes an option in some *.opt file (which e.g.
on the 10 branch after branching off 11 happened 7 times already), many
offsets in global_options variable change and so plugins that ever access
GCC options or other global_options values are ABI dependent on it.  It is
true we don't guarantee ABI stability for plugins, but we change the most
often used data structures on the release branches only very rarely and so
the options changes are the most problematic for ABI stability of plugins.

Annobin uses a way to remap accesses to some of the global_options.x_* by
looking them up in the cl_options array where we have
offsetof (struct gcc_options, x_flag_lto)
etc. remembered, but sadly doesn't do it for all options (e.g. some flag_*
etc. option accesses may be hidden in various macros like POINTER_SIZE),
and more importantly some struct gcc_options offsets are not covered at all.
E.g. there is no offsetof (struct gcc_options, x_optimize),
offsetof (struct gcc_options, x_flag_sanitize) etc.  Those are usually:
Variable
int optimize
in the *.opt files.

The following patch allows the plugins to deal with reshuffling of even
the global_options fields that aren't tracked in cl_options by adding
another array that describes those, which adds an 816 bytes long array
and 1039 bytes in string literals, so 1855 .rodata bytes in total ATM.

If needed, this could be guarded by some configure option if those 1855
.rodata (0.02% of .rodata size) bytes is something that people don't want
to sacrifice for this.

Bootstrapped/regtested again last night on x86_64-linux and i686-linux, ok
for trunk?

Or if not, would it be ok if this was guarded by some
--enable-plugin-option-tracking
or other configure option?

2020-11-18  Jakub Jelinek  

* opts.h (struct cl_var): New type.
(cl_vars): Declare.
* optc-gen.awk: Generate cl_vars array.

--- gcc/opts.h.jj   2020-04-30 11:49:28.462900760 +0200
+++ gcc/opts.h  2020-08-24 10:21:08.563288412 +0200
@@ -124,6 +124,14 @@ struct cl_option
   int range_max;
 };
 
+struct cl_var
+{
+  /* Name of the variable.  */
+  const char *var_name;
+  /* Offset of field for this var in struct gcc_options.  */
+  unsigned short var_offset;
+};
+
 /* Records that the state of an option consists of SIZE bytes starting
at DATA.  DATA might point to CH in some cases.  */
 struct cl_option_state {
@@ -134,6 +142,7 @@ struct cl_option_state {
 
 extern const struct cl_option cl_options[];
 extern const unsigned int cl_options_count;
+extern const struct cl_var cl_vars[];
 extern const char *const lang_names[];
 extern const unsigned int cl_lang_count;
 
--- gcc/optc-gen.awk.jj 2020-01-12 11:54:36.691409214 +0100
+++ gcc/optc-gen.awk2020-08-24 10:19:49.410410288 +0200
@@ -592,5 +592,29 @@ for (i = 0; i < n_opts; i++) {
 }
 print "}   "
 
+split("", var_seen, ":")
+print "\n#ifndef GENERATOR_FILE"
+print "DEBUG_VARIABLE const struct cl_var cl_vars[] =\n{"
+
+for (i = 0; i < n_opts; i++) {
+   name = var_name(flags[i]);
+   if (name == "")
+   continue;
+   var_seen[name] = 1;
+}
+
+for (i = 0; i < n_extra_vars; i++) {
+   var = extra_vars[i]
+   sub(" *=.*", "", var)
+   name = var
+   sub("^.*[ *]", "", name)
+   sub("\\[.*\\]$", "", name)
+   if (name in var_seen)
+   continue;
+   print "  { " quote name quote ", offsetof (struct gcc_options, x_" name 
") },"
+   var_seen[name] = 1
 }
 
+print "  { NULL, (unsigned short) -1 }\n};\n#endif"
+
+}

Jakub



[PATCH] c++: __builtin_clear_padding builtin C++ tail padding fix [PR88101]

2020-11-18 Thread Jakub Jelinek via Gcc-patches
On Mon, Nov 16, 2020 at 10:13:52PM +0100, Jakub Jelinek via Gcc-patches wrote:
> On Sun, Nov 15, 2020 at 11:57:55PM -1200, Jakub Jelinek via Gcc-patches wrote:
> > Tested on x86_64-linux, i686-linux and powerpc64-linux, ok for trunk?
> 
> Here is an incremental patch that resolves the remaining FIXMEs, in
> particular implements VLAs (except for variable length structures)
> and for larger fixed sized arrays or members with larger array types
> uses runtime loops for the clearing (unless inside of a union).
> Furthermore, I've added diagnostics about last argument being const whatever *
> (similarly to e.g. __builtin_*_overflow) and also about _Atomic whatever *.

And another incrementaly patch, I've noticed ICEs on the following testcase.
For C++ we need to take DECL_SIZE_UNIT (field) as the size of the fields rather 
than
their TYPE_SIZE_UNIT (TREE_TYPE (field)), because in C++ tail padding is
often reused for other fields.

Bootstrapped/regtested (on top of the earlier 2 patches) on x86_64-linux and
i686-linux, ok for trunk?

2020-11-18  Jakub Jelinek  

PR libstdc++/88101
* gimple-fold.c (clear_padding_type): Add sz argument,
don't set it to int_size_in_bytes.  Adjust recursive calls.
In RECORD_TYPEs, use DECL_SIZE_UNIT for fldsz.
(clear_padding_union): Use DECL_SIZE_UNIT for fldsz and
pass it to clear_padding_type.
(clear_padding_emit_loop): Adjust clear_padding_type caller.
(gimple_fold_builtin_clear_padding): Likewise.

* g++.dg/torture/builtin-clear-padding-1.C: New test.

--- gcc/gimple-fold.c.jj2020-11-16 18:47:42.997770758 +0100
+++ gcc/gimple-fold.c   2020-11-17 10:37:50.014877514 +0100
@@ -4211,7 +4211,7 @@ clear_padding_add_padding (clear_padding
 }
 }
 
-static void clear_padding_type (clear_padding_struct *, tree);
+static void clear_padding_type (clear_padding_struct *, tree, HOST_WIDE_INT);
 
 /* Clear padding bits of union type TYPE.  */
 
@@ -4253,12 +4253,12 @@ clear_padding_union (clear_padding_struc
   for (tree field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
 if (TREE_CODE (field) == FIELD_DECL)
   {
-   HOST_WIDE_INT fldsz = int_size_in_bytes (TREE_TYPE (field));
+   HOST_WIDE_INT fldsz = tree_to_shwi (DECL_SIZE_UNIT (field));
gcc_assert (union_buf->size == 0);
union_buf->off = start_off;
union_buf->size = start_size;
memset (union_buf->buf, ~0, start_size);
-   clear_padding_type (union_buf, TREE_TYPE (field));
+   clear_padding_type (union_buf, TREE_TYPE (field), fldsz);
clear_padding_add_padding (union_buf, sz - fldsz);
clear_padding_flush (union_buf, true);
   }
@@ -4339,7 +4339,7 @@ clear_padding_emit_loop (clear_padding_s
   g = gimple_build_label (l1);
   gimple_set_location (g, buf->loc);
   gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
-  clear_padding_type (buf, type);
+  clear_padding_type (buf, type, buf->sz);
   clear_padding_flush (buf, true);
   g = gimple_build_assign (buf->base, POINTER_PLUS_EXPR, buf->base,
   size_int (buf->sz));
@@ -4360,9 +4360,8 @@ clear_padding_emit_loop (clear_padding_s
gimple_fold_builtin_clear_padding.  */
 
 static void
-clear_padding_type (clear_padding_struct *buf, tree type)
+clear_padding_type (clear_padding_struct *buf, tree type, HOST_WIDE_INT sz)
 {
-  HOST_WIDE_INT sz = int_size_in_bytes (type);
   switch (TREE_CODE (type))
 {
 case RECORD_TYPE:
@@ -4441,11 +4440,11 @@ clear_padding_type (clear_padding_struct
else
  {
HOST_WIDE_INT pos = int_byte_position (field);
-   HOST_WIDE_INT fldsz = int_size_in_bytes (TREE_TYPE (field));
+   HOST_WIDE_INT fldsz = tree_to_shwi (DECL_SIZE_UNIT (field));
gcc_assert (pos >= 0 && fldsz >= 0 && pos >= cur_pos);
clear_padding_add_padding (buf, pos - cur_pos);
cur_pos = pos;
-   clear_padding_type (buf, TREE_TYPE (field));
+   clear_padding_type (buf, TREE_TYPE (field), fldsz);
cur_pos += fldsz;
  }
  }
@@ -4453,9 +4452,9 @@ clear_padding_type (clear_padding_struct
   clear_padding_add_padding (buf, sz - cur_pos);
   break;
 case ARRAY_TYPE:
-  HOST_WIDE_INT nelts;
-  nelts = int_size_in_bytes (TREE_TYPE (type));
-  nelts = sz / nelts;
+  HOST_WIDE_INT nelts, fldsz;
+  fldsz = int_size_in_bytes (TREE_TYPE (type));
+  nelts = sz / fldsz;
   if (nelts > 1
  && sz > 8 * UNITS_PER_WORD
  && buf->union_ptr == NULL
@@ -4479,7 +4478,7 @@ clear_padding_type (clear_padding_struct
   size_int (sz));
  gimple_set_location (g, buf->loc);
  gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
- buf->sz = sz / nelts;
+ buf->sz = fldsz;
  buf->align = TYPE_ALIGN (elttype);
  buf->off = 0;
  

openmp: Fix ICE on non-rectangular loop with known 0 iterations [PR97862]

2020-11-18 Thread Jakub Jelinek via Gcc-patches
Hi!

The loops in the testcase are non-rectangular and have 0 iterations
(the outer loop iterates, but the inner one never).  In this case we
just have the overall number of iterations computed (0), and don't have
factor and other values computed.  We never need to map logical iterations
to the individual iterations in that case, and we were crashing during
expansion of that code.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2020-11-18  Jakub Jelinek  

PR middle-end/97862
* omp-expand.c (expand_omp_for_init_vars): Don't use the sqrt path
if number of iterations is constant 0.

* c-c++-common/gomp/pr97862.c: New test.

--- gcc/omp-expand.c.jj 2020-11-14 10:40:11.231409596 +0100
+++ gcc/omp-expand.c2020-11-17 12:56:52.183888420 +0100
@@ -2514,7 +2514,8 @@ expand_omp_for_init_vars (struct omp_for
  && (TREE_CODE (fd->loop.n2) == INTEGER_CST
  || fd->first_inner_iterations)
  && (optab_handler (sqrt_optab, TYPE_MODE (double_type_node))
- != CODE_FOR_nothing))
+ != CODE_FOR_nothing)
+ && !integer_zerop (fd->loop.n2))
{
  tree outer_n1 = fd->adjn1 ? fd->adjn1 : fd->loops[i - 1].n1;
  tree itype = TREE_TYPE (fd->loops[i].v);
--- gcc/testsuite/c-c++-common/gomp/pr97862.c.jj2020-11-17 
13:00:31.019380920 +0100
+++ gcc/testsuite/c-c++-common/gomp/pr97862.c   2020-11-17 13:04:05.602922138 
+0100
@@ -0,0 +1,15 @@
+/* PR middle-end/97862 */
+
+void
+foo (void)
+{
+  int i, j;
+#pragma omp for collapse(2)
+  for (i = 0; i < 1; ++i)
+for (j = 0; j < i; ++j)
+  ;
+#pragma omp for collapse(2)
+  for (i = 0; i < 20; i++)
+for (j = 0; j < i - 19; j += 1)
+  ;
+}

Jakub



Re: [PATCH] recognize implied ranges for modulo.

2020-11-18 Thread Aldy Hernandez via Gcc-patches




On 11/17/20 11:01 PM, Andrew MacLeod wrote:

PR 91029 observes when

  a % b > 0 && b >= 0,

then a has an implied range of  a >=0.  likewise


Shouldn't that be && b > 0?  b == 0 is undefined.



a % b < 0 implies a range of a <= 0.

This patch is a good example of how range-ops can be leveraged to solve 
problems. It simply implements operator_trunc_mod::op1_range()  to solve 
for 'A' when the LHS and 'b' are known to be within the specified 
ranges.    I also added a a test case to show folding of conditions 
based on that.


Bootstrapped on x86_64-pc-linux-gnu, no regressions.  pushed.

Andrew




diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index d0adc95527a..f37796cac70 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -2634,6 +2634,9 @@ public:
const wide_int _ub,
const wide_int _lb,
const wide_int _ub) const;
+  virtual bool op1_range (irange , tree type,
+ const irange ,
+ const irange ) const;
 } op_trunc_mod;
 
 void

@@ -2680,6 +2683,31 @@ operator_trunc_mod::wi_fold (irange , tree type,
   value_range_with_overflow (r, type, new_lb, new_ub);
 }
 
+bool

+operator_trunc_mod::op1_range (irange , tree type,
+  const irange ,
+  const irange ) const
+{
+  // PR 91029.  Check for signed truncation with op2 >= 0.
+  if (TYPE_SIGN (type) == SIGNED && wi::ge_p (op2.lower_bound (), 0, SIGNED))
+{
+  unsigned prec = TYPE_PRECISION (type);
+  // if a & b >=0 , then a >= 0.


Shouldn't comment be %, not & ??.


+  if (wi::ge_p (lhs.lower_bound (), 0, SIGNED))
+   {
+ r = value_range (type, wi::zero (prec), wi::max_value (prec, SIGNED));
+ return true;
+   }
+  // if a & b < 0 , then a <= 0.


Similarly here.


+  if (wi::lt_p (lhs.upper_bound (), 0, SIGNED))
+   {
+ r = value_range (type, wi::min_value (prec, SIGNED), wi::zero (prec));
+ return true;
+   }
+}
+  return false;
+}
+
 


Thanks for doing this BTW.
Aldy