Re: [PATCH] C-family : Add attribute 'unavailable'.

2020-11-09 Thread Richard Biener
On Mon, 9 Nov 2020, Iain Sandoe wrote:

> Hi,
> 
> I?ve been carrying this patch around on my Darwin branches for a very long
> time?
> 
> tested across the Darwin patch and on x86_64-linux-gnu,
> OK for master?
> thanks
> Iain
> 
> = commit message
> 
> If an interface is marked 'deprecated' then, presumably, at some point it
> will be withdrawn and no longer available. The 'unavailable' attribute
> makes it possible to mark up interfaces to indicate this status. It is used
> quite extensively in some codebases where a single set of headers can be
> used
> to permit code generation for multiple system versions.
> 
> From a configuration perspective, it also allows a compile test to determine
> that an interface is missing - rather than requiring a link test.
> 
> The implementation follows the pattern of attribute deprecated, but produces
> an error (where deprecation produces a warning).
> 
> This attribute has been implemented in clang for some years.
> 
> gcc/c-family/ChangeLog:
> 
> * c-attribs.c (handle_unavailable_attribute): New.
> 
> gcc/c/ChangeLog:
> 
> * c-decl.c (enum deprecated_states): Add unavailable state.
> (merge_decls): Copy unavailability.
> (quals_from_declspecs): Handle unavailable case.
> (start_decl): Amend the logic handling suppression of nested
> deprecation states to include unavailability.
> (smallest_type_quals_location): Amend comment.
> (grokdeclarator): Handle the unavailable deprecation state.
> (declspecs_add_type): Set TREE_UNAVAILABLE from the decl specs.
> * c-tree.h (struct c_declspecs): Add unavailable_p.
> * c-typeck.c (build_component_ref): Handle unavailability.
> (build_external_ref): Likewise.
> 
> gcc/cp/ChangeLog:
> 
> * call.c (build_over_call): Handle unavailable state in addition to
> deprecation.
> * class.c (type_build_ctor_call): Likewise.
> (type_build_dtor_call): Likewise.
> * cp-tree.h: Rename cp_warn_deprecated_use to
> cp_handle_deprecated_or_unavailable.
> * decl.c (duplicate_decls): Merge unavailability.
> (grokdeclarator): Handle unavailability in addition to deprecation.
> (type_is_unavailable): New.
> (grokparms): Handle unavailability in addition to deprecation.
> * decl.h (enum deprecated_states): Add
> UNAVAILABLE_DEPRECATED_SUPPRESS.
> * decl2.c (cplus_decl_attributes): Propagate unavailability to
> templates.
> (cp_warn_deprecated_use): Rename to ...
> (cp_handle_deprecated_or_unavailable): ... this and amend to handle
> the unavailable case. It remains a warning in the case of deprecation
> but becomes an error in the case of unavailability.
> (cp_warn_deprecated_use_scopes): Handle unavailability.
> (mark_used): Likewise.
> * parser.c (cp_parser_template_name): Likewise.
> (cp_parser_template_argument): Likewise.
> (cp_parser_parameter_declaration_list): Likewise.
> * typeck.c (build_class_member_access_expr): Likewise.
> (finish_class_member_access_expr): Likewise.
> * typeck2.c (build_functional_cast_1): Likewise.
> 
> gcc/ChangeLog:
> 
> * lto-streamer-out.c (hash_tree): Stream TREE_UNAVAILABLE.
> * print-tree.c (print_node): Handle unavailable attribute.
> * tree-core.h (struct tree_base): Add a bit to carry unavailability.
> * tree.c (error_unavailable_use): New.
> * tree.h (TREE_UNAVAILABLE): New.
> (error_unavailable_use): New.

Why'd you need to stream this in LTO, more specifically only handle
it in hashing?  It should be all frontend operation.

You're targeting only DECLs can you use a bit from decl_common where decl
specific bits exist instead?

Thanks,
Richard.

> gcc/objc/ChangeLog:
> 
> * objc-act.c (objc_add_property_declaration): Register unavailable
> attribute.
> (maybe_make_artificial_property_decl): Set available.
> (objc_maybe_build_component_ref): Generalise to the method prototype
> to count availability.
> (objc_build_class_component_ref): Likewise.
> (build_private_template): Likewise.
> (objc_decl_method_attributes): Handle unavailable attribute.
> (lookup_method_in_hash_lists): Amend comments.
> (objc_finish_message_expr): Handle unavailability in addition to
> deprecation.
> (start_class): Likewise.
> (finish_class): Likewise.
> (lookup_protocol): Likewise.
> (objc_declare_protocol): Likewise.
> (start_protocol): Register unavailable attribute.
> (really_start_method): Likewise.
> (objc_gimplify_property_ref): Emit error on encountering an
> unavailable entity (and a warning for a deprecated one).
> 
> gcc/testsuite/ChangeLog:
> 
> * g++.dg/ext/attr-unavailable-1.C: New test.
> * g++.dg/ext/attr-unavailable-2.C: New test.
> * g++.dg/ext/attr-unavailable-3.C: New test.
> * g++.dg/ext/attr-unavailable-4.C: New test.
> * g++.dg/ext/attr-unavailable-5.C: New test.
> * g++.dg/ext/attr-unavailable-6.C: New test.
> * g++.dg/ext/attr-unavailable-7.C: New test.
> * g++.dg/ext/attr-unavailable-8.C: New test.
> * g++.dg/ext/attr-unavailable-9.C: New test.
> * gcc.dg/attr-unavailable-1.c: New test.
> * gcc.dg/attr-unavailable-2.c: New test.
> * gcc.dg/attr-unavailable-3.c: New test.
> * gcc.dg/attr-unavailable-4.c: 

Re: *PING* Re: [Patch] Fortran: Fix function decl's location [PR95847]

2020-11-09 Thread Thomas Koenig via Gcc-patches

Hi Tobias,


*PING*


OK.

Thanks for the patch!

Best regards

Thomas


Re: [PATCH 1/2] cfgloop: Extend loop iteration macros to loop only over sub-loops

2020-11-09 Thread Richard Biener
On Mon, 9 Nov 2020, Martin Jambor wrote:

> Hi,
> 
> This patch adds loop iteration macros FOR_EACH_ENCLOSED_LOOP and
> FOR_EACH_ENCLOSED_LOOP_FN which can loop only over inner loops of a
> given loop.
> 
> The patch is required for a follow-up patch which enables loop
> invariant motion to only work on a selected loop.  I have bootstrapped
> and tested the two patches on x86_64-linux and aarch64-linux.  OK for
> trunk once the follow-up patch is accepted too?

OK for trunk.

Thanks,
Richard.

> Thanks,
> 
> Martin
> 
> 
> gcc/ChangeLog:
> 
> 2020-10-29  Martin Jambor  
> 
>   * cfgloop.h (loop_iterator::loop_iterator): Add parameter to the
>   constructor, make it iterate over sub-loops if non-NULL.
>   (FOR_EACH_LOOP): Pass extra NULL to loop_iterator::loop_iterator.
>   (FOR_EACH_LOOP_FN): Likewise.
>   (FOR_EACH_ENCLOSED_LOOP): New macro.
>   (FOR_EACH_ENCLOSED_LOOP_FN): Likewise.
> ---
>  gcc/cfgloop.h | 44 
>  1 file changed, 32 insertions(+), 12 deletions(-)
> 
> diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
> index d14689dc31f..e8ffa5b2964 100644
> --- a/gcc/cfgloop.h
> +++ b/gcc/cfgloop.h
> @@ -663,7 +663,7 @@ enum li_flags
>  class loop_iterator
>  {
>  public:
> -  loop_iterator (function *fn, loop_p *loop, unsigned flags);
> +  loop_iterator (function *fn, loop_p top, loop_p *loop, unsigned flags);
>  
>inline loop_p next ();
>  
> @@ -693,8 +693,15 @@ loop_iterator::next ()
>return NULL;
>  }
>  
> +/* Constructor to set up iteration over loops.  FN is the function in which 
> the
> +   loop tree resides.  If TOP is NULL iterate over all loops in the function,
> +   otherwise iterate only over sub-loops of TOP (including TOP).  LOOP points
> +   to the iteration pointer in the iteration.  FLAGS modify the iteration as
> +   described in enum li_flags.  */
> +
>  inline
> -loop_iterator::loop_iterator (function *fn, loop_p *loop, unsigned flags)
> +loop_iterator::loop_iterator (function *fn, loop_p top, loop_p *loop,
> +   unsigned flags)
>  {
>class loop *aloop;
>unsigned i;
> @@ -716,13 +723,16 @@ loop_iterator::loop_iterator (function *fn, loop_p 
> *loop, unsigned flags)
>for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, ); 
> i++)
>   if (aloop != NULL
>   && aloop->inner == NULL
> - && aloop->num >= mn)
> + && aloop->num >= mn
> + && (!top || flow_loop_nested_p (top, aloop)))
> this->to_visit.quick_push (aloop->num);
>  }
>else if (flags & LI_FROM_INNERMOST)
>  {
> +  if (!top)
> + top = loops_for_fn (fn)->tree_root;
>/* Push the loops to LI->TO_VISIT in postorder.  */
> -  for (aloop = loops_for_fn (fn)->tree_root;
> +  for (aloop = top;
>  aloop->inner != NULL;
>  aloop = aloop->inner)
>   continue;
> @@ -732,15 +742,15 @@ loop_iterator::loop_iterator (function *fn, loop_p 
> *loop, unsigned flags)
> if (aloop->num >= mn)
>   this->to_visit.quick_push (aloop->num);
>  
> -   if (aloop->next)
> +   if (aloop == top)
> + break;
> +   else if (aloop->next)
>   {
> for (aloop = aloop->next;
>  aloop->inner != NULL;
>  aloop = aloop->inner)
>   continue;
>   }
> -   else if (!loop_outer (aloop))
> - break;
> else
>   aloop = loop_outer (aloop);
>   }
> @@ -748,7 +758,7 @@ loop_iterator::loop_iterator (function *fn, loop_p *loop, 
> unsigned flags)
>else
>  {
>/* Push the loops to LI->TO_VISIT in preorder.  */
> -  aloop = loops_for_fn (fn)->tree_root;
> +  aloop = top ? top : loops_for_fn (fn)->tree_root;
>while (1)
>   {
> if (aloop->num >= mn)
> @@ -758,9 +768,9 @@ loop_iterator::loop_iterator (function *fn, loop_p *loop, 
> unsigned flags)
>   aloop = aloop->inner;
> else
>   {
> -   while (aloop != NULL && aloop->next == NULL)
> +   while (aloop != top && aloop->next == NULL)
>   aloop = loop_outer (aloop);
> -   if (aloop == NULL)
> +   if (aloop == top)
>   break;
> aloop = aloop->next;
>   }
> @@ -771,12 +781,22 @@ loop_iterator::loop_iterator (function *fn, loop_p 
> *loop, unsigned flags)
>  }
>  
>  #define FOR_EACH_LOOP(LOOP, FLAGS) \
> -  for (loop_iterator li(cfun, &(LOOP), FLAGS); \
> +  for (loop_iterator li(cfun, NULL, &(LOOP), FLAGS); \
> (LOOP); \
> (LOOP) = li.next ())
>  
>  #define FOR_EACH_LOOP_FN(FN, LOOP, FLAGS) \
> -  for (loop_iterator li(FN, &(LOOP), FLAGS); \
> +  for (loop_iterator li(FN, NULL, &(LOOP), FLAGS);   \
> +   (LOOP); \
> +   (LOOP) = li.next ())
> +
> +#define FOR_EACH_ENCLOSED_LOOP(TOP, LOOP, FLAGS) \
> +  for (loop_iterator li(cfun, TOP, &(LOOP), FLAGS);  \
> +   (LOOP); \
> +   (LOOP) = li.next ())
> +
> 

[PATCH] PR target/97682 - Fix to reuse t1 register between call address and epilogue.

2020-11-09 Thread Monk Chiang
  - When expanding the call pattern, choose t1 register be a jump register.
Epilogue also uses a t1 register to adjust Stack point. The call pattern
and epilogue will initial t1 twice, if both are generated in the same
function. The call pattern will emit 'la t1,symbol' and 'jalr 
t1'instructions.
Epilogue also emits 'li t1,4096' and 'addi sp,sp,t1' instructions.
But li and addi instructions will be placed between la and jalr 
instructions.
The la instruction will be removed by some optimizations,
because t1 register define twice, the first define instruction look
likes duplicate.

  - To resolve this issue, Prologue and Epilogue use the t0 register
be temp register, the call pattern use the t1 register be tmp register.

  gcc/ChangeLog:

PR target/97682
* config/riscv/riscv.h (RISCV_PROLOGUE_TEMP_REGNUM): Change register to 
t0.
(RISCV_CALL_ADDRESS_TEMP_REGNUM): New Marco, define t1 register.
(RISCV_CALL_ADDRESS_TEMP): Use it for call instructions.
* config/riscv/riscv.c (riscv_legitimize_call_address): Use
RISCV_CALL_ADDRESS_TEMP.
(riscv_trampoline_init): Adjust comment.

  gcc/testsuite/ChangeLog

PR target/97682
* g++.target/riscv/pr97682.C: New test.
---
 gcc/config/riscv/riscv.c |  14 +-
 gcc/config/riscv/riscv.h |   6 +-
 gcc/testsuite/g++.target/riscv/pr97682.C | 160 +++
 3 files changed, 172 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/riscv/pr97682.C

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 989a9f15250..ac4b04540e6 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3110,7 +3110,7 @@ riscv_legitimize_call_address (rtx addr)
 {
   if (!call_insn_operand (addr, VOIDmode))
 {
-  rtx reg = RISCV_PROLOGUE_TEMP (Pmode);
+  rtx reg = RISCV_CALL_ADDRESS_TEMP (Pmode);
   riscv_emit_move (reg, addr);
   return reg;
 }
@@ -4902,9 +4902,9 @@ riscv_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
 
   rtx target_function = force_reg (Pmode, XEXP (DECL_RTL (fndecl), 0));
   /* lui t2, hi(chain)
-lui t1, hi(func)
+lui t0, hi(func)
 addit2, t2, lo(chain)
-jr  r1, lo(func)
+jr  t0, lo(func)
   */
   unsigned HOST_WIDE_INT lui_hi_chain_code, lui_hi_func_code;
   unsigned HOST_WIDE_INT lo_chain_code, lo_func_code;
@@ -4929,7 +4929,7 @@ riscv_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
   mem = adjust_address (m_tramp, SImode, 0);
   riscv_emit_move (mem, lui_hi_chain);
 
-  /* Gen lui t1, hi(func).  */
+  /* Gen lui t0, hi(func).  */
   rtx hi_func = riscv_force_binary (SImode, PLUS, target_function,
fixup_value);
   hi_func = riscv_force_binary (SImode, AND, hi_func,
@@ -4956,7 +4956,7 @@ riscv_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
   mem = adjust_address (m_tramp, SImode, 2 * GET_MODE_SIZE (SImode));
   riscv_emit_move (mem, addi_lo_chain);
 
-  /* Gen jr r1, lo(func).  */
+  /* Gen jr t0, lo(func).  */
   rtx lo_func = riscv_force_binary (SImode, AND, target_function,
imm12_mask);
   lo_func = riscv_force_binary (SImode, ASHIFT, lo_func, GEN_INT (20));
@@ -4975,9 +4975,9 @@ riscv_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
   target_function_offset = static_chain_offset + GET_MODE_SIZE (ptr_mode);
 
   /* auipc   t2, 0
-l[wd]   t1, target_function_offset(t2)
+l[wd]   t0, target_function_offset(t2)
 l[wd]   t2, static_chain_offset(t2)
-jr  t1
+jr  t0
   */
   trampoline[0] = OPCODE_AUIPC | (STATIC_CHAIN_REGNUM << SHIFT_RD);
   trampoline[1] = (Pmode == DImode ? OPCODE_LD : OPCODE_LW)
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 172c7ca7c98..3bd1993c4c9 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -342,9 +342,13 @@ extern const char *riscv_default_mtune (int argc, const 
char **argv);
The epilogue temporary mustn't conflict with the return registers,
the frame pointer, the EH stack adjustment, or the EH data registers. */
 
-#define RISCV_PROLOGUE_TEMP_REGNUM (GP_TEMP_FIRST + 1)
+#define RISCV_PROLOGUE_TEMP_REGNUM (GP_TEMP_FIRST)
 #define RISCV_PROLOGUE_TEMP(MODE) gen_rtx_REG (MODE, 
RISCV_PROLOGUE_TEMP_REGNUM)
 
+#define RISCV_CALL_ADDRESS_TEMP_REGNUM (GP_TEMP_FIRST + 1)
+#define RISCV_CALL_ADDRESS_TEMP(MODE) \
+  gen_rtx_REG (MODE, RISCV_CALL_ADDRESS_TEMP_REGNUM)
+
 #define MCOUNT_NAME "_mcount"
 
 #define NO_PROFILE_COUNTERS 1
diff --git a/gcc/testsuite/g++.target/riscv/pr97682.C 
b/gcc/testsuite/g++.target/riscv/pr97682.C
new file mode 100644
index 000..03c7a447de5
--- /dev/null
+++ b/gcc/testsuite/g++.target/riscv/pr97682.C
@@ -0,0 

testsuite: Adjust pr96789.c to exclude vect_load_lanes

2020-11-09 Thread Kewen.Lin via Gcc-patches
Hi,

As Lyon pointed out, the newly introduced test case
gcc.dg/tree-ssa/pr96789.c fails on arm-none-linux-gnueabihf.
Loop vectorizer is able to vectorize the two loops which
operate on array tmp with load_lanes feature support.  It
makes dse3 get unexpected inputs and do nothing.

This patch is to teach the case to respect vect_load_lanes,
meanwhile to guard the check only under vect_int.

Is it ok for trunk?

BR,
Kewen
-
gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr96789.c: Adjusted by excluding vect_load_lanes.


diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
index d6139a014d8..1b89f8b7a6a 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
@@ -55,4 +55,7 @@ bar (int16_t res[16], uint8_t *val1, uint8_t *val2)
 }
 }

-/* { dg-final { scan-tree-dump {Deleted dead store:.*tmp} "dse3" } } */
+/* Exclude targets which support load_lanes since loop vectorizer
+   can vectorize those two loops that operate tmp array so that
+   subsequent dse3 will not eliminate tmp stores.  */
+/* { dg-final { scan-tree-dump {Deleted dead store:.*tmp} "dse3" { target { 
vect_int && { ! vect_load_lanes } } } } } */


Re: [PATCH] generalized range_query class for multiple contexts

2020-11-09 Thread Andrew MacLeod via Gcc-patches

On 11/5/20 2:29 PM, Martin Sebor wrote:

On 10/1/20 11:25 AM, Martin Sebor wrote:




I have applied the patch and ran some tests.  There are quite
a few failures (see the list below).  I have only looked at
a couple.  The one in in gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
boils down to the following test case.  There should be no warning
for either sprintf call.  The one in h() is a false positive and
the reason for at least some of the regressions.  Somehow,
the conversions between int and char are causing Ranger to lose
the range.

$ cat t.c && gcc -O2 -S -Wall t.c
char a[2];

extern int x;

signed char f (int min, int max)
{
  signed char i = x;
  return i < min || max < i ? min : i;
}

void ff (signed char i)
{
  __builtin_sprintf (a, "%i", f (0, 9));   // okay
}

signed char g (signed char min, signed char max)
{
  signed char i = x;
  return i < min || max < i ? min : i;
}

void gg (void)
{
  __builtin_sprintf (a, "%i", g (0, 9));   // bogus warning
}




The latest changes resolve the issues witg the gg() case.
you will now get a range of [0,9] for the temporary:
=== BB 4 
     :
    # iftmp.3_10 = PHI <0(2), i_6(3)>
    _2 = (int) iftmp.3_10;
    __builtin_sprintf (, "%i", _2);
    return;

_2 : int [0, 9]
iftmp.3_10 : signed char [0, 9]




The code issued for the 2 routines is very different. The first routine 
produces:


 === BB 2 
     :
    x.0_4 = x;
    i_6 = (signed char) x.0_4;
    _7 = (int) i_6;
    if (_7 < 0)
  goto ; [50.00%]
    else
  goto ; [50.00%]

_7 : int [-128, 127]
2->4  (T) x.0_4 :   int [-INF, -257][-128, -1][128, +INF]
2->4  (T) i_6 : signed char [-INF, -1]
2->4  (T) _7 :  int [-128, -1]
2->3  (F) x.0_4 :   int [-INF, -129][0, 127][256, +INF]
2->3  (F) i_6 : signed char [0, +INF]
2->3  (F) _7 :  int [0, 127]

=== BB 3 
i_6 signed char [0, +INF]
_7  int [0, 127]
     :
    if (_7 > 9)
  goto ; [50.00%]
    else
  goto ; [50.00%]

3->4  (T) _7 :  int [10, 127]
3->5  (F) _7 :  int [0, 9]

=== BB 4 
     :

=== BB 5 
     :
    # iftmp.1_9 = PHI 
    _2 = (int) iftmp.1_9;
    __builtin_sprintf (, "%i", _2);
    return;

_2 : int [0, 127]
iftmp.1_9 : signed char [0, +INF]


Note that we figure out that _7 is [0,9] from 3->5,  except the PHI node 
in block 5 refers to i_6...


i_6 is not referred to in BB3, so this generation of ranger doesnt see 
it as an exportable value, and thus doesnt do a calculation for it like 
it does in BB2, where it is defined.  So it only picks up the first of 
the 2 conditions for i_6 that is live on entry to BB3 ([0, +INF])


2 things are in the pipe which will resolve this
1 - equivalences.  _7 is a cast-copy of i_6.  since the range of _7 
falls within the range of a signed char, the forthcoming 
equivalency/relational work will have i_6 match the same value as _7.
2 - The next generation GORI engine will allow for recalculation of 
dependency chains from outside the block when appropriate, so this will 
cause i_6 to also be an exportable value from BB 3 as well. which will 
also then get the [0,9] range in circumstances in which it isnt a simply 
copy.



Neither of which helps you right now with the perfect precision  :-P

Andrew




Fix logical_combine OR operation. Again.

2020-11-09 Thread Andrew MacLeod via Gcc-patches

Doh.

The original fix for PR97657 was incorrect.  It papered over a problem 
by reducing opportunities it could find. Given


  if (c_2 || c_3)

If the FALSE edge is taken, this is ! (c_2 || c_3) which is equivalent 
to !c_2 && !c_3.. so performing the intersection as combine_logical was 
originally doing was correct.


I found this by examining cases  we were missing because this was no 
longer being combined properly.  which sent be back to this PR to figure 
out what the real reason for the failure was.


The GORI design specification calculates outgoing ranges using 
dependency chains, but as soon as the chain goes outside the current 
block, we are suppose to revert to using the on-entry range since 
control flow could dictate further range changes.


I had noticed that on certain occasions we'd peek into other blocks, and 
each time I saw it, it was beneficial and seemed like a harmless 
recalucaltion, So I let it go.


In this particular case, during the on-entry cache propagation we were 
peeking into another block to pick up a value used in the logical OR 
operation. Unfortunately, the on-entry cache hadnt finished propagating 
and the incomplete range was picked up from that other block instead of 
the current one, and we ended up calculating a [0,0] range on an 
outgoing edge that should have been VARYING.  And bad things happened.


The fix is to patch the logical evaluation code to do the same thing as 
the non-logical code, follow the spec, and simply use the range-on-entry 
value for the block if the def chain  leads out of the current block.   
Next release will integrate the full re-evaluation of def chains from 
outside the basic block.. properly planned :-)


I've added an additional test case to confirm that the logical or FALSE 
case is being optimized properly. The original fix fails this test, the 
current fix passes it,.


Bootstrapped on x86_64-pc-linux-gnu, no regressions. pushed.

Andrew




commit 7d26a337bfa1135d95caa3c213e82f2a97f18a01
Author: Andrew MacLeod 
Date:   Mon Nov 9 19:38:22 2020 -0500

Fix logical_combine OR operation. Again.

The original fix was incorrect and results in loss of opportunities.
Revert the original fix. When processing logical chains, do not
follow chains outside of the current basic block.  Use the import
value instead.

gcc/
PR tree-optimization/97567
* gimple-range-gori.cc: (gori_compute::logical_combine): False
OR operations should intersect the 2 results.
(gori_compute::compute_logical_operands_in_chain): If def chains
are outside the current basic block, don't follow them.
gcc/testsuite/
* gcc.dg/pr97567-2.c: New.

diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc
index 54385baa629..af3609e414e 100644
--- a/gcc/gimple-range-gori.cc
+++ b/gcc/gimple-range-gori.cc
@@ -730,10 +730,10 @@ gori_compute::logical_combine (irange , enum tree_code code,
 if (lhs.zero_p ())
 	  {
 	// An OR operation will only take the FALSE path if both
-	// operands are false, so either [20, 255] or [0, 5] is the
-	// union: [0,5][20,255].
+	// operands are false simlulateously, which means they should
+	// be intersected.  !(x || y) == !x && !y
 	r = op1.false_range;
-	r.union_ (op2.false_range);
+	r.intersect (op2.false_range);
 	  }
 	else
 	  {
@@ -804,9 +804,12 @@ gori_compute::compute_logical_operands_in_chain (tf_range ,
 		 tree name,
 		 tree op, bool op_in_chain)
 {
-  if (!op_in_chain)
+  gimple *src_stmt = gimple_range_ssa_p (op) ? SSA_NAME_DEF_STMT (op) : NULL;
+  basic_block bb = gimple_bb (stmt);
+  if (!op_in_chain || (src_stmt != NULL && bb != gimple_bb (src_stmt)))
 {
-  // If op is not in chain, use its known value.
+  // If op is not in the def chain, or defined in this block,
+  // use its known value on entry to the block.
   expr_range_in_bb (range.true_range, name, gimple_bb (stmt));
   range.false_range = range.true_range;
   return;
@@ -814,14 +817,12 @@ gori_compute::compute_logical_operands_in_chain (tf_range ,
   if (optimize_logical_operands (range, stmt, lhs, name, op))
 return;
 
-  // Calulate ranges for true and false on both sides, since the false
+  // Calculate ranges for true and false on both sides, since the false
   // path is not always a simple inversion of the true side.
-  if (!compute_operand_range (range.true_range, SSA_NAME_DEF_STMT (op),
-			  m_bool_one, name))
-expr_range_in_bb (range.true_range, name, gimple_bb (stmt));
-  if (!compute_operand_range (range.false_range, SSA_NAME_DEF_STMT (op),
-			  m_bool_zero, name))
-expr_range_in_bb (range.false_range, name, gimple_bb (stmt));
+  if (!compute_operand_range (range.true_range, src_stmt, m_bool_one, name))
+expr_range_in_bb (range.true_range, name, bb);
+  if (!compute_operand_range (range.false_range, src_stmt, m_bool_zero, 

[PATCH] c++: Improve static_assert diagnostic [PR97518]

2020-11-09 Thread Marek Polacek via Gcc-patches
Currently, when a static_assert fails, we only say "static assertion failed".
It would be more useful if we could also print the expression that
evaluated to false; this is especially useful when the condition uses
template parameters.  Consider the motivating example, in which we have
this line:

  static_assert(is_same::value);

if this fails, the user has to play dirty games to get the compiler to
print the template arguments.  With this patch, we say:

  static assertion failed due to requirement 'is_same::value'

which I think is much better.  However, always printing the condition that
evaluated to 'false' wouldn't be very useful: e.g. noexcept(fn) is
always parsed to true/false, so we would say "static assertion failed due
to requirement 'false'" which doesn't help.  So I wound up only printing
the condition when it was instantiation-dependent, that is, we called
finish_static_assert from tsubst_expr.

Moreover, this patch also improves the diagnostic when the condition
consists of a logical AND.  Say you have something like this:

  static_assert(fn1() && fn2() && fn3() && fn4() && fn5());

where fn4() evaluates to false and the other ones to true.  Highlighting
the whole thing is not that helpful because it won't say which clause
evaluated to false.  With the find_failing_clause tweak in this patch
we emit:

  error: static assertion failed
6 | static_assert(fn1() && fn2() && fn3() && fn4() && fn5());
  |  ~~~^~

so you know right away what's going on.  Unfortunately, when you combine
both things, that is, have an instantiation-dependent expr and && in
a static_assert, we can't yet quite point to the clause that failed.  It
is because when we tsubstitute something like is_same::value, we
generate a VAR_DECL that doesn't have any location.  It would be awesome
if we could wrap it with a location wrapper, but I didn't see anything
obvious.

Further tweak could be to print the failing clause of the condition if
possible, whether or not it was instantiation-dependent.

In passing, I've cleaned up some things:
* use iloc_sentinel when appropriate,
* it's nicer to call contextual_conv_bool instead of the rather verbose
  perform_implicit_conversion_flags,
* no need to check for INTEGER_CST before calling integer_zerop.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/97518
* cp-tree.h (finish_static_assert): Adjust declaration.
* parser.c (cp_parser_static_assert): Pass false to
finish_static_assert.
* pt.c (tsubst_expr): Pass true to finish_static_assert.
* semantics.c (find_failing_clause_r): New function.
(find_failing_clause): New function.
(finish_static_assert): Add a bool parameter.  Use
iloc_sentinel.  Call contextual_conv_bool instead of
perform_implicit_conversion_flags.  Don't check for INTEGER_CST before
calling integer_zerop.  Call find_failing_clause and maybe use its
location.  Print the original condition if SHOW_EXPR_P.

gcc/testsuite/ChangeLog:

PR c++/97518
* g++.dg/diagnostic/pr87386.C: Adjust expected output.
* g++.dg/diagnostic/static_assert1.C: New test.
* g++.dg/diagnostic/static_assert2.C: New test.

libcc1/ChangeLog:

PR c++/97518
* libcp1plugin.cc (plugin_add_static_assert): Pass false to
finish_static_assert.

libstdc++-v3/ChangeLog:

* testsuite/20_util/scoped_allocator/69293_neg.cc: Adjust dg-error.
* testsuite/20_util/uses_allocator/69293_neg.cc: Likewise.
* testsuite/20_util/uses_allocator/cons_neg.cc: Likewise.
* testsuite/26_numerics/random/pr60037-neg.cc: Likewise.
---
 gcc/cp/cp-tree.h  |  2 +-
 gcc/cp/parser.c   |  3 +-
 gcc/cp/pt.c   |  6 +-
 gcc/cp/semantics.c| 79 ---
 gcc/testsuite/g++.dg/diagnostic/pr87386.C |  2 +-
 .../g++.dg/diagnostic/static_assert1.C| 20 +
 .../g++.dg/diagnostic/static_assert2.C| 68 
 libcc1/libcp1plugin.cc|  2 +-
 .../20_util/scoped_allocator/69293_neg.cc |  2 +-
 .../20_util/uses_allocator/69293_neg.cc   |  2 +-
 .../20_util/uses_allocator/cons_neg.cc|  2 +-
 .../26_numerics/random/pr60037-neg.cc |  4 +-
 12 files changed, 168 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/static_assert1.C
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/static_assert2.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index b98d47a702f..230a1525c63 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7234,7 +7234,7 @@ extern bool cxx_omp_create_clause_info(tree, 
tree, bool, bool,
 bool, bool);
 extern tree baselink_for_fns(tree);
 extern void 

[r11-4852 Regression] FAIL: g++.dg/ubsan/pr61272.C (test for excess errors) on Linux/x86_64

2020-11-09 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

05b03452db6a520091aed254d3c399caed714b15 is the first bad commit
commit 05b03452db6a520091aed254d3c399caed714b15
Author: Jason Merrill 
Date:   Fri Nov 6 20:41:54 2020 -0500

c++: Improve error location for class using-decl.

caused

FAIL: g++.dg/ubsan/pr61272.C(test for errors, line 15)
FAIL: g++.dg/ubsan/pr61272.C   (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-4852/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="ubsan.exp=g++.dg/ubsan/pr61272.C --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="ubsan.exp=g++.dg/ubsan/pr61272.C --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="ubsan.exp=g++.dg/ubsan/pr61272.C --target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="ubsan.exp=g++.dg/ubsan/pr61272.C --target_board='unix{-m64\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [C PATCH RFC] Drop qualifiers during lvalue conversion

2020-11-09 Thread Joseph Myers
On Sat, 7 Nov 2020, Uecker, Martin wrote:

> In 'gcc.dg/cond-constqual-1.c' we test for the opposite
> behavior for conditional operators. I do not know why.
> We could just invert the test.

That's probably a relic of the old idea that rvalues might actually have 
qualified type in some cases; it seems reasonable to invert the test.

>   t = (const T) { { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } };
>   test ();
> }
> 
> Not sure what to do about it, maybe 'convert' is not
> the right function to use.

I think 'convert' is fine, but new code is probably needed in whatever 
implements the optimization for assignment from compound literals so that 
it works when there is a conversion that only changes qualifiers.

> diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
> index 96840377d90..aeacd30badd 100644
> --- a/gcc/c/c-typeck.c
> +++ b/gcc/c/c-typeck.c
> @@ -2080,6 +2080,8 @@ convert_lvalue_to_rvalue (location_t loc, struct c_expr 
> exp,
>  exp = default_function_array_conversion (loc, exp);
>    if (!VOID_TYPE_P (TREE_TYPE (exp.value)))
>  exp.value = require_complete_type (loc, exp.value);
> +  if (convert_p && !error_operand_p (exp.value))
> +exp.value = convert (build_qualified_type (TREE_TYPE (exp.value), 
> TYPE_UNQUALIFIED),
> exp.value);

I think it might be safest to avoid doing any conversion in the case where 
the value is still of array type at this point (C90 non-lvalue arrays).

-- 
Joseph S. Myers
jos...@codesourcery.com


[Patch] Fortran: OpenMP 5.0 (in_,task_)reduction clause extensions

2020-11-09 Thread Tobias Burnus

This patch updates the OpenMP handling to support OpenMP 5.0's
reductions changes:
- add task_reduction (for taskgroup)
- add in_reduction (for task, taskloop, target)
- add 'default', 'inscan' and 'task' to 'reduction'
  - only default for teams, taskloop
  - all three for parallel, simd, do, section

When copying + converting testcases from C to Fortran,
I saw that 'schedule(monotonic' can now be mixed with static/runtime/auto,
which is also included in the patch.

OK?

Tobias

PS: I am sure, I missed something, the question is only what ...
A likely place where something might have got wrong is gfc_split_omp_clauses.

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
Fortran: OpenMP 5.0 (in_,task_)reduction clause extensions

gcc/fortran/ChangeLog:

	PR fortran/95847
	* dump-parse-tree.c (show_omp_clauses): Handle new reduction enums.
	* gfortran.h (OMP_LIST_REDUCTION_INSCAN, OMP_LIST_REDUCTION_TASK,
	OMP_LIST_IN_REDUCTION, OMP_LIST_TASK_REDUCTION): Add enums.
	* openmp.c (enum omp_mask1): Add OMP_CLAUSE_REDUCTION_DEFAULT,
	OMP_CLAUSE_REDUCTION_MODIFIER, OMP_CLAUSE_IN_REDUCTION,
	and OMP_CLAUSE_TASK_REDUCTION.
	(gfc_match_omp_clause_reduction): Extend reduction handling;
	moved from ...
	(gfc_match_omp_clauses): ... here. Add calls to it.
	(OMP_PARALLEL_CLAUSES, OMP_OMP_DO_CLAUSES, OMP_SECTIONS_CLAUSES,
	OMP_SIMD_CLAUSES): Use OMP_CLAUSE_REDUCTION_MODIFIERS.
	(OMP_TASK_CLAUSES, OMP_TARGET_CLAUSES): Add OMP_CLAUSE_IN_REDUCTION.
	(OMP_TASKLOOP_CLAUSES): Likewise; add OMP_CLAUSE_REDUCTION_DEFAULT.
	(OMP_TEAMS_CLAUSES): Add OMP_CLAUSE_REDUCTION_DEFAULT.
	(gfc_match_omp_taskgroup): Add task_reduction matching.
	(resolve_omp_clauses): Update for new reduction clause changes;
	remove removed nonmonotonic-schedule restrictions.
	(gfc_resolve_omp_parallel_blocks): Add new enums to switch.
	* trans-openmp.c (gfc_omp_clause_default_ctor,
	gfc_trans_omp_reduction_list, gfc_trans_omp_clauses,
	gfc_split_omp_clauses): Handle updated reduction clause.

gcc/ChangeLog:

	PR fortran/95847
	* gimplify.c (gimplify_scan_omp_clauses, gimplify_omp_loop): Use 'do'
	instead of 'for' in error messages for Fortran.
	* omp-low.c (check_omp_nesting_restrictions): Likewise

gcc/testsuite/ChangeLog:

	PR fortran/95847
	* gfortran.dg/gomp/schedule-modifiers-2.f90: Remove some dg-error.
	* gfortran.dg/gomp/reduction4.f90: New test.
	* gfortran.dg/gomp/reduction5.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-1.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-2.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-3.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-4.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-5.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-6.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-7.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-8.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-9.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-10.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-11.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-12.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-13.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-14.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-15.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-16.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-17.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-18.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-19.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-20.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-21.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-22.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-23.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-24.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-25.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-26.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-27.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-28.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-29.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-30.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-31.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-32.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-33.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-34.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-35.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-36.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-37.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-38.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-39.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-40.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-41.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-42.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-43.f90: 

Re: Detect EAF flags in ipa-modref

2020-11-09 Thread Jan Hubicka
Hi,
this is updated patch for autodetection of EAF flags.  Still the goal is
to avoid fancy stuff and get besic logic in place (so no dataflow, no IPA
propagation, no attempts to handle trickier cases).  There is one new failure

./gcc/testsuite/gcc/gcc.sum:FAIL: gcc.dg/sso/t2.c   -Wno-scalar-storage-order 
-O1 -fno-inline  output pattern test
./gcc/testsuite/gcc/gcc.sum:FAIL: gcc.dg/sso/t2.c   -Wno-scalar-storage-order 
-O2  output pattern test
./gcc/testsuite/gcc/gcc.sum:FAIL: gcc.dg/sso/t2.c   -Wno-scalar-storage-order 
-Os  output pattern test

Which I blieve is bug exposed by detecting dump function to be EAF_DIRECT and
NOESCAPE (which it is) and packing/updacking the bitfields leads in one bit
difference. Still no idea why.

Patch seems to be quite effective on cc1plus turning:

Alias oracle query stats:
  refs_may_alias_p: 65808750 disambiguations, 75664890 queries
  ref_maybe_used_by_call_p: 153485 disambiguations, 66711204 queries
  call_may_clobber_ref_p: 22816 disambiguations, 28889 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 36846 queries
  nonoverlapping_refs_since_match_p: 27271 disambiguations, 58917 must 
overlaps, 86958 queries
  aliasing_component_refs_p: 65808 disambiguations, 2067256 queries
  TBAA oracle: 25929211 disambiguations 60395141 queries
   12391384 are in alias set 0
   10783783 queries asked about the same object
   126 queries asked about the same alias set
   0 access volatile
   9598698 are dependent in the DAG
   1691939 are aritificially in conflict with void *

Modref stats:
  modref use: 14284 disambiguations, 53336 queries
  modref clobber: 1660281 disambiguations, 2130440 queries
  4311165 tbaa queries (2.023603 per modref query)
  685304 base compares (0.321673 per modref query)

PTA query stats:
  pt_solution_includes: 959190 disambiguations, 13169678 queries
  pt_solutions_intersect: 1050969 disambiguations, 13246686 queries

Alias oracle query stats:
  refs_may_alias_p: 66914578 disambiguations, 76692648 queries
  ref_maybe_used_by_call_p: 244077 disambiguations, 67732086 queries
  call_may_clobber_ref_p: 111475 disambiguations, 116613 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 37091 queries
  nonoverlapping_refs_since_match_p: 27267 disambiguations, 59019 must 
overlaps, 87056 queries
  aliasing_component_refs_p: 65870 disambiguations, 2063394 queries
  TBAA oracle: 26024415 disambiguations 60579490 queries
   12450910 are in alias set 0
   10806673 queries asked about the same object
   125 queries asked about the same alias set
   0 access volatile
   9605837 are dependent in the DAG
   1691530 are aritificially in conflict with void *

Modref stats:
  modref use: 14272 disambiguations, 277680 queries
  modref clobber: 1669753 disambiguations, 7849135 queries
  4330162 tbaa queries (0.551674 per modref query)
  699241 base compares (0.089085 per modref query)

PTA query stats:
  pt_solution_includes: 1833920 disambiguations, 13846032 queries
  pt_solutions_intersect: 1093785 disambiguations, 13309954 queries

So almost twice as many pt_solution_includes disambiguations.
I will re-run the stats overnight to be sure that it is not independent
change (but both build was from almost same checkout).

Bootstrapped/regtested x86_64-linux, OK?
(I will analyze more the t2.c failure)

Honza

gcc/ChangeLog:

2020-11-10  Jan Hubicka  

* gimple.c: Include ipa-modref-tree.h and ipa-modref.h.
(gimple_call_arg_flags): Use modref to determine flags.
* ipa-modref.c: Include gimple-ssa.h, tree-phinodes.h,
tree-ssa-operands.h, stringpool.h and tree-ssanames.h.
(analyze_ssa_name_flags): Declare.
(modref_summary::useful_p): Summary is also useful if arg flags are
known.
(dump_eaf_flags): New function.
(modref_summary::dump): Use it.
(get_modref_function_summary): Be read for current_function_decl
being NULL.
(memory_access_to): New function.
(deref_flags): New function.
(call_lhs_flags): New function.
(analyze_parms): New function.
(analyze_function): Use it.
* ipa-modref.h (struct modref_summary): Add arg_flags.

gcc/testsuite/ChangeLog:

2020-11-10  Jan Hubicka  

* gcc.dg/torture/pta-ptrarith-1.c: Escape parametrs.

diff --git a/gcc/gimple.c b/gcc/gimple.c
index 1afed88e1f1..da90716aa23 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -46,6 +46,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "asan.h"
 #include "langhooks.h"
 #include "attr-fnspec.h"
+#include "ipa-modref-tree.h"
+#include "ipa-modref.h"
 
 
 /* All the tuples have their operand vector (if present) at the very bottom
@@ -1532,24 +1534,45 @@ int
 gimple_call_arg_flags (const gcall *stmt, unsigned arg)
 {
   attr_fnspec fnspec = gimple_call_fnspec (stmt);
-
-  if 

Re: [ranger] Fix wrong code for boolean negation in condition at -O

2020-11-09 Thread Andrew MacLeod via Gcc-patches

On 11/9/20 4:38 PM, Eric Botcazou wrote:

Hi,

this is a regression present on mainline in the form of wrong code generated
for the attached Ada testcase at -O -ftree-vrp.  It's again an issue with the
8-bit booleans in Ada and, as a matter of fact, it's exactly the same issue as
the one I fixed elsewhere about one year and half ago at:
   https://gcc.gnu.org/pipermail/gcc-patches/2019-February/517843.html

The problem is the bitwise/logical dichotomy for operators and the transition
from the former to the latter for boolean types: if they are 1-bit, that's
straightforward but, if they are larger, then you need to be careful because
you cannot, on the one hand, turn a bitwise AND into a logical AND and, on the
other hand, *not* turn e.g. a bitwise NOT into a logical NOT if they occur in
the same computation, as the first change will drop the masking that may need
to be applied after the bitwise NOT if it is not also changed.

Given that the ranger turns bitwise AND/OR into logical AND/OR for booleans,
the patch does the same for bitwise NOT, exactly as in the above fix.

Bootstrapped/regtested on x86-64/Linux, OK for the mainline?


2020-11-09  Eric Botcazou  

* range-op.cc (operator_logical_not::fold_range): Tidy up.
(operator_logical_not::op1_range): Call above method.
(operator_bitwise_not::fold_range): If the type is compatible
with boolean, call op_logical_not.fold_range.
(operator_bitwise_not::op1_range): If the type is compatible
with boolean, call op_logical_not.op1_range.



Doh. Sorry about that. The project has a long painful history with multi 
bit booleans :-P.. Hopefully this is the last of them :-)


THanks.

OK.

Andrew



Re: [PATCH, rs6000] Update instruction attributes for Power10

2020-11-09 Thread Segher Boessenkool
On Wed, Nov 04, 2020 at 02:42:39PM -0600, Pat Haugen wrote:
> This patch updates the type/prefixed/dot/size attributes for various new 
> instructions (and a couple existing that were incorrect) in preparation for 
> the Power10 scheduling patch that will be following.

> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -819,7 +819,7 @@ (define_insn "vsdb_"
> VSHIFT_DBL_LR))]
>"TARGET_POWER10"
>"vsdbi %0,%1,%2,%3"
> -  [(set_attr "type" "vecsimple")])
> +  [(set_attr "type" "vecperm")])

Is that such a good type for this?  I know the vsl etc. insns use it as
well, but :-)

These insns use the PM pipe on p9, which is documented as "Permute /
128b FX".  So maybe something like that can be a better name?

(Just food for thought.)

> @@ -827,7 +827,8 @@ (define_insn "xxspltiw_v4si"
>UNSPEC_XXSPLTIW))]
>   "TARGET_POWER10"
>   "xxspltiw %x0,%1"
> - [(set_attr "type" "vecsimple")])
> + [(set_attr "type" "vecperm")
> +  (set_attr "prefixed" "always")])

Like I said in the other thread, you could have a "maybe_prefixed"
attribute (which you use on pretty much all existing ones), and only if
that is set the "prefixed" attribute is set to "yes" or "no"
automatically.  And things like this that always want it set can just do
so.  The code that decides whether to prefix the mnemonic with a "p" can
then just look if "maybe_prefixed" is "yes".

I don't know what is nicer, such a scheme or what you have here.  Will
ran into a pitfall already, so dunno.

> @@ -1080,7 +1088,8 @@ (define_insn "vstril_p_code_"
>  UNSPEC_VSTRIR))]
>"TARGET_POWER10"
>"vstril. %0,%1"
> -  [(set_attr "type" "vecsimple")])
> +  [(set_attr "type" "vecperm")
> +   (set_attr "dot" "yes")])

"dot" is documented as
;; Is this instruction record form ("dot", signed compare to 0, writing CR0)?
which this insn doesn't do (it doesn't do a comparison, and it writes to
CR6).  It is fine to have this attribute if that is useful, but then
change the docs?  (FP insns write CR1 btw; that is all three fields that
record form insns can write).

>  ;; Fused multiply subtract 
>  (define_insn "*altivec_vnmsubfp"
> @@ -2779,7 +2788,7 @@ (define_insn "altivec_lvsl_reg"
>   UNSPEC_LVSL_REG))]
>"TARGET_ALTIVEC"
>"lvsl %0,0,%1"
> -  [(set_attr "type" "vecload")])
> +  [(set_attr "type" "vecperm")])

That is not correct on older processors (7400, 970, etc.), and it even
matters there (you can have only one insn per pipe in each cycle).  Not
that that will matter much for GCC 11, but perhaps we can make the model
better for new CPUs without making it worse for older ones :-)

> @@ -4465,7 +4475,7 @@ (define_insn "vcfuged"
>UNSPEC_VCFUGED))]
> "TARGET_POWER10"
> "vcfuged %0,%1,%2"
> -   [(set_attr "type" "vecsimple")])
> +   [(set_attr "type" "crypto")])
>  
>  (define_insn "vclzdm"
>[(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
> @@ -4474,7 +4484,7 @@ (define_insn "vclzdm"
>UNSPEC_VCLZDM))]
> "TARGET_POWER10"
> "vclzdm %0,%1,%2"
> -   [(set_attr "type" "vecsimple")])
> +   [(set_attr "type" "crypto")])

Yeah that is a pretty strange type for what these insn do.

We already have the name "veccomplex", but a name like that would be
fitting perhaps.

> @@ -345,4 +357,5 @@ (define_insn "dfp_dscri_"
>UNSPEC_DSCRI))]
>"TARGET_DFP"
>"dscri %0,%1,%2"
> -  [(set_attr "type" "dfp")])
> +  [(set_attr "type" "dfp")
> +   (set_attr "size" "")])

All the DFP size changes are fine.

> diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
> index a3fd28bdd0a..43d6b618929 100644
> --- a/gcc/config/rs6000/mma.md
> +++ b/gcc/config/rs6000/mma.md
> @@ -302,6 +302,7 @@ (define_insn_and_split "*movpoi"
>DONE;
>  }
>[(set_attr "type" "vecload,vecstore,veclogical")
> +   (set_attr "size" "256,256,*")
> (set_attr "length" "*,*,8")])

And this, too.

> @@ -589,4 +599,5 @@ (define_insn "mma_"
>"TARGET_MMA"
>" %A0,%x2,%x3,%4,%5,%6"
>[(set_attr "type" "mma")
> +   (set_attr "prefixed" "always")
> (set_attr "length" "8")])

And all the prefixed ones, too.

> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 4d528a39a37..55c47140672 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -25752,7 +25752,7 @@ static bool next_insn_prefixed_p;
>  void
>  rs6000_final_prescan_insn (rtx_insn *insn, rtx [], int)
>  {
> -  next_insn_prefixed_p = (get_attr_prefixed (insn) != PREFIXED_NO);
> +  next_insn_prefixed_p = (get_attr_prefixed (insn) == PREFIXED_YES);
>return;
>  }

Add a comment here?  "always" is not surprising in many places, but here
it is.

> -;; Whether an insn is a prefixed insn, and an initial 'p' should be printed
> -;; before the instruction.  A prefixed instruction has a prefix instruction
> -;; word that extends the immediate value of the instructions from 12-16 bits 
> to
> -;; 34 bits.  The macro 

[RFH] Use get_deref_alias_type in ipa-modref

2020-11-09 Thread Jan Hubicka
Hi,
this patch implements cleanup we discussed some time ago on IRC.
Instead of recording reference types in ipa-modref I record pointer
types same way as done by RTL attributes and I moved corresponding logic
to tree-ssa-alias.c.
Problem is that it breaks some testcases:

./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr50444.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (internal compiler error)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr50444.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr50444.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr50444.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr52419.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (internal compiler error)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr52419.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr52419.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr52419.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-1.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (internal compiler error)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-1.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-1.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-1.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-2.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (internal compiler error)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-2.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-2.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-2.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-3.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (internal compiler error)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-3.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-3.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-3.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-4.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (internal compiler error)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-4.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-4.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr57748-4.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr58041.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (internal compiler error)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr58041.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr58041.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
./testsuite/gcc/gcc.sum:FAIL: gcc.dg/torture/pr58041.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)

I looked into first one and it is caused by "may_alias" attribute on the
vector type. This makes build_pointer_type to set that it may alias all
and the get_deref_alias_type return 0 while ao_ref_alias_set returns 4.

I can live with this small loss of precision for ipa-modref, but I am
using same logic in the ao_ref_compare for ipa-icf where we can not be
conservative.  I wonder how to fix this. One option is to keep the
ipa-modref way of recording actual types that determine alias set rather
pointers to them, but I recall you considered it confusing to have two
sets of machineries for this.  So perhaps we want to add a way to build
pointer type ignoring may alias attribute?

I see that RTL needs this to handle alias sets 0 because it also needs
mathcing type size.  I do not need that in TBAA machinery.

Honza

diff --git a/gcc/alias.c 

[ranger] Fix wrong code for boolean negation in condition at -O

2020-11-09 Thread Eric Botcazou
Hi,

this is a regression present on mainline in the form of wrong code generated 
for the attached Ada testcase at -O -ftree-vrp.  It's again an issue with the 
8-bit booleans in Ada and, as a matter of fact, it's exactly the same issue as 
the one I fixed elsewhere about one year and half ago at:
  https://gcc.gnu.org/pipermail/gcc-patches/2019-February/517843.html

The problem is the bitwise/logical dichotomy for operators and the transition 
from the former to the latter for boolean types: if they are 1-bit, that's 
straightforward but, if they are larger, then you need to be careful because 
you cannot, on the one hand, turn a bitwise AND into a logical AND and, on the 
other hand, *not* turn e.g. a bitwise NOT into a logical NOT if they occur in 
the same computation, as the first change will drop the masking that may need 
to be applied after the bitwise NOT if it is not also changed.

Given that the ranger turns bitwise AND/OR into logical AND/OR for booleans, 
the patch does the same for bitwise NOT, exactly as in the above fix.

Bootstrapped/regtested on x86-64/Linux, OK for the mainline?


2020-11-09  Eric Botcazou  

* range-op.cc (operator_logical_not::fold_range): Tidy up.
(operator_logical_not::op1_range): Call above method.
(operator_bitwise_not::fold_range): If the type is compatible
with boolean, call op_logical_not.fold_range.
(operator_bitwise_not::op1_range): If the type is compatible
with boolean, call op_logical_not.op1_range.


2020-11-09  Eric Botcazou  

* gnat.dg/opt88.adb: New test.

-- 
Eric Botcazoudiff --git a/gcc/range-op.cc b/gcc/range-op.cc
index f38f02e8d27..238bf327d9f 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -2706,27 +2706,21 @@ operator_logical_not::fold_range (irange , tree type,
   if (empty_range_varying (r, type, lh, rh))
 return true;
 
-  if (lh.varying_p () || lh.undefined_p ())
-r = lh;
-  else
-{
-  r = lh;
-  r.invert ();
-}
-  gcc_checking_assert (lh.type() == type);
+  r = lh;
+  if (!lh.varying_p () && !lh.undefined_p ())
+r.invert ();
+
   return true;
 }
 
 bool
 operator_logical_not::op1_range (irange ,
- tree type ATTRIBUTE_UNUSED,
+ tree type,
  const irange ,
- const irange  ATTRIBUTE_UNUSED) const
+ const irange ) const
 {
-  r = lhs;
-  if (!lhs.varying_p () && !lhs.undefined_p ())
-r.invert ();
-  return true;
+  // Logical NOT is involutary...do it again.
+  return fold_range (r, type, lhs, op2);
 }
 
 
@@ -2749,6 +2743,9 @@ operator_bitwise_not::fold_range (irange , tree type,
   if (empty_range_varying (r, type, lh, rh))
 return true;
 
+  if (types_compatible_p (type, boolean_type_node))
+return op_logical_not.fold_range (r, type, lh, rh);
+
   // ~X is simply -1 - X.
   int_range<1> minusone (type, wi::minus_one (TYPE_PRECISION (type)),
 			 wi::minus_one (TYPE_PRECISION (type)));
@@ -2761,6 +2758,9 @@ operator_bitwise_not::op1_range (irange , tree type,
  const irange ,
  const irange ) const
 {
+  if (types_compatible_p (type, boolean_type_node))
+return op_logical_not.op1_range (r, type, lhs, op2);
+
   // ~X is -1 - X and since bitwise NOT is involutary...do it again.
   return fold_range (r, type, lhs, op2);
 }
-- { dg-do run }
-- { dg-options "-O -ftree-vrp -fno-inline" }

procedure Opt88 is

  Val : Integer := 1;

  procedure Dummy (B : out Boolean) is
  begin
B := True;
  end;

  function Test return Boolean is
  begin
return False;
  end;

  procedure Do_It (OK : out Boolean) is

Blue : Boolean := False;
Red  : Boolean := False;

  begin
OK := True;
Blue := True;
Dummy (Red);

if Red then
  Red := False;

  if Test then
Dummy (Red);
  end if;
end if;

if Blue and not Red then
  Val := 0;
end if;

if Red then
  OK := False;
end if;
  end;

  OK : Boolean;

begin
  Do_It (OK);
  if not OK then
raise Program_Error;
  end if;
end;


Re: [PATCH] c, c++: Fix up -Wunused-value on COMPLEX_EXPRs [PR97748]

2020-11-09 Thread Jakub Jelinek via Gcc-patches
On Mon, Nov 09, 2020 at 02:35:58PM -0500, Jason Merrill wrote:
> How about calling warn_if_unused_value instead of the new
> warn_if_unused_value_p?

That seems to work if I just replace the warning_at call with
warn_if_unused_value call (at least no regression in check-c++-all and
libstdc++ testsuite).
Initially I've tried just calling warn_if_unused_value without the NOP_EXPR
stripping and code/tclass checks, but that regressed a few tests, e.g.
g++.dg/warn/Wunused-14.C or c-c++-common/Wunused-var-9.c or 3 lines
in the new test, e.g. because STATEMENT_LIST or CLEANUP_POINT_EXPRs would
make it through and resulted in bogus warnings.

2020-11-09  Jakub Jelinek  

PR c/97748
gcc/c-family/
* c-common.h (warn_if_unused_value): Add quiet argument defaulted
to false.
* c-warn.c (warn_if_unused_value): Likewise.  Pass it down
recursively and just return true instead of warning if it is true.
Handle COMPLEX_EXPR.
gcc/cp/
* cvt.c (convert_to_void): Check (complain & tf_warning) in the outer
if rather than twice times in the inner one.  Use warn_if_unused_value.
Formatting fix.
gcc/testsuite/
* c-c++-common/Wunused-value-1.c: New test.

--- gcc/c-family/c-common.h.jj  2020-11-03 11:15:07.170681001 +0100
+++ gcc/c-family/c-common.h 2020-11-07 09:37:48.597233063 +0100
@@ -1362,7 +1362,7 @@ extern void warn_tautological_cmp (const
   tree, tree);
 extern void warn_logical_not_parentheses (location_t, enum tree_code, tree,
  tree);
-extern bool warn_if_unused_value (const_tree, location_t);
+extern bool warn_if_unused_value (const_tree, location_t, bool = false);
 extern bool strict_aliasing_warning (location_t, tree, tree);
 extern void sizeof_pointer_memaccess_warning (location_t *, tree,
  vec *, tree *,
--- gcc/c-family/c-warn.c.jj2020-10-26 10:53:56.533885147 +0100
+++ gcc/c-family/c-warn.c   2020-11-07 09:40:51.011170825 +0100
@@ -585,7 +585,7 @@ warn_logical_not_parentheses (location_t
(potential) location of the expression.  */
 
 bool
-warn_if_unused_value (const_tree exp, location_t locus)
+warn_if_unused_value (const_tree exp, location_t locus, bool quiet)
 {
  restart:
   if (TREE_USED (exp) || TREE_NO_WARNING (exp))
@@ -633,7 +633,7 @@ warn_if_unused_value (const_tree exp, lo
   goto restart;
 
 case COMPOUND_EXPR:
-  if (warn_if_unused_value (TREE_OPERAND (exp, 0), locus))
+  if (warn_if_unused_value (TREE_OPERAND (exp, 0), locus, quiet))
return true;
   /* Let people do `(foo (), 0)' without a warning.  */
   if (TREE_CONSTANT (TREE_OPERAND (exp, 1)))
@@ -648,6 +648,13 @@ warn_if_unused_value (const_tree exp, lo
return false;
   goto warn;
 
+case COMPLEX_EXPR:
+  /* Warn only if both operands are unused.  */
+  if (warn_if_unused_value (TREE_OPERAND (exp, 0), locus, true)
+ && warn_if_unused_value (TREE_OPERAND (exp, 1), locus, true))
+   goto warn;
+  return false;
+
 case INDIRECT_REF:
   /* Don't warn about automatic dereferencing of references, since
 the user cannot control it.  */
@@ -671,6 +678,8 @@ warn_if_unused_value (const_tree exp, lo
return false;
 
 warn:
+  if (quiet)
+   return true;
   return warning_at (locus, OPT_Wunused_value, "value computed is not 
used");
 }
 }
--- gcc/cp/cvt.c.jj 2020-07-28 15:39:09.0 +0200
+++ gcc/cp/cvt.c2020-11-09 21:28:47.180254399 +0100
@@ -1568,12 +1568,13 @@ convert_to_void (tree expr, impl_conv_vo
  && warn_unused_value
  && !TREE_NO_WARNING (expr)
  && !processing_template_decl
- && !cp_unevaluated_operand)
+ && !cp_unevaluated_operand
+ && (complain & tf_warning))
{
  /* The middle end does not warn about expressions that have
 been explicitly cast to void, so we must do so here.  */
- if (!TREE_SIDE_EFFECTS (expr)) {
-if (complain & tf_warning)
+ if (!TREE_SIDE_EFFECTS (expr))
+   {
  switch (implicit)
{
  case ICV_SECOND_OF_COND:
@@ -1605,14 +1606,10 @@ convert_to_void (tree expr, impl_conv_vo
  default:
gcc_unreachable ();
}
-  }
+   }
  else
{
- tree e;
- enum tree_code code;
- enum tree_code_class tclass;
-
- e = expr;
+ tree e = expr;
  /* We might like to warn about (say) "(int) f()", as the
 cast has no effect, but the compiler itself will
 generate implicit conversions under some
@@ -1626,21 +1623,20 @@ convert_to_void (tree expr, impl_conv_vo
  while (TREE_CODE (e) == NOP_EXPR)
e = TREE_OPERAND (e, 0);
 
- code = 

Re: [PATCH] nvptx: Cache stacks block for OpenMP kernel launch

2020-11-09 Thread Alexander Monakov via Gcc-patches
On Mon, 26 Oct 2020, Jakub Jelinek wrote:

> On Mon, Oct 26, 2020 at 07:14:48AM -0700, Julian Brown wrote:
> > This patch adds caching for the stack block allocated for offloaded
> > OpenMP kernel launches on NVPTX. This is a performance optimisation --
> > we observed an average 11% or so performance improvement with this patch
> > across a set of accelerated GPU benchmarks on one machine (results vary
> > according to individual benchmark and with hardware used).

In this patch you're folding two changes together: reuse of allocated stacks
and removing one host-device synchronization.  Why is that?  Can you report
performance change separately for each change (and split out the patches)?

> > A given kernel launch will reuse the stack block from the previous launch
> > if it is large enough, else it is freed and reallocated. A slight caveat
> > is that memory will not be freed until the device is closed, so e.g. if
> > code is using highly variable launch geometries and large amounts of
> > GPU RAM, you might run out of resources slightly quicker with this patch.
> > 
> > Another way this patch gains performance is by omitting the
> > synchronisation at the end of an OpenMP offload kernel launch -- it's
> > safe for the GPU and CPU to continue executing in parallel at that point,
> > because e.g. copies-back from the device will be synchronised properly
> > with kernel completion anyway.

I don't think this explanation is sufficient. My understanding is that OpenMP
forbids the host to proceed asynchronously after the target construct unless
it is a 'target nowait' construct. This may be observable if there's a printf
in the target region for example (or if it accesses memory via host pointers).

So this really needs to be a separate patch with more explanation why this is
okay (if it is okay).

> > In turn, the last part necessitates a change to the way "(perhaps abort
> > was called)" errors are detected and reported.

As already mentioned using callbacks is problematic. Plus, I'm sure the way
you lock out other threads is a performance loss when multiple threads have
target regions: even though they will not run concurrently on the GPU, you
still want to allow host threads to submit GPU jobs while the GPU is occupied.

I would suggest to have a small pool (up to 3 entries perhaps) of stacks. Then
you can arrange reuse without totally serializing host threads on target
regions.

Alexander


Add support for copy specifier to fnspec

2020-11-09 Thread Jan Hubicka
Hi,
this patch adds 'c' and 'C' fnspec for parameter that is copied to different
parameter.  Main motivation is to get rid of wrong EAF_NOESCAPE flag on
the memcpy argument #2. I however also added arg_copies_to_arg_p
predicate that can be eventually used by tree-ssa-structalias instead of
special casing all builtins.

I noticed that we can no longer describe STRNCAT precisely.  I am not
sure how important it is.  We can either special case it on the three
places (in tree-ssa-alias and in ipa-modref) or use 1-9 in place of 'c'
and 'C' so the second character would be still available for size
specifier, so strncat would become

"1cW 13"
instead of
"1cW C1"

Not sure how important this is.
Bootstrapped/regtested x86_64-linux, OK?

Honza

2020-11-09  Jan Hubicka  

* attr-fnspec.h: Add 'c' and 'C' specifiers to the toplevel comment.
(attr_fnspec::arg_direct_p): Add 'C'.
(attr_fnspec::arg_not_written_p): Handle 'c' and 'C'.
(attr_fnspec::arg_max_access_size_given_by_arg_p): Handle 'c' and 'C'.
(attr_fnspec::arg_access_size_given_by_type_p): Add comment about 'c'
and 'C'.
(attr_fnspec::arg_copied_to_arg_p): New.
* builtins.c (attr_fnspec::builtin_fnspec): Update fnspec of string
functions that copies argument.
* tree-ssa-alias.c (attr_fnspec::verify): Add 'c' and 'C'; be more
struct on arg specifiers.

diff --git a/gcc/attr-fnspec.h b/gcc/attr-fnspec.h
index 28135328437..97405dbdd78 100644
--- a/gcc/attr-fnspec.h
+++ b/gcc/attr-fnspec.h
@@ -41,6 +41,13 @@
written and does not escape
  'w' or 'W' specifies that the memory pointed to by the parameter does not
escape
+ 'c' or 'C' specifies that the memory pointed to by the parameter is
+   copied to memory pointed to by different parameter
+   (as in memcpy).  The index of the destination parmeter is
+   specified by following character i.e. "C1" means that memory is
+   copied to parameter pointed to by parameter 1.
+   Size of block copied is determined by size specifier of the
+   destination parameter.
  '.'   specifies that nothing is known.
The uppercase letter in addition specifies that the memory pointed to
by the parameter is not dereferenced.  For 'r' only read applies
@@ -51,8 +58,11 @@
  ' 'nothing is known
  't'   the size of value written/read corresponds to the size of
of the pointed-to type of the argument type
- '1'...'9'  the size of value written/read is given by the specified
-   argument
+ '1'...'9'  preceeded by 'o', 'O', 'w' or 'W'
+   specifies the size of value written/read is given by the
+   specified argument
+ '1'...'9'  preceeded by 'c', or 'c'
+   specifies the argument data is copied to
  */
 
 #ifndef ATTR_FNSPEC_H
@@ -122,7 +132,8 @@ public:
   {
 unsigned int idx = arg_idx (i);
 gcc_checking_assert (arg_specified_p (i));
-return str[idx] == 'R' || str[idx] == 'O' || str[idx] == 'W';
+return str[idx] == 'R' || str[idx] == 'O'
+  || str[idx] == 'W' || str[idx] == 'C';
   }
 
   /* True if argument is used.  */
@@ -161,6 +172,7 @@ public:
 unsigned int idx = arg_idx (i);
 gcc_checking_assert (arg_specified_p (i));
 return str[idx] != 'r' && str[idx] != 'R'
+  && str[idx] != 'c' && str[idx] != 'C'
   && str[idx] != 'x' && str[idx] != 'X';
   }
 
@@ -171,6 +183,8 @@ public:
   {
 unsigned int idx = arg_idx (i);
 gcc_checking_assert (arg_specified_p (i));
+if (str[idx] == 'c' || str[idx] == 'C')
+  return arg_max_access_size_given_by_arg_p (str[idx + 1] - '1', arg);
 if (str[idx + 1] >= '1' && str[idx + 1] <= '9')
   {
*arg = str[idx + 1] - '1';
@@ -187,9 +201,26 @@ public:
   {
 unsigned int idx = arg_idx (i);
 gcc_checking_assert (arg_specified_p (i));
+/* We could handle 'c' 'C' but then we would need to have way to check
+   that both points to sizes are same.  */
 return str[idx + 1] == 't';
   }
 
+  /* Return true if memory pointer to by argument is copied to a memory
+ pointed to by a different argument (as in memcpy).
+ In this case set ARG.  */
+  bool
+  arg_copied_to_arg_p (unsigned int i, unsigned int *arg)
+  {
+unsigned int idx = arg_idx (i);
+gcc_checking_assert (arg_specified_p (i));
+if (str[idx] != 'c' && str[idx] == 'C')
+  return false;
+*arg = str[idx + 1] - '1';
+return true;
+  }
+
+
   /* True if the argument does not escape.  */
   bool
   arg_noescape_p (unsigned int i)
@@ -230,7 +261,7 @@ public:
 return str[1] != 'c' && str[1] != 'C';
   }
 
-  /* Return true if all memory written by the function 
+  /* Return true if all memory written by the function
  is specified by fnspec.  */
   bool
   global_memory_written_p ()
diff --git a/gcc/builtins.c 

[committed] MAINTAINERS: add myself for write after approval

2020-11-09 Thread Pat Bernardi
2020-11-09  Pat Bernardi  

* MAINTAINERS (Write After Approval): Add myself.
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 24b77f5e663..a0216185de9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -323,6 +323,7 @@ Jon Beniston

 Andrew Bennett 
 Andrew Benson  
 Daniel Berlin  
+Pat Bernardi   
 Jan Beulich
 David Billinghurst 

 Tomas Bily 
-- 
2.27.0



Re: [PATCH, rs6000] Update instruction attributes for Power10

2020-11-09 Thread Segher Boessenkool
On Fri, Nov 06, 2020 at 10:46:43AM -0600, Pat Haugen wrote:
> On 11/5/20 4:32 PM, will schmidt wrote:
> > On Wed, 2020-11-04 at 14:42 -0600, Pat Haugen via Gcc-patches wrote:
> >>* config/rs6000/rs6000.c (rs6000_final_prescan_insn): Only add 'p' for
> >>PREFIXED_YES.
> > 
> > The code change reads as roughly 
> > - next_insn_prefixed_p != PREFIXED_NO
> > 
> > + next_insn_prefixed_p == PREFIXED_YES"
> > 
> > So just an inversion of the logic? I don't obviously see the 'p' impact
> > there.
> > 
> It's no longer an inversion of the logic since I added a PREFIXED_ALWAYS 
> value. 'next_insn_prefixed' is used by rs6000_final_prescan_insn() to 
> determine whether an insn mnemonic needs a 'p' prefix. We want it set for 
> PREFIXED_YES, but not for PREFIXED_NO or PREFIXED_ALWAYS.

Another way to do this is to do like maybe_var_shift does: things you
mask maybe_prefixed=yes get their prefixed attribute set to no or yes
depending on their operands, and for things where we always want to
set the prefixed attribute we can do just that. I don't know what reads
better / is more clear, and/or is more work to write (the schemes are
semantically identical).



Segher


Re: [PATCH] "used" attribute saves decl from linker garbage collection

2020-11-09 Thread H.J. Lu via Gcc-patches
On Mon, Nov 9, 2020 at 11:56 AM Jozef Lawrynowicz
 wrote:
>
> On Mon, Nov 09, 2020 at 10:36:07AM -0800, H.J. Lu via Gcc-patches wrote:
> > On Mon, Nov 9, 2020 at 9:41 AM Jozef Lawrynowicz
> >  wrote:
> > >
> > > On Fri, Nov 06, 2020 at 04:39:33PM -0800, H.J. Lu via Gcc-patches wrote:
> > > > On Fri, Nov 6, 2020 at 4:17 PM Jeff Law  wrote:
> > > > >
> > > > >
> > > > > On 11/6/20 5:13 PM, H.J. Lu wrote:
> > > > > > On Fri, Nov 6, 2020 at 4:01 PM Jeff Law  wrote:
> > > > > >>
> > > > > >> On 11/6/20 4:45 PM, H.J. Lu wrote:
> > > > > >>> On Fri, Nov 6, 2020 at 3:37 PM Jeff Law  wrote:
> > > > >  On 11/6/20 4:29 PM, H.J. Lu wrote:
> > > > > > On Fri, Nov 6, 2020 at 3:22 PM Jeff Law  wrote:
> > > > > >> On 11/5/20 7:34 AM, H.J. Lu via Gcc-patches wrote:
> > > > > >>> On Thu, Nov 5, 2020 at 3:37 AM Jozef Lawrynowicz
> > > > > >>>  wrote:
> > > > >  On Thu, Nov 05, 2020 at 06:21:21AM -0500, Hans-Peter Nilsson 
> > > > >  wrote:
> > > > > > On Wed, 4 Nov 2020, H.J. Lu wrote:
> > > > > >> .retain is ill-defined.   For example,
> > > > > >>
> > > > > >> [hjl@gnu-cfl-2 gcc]$ cat /tmp/x.c
> > > > > >> static int xyzzy __attribute__((__used__));
> > > > > >> [hjl@gnu-cfl-2 gcc]$ ./xgcc -B./ -S /tmp/x.c -fcommon
> > > > > >> [hjl@gnu-cfl-2 gcc]$ cat x.s
> > > > > >> .file "x.c"
> > > > > >> .text
> > > > > >> .retain xyzzy  < What does it do?
> > > > > >> .local xyzzy
> > > > > >> .comm xyzzy,4,4
> > > > > >> .ident "GCC: (GNU) 11.0.0 20201103 (experimental)"
> > > > > >> .section .note.GNU-stack,"",@progbits
> > > > > >> [hjl@gnu-cfl-2 gcc]$
> > > > > > To answer that question: it's up to the assembler, but for 
> > > > > > ELF
> > > > > > and SHF_GNU_RETAIN, it seems obvious it'd tell the 
> > > > > > assembler to
> > > > > > set SHF_GNU_RETAIN for the section where the symbol ends up.
> > > > > > We both know this isn't rocket science with binutils.
> > > > >  Indeed, and my patch handles it trivially:
> > > > >  https://sourceware.org/pipermail/binutils/2020-November/113993.html
> > > > > 
> > > > >    +void
> > > > >    +obj_elf_retain (int arg ATTRIBUTE_UNUSED)
> > > > >     snip 
> > > > >    +  sym = get_sym_from_input_line_and_check ();
> > > > >    +  symbol_get_obj (sym)->retain = 1;
> > > > > 
> > > > >    @@ -2624,6 +2704,9 @@ elf_frob_symbol (symbolS *symp, int 
> > > > >  *puntp)
> > > > >  }
> > > > > }
> > > > > 
> > > > >    +  if (symbol_get_obj (symp)->retain)
> > > > >    +elf_section_flags (S_GET_SEGMENT (symp)) |= 
> > > > >  SHF_GNU_RETAIN;
> > > > >    +
> > > > >   /* Double check weak symbols.  */
> > > > >   if (S_IS_WEAK (symp))
> > > > > {
> > > > > 
> > > > >  We could check that the symbol named in the .retain 
> > > > >  directive has
> > > > >  already been defined, however this isn't compatible with GCC
> > > > >  mark_decl_preserved handling, since mark_decl_preserved is 
> > > > >  called
> > > > >  emitted before the local symbols are defined in the assembly 
> > > > >  output
> > > > >  file.
> > > > > 
> > > > >  GAS should at least validate that the symbol named in the 
> > > > >  .retain
> > > > >  directive does end up as a symbol though.
> > > > > 
> > > > > >>> Don't add .retain.
> > > > > >> Why?  I don't see why you find it so objectionable.
> > > > > >>
> > > > > > An ELF symbol directive should operate on symbol table:
> > > > > >
> > > > > > http://www.sco.com/developers/gabi/latest/ch4.symtab.html
> > > > > >
> > > > > > not the section flags where the symbol is defined.
> > > > >  I agree in general, but I think this is one of those cases where 
> > > > >  it's
> > > > >  not so clear.  And what you're talking about is an 
> > > > >  implementation detail.
> > > > > >>> There is no need for such a hack.  The proper thing to do in ELF 
> > > > > >>> is
> > > > > >>> to place such a symbol in a section with SHF_GNU_RETAIN flag.   
> > > > > >>> This
> > > > > >>> also avoids the question what to do with SHN_COMMON.
> > > > > >> I'm not sure that's a good idea either.  Moving symbols into a 
> > > > > >> section
> > > > > >> other than they'd normally live doesn't seem all that wise.
> > > > > > In ELF, a symbol must be defined in a section.  If we want to keep 
> > > > > > a symbol,
> > > > > > we should place it in an SHF_GNU_RETAIN section.
> > > > >
> > > > > Again, that's an implementation detail and it's not clear to me that 
> > > > > one
> > > > > approach is inherently better than the other.
> 

[pushed] c++: Improve error location for class using-decl.

2020-11-09 Thread Jason Merrill via Gcc-patches
We should use the location of the using-declaration, not the location of the
class.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* class.c (handle_using_decl): Add an iloc_sentinel.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/using26.C: Adjust location.
* g++.old-deja/g++.other/using1.C: Adjust location.
---
 gcc/cp/class.c| 4 +++-
 gcc/testsuite/g++.dg/lookup/using26.C | 4 ++--
 gcc/testsuite/g++.old-deja/g++.other/using1.C | 4 ++--
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index c03737294eb..7c34d9466fc 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -1,4 +1,4 @@
-/* Functions related to building classes and their related objects.
+/* Functions related to building -*- C++ -*- classes and their related objects.
Copyright (C) 1987-2020 Free Software Foundation, Inc.
Contributed by Michael Tiemann (tiem...@cygnus.com)
 
@@ -1322,6 +1322,8 @@ handle_using_decl (tree using_decl, tree t)
   return;
 }
 
+  iloc_sentinel ils (DECL_SOURCE_LOCATION (using_decl));
+
   /* Make type T see field decl FDECL with access ACCESS.  */
   if (flist)
 for (ovl_iterator iter (flist); iter; ++iter)
diff --git a/gcc/testsuite/g++.dg/lookup/using26.C 
b/gcc/testsuite/g++.dg/lookup/using26.C
index 857c1348181..dd4e13039d7 100644
--- a/gcc/testsuite/g++.dg/lookup/using26.C
+++ b/gcc/testsuite/g++.dg/lookup/using26.C
@@ -17,9 +17,9 @@ struct C
 int next;
 };
 
-struct D : A, B, C // { dg-error "context" }
+struct D : A, B, C
 {
-using B::next;
+using B::next; // { dg-error "context" }
 void f()
 {
next = 12;
diff --git a/gcc/testsuite/g++.old-deja/g++.other/using1.C 
b/gcc/testsuite/g++.old-deja/g++.other/using1.C
index 6cebc292a41..89100918a1e 100644
--- a/gcc/testsuite/g++.old-deja/g++.other/using1.C
+++ b/gcc/testsuite/g++.old-deja/g++.other/using1.C
@@ -10,9 +10,9 @@ protected:
   friend class D2;
 };
 
-class D : public B { // { dg-error "" } within this context
+class D : public B {
 public:
-  using B::a;
+  using B::a; // { dg-error "" } within this context
   using B::b;
 };
 

base-commit: 38b17c27ce5a8e0cc5baa14697d4b5542b91b9d1
-- 
2.18.4



[pushed] c++: Call tsubst_pack_expansion from tsubst.

2020-11-09 Thread Jason Merrill via Gcc-patches
This was unnecessary (and incomplete) code duplication.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* pt.c (tsubst): Replace *_ARGUMENT_PACK code with
a call to tsubst_argument_pack.
---
 gcc/cp/pt.c | 15 +--
 1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 2a885a90857..88644b9556b 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -16060,20 +16060,7 @@ tsubst (tree t, tree args, tsubst_flags_t complain, 
tree in_decl)
 
 case TYPE_ARGUMENT_PACK:
 case NONTYPE_ARGUMENT_PACK:
-  {
-tree r;
-
-   if (code == NONTYPE_ARGUMENT_PACK)
- r = make_node (code);
-   else
- r = cxx_make_type (code);
-
-   tree pack_args = ARGUMENT_PACK_ARGS (t);
-   pack_args = tsubst_template_args (pack_args, args, complain, in_decl);
-   SET_ARGUMENT_PACK_ARGS (r, pack_args);
-
-   return r;
-  }
+  return tsubst_argument_pack (t, args, complain, in_decl);
 
 case VOID_CST:
 case INTEGER_CST:

base-commit: 38b17c27ce5a8e0cc5baa14697d4b5542b91b9d1
-- 
2.18.4



[PATCH] c++: Call tsubst_pack_expansion from tsubst.

2020-11-09 Thread Jason Merrill via Gcc-patches
This was unnecessary (and incomplete) code duplication.

gcc/cp/ChangeLog:

* pt.c (tsubst): Replace *_ARGUMENT_PACK code with
a call to tsubst_argument_pack.
---
 gcc/cp/pt.c | 15 +--
 1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 2a885a90857..88644b9556b 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -16060,20 +16060,7 @@ tsubst (tree t, tree args, tsubst_flags_t complain, 
tree in_decl)
 
 case TYPE_ARGUMENT_PACK:
 case NONTYPE_ARGUMENT_PACK:
-  {
-tree r;
-
-   if (code == NONTYPE_ARGUMENT_PACK)
- r = make_node (code);
-   else
- r = cxx_make_type (code);
-
-   tree pack_args = ARGUMENT_PACK_ARGS (t);
-   pack_args = tsubst_template_args (pack_args, args, complain, in_decl);
-   SET_ARGUMENT_PACK_ARGS (r, pack_args);
-
-   return r;
-  }
+  return tsubst_argument_pack (t, args, complain, in_decl);
 
 case VOID_CST:
 case INTEGER_CST:

base-commit: 38b17c27ce5a8e0cc5baa14697d4b5542b91b9d1
-- 
2.18.4



[PATCH 2/2] loops: Invoke lim after successful loop interchange

2020-11-09 Thread Martin Jambor
Hi,

this patch modifies the loop invariant pass so that is can operate
only on a single requested loop and its sub-loops and ignore the rest
of the function, much like it currently ignores basic blocks that are
not in any real loop.  It then invokes it from within the loop
interchange pass when it successfully swaps two loops.  This avoids
the non-LTO -Ofast run-time regressions of 410.bwaves and 503.bwaves_r
(which are 19% and 15% faster than current master on an AMD zen2
machine) while not introducing a full LIM pass into the pass pipeline.

I have not modified the LIM data structures, this means that it still
contains vectors indexed by loop->num even though only a single loop
nest is actually processed.  I also did not replace the uses of
pre_and_rev_post_order_compute_fn with a function that would count a
postorder only for a given loop.  I can of course do so if the
approach is otherwise deemed viable.

The patch adds one additional global variable requested_loop to the
pass and then at various places behaves differently when it is set.  I
was considering storing the fake root loop into it for normal
operation, but since this loop often requires special handling anyway,
I came to the conclusion that the code would actually end up less
straightforward.

I have bootstrapped and tested the patch on x86_64-linux and a very
similar one on aarch64-linux.  I have also tested it by modifying the
tree_ssa_lim function to run loop_invariant_motion_from_loop on each
real outermost loop in a function and this variant also passed
bootstrap and all tests, including dump scans, of all languages.

I have built the entire SPEC 2006 FPrate monitoring the activity of
the LIM pass without and with the patch (on top of commit b642fca1c31
with which 526.blender_r and 538.imagick_r seemed to be failing) and
it only examined 0.2% more loops, 0.02% more BBs and even fewer
percent of statements because it is invoked only in a rather special
circumstance.  But the patch allows for more such need-based uses at
hopefully reasonable cost.

Since I do not have much experience with loop optimizers, I expect
that there will be requests to adjust the patch during the review.
Still, it fixes a performance regression against GCC 9 and so I hope
to address the concerns in time to get it into GCC 11.

Thanks,

Martin


gcc/ChangeLog:

2020-11-08  Martin Jambor  

* gimple-loop-interchange.cc (pass_linterchange::execute): Call
loop_invariant_motion_from_loop on affected loop nests.
* tree-ssa-loop-im.c (requested_loop): New variable.
(get_topmost_lim_loop): New function.
(outermost_invariant_loop): Use it, cap discovered topmost loop at
requested_loop.
(determine_max_movement): Use get_topmost_lim_loop.
(set_level): Assert that the selected loop is not outside of
requested_loop.
(compute_invariantness): Do not process loops outside of
requested_loop, if non-NULL.
(move_computations_worker): Likewise.
(mark_ref_stored): Stop iteration at requested_loop, if non-NULL.
(mark_ref_loaded): Likewise.
(analyze_memory_references): If non-NULL, only process basic
blocks and loops in requested_loop.  Compute contains_call bitmap.
(do_store_motion): Only process requested_loop if non-NULL.
(fill_always_executed_in): Likewise.  Also accept contains_call as
a parameter rather than computing it.
(tree_ssa_lim_initialize): New parameter which is stored into
requested_loop.  Additonal dumping. Only initialize
bb_loop_postorder for loops within requested_loop, if non-NULL.
(tree_ssa_lim_finalize): Clear requested_loop, additional dumping.
(loop_invariant_motion_from_loop): New function.
(tree_ssa_lim): Move all functionality to
loop_invariant_motion_from_loop, call it.
* tree-ssa-loop-manip.h (loop_invariant_motion_from_loop): Declare.

---
 gcc/gimple-loop-interchange.cc |  30 +-
 gcc/tree-ssa-loop-im.c | 176 -
 gcc/tree-ssa-loop-manip.h  |   2 +
 3 files changed, 156 insertions(+), 52 deletions(-)

diff --git a/gcc/gimple-loop-interchange.cc b/gcc/gimple-loop-interchange.cc
index 1656004ecf0..8c376228779 100644
--- a/gcc/gimple-loop-interchange.cc
+++ b/gcc/gimple-loop-interchange.cc
@@ -2068,6 +2068,7 @@ pass_linterchange::execute (function *fun)
 return 0;
 
   bool changed_p = false;
+  auto_vec loops_to_lim;
   class loop *loop;
   FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
 {
@@ -2077,7 +2078,11 @@ pass_linterchange::execute (function *fun)
   if (prepare_perfect_loop_nest (loop, _nest, , ))
{
  tree_loop_interchange loop_interchange (loop_nest);
- changed_p |= loop_interchange.interchange (datarefs, ddrs);
+ if (loop_interchange.interchange (datarefs, ddrs))
+   {
+ changed_p = true;
+ loops_to_lim.safe_push 

[PATCH 1/2] cfgloop: Extend loop iteration macros to loop only over sub-loops

2020-11-09 Thread Martin Jambor
Hi,

This patch adds loop iteration macros FOR_EACH_ENCLOSED_LOOP and
FOR_EACH_ENCLOSED_LOOP_FN which can loop only over inner loops of a
given loop.

The patch is required for a follow-up patch which enables loop
invariant motion to only work on a selected loop.  I have bootstrapped
and tested the two patches on x86_64-linux and aarch64-linux.  OK for
trunk once the follow-up patch is accepted too?

Thanks,

Martin


gcc/ChangeLog:

2020-10-29  Martin Jambor  

* cfgloop.h (loop_iterator::loop_iterator): Add parameter to the
constructor, make it iterate over sub-loops if non-NULL.
(FOR_EACH_LOOP): Pass extra NULL to loop_iterator::loop_iterator.
(FOR_EACH_LOOP_FN): Likewise.
(FOR_EACH_ENCLOSED_LOOP): New macro.
(FOR_EACH_ENCLOSED_LOOP_FN): Likewise.
---
 gcc/cfgloop.h | 44 
 1 file changed, 32 insertions(+), 12 deletions(-)

diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index d14689dc31f..e8ffa5b2964 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -663,7 +663,7 @@ enum li_flags
 class loop_iterator
 {
 public:
-  loop_iterator (function *fn, loop_p *loop, unsigned flags);
+  loop_iterator (function *fn, loop_p top, loop_p *loop, unsigned flags);
 
   inline loop_p next ();
 
@@ -693,8 +693,15 @@ loop_iterator::next ()
   return NULL;
 }
 
+/* Constructor to set up iteration over loops.  FN is the function in which the
+   loop tree resides.  If TOP is NULL iterate over all loops in the function,
+   otherwise iterate only over sub-loops of TOP (including TOP).  LOOP points
+   to the iteration pointer in the iteration.  FLAGS modify the iteration as
+   described in enum li_flags.  */
+
 inline
-loop_iterator::loop_iterator (function *fn, loop_p *loop, unsigned flags)
+loop_iterator::loop_iterator (function *fn, loop_p top, loop_p *loop,
+ unsigned flags)
 {
   class loop *aloop;
   unsigned i;
@@ -716,13 +723,16 @@ loop_iterator::loop_iterator (function *fn, loop_p *loop, 
unsigned flags)
   for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, ); i++)
if (aloop != NULL
&& aloop->inner == NULL
-   && aloop->num >= mn)
+   && aloop->num >= mn
+   && (!top || flow_loop_nested_p (top, aloop)))
  this->to_visit.quick_push (aloop->num);
 }
   else if (flags & LI_FROM_INNERMOST)
 {
+  if (!top)
+   top = loops_for_fn (fn)->tree_root;
   /* Push the loops to LI->TO_VISIT in postorder.  */
-  for (aloop = loops_for_fn (fn)->tree_root;
+  for (aloop = top;
   aloop->inner != NULL;
   aloop = aloop->inner)
continue;
@@ -732,15 +742,15 @@ loop_iterator::loop_iterator (function *fn, loop_p *loop, 
unsigned flags)
  if (aloop->num >= mn)
this->to_visit.quick_push (aloop->num);
 
- if (aloop->next)
+ if (aloop == top)
+   break;
+ else if (aloop->next)
{
  for (aloop = aloop->next;
   aloop->inner != NULL;
   aloop = aloop->inner)
continue;
}
- else if (!loop_outer (aloop))
-   break;
  else
aloop = loop_outer (aloop);
}
@@ -748,7 +758,7 @@ loop_iterator::loop_iterator (function *fn, loop_p *loop, 
unsigned flags)
   else
 {
   /* Push the loops to LI->TO_VISIT in preorder.  */
-  aloop = loops_for_fn (fn)->tree_root;
+  aloop = top ? top : loops_for_fn (fn)->tree_root;
   while (1)
{
  if (aloop->num >= mn)
@@ -758,9 +768,9 @@ loop_iterator::loop_iterator (function *fn, loop_p *loop, 
unsigned flags)
aloop = aloop->inner;
  else
{
- while (aloop != NULL && aloop->next == NULL)
+ while (aloop != top && aloop->next == NULL)
aloop = loop_outer (aloop);
- if (aloop == NULL)
+ if (aloop == top)
break;
  aloop = aloop->next;
}
@@ -771,12 +781,22 @@ loop_iterator::loop_iterator (function *fn, loop_p *loop, 
unsigned flags)
 }
 
 #define FOR_EACH_LOOP(LOOP, FLAGS) \
-  for (loop_iterator li(cfun, &(LOOP), FLAGS); \
+  for (loop_iterator li(cfun, NULL, &(LOOP), FLAGS);   \
(LOOP); \
(LOOP) = li.next ())
 
 #define FOR_EACH_LOOP_FN(FN, LOOP, FLAGS) \
-  for (loop_iterator li(FN, &(LOOP), FLAGS); \
+  for (loop_iterator li(FN, NULL, &(LOOP), FLAGS); \
+   (LOOP); \
+   (LOOP) = li.next ())
+
+#define FOR_EACH_ENCLOSED_LOOP(TOP, LOOP, FLAGS)   \
+  for (loop_iterator li(cfun, TOP, &(LOOP), FLAGS);\
+   (LOOP); \
+   (LOOP) = li.next ())
+
+#define FOR_EACH_ENCLOSED_LOOP_FN(FN, TOP, LOOP, FLAGS) \
+  for (loop_iterator li(FN, TOP, &(LOOP), FLAGS);  \
(LOOP); \
(LOOP) = li.next ())
 
-- 
2.29.2



[PATCH 2/2] IBM Z: Test long doubles in vector registers

2020-11-09 Thread Ilya Leoshkevich via Gcc-patches
gcc/testsuite/ChangeLog:

2020-11-05  Ilya Leoshkevich  

* gcc.target/s390/vector/long-double-callee-abi-scan.c: New test.
* gcc.target/s390/vector/long-double-caller-abi-run.c: New test.
* gcc.target/s390/vector/long-double-caller-abi-scan.c: New test.
* gcc.target/s390/vector/long-double-copysign.c: New test.
* gcc.target/s390/vector/long-double-fprx2-constant.c: New test.
* gcc.target/s390/vector/long-double-from-double.c: New test.
* gcc.target/s390/vector/long-double-from-float.c: New test.
* gcc.target/s390/vector/long-double-from-i16.c: New test.
* gcc.target/s390/vector/long-double-from-i32.c: New test.
* gcc.target/s390/vector/long-double-from-i64.c: New test.
* gcc.target/s390/vector/long-double-from-i8.c: New test.
* gcc.target/s390/vector/long-double-from-u16.c: New test.
* gcc.target/s390/vector/long-double-from-u32.c: New test.
* gcc.target/s390/vector/long-double-from-u64.c: New test.
* gcc.target/s390/vector/long-double-from-u8.c: New test.
* gcc.target/s390/vector/long-double-to-double.c: New test.
* gcc.target/s390/vector/long-double-to-float.c: New test.
* gcc.target/s390/vector/long-double-to-i16.c: New test.
* gcc.target/s390/vector/long-double-to-i32.c: New test.
* gcc.target/s390/vector/long-double-to-i64.c: New test.
* gcc.target/s390/vector/long-double-to-i8.c: New test.
* gcc.target/s390/vector/long-double-to-u16.c: New test.
* gcc.target/s390/vector/long-double-to-u32.c: New test.
* gcc.target/s390/vector/long-double-to-u64.c: New test.
* gcc.target/s390/vector/long-double-to-u8.c: New test.
* gcc.target/s390/vector/long-double-vec-duplicate.c: New test.
* gcc.target/s390/vector/long-double-wf.h: New test.
* gcc.target/s390/vector/long-double-wfaxb.c: New test.
* gcc.target/s390/vector/long-double-wfcxb-0001.c: New test.
* gcc.target/s390/vector/long-double-wfcxb-0111.c: New test.
* gcc.target/s390/vector/long-double-wfcxb-1011.c: New test.
* gcc.target/s390/vector/long-double-wfcxb-1101.c: New test.
* gcc.target/s390/vector/long-double-wfdxb.c: New test.
* gcc.target/s390/vector/long-double-wfixb.c: New test.
* gcc.target/s390/vector/long-double-wfkxb-0111.c: New test.
* gcc.target/s390/vector/long-double-wfkxb-1011.c: New test.
* gcc.target/s390/vector/long-double-wfkxb-1101.c: New test.
* gcc.target/s390/vector/long-double-wflcxb.c: New test.
* gcc.target/s390/vector/long-double-wflpxb.c: New test.
* gcc.target/s390/vector/long-double-wfmaxb-2.c: New test.
* gcc.target/s390/vector/long-double-wfmaxb-3.c: New test.
* gcc.target/s390/vector/long-double-wfmaxb-disabled.c: New test.
* gcc.target/s390/vector/long-double-wfmaxb.c: New test.
* gcc.target/s390/vector/long-double-wfmsxb-disabled.c: New test.
* gcc.target/s390/vector/long-double-wfmsxb.c: New test.
* gcc.target/s390/vector/long-double-wfmxb.c: New test.
* gcc.target/s390/vector/long-double-wfnmaxb-disabled.c: New test.
* gcc.target/s390/vector/long-double-wfnmaxb.c: New test.
* gcc.target/s390/vector/long-double-wfnmsxb-disabled.c: New test.
* gcc.target/s390/vector/long-double-wfnmsxb.c: New test.
* gcc.target/s390/vector/long-double-wfsqxb.c: New test.
* gcc.target/s390/vector/long-double-wfsxb-1.c: New test.
* gcc.target/s390/vector/long-double-wfsxb.c: New test.
* gcc.target/s390/vector/long-double-wftcixb-1.c: New test.
* gcc.target/s390/vector/long-double-wftcixb.c: New test.
---
 .../s390/vector/long-double-callee-abi-scan.c | 20 +++
 .../s390/vector/long-double-caller-abi-run.c  |  4 ++
 .../s390/vector/long-double-caller-abi-scan.c | 13 
 .../s390/vector/long-double-copysign.c| 21 +++
 .../s390/vector/long-double-fprx2-constant.c  | 11 
 .../s390/vector/long-double-from-double.c | 18 ++
 .../s390/vector/long-double-from-float.c  | 19 ++
 .../s390/vector/long-double-from-i16.c| 19 ++
 .../s390/vector/long-double-from-i32.c| 19 ++
 .../s390/vector/long-double-from-i64.c| 19 ++
 .../s390/vector/long-double-from-i8.c | 19 ++
 .../s390/vector/long-double-from-u16.c| 19 ++
 .../s390/vector/long-double-from-u32.c| 19 ++
 .../s390/vector/long-double-from-u64.c| 19 ++
 .../s390/vector/long-double-from-u8.c | 19 ++
 .../s390/vector/long-double-to-double.c   | 18 ++
 .../s390/vector/long-double-to-float.c| 19 ++
 .../s390/vector/long-double-to-i16.c  | 19 ++
 .../s390/vector/long-double-to-i32.c  | 19 ++
 .../s390/vector/long-double-to-i64.c  | 21 +++
 .../s390/vector/long-double-to-i8.c   | 

Re: [PATCH 1/4] c++: Fix ICE with variadic concepts and aliases [PR93907]

2020-11-09 Thread Jason Merrill via Gcc-patches

On 11/7/20 10:59 AM, Patrick Palka wrote:

On Fri, 6 Nov 2020, Jason Merrill wrote:


On 11/5/20 8:40 PM, Patrick Palka wrote:

This patch (naively) extends the PR93907 fix to also apply to variadic
concepts invoked with a type argument pack.  Without this, we ICE on
the below testcase (a variadic version of concepts-using2.C) in the same
manner as we used to on concepts-using2.C before r10-7133.

Patch series bootstrapped and regtested on x86_64-pc-linux-gnu,
and also tested against cmcstl2 and range-v3.

gcc/cp/ChangeLog:

PR c++/93907
* constraint.cc (tsubst_parameter_mapping): Also canonicalize
the type arguments of a TYPE_ARGUMENT_PACk.

gcc/testsuite/ChangeLog:

PR c++/93907
* g++.dg/cpp2a/concepts-using3.C: New test, based off of
concepts-using2.C.
---
   gcc/cp/constraint.cc | 10 
   gcc/testsuite/g++.dg/cpp2a/concepts-using3.C | 52 
   2 files changed, 62 insertions(+)
   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-using3.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index b6f6f0d02a5..c871a8ab86a 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2252,6 +2252,16 @@ tsubst_parameter_mapping (tree map, tree args,
subst_info info)


Hmm, the


   else if (ARGUMENT_PACK_P (arg))
 new_arg = tsubst_argument_pack (arg, args, complain, in_decl);


just above this seems redundant, since tsubst_template_arg handles packs just
fine.  In fact, I wonder why tsubst_argument_pack is used specifically
anywhere?  It seems to get some edge cases better than the code in tsubst, but
the solution to that would seem to be replacing the code in tsubst with a call
to tsubst_argument_pack; then we can remove all the other calls to the
function.


They seem interchangeable here wrt handling TYPE_ARGUMENT_PACKs, but not
NONTYPE_ARGUMENT_PACKs.  It looks like tsubst_template_arg ends up just
issuing an error from tsubst_expr if we try using it to substitute into
a NONTYPE_ARGUMENT_PACK.




  new_arg = tsubst_template_arg (arg, args, complain, in_decl);
  if (TYPE_P (new_arg))
new_arg = canonicalize_type_argument (new_arg, complain);
+ if (TREE_CODE (new_arg) == TYPE_ARGUMENT_PACK)
+   {
+ tree pack_args = ARGUMENT_PACK_ARGS (new_arg);
+ for (int i = 0; i < TREE_VEC_LENGTH (pack_args); i++)
+   {
+ tree& pack_arg = TREE_VEC_ELT (pack_args, i);
+ if (TYPE_P (pack_arg))
+   pack_arg = canonicalize_type_argument (pack_arg,
complain);


Do we need the TYPE_P here, since we already know we're in a
TYPE_ARGUMENT_PACK?  That is, can an element of a TYPE_ARGUMENT_PACK be an
invalid argument to canonicalize_type_argument?


With -fconcepts-ts, the elements of a TYPE_ARGUMENT_PACK here can be
TEMPLATE_DECLs, as in e.g. line 28 of concepts/template-parm3.C.



OTOH, I wonder if we need to canonicalize non-type arguments here as well?


Hmm, I'm not sure.  Not doing so should at worst result in a
satisfaction cache miss in release builds, and in checking builds should
get caught by the hash table sanitizer.  I haven't been able to come up
with a testcase that demonstrates it's necessary.



I wonder if tsubst_template_arg should canonicalize rather than leave that up
to the caller?  I suppose that could do a bit more work when the result is
going to end up in convert_template_argument and get canonicalized again; I
don't know if that would be significant.


That seems like it works, based on some limited testing.  But there are
only two users of canonicalize_template_argument outside of
convert_template_argument itself, and the one use in unify is still
needed even with this change (or else we get many ICEs coming from
verify_unstripped_args if we try to remove it).  So the benefit of such
a change seems marginal at the moment.


Then the patch is OK as is.


}
 if (new_arg == error_mark_node)
return error_mark_node;
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-using3.C
b/gcc/testsuite/g++.dg/cpp2a/concepts-using3.C
new file mode 100644
index 000..2c8ad40d104
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-using3.C
@@ -0,0 +1,52 @@
+// PR c++/93907
+// { dg-options -std=gnu++20 }
+
+// This testcase is a variadic version of concepts-using2.C; the only
+// difference is that 'cd' and 'ce' are now variadic concepts.
+
+template  struct c {
+  static constexpr int d = a;
+  typedef c e;
+};
+template  struct f;
+template  using g = typename f::e;
+struct b;
+template  struct f { using e = b; };
+template  struct m { typedef g aj; };
+template  struct n { typedef typename m::aj e; };
+template  using an = typename n::e;
+template  constexpr bool ao = c::d;
+template  constexpr bool i = c<1>::d;
+template  concept bb = i;
+#ifdef __SIZEOF_INT128__
+using cc = __int128;
+#else
+using cc = long long;
+#endif
+template  concept 

Re: [PATCH] "used" attribute saves decl from linker garbage collection

2020-11-09 Thread Jozef Lawrynowicz
On Mon, Nov 09, 2020 at 10:36:07AM -0800, H.J. Lu via Gcc-patches wrote:
> On Mon, Nov 9, 2020 at 9:41 AM Jozef Lawrynowicz
>  wrote:
> >
> > On Fri, Nov 06, 2020 at 04:39:33PM -0800, H.J. Lu via Gcc-patches wrote:
> > > On Fri, Nov 6, 2020 at 4:17 PM Jeff Law  wrote:
> > > >
> > > >
> > > > On 11/6/20 5:13 PM, H.J. Lu wrote:
> > > > > On Fri, Nov 6, 2020 at 4:01 PM Jeff Law  wrote:
> > > > >>
> > > > >> On 11/6/20 4:45 PM, H.J. Lu wrote:
> > > > >>> On Fri, Nov 6, 2020 at 3:37 PM Jeff Law  wrote:
> > > >  On 11/6/20 4:29 PM, H.J. Lu wrote:
> > > > > On Fri, Nov 6, 2020 at 3:22 PM Jeff Law  wrote:
> > > > >> On 11/5/20 7:34 AM, H.J. Lu via Gcc-patches wrote:
> > > > >>> On Thu, Nov 5, 2020 at 3:37 AM Jozef Lawrynowicz
> > > > >>>  wrote:
> > > >  On Thu, Nov 05, 2020 at 06:21:21AM -0500, Hans-Peter Nilsson 
> > > >  wrote:
> > > > > On Wed, 4 Nov 2020, H.J. Lu wrote:
> > > > >> .retain is ill-defined.   For example,
> > > > >>
> > > > >> [hjl@gnu-cfl-2 gcc]$ cat /tmp/x.c
> > > > >> static int xyzzy __attribute__((__used__));
> > > > >> [hjl@gnu-cfl-2 gcc]$ ./xgcc -B./ -S /tmp/x.c -fcommon
> > > > >> [hjl@gnu-cfl-2 gcc]$ cat x.s
> > > > >> .file "x.c"
> > > > >> .text
> > > > >> .retain xyzzy  < What does it do?
> > > > >> .local xyzzy
> > > > >> .comm xyzzy,4,4
> > > > >> .ident "GCC: (GNU) 11.0.0 20201103 (experimental)"
> > > > >> .section .note.GNU-stack,"",@progbits
> > > > >> [hjl@gnu-cfl-2 gcc]$
> > > > > To answer that question: it's up to the assembler, but for ELF
> > > > > and SHF_GNU_RETAIN, it seems obvious it'd tell the assembler 
> > > > > to
> > > > > set SHF_GNU_RETAIN for the section where the symbol ends up.
> > > > > We both know this isn't rocket science with binutils.
> > > >  Indeed, and my patch handles it trivially:
> > > >  https://sourceware.org/pipermail/binutils/2020-November/113993.html
> > > > 
> > > >    +void
> > > >    +obj_elf_retain (int arg ATTRIBUTE_UNUSED)
> > > >     snip 
> > > >    +  sym = get_sym_from_input_line_and_check ();
> > > >    +  symbol_get_obj (sym)->retain = 1;
> > > > 
> > > >    @@ -2624,6 +2704,9 @@ elf_frob_symbol (symbolS *symp, int 
> > > >  *puntp)
> > > >  }
> > > > }
> > > > 
> > > >    +  if (symbol_get_obj (symp)->retain)
> > > >    +elf_section_flags (S_GET_SEGMENT (symp)) |= 
> > > >  SHF_GNU_RETAIN;
> > > >    +
> > > >   /* Double check weak symbols.  */
> > > >   if (S_IS_WEAK (symp))
> > > > {
> > > > 
> > > >  We could check that the symbol named in the .retain directive 
> > > >  has
> > > >  already been defined, however this isn't compatible with GCC
> > > >  mark_decl_preserved handling, since mark_decl_preserved is 
> > > >  called
> > > >  emitted before the local symbols are defined in the assembly 
> > > >  output
> > > >  file.
> > > > 
> > > >  GAS should at least validate that the symbol named in the 
> > > >  .retain
> > > >  directive does end up as a symbol though.
> > > > 
> > > > >>> Don't add .retain.
> > > > >> Why?  I don't see why you find it so objectionable.
> > > > >>
> > > > > An ELF symbol directive should operate on symbol table:
> > > > >
> > > > > http://www.sco.com/developers/gabi/latest/ch4.symtab.html
> > > > >
> > > > > not the section flags where the symbol is defined.
> > > >  I agree in general, but I think this is one of those cases where 
> > > >  it's
> > > >  not so clear.  And what you're talking about is an implementation 
> > > >  detail.
> > > > >>> There is no need for such a hack.  The proper thing to do in ELF is
> > > > >>> to place such a symbol in a section with SHF_GNU_RETAIN flag.   This
> > > > >>> also avoids the question what to do with SHN_COMMON.
> > > > >> I'm not sure that's a good idea either.  Moving symbols into a 
> > > > >> section
> > > > >> other than they'd normally live doesn't seem all that wise.
> > > > > In ELF, a symbol must be defined in a section.  If we want to keep a 
> > > > > symbol,
> > > > > we should place it in an SHF_GNU_RETAIN section.
> > > >
> > > > Again, that's an implementation detail and it's not clear to me that one
> > > > approach is inherently better than the other.
> > > >
> > > >
> > > > >
> > > > >> Let's face it, there's not a great solution here.  If we mark its
> > > > >> existing section, then everything in that section gets kept.  If we 
> > > > >> put
> > > > > FWIW, this is what .retain direct does and is one reason why I object
> > > > > it.
> > > >
> > > > We could make 

[PATCH 1/2] IBM Z: Store long doubles in vector registers when possible

2020-11-09 Thread Ilya Leoshkevich via Gcc-patches
On z14+, there are instructions for working with 128-bit floats (long
doubles) in vector registers.  It's beneficial to use them instead of
instructions that operate on floating point register pairs, because it
allows to store 4 times more data in registers at a time, relieving
register pressure.  The raw performance of the new instructions is
almost the same as that of the new ones.

Implement by storing TFmode values in vector registers on z14+.  Since
not all operations are available with the new instructions, keep the
old ones available using the new FPRX2 mode, and convert between it and
TFmode when necessary (this is called "forwarder" expanders below).
Change the existing TFmode expanders to call either new- or old-style
ones depending on whether we are on z14+ or older machines
("dispatcher" expanders).

gcc/ChangeLog:

2020-11-03  Ilya Leoshkevich  

* config/s390/s390-modes.def (FPRX2): New mode.
* config/s390/s390-protos.h (s390_fma_allowed_p): New function.
* config/s390/s390.c (s390_fma_allowed_p): Likewise.
(s390_build_signbit_mask): Support 128-bit masks.
(print_operand): Support printing the second word of a TFmode
operand as vector register.
(constant_modes): Add FPRX2mode.
(s390_class_max_nregs): Return 1 for TFmode on z14+.
(s390_is_fpr128): New function.
(s390_is_vr128): Likewise.
(s390_can_change_mode_class): Use s390_is_fpr128 and
s390_is_vr128 in order to determine whether mode refers to a FPR
pair or to a VR.
(s390_emit_compare): Force TFmode operands into registers on
z14+.
* config/s390/s390.h (HAVE_TF): New macro.
(EXPAND_MOVTF): New macro.
(EXPAND_TF): Likewise.
* config/s390/s390.md (PFPO_OP_TYPE_FPRX2): PFPO_OP_TYPE_TF
alias.
(ALL): Add FPRX2.
(FP_ALL): Add FPRX2 for z14+, restrict TFmode to z13-.
(FP): Likewise.
(FP_ANYTF): New mode iterator.
(BFP): Add FPRX2 for z14+, restrict TFmode to z13-.
(TD_TF): Likewise.
(xde): Add FPRX2.
(nBFP): Likewise.
(nDFP): Likewise.
(DSF): Likewise.
(DFDI): Likewise.
(SFSI): Likewise.
(DF): Likewise.
(SF): Likewise.
(fT0): Likewise.
(bt): Likewise.
(_d): Likewise.
(HALF_TMODE): Likewise.
(tf_fpr): New mode_attr.
(type): New mode_attr.
(*cmp_ccz_0): Use type instead of mode with fsimp.
(*cmp_ccs_0_fastmath): Likewise.
(*cmptf_ccs): New pattern for wfcxb.
(*cmptf_ccsfps): New pattern for wfkxb.
(mov): Rename to mov.
(signbit2): Rename to signbit2.
(isinf2): Renamed to isinf2.
(*TDC_insn_): Use type instead of mode with fsimp.
(fixuns_trunc2): Rename to
fixuns_trunc2.
(fix_trunctf2): Rename to fix_trunctf2_fpr.
(floatdi2): Rename to floatdi2, use type
instead of mode with itof.
(floatsi2): Rename to floatsi2, use type
instead of mode with itof.
(*floatuns2): Use type instead of mode for
itof.
(floatuns2): Rename to
floatuns2.
(trunctf2): Rename to trunctf2_fpr, use type instead
of mode with fsimp.
(extend2): Rename to
extend2.
(2): Rename to
2, use type instead of
mode with fsimp.
(rint2): Rename to rint2, use
type instead of mode with fsimp.
(2): Use type instead of mode for
fsimp.
(rint2): Likewise.
(trunc2): Rename to
trunc2.
(trunc2): Rename to
trunc2.
(extend2): Rename to
extend2.
(extend2): Rename to
extend2.
(add3): Rename to add3, use type instead of
mode with fsimp.
(*add3_cc): Use type instead of mode with fsimp.
(*add3_cconly): Likewise.
(sub3): Rename to sub3, use type instead of
mode with fsimp.
(*sub3_cc): Use type instead of mode with fsimp.
(*sub3_cconly): Likewise.
(mul3): Rename to mul3, use type instead of
mode with fsimp.
(fma4): Restrict using s390_fma_allowed_p.
(fms4): Restrict using s390_fma_allowed_p.
(div3): Rename to div3, use type instead of
mode with fdiv.
(neg2): Rename to neg2.
(*neg2_cc): Use type instead of mode with fsimp.
(*neg2_cconly): Likewise.
(*neg2_nocc): Likewise.
(*neg2): Likeiwse.
(abs2): Rename to abs2, use type instead of
mode with fdiv.
(*abs2_cc): Use type instead of mode with fsimp.
(*abs2_cconly): Likewise.
(*abs2_nocc): Likewise.
(*abs2): Likewise.
(*negabs2_cc): Likewise.
(*negabs2_cconly): Likewise.
(*negabs2_nocc): Likewise.
(*negabs2): Likewise.
(sqrt2): Rename to sqrt2, use type instead
of mode with fsqrt.
(cbranch4): 

[PATCH 0/2] IBM Z: Store long doubles in vector registers when possible

2020-11-09 Thread Ilya Leoshkevich via Gcc-patches
Bootstrapped and regtested on s390x-redhat-linux with --with-arch=z15.
Ok for master?

This patch series implements storing long doubles in vector registers
on z14+.  Patch 1 is the actual implementation, patch 2 adds tests.

v1: https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557968.html
v1 -> v2:
* Committed cleanups.
* Do not use general_operand for *cmptf_ccs.
* Fix expander condition mismatches.
* Move tests to from zvector to vector, do not use -mzvector.
* Merge scan and run tests where possible.

Ilya Leoshkevich (2):
  IBM Z: Store long doubles in vector registers when possible
  IBM Z: Test long doubles in vector registers

 gcc/config/s390/s390-modes.def|   5 +-
 gcc/config/s390/s390-protos.h |   1 +
 gcc/config/s390/s390.c|  57 ++-
 gcc/config/s390/s390.h|  35 ++
 gcc/config/s390/s390.md   | 209 ++
 gcc/config/s390/s390.opt  |  11 +
 gcc/config/s390/vector.md | 382 --
 gcc/config/s390/vx-builtins.md|  38 +-
 .../s390/vector/long-double-callee-abi-scan.c |  20 +
 .../s390/vector/long-double-caller-abi-run.c  |   4 +
 .../s390/vector/long-double-caller-abi-scan.c |  13 +
 .../s390/vector/long-double-copysign.c|  21 +
 .../s390/vector/long-double-fprx2-constant.c  |  11 +
 .../s390/vector/long-double-from-double.c |  18 +
 .../s390/vector/long-double-from-float.c  |  19 +
 .../s390/vector/long-double-from-i16.c|  19 +
 .../s390/vector/long-double-from-i32.c|  19 +
 .../s390/vector/long-double-from-i64.c|  19 +
 .../s390/vector/long-double-from-i8.c |  19 +
 .../s390/vector/long-double-from-u16.c|  19 +
 .../s390/vector/long-double-from-u32.c|  19 +
 .../s390/vector/long-double-from-u64.c|  19 +
 .../s390/vector/long-double-from-u8.c |  19 +
 .../s390/vector/long-double-to-double.c   |  18 +
 .../s390/vector/long-double-to-float.c|  19 +
 .../s390/vector/long-double-to-i16.c  |  19 +
 .../s390/vector/long-double-to-i32.c  |  19 +
 .../s390/vector/long-double-to-i64.c  |  21 +
 .../s390/vector/long-double-to-i8.c   |  19 +
 .../s390/vector/long-double-to-u16.c  |  20 +
 .../s390/vector/long-double-to-u32.c  |  20 +
 .../s390/vector/long-double-to-u64.c  |  20 +
 .../s390/vector/long-double-to-u8.c   |  20 +
 .../s390/vector/long-double-vec-duplicate.c   |  13 +
 .../gcc.target/s390/vector/long-double-wf.h   |  60 +++
 .../s390/vector/long-double-wfaxb.c   |  17 +
 .../s390/vector/long-double-wfcxb-0001.c  |   9 +
 .../s390/vector/long-double-wfcxb-0111.c  |   9 +
 .../s390/vector/long-double-wfcxb-1011.c  |   9 +
 .../s390/vector/long-double-wfcxb-1101.c  |   9 +
 .../s390/vector/long-double-wfdxb.c   |  17 +
 .../s390/vector/long-double-wfixb.c   |   7 +
 .../s390/vector/long-double-wfkxb-0111.c  |   9 +
 .../s390/vector/long-double-wfkxb-1011.c  |   9 +
 .../s390/vector/long-double-wfkxb-1101.c  |   9 +
 .../s390/vector/long-double-wflcxb.c  |   7 +
 .../s390/vector/long-double-wflpxb.c  |   7 +
 .../s390/vector/long-double-wfmaxb-2.c|  24 ++
 .../s390/vector/long-double-wfmaxb-3.c|  14 +
 .../s390/vector/long-double-wfmaxb-disabled.c |   8 +
 .../s390/vector/long-double-wfmaxb.c  |   7 +
 .../s390/vector/long-double-wfmsxb-disabled.c |   8 +
 .../s390/vector/long-double-wfmsxb.c  |   7 +
 .../s390/vector/long-double-wfmxb.c   |   7 +
 .../vector/long-double-wfnmaxb-disabled.c |   9 +
 .../s390/vector/long-double-wfnmaxb.c |   7 +
 .../vector/long-double-wfnmsxb-disabled.c |   9 +
 .../s390/vector/long-double-wfnmsxb.c |   7 +
 .../s390/vector/long-double-wfsqxb.c  |   7 +
 .../s390/vector/long-double-wfsxb-1.c |  21 +
 .../s390/vector/long-double-wfsxb.c   |   7 +
 .../s390/vector/long-double-wftcixb-1.c   |  15 +
 .../s390/vector/long-double-wftcixb.c |   7 +
 63 files changed, 1412 insertions(+), 134 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-callee-abi-scan.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-caller-abi-run.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-caller-abi-scan.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/long-double-copysign.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-fprx2-constant.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-from-double.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-from-float.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/long-double-from-i16.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/long-double-from-i32.c
 create mode 100644 

Re: [PATCH v2] c++: DR 1914 - Allow duplicate standard attributes.

2020-11-09 Thread Jason Merrill via Gcc-patches

On 11/9/20 12:05 PM, Marek Polacek wrote:

On Fri, Nov 06, 2020 at 03:01:56PM -0500, Jason Merrill via Gcc-patches wrote:

On 11/6/20 2:34 PM, Marek Polacek wrote:

On Fri, Nov 06, 2020 at 02:23:10PM -0500, Jason Merrill via Gcc-patches wrote:

On 11/6/20 2:06 PM, Marek Polacek wrote:

Following Joseph's change for C to allow duplicate C2x standard attributes
,
this patch does a similar thing for C++.  This is DR 1914, to be resolved by
, which is not part of the standard yet, but has a wide
support so look like a shoo-in.  Some duplications still produce warnings;
I didn't change that because a warning might be desirable.


What's the rationale for warning about some and not others?


I don't have any.  Joseph's patch removed the error for a duplicated
'fallthrough' attribute, but the warning remained so I left it as-is
too.

So either we just downgrade the error to a warning, or remove the
remaining warnings too.  I think I slightly prefer the former; with perhaps
a small tweak not to warn when the duplicated attribute comes from a macro
expansion.


Sounds good.


Here's a patch that does that.  cp_parser_check_std_attribute now handles
more standard attributes.  There's still a discrepancy that we warn about
[[fallthrough]] [[fallthrough]] but not about [[noreturn]] [[noreturn]].
Fixing this would involve calling cp_parser_check_std_attribute in another
spot (in a loop), presumably.  But I suspect that's less important for now.

Thanks,

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Following Joseph's change for C to allow duplicate C2x standard attributes
,
this patch does a similar thing for C++.  This is DR 1914, to be resolved by
, which is not part of the standard yet, but has wide
support so looks like a shoo-in.  The duplications now produce warnings
instead, but only if the attribute wasn't specified via a macro.

gcc/c-family/ChangeLog:

DR 1914
* c-common.c (attribute_fallthrough_p): Tweak the warning
message.

gcc/cp/ChangeLog:

DR 1914
* parser.c (cp_parser_check_std_attribute): Return bool.  Add a
location_t parameter.  Return true if the attribute wasn't duplicated.
Give a warning instead of an error.  Check more attributes.
(cp_parser_std_attribute_list): Don't add duplicated attributes to
the list.  Pass location to cp_parser_check_std_attribute.

gcc/testsuite/ChangeLog:

DR 1914
* c-c++-common/attr-fallthrough-2.c: Adjust dg-warning.
* g++.dg/cpp0x/fallthrough2.C: Likewise.
* g++.dg/cpp0x/gen-attrs-60.C: Turn dg-error into dg-warning.
* g++.dg/cpp1y/attr-deprecated-2.C: Likewise.
* g++.dg/cpp2a/attr-likely2.C: Adjust dg-warning.
* g++.dg/cpp2a/nodiscard-once.C: Turn dg-error into dg-warning.
* g++.dg/cpp0x/gen-attrs-72.C: New test.
---
  gcc/c-family/c-common.c   |  5 +-
  gcc/cp/parser.c   | 51 ++-
  .../c-c++-common/attr-fallthrough-2.c |  2 +-
  gcc/testsuite/g++.dg/cpp0x/fallthrough2.C |  2 +-
  gcc/testsuite/g++.dg/cpp0x/gen-attrs-60.C |  2 +-
  gcc/testsuite/g++.dg/cpp0x/gen-attrs-72.C | 45 
  .../g++.dg/cpp1y/attr-deprecated-2.C  |  2 +-
  gcc/testsuite/g++.dg/cpp2a/attr-likely2.C |  2 +-
  gcc/testsuite/g++.dg/cpp2a/nodiscard-once.C   |  2 +-
  9 files changed, 81 insertions(+), 32 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/gen-attrs-72.C

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index d4d3228b8f6..29508bca97b 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5752,9 +5752,10 @@ attribute_fallthrough_p (tree attr)
tree t = lookup_attribute ("fallthrough", attr);
if (t == NULL_TREE)
  return false;
-  /* This attribute shall appear at most once in each attribute-list.  */
+  /* It is no longer true that "this attribute shall appear at most once in
+ each attribute-list", but we still give a warning.  */
if (lookup_attribute ("fallthrough", TREE_CHAIN (t)))
-warning (OPT_Wattributes, "% attribute specified multiple "
+warning (OPT_Wattributes, "attribute % specified multiple "
 "times");
/* No attribute-argument-clause shall be present.  */
else if (TREE_VALUE (t) != NULL_TREE)
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 323d7424a83..6fcee3efe6f 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -27269,30 +27269,30 @@ cp_parser_std_attribute (cp_parser *parser, tree 
attr_ns)
return attribute;
  }
  
-/* Check that the attribute ATTRIBUTE appears at most once in the

-   attribute-list ATTRIBUTES.  This is enforced for noreturn (7.6.3),
-   nodiscard, and deprecated (7.6.5).  Note that
-   carries_dependency (7.6.4) isn't implemented yet in GCC.  */
+/* 

Re: PowerPC: Update long double IEEE 128-bit tests.

2020-11-09 Thread Segher Boessenkool
On Fri, Nov 06, 2020 at 11:45:21PM -0500, Michael Meissner wrote:
> On Mon, Nov 02, 2020 at 07:00:15PM -0600, Segher Boessenkool wrote:
> > > +  /* This test is written to test IBM extended double, which is a pair of
> > > + doubles.  If long double can hold a larger value than a double can, 
> > > such
> > > + as when long double is IEEE 128-bit, just exit immediately.  */
> > 
> > A double-double can hold bigger values than a double can, as well
> > (if X is the biggest double, then X+Y is a valid double-double whenever
> > you take Y small enough).
> > 
> > > +  if (LDBL_MAX_10_EXP > DBL_MAX_10_EXP)
> > > +return 0;
> 
> Yes a double-double can hold more mantissa bits than a double, but the 
> exponent
> size is the same (which is what I'm testing).

But that is not what the comment says.  My remark was about the comment.
It is confusing as is.

> > > +#if defined(_ARCH_PPC) && defined(__LONG_DOUBLE_IEEE128__)
> > > +/* On PowerPC systems, long double uses either the IBM long double 
> > > format, or
> > > +   IEEE 128-bit format.  The compiler switches the long double built-in
> > > +   function names and glibc switches the names when math.h is included.
> > > +   Because this test is run with -fno-builtin, include math.h so that the
> > > +   appropriate nextafter functions are called.  */
> > > +#include 
> > > +#endif
> > > +
> > >  #include "nextafter-1.c"
> > 
> > Please explain *what* mappings are made?  And why is it okay to do this
> > in the testsuite, when all "normal" code (that does not do this) will
> > just fail?
> 
> I can put in a better comment.  However, this test fails because it explicitly
> does not include math.h and it uses -fno-builtin.  So the compiler can't
> effectively map the nextafter math function.

So either the compiler is wrong, or the test is?  Or I do not grasp what
you mean to say at all :-(


Segher


Re: [PATCH] c++: Fix -Wvexing-parse ICE with omitted int [PR97762]

2020-11-09 Thread Jason Merrill via Gcc-patches

On 11/9/20 11:47 AM, Marek Polacek wrote:

For declarations like

   long f();

decl_specifiers->type will be NULL, but I neglected to handle this case,
therefore we ICE.  So handle this case by pretending we've seen 'int',
which is good enough for -Wvexing-parse's purposes.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


gcc/cp/ChangeLog:

PR c++/97762
* parser.c (warn_about_ambiguous_parse): Handle the case when
there is no type in the decl-specifiers.

gcc/testsuite/ChangeLog:

PR c++/97762
* g++.dg/warn/Wvexing-parse8.C: New test.
---
  gcc/cp/parser.c| 23 --
  gcc/testsuite/g++.dg/warn/Wvexing-parse8.C | 11 +++
  2 files changed, 28 insertions(+), 6 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wvexing-parse8.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index bbf157eb47f..b14b4c90c92 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -20652,13 +20652,24 @@ warn_about_ambiguous_parse (const 
cp_decl_specifier_seq *decl_specifiers,
if (declarator->parenthesized != UNKNOWN_LOCATION)
  return;
  
-  tree type = decl_specifiers->type;

-  if (TREE_CODE (type) == TYPE_DECL)
-   type = TREE_TYPE (type);
+  tree type;
+  if (decl_specifiers->type)
+{
+  type = decl_specifiers->type;
+  if (TREE_CODE (type) == TYPE_DECL)
+   type = TREE_TYPE (type);
  
-  /* If the return type is void there is no ambiguity.  */

-  if (same_type_p (type, void_type_node))
-return;
+  /* If the return type is void there is no ambiguity.  */
+  if (same_type_p (type, void_type_node))
+   return;
+}
+  else
+{
+  /* Code like long f(); will have null ->type.  If we have any
+type-specifiers, pretend we've seen int.  */
+  gcc_checking_assert (decl_specifiers->any_type_specifiers_p);
+  type = integer_type_node;
+}
  
auto_diagnostic_group d;

location_t loc = declarator->u.function.parens_loc;
diff --git a/gcc/testsuite/g++.dg/warn/Wvexing-parse8.C 
b/gcc/testsuite/g++.dg/warn/Wvexing-parse8.C
new file mode 100644
index 000..2d26d22fc4b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wvexing-parse8.C
@@ -0,0 +1,11 @@
+// PR c++/97762
+// { dg-do compile }
+
+void
+g ()
+{
+  long a(); // { dg-warning "empty parentheses" }
+  signed b(); // { dg-warning "empty parentheses" }
+  unsigned c(); // { dg-warning "empty parentheses" }
+  short d(); // { dg-warning "empty parentheses" }
+}

base-commit: 4e85ad79a137535393d8dc169359e1730cab3533





Re: [PATCH] c, c++: Fix up -Wunused-value on COMPLEX_EXPRs [PR97748]

2020-11-09 Thread Jason Merrill via Gcc-patches

On 11/9/20 4:51 AM, Jakub Jelinek wrote:

Hi!

The -Wunused-value warning in both C and C++ FEs (implemented
significantly differently between the two) sees the COMPLEX_EXPRs created
e.g. for complex pre/post increment and many other expressions as useless
and warns about it.

For the C warning implementation, on e.g.
COMPLEX_EXPR < ++REALPART_EXPR , IMAGPART_EXPR >;
would warn even on the IMAGPART_EXPR  there alone etc., so what works
is check if we'd warn about both operands of COMPLEX_EXPR and if yes,
warn on the whole COMPLEX_EXPR, otherwise don't warn.

The C++ warning implementation is significantly different and for that one
the only warn if both would be warned about doesn't really work,
we then miss warnings e.g. about
COMPLEX_EXPR > + 1.0e+0, IMAGPART_EXPR >> 
>
so the patch instead warns if it would warn on any of the operands.


I don't understand the rationale for this difference between C and C++. 
Are you saying that the C++ front end is confused by the SAVE_EXPR? 
warn_if_unused_value properly looks through it.


It's also weird that the C++ front end calls warn_if_unused_value (which 
is in a shared file) in gimplify_expr_stmt AND has its own version of 
the warning in convert_to_void.


How about calling warn_if_unused_value instead of the new 
warn_if_unused_value_p?



On the testcase which after the initial new tests contains pretty much
everything from gcc.dg/Wunused-value-1.c both approaches seem to work
nicely.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-11-09  Jakub Jelinek  

PR c/97748
gcc/c-family/
* c-common.h (warn_if_unused_value): Add quiet argument defaulted
to false.
* c-warn.c (warn_if_unused_value): Likewise.  Pass it down
recursively and just return true instead of warning if it is true.
Handle COMPLEX_EXPR.
gcc/cp/
* cvt.c (warn_if_unused_value_p): New function.
(convert_to_void): Use it.
gcc/testsuite/
* c-c++-common/Wunused-value-1.c: New test.

--- gcc/c-family/c-common.h.jj  2020-11-03 11:15:07.170681001 +0100
+++ gcc/c-family/c-common.h 2020-11-07 09:37:48.597233063 +0100
@@ -1362,7 +1362,7 @@ extern void warn_tautological_cmp (const
   tree, tree);
  extern void warn_logical_not_parentheses (location_t, enum tree_code, tree,
  tree);
-extern bool warn_if_unused_value (const_tree, location_t);
+extern bool warn_if_unused_value (const_tree, location_t, bool = false);
  extern bool strict_aliasing_warning (location_t, tree, tree);
  extern void sizeof_pointer_memaccess_warning (location_t *, tree,
  vec *, tree *,
--- gcc/c-family/c-warn.c.jj2020-10-26 10:53:56.533885147 +0100
+++ gcc/c-family/c-warn.c   2020-11-07 09:40:51.011170825 +0100
@@ -585,7 +585,7 @@ warn_logical_not_parentheses (location_t
 (potential) location of the expression.  */
  
  bool

-warn_if_unused_value (const_tree exp, location_t locus)
+warn_if_unused_value (const_tree exp, location_t locus, bool quiet)
  {
   restart:
if (TREE_USED (exp) || TREE_NO_WARNING (exp))
@@ -633,7 +633,7 @@ warn_if_unused_value (const_tree exp, lo
goto restart;
  
  case COMPOUND_EXPR:

-  if (warn_if_unused_value (TREE_OPERAND (exp, 0), locus))
+  if (warn_if_unused_value (TREE_OPERAND (exp, 0), locus, quiet))
return true;
/* Let people do `(foo (), 0)' without a warning.  */
if (TREE_CONSTANT (TREE_OPERAND (exp, 1)))
@@ -648,6 +648,13 @@ warn_if_unused_value (const_tree exp, lo
return false;
goto warn;
  
+case COMPLEX_EXPR:

+  /* Warn only if both operands are unused.  */
+  if (warn_if_unused_value (TREE_OPERAND (exp, 0), locus, true)
+ && warn_if_unused_value (TREE_OPERAND (exp, 1), locus, true))
+   goto warn;
+  return false;
+
  case INDIRECT_REF:
/* Don't warn about automatic dereferencing of references, since
 the user cannot control it.  */
@@ -671,6 +678,8 @@ warn_if_unused_value (const_tree exp, lo
return false;
  
  warn:

+  if (quiet)
+   return true;
return warning_at (locus, OPT_Wunused_value, "value computed is not 
used");
  }
  }
--- gcc/cp/cvt.c.jj 2020-07-28 15:39:09.0 +0200
+++ gcc/cp/cvt.c2020-11-08 21:02:08.306584085 +0100
@@ -,6 +,46 @@ maybe_warn_nodiscard (tree expr, impl_co
  }
  }
  
+/* Return true if -Wunused-value warning should be emitted for EXPR.  */

+
+static bool
+warn_if_unused_value_p (tree expr)
+{
+  /* We might like to warn about (say) "(int) f()", as the
+ cast has no effect, but the compiler itself will
+ generate implicit conversions under some
+ circumstances.  (For example a block copy will be
+ turned into a call to "__builtin_memcpy", with a
+ conversion of the return value to an appropriate
+ type.)  So, to avoid false 

C++ patch ping

2020-11-09 Thread Jakub Jelinek via Gcc-patches
Hi!

I'd like to ping the updated bit_cast patch:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557781.html

Thanks
Jakub



Re: Enable MOVDIRI, MOVDIR64B, CLDEMOTE and WAITPKG for march=tremont

2020-11-09 Thread Jason Merrill via Gcc-patches
This patch was also applied to the GCC 9 and 10 branches and breaks those
builds, because PTA_CLDEMOTE is not defined.

On Mon, Nov 9, 2020 at 4:03 AM Uros Bizjak via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> On Mon, Nov 9, 2020 at 9:50 AM Cui, Lili  wrote:
> >
> > Hi Uros,
> >
> > This patch is  to correct some instruction sets for
> march=Tremont/Broadwell/Silvermont/knl
> >
> > Bootstrap is ok, and no regressions for i386/x86-64 testsuite.
> >
> > OK for master?
> >
> > [PATCH] Enable MOVDIRI, MOVDIR64B, CLDEMOTE and WAITPKG for
> >  march=tremont
> >
> > 1. Enable MOVDIRI, MOVDIR64B, CLDEMOTE and WAITPKG for march=tremont
> > 2. Move PREFETCHW from march=broadwell to march=silvermont.
> > 3. Add PREFETCHWT1 to march=knl
> >
> > gcc/ChangeLog:
> >
> > PR target/97685
> > * config/i386/i386.h:
> > (PTA_BROADWELL): Delete PTA_PRFCHW.
> > (PTA_SILVERMONT): Add PTA_PRFCHW.
> > (PTA_KNL): Add PTA_PREFETCHWT1.
> > (PTA_TREMONT): Add PTA_MOVDIRI, PTA_MOVDIR64B, PTA_CLDEMOTE and
> PTA_WAITPKG.
> > * doc/invoke.texi: Delete PREFETCHW for broadwell, skylake, knl,
> knm,
> > skylake-avx512, cannonlake, icelake-client, icelake-server,
> cascadelake,
> > cooperlake, tigerlake and sapphirerapids.
> > Add PREFETCHW for silvermont, goldmont, goldmont-plus and
> tremont.
> > Add XSAVEC and XSAVES for goldmont, goldmont-plus and tremont.
> > Add MOVDIRI, MOVDIR64B, CLDEMOTE and WAITPKG for tremont.
> > Add KEYLOCKER and HREST for alderlake.
> > Add AMX-BF16, AMX-TILE, AMX-INT8 and UINTR for sapphirerapids.
> > Add KEYLOCKER for tigerlake.
>
> OK.
>
> Thanks,
> Uros.
>
> > ---
> >  gcc/config/i386/i386.h | 10 +++
> >  gcc/doc/invoke.texi| 59 +-
> >  2 files changed, 35 insertions(+), 34 deletions(-)
> >
> > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > index d0c157a9970..5e01fe6b841 100644
> > --- a/gcc/config/i386/i386.h
> > +++ b/gcc/config/i386/i386.h
> > @@ -2515,8 +2515,7 @@ const wide_int_bitmask PTA_IVYBRIDGE =
> PTA_SANDYBRIDGE | PTA_FSGSBASE
> >| PTA_RDRND | PTA_F16C;
> >  const wide_int_bitmask PTA_HASWELL = PTA_IVYBRIDGE | PTA_AVX2 | PTA_BMI
> >| PTA_BMI2 | PTA_LZCNT | PTA_FMA | PTA_MOVBE | PTA_HLE;
> > -const wide_int_bitmask PTA_BROADWELL = PTA_HASWELL | PTA_ADX |
> PTA_PRFCHW
> > -  | PTA_RDSEED;
> > +const wide_int_bitmask PTA_BROADWELL = PTA_HASWELL | PTA_ADX |
> PTA_RDSEED;
> >  const wide_int_bitmask PTA_SKYLAKE = PTA_BROADWELL | PTA_AES |
> PTA_CLFLUSHOPT
> >| PTA_XSAVEC | PTA_XSAVES | PTA_SGX;
> >  const wide_int_bitmask PTA_SKYLAKE_AVX512 = PTA_SKYLAKE | PTA_AVX512F
> > @@ -2541,16 +2540,17 @@ const wide_int_bitmask PTA_SAPPHIRERAPIDS =
> PTA_COOPERLAKE | PTA_MOVDIRI
> >  const wide_int_bitmask PTA_ALDERLAKE = PTA_SKYLAKE | PTA_CLDEMOTE |
> PTA_PTWRITE
> >| PTA_WAITPKG | PTA_SERIALIZE | PTA_HRESET | PTA_KL | PTA_WIDEKL;
> >  const wide_int_bitmask PTA_KNL = PTA_BROADWELL | PTA_AVX512PF |
> PTA_AVX512ER
> > -  | PTA_AVX512F | PTA_AVX512CD;
> > +  | PTA_AVX512F | PTA_AVX512CD | PTA_PREFETCHWT1;
> >  const wide_int_bitmask PTA_BONNELL = PTA_CORE2 | PTA_MOVBE;
> > -const wide_int_bitmask PTA_SILVERMONT = PTA_WESTMERE | PTA_MOVBE |
> PTA_RDRND;
> > +const wide_int_bitmask PTA_SILVERMONT = PTA_WESTMERE | PTA_MOVBE |
> PTA_RDRND
> > +  | PTA_PRFCHW;
> >  const wide_int_bitmask PTA_GOLDMONT = PTA_SILVERMONT | PTA_AES |
> PTA_SHA | PTA_XSAVE
> >| PTA_RDSEED | PTA_XSAVEC | PTA_XSAVES | PTA_CLFLUSHOPT | PTA_XSAVEOPT
> >| PTA_FSGSBASE;
> >  const wide_int_bitmask PTA_GOLDMONT_PLUS = PTA_GOLDMONT | PTA_RDPID
> >| PTA_SGX | PTA_PTWRITE;
> >  const wide_int_bitmask PTA_TREMONT = PTA_GOLDMONT_PLUS | PTA_CLWB
> > -  | PTA_GFNI;
> > +  | PTA_GFNI | PTA_MOVDIRI | PTA_MOVDIR64B | PTA_CLDEMOTE | PTA_WAITPKG;
> >  const wide_int_bitmask PTA_KNM = PTA_KNL | PTA_AVX5124VNNIW
> >| PTA_AVX5124FMAPS | PTA_AVX512VPOPCNTDQ;
> >
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index d2a188d7c75..d01beb248e1 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -29528,14 +29528,14 @@ BMI, BMI2 and F16C instruction set support.
> >
> >  @item broadwell
> >  Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
> SSE3, SSSE3,
> > -SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA,
> > -BMI, BMI2, F16C, RDSEED, ADCX and PREFETCHW instruction set support.
> > +SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA,
> BMI, BMI2,
> > +F16C, RDSEED and ADCX instruction set support.
> >
> >  @item skylake
> >  Intel Skylake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
> SSSE3,
> >  SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA,
> > -BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC and
> > -XSAVES instruction set support.
> > +BMI, BMI2, F16C, RDSEED, ADCX, CLFLUSHOPT, XSAVEC and XSAVES
> 

[PATCH] x86: Add -mneeded for GNU_PROPERTY_X86_ISA_1_V[234] marker

2020-11-09 Thread H.J. Lu via Gcc-patches
GCC 11 supports -march=x86-64-v[234] to enable x86 micro-architecture ISA
levels:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97250

Binutils has been updated to support GNU_PROPERTY_X86_ISA_1_V[234] marker:

https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/13

with

commit b0ab06937385e0ae25cebf1991787d64f439bf12
Author: H.J. Lu 
Date:   Fri Oct 30 06:49:57 2020 -0700

x86: Support GNU_PROPERTY_X86_ISA_1_BASELINE marker

and

commit 32930e4edbc06bc6f10c435dbcc63131715df678
Author: H.J. Lu 
Date:   Fri Oct 9 05:05:57 2020 -0700

x86: Support GNU_PROPERTY_X86_ISA_1_V[234] marker

in x86 ELF binaries.

Add -mneeded to emit GNU_PROPERTY_X86_ISA_1_NEEDED property to indicate
the micro-architecture ISA level required to execute the binary.

gcc/

* config.gcc: Replace cet.o with gnu-property.o.  Replace
i386/t-cet with i386/t-gnu-property.
* config/i386/cet.c: Renamed to ...
* config/i386/gnu-property.c: This.
(emit_gnu_property): New function.
(file_end_indicate_exec_stack_and_cet): Renamed to ...
(file_end_indicate_exec_stack_and_gnu_property): This.  Call
emit_gnu_property to generate GNU_PROPERTY_X86_FEATURE_1_AND and
GNU_PROPERTY_X86_ISA_1_NEEDED properties.
* config/i386/i386.opt (mneeded): New.
* config/i386/linux-common.h (file_end_indicate_exec_stack_and_cet):
Renamed to ...
(file_end_indicate_exec_stack_and_gnu_property): This.
(TARGET_ASM_FILE_END): Updated.
* config/i386/t-cet: Renamed to ...
* config/i386/t-gnu-property: This.
(cet.o): Renamed to ...
(gnu-property.o): This.
* doc/invoke.texi: Document -mneeded.

gcc/testsuite/

* gcc.target/i386/x86-needed-1.c: New test.
* gcc.target/i386/x86-needed-2.c: Likewise.
* gcc.target/i386/x86-needed-3.c: Likewise.
---
 gcc/config.gcc   |   4 +-
 gcc/config/i386/cet.c|  76 
 gcc/config/i386/gnu-property.c   | 124 +++
 gcc/config/i386/i386.opt |   4 +
 gcc/config/i386/linux-common.h   |   4 +-
 gcc/config/i386/{t-cet => t-gnu-property}|   2 +-
 gcc/doc/invoke.texi  |   8 +-
 gcc/testsuite/gcc.target/i386/x86-needed-1.c |  13 ++
 gcc/testsuite/gcc.target/i386/x86-needed-2.c |  11 ++
 gcc/testsuite/gcc.target/i386/x86-needed-3.c |  11 ++
 10 files changed, 175 insertions(+), 82 deletions(-)
 delete mode 100644 gcc/config/i386/cet.c
 create mode 100644 gcc/config/i386/gnu-property.c
 rename gcc/config/i386/{t-cet => t-gnu-property} (93%)
 create mode 100644 gcc/testsuite/gcc.target/i386/x86-needed-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/x86-needed-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/x86-needed-3.c

diff --git a/gcc/config.gcc b/gcc/config.gcc
index dc6d68bd4eb..9bbc4274f86 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -5226,8 +5226,8 @@ case ${target} in
i[34567]86-*-darwin* | x86_64-*-darwin*)
;;
i[34567]86-*-linux* | x86_64-*-linux*)
-   extra_objs="${extra_objs} cet.o"
-   tmake_file="$tmake_file i386/t-linux i386/t-cet"
+   extra_objs="${extra_objs} gnu-property.o"
+   tmake_file="$tmake_file i386/t-linux i386/t-gnu-property"
;;
i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu)
tmake_file="$tmake_file i386/t-kfreebsd"
diff --git a/gcc/config/i386/cet.c b/gcc/config/i386/cet.c
deleted file mode 100644
index 5450ac307d5..000
--- a/gcc/config/i386/cet.c
+++ /dev/null
@@ -1,76 +0,0 @@
-/* Functions for CET/x86.
-   Copyright (C) 2017-2020 Free Software Foundation, Inc.
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 3, or (at your option)
-any later version.
-
-GCC is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-.  */
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "tm.h"
-#include "output.h"
-#include "linux-common.h"
-
-void
-file_end_indicate_exec_stack_and_cet (void)
-{
-  file_end_indicate_exec_stack ();
-
-  if (flag_cf_protection == CF_NONE)
-return;
-
-  unsigned int feature_1 = 0;
-
-  if (flag_cf_protection & CF_BRANCH)
-/* GNU_PROPERTY_X86_FEATURE_1_IBT.  */
-feature_1 |= 0x1;
-
-  if (flag_cf_protection & CF_RETURN)
-/* GNU_PROPERTY_X86_FEATURE_1_SHSTK.  */
-feature_1 |= 0x2;
-
-  if (feature_1)
-{
-  int 

[PATCH][AArch64] Skip arm targets in vq*shr*n_high_n intrinsic tests

2020-11-09 Thread David Candler via Gcc-patches
Hi,

These tests should be skipped for arm targets as the instrinsics
are only supported on aarch64.

Tested on aarch64 and aarch32

gcc/testsuite/ChangeLog

2020-11-09  David Candler  

* gcc.target/aarch64/advsimd-intrinsics/vqrshrn_high_n.c: Added skip 
directive.
* gcc.target/aarch64/advsimd-intrinsics/vqrshrun_high_n.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vqshrn_high_n.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vqshrun_high_n.c: Likewise.diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqrshrn_high_n.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqrshrn_high_n.c
index d9add2908d1..6ebe0743cc4 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqrshrn_high_n.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqrshrn_high_n.c
@@ -1,3 +1,6 @@
+/* { dg-do run } */
+/* { dg-skip-if "" { arm*-*-* } } */
+
 #include 
 #include "arm-neon-ref.h"
 #include "compute-ref-data.h"
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqrshrun_high_n.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqrshrun_high_n.c
index 1a3788cd14a..49d319d0181 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqrshrun_high_n.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqrshrun_high_n.c
@@ -1,3 +1,6 @@
+/* { dg-do run } */
+/* { dg-skip-if "" { arm*-*-* } } */
+
 #include 
 #include "arm-neon-ref.h"
 #include "compute-ref-data.h"
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqshrn_high_n.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqshrn_high_n.c
index 72aecc15ba2..8d06f113dc8 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqshrn_high_n.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqshrn_high_n.c
@@ -1,3 +1,6 @@
+/* { dg-do run } */
+/* { dg-skip-if "" { arm*-*-* } } */
+
 #include 
 #include "arm-neon-ref.h"
 #include "compute-ref-data.h"
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqshrun_high_n.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqshrun_high_n.c
index 4885c029d1a..e8235fe9693 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqshrun_high_n.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqshrun_high_n.c
@@ -1,3 +1,6 @@
+/* { dg-do run } */
+/* { dg-skip-if "" { arm*-*-* } } */
+
 #include 
 #include "arm-neon-ref.h"
 #include "compute-ref-data.h"


Re: [PATCH] "used" attribute saves decl from linker garbage collection

2020-11-09 Thread H.J. Lu via Gcc-patches
On Mon, Nov 9, 2020 at 9:41 AM Jozef Lawrynowicz
 wrote:
>
> On Fri, Nov 06, 2020 at 04:39:33PM -0800, H.J. Lu via Gcc-patches wrote:
> > On Fri, Nov 6, 2020 at 4:17 PM Jeff Law  wrote:
> > >
> > >
> > > On 11/6/20 5:13 PM, H.J. Lu wrote:
> > > > On Fri, Nov 6, 2020 at 4:01 PM Jeff Law  wrote:
> > > >>
> > > >> On 11/6/20 4:45 PM, H.J. Lu wrote:
> > > >>> On Fri, Nov 6, 2020 at 3:37 PM Jeff Law  wrote:
> > >  On 11/6/20 4:29 PM, H.J. Lu wrote:
> > > > On Fri, Nov 6, 2020 at 3:22 PM Jeff Law  wrote:
> > > >> On 11/5/20 7:34 AM, H.J. Lu via Gcc-patches wrote:
> > > >>> On Thu, Nov 5, 2020 at 3:37 AM Jozef Lawrynowicz
> > > >>>  wrote:
> > >  On Thu, Nov 05, 2020 at 06:21:21AM -0500, Hans-Peter Nilsson 
> > >  wrote:
> > > > On Wed, 4 Nov 2020, H.J. Lu wrote:
> > > >> .retain is ill-defined.   For example,
> > > >>
> > > >> [hjl@gnu-cfl-2 gcc]$ cat /tmp/x.c
> > > >> static int xyzzy __attribute__((__used__));
> > > >> [hjl@gnu-cfl-2 gcc]$ ./xgcc -B./ -S /tmp/x.c -fcommon
> > > >> [hjl@gnu-cfl-2 gcc]$ cat x.s
> > > >> .file "x.c"
> > > >> .text
> > > >> .retain xyzzy  < What does it do?
> > > >> .local xyzzy
> > > >> .comm xyzzy,4,4
> > > >> .ident "GCC: (GNU) 11.0.0 20201103 (experimental)"
> > > >> .section .note.GNU-stack,"",@progbits
> > > >> [hjl@gnu-cfl-2 gcc]$
> > > > To answer that question: it's up to the assembler, but for ELF
> > > > and SHF_GNU_RETAIN, it seems obvious it'd tell the assembler to
> > > > set SHF_GNU_RETAIN for the section where the symbol ends up.
> > > > We both know this isn't rocket science with binutils.
> > >  Indeed, and my patch handles it trivially:
> > >  https://sourceware.org/pipermail/binutils/2020-November/113993.html
> > > 
> > >    +void
> > >    +obj_elf_retain (int arg ATTRIBUTE_UNUSED)
> > >     snip 
> > >    +  sym = get_sym_from_input_line_and_check ();
> > >    +  symbol_get_obj (sym)->retain = 1;
> > > 
> > >    @@ -2624,6 +2704,9 @@ elf_frob_symbol (symbolS *symp, int 
> > >  *puntp)
> > >  }
> > > }
> > > 
> > >    +  if (symbol_get_obj (symp)->retain)
> > >    +elf_section_flags (S_GET_SEGMENT (symp)) |= 
> > >  SHF_GNU_RETAIN;
> > >    +
> > >   /* Double check weak symbols.  */
> > >   if (S_IS_WEAK (symp))
> > > {
> > > 
> > >  We could check that the symbol named in the .retain directive has
> > >  already been defined, however this isn't compatible with GCC
> > >  mark_decl_preserved handling, since mark_decl_preserved is called
> > >  emitted before the local symbols are defined in the assembly 
> > >  output
> > >  file.
> > > 
> > >  GAS should at least validate that the symbol named in the .retain
> > >  directive does end up as a symbol though.
> > > 
> > > >>> Don't add .retain.
> > > >> Why?  I don't see why you find it so objectionable.
> > > >>
> > > > An ELF symbol directive should operate on symbol table:
> > > >
> > > > http://www.sco.com/developers/gabi/latest/ch4.symtab.html
> > > >
> > > > not the section flags where the symbol is defined.
> > >  I agree in general, but I think this is one of those cases where it's
> > >  not so clear.  And what you're talking about is an implementation 
> > >  detail.
> > > >>> There is no need for such a hack.  The proper thing to do in ELF is
> > > >>> to place such a symbol in a section with SHF_GNU_RETAIN flag.   This
> > > >>> also avoids the question what to do with SHN_COMMON.
> > > >> I'm not sure that's a good idea either.  Moving symbols into a section
> > > >> other than they'd normally live doesn't seem all that wise.
> > > > In ELF, a symbol must be defined in a section.  If we want to keep a 
> > > > symbol,
> > > > we should place it in an SHF_GNU_RETAIN section.
> > >
> > > Again, that's an implementation detail and it's not clear to me that one
> > > approach is inherently better than the other.
> > >
> > >
> > > >
> > > >> Let's face it, there's not a great solution here.  If we mark its
> > > >> existing section, then everything in that section gets kept.  If we put
> > > > FWIW, this is what .retain direct does and is one reason why I object
> > > > it.
> > >
> > > We could make .retain work with either approach.I don't see .retain
> > > as a problem at all.
> > >
> > >
> > >
> > > >
> > > >> the object into a different section than it would normally live, then
> > > >> that opens a whole new can of worms.
> > > > We should place it in a section which it normally lives in and mark the
> > > > section with SHF_GNU_RETAIN.
> > >
> > > And why not do that 

[PATCH] [PR target/97727] aarch64: [testcase] fix bf16_vstN_lane_2.c for big endian targets

2020-11-09 Thread Andrea Corallo via Gcc-patches
Hi all,

this simple patch is to fix PR target/97727.

Okay for trunk and gcc-10?

Thanks!

  Andrea

2020-11-09  Andrea Corallo  

PR target/97727
* gcc.target/aarch64/advsimd-intrinsics/bf16_vldN_lane_2.c: Relax
regexps.

>From 38abb583632b8b4b38304e0aabf270a42b80dcf7 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Mon, 9 Nov 2020 16:59:14 +0100
Subject: [PATCH] PR target/97727 aarch64: [testcase] fix bf16_vstN_lane_2.c
 for big endian targets

2020-11-09  Andrea Corallo  

PR target/97727
* gcc.target/aarch64/advsimd-intrinsics/bf16_vldN_lane_2.c: Relax
regexps.
---
 .../aarch64/advsimd-intrinsics/bf16_vstN_lane_2.c  | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git 
a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bf16_vstN_lane_2.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bf16_vstN_lane_2.c
index f70c34dbd83..822968df44f 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bf16_vstN_lane_2.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bf16_vstN_lane_2.c
@@ -11,15 +11,13 @@ test_vst2_lane_bf16 (bfloat16_t *ptr, bfloat16x4x2_t b)
   vst2_lane_bf16 (ptr, b, 2);
 }
 
-/* { dg-final { scan-assembler-times "st2\\t{v2.h - v3.h}\\\[2\\\], 
\\\[x0\\\]" 1 } } */
-
 void
 test_vst2q_lane_bf16 (bfloat16_t *ptr, bfloat16x8x2_t b)
 {
   vst2q_lane_bf16 (ptr, b, 2);
 }
 
-/* { dg-final { scan-assembler-times "st2\\t{v0.h - v1.h}\\\[2\\\], 
\\\[x0\\\]" 1 } } */
+/* { dg-final { scan-assembler-times "st2\\t{v\[0-9\]+.h - 
v\[0-9\]+.h}\\\[2\\\], \\\[x\[0-9\]+\\\]" 2 } } */
 
 void
 test_vst3_lane_bf16 (bfloat16_t *ptr, bfloat16x4x3_t b)
@@ -33,7 +31,7 @@ test_vst3q_lane_bf16 (bfloat16_t *ptr, bfloat16x8x3_t b)
   vst3q_lane_bf16 (ptr, b, 2);
 }
 
-/* { dg-final { scan-assembler-times "st3\\t{v4.h - v6.h}\\\[2\\\], 
\\\[x0\\\]" 2 } } */
+/* { dg-final { scan-assembler-times "st3\\t{v\[0-9\]+.h - 
v\[0-9\]+.h}\\\[2\\\], \\\[x\[0-9\]+\\\]" 2 } } */
 
 void
 test_vst4_lane_bf16 (bfloat16_t *ptr, bfloat16x4x4_t b)
@@ -41,12 +39,10 @@ test_vst4_lane_bf16 (bfloat16_t *ptr, bfloat16x4x4_t b)
   vst4_lane_bf16 (ptr, b, 2);
 }
 
-/* { dg-final { scan-assembler-times "st4\\t{v4.h - v7.h}\\\[2\\\], 
\\\[x0\\\]" 1 } } */
-
 void
 test_vst4q_lane_bf16 (bfloat16_t *ptr, bfloat16x8x4_t b)
 {
   vst4q_lane_bf16 (ptr, b, 2);
 }
 
-/* { dg-final { scan-assembler-times "st4\\t{v0.h - v3.h}\\\[2\\\], 
\\\[x0\\\]" 1 } } */
+/* { dg-final { scan-assembler-times "st4\\t{v\[0-9\]+.h - 
v\[0-9\]+.h}\\\[2\\\], \\\[x\[0-9\]+\\\]" 2 } } */
-- 
2.20.1



[PATCH] rs6000.c DECL_IS_BUILTIN bootstrap fix

2020-11-09 Thread David Edelsohn via Gcc-patches
rs6000: Fix bootstrap after r11-4793.

The patch omitted a change for rs6000.c, fixed thus.

gcc/ChangeLog:

* config/rs6000/rs6000.c
(rs6000_mangle_decl_assembler_name): ChangeDECL_IS_BUILTIN
-> DECL_IS_UNDECLARED_BUILTIN.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 63f1c06c01b..d7dcd93f088 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -27084,7 +27084,8 @@ static tree
 rs6000_mangle_decl_assembler_name (tree decl, tree id)
 {
   if (!TARGET_IEEEQUAD_DEFAULT && TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
-  && TREE_CODE (decl) == FUNCTION_DECL && DECL_IS_BUILTIN (decl) )
+  && TREE_CODE (decl) == FUNCTION_DECL
+  && DECL_IS_UNDECLARED_BUILTIN (decl))
 {
   size_t len = IDENTIFIER_LENGTH (id);
   const char *name = IDENTIFIER_POINTER (id);


Re: [PATCH] openmp: Retire nest-var ICV

2020-11-09 Thread Kwok Cheung Yeung

On 06/11/2020 8:33 pm, Tobias Burnus wrote:

Hello Kwok, hi Jakub,

On 06.11.20 21:13, Kwok Cheung Yeung wrote:

In addition to deprecating the omp_(get|set)_nested() functions and OMP_NESTED 
environment variable, OpenMP 5.0 also removes the nest-var ICV altogether, 
defining it in terms of the max-active-levels-var ICV instead. [...]


Shouldn't libgomp/libgomp.texi be also updated?

Tobias


I have added some documentation regarding the relationship between the nesting 
setting and the current maximum number active levels. The documentation does not 
detail ICVs though, so we probably don't need to explicitly state that one is in 
terms of another?


Is this version okay for trunk?

Thanks

Kwok
commit b4feb16f3c84b8f82163a4cbba6a31d55fbb8e5b
Author: Kwok Cheung Yeung 
Date:   Mon Nov 9 09:34:39 2020 -0800

openmp: Retire nest-var ICV for OpenMP 5.0

This removes the nest-var ICV, expressing nesting in terms of the
max-active-levels-var ICV instead.

2020-11-09  Kwok Cheung Yeung  

libgomp/
* env.c (gomp_global_icv): Remove nest_var field.
(gomp_max_active_levels_var): Initialize to 1.
(parse_boolean): Return true on success.
(handle_omp_display_env): Express OMP_NESTED in terms of
gomp_max_active_levels_var.
(initialize_env): Set gomp_max_active_levels_var from
OMP_MAX_ACTIVE_LEVELS, OMP_NESTED, OMP_NUM_THREADS and
OMP_PROC_BIND.
* icv.c (omp_set_nested): Express in terms of
gomp_max_active_levels_var.
(omp_get_nested): Likewise.
* libgomp.h (struct gomp_task_icv): Remove nest_var field.
* libgomp.texi (omp_get_nested): Update documentation.
(omp_set_nested): Likewise.
(OMP_MAX_ACTIVE_LEVELS): Likewise.
(OMP_NESTED): Likewise.
(OMP_NUM_THREADS): Likewise.
(OMP_PROC_BIND): Likewise.
* parallel.c (gomp_resolve_num_threads): Replace reference
to nest_var with gomp_max_active_levels_var.
* testsuite/libgomp.c/target-5.c: Remove additional options.
(main): Remove references to omp_get_nested and omp_set_nested.

diff --git a/libgomp/env.c b/libgomp/env.c
index ab22525..75d0fe2 100644
--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -68,12 +68,11 @@ struct gomp_task_icv gomp_global_icv = {
   .run_sched_chunk_size = 1,
   .default_device_var = 0,
   .dyn_var = false,
-  .nest_var = false,
   .bind_var = omp_proc_bind_false,
   .target_data = NULL
 };
 
-unsigned long gomp_max_active_levels_var = gomp_supported_active_levels;
+unsigned long gomp_max_active_levels_var = 1;
 bool gomp_cancel_var = false;
 enum gomp_target_offload_t gomp_target_offload_var
   = GOMP_TARGET_OFFLOAD_DEFAULT;
@@ -959,16 +958,17 @@ parse_spincount (const char *name, unsigned long long 
*pvalue)
 }
 
 /* Parse a boolean value for environment variable NAME and store the
-   result in VALUE.  */
+   result in VALUE.  Return true if one was present and it was
+   successfully parsed.  */
 
-static void
+static bool
 parse_boolean (const char *name, bool *value)
 {
   const char *env;
 
   env = getenv (name);
   if (env == NULL)
-return;
+return false;
 
   while (isspace ((unsigned char) *env))
 ++env;
@@ -987,7 +987,11 @@ parse_boolean (const char *name, bool *value)
   while (isspace ((unsigned char) *env))
 ++env;
   if (*env != '\0')
-gomp_error ("Invalid value for environment variable %s", name);
+{
+  gomp_error ("Invalid value for environment variable %s", name);
+  return false;
+}
+  return true;
 }
 
 /* Parse the OMP_WAIT_POLICY environment variable and return the value.  */
@@ -1252,7 +1256,7 @@ handle_omp_display_env (unsigned long stacksize, int 
wait_policy)
   fprintf (stderr, "  OMP_DYNAMIC = '%s'\n",
   gomp_global_icv.dyn_var ? "TRUE" : "FALSE");
   fprintf (stderr, "  OMP_NESTED = '%s'\n",
-  gomp_global_icv.nest_var ? "TRUE" : "FALSE");
+  gomp_max_active_levels_var > 1 ? "TRUE" : "FALSE");
 
   fprintf (stderr, "  OMP_NUM_THREADS = '%lu", gomp_global_icv.nthreads_var);
   for (i = 1; i < gomp_nthreads_var_list_len; i++)
@@ -1417,16 +1421,11 @@ initialize_env (void)
 
   parse_schedule ();
   parse_boolean ("OMP_DYNAMIC", _global_icv.dyn_var);
-  parse_boolean ("OMP_NESTED", _global_icv.nest_var);
   parse_boolean ("OMP_CANCELLATION", _cancel_var);
   parse_boolean ("OMP_DISPLAY_AFFINITY", _display_affinity_var);
   parse_int ("OMP_DEFAULT_DEVICE", _global_icv.default_device_var, true);
   parse_target_offload ("OMP_TARGET_OFFLOAD", _target_offload_var);
   parse_int ("OMP_MAX_TASK_PRIORITY", _max_task_priority_var, true);
-  parse_unsigned_long ("OMP_MAX_ACTIVE_LEVELS", _max_active_levels_var,
-  true);
-  if (gomp_max_active_levels_var > gomp_supported_active_levels)
-gomp_max_active_levels_var = gomp_supported_active_levels;
   gomp_def_allocator = parse_allocator ();
   if (parse_unsigned_long ("OMP_THREAD_LIMIT", 

Re: [PATCH] "used" attribute saves decl from linker garbage collection

2020-11-09 Thread Jozef Lawrynowicz
On Fri, Nov 06, 2020 at 04:39:33PM -0800, H.J. Lu via Gcc-patches wrote:
> On Fri, Nov 6, 2020 at 4:17 PM Jeff Law  wrote:
> >
> >
> > On 11/6/20 5:13 PM, H.J. Lu wrote:
> > > On Fri, Nov 6, 2020 at 4:01 PM Jeff Law  wrote:
> > >>
> > >> On 11/6/20 4:45 PM, H.J. Lu wrote:
> > >>> On Fri, Nov 6, 2020 at 3:37 PM Jeff Law  wrote:
> >  On 11/6/20 4:29 PM, H.J. Lu wrote:
> > > On Fri, Nov 6, 2020 at 3:22 PM Jeff Law  wrote:
> > >> On 11/5/20 7:34 AM, H.J. Lu via Gcc-patches wrote:
> > >>> On Thu, Nov 5, 2020 at 3:37 AM Jozef Lawrynowicz
> > >>>  wrote:
> >  On Thu, Nov 05, 2020 at 06:21:21AM -0500, Hans-Peter Nilsson wrote:
> > > On Wed, 4 Nov 2020, H.J. Lu wrote:
> > >> .retain is ill-defined.   For example,
> > >>
> > >> [hjl@gnu-cfl-2 gcc]$ cat /tmp/x.c
> > >> static int xyzzy __attribute__((__used__));
> > >> [hjl@gnu-cfl-2 gcc]$ ./xgcc -B./ -S /tmp/x.c -fcommon
> > >> [hjl@gnu-cfl-2 gcc]$ cat x.s
> > >> .file "x.c"
> > >> .text
> > >> .retain xyzzy  < What does it do?
> > >> .local xyzzy
> > >> .comm xyzzy,4,4
> > >> .ident "GCC: (GNU) 11.0.0 20201103 (experimental)"
> > >> .section .note.GNU-stack,"",@progbits
> > >> [hjl@gnu-cfl-2 gcc]$
> > > To answer that question: it's up to the assembler, but for ELF
> > > and SHF_GNU_RETAIN, it seems obvious it'd tell the assembler to
> > > set SHF_GNU_RETAIN for the section where the symbol ends up.
> > > We both know this isn't rocket science with binutils.
> >  Indeed, and my patch handles it trivially:
> >  https://sourceware.org/pipermail/binutils/2020-November/113993.html
> > 
> >    +void
> >    +obj_elf_retain (int arg ATTRIBUTE_UNUSED)
> >     snip 
> >    +  sym = get_sym_from_input_line_and_check ();
> >    +  symbol_get_obj (sym)->retain = 1;
> > 
> >    @@ -2624,6 +2704,9 @@ elf_frob_symbol (symbolS *symp, int *puntp)
> >  }
> > }
> > 
> >    +  if (symbol_get_obj (symp)->retain)
> >    +elf_section_flags (S_GET_SEGMENT (symp)) |= SHF_GNU_RETAIN;
> >    +
> >   /* Double check weak symbols.  */
> >   if (S_IS_WEAK (symp))
> > {
> > 
> >  We could check that the symbol named in the .retain directive has
> >  already been defined, however this isn't compatible with GCC
> >  mark_decl_preserved handling, since mark_decl_preserved is called
> >  emitted before the local symbols are defined in the assembly output
> >  file.
> > 
> >  GAS should at least validate that the symbol named in the .retain
> >  directive does end up as a symbol though.
> > 
> > >>> Don't add .retain.
> > >> Why?  I don't see why you find it so objectionable.
> > >>
> > > An ELF symbol directive should operate on symbol table:
> > >
> > > http://www.sco.com/developers/gabi/latest/ch4.symtab.html
> > >
> > > not the section flags where the symbol is defined.
> >  I agree in general, but I think this is one of those cases where it's
> >  not so clear.  And what you're talking about is an implementation 
> >  detail.
> > >>> There is no need for such a hack.  The proper thing to do in ELF is
> > >>> to place such a symbol in a section with SHF_GNU_RETAIN flag.   This
> > >>> also avoids the question what to do with SHN_COMMON.
> > >> I'm not sure that's a good idea either.  Moving symbols into a section
> > >> other than they'd normally live doesn't seem all that wise.
> > > In ELF, a symbol must be defined in a section.  If we want to keep a 
> > > symbol,
> > > we should place it in an SHF_GNU_RETAIN section.
> >
> > Again, that's an implementation detail and it's not clear to me that one
> > approach is inherently better than the other.
> >
> >
> > >
> > >> Let's face it, there's not a great solution here.  If we mark its
> > >> existing section, then everything in that section gets kept.  If we put
> > > FWIW, this is what .retain direct does and is one reason why I object
> > > it.
> >
> > We could make .retain work with either approach.I don't see .retain
> > as a problem at all.
> >
> >
> >
> > >
> > >> the object into a different section than it would normally live, then
> > >> that opens a whole new can of worms.
> > > We should place it in a section which it normally lives in and mark the
> > > section with SHF_GNU_RETAIN.
> >
> > And why not do that with .retain?  We define its semantics as precisely
> 
> But the .retain directive implementation being discussed here is different.
> One problem with the .retain directive is we can have
> 
> .section .data
> foo:
> ...
> bar:
> 
> .retain bar
> ...
> xxx:
> ...
> 
> What should assembler do with ".retain bar"?
> 
> > what you've 

Re: [PATCH v2] c++: DR 1914 - Allow duplicate standard attributes.

2020-11-09 Thread Marek Polacek via Gcc-patches
On Fri, Nov 06, 2020 at 03:01:56PM -0500, Jason Merrill via Gcc-patches wrote:
> On 11/6/20 2:34 PM, Marek Polacek wrote:
> > On Fri, Nov 06, 2020 at 02:23:10PM -0500, Jason Merrill via Gcc-patches 
> > wrote:
> > > On 11/6/20 2:06 PM, Marek Polacek wrote:
> > > > Following Joseph's change for C to allow duplicate C2x standard 
> > > > attributes
> > > > ,
> > > > this patch does a similar thing for C++.  This is DR 1914, to be 
> > > > resolved by
> > > > , which is not part of the standard yet, but has a wide
> > > > support so look like a shoo-in.  Some duplications still produce 
> > > > warnings;
> > > > I didn't change that because a warning might be desirable.
> > > 
> > > What's the rationale for warning about some and not others?
> > 
> > I don't have any.  Joseph's patch removed the error for a duplicated
> > 'fallthrough' attribute, but the warning remained so I left it as-is
> > too.
> > 
> > So either we just downgrade the error to a warning, or remove the
> > remaining warnings too.  I think I slightly prefer the former; with perhaps
> > a small tweak not to warn when the duplicated attribute comes from a macro
> > expansion.
> 
> Sounds good.

Here's a patch that does that.  cp_parser_check_std_attribute now handles
more standard attributes.  There's still a discrepancy that we warn about
[[fallthrough]] [[fallthrough]] but not about [[noreturn]] [[noreturn]].
Fixing this would involve calling cp_parser_check_std_attribute in another
spot (in a loop), presumably.  But I suspect that's less important for now.

Thanks,

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Following Joseph's change for C to allow duplicate C2x standard attributes
,
this patch does a similar thing for C++.  This is DR 1914, to be resolved by
, which is not part of the standard yet, but has wide
support so looks like a shoo-in.  The duplications now produce warnings
instead, but only if the attribute wasn't specified via a macro.

gcc/c-family/ChangeLog:

DR 1914
* c-common.c (attribute_fallthrough_p): Tweak the warning
message.

gcc/cp/ChangeLog:

DR 1914
* parser.c (cp_parser_check_std_attribute): Return bool.  Add a
location_t parameter.  Return true if the attribute wasn't duplicated.
Give a warning instead of an error.  Check more attributes.
(cp_parser_std_attribute_list): Don't add duplicated attributes to
the list.  Pass location to cp_parser_check_std_attribute.

gcc/testsuite/ChangeLog:

DR 1914
* c-c++-common/attr-fallthrough-2.c: Adjust dg-warning.
* g++.dg/cpp0x/fallthrough2.C: Likewise.
* g++.dg/cpp0x/gen-attrs-60.C: Turn dg-error into dg-warning.
* g++.dg/cpp1y/attr-deprecated-2.C: Likewise.
* g++.dg/cpp2a/attr-likely2.C: Adjust dg-warning.
* g++.dg/cpp2a/nodiscard-once.C: Turn dg-error into dg-warning.
* g++.dg/cpp0x/gen-attrs-72.C: New test.
---
 gcc/c-family/c-common.c   |  5 +-
 gcc/cp/parser.c   | 51 ++-
 .../c-c++-common/attr-fallthrough-2.c |  2 +-
 gcc/testsuite/g++.dg/cpp0x/fallthrough2.C |  2 +-
 gcc/testsuite/g++.dg/cpp0x/gen-attrs-60.C |  2 +-
 gcc/testsuite/g++.dg/cpp0x/gen-attrs-72.C | 45 
 .../g++.dg/cpp1y/attr-deprecated-2.C  |  2 +-
 gcc/testsuite/g++.dg/cpp2a/attr-likely2.C |  2 +-
 gcc/testsuite/g++.dg/cpp2a/nodiscard-once.C   |  2 +-
 9 files changed, 81 insertions(+), 32 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/gen-attrs-72.C

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index d4d3228b8f6..29508bca97b 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5752,9 +5752,10 @@ attribute_fallthrough_p (tree attr)
   tree t = lookup_attribute ("fallthrough", attr);
   if (t == NULL_TREE)
 return false;
-  /* This attribute shall appear at most once in each attribute-list.  */
+  /* It is no longer true that "this attribute shall appear at most once in
+ each attribute-list", but we still give a warning.  */
   if (lookup_attribute ("fallthrough", TREE_CHAIN (t)))
-warning (OPT_Wattributes, "% attribute specified multiple "
+warning (OPT_Wattributes, "attribute % specified multiple "
 "times");
   /* No attribute-argument-clause shall be present.  */
   else if (TREE_VALUE (t) != NULL_TREE)
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 323d7424a83..6fcee3efe6f 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -27269,30 +27269,30 @@ cp_parser_std_attribute (cp_parser *parser, tree 
attr_ns)
   return attribute;
 }
 
-/* Check that the attribute ATTRIBUTE appears at most once in the
-   attribute-list ATTRIBUTES.  This is enforced for noreturn (7.6.3),
-   nodiscard, and deprecated (7.6.5).  Note 

Re: [Patch] opts: Change `is incompatible with` messages to have standard parametrised form

2020-11-09 Thread Jeff Law via Gcc-patches


On 11/9/20 8:22 AM, Matthew Malcomson via Gcc-patches wrote:
> Hello,
>
> In a recent review for one of the hwasan patches Richard S. noticed there are
> quite a few errors of the form "% is incompatible with
> ".
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556137.html
>
> In order to avoid this creating extra work for translators we would like to
> change these error messages to use the form "%qs is incompatible with %qs" and
> pass the flag as format arguments.
>
> This patch implements that change.
> There is only one change in the output the compiler produces from this patch,
> an error message of "-fsanitize=address and -fsanitize=kernel-address are
> incompatible with -fsanitize=thread" has been changed to "-fsanitize=thread is
> incompatible with -fsanitize=address|kernel-address".
> This matches the similar error messages for live patching which use the
> messages "-f is incompatible with
> -flive-patching=inline-only-static|inline-clone".
>
> Bootstrapped and regtested on AArch64 without any problems.
> Ok for trunk?
>
> gcc/ChangeLog:
>
>   * opts.c (control_options_for_live_patching): Reform 'is incompatible
>   with' error messages to use a standard message with differing format
>   arguments.
>   (finish_options): Likewise.
>
> gcc/testsuite/ChangeLog:
>
>   * c-c++-common/ubsan/sanitize-recover-7.c: Update testcase.

Given how mechanical this is, I only spot-checked the changes.  If you
find more, consider similar changes pre-approved.


OK for the trunk.

jeff




[PATCH] c++: Fix -Wvexing-parse ICE with omitted int [PR97762]

2020-11-09 Thread Marek Polacek via Gcc-patches
For declarations like

  long f();

decl_specifiers->type will be NULL, but I neglected to handle this case,
therefore we ICE.  So handle this case by pretending we've seen 'int',
which is good enough for -Wvexing-parse's purposes.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/97762
* parser.c (warn_about_ambiguous_parse): Handle the case when
there is no type in the decl-specifiers.

gcc/testsuite/ChangeLog:

PR c++/97762
* g++.dg/warn/Wvexing-parse8.C: New test.
---
 gcc/cp/parser.c| 23 --
 gcc/testsuite/g++.dg/warn/Wvexing-parse8.C | 11 +++
 2 files changed, 28 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wvexing-parse8.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index bbf157eb47f..b14b4c90c92 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -20652,13 +20652,24 @@ warn_about_ambiguous_parse (const 
cp_decl_specifier_seq *decl_specifiers,
   if (declarator->parenthesized != UNKNOWN_LOCATION)
 return;
 
-  tree type = decl_specifiers->type;
-  if (TREE_CODE (type) == TYPE_DECL)
-   type = TREE_TYPE (type);
+  tree type;
+  if (decl_specifiers->type)
+{
+  type = decl_specifiers->type;
+  if (TREE_CODE (type) == TYPE_DECL)
+   type = TREE_TYPE (type);
 
-  /* If the return type is void there is no ambiguity.  */
-  if (same_type_p (type, void_type_node))
-return;
+  /* If the return type is void there is no ambiguity.  */
+  if (same_type_p (type, void_type_node))
+   return;
+}
+  else
+{
+  /* Code like long f(); will have null ->type.  If we have any
+type-specifiers, pretend we've seen int.  */
+  gcc_checking_assert (decl_specifiers->any_type_specifiers_p);
+  type = integer_type_node;
+}
 
   auto_diagnostic_group d;
   location_t loc = declarator->u.function.parens_loc;
diff --git a/gcc/testsuite/g++.dg/warn/Wvexing-parse8.C 
b/gcc/testsuite/g++.dg/warn/Wvexing-parse8.C
new file mode 100644
index 000..2d26d22fc4b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wvexing-parse8.C
@@ -0,0 +1,11 @@
+// PR c++/97762
+// { dg-do compile }
+
+void
+g ()
+{
+  long a(); // { dg-warning "empty parentheses" }
+  signed b(); // { dg-warning "empty parentheses" }
+  unsigned c(); // { dg-warning "empty parentheses" }
+  short d(); // { dg-warning "empty parentheses" }
+}

base-commit: 4e85ad79a137535393d8dc169359e1730cab3533
-- 
2.28.0



Re: [PATCH] c-family: Avoid unnecessary work when -Wpragmas is being ignored

2020-11-09 Thread Jeff Law via Gcc-patches


On 11/9/20 8:38 AM, Patrick Palka via Gcc-patches wrote:
> This speeds up handle_pragma_diagnostic by avoiding computing a spelling
> suggestion for an unrecognized option inside a #pragma directive when
> -Wpragmas warnings are being suppressed.
>
> In the range-v3 library, which contains many instances of
>
>   #pragma GCC diagnostic push
>   #pragma GCC diagnostic ignored "-Wpragmas"
>   #pragma GCC diagnostic ignored "-Wfoo"
>   ...
>   #pragma GCC diagnostic pop
>
> compile time is reduced by 33% in some of its tests.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?
>
> gcc/c-family/ChangeLog:
>
>   * c-pragma.c (handle_pragma_diagnostic): Split the
>   unknown-option -Wpragmas diagnostic into a warning and a
>   subsequent note containing a spelling suggestion.  Avoid
>   computing the suggestion if -Wpragmas warnings are being
>   suppressed.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.dg/pragma-diag-6.c: Adjust expected diagnostics
>   appropriately.

OK

jeff




Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-11-09 Thread Martin Sebor via Gcc-patches

Ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html

Jeff, I don't expect to have the cycles to reimplement this patch
using the Ranger APIs before stage 1 closes.  I'm open to giving
it a try in stage 3 if it's still in scope for GCC 11.  Otherwise,
is this patch okay to commit?

On 9/21/20 9:13 AM, Martin Sebor wrote:

On 9/20/20 12:39 AM, Aldy Hernandez wrote:



On 9/19/20 11:22 PM, Martin Sebor wrote:

On 9/18/20 12:29 AM, Aldy Hernandez wrote:



On 9/17/20 10:18 PM, Martin Sebor wrote:

On 9/17/20 12:39 PM, Andrew MacLeod wrote:

On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote:

On 9/16/20 9:23 PM, Jeff Law wrote:


On 9/15/20 1:47 PM, Martin Sebor wrote:

Overflowing the size of a dynamic allocation (e.g., malloc or VLA)
can lead to a subsequent buffer overflow corrupting the heap or
stack.  The attached patch diagnoses a subset of these cases where
the overflow/wraparound is still detectable.

Besides regtesting GCC on x86_64-linux I also verified the warning
doesn't introduce any false positives into Glibc or Binutils/GDB
builds on the same target.

Martin

gcc-96838.diff

PR middle-end/96838 - missing warning on integer overflow in 
calls to allocation functions


gcc/ChangeLog:

PR middle-end/96838
* calls.c (eval_size_vflow): New function.
(get_size_range): Call it.  Add argument.
(maybe_warn_alloc_args_overflow): Diagnose 
overflow/wraparound.

* calls.h (get_size_range): Add argument.

gcc/testsuite/ChangeLog:

PR middle-end/96838
* gcc.dg/Walloc-size-larger-than-19.c: New test.
* gcc.dg/Walloc-size-larger-than-20.c: New test.


If an attacker can control an integer overflow that feeds an 
allocation, then they can do all kinds of bad things.  In fact, 
when my son was asking me attack vectors, this is one I said I'd 
look at if I were a bad guy.



I'm a bit surprised you can't just query the range of the 
argument and get the overflow status out of that range, but I 
don't see that in the APIs.  How painful would it be to make 
that part of the API? The conceptual model would be to just ask 
for the range of the argument to malloc which would include the 
range and a status bit indicating the computation might have 
overflowed.


  Do we know if it did/would have wrapped? sure.  since we have to 
do the math.    so you are correct in that the information is 
there. but is it useful?


We are in the very annoying habit of subtracting one by adding 
0xFFF.  which means you get an overflow for unsigned when you 
subtract one.   From what I have seen of unsigned math, we would 
be flagging very many operations as overflows, so you would still 
have the difficulty of figuring out whether its a "real" overflow 
or a fake one because of the way we do unsigned math


You and me both :)



At the very start, I did have an overflow flag in the range 
class... but it was turning out to be fairly useless so it was 
removed.

.


I agree that being able to evaluate an expression in an as-if
infinite precision (in addition to its type) would be helpful.


SO again, we get back to adding 0x0f when we are trying to 
subtract one...  now, with infinite precision you are going to see


  [2,10]  - 1  we end up with [2,10]+0xFF, which will now 
give you  [0x10001, 0x10009]    so its going to look like 
it overflowed?





But just to make sure I understood correctly, let me ask again
using an example:

  void* f (size_t n)
  {
    if (n < PTRDIFF_MAX / 2)
  n = PTRDIFF_MAX / 2;

    return malloc (n * sizeof (int));
  }

Can the unsigned wraparound in the argument be readily detected?

On trunk, this ends up with the following:

  # RANGE [4611686018427387903, 18446744073709551615]
  _6 = MAX_EXPR ;
  # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612
  _1 = _6 * 4;
  ...
  p_5 = mallocD.1206 (_1); [tail call]
  ...
  return p_5;

so _1's range reflects the wraparound in size_t, but _6's range
has enough information to uncover it.  So detecting it is possible
and is done in the patch so we get a warning:

warning: argument 1 range [18446744073709551612, 
0x3fffc] is too large to represent in ‘long unsigned 
int’ [-Walloc-size-larger-than=]

    6 |   return malloc (n * sizeof (int));
  |  ^

The code is very simplistic and only handles a small subset of 
cases.
It could be generalized and exposed by a more generic API but it 
does
seem like the ranger must already have all the logic built into 
it so

if it isn't exposed now it should be a matter of opening it up.


everything is exposed in range-ops.  well, mostly.
if we have _1 = _6 * 4

if one wanted to do that infinite precision, you query the range 
for _6, and the range for 4 (which would be [4,4] :-)

range_of_expr (op1r, _6, stmt)
range_of_expr (op2r, 4, stmt)

you could take their current types, and cast those ranges to 
whatever the next higher precsion is,

range_cast  (op1r, highertype)

Re: [PATCH] Prefer bit-test over the jump table.

2020-11-09 Thread Jeff Law via Gcc-patches


On 11/9/20 7:24 AM, Martin Liška wrote:
> Hello.
>
> As mentioned in the PR, we used to prefer BT over JT in switch expansion.
> I restore the behavior to that.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
> Thanks,
> Martin
>
> gcc/ChangeLog:
>
>     PR tree-optimization/97736
>     * tree-switch-conversion.c
> (switch_decision_tree::analyze_switch_statement):
>     Prefer bit tests.
>
> gcc/testsuite/ChangeLog:
>
>     PR tree-optimization/97736
>     * gcc.dg/tree-ssa/switch-1.c: Prefer bit tests.
>     * g++.dg/tree-ssa/pr97736.C: New test.

[ ... ]


> ---
> diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
> index 426462e856b..a7c5df31743 100644
> --- a/gcc/tree-switch-conversion.c
> +++ b/gcc/tree-switch-conversion.c
> @@ -1743,8 +1743,8 @@ switch_decision_tree::analyze_switch_statement ()
>  
>    reset_out_edges_aux (m_switch);
>  
> -  /* Find jump table clusters.  */
> -  vec output = jump_table_cluster::find_jump_tables
> (clusters);
> +  /* Find bit-test clusters.  */
> +  vec output = bit_test_cluster::find_bit_tests (clusters);
>  
>    /* Find bit test clusters.  */

                ^^^


Doesn't the "bit test clusters" need to change too?  OK with that nit fixed.



jeff



Ping^2 Re: float.h: C2x NaN and Inf macros

2020-11-09 Thread Joseph Myers
Ping^2.  This patch 
 is 
still pending review (the DFP sNaN followup has been approved).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [24/32] module mapper

2020-11-09 Thread Nathan Sidwell

On 11/9/20 1:42 AM, Boris Kolpackov wrote:

I've noticed the following issues with the module mapper in the
-fdirectives-only mode:


1. When partially preprocessing the module interface unit, the mapper
receives the MODULE-EXPORT request that's unnecessary (BMI is not
written):

g++ ... -x c++ -E -fdirectives-only -o hello.gcm.ii hello.mxx

Similarly, in this mode, the mapper receives MODULE-IMPORT for
(non-header) module imports. Again, this is not necessary and
replying with a non-existent BMI works.


2. When doing full preprocessing of a partially preprocessed unit,
the mapper again receives MODULE-EXPORT and MODULE-IMPORT for
non-header modules:

g++-m ... -E -x c++ -fpreprocessed -fdirectives-only hello.gcm.ii

These are also unnecessary.


These are needed as they also serve to inform the mapper of a dependency 
edge.


nathan

--
Nathan Sidwell


[PATCH] c-family: Avoid unnecessary work when -Wpragmas is being ignored

2020-11-09 Thread Patrick Palka via Gcc-patches
This speeds up handle_pragma_diagnostic by avoiding computing a spelling
suggestion for an unrecognized option inside a #pragma directive when
-Wpragmas warnings are being suppressed.

In the range-v3 library, which contains many instances of

  #pragma GCC diagnostic push
  #pragma GCC diagnostic ignored "-Wpragmas"
  #pragma GCC diagnostic ignored "-Wfoo"
  ...
  #pragma GCC diagnostic pop

compile time is reduced by 33% in some of its tests.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/c-family/ChangeLog:

* c-pragma.c (handle_pragma_diagnostic): Split the
unknown-option -Wpragmas diagnostic into a warning and a
subsequent note containing a spelling suggestion.  Avoid
computing the suggestion if -Wpragmas warnings are being
suppressed.

gcc/testsuite/ChangeLog:

* gcc.dg/pragma-diag-6.c: Adjust expected diagnostics
appropriately.
---
 gcc/c-family/c-pragma.c  | 19 +--
 gcc/testsuite/gcc.dg/pragma-diag-6.c |  9 ++---
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
index dc52ee8b003..d68985ca277 100644
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -809,16 +809,15 @@ handle_pragma_diagnostic(cpp_reader *ARG_UNUSED(dummy))
   unsigned int option_index = find_opt (option_string + 1, lang_mask);
   if (option_index == OPT_SPECIAL_unknown)
 {
-  option_proposer op;
-  const char *hint = op.suggest_option (option_string + 1);
-  if (hint)
-   warning_at (loc, OPT_Wpragmas,
-   "unknown option after %<#pragma GCC diagnostic%> kind;"
-   " did you mean %<-%s%>?", hint);
-  else
-   warning_at (loc, OPT_Wpragmas,
-   "unknown option after %<#pragma GCC diagnostic%> kind");
-
+  auto_diagnostic_group d;
+  if (warning_at (loc, OPT_Wpragmas,
+ "unknown option after %<#pragma GCC diagnostic%> kind"))
+   {
+ option_proposer op;
+ const char *hint = op.suggest_option (option_string + 1);
+ if (hint)
+   inform (loc, "did you mean %<-%s%>?", hint);
+   }
   return;
 }
   else if (!(cl_options[option_index].flags & CL_WARNING))
diff --git a/gcc/testsuite/gcc.dg/pragma-diag-6.c 
b/gcc/testsuite/gcc.dg/pragma-diag-6.c
index 0dca1dc1ef4..f2df88d245b 100644
--- a/gcc/testsuite/gcc.dg/pragma-diag-6.c
+++ b/gcc/testsuite/gcc.dg/pragma-diag-6.c
@@ -2,7 +2,10 @@
 #pragma GCC diagnostic error "-Wnoexcept" /* { dg-warning "is valid for 
C../ObjC.. but not for C" } */
 #pragma GCC diagnostic error "-fstrict-aliasing" /* { dg-warning "not an 
option that controls warnings" } */
 #pragma GCC diagnostic error "-Werror" /* { dg-warning "not an option that 
controls warnings" } */
-#pragma GCC diagnostic error "-Wvla2" /* { dg-warning "unknown option after 
'#pragma GCC diagnostic' kind; did you mean '-Wvla'" } */
-#pragma GCC diagnostic error "-Walla" /* { dg-warning "unknown option after 
'#pragma GCC diagnostic' kind; did you mean '-Wall'" } */
-#pragma GCC diagnostic warning "-Walla" /* { dg-warning "unknown option after 
'#pragma GCC diagnostic' kind; did you mean '-Wall'" } */
+#pragma GCC diagnostic error "-Wvla2" /* { dg-warning "unknown option after 
'#pragma GCC diagnostic' kind" } */
+/* { dg-message "did you mean '-Wvla'" "" { target *-*-* } .-1 } */
+#pragma GCC diagnostic error "-Walla" /* { dg-warning "unknown option after 
'#pragma GCC diagnostic' kind" } */
+/* { dg-message "did you mean '-Wall'" "" { target *-*-* } .-1 } */
+#pragma GCC diagnostic warning "-Walla" /* { dg-warning "unknown option after 
'#pragma GCC diagnostic' kind" } */
+/* { dg-message "did you mean '-Wall'" "" { target *-*-* } .-1 } */
 int i;
-- 
2.29.2.154.g7f7ebe054a



[Patch] opts: Change `is incompatible with` messages to have standard parametrised form

2020-11-09 Thread Matthew Malcomson via Gcc-patches
Hello,

In a recent review for one of the hwasan patches Richard S. noticed there are
quite a few errors of the form "% is incompatible with
".
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556137.html

In order to avoid this creating extra work for translators we would like to
change these error messages to use the form "%qs is incompatible with %qs" and
pass the flag as format arguments.

This patch implements that change.
There is only one change in the output the compiler produces from this patch,
an error message of "-fsanitize=address and -fsanitize=kernel-address are
incompatible with -fsanitize=thread" has been changed to "-fsanitize=thread is
incompatible with -fsanitize=address|kernel-address".
This matches the similar error messages for live patching which use the
messages "-f is incompatible with
-flive-patching=inline-only-static|inline-clone".

Bootstrapped and regtested on AArch64 without any problems.
Ok for trunk?

gcc/ChangeLog:

* opts.c (control_options_for_live_patching): Reform 'is incompatible
with' error messages to use a standard message with differing format
arguments.
(finish_options): Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/ubsan/sanitize-recover-7.c: Update testcase.



### Attachment also inlined for ease of reply###


diff --git a/gcc/opts.c b/gcc/opts.c
index 
96291e89a49dd1cf25a0cacc5a62413d120fa24d..ac9972d9c386247af3482e07a94c76da3e1abb4d
 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -688,30 +688,26 @@ control_options_for_live_patching (struct gcc_options 
*opts,
 {
 case LIVE_PATCHING_INLINE_ONLY_STATIC:
   if (opts_set->x_flag_ipa_cp_clone && opts->x_flag_ipa_cp_clone)
-   error_at (loc,
- "%<-fipa-cp-clone%> is incompatible with "
- "%<-flive-patching=inline-only-static%>");
+   error_at (loc, "%qs is incompatible with %qs",
+ "-fipa-cp-clone", "-flive-patching=inline-only-static");
   else
opts->x_flag_ipa_cp_clone = 0;
 
   if (opts_set->x_flag_ipa_sra && opts->x_flag_ipa_sra)
-   error_at (loc,
- "%<-fipa-sra%> is incompatible with "
- "%<-flive-patching=inline-only-static%>");
+   error_at (loc, "%qs is incompatible with %qs",
+ "-fipa-sra", "-flive-patching=inline-only-static");
   else
opts->x_flag_ipa_sra = 0;
 
   if (opts_set->x_flag_partial_inlining && opts->x_flag_partial_inlining)
-   error_at (loc,
- "%<-fpartial-inlining%> is incompatible with "
- "%<-flive-patching=inline-only-static%>");
+   error_at (loc, "%qs is incompatible with %qs",
+ "-fpartial-inlining", "-flive-patching=inline-only-static");
   else
opts->x_flag_partial_inlining = 0;
 
   if (opts_set->x_flag_ipa_cp && opts->x_flag_ipa_cp)
-   error_at (loc,
- "%<-fipa-cp%> is incompatible with "
- "%<-flive-patching=inline-only-static%>");
+   error_at (loc, "%qs is incompatible with %qs",
+ "-fipa-cp", "-flive-patching=inline-only-static");
   else
opts->x_flag_ipa_cp = 0;
 
@@ -719,9 +715,9 @@ control_options_for_live_patching (struct gcc_options *opts,
 case LIVE_PATCHING_INLINE_CLONE:
   /* live patching should disable whole-program optimization.  */
   if (opts_set->x_flag_whole_program && opts->x_flag_whole_program)
-   error_at (loc,
- "%<-fwhole-program%> is incompatible with "
- "%<-flive-patching=inline-only-static|inline-clone%>");
+   error_at (loc, "%qs is incompatible with %qs",
+ "-fwhole-program",
+ "-flive-patching=inline-only-static|inline-clone");
   else
opts->x_flag_whole_program = 0;
 
@@ -730,65 +726,65 @@ control_options_for_live_patching (struct gcc_options 
*opts,
 && !flag_partial_inlining.  */
 
   if (opts_set->x_flag_ipa_pta && opts->x_flag_ipa_pta)
-   error_at (loc,
- "%<-fipa-pta%> is incompatible with "
- "%<-flive-patching=inline-only-static|inline-clone%>");
+   error_at (loc, "%qs is incompatible with %qs",
+ "-fipa-pta",
+ "-flive-patching=inline-only-static|inline-clone");
   else
opts->x_flag_ipa_pta = 0;
 
   if (opts_set->x_flag_ipa_reference && opts->x_flag_ipa_reference)
-   error_at (loc,
- "%<-fipa-reference%> is incompatible with "
- "%<-flive-patching=inline-only-static|inline-clone%>");
+   error_at (loc, "%qs is incompatible with %qs",
+ "-fipa-reference",
+ "-flive-patching=inline-only-static|inline-clone");
   else
opts->x_flag_ipa_reference = 0;
 
   if (opts_set->x_flag_ipa_ra && opts->x_flag_ipa_ra)
-   error_at (loc,
- "%<-fipa-ra%> is incompatible with "
-

Re: [PATCH] rs6000: Fix default alignment ABI break caused by MMA base support

2020-11-09 Thread Paul A. Clarke via Gcc-patches
On Fri, Nov 06, 2020 at 04:18:00PM -0600, Peter Bergner via Gcc-patches wrote:
> As part of the MMA base support, we incremented BIGGEST_ALIGNMENT in
> order to align the __vector_pair and __vector_quad types to 256 and 512
> bits respectively.  This had the unintended effect of changing the
> default alignment used by __attribute__ ((__aligned__)) which causes
> an ABI break because of some dodgy code in GLIBC's struct pthread
> (GLIBC is going to fix that too).  The fix in GCC is to revert the
> BIGGEST_ALIGNMENT change and to force the alignment on the type itself
> rather than the mode used by the type.
[snip]
> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
> index bbd8060e143..5a47aa14722 100644
> --- a/gcc/config/rs6000/rs6000.h
> +++ b/gcc/config/rs6000/rs6000.h
> @@ -776,8 +776,10 @@ extern unsigned rs6000_pointer_size;
>  /* Allocation boundary (in *bits*) for the code of a function.  */
>  #define FUNCTION_BOUNDARY 32
>  
> -/* No data type wants to be aligned rounder than this.  */
> -#define BIGGEST_ALIGNMENT (TARGET_MMA ? 512 : 128)
> +/* No data type is required to be aligned rounder than this.  Warning, if
> +   BIGGEST_ALIGNMENT is changed, then this may be an ABI break.  An example
> +   of where this can break an ABI is in GLIBC's struct _Unwind_Exception.  */

Instead of calling out something that is expected to be fixed, should you
instead call out that it will change the alignment of anything using
"__attribute__ ((__aligned__))"?

> +#define BIGGEST_ALIGNMENT 128

[snip]

PC


[committed] libstdc++: Improve comment on _Power_of_2 helper function

2020-11-09 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

* include/bits/uniform_int_dist.h (__detail::_Power_of_2):
Document that true result for zero is intentional.

Tested x86_64-linux. Committed to trunk.

commit b2b85163731e8647542f2f7561bd4c69ae5f5f2a
Author: Jonathan Wakely 
Date:   Mon Nov 9 14:32:45 2020

libstdc++: Improve comment on _Power_of_2 helper function

libstdc++-v3/ChangeLog:

* include/bits/uniform_int_dist.h (__detail::_Power_of_2):
Document that true result for zero is intentional.

diff --git a/libstdc++-v3/include/bits/uniform_int_dist.h 
b/libstdc++-v3/include/bits/uniform_int_dist.h
index 8f02b85c9bb0..4169f705c2af 100644
--- a/libstdc++-v3/include/bits/uniform_int_dist.h
+++ b/libstdc++-v3/include/bits/uniform_int_dist.h
@@ -56,7 +56,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   namespace __detail
   {
-/* Determine whether number is a power of 2.  */
+// Determine whether number is a power of two.
+// This is true for zero, which is OK because we want _Power_of_2(n+1)
+// to be true if n==numeric_limits<_Tp>::max() and so n+1 wraps around.
 template
   constexpr bool
   _Power_of_2(_Tp __x)


[committed] libstdc++: Remove redundant check for zero in std::__popcount

2020-11-09 Thread Jonathan Wakely via Gcc-patches
The popcount built-ins work fine for zero, so there's no need to check
for it.

libstdc++-v3/ChangeLog:

* include/std/bit (__popcount): Remove redundant check for zero.

Tested x86_64-linux. Committed to trunk.

commit ff4bfb1553cf525d7299bbf7451ac32cfd97ae1b
Author: Jonathan Wakely 
Date:   Mon Nov 9 14:31:13 2020

libstdc++: Remove redundant check for zero in std::__popcount

The popcount built-ins work fine for zero, so there's no need to check
for it.

libstdc++-v3/ChangeLog:

* include/std/bit (__popcount): Remove redundant check for zero.

diff --git a/libstdc++-v3/include/std/bit b/libstdc++-v3/include/std/bit
index f4344820d527..16f7eba46d7b 100644
--- a/libstdc++-v3/include/std/bit
+++ b/libstdc++-v3/include/std/bit
@@ -184,9 +184,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   using __gnu_cxx::__int_traits;
   constexpr auto _Nd = __int_traits<_Tp>::__digits;
 
-  if (__x == 0)
-return 0;
-
   constexpr auto _Nd_ull = __int_traits::__digits;
   constexpr auto _Nd_ul = __int_traits::__digits;
   constexpr auto _Nd_u = __int_traits::__digits;


[PATCH] tree-optimization/97761 - fix SLP live calculation

2020-11-09 Thread Richard Biener
This removes a premature end of the DFS walk.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-11-09  Richard Biener  

PR tree-optimization/97761
* tree-vect-slp.c (vect_bb_slp_mark_live_stmts): Remove
premature end of DFS walk.

* gfortran.dg/vect/pr97761.f90: New testcase.
---
 gcc/testsuite/gfortran.dg/vect/pr97761.f90 | 32 ++
 gcc/tree-vect-slp.c|  4 ---
 2 files changed, 32 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/vect/pr97761.f90

diff --git a/gcc/testsuite/gfortran.dg/vect/pr97761.f90 
b/gcc/testsuite/gfortran.dg/vect/pr97761.f90
new file mode 100644
index 000..250e2bf016e
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/vect/pr97761.f90
@@ -0,0 +1,32 @@
+! { dg-do compile }
+! { dg-additional-options "-O1" }
+
+subroutine ni (ps)
+type vector
+   real  x, y
+end type 
+type quad_inductor
+   type (vector) v1, v2
+end type 
+type (quad_inductor), dimension(inout) :: ps
+integer :: dl, nk = 1.0
+fo = 1.0
+if (f == 1) then
+   nk = 0.0
+   fo = 0.0
+end if
+ot = nk * 0.5
+gb = -fo * 0.5
+wu = fo * 0.5
+up = nk * 0.1
+xe = up * 0.1
+do lx = 0, 7
+   ps%v2%y = -wu
+   ps(dl)%v1%x = xe + 1.0
+   ps(dl)%v1%y = wu - tn
+end do
+do lx = 0, 7
+   ps(dl)%v1%x = 0.1 - ot
+   ps(dl)%v1%y = 0.1 - wu
+end do
+end
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 88e637e30dc..e4c2aa480e5 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -3549,12 +3549,10 @@ vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo, 
slp_tree node,
   unsigned i;
   stmt_vec_info stmt_info;
   stmt_vec_info last_stmt = vect_find_last_scalar_stmt_in_slp (node);
-  bool all_visited = true;
   FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
 {
   if (svisited.contains (stmt_info))
continue;
-  all_visited = false;
   stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info);
   if (STMT_VINFO_IN_PATTERN_P (orig_stmt_info)
  && STMT_VINFO_RELATED_STMT (orig_stmt_info) != stmt_info)
@@ -3628,8 +3626,6 @@ vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo, 
slp_tree node,
   if (mark_visited)
svisited.add (stmt_info);
 }
-  if (all_visited)
-return;
 
   slp_tree child;
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
-- 
2.26.2


Re: [PATCH] Clean up irange self tests.

2020-11-09 Thread Aldy Hernandez via Gcc-patches
Yes, it was leftover from the original irange work years ago.

Aldy

On Mon, Nov 9, 2020 at 3:42 PM Andrew MacLeod  wrote:
>
> On 11/9/20 9:38 AM, Aldy Hernandez wrote:
> > Currently we have all the irange and range-op tests in range-op.cc.
> > This patch splits them up into the appropriate file (irange
> > tests in value-range.cc and range-op tests in range-op.cc).  The patch
> > also splits up the tests themselves by functionality.  It's not perfect,
> > but significantly better than the mess we had.
> >
> > Andrew, does this split look good to you?
>
> OK.
>   I always thought it was a little weird that the range tests were in
> the range-ops file anyway :-)
> >
> > If so, I'll push once bootstrap passes.
> > Aldy
> >
> > gcc/ChangeLog:
> >
> >   * function-tests.c (test_ranges): Call range_op_tests.
> >   * range-op.cc (build_range3): Move to value-range.cc.
> >   (range3_tests): Same.
> >   (int_range_max_tests): Same.
> >   (multi_precision_range_tests): Same.
> >   (range_tests): Same.
> >   (operator_tests): Split up...
> >   (range_op_tests): Split up...
> >   (range_op_cast_tests): ...here.
> >   (range_op_lshift_tests): ...here.
> >   (range_op_rshift_tests): ...here.
> >   (range_op_bitwise_and_tests): ...here.
> >   * selftest.h (range_op_tests): New.
> >   * value-range.cc (build_range3): New.
> >   (range_tests_irange3): New.
> >   (range_tests_int_range_max): New.
> >   (range_tests_legacy): New.
> >   (range_tests_misc): New.
> >   (range_tests): New.
> > ---
> >
>



[PATCH] Cleanup irange::set.

2020-11-09 Thread Aldy Hernandez via Gcc-patches
[This is actually part of a larger patch that actually changes
behavior, but I thought I'd commit the non-invasive cleanups first
which will simplify the upcoming work.]

irange::set was doing more work than it should for legacy ranges.
I cleaned up various unnecessary calls to swap_out_of_order_endpoints,
as well as some duplicate code that could be done with normalize_min_max.

I also removed an obsolete comment wrt sticky infinite overflows.
Not only did the -INF/+INF(OVF) code get removed in 2017,
but normalize_min_max() uses wide ints, which ignored overflows
altogether.

Pushed.

gcc/ChangeLog:

* value-range.cc (irange::swap_out_of_order_endpoints): Rewrite
into static function.
(irange::set): Cleanup redundant manipulations.
* value-range.h (irange::normalize_min_max): Modify object
in-place instead of modifying arguments.
---
 gcc/value-range.cc | 70 --
 gcc/value-range.h  | 28 +--
 2 files changed, 37 insertions(+), 61 deletions(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 5827e812216..2124e229e0c 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -131,13 +131,14 @@ irange::copy_to_legacy (const irange )
 set (src.tree_lower_bound (), src.tree_upper_bound ());
 }
 
-// Swap min/max if they are out of order.  Return TRUE if further
-// processing of the range is necessary, FALSE otherwise.
+// Swap MIN/MAX if they are out of order and adjust KIND appropriately.
 
-bool
-irange::swap_out_of_order_endpoints (tree , tree ,
- value_range_kind )
+static void
+swap_out_of_order_endpoints (tree , tree , value_range_kind )
 {
+  gcc_checking_assert (kind != VR_UNDEFINED);
+  if (kind == VR_VARYING)
+return;
   /* Wrong order for min and max, to swap them and the VR type we need
  to adjust them.  */
   if (tree_int_cst_lt (max, min))
@@ -149,8 +150,8 @@ irange::swap_out_of_order_endpoints (tree , tree ,
 for VR_ANTI_RANGE empty range, so drop to varying as well.  */
   if (TYPE_PRECISION (TREE_TYPE (min)) == 1)
{
- set_varying (TREE_TYPE (min));
- return false;
+ kind = VR_VARYING;
+ return;
}
 
   one = build_int_cst (TREE_TYPE (min), 1);
@@ -163,12 +164,11 @@ irange::swap_out_of_order_endpoints (tree , tree ,
 to varying in this case.  */
   if (tree_int_cst_lt (max, min))
{
- set_varying (TREE_TYPE (min));
- return false;
+ kind = VR_VARYING;
+ return;
}
   kind = kind == VR_RANGE ? VR_ANTI_RANGE : VR_RANGE;
 }
-  return true;
 }
 
 void
@@ -253,13 +253,6 @@ irange::set (tree min, tree max, value_range_kind kind)
   && (POLY_INT_CST_P (min) || POLY_INT_CST_P (max)))
 kind = VR_VARYING;
 
-  if (kind == VR_VARYING)
-{
-  set_varying (TREE_TYPE (min));
-  return;
-}
-
-  tree type = TREE_TYPE (min);
   // Nothing to canonicalize for symbolic ranges.
   if (TREE_CODE (min) != INTEGER_CST
   || TREE_CODE (max) != INTEGER_CST)
@@ -270,8 +263,13 @@ irange::set (tree min, tree max, value_range_kind kind)
   m_num_ranges = 1;
   return;
 }
-  if (!swap_out_of_order_endpoints (min, max, kind))
-goto cleanup_set;
+
+  swap_out_of_order_endpoints (min, max, kind);
+  if (kind == VR_VARYING)
+{
+  set_varying (TREE_TYPE (min));
+  return;
+}
 
   // Anti-ranges that can be represented as ranges should be so.
   if (kind == VR_ANTI_RANGE)
@@ -280,6 +278,7 @@ irange::set (tree min, tree max, value_range_kind kind)
  values < -INF and values > INF as -INF/INF as well.  */
   bool is_min = vrp_val_is_min (min);
   bool is_max = vrp_val_is_max (max);
+  tree type = TREE_TYPE (min);
 
   if (is_min && is_max)
{
@@ -314,38 +313,17 @@ irange::set (tree min, tree max, value_range_kind kind)
  kind = VR_RANGE;
 }
 }
-  else if (!swap_out_of_order_endpoints (min, max, kind))
-goto cleanup_set;
-
-  /* Do not drop [-INF(OVF), +INF(OVF)] to varying.  (OVF) has to be sticky
- to make sure VRP iteration terminates, otherwise we can get into
- oscillations.  */
-  if (!normalize_min_max (type, min, max, kind))
-{
-  m_kind = kind;
-  m_base[0] = min;
-  m_base[1] = max;
-  m_num_ranges = 1;
-  if (flag_checking)
-   verify_range ();
-}
 
- cleanup_set:
-  // Avoid using TYPE_{MIN,MAX}_VALUE because -fstrict-enums can
-  // restrict those to a subset of what actually fits in the type.
-  // Instead use the extremes of the type precision
-  unsigned prec = TYPE_PRECISION (type);
-  signop sign = TYPE_SIGN (type);
-  if (wi::eq_p (wi::to_wide (min), wi::min_value (prec, sign))
-  && wi::eq_p (wi::to_wide (max), wi::max_value (prec, sign)))
-m_kind = VR_VARYING;
-  else if (undefined_p ())
-m_kind = VR_UNDEFINED;
+  m_kind = kind;
+  m_base[0] = min;
+  m_base[1] 

Re: [PATCH] Clean up irange self tests.

2020-11-09 Thread Andrew MacLeod via Gcc-patches

On 11/9/20 9:38 AM, Aldy Hernandez wrote:

Currently we have all the irange and range-op tests in range-op.cc.
This patch splits them up into the appropriate file (irange
tests in value-range.cc and range-op tests in range-op.cc).  The patch
also splits up the tests themselves by functionality.  It's not perfect,
but significantly better than the mess we had.

Andrew, does this split look good to you?


OK.
 I always thought it was a little weird that the range tests were in 
the range-ops file anyway :-)


If so, I'll push once bootstrap passes.
Aldy

gcc/ChangeLog:

* function-tests.c (test_ranges): Call range_op_tests.
* range-op.cc (build_range3): Move to value-range.cc.
(range3_tests): Same.
(int_range_max_tests): Same.
(multi_precision_range_tests): Same.
(range_tests): Same.
(operator_tests): Split up...
(range_op_tests): Split up...
(range_op_cast_tests): ...here.
(range_op_lshift_tests): ...here.
(range_op_rshift_tests): ...here.
(range_op_bitwise_and_tests): ...here.
* selftest.h (range_op_tests): New.
* value-range.cc (build_range3): New.
(range_tests_irange3): New.
(range_tests_int_range_max): New.
(range_tests_legacy): New.
(range_tests_misc): New.
(range_tests): New.
---
  




Re: [PATCH] aarch64: Do not alter force_reg returned register expanding fcmla

2020-11-09 Thread Andrea Corallo via Gcc-patches
Kyrylo Tkachov  writes:

>> -Original Message-
>> From: Andrea Corallo 
>> Sent: 09 November 2020 10:04
>> To: gcc-patches@gcc.gnu.org
>> Cc: Kyrylo Tkachov ; Richard Earnshaw
>> ; nd 
>> Subject: [PATCH] aarch64: Do not alter force_reg returned register
>> expanding fcmla
>> 
>> Hi all,
>> 
>> this patch is to fix a force_reg returned rtx potentially modified in
>> `aarch64_general_expand_builtin`.
>> 
>> Bootstrapped and reg-tested on aarch64-none-linux-gnu.
>> 
>> Okay for trunk?
>
> Ok.
> Thanks,
> Kyrill

Installed into master as fa59c8dcd2f.

Thanks!

  Andrea


[PATCH] Clean up irange self tests.

2020-11-09 Thread Aldy Hernandez via Gcc-patches
Currently we have all the irange and range-op tests in range-op.cc.
This patch splits them up into the appropriate file (irange
tests in value-range.cc and range-op tests in range-op.cc).  The patch
also splits up the tests themselves by functionality.  It's not perfect,
but significantly better than the mess we had.

Andrew, does this split look good to you?

If so, I'll push once bootstrap passes.
Aldy

gcc/ChangeLog:

* function-tests.c (test_ranges): Call range_op_tests.
* range-op.cc (build_range3): Move to value-range.cc.
(range3_tests): Same.
(int_range_max_tests): Same.
(multi_precision_range_tests): Same.
(range_tests): Same.
(operator_tests): Split up...
(range_op_tests): Split up...
(range_op_cast_tests): ...here.
(range_op_lshift_tests): ...here.
(range_op_rshift_tests): ...here.
(range_op_bitwise_and_tests): ...here.
* selftest.h (range_op_tests): New.
* value-range.cc (build_range3): New.
(range_tests_irange3): New.
(range_tests_int_range_max): New.
(range_tests_legacy): New.
(range_tests_misc): New.
(range_tests): New.
---
 gcc/function-tests.c |   1 +
 gcc/range-op.cc  | 643 ++-
 gcc/selftest.h   |   1 +
 gcc/value-range.cc   | 380 +
 4 files changed, 532 insertions(+), 493 deletions(-)

diff --git a/gcc/function-tests.c b/gcc/function-tests.c
index 65364588734..92f1acf780e 100644
--- a/gcc/function-tests.c
+++ b/gcc/function-tests.c
@@ -580,6 +580,7 @@ test_ranges ()
   function *fun = DECL_STRUCT_FUNCTION (fndecl);
   push_cfun (fun);
   range_tests ();
+  range_op_tests ();
   pop_cfun ();
 }
 
diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index f38f02e8d27..bbb2a61ae35 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -3361,7 +3361,6 @@ range_cast (irange , tree type)
 
 #if CHECKING_P
 #include "selftest.h"
-#include "stor-layout.h"
 
 namespace selftest
 {
@@ -3369,413 +3368,21 @@ namespace selftest
 #define UINT(N) build_int_cstu (unsigned_type_node, (N))
 #define INT16(N) build_int_cst (short_integer_type_node, (N))
 #define UINT16(N) build_int_cstu (short_unsigned_type_node, (N))
-#define INT64(N) build_int_cstu (long_long_integer_type_node, (N))
-#define UINT64(N) build_int_cstu (long_long_unsigned_type_node, (N))
-#define UINT128(N) build_int_cstu (u128_type, (N))
-#define UCHAR(N) build_int_cstu (unsigned_char_type_node, (N))
 #define SCHAR(N) build_int_cst (signed_char_type_node, (N))
-
-static int_range<3>
-build_range3 (int a, int b, int c, int d, int e, int f)
-{
-  int_range<3> i1 (INT (a), INT (b));
-  int_range<3> i2 (INT (c), INT (d));
-  int_range<3> i3 (INT (e), INT (f));
-  i1.union_ (i2);
-  i1.union_ (i3);
-  return i1;
-}
-
-static void
-range3_tests ()
-{
-  typedef int_range<3> int_range3;
-  int_range3 r0, r1, r2;
-  int_range3 i1, i2, i3;
-
-  // ([10,20] U [5,8]) U [1,3] ==> [1,3][5,8][10,20].
-  r0 = int_range3 (INT (10), INT (20));
-  r1 = int_range3 (INT (5), INT (8));
-  r0.union_ (r1);
-  r1 = int_range3 (INT (1), INT (3));
-  r0.union_ (r1);
-  ASSERT_TRUE (r0 == build_range3 (1, 3, 5, 8, 10, 20));
-
-  // [1,3][5,8][10,20] U [-5,0] => [-5,3][5,8][10,20].
-  r1 = int_range3 (INT (-5), INT (0));
-  r0.union_ (r1);
-  ASSERT_TRUE (r0 == build_range3 (-5, 3, 5, 8, 10, 20));
-
-  // [10,20][30,40] U [50,60] ==> [10,20][30,40][50,60].
-  r1 = int_range3 (INT (50), INT (60));
-  r0 = int_range3 (INT (10), INT (20));
-  r0.union_ (int_range3 (INT (30), INT (40)));
-  r0.union_ (r1);
-  ASSERT_TRUE (r0 == build_range3 (10, 20, 30, 40, 50, 60));
-  // [10,20][30,40][50,60] U [70, 80] ==> [10,20][30,40][50,60][70,80].
-  r1 = int_range3 (INT (70), INT (80));
-  r0.union_ (r1);
-
-  r2 = build_range3 (10, 20, 30, 40, 50, 60);
-  r2.union_ (int_range3 (INT (70), INT (80)));
-  ASSERT_TRUE (r0 == r2);
-
-  // [10,20][30,40][50,60] U [6,35] => [6,40][50,60].
-  r0 = build_range3 (10, 20, 30, 40, 50, 60);
-  r1 = int_range3 (INT (6), INT (35));
-  r0.union_ (r1);
-  r1 = int_range3 (INT (6), INT (40));
-  r1.union_ (int_range3 (INT (50), INT (60)));
-  ASSERT_TRUE (r0 == r1);
-
-  // [10,20][30,40][50,60] U [6,60] => [6,60].
-  r0 = build_range3 (10, 20, 30, 40, 50, 60);
-  r1 = int_range3 (INT (6), INT (60));
-  r0.union_ (r1);
-  ASSERT_TRUE (r0 == int_range3 (INT (6), INT (60)));
-
-  // [10,20][30,40][50,60] U [6,70] => [6,70].
-  r0 = build_range3 (10, 20, 30, 40, 50, 60);
-  r1 = int_range3 (INT (6), INT (70));
-  r0.union_ (r1);
-  ASSERT_TRUE (r0 == int_range3 (INT (6), INT (70)));
-
-  // [10,20][30,40][50,60] U [35,70] => [10,20][30,70].
-  r0 = build_range3 (10, 20, 30, 40, 50, 60);
-  r1 = int_range3 (INT (35), INT (70));
-  r0.union_ (r1);
-  r1 = int_range3 (INT (10), INT (20));
-  r1.union_ (int_range3 (INT (30), INT (70)));
-  ASSERT_TRUE (r0 == r1);
-
-  // [10,20][30,40][50,60] U [15,35] => [10,40][50,60].
-  r0 = 

Re: [committed 1/2] libstdc++: Fix multiple definitions of std::exception_ptr functions [PR 97729]

2020-11-09 Thread Jonathan Wakely via Gcc-patches

On 05/11/20 18:03 +, Jonathan Wakely wrote:

This fixes some multiple definition errors caused by the changes for
PR libstdc++/90295. The previous solution for inlining the members of
std::exception_ptr but still exporting them from the library was to
suppress the 'inline' keyword on those functions when compiling
libsupc++/eh_ptr.cc, so they get defined in that file. That produces ODR
violations though, because there are now both inline and non-inline
definitions in the library, due to the use of std::exception_ptr in
other files sucg as src/c++11/future.cc.

The new solution is to define all the relevant members as 'inline'
unconditionally, but use __attribute__((used)) to cause definitions to
be emitted in libsupc++/eh_ptr.cc as before. This doesn't quite work
however, because PR c++/67453 means the attribute is ignored on
constructors and destructors. As a workaround, the old solution
(conditionally inline) is still used for those members, but they are
given the always_inline attribute so that they aren't emitted in
src/c++11/future.o as inline definitions.


That workaround can be removed now.

Tested powerpc64le-linux. Committed to trunk.


commit 0af3930a497e022597a08fa1bcef5e453bfa636f
Author: Jonathan Wakely 
Date:   Mon Nov 9 10:16:07 2020

libstdc++: Use 'inline' consistently in std::exception_ptr [PR 97729]

With PR c++/67453 fixed we can rely on the 'used' attribute to emit
inline constructors and destructors in libsupc++/eh_ptr.cc. This means
we don't need to suppress the 'inline' keyword on them in that file, and
don't need to force 'always_inline' on them in other files.

libstdc++-v3/ChangeLog:

PR libstdc++/97729
* libsupc++/exception_ptr.h (exception_ptr::exception_ptr())
(exception_ptr::exception_ptr(const exception_ptr&))
(exception_ptr::~exception_ptr()): Remove 'always_inline'
attributes. Use 'inline' unconditionally.

diff --git a/libstdc++-v3/libsupc++/exception_ptr.h b/libstdc++-v3/libsupc++/exception_ptr.h
index 001343ac0498..6ae4d4ca944d 100644
--- a/libstdc++-v3/libsupc++/exception_ptr.h
+++ b/libstdc++-v3/libsupc++/exception_ptr.h
@@ -174,19 +174,13 @@ namespace std
 };
 
 _GLIBCXX_EH_PTR_USED
-#ifndef  _GLIBCXX_EH_PTR_COMPAT
-__attribute__((__always_inline__)) // XXX see PR 97729
 inline
-#endif
 exception_ptr::exception_ptr() _GLIBCXX_NOEXCEPT
 : _M_exception_object(0)
 { }
 
 _GLIBCXX_EH_PTR_USED
-#ifndef  _GLIBCXX_EH_PTR_COMPAT
-__attribute__((__always_inline__))
 inline
-#endif
 exception_ptr::exception_ptr(const exception_ptr& __other) _GLIBCXX_NOEXCEPT
 : _M_exception_object(__other._M_exception_object)
 {
@@ -195,10 +189,7 @@ namespace std
 }
 
 _GLIBCXX_EH_PTR_USED
-#ifndef  _GLIBCXX_EH_PTR_COMPAT
-__attribute__((__always_inline__))
 inline
-#endif
 exception_ptr::~exception_ptr() _GLIBCXX_USE_NOEXCEPT
 {
   if (_M_exception_object)


Re: [committed] libstdc++: Make std::function work better with -fno-rtti

2020-11-09 Thread Jonathan Wakely via Gcc-patches

On 29/10/20 14:49 +, Jonathan Wakely wrote:

This change allows std::function::target() to work even without RTTI,
using the same approach as std::any. Because we know what the manager
function would be for a given type, we can check if the stored pointer
has the expected address. If it does, we don't need to use RTTI. If it
isn't equal, we still need to do the RTTI check (when RTTI is enabled)
to handle the case where the same function has different addresses in
different shared objects.

This also changes the implementation of the manager function to return a
null pointer result when asked for the type_info of the target object.
This not only avoids a warning with -Wswitch -Wsystem-headers, but also
avoids prevents std::function::target_type() from dereferencing an
uninitialized pointer when the linker keeps an instantiation of the
manager function that was compiled without RTTI.

Finally, this fixes a bug in the non-const overload of function::target
where calling it with a function type F was ill-formed, due to
attempting to use const_cast(ptr). The standard only allows
const_cast when T is an object type.  The solution is to use
*const_cast() instead, because F* is an object type even if F
isn't. I've also used _GLIBCXX17_CONSTEXPR in function::target so that
it doesn't bother instantiating anything for types that can never be a
valid target.

libstdc++-v3/ChangeLog:

* include/bits/std_function.h (_Function_handler):
Define explicit specialization used for invalid target types.
(_Base_manager::_M_manager) [!__cpp_rtti]: Return null.
(function::target_type()): Check for null pointer.
(function::target()): Define unconditionall. Fix bug with
const_cast of function pointer type.
(function::target() const): Define unconditionally, but
only use RTTI if enabled.
* testsuite/20_util/function/target_no_rtti.cc: New test.


This fixes a problem with that previous patch.

Tested x86_64-linux. Committed to trunk.


commit 99bf3a817b9d31905dd12448e853ad2685635250
Author: Jonathan Wakely 
Date:   Mon Nov 9 10:09:51 2020

libstdc++: Include  even for -fno-rtti [PR 97758]

The std::function code now uses std::type_info* even when RTTI is
disabled, so it should include  unconditionally. Without this,
Clang can't compile  with -fno-rtti (it works with GCC
because std::type_info gets declared automatically by the compiler).

libstdc++-v3/ChangeLog:

PR libstdc++/97758
* include/bits/std_function.h [!__cpp_rtti]: Include .

diff --git a/libstdc++-v3/include/bits/std_function.h b/libstdc++-v3/include/bits/std_function.h
index 054d9cbbf02b..1788b882a8aa 100644
--- a/libstdc++-v3/include/bits/std_function.h
+++ b/libstdc++-v3/include/bits/std_function.h
@@ -36,9 +36,7 @@
 # include 
 #else
 
-#if __cpp_rtti
-# include 
-#endif
+#include 
 #include 
 #include 
 #include 


Re: [PATCH] analyzer: remove dead code

2020-11-09 Thread Martin Liška

PING^1

On 10/23/20 5:26 PM, Martin Liška wrote:

Hey.

I've noticed that when building GCC with Clang.
David what do you think about it?

Thanks,
Martin

gcc/analyzer/ChangeLog:

 * constraint-manager.cc (constraint_manager::merge): Remove
 unused code.
 * constraint-manager.h: Likewise.
 * program-state.cc (sm_state_map::sm_state_map): Likewise.
 (program_state::program_state): Likewise.
 (test_sm_state_map): Likewise.
 * program-state.h: Likewise.
 * region-model-reachability.cc (reachable_regions::reachable_regions): 
Likewise.
 * region-model-reachability.h: Likewise.
 * region-model.cc (region_model::handle_unrecognized_call): Likewise.
 (region_model::get_reachable_svalues): Likewise.
 (region_model::can_merge_with_p): Likewise.
---
  gcc/analyzer/constraint-manager.cc    | 11 ---
  gcc/analyzer/constraint-manager.h |  3 +--
  gcc/analyzer/program-state.cc | 22 +++---
  gcc/analyzer/program-state.h  |  3 +--
  gcc/analyzer/region-model-reachability.cc |  5 ++---
  gcc/analyzer/region-model-reachability.h  |  3 +--
  gcc/analyzer/region-model.cc  |  7 +++
  7 files changed, 23 insertions(+), 31 deletions(-)

diff --git a/gcc/analyzer/constraint-manager.cc 
b/gcc/analyzer/constraint-manager.cc
index 603b22811c1..f9fffe45c66 100644
--- a/gcc/analyzer/constraint-manager.cc
+++ b/gcc/analyzer/constraint-manager.cc
@@ -1808,9 +1808,8 @@ class merger_fact_visitor : public fact_visitor
  {
  public:
    merger_fact_visitor (const constraint_manager *cm_b,
-   constraint_manager *out,
-   const model_merger )
-  : m_cm_b (cm_b), m_out (out), m_merger (merger)
+   constraint_manager *out)
+  : m_cm_b (cm_b), m_out (out)
    {}

    void on_fact (const svalue *lhs, enum tree_code code, const svalue *rhs)
@@ -1844,7 +1843,6 @@ public:
  private:
    const constraint_manager *m_cm_b;
    constraint_manager *m_out;
-  const model_merger _merger;
  };

  /* Use MERGER to merge CM_A and CM_B into *OUT.
@@ -1856,14 +1854,13 @@ private:
  void
  constraint_manager::merge (const constraint_manager _a,
     const constraint_manager _b,
-   constraint_manager *out,
-   const model_merger )
+   constraint_manager *out)
  {
    /* Merge the equivalence classes and constraints.
   The easiest way to do this seems to be to enumerate all of the facts
   in cm_a, see which are also true in cm_b,
   and add those to *OUT.  */
-  merger_fact_visitor v (_b, out, merger);
+  merger_fact_visitor v (_b, out);
    cm_a.for_each_fact ();
  }

diff --git a/gcc/analyzer/constraint-manager.h 
b/gcc/analyzer/constraint-manager.h
index 98960ffad84..1142b1f06e6 100644
--- a/gcc/analyzer/constraint-manager.h
+++ b/gcc/analyzer/constraint-manager.h
@@ -274,8 +274,7 @@ public:

    static void merge (const constraint_manager _a,
   const constraint_manager _b,
- constraint_manager *out,
- const model_merger );
+ constraint_manager *out);

    void for_each_fact (fact_visitor *) const;

diff --git a/gcc/analyzer/program-state.cc b/gcc/analyzer/program-state.cc
index 5bb8907e340..77c2de435d6 100644
--- a/gcc/analyzer/program-state.cc
+++ b/gcc/analyzer/program-state.cc
@@ -135,8 +135,8 @@ extrinsic_state::get_model_manager () const

  /* sm_state_map's ctor.  */

-sm_state_map::sm_state_map (const state_machine , int sm_idx)
-: m_sm (sm), m_sm_idx (sm_idx), m_map (), m_global_state (sm.get_start_state 
())
+sm_state_map::sm_state_map (const state_machine )
+: m_sm (sm), m_map (), m_global_state (sm.get_start_state ())
  {
  }

@@ -577,7 +577,7 @@ program_state::program_state (const extrinsic_state 
_state)
    const int num_states = ext_state.get_num_checkers ();
    for (int i = 0; i < num_states; i++)
  {
-  sm_state_map *sm = new sm_state_map (ext_state.get_sm (i), i);
+  sm_state_map *sm = new sm_state_map (ext_state.get_sm (i));
    m_checker_states.quick_push (sm);
  }
  }
@@ -1154,7 +1154,7 @@ test_sm_state_map ()
  const svalue *y_sval = model.get_rvalue (y, NULL);
  const svalue *z_sval = model.get_rvalue (z, NULL);

-    sm_state_map map (*sm, 0);
+    sm_state_map map (*sm);
  ASSERT_TRUE (map.is_empty_p ());
  ASSERT_EQ (map.get_state (x_sval, ext_state), start);

@@ -1183,7 +1183,7 @@ test_sm_state_map ()
  const svalue *y_sval = model.get_rvalue (y, NULL);
  const svalue *z_sval = model.get_rvalue (z, NULL);

-    sm_state_map map (*sm, 0);
+    sm_state_map map (*sm);
  ASSERT_TRUE (map.is_empty_p ());
  ASSERT_EQ (map.get_state (x_sval, ext_state), start);
  ASSERT_EQ (map.get_state (y_sval, ext_state), start);
@@ -1206,9 +1206,9 @@ test_sm_state_map ()
  const svalue *y_sval = model.get_rvalue (y, NULL);
  const svalue *z_sval = model.get_rvalue (z, NULL);

-    sm_state_map map0 (*sm, 0);
-    

[PATCH] Prefer bit-test over the jump table.

2020-11-09 Thread Martin Liška

Hello.

As mentioned in the PR, we used to prefer BT over JT in switch expansion.
I restore the behavior to that.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

PR tree-optimization/97736
* tree-switch-conversion.c 
(switch_decision_tree::analyze_switch_statement):
Prefer bit tests.

gcc/testsuite/ChangeLog:

PR tree-optimization/97736
* gcc.dg/tree-ssa/switch-1.c: Prefer bit tests.
* g++.dg/tree-ssa/pr97736.C: New test.
---
 gcc/testsuite/g++.dg/tree-ssa/pr97736.C  | 12 
 gcc/testsuite/gcc.dg/tree-ssa/switch-1.c |  6 +++---
 gcc/tree-switch-conversion.c |  8 
 3 files changed, 19 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr97736.C

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr97736.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr97736.C
new file mode 100644
index 000..bda77e7e165
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr97736.C
@@ -0,0 +1,12 @@
+/* PR tree-optimization/97736 */
+/* { dg-do compile { target { { x86_64-*-* aarch64-*-* ia64-*-* powerpc64-*-* } 
&& lp64 } } } */
+/* { dg-options "-O2 -fdump-tree-switchlower1" } */
+
+bool is_vowel(char c) {
+switch (c)
+  case'a':case'e':case'i':case'o':case'u':
+  return true;
+  return false;
+}
+
+/* { dg-final { scan-tree-dump ";; GIMPLE switch case clusters: BT:97-117" 
"switchlower1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/switch-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/switch-1.c
index 149687ca2bb..6f70c9de0c1 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/switch-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/switch-1.c
@@ -54,7 +54,7 @@ int foo3 (int x)
   }
 }
 
-/* { dg-final { scan-tree-dump ";; GIMPLE switch case clusters: JT:0-62" "switchlower1" } } */

+/* { dg-final { scan-tree-dump ";; GIMPLE switch case clusters: BT:0-62" 
"switchlower1" } } */
 
 int foo4 (int x)

 {
@@ -77,7 +77,7 @@ int foo4 (int x)
   }
 }
 
-/* { dg-final { scan-tree-dump ";; GIMPLE switch case clusters: -100 JT:10-62 600-700" "switchlower1" } } */

+/* { dg-final { scan-tree-dump ";; GIMPLE switch case clusters: -100 BT:10-62 600-700" 
"switchlower1" } } */
 
 int foo5 (int x)

 {
@@ -107,4 +107,4 @@ int foo5 (int x)
   }
 }
 
-/* { dg-final { scan-tree-dump ";; GIMPLE switch case clusters: JT:10-62 600-700 JT:1000-1021 11" "switchlower1" } } */

+/* { dg-final { scan-tree-dump ";; GIMPLE switch case clusters: BT:10-62 600-700 JT:1000-1021 
11" "switchlower1" } } */
diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
index 426462e856b..a7c5df31743 100644
--- a/gcc/tree-switch-conversion.c
+++ b/gcc/tree-switch-conversion.c
@@ -1743,8 +1743,8 @@ switch_decision_tree::analyze_switch_statement ()
 
   reset_out_edges_aux (m_switch);
 
-  /* Find jump table clusters.  */

-  vec output = jump_table_cluster::find_jump_tables (clusters);
+  /* Find bit-test clusters.  */
+  vec output = bit_test_cluster::find_bit_tests (clusters);
 
   /* Find bit test clusters.  */

   vec output2;
@@ -1759,7 +1759,7 @@ switch_decision_tree::analyze_switch_statement ()
{
  if (!tmp.is_empty ())
{
- vec n = bit_test_cluster::find_bit_tests (tmp);
+ vec n = jump_table_cluster::find_jump_tables (tmp);
  output2.safe_splice (n);
  n.release ();
  tmp.truncate (0);
@@ -1773,7 +1773,7 @@ switch_decision_tree::analyze_switch_statement ()
   /* We still can have a temporary vector to test.  */
   if (!tmp.is_empty ())
 {
-  vec n = bit_test_cluster::find_bit_tests (tmp);
+  vec n = jump_table_cluster::find_jump_tables (tmp);
   output2.safe_splice (n);
   n.release ();
 }
--
2.29.2



RE: [PATCH] aarch64: Do not alter force_reg returned register expanding fcmla

2020-11-09 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: 09 November 2020 10:04
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; nd 
> Subject: [PATCH] aarch64: Do not alter force_reg returned register
> expanding fcmla
> 
> Hi all,
> 
> this patch is to fix a force_reg returned rtx potentially modified in
> `aarch64_general_expand_builtin`.
> 
> Bootstrapped and reg-tested on aarch64-none-linux-gnu.
> 
> Okay for trunk?

Ok.
Thanks,
Kyrill

> 
> Thanks
> 
>   Andrea
> 
> 2020-11-06  Andrea Corallo  
> 
>   * config/aarch64/aarch64-builtins.c
>   (aarch64_expand_fcmla_builtin): Do not alter force_reg returned
>   register.



Re: [PATCH] x86: Adjust keylocker testcases for fail on darwin

2020-11-09 Thread Uros Bizjak via Gcc-patches
On Mon, Nov 9, 2020 at 2:50 PM Hongyu Wang  wrote:

LGTM.

Thanks,
Uros.

> >
> > Please rewrite scan strings back to using double-quotation marks.
> >
>
> Yes, updated patch.
>
> Uros Bizjak  于2020年11月9日周一 下午7:41写道:
>
> >
> > On Mon, Nov 9, 2020 at 11:50 AM Hongyu Wang  wrote:
> > >
> > > Hi
> > >
> > > According to the discussion in
> > > https://gcc.gnu.org/pipermail/gcc/2020-November/234096.html,
> > > The testcase for keylocker-* is too strict for darwin target. This
> > > patch adjusted the regex, and add a missing test for aesenc256kl
> > > instruction.
> > >
> > > Tested by Iain Sandone and all get pass in darwin target.
> > >
> > > Ok for trunk?
> > >
> > > gcc/testsuite/ChangeLog
> > >
> > > * gcc.target/i386/keylocker-aesdec128kl.c: Adjust regex patterns.
> > > * gcc.target/i386/keylocker-aesdec256kl.c: Likewise.
> > > * gcc.target/i386/keylocker-aesdecwide128kl.c: Likewise.
> > > * gcc.target/i386/keylocker-aesdecwide256kl.c: Likewise.
> > > * gcc.target/i386/keylocker-aesenc128kl.c: Likewise.
> > > * gcc.target/i386/keylocker-aesencwide128kl.c: Likewise.
> > > * gcc.target/i386/keylocker-aesencwide256kl.c: Likewise.
> > > * gcc.target/i386/keylocker-encodekey128.c: Likewise.
> > > * gcc.target/i386/keylocker-encodekey256.c: Likewise.
> > > * gcc.target/i386/keylocker-aesenc256kl.c: New test.
> >
> > Please rewrite scan strings back to using double-quotation marks.
> >
> > Uros.
> >
> > >
> > > --
> > > Regards,
> > >
> > > Hongyu, Wang


[PATCH] tree-optimization/97746 - fix order of mask precision computes

2020-11-09 Thread Richard Biener
This fixes the order of walking PHIs and stmts for BB mask
precision compute.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-11-09  Richard Biener  

PR tree-optimization/97746
* tree-vect-patterns.c (vect_determine_precisions): First walk PHIs.

* gcc.dg/vect/bb-slp-pr97746.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-pr97746.c | 20 
 gcc/tree-vect-patterns.c   |  8 
 2 files changed, 24 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-pr97746.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr97746.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr97746.c
new file mode 100644
index 000..c5a615d1253
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr97746.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+
+int a, b;
+short c;
+
+extern void f (short*);
+
+void d()
+{
+  short e[2] = {0, 0};
+  while (a)
+{
+  f(e);
+  int g = 0 || a, h = 8 && c;
+  short i = c;
+  c = h & g;
+  if (b)
+   b = g || i;
+}
+}
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index eefa7cf6799..f68a87e05ed 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -5182,15 +5182,15 @@ vect_determine_precisions (vec_info *vinfo)
   for (unsigned i = 0; i < bb_vinfo->bbs.length (); ++i)
{
  basic_block bb = bb_vinfo->bbs[i];
- for (auto gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
+ for (auto gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next 
())
{
- stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (gsi));
+ stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi.phi ());
  if (stmt_info && STMT_VINFO_VECTORIZABLE (stmt_info))
vect_determine_mask_precision (vinfo, stmt_info);
}
- for (auto gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next 
())
+ for (auto gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
{
- stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi.phi ());
+ stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (gsi));
  if (stmt_info && STMT_VINFO_VECTORIZABLE (stmt_info))
vect_determine_mask_precision (vinfo, stmt_info);
}
-- 
2.26.2


[PATCH] tree-optimization/97753 - fix SLP induction vect

2020-11-09 Thread Richard Biener
This fixes updating of the step vectors when filling up to group_size.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-11-09  Richard Biener  

PR tree-optimization/97753
* tree-vect-loop.c (vectorizable_induction): Fill vec_steps
when CSEing inside the group.

* gcc.dg/vect/pr97753.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/pr97753.c | 10 ++
 gcc/tree-vect-loop.c|  7 +--
 2 files changed, 15 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr97753.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr97753.c 
b/gcc/testsuite/gcc.dg/vect/pr97753.c
new file mode 100644
index 000..e49a8487631
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr97753.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+long a[6];
+void d(int c)
+{
+  for (; c; c++)
+for (int b = 0; b < 8; b++)
+  ((char *)[c])[b] = c;
+}
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 0ba37540d5d..977633a3ce3 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -8068,8 +8068,11 @@ vectorizable_induction (loop_vec_info loop_vinfo,
  nivs = least_common_multiple (group_size,
const_nunits) / const_nunits;
  for (; ivn < nivs; ++ivn)
-   SLP_TREE_VEC_STMTS (slp_node)
- .quick_push (SLP_TREE_VEC_STMTS (slp_node)[0]);
+   {
+ SLP_TREE_VEC_STMTS (slp_node)
+   .quick_push (SLP_TREE_VEC_STMTS (slp_node)[0]);
+ vec_steps.safe_push (vec_steps[0]);
+   }
}
 
   /* Re-use IVs when we can.  We are generating further vector
-- 
2.26.2


Re: [04/32] cpp lexer

2020-11-09 Thread Nathan Sidwell

Jeff,
here is an updated patch with changelog.  I've added checking_asserts 
and comments for the state changes you were concerned about.


While the __builtin_expects do make a change to generated code, you are 
probably right that they are not significant and I have removed them -- 
cpplib tends to sprinkle them liberally and I guess I got infected.


I trust this addresses your concerns.

nathan

--
Nathan Sidwell
	libcpp/
	* include/cpplib.h (struct cpp_options): Add module_directives
	option.
	(NODE_MODULE): New node flag.
	(cpp_hashnode): Make rid_code bitfield, increase flag bits.
	* internal.h (struct lexer_state): Add directive_file_token field.
	(struct spec_nodes): Add module enum, n_modules field.
	* init.c (post_options): Initialize modules fields of spec_nodes.
	* lex.c (cpp_maybe_module_directive): New.
	(_cpp_lex_token): Call it.
	(cpp_output_token): Add CPP_HEADER_NAME functionality.
	(do_peek_ident, do_peek_module): New.
	(cpp_directive_only_process): Add module control line detection.
	* macro.c (cpp_get_token_1): Add directive_file_token for module
	control lines.
	gcc/c-family/
	* c-lex.c (c_lex_with_flags): Accept CPP_HEADER_NAME.

diff --git c/libcpp/include/cpplib.h w/libcpp/include/cpplib.h
index c4d7cc520d1..eb82599dd22 100644
--- c/libcpp/include/cpplib.h
+++ w/libcpp/include/cpplib.h
@@ -487,6 +494,9 @@ struct cpp_options
   /* Nonzero for the '::' token.  */
   unsigned char scope;
 
+  /* Nonzero means tokenize C++20 module directives.  */
+  unsigned char module_directives;
+
   /* Holds the name of the target (execution) character set.  */
   const char *narrow_charset;
 
@@ -831,6 +857,7 @@ struct GTY(()) cpp_macro {
 #define NODE_USED	(1 << 5)	/* Dumped with -dU.  */
 #define NODE_CONDITIONAL (1 << 6)	/* Conditional macro */
 #define NODE_WARN_OPERATOR (1 << 7)	/* Warn about C++ named operator.  */
+#define NODE_MODULE (1 << 8)		/* C++-20 module-related name.  */
 
 /* Different flavors of hash node.  */
 enum node_type
@@ -888,11 +915,11 @@ struct GTY(()) cpp_hashnode {
   unsigned int directive_index : 7;	/* If is_directive,
 	   then index into directive table.
 	   Otherwise, a NODE_OPERATOR.  */
-  unsigned char rid_code;		/* Rid code - for front ends.  */
+  unsigned int rid_code : 8;		/* Rid code - for front ends.  */
+  unsigned int flags : 9;		/* CPP flags.  */
   ENUM_BITFIELD(node_type) type : 2;	/* CPP node type.  */
-  unsigned int flags : 8;		/* CPP flags.  */
 
-  /* 6 bits spare (plus another 32 on 64-bit hosts).  */
+  /* 5 bits spare (plus another 32 on 64-bit hosts).  */
 
   union _cpp_hashnode_value GTY ((desc ("%1.type"))) value;
 };
diff --git c/libcpp/internal.h w/libcpp/internal.h
index d7780e49d27..60a0c194a7d 100644
--- c/libcpp/internal.h
+++ w/libcpp/internal.h
@@ -280,6 +280,9 @@ struct lexer_state
   /* Nonzero when tokenizing a deferred pragma.  */
   unsigned char in_deferred_pragma;
 
+  /* Count to token that is a header-name.  */
+  unsigned char directive_file_token;
+
   /* Nonzero if the deferred pragma being handled allows macro expansion.  */
   unsigned char pragma_allow_expansion;
 };
@@ -292,6 +295,12 @@ struct spec_nodes
   cpp_hashnode *n_false;		/* C++ keyword false */
   cpp_hashnode *n__VA_ARGS__;		/* C99 vararg macros */
   cpp_hashnode *n__VA_OPT__;		/* C++ vararg macros */
+
+  enum {M_EXPORT, M_MODULE, M_IMPORT, M__IMPORT, M_HWM};
+  
+  /* C++20 modules, only set when module_directives is in effect.
+ incoming variants [0], outgoing ones [1] */
+  cpp_hashnode *n_modules[M_HWM][2];
 };
 
 typedef struct _cpp_line_note _cpp_line_note;
diff --git c/libcpp/init.c w/libcpp/init.c
index dcf1d4be587..2266fff17ff 100644
--- c/libcpp/init.c
+++ w/libcpp/init.c
@@ -841,4 +856,27 @@ post_options (cpp_reader *pfile)
   CPP_OPTION (pfile, trigraphs) = 0;
   CPP_OPTION (pfile, warn_trigraphs) = 0;
 }
+
+  if (CPP_OPTION (pfile, module_directives))
+{
+  /* These unspellable tokens have a leading space.  */
+  const char *const inits[spec_nodes::M_HWM]
+	= {"export ", "module ", "import ", "__import"};
+
+  for (int ix = 0; ix != spec_nodes::M_HWM; ix++)
+	{
+	  cpp_hashnode *node = cpp_lookup (pfile, UC (inits[ix]),
+	   strlen (inits[ix]));
+
+	  /* Token we pass to the compiler.  */
+	  pfile->spec_nodes.n_modules[ix][1] = node;
+
+	  if (ix != spec_nodes::M__IMPORT)
+	/* Token we recognize when lexing, drop the trailing ' '.  */
+	node = cpp_lookup (pfile, NODE_NAME (node), NODE_LEN (node) - 1);
+
+	  node->flags |= NODE_MODULE;
+	  pfile->spec_nodes.n_modules[ix][0] = node;
+	}
+}
 }
diff --git c/libcpp/lex.c w/libcpp/lex.c
index f58a8828124..b9c10b399dc 100644
--- c/libcpp/lex.c
+++ w/libcpp/lex.c
@@ -2615,6 +2622,151 @@ _cpp_temp_token (cpp_reader *pfile)
   return result;
 }
 
+/* We're at the beginning of a logical line (so not in
+  directives-mode) and RESULT is a CPP_NAME with NODE_MODULE set.  See
+  if we should enter deferred_pragma mode to tokenize the rest of 

Re: [PATCH] x86: Adjust keylocker testcases for fail on darwin

2020-11-09 Thread Hongyu Wang via Gcc-patches
>
> Please rewrite scan strings back to using double-quotation marks.
>

Yes, updated patch.

Uros Bizjak  于2020年11月9日周一 下午7:41写道:

>
> On Mon, Nov 9, 2020 at 11:50 AM Hongyu Wang  wrote:
> >
> > Hi
> >
> > According to the discussion in
> > https://gcc.gnu.org/pipermail/gcc/2020-November/234096.html,
> > The testcase for keylocker-* is too strict for darwin target. This
> > patch adjusted the regex, and add a missing test for aesenc256kl
> > instruction.
> >
> > Tested by Iain Sandone and all get pass in darwin target.
> >
> > Ok for trunk?
> >
> > gcc/testsuite/ChangeLog
> >
> > * gcc.target/i386/keylocker-aesdec128kl.c: Adjust regex patterns.
> > * gcc.target/i386/keylocker-aesdec256kl.c: Likewise.
> > * gcc.target/i386/keylocker-aesdecwide128kl.c: Likewise.
> > * gcc.target/i386/keylocker-aesdecwide256kl.c: Likewise.
> > * gcc.target/i386/keylocker-aesenc128kl.c: Likewise.
> > * gcc.target/i386/keylocker-aesencwide128kl.c: Likewise.
> > * gcc.target/i386/keylocker-aesencwide256kl.c: Likewise.
> > * gcc.target/i386/keylocker-encodekey128.c: Likewise.
> > * gcc.target/i386/keylocker-encodekey256.c: Likewise.
> > * gcc.target/i386/keylocker-aesenc256kl.c: New test.
>
> Please rewrite scan strings back to using double-quotation marks.
>
> Uros.
>
> >
> > --
> > Regards,
> >
> > Hongyu, Wang
From 826a48e5d08b2ad6865ef92c0965f095cad3d654 Mon Sep 17 00:00:00 2001
From: hongyuw1 
Date: Fri, 6 Nov 2020 15:08:10 +0800
Subject: [PATCH] Adjust Keylocker regex pattern for darwin, and add missing
 aesenc256kl test.

gcc/testsuite/ChangeLog

	* gcc.target/i386/keylocker-aesdec128kl.c: Adjust regex patterns.
	* gcc.target/i386/keylocker-aesdec256kl.c: Likewise.
	* gcc.target/i386/keylocker-aesdecwide128kl.c: Likewise.
	* gcc.target/i386/keylocker-aesdecwide256kl.c: Likewise.
	* gcc.target/i386/keylocker-aesenc128kl.c: Likewise.
	* gcc.target/i386/keylocker-aesencwide128kl.c: Likewise.
	* gcc.target/i386/keylocker-aesencwide256kl.c: Likewise.
	* gcc.target/i386/keylocker-encodekey128.c: Likewise.
	* gcc.target/i386/keylocker-encodekey256.c: Likewise.
	* gcc.target/i386/keylocker-aesenc256kl.c: New test.
---
 .../gcc.target/i386/keylocker-aesdec128kl.c   |  6 ++--
 .../gcc.target/i386/keylocker-aesdec256kl.c   |  6 ++--
 .../i386/keylocker-aesdecwide128kl.c  | 34 +--
 .../i386/keylocker-aesdecwide256kl.c  | 34 +--
 .../gcc.target/i386/keylocker-aesenc128kl.c   |  6 ++--
 .../gcc.target/i386/keylocker-aesenc256kl.c   | 17 ++
 .../i386/keylocker-aesencwide128kl.c  | 34 +--
 .../i386/keylocker-aesencwide256kl.c  | 34 +--
 .../gcc.target/i386/keylocker-encodekey128.c  | 14 
 .../gcc.target/i386/keylocker-encodekey256.c  | 18 +-
 10 files changed, 110 insertions(+), 93 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/keylocker-aesenc256kl.c

diff --git a/gcc/testsuite/gcc.target/i386/keylocker-aesdec128kl.c b/gcc/testsuite/gcc.target/i386/keylocker-aesdec128kl.c
index 3cdda8ed7b0..d134612beea 100644
--- a/gcc/testsuite/gcc.target/i386/keylocker-aesdec128kl.c
+++ b/gcc/testsuite/gcc.target/i386/keylocker-aesdec128kl.c
@@ -1,9 +1,9 @@
 /* { dg-do compile } */
 /* { dg-options "-mkl -O2" } */
-/* { dg-final { scan-assembler "movdqa\[ \\t\]+\[^\n\]*k2\[^\n\r]*%xmm0" } } */
-/* { dg-final { scan-assembler "aesdec128kl\[ \\t\]+\[^\n\]*h1\[^\n\r]*%xmm0" } } */
+/* { dg-final { scan-assembler "movdqa\[ \\t\]+\[^\\n\\r\]*, %xmm0" } } */
+/* { dg-final { scan-assembler "aesdec128kl\[ \\t\]+\[^\\n\\r\]*, %xmm0" } } */
 /* { dg-final { scan-assembler "sete" } } */
-/* { dg-final { scan-assembler "(?:movdqu|movups)\[ \\t\]+\[^\n\]*%xmm0\[^\n\r]*k1" } } */
+/* { dg-final { scan-assembler "(?:movdqu|movups)\[ \\t\]+\[^\\n\\r\]*%xmm0,\[^\\n\\r\]*" } } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/i386/keylocker-aesdec256kl.c b/gcc/testsuite/gcc.target/i386/keylocker-aesdec256kl.c
index 70b2c6357fa..34736d2d61a 100644
--- a/gcc/testsuite/gcc.target/i386/keylocker-aesdec256kl.c
+++ b/gcc/testsuite/gcc.target/i386/keylocker-aesdec256kl.c
@@ -1,9 +1,9 @@
 /* { dg-do compile } */
 /* { dg-options "-mkl -O2" } */
-/* { dg-final { scan-assembler "movdqa\[ \\t\]+\[^\n\]*k2\[^\n\r]*%xmm0" } } */
-/* { dg-final { scan-assembler "aesdec256kl\[ \\t\]+\[^\n\]*h1\[^\n\r]*%xmm0" } } */
+/* { dg-final { scan-assembler "movdqa\[ \\t\]+\[^\\n\\r\]*, %xmm0" } } */
+/* { dg-final { scan-assembler "aesdec256kl\[ \\t\]+\[^\\n\\r\]*, %xmm0" } } */
 /* { dg-final { scan-assembler "sete" } } */
-/* { dg-final { scan-assembler "(?:movdqu|movups)\[ \\t\]+\[^\n\]*%xmm0\[^\n\r]*k1" } } */
+/* { dg-final { scan-assembler "(?:movdqu|movups)\[ \\t\]+\[^\\n\\r\]*%xmm0,\[^\\n\\r\]*" } } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/i386/keylocker-aesdecwide128kl.c b/gcc/testsuite/gcc.target/i386/keylocker-aesdecwide128kl.c
index f2806891bff..d23cf4b6517 100644
--- 

Re: [PATCH][Arm] Auto-vectorization for MVE: vsub

2020-11-09 Thread Christophe Lyon via Gcc-patches
Hi,


On Fri, 23 Oct 2020 at 10:02, Dennis Zhang via Gcc-patches
 wrote:
>
> Hi Kyrylo,
>
> > 
> > From: Kyrylo Tkachov 
> > Sent: Thursday, October 22, 2020 9:40 AM
> > To: Dennis Zhang; gcc-patches@gcc.gnu.org
> > Cc: nd; Richard Earnshaw; Ramana Radhakrishnan
> > Subject: RE: [PATCH][Arm] Auto-vectorization for MVE: vsub
> >
> > Hi Dennis,
> >
> > > -Original Message-
> > > From: Dennis Zhang 
> > > Sent: 06 October 2020 17:47
> > > To: gcc-patches@gcc.gnu.org
> > > Cc: Kyrylo Tkachov ; nd ;
> > > Richard Earnshaw ; Ramana Radhakrishnan
> > > 
> > > Subject: Re: [PATCH][Arm] Auto-vectorization for MVE: vsub
> > >
> > > Hi all,
> > >
> > > On 8/17/20 6:41 PM, Dennis Zhang wrote:
> > > >
> > > > Hi all,
> > > >
> > > > This patch enables MVE vsub instructions for auto-vectorization.
> > > > It adds RTL templates for MVE vsub instructions using 'minus' instead of
> > > > unspec expression to make the instructions recognizable for 
> > > > vectorization.
> > > > MVE target is added in sub3 optab. The sub3 optab is
> > > > modified to use a mode iterator that selects available modes for various
> > > > targets correspondingly.
> > > > MVE vector modes are enabled in arm_preferred_simd_mode in arm.c to
> > > > support vectorization.
> > > >
> > > > This patch also fixes 'vreinterpretq_*.c' MVE intrinsic tests. The tests
> > > > generate wrong instruction numbers because of unexpected icf
> > > optimization.
> > > > This bug is exposed by the MVE vector modes enabled in this patch,
> > > > therefore it is corrected in this patch to avoid test failures.
> > > >
> > > > MVE instructions are documented here:
> > > > https://developer.arm.com/architectures/instruction-sets/simd-
> > > isas/helium/helium-intrinsics
> > > >
> > > > The patch is regtested for arm-none-eabi and bootstrapped for
> > > > arm-none-linux-gnueabihf.
> > > >
> > > > Is it OK for trunk please?
> > > >
> > > > Thanks
> > > > Dennis
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > 2020-08-10  Dennis Zhang  
> > > >
> > > > * config/arm/arm.c (arm_preferred_simd_mode): Enable MVE vector
> > > modes.
> > > > * config/arm/arm.h (TARGET_NEON_IWMMXT): New macro.
> > > > (TARGET_NEON_IWMMXT_MVE, TARGET_NEON_IWMMXT_MVE_FP):
> > > Likewise.
> > > > (TARGET_NEON_MVE_HFP): Likewise.
> > > > * config/arm/iterators.md (VSEL): New mode iterator to select modes
> > > > for corresponding targets.
> > > > * config/arm/mve.md (mve_vsubq): New entry for vsub instruction
> > > > using expression 'minus'.
> > > > (mve_vsubq_f): Use minus instead of VSUBQ_F unspec.
> > > > * config/arm/neon.md (sub3): Removed here. Integrated in the
> > > > sub3 in vec-common.md
> > > > * config/arm/vec-common.md (sub3): Enable MVE target. Use
> > > VSEL
> > > > to select available modes. Exclude TARGET_NEON_FP16INST from
> > > > TARGET_NEON statement. Intergrate TARGET_NEON_FP16INST which is
> > > > originally in neon.md.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > 2020-08-10  Dennis Zhang  
> > > >
> > > > * gcc.target/arm/mve/intrinsics/vreinterpretq_f16.c: Use additional
> > > > option -fno-ipa-icf and change the instruction count from 8 to 16.
> > > > * gcc.target/arm/mve/intrinsics/vreinterpretq_f32.c: Likewise.
> > > > * gcc.target/arm/mve/intrinsics/vreinterpretq_s16.c: Likewise.
> > > > * gcc.target/arm/mve/intrinsics/vreinterpretq_s32.c: Likewise.
> > > > * gcc.target/arm/mve/intrinsics/vreinterpretq_s64.c: Likewise.
> > > > * gcc.target/arm/mve/intrinsics/vreinterpretq_s8.c: Likewise.
> > > > * gcc.target/arm/mve/intrinsics/vreinterpretq_u16.c: Likewise.
> > > > * gcc.target/arm/mve/intrinsics/vreinterpretq_u32.c: Likewise.
> > > > * gcc.target/arm/mve/intrinsics/vreinterpretq_u64.c: Likewise.
> > > > * gcc.target/arm/mve/intrinsics/vreinterpretq_u8.c: Likewise.
> > > > * gcc.target/arm/mve/mve.exp: Include tests in subdir 'vect'.
> > > > * gcc.target/arm/mve/vect/vect_sub_0.c: New test.
> > > > * gcc.target/arm/mve/vect/vect_sub_1.c: New test.
> > > >
> > >
> > > This patch is updated based on Richard Sandiford's patch adding new
> > > vector mode macros:
> > > https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553425.html
> > > The old version of this patch is at
> > > https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552104.html
> > > And a less related part in the old version is separated into another
> > > patch: https://gcc.gnu.org/pipermail/gcc-patches/2020-
> > > September/554100.html
> > >
> > > This patch enables MVE vsub instructions for auto-vectorization.
> > > It adds insns for MVE vsub instructions using 'minus' instead of unspec
> > > expression to make the instructions recognizable for auto-vectorization.
> > > The sub3 in mve.md is modified to use new mode macros which
> > > make
> > > the expander available when certain modes are supported. Then various
> > > targets can share this expander for vectorization. The redundant
> > > sub3 insns in neon.md are then removed.
> > >
> > > 

Detect EAF flags in ipa-modref

2020-11-09 Thread Jan Hubicka
> > 
> > Yep, i am not arguing for eliminating special case of memcpy (because we
> > have the additional info that it only copies pointers from *src to
> > *dest).
> > 
> > However I find current definition of EAF_NOESCAPE bit hard to handle in
> > modref, since naturally it is quite reliable to track all uses of ssa
> > name that correspond to parameters, but it is harder to track where
> > values read from pointed-to memory can eventually go.
> 
> Yeah - also the fnspec of memcpy _is_ wrong at the moment ...

Yep.
> > 
> > For 6) I determine flags of LHS and merge them in
> 
> guess you could track a SSA lattice "based on parameter N" which
> would need to be a mask (thus only track up to 32 parameters?)

Yes, I can do that if we relax NOESCAPE this way.
> 
> > For 7) I clear NOESCAPE if rhs is name itself
> > and UNUSED + NOESCAPE if rhs is derefernece from name.
> > 
> > For 8) I do nothing.  Here the names are non-pointers that I track
> > because of earlier dereference.
> > 
> > 
> > 
> > So I think 7) can be relaxed.  Main problem is hoever that we often see 1)
> > and then 3) or 7) on LHS that makes us punt very often.
> > 
> > The fact that pointer directly does not escape but pointed to memory can
> > seems still very useful since one does not need to add *ptr to points-to
> > sets. But I will try relaxing 7).
> > 
> > If we allow values escaping to other parameters and itself, I think I
> > can relax 3) if base of the store is default def of PARM_DECL.
> 
> I think the important part is to get things correct.  Maybe it's worth

Indeed :)
> to add write/read flags where the argument _does_ escape in case the
> function itself is otherwise pure/const.  For PTA that doesn't make
> a difference (and fnspec was all about PTA ...) but for alias-analysis
> it does.

I detect them independently (UNUSED/NOCLOBBER flags which is not perfect
since we do not have OUTPUT flag like in fnspecs), but currently they
are unused since we do not track "pure/const except for known
exceptions".  This is not hard to add.
> 
> > > 
> > > I wonder if we should teach the GIMPLE FE to parse 'fn spec'
> > > so we can write unit tests for the attribute ... or maybe simply
> > > add this to the __GIMPLE spec string.
> > 
> > May be nice and also describe carefully that NOESCAPE and NOCLOBBER also
> > reffers to indirect references.  Current description
> > "Nonzero if the argument does not escape."
> > reads to me that it is about ptr itself, not about *ptr and also it does
> > not speak of the escaping to return value etc.
> 
> Well, if 'ptr' escapes then obvoiously all memory reachable from 'ptr'
> escapes - escaping is always transitive.

Yes, but if values pointed to by ptr escapes, ptr itself does not need
to escape.  This is easy to detect (and is common case of THIS pointer)
but we have no way to express it via EAF flags.
> 
> And as escaping is in the context of the caller sth escaping to the
> caller itself (via return) can hardly be considered escaping (again
> this was designed for PTA ...).
> 
> I guess it makes sense to be able to separate escaping from the rest.
I think current definition (escaping via return is not an escape) is
also OK for modref propagation.  We may have EAF_NORETURN that says that
value never escapes to return value that would be also easy to detect.

This is kind of minimal patch for the EAF flags discovery.  It works
only in local ipa-modref and gives up on cyclic SSA graphs.  Adding
propagation is easy and proper IPA mode needs collecting call sites+arg
pairs that affect the answer.

It passes testuite except for sso/t2.c testcase where it affects bit of first
dumped structure R1.
We correctly determine that it is noescape/noclobber from the dump function and
it seems that this triggers kind of strange SRA.  I will look into it.

I am running full bootstrap/regtest.

On tramp3d the effect is not great, but it does something.

PTA query stats:
  pt_solution_includes: 397269 disambiguations, 606922 queries
  pt_solutions_intersect: 138302 disambiguations, 416884 queries

to

PTA query stats:
  pt_solution_includes: 401540 disambiguations, 609093 queries
  pt_solutions_intersect: 138616 disambiguations, 417174 queries

2020-11-09  Jan Hubicka  

* builtins.c (builtin_fnspec): Fix fnspecs of memcpy and friends.
* gimple.c: Include ipa-modref-tree.h and ipa-mdoref.h.
(gimple_call_arg_flags): Use modref to determine flags.
* ipa-modref.c: Include gimple-ssa.h, tree-phinodes.h,
tree-ssa-operands.h, stringpool.h and tree-ssanames.h.
(modref_summary::useful_p): Summary is also useful if EAF flags are
known.
(dump_eaf_flags): New.
(modref_summary::dump): Dump EAF flags.
(get_modref_function_summary): Be ready for
current_function_decl == NULL.
(memory_access_to): New function.
(deref_flags): New function.
(analyze_ssa_name_flags): New function.
(analyze_parms): New function.
   

c++: ADL refactor

2020-11-09 Thread Nathan Sidwell

Jason, this might be relevant to using enum, not sure.

This refactors the ADL lookup.  It just so happens the refactoring
makes dropping modules in simpler :) We break apart the namespace and
class fn processing, and move scope iteration to an outer function.
It'll also become possible to find the same enum in multiple place, so
we need to handle that idempotently.

gcc/cp/
* cp-tree.h (LOOKUP_FOUND_P): Add ENUMERAL_TYPE.
* name-lookup.c (class name_lookup): Add comments.
(name_lookup::adl_namespace_only): Replace with ...
(name_lookup::adl_class_fns): ... this and ...
(name_lookup::adl_namespace_fns): ... this.
(name_lookup::adl_namespace): Deal with inline nests here.
(name_lookup::adl_class): Complete the type here.
(name_lookup::adl_type): Call broken-out enum ..
(name_lookup::adl_enum): New.  No need to call the namespace adl
if it is class-scope.
(name_lookup::search_adl): Iterate over collected scopes here.

pushing to trunk

nathan
--
Nathan Sidwell
diff --git i/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h
index 052291c40fe..081373076b9 100644
--- i/gcc/cp/cp-tree.h
+++ w/gcc/cp/cp-tree.h
@@ -487,7 +487,7 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
   DECL_TINFO_P (in VAR_DECL)
   FUNCTION_REF_QUALIFIED (in FUNCTION_TYPE, METHOD_TYPE)
   OVL_LOOKUP_P (in OVERLOAD)
-  LOOKUP_FOUND_P (in RECORD_TYPE, UNION_TYPE, NAMESPACE_DECL)
+  LOOKUP_FOUND_P (in RECORD_TYPE, UNION_TYPE, ENUMERAL_TYPE, NAMESPACE_DECL)
5: IDENTIFIER_VIRTUAL_P (in IDENTIFIER_NODE)
   FUNCTION_RVALUE_QUALIFIED (in FUNCTION_TYPE, METHOD_TYPE)
   CALL_EXPR_REVERSE_ARGS (in CALL_EXPR, AGGR_INIT_EXPR)
@@ -745,9 +745,10 @@ typedef struct ptrmem_cst * ptrmem_cst_t;
 && flag_hosted)
 
 /* Lookup walker marking.  */
-#define LOOKUP_SEEN_P(NODE) TREE_VISITED(NODE)
+#define LOOKUP_SEEN_P(NODE) TREE_VISITED (NODE)
 #define LOOKUP_FOUND_P(NODE) \
-  TREE_LANG_FLAG_4 (TREE_CHECK3(NODE,RECORD_TYPE,UNION_TYPE,NAMESPACE_DECL))
+  TREE_LANG_FLAG_4 (TREE_CHECK4 (NODE,RECORD_TYPE,UNION_TYPE,ENUMERAL_TYPE,\
+ NAMESPACE_DECL))
 
 /* These two accessors should only be used by OVL manipulators.
Other users should use iterators and convenience functions.  */
diff --git i/gcc/cp/name-lookup.c w/gcc/cp/name-lookup.c
index 16efd161301..410ec595c82 100644
--- i/gcc/cp/name-lookup.c
+++ w/gcc/cp/name-lookup.c
@@ -171,8 +171,13 @@ public:
 
 public:
   tree name;	/* The identifier being looked for.  */
+
+  /* Usually we just add things to the VALUE binding, but we record
+ (hidden) IMPLICIT_TYPEDEFs on the type binding, which is used for
+ using-decl resolution.  */
   tree value;	/* A (possibly ambiguous) set of things found.  */
   tree type;	/* A type that has been found.  */
+
   LOOK_want want;  /* What kind of entity we want.  */
 
   bool deduping; /* Full deduping is needed because using declarations
@@ -263,14 +268,17 @@ private:
 private:
   void add_fns (tree);
 
+ private:
   void adl_expr (tree);
   void adl_type (tree);
   void adl_template_arg (tree);
   void adl_class (tree);
+  void adl_enum (tree);
   void adl_bases (tree);
   void adl_class_only (tree);
   void adl_namespace (tree);
-  void adl_namespace_only (tree);
+  void adl_class_fns (tree);
+  void adl_namespace_fns (tree);
 
 public:
   /* Search namespace + inlines + maybe usings as qualified lookup.  */
@@ -433,8 +441,8 @@ name_lookup::add_overload (tree fns)
   if (probe && TREE_CODE (probe) == OVERLOAD
 	  && OVL_DEDUP_P (probe))
 	{
-	  /* We're about to add something found by a using
-	 declaration, so need to engage deduping mode.  */
+	  /* We're about to add something found by multiple paths, so
+	 need to engage deduping mode.  */
 	  lookup_mark (value, true);
 	  deduping = true;
 	}
@@ -777,20 +785,56 @@ name_lookup::add_fns (tree fns)
   add_overload (fns);
 }
 
-/* Add functions of a namespace to the lookup structure.  */
+/* Add the overloaded fns of SCOPE.  */
 
 void
-name_lookup::adl_namespace_only (tree scope)
+name_lookup::adl_namespace_fns (tree scope)
 {
-  mark_seen (scope);
+  if (tree *binding = find_namespace_slot (scope, name))
+{
+  tree val = *binding;
+  add_fns (ovl_skip_hidden (MAYBE_STAT_DECL (val)));
+}
+}
 
-  /* Look down into inline namespaces.  */
-  if (vec *inlinees = DECL_NAMESPACE_INLINEES (scope))
-for (unsigned ix = inlinees->length (); ix--;)
-  adl_namespace_only ((*inlinees)[ix]);
+/* Add the hidden friends of SCOPE.  */
+
+void
+name_lookup::adl_class_fns (tree type)
+{
+  /* Add friends.  */
+  for (tree list = DECL_FRIENDLIST (TYPE_MAIN_DECL (type));
+   list; list = TREE_CHAIN (list))
+if (name == FRIEND_NAME (list))
+  {
+	tree context = NULL_TREE; /* Lazily computed.  */
+	for (tree friends = FRIEND_DECLS (list); friends;
+	 friends = TREE_CHAIN (friends))
+	  {
+	tree fn = TREE_VALUE (friends);
 
-  if (tree fns = 

Re: [PATCH] 2/2 Remove debug/array

2020-11-09 Thread Jonathan Wakely via Gcc-patches

On 08/11/20 15:27 +0100, François Dumont via Libstdc++ wrote:
Following a recent fix on std::array this test started to fail in 
_GLIBCXX_DEBUG mode.


FAIL: 23_containers/array/comparison_operators/96851.cc (test for 
excess errors)


Rather than fixing it and now that __glibcxx_assert is constexpr 
compatible I would like to propose to simply remove 
__gnu_debug::array.


The only code we are losing with this change are the 
_Array_check_nonempty/_Array_check_subscript types. I am not sure 
about the purpose of this code as I saw no impact on tests. Maybe it 
was to avoid assertion in constexpr where the value of the expression 
is not use but there is a test doing that and it does produce an 
assertion.


Note that I am also moving std::array in versioned namespace. It is 
just for consistency so no problem to remove it.


I also manually edited include/Makefile.in cause I do not have the 
proper autoreconf version. Can you regenerate it on your side once 
patch is in ?


    libstdc++: Remove 

    Add _GLIBCXX_ASSERTIONS assert in normal std::array and remove 
__gnu_debug::array

    implementation.

    libstdc++-v3/ChangeLog:

            * include/debug/array: Remove.
            * include/Makefile.am: Remove .
            * include/Makefile.in: Regenerate.
            * include/experimental/functional: Adapt.
            * include/std/array: Move to _GLIBCXX_INLINE_VERSION 
namespace.
            * include/std/functional: Adapt.
            * include/std/span: Adapt.
            * testsuite/23_containers/array/debug/back1_neg.cc:
            Remove dg-require-debug-mode. Add -D_GLIBCXX_ASSERTIONS 
option.
            * testsuite/23_containers/array/debug/back2_neg.cc: 
Likewise.
            * testsuite/23_containers/array/debug/front1_neg.cc: 
Likewise.
            * testsuite/23_containers/array/debug/front2_neg.cc: 
Likewise.
            * 
testsuite/23_containers/array/debug/square_brackets_operator1_neg.cc:

            Likewise.
            * 
testsuite/23_containers/array/debug/square_brackets_operator2_neg.cc:

            Likewise.
            * testsuite/23_containers/array/element_access/60497.cc
            * 
testsuite/23_containers/array/tuple_interface/get_debug_neg.cc:

            Remove.
            * 
testsuite/23_containers/array/tuple_interface/get_neg.cc
            * 
testsuite/23_containers/array/tuple_interface/tuple_element_debug_neg.cc
            * 
testsuite/23_containers/array/tuple_interface/tuple_element_neg.cc


Tested under Linux x86_64 normal and debug modes.

Ok to commit ?


Yes, this is a nice simplification, thanks.




[PATCH] Use a per-edge PRE PHI translation cache

2020-11-09 Thread Richard Biener
This changes the phi translation cache to be per edge which
pushes it off the profiling radar.  For larger testcases the
combined hashtable causes a load of cache misses and making it
per edge allows to shrink the entry further.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-11-09  Richard Biener  

PR tree-optimization/97765
* tree-ssa-pre.c (bb_bitmap_sets::phi_translate_table): Add.
(PHI_TRANS_TABLE): New macro.
(phi_translate_table): Remove.
(expr_pred_trans_d::pred): Remove.
(expr_pred_trans_d::hash): Simplify.
(expr_pred_trans_d::equal): Likewise.
(phi_trans_add): Adjust.
(phi_translate): Likewise.  Remove hash-table expansion
detection and optimization.
(phi_translate_set): Allocate PHI_TRANS_TABLE here.
(init_pre): Adjsust.
(fini_pre): Free PHI_TRANS_TABLE.
---
 gcc/tree-ssa-pre.c | 166 ++---
 1 file changed, 81 insertions(+), 85 deletions(-)

diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c
index 3496891f8b5..79bb9e2d712 100644
--- a/gcc/tree-ssa-pre.c
+++ b/gcc/tree-ssa-pre.c
@@ -448,63 +448,6 @@ static vec value_expressions;
value, one of kind CONSTANT.  */
 static vec constant_value_expressions;
 
-/* Sets that we need to keep track of.  */
-typedef struct bb_bitmap_sets
-{
-  /* The EXP_GEN set, which represents expressions/values generated in
- a basic block.  */
-  bitmap_set_t exp_gen;
-
-  /* The PHI_GEN set, which represents PHI results generated in a
- basic block.  */
-  bitmap_set_t phi_gen;
-
-  /* The TMP_GEN set, which represents results/temporaries generated
- in a basic block. IE the LHS of an expression.  */
-  bitmap_set_t tmp_gen;
-
-  /* The AVAIL_OUT set, which represents which values are available in
- a given basic block.  */
-  bitmap_set_t avail_out;
-
-  /* The ANTIC_IN set, which represents which values are anticipatable
- in a given basic block.  */
-  bitmap_set_t antic_in;
-
-  /* The PA_IN set, which represents which values are
- partially anticipatable in a given basic block.  */
-  bitmap_set_t pa_in;
-
-  /* The NEW_SETS set, which is used during insertion to augment the
- AVAIL_OUT set of blocks with the new insertions performed during
- the current iteration.  */
-  bitmap_set_t new_sets;
-
-  /* A cache for value_dies_in_block_x.  */
-  bitmap expr_dies;
-
-  /* The live virtual operand on successor edges.  */
-  tree vop_on_exit;
-
-  /* True if we have visited this block during ANTIC calculation.  */
-  unsigned int visited : 1;
-
-  /* True when the block contains a call that might not return.  */
-  unsigned int contains_may_not_return_call : 1;
-} *bb_value_sets_t;
-
-#define EXP_GEN(BB)((bb_value_sets_t) ((BB)->aux))->exp_gen
-#define PHI_GEN(BB)((bb_value_sets_t) ((BB)->aux))->phi_gen
-#define TMP_GEN(BB)((bb_value_sets_t) ((BB)->aux))->tmp_gen
-#define AVAIL_OUT(BB)  ((bb_value_sets_t) ((BB)->aux))->avail_out
-#define ANTIC_IN(BB)   ((bb_value_sets_t) ((BB)->aux))->antic_in
-#define PA_IN(BB)  ((bb_value_sets_t) ((BB)->aux))->pa_in
-#define NEW_SETS(BB)   ((bb_value_sets_t) ((BB)->aux))->new_sets
-#define EXPR_DIES(BB)  ((bb_value_sets_t) ((BB)->aux))->expr_dies
-#define BB_VISITED(BB) ((bb_value_sets_t) ((BB)->aux))->visited
-#define BB_MAY_NOTRETURN(BB) ((bb_value_sets_t) 
((BB)->aux))->contains_may_not_return_call
-#define BB_LIVE_VOP_ON_EXIT(BB) ((bb_value_sets_t) ((BB)->aux))->vop_on_exit
-
 
 /* This structure is used to keep track of statistics on what
optimization PRE was able to perform.  */
@@ -553,9 +496,6 @@ typedef struct expr_pred_trans_d : public typed_noop_remove 

   /* The expression ID.  */
   unsigned e;
 
-  /* The predecessor block index along which we translated the expression.  */
-  int pred;
-
   /* The value expression ID that resulted from the translation.  */
   unsigned v;
 
@@ -597,27 +537,77 @@ expr_pred_trans_d::mark_deleted (expr_pred_trans_d )
 inline hashval_t
 expr_pred_trans_d::hash (const expr_pred_trans_d )
 {
-  return iterative_hash_hashval_t (e.e, e.pred);
+  return e.e;
 }
 
 inline int
 expr_pred_trans_d::equal (const expr_pred_trans_d ,
  const expr_pred_trans_d )
 {
-  int b1 = ve1.pred;
-  int b2 = ve2.pred;
-
-  /* If they are not translations for the same basic block, they can't
- be equal.  */
-  if (b1 != b2)
-return false;
-
   return ve1.e == ve2.e;
 }
 
-/* The phi_translate_table caches phi translations for a given
-   expression and predecessor.  */
-static hash_table *phi_translate_table;
+/* Sets that we need to keep track of.  */
+typedef struct bb_bitmap_sets
+{
+  /* The EXP_GEN set, which represents expressions/values generated in
+ a basic block.  */
+  bitmap_set_t exp_gen;
+
+  /* The PHI_GEN set, which represents PHI results generated in a
+ basic block.  */
+  bitmap_set_t phi_gen;
+
+  /* The TMP_GEN set, which represents 

[PATCH] CSE VN_INFO calls in PRE and VN

2020-11-09 Thread Richard Biener
The following CSEs VN_INFO calls which nowadays are hashtable queries.

Bootstrapped / tested on x86_64-unknown-linux-gnu, pushed.

2020-11-09  Richard Biener  

* tree-ssa-pre.c (get_representative_for): CSE VN_INFO calls.
(create_expression_by_pieces): Likewise.
(insert_into_preds_of_block): Likewsie.
(do_pre_regular_insertion): Likewsie.
* tree-ssa-sccvn.c (eliminate_dom_walker::eliminate_insert):
Likewise.
(eliminate_dom_walker::eliminate_stmt): Likewise.
---
 gcc/tree-ssa-pre.c   | 43 ---
 gcc/tree-ssa-sccvn.c | 16 ++--
 2 files changed, 34 insertions(+), 25 deletions(-)

diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c
index 79bb9e2d712..fec3b2f80f1 100644
--- a/gcc/tree-ssa-pre.c
+++ b/gcc/tree-ssa-pre.c
@@ -1343,10 +1343,11 @@ get_representative_for (const pre_expr e, basic_block b 
= NULL)
  ???  We should be able to re-use this when we insert the statement
  to compute it.  */
   name = make_temp_ssa_name (get_expr_type (e), gimple_build_nop (), "pretmp");
-  VN_INFO (name)->value_id = value_id;
-  VN_INFO (name)->valnum = valnum ? valnum : name;
+  vn_ssa_aux_t vn_info = VN_INFO (name);
+  vn_info->value_id = value_id;
+  vn_info->valnum = valnum ? valnum : name;
   /* ???  For now mark this SSA name for release by VN.  */
-  VN_INFO (name)->needs_insertion = true;
+  vn_info->needs_insertion = true;
   add_to_value (value_id, get_or_alloc_expr_for_name (name));
   if (dump_file && (dump_flags & TDF_DETAILS))
 {
@@ -2990,10 +2991,11 @@ create_expression_by_pieces (basic_block block, 
pre_expr expr,
 
  if (forcedname != folded)
{
- VN_INFO (forcedname)->valnum = forcedname;
- VN_INFO (forcedname)->value_id = get_next_value_id ();
+ vn_ssa_aux_t vn_info = VN_INFO (forcedname);
+ vn_info->valnum = forcedname;
+ vn_info->value_id = get_next_value_id ();
  nameexpr = get_or_alloc_expr_for_name (forcedname);
- add_to_value (VN_INFO (forcedname)->value_id, nameexpr);
+ add_to_value (vn_info->value_id, nameexpr);
  bitmap_value_replace_in_set (NEW_SETS (block), nameexpr);
  bitmap_value_replace_in_set (AVAIL_OUT (block), nameexpr);
}
@@ -3016,11 +3018,12 @@ create_expression_by_pieces (basic_block block, 
pre_expr expr,
  the expression may have been represented.  There is no harm in replacing
  here.  */
   value_id = get_expr_value_id (expr);
-  VN_INFO (name)->value_id = value_id;
-  VN_INFO (name)->valnum = vn_valnum_from_value_id (value_id);
-  if (VN_INFO (name)->valnum == NULL_TREE)
-VN_INFO (name)->valnum = name;
-  gcc_assert (VN_INFO (name)->valnum != NULL_TREE);
+  vn_ssa_aux_t vn_info = VN_INFO (name);
+  vn_info->value_id = value_id;
+  vn_info->valnum = vn_valnum_from_value_id (value_id);
+  if (vn_info->valnum == NULL_TREE)
+vn_info->valnum = name;
+  gcc_assert (vn_info->valnum != NULL_TREE);
   nameexpr = get_or_alloc_expr_for_name (name);
   add_to_value (value_id, nameexpr);
   if (NEW_SETS (block))
@@ -3122,10 +3125,11 @@ insert_into_preds_of_block (basic_block block, unsigned 
int exprnum,
   temp = make_temp_ssa_name (type, NULL, "prephitmp");
   phi = create_phi_node (temp, block);
 
-  VN_INFO (temp)->value_id = val;
-  VN_INFO (temp)->valnum = vn_valnum_from_value_id (val);
-  if (VN_INFO (temp)->valnum == NULL_TREE)
-VN_INFO (temp)->valnum = temp;
+  vn_ssa_aux_t vn_info = VN_INFO (temp);
+  vn_info->value_id = val;
+  vn_info->valnum = vn_valnum_from_value_id (val);
+  if (vn_info->valnum == NULL_TREE)
+vn_info->valnum = temp;
   bitmap_set_bit (inserted_exprs, SSA_NAME_VERSION (temp));
   FOR_EACH_EDGE (pred, ei, block->preds)
 {
@@ -3367,10 +3371,11 @@ do_pre_regular_insertion (basic_block block, 
basic_block dom)
  gimple_stmt_iterator gsi = gsi_after_labels (block);
  gsi_insert_before (, assign, GSI_NEW_STMT);
 
- VN_INFO (temp)->value_id = val;
- VN_INFO (temp)->valnum = vn_valnum_from_value_id (val);
- if (VN_INFO (temp)->valnum == NULL_TREE)
-   VN_INFO (temp)->valnum = temp;
+ vn_ssa_aux_t vn_info = VN_INFO (temp);
+ vn_info->value_id = val;
+ vn_info->valnum = vn_valnum_from_value_id (val);
+ if (vn_info->valnum == NULL_TREE)
+   vn_info->valnum = temp;
  bitmap_set_bit (inserted_exprs, SSA_NAME_VERSION (temp));
  pre_expr newe = get_or_alloc_expr_for_name (temp);
  add_to_value (val, newe);
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 8c9880e40cd..24bbd8d283f 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -5843,8 +5843,9 @@ eliminate_dom_walker::eliminate_insert (basic_block bb,
   else
 {
   gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
-  VN_INFO 

Re: [PATCH] 1/2 Make _GLIBCXX_DEBUG checks constexpr compatible

2020-11-09 Thread Jonathan Wakely via Gcc-patches

On 08/11/20 15:06 +0100, François Dumont via Libstdc++ wrote:
Now that __glibcxx_assert is constexpr compatible we can do the same 
for the _GLIBCXX_DEBUG equivalent.


I had also try to do the same on my own so this patch contains the 
string_view tests I had written when doing so.


I plan to activate some _GLIBCXX_DEBUG checks when _GLIBCXX_ASSERTIONS 
is defined but only the contant time checks. Is it ok to run checks 
like __check_partitioned_lower in constexpr ?


Hmm, I don't *think* it's possible to detect the additional calls to
the comparison function during constant evaluation. So I think the
only concern is the extra work the compiler has to do, i.e. the extra
time it takes to compile.

The constant-time checks should be OK though.



    libstdc++: Make _GLIBCXX_DEBUG checks constexpr compatible

[snip]

Ok to commit ?


Yes, thanks.



Re: [pushed] Ada : Fix bootstrap after r11-4793.

2020-11-09 Thread Arnaud Charlet
> Iain, thank you for catching and fixing this.  As you know (but
> others don't), ada is harder for me as I can't build that on my
> usual machine.
> 
> Eric, Iain does bootstraps of the modules branch on darwin include
> Ada, and I have done so for linux (a few months back).  I will make
> sure to check that more regularly during the modules merge.

Thanks Nathan and Iain, much appreciated!


Re: [pushed] Ada : Fix bootstrap after r11-4793.

2020-11-09 Thread Nathan Sidwell

On 11/7/20 4:10 AM, Iain Sandoe wrote:

Hi

The patch omitted a change for Ada, fixed thus.



Iain, thank you for catching and fixing this.  As you know (but others 
don't), ada is harder for me as I can't build that on my usual machine.


Eric, Iain does bootstraps of the modules branch on darwin include Ada, 
and I have done so for linux (a few months back).  I will make sure to 
check that more regularly during the modules merge.


nathan

--
Nathan Sidwell


Re: Fix hashing of multiply and add

2020-11-09 Thread Richard Biener via Gcc-patches
On Mon, Nov 9, 2020 at 11:39 AM Jan Hubicka  wrote:
>
> Hi,
> I have noticed that hash_operand computes a hash value that is unused
> since two is a local variable.  This patch fixes it.
>
> Bootstrapped/regtested x86_64-linux, OK?

OK.  You can add DOT_PROD_EXPR to the mix (or handle
commutative_ternary_tree_code generally).  All of those tree codes
will only appear "late".

> * fold-const.c (operand_compare::hash_operand): Fix hashing of operand
> 3 of multiply and add.
> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> index c47557daeba..8844069127c 100644
> --- a/gcc/fold-const.c
> +++ b/gcc/fold-const.c
> @@ -3806,7 +3806,7 @@ operand_compare::hash_operand (const_tree t, 
> inchash::hash ,
> hash_operand (TREE_OPERAND (t, 0), one, flags);
> hash_operand (TREE_OPERAND (t, 1), two, flags);
> hstate.add_commutative (one, two);
> -   hash_operand (TREE_OPERAND (t, 2), two, flags);
> +   hash_operand (TREE_OPERAND (t, 2), hstate, flags);
> return;
>   }
>


Re: [PATCH v2] Add if-chain to switch conversion pass.

2020-11-09 Thread Martin Liška

On 11/6/20 1:31 PM, Richard Biener wrote:

On Fri, Oct 16, 2020 at 4:04 PM Martin Liška  wrote:


Hello.

There's another version of the patch that should be based on what
I discussed with Richi and Jakub:

- the first patch introduces a new option -fbit-tests that analogue to 
-fjump-tables
and will control the new if-to-switch conversion pass

- the second patch adds the pass
- I share code with tree-ssa-reassoc.c (range_entry and init_range_entry)
- a local discovery phase is run first
- later than these local BBs are chained into a candidate list for the 
conversion

I'm also sending transformed chains for 'make all-host' (620 transformations).
Patch can bootstrap on x86_64-linux-gnu and survives regression tests.


-static bool
+bool
  no_side_effect_bb (basic_block bb)
  {

exporting this with this name is dangerous I think because the function
seems to allow side-effects in the last stmt - not sure exactly what
it tries to allow - there's no comment to that :/


All right, will fix that.



+  free (rpo);
+  free_dominance_info (CDI_DOMINATORS);
+
+  if (!all_candidates.is_empty ())
+mark_virtual_operands_for_renaming (fun);

please avoid freeing dominance info when there was no change done
(move it to the !all_candidates.is_empty () block).

+  basic_block bb;
+  FOR_EACH_BB_FN (bb, fun)
+find_conditions (bb, _in_bbs);
+

if we didn't find any conditions (or found just one?) we can elide the
rest of the function, no?


Sure.



+ if_chain *chain = new if_chain ();
+ chain->m_entries.safe_push (info);
+ /* Try to find a chain starting in this BB.  */
+ while (true)
+   {
+ if (!single_pred_p (gimple_bb (info->m_cond)))
+   break;
+ edge e = single_pred_edge (gimple_bb (info->m_cond));
+ condition_info *info2 = conditions_in_bbs.get (e->src);
+ if (!info2 || info->m_ranges[0].exp != info2->m_ranges[0].exp)
+   break;
+
+ chain->m_entries.safe_push (info2);
+ bitmap_set_bit (seen_bbs, e->src->index);
+ info = info2;
+   }

so while we now record conditions per BB the above doesn't really
allow matching a binary tree.


Yes. The pass currently only supports conditions of the following form:
1) index in {min, max}
2) index out of {min, max}

which means one edge in form 1). I don't see how can be useful handling
of a situation where both edges contain a such-chain? Can you please
come up with a test-case that can be interesting?


What I was thinking of is to record
if_chain * per BB as well and look at successors, thus (pseudo-code)

if (block ends in cond)
  if (if_chain on true edge && if_chain on false edge)
   try merge
 else if (if_chain on true edge && this-cond tests same var)
   try merge
 else if (if_chan on false edge && ...)
   try merge
 record if_chain for block

where merging would eventually detach the if_chains from the successors.
For now we'd just handle the true (and maybe false) edge combos to handle
linear chains.  Walking reverse RPO (I'm not 100% sure reverse RPO is what
we want here, but guess it will work fine for now) will gather chains
accordingly.
When merging from a successor to a BB fails we push the successor chain
to the candidate list.

+/* Algorithm of the pass runs in the following steps:
+   a) We walk basic blocks in DOMINATOR order so that we first reach
+  a first condition of a future switch.
+   b) We follow false edges of a if-else-chain and we record chain
+  of GIMPLE conditions.  These blocks are only used for comparison
+  of a common SSA_NAME and we do not allow any side effect.
+   c) We remove all basic blocks (except first) of such chain and
+  GIMPLE switch replaces the condition in the first basic block.
+   d) We move all GIMPLE statements in the removed blocks into the
+  first one.  */

the overall comment is now a bit out-of-date?

Please remove the PHI mapping as I outlined in earlier review.

The 0001-Add-fbit-tests-option.patch is OK for trunk.


Installed to master.

Martin



Thanks,
Richard.



Thoughts?
Thanks,
Martin




Re: [Patch] x86: Enable GCC support for Intel AVX-VNNI extension

2020-11-09 Thread Uros Bizjak via Gcc-patches
On Mon, Nov 9, 2020 at 11:31 AM Hongtao Liu  wrote:
>
> >
> > +  /* Support unified builtin.  */
> > +  || (mask2 == OPTION_MASK_ISA2_AVXVNNI)
> >
> > I don't think we gain anything with unified builtins. Better, just
> > introduce separate builtins, e.g for
> >
>
> Unified builtins are used for unified intrinsics, intrinsics users may prefer
> same interface and let compiler decide encoding version. Separate
> buitins may cause
> some defination ambiguous when target attribute is used, see avx-vnni-2.c.
> We also provide separate intrinsics interface for compatibility with
> different compilers(llvm/msvc/icc).

Hm, the new intrinsics file introduces:

+#ifdef __AVXVNNI__
+#define _mm256_dpbusd_avx_epi32(A, B, C) \
+  _mm256_dpbusd_epi32((A), (B), (C))
...
+#endif /* __AVXVNNI__ */
+
+#define _mm256_dpbusd_epi32(A, B, C)\
+  ((__m256i) __builtin_ia32_vpdpbusd_v8si ((__v8si) (A),\
+   (__v8si) (B),\
+   (__v8si) (C)))
+

And there are two versions of intrinsics:

_mm256_dpbusd_avx_epi32
_mm256_dpbusd_epi32

So, is _mm256_dpusb_epi32 active for either

OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512VL

or

OPTION_MASK_ISA2_AVXVNNI ?

Is _mm265_dpbusb_avx_epi32 the "compatibility intrinsics"?

In case the above is correct, please expand the comment

+  /* Support unified builtin.  */
+  || (mask2 == OPTION_MASK_ISA2_AVXVNNI)

with the above information, what kind of unified builtin is this.

Please also note that #defines won't be tested in e.g. sse-13.c, where:

--q--
  Defining away "extern" and "__inline" results in all of them being
  compiled as proper functions.  */

#define extern
#define __inline
--/q--

so these defines should be reimplemented as extern inline functions.

Uros.


Re: [PATCH][AArch64] Use intrinsics for upper saturating shift right

2020-11-09 Thread Christophe Lyon via Gcc-patches
Hi,


On Thu, 5 Nov 2020 at 17:12, David Candler via Gcc-patches
 wrote:
>
> Hi Richard,
>
> Thanks for the feedback.
>
> Richard Sandiford  writes:
> > > diff --git a/gcc/config/aarch64/aarch64-builtins.c 
> > > b/gcc/config/aarch64/aarch64-builtins.c
> > > index 4f33dd936c7..f93f4e29c89 100644
> > > --- a/gcc/config/aarch64/aarch64-builtins.c
> > > +++ b/gcc/config/aarch64/aarch64-builtins.c
> > > @@ -254,6 +254,10 @@ 
> > > aarch64_types_binop_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> > >  #define TYPES_GETREG (aarch64_types_binop_imm_qualifiers)
> > >  #define TYPES_SHIFTIMM (aarch64_types_binop_imm_qualifiers)
> > >  static enum aarch64_type_qualifiers
> > > +aarch64_types_ternop_s_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> > > +  = { qualifier_none, qualifier_none, qualifier_none, 
> > > qualifier_immediate};
> > > +#define TYPES_SHIFT2IMM (aarch64_types_ternop_s_imm_qualifiers)
> > > +static enum aarch64_type_qualifiers
> > >  aarch64_types_shift_to_unsigned_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> > >= { qualifier_unsigned, qualifier_none, qualifier_immediate };
> > >  #define TYPES_SHIFTIMM_USS (aarch64_types_shift_to_unsigned_qualifiers)
> > > @@ -265,14 +269,16 @@ static enum aarch64_type_qualifiers
> > >  aarch64_types_unsigned_shift_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> > >= { qualifier_unsigned, qualifier_unsigned, qualifier_immediate };
> > >  #define TYPES_USHIFTIMM (aarch64_types_unsigned_shift_qualifiers)
> > > +#define TYPES_USHIFT2IMM (aarch64_types_ternopu_imm_qualifiers)
> > > +static enum aarch64_type_qualifiers
> > > +aarch64_types_shift2_to_unsigned_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> > > +  = { qualifier_unsigned, qualifier_unsigned, qualifier_none, 
> > > qualifier_immediate };
> > > +#define TYPES_SHIFT2IMM_UUSS 
> > > (aarch64_types_shift2_to_unsigned_qualifiers)
> > >
> > >  static enum aarch64_type_qualifiers
> > >  aarch64_types_ternop_s_imm_p_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> > >= { qualifier_none, qualifier_none, qualifier_poly, 
> > > qualifier_immediate};
> > >  #define TYPES_SETREGP (aarch64_types_ternop_s_imm_p_qualifiers)
> > > -static enum aarch64_type_qualifiers
> > > -aarch64_types_ternop_s_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> > > -  = { qualifier_none, qualifier_none, qualifier_none, 
> > > qualifier_immediate};
> > >  #define TYPES_SETREG (aarch64_types_ternop_s_imm_qualifiers)
> > >  #define TYPES_SHIFTINSERT (aarch64_types_ternop_s_imm_qualifiers)
> > >  #define TYPES_SHIFTACC (aarch64_types_ternop_s_imm_qualifiers)
> >
> > Very minor, but I think it would be better to keep
> > aarch64_types_ternop_s_imm_qualifiers where it is and define
> > TYPES_SHIFT2IMM here rather than above.  For better or worse,
> > the current style seems to be to keep the defines next to the
> > associated arrays, rather than group them based on the TYPES_* name.
> >
> > > diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def 
> > > b/gcc/config/aarch64/aarch64-simd-builtins.def
> > > index d1b21102b2f..0b82b9c072b 100644
> > > --- a/gcc/config/aarch64/aarch64-simd-builtins.def
> > > +++ b/gcc/config/aarch64/aarch64-simd-builtins.def
> > > @@ -285,6 +285,13 @@
> > >BUILTIN_VSQN_HSDI (USHIFTIMM, uqshrn_n, 0, ALL)
> > >BUILTIN_VSQN_HSDI (SHIFTIMM, sqrshrn_n, 0, ALL)
> > >BUILTIN_VSQN_HSDI (USHIFTIMM, uqrshrn_n, 0, ALL)
> > > +  /* Implemented by aarch64_qshrn2_n.  */
> > > +  BUILTIN_VQN (SHIFT2IMM_UUSS, sqshrun2_n, 0, ALL)
> > > +  BUILTIN_VQN (SHIFT2IMM_UUSS, sqrshrun2_n, 0, ALL)
> > > +  BUILTIN_VQN (SHIFT2IMM, sqshrn2_n, 0, ALL)
> > > +  BUILTIN_VQN (USHIFT2IMM, uqshrn2_n, 0, ALL)
> > > +  BUILTIN_VQN (SHIFT2IMM, sqrshrn2_n, 0, ALL)
> > > +  BUILTIN_VQN (USHIFT2IMM, uqrshrn2_n, 0, ALL)
> >
> > Using ALL is a holdover from the time (until a few weeks ago) when we
> > didn't record function attributes.  New intrinsics should therefore
> > have something more specific than ALL.
> >
> > We discussed offline whether the Q flag side effect of the intrinsics
> > should be observable or not, and the conclusion was that it shouldn't.
> > I think we can therefore treat these functions as pure functions,
> > meaning that they should have flags NONE rather than ALL.
> >
> > For that reason, I think we should also remove the Set_Neon_Cumulative_Sat
> > and CHECK_CUMULATIVE_SAT parts of the test (sorry).
> >
> > Other than that, the patch looks good to go.
> >
> > Thanks,
> > Richard
>
> I've updated the patch with TYPES_SHIFT2IMM moved, the builtins changed
> to NONE, and the Q flag portion of the tests removed.
>

It looks like you forgot that these tests are shared with the arm target, and
since there intrinsics are not supported on that target you should make sure
they are skipped (there are several examples in advsimd-intrinsics/)

Christophe

> Thanks,
> David


Re: [PATCH] x86: Adjust keylocker testcases for fail on darwin

2020-11-09 Thread Andreas Schwab
On Nov 09 2020, Hongyu Wang via Gcc-patches wrote:

> diff --git a/gcc/testsuite/gcc.target/i386/keylocker-encodekey128.c 
> b/gcc/testsuite/gcc.target/i386/keylocker-encodekey128.c
> index 8dd1bc634ac..c2bc7ea344d 100644
> --- a/gcc/testsuite/gcc.target/i386/keylocker-encodekey128.c
> +++ b/gcc/testsuite/gcc.target/i386/keylocker-encodekey128.c
> @@ -1,12 +1,12 @@
>  /* { dg-do compile } */
>  /* { dg-options "-mkl -O2" } */
> -/* { dg-final { scan-assembler "movdqa\[ 
> \\t\]+\[^\n\]*k1(\\(%rip\\))?\[^\n\r]*%xmm0" } } */
> -/* { dg-final { scan-assembler "movl\[ 
> \\t\]+\[^\n\]*ctrl(\\(%rip\\))?\[^\n\r]*%eax" } } */
> -/* { dg-final { scan-assembler "encodekey128\[ 
> \\t\]+\[^\n\]*%eax\[^\n\r]*%eax" } } */
> -/* { dg-final { scan-assembler "(?:movdqu|movups)\[ 
> \\t\]+\[^\n\]*%xmm0\[^\n\r]*h2(\\(%rip\\))?" } } */
> -/* { dg-final { scan-assembler "(?:movdqu|movups)\[ 
> \\t\]+\[^\n\]*%xmm1\[^\n\r]*h2\\+16(\\(%rip\\))?" } } */
> -/* { dg-final { scan-assembler "(?:movdqu|movups)\[ 
> \\t\]+\[^\n\]*%xmm2\[^\n\r]*h2\\+32(\\(%rip\\))?" } } */
> -/* { dg-final { scan-assembler "(?:movdqa|movaps)\[ 
> \\t\]+\[^\n\]*%xmm\[4-6\]\[^\n\r]*k2(\\(%rip\\))?" } } */
> +/* { dg-final { scan-assembler {movdqa[ \t]+[^\n\r]*, %xmm0} } } */
> +/* { dg-final { scan-assembler {movl[ \t]+[^\n\r]*, %eax} } } */
> +/* { dg-final { scan-assembler {encodekey128[ \t]+[^\n]*%eax[^\n\r]*%eax} } 
> } */
> +/* { dg-final { scan-assembler {(?:movdqu|movups)[ \t]+[^\n]*%xmm0,[^\n\r]*} 
> } } */
> +/* { dg-final { scan-assembler {(?:movdqu|movups)[ 
> \t]+[^\n]*%xmm1,[^\n\r]*16[^\n\r]*} } } */
> +/* { dg-final { scan-assembler {(?:movdqu|movups)[ 
> \t]+[^\n]*%xmm2,[^\n\r]*32[^\n\r]*} } } */
> +/* { dg-final { scan-assembler {(?:movdqa|movaps)[ 
> \t]+[^\n]*%xmm[4-6\],[^\n\r]*} } } */

The last line missed one \].

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH] x86: Adjust keylocker testcases for fail on darwin

2020-11-09 Thread Uros Bizjak via Gcc-patches
On Mon, Nov 9, 2020 at 12:47 PM Iain Sandoe  wrote:
>
> Uros Bizjak via Gcc-patches  wrote:
>
> > On Mon, Nov 9, 2020 at 11:50 AM Hongyu Wang  wrote:
> >> Hi
> >>
> >> According to the discussion in
> >> https://gcc.gnu.org/pipermail/gcc/2020-November/234096.html,
> >> The testcase for keylocker-* is too strict for darwin target. This
> >> patch adjusted the regex, and add a missing test for aesenc256kl
> >> instruction.
> >>
> >> Tested by Iain Sandone and all get pass in darwin target.
> >>
> >> Ok for trunk?
> >>
> >> gcc/testsuite/ChangeLog
> >>
> >>* gcc.target/i386/keylocker-aesdec128kl.c: Adjust regex patterns.
> >>* gcc.target/i386/keylocker-aesdec256kl.c: Likewise.
> >>* gcc.target/i386/keylocker-aesdecwide128kl.c: Likewise.
> >>* gcc.target/i386/keylocker-aesdecwide256kl.c: Likewise.
> >>* gcc.target/i386/keylocker-aesenc128kl.c: Likewise.
> >>* gcc.target/i386/keylocker-aesencwide128kl.c: Likewise.
> >>* gcc.target/i386/keylocker-aesencwide256kl.c: Likewise.
> >>* gcc.target/i386/keylocker-encodekey128.c: Likewise.
> >>* gcc.target/i386/keylocker-encodekey256.c: Likewise.
> >>* gcc.target/i386/keylocker-aesenc256kl.c: New test.
> >
> > Please rewrite scan strings back to using double-quotation marks.
>
> out of curiosity, why?

Because this is the convention, and (mostly) all testacases adhere to
this convention.

There should be a compelling reason why this convention should be changed.

> ([IMO] the {} form is generally much more readable, and less prone to
>   uncaught omissions of \ as happened here)

Uros.


Re: [PATCH] x86: Adjust keylocker testcases for fail on darwin

2020-11-09 Thread Iain Sandoe via Gcc-patches

Uros Bizjak via Gcc-patches  wrote:


On Mon, Nov 9, 2020 at 11:50 AM Hongyu Wang  wrote:

Hi

According to the discussion in
https://gcc.gnu.org/pipermail/gcc/2020-November/234096.html,
The testcase for keylocker-* is too strict for darwin target. This
patch adjusted the regex, and add a missing test for aesenc256kl
instruction.

Tested by Iain Sandone and all get pass in darwin target.

Ok for trunk?

gcc/testsuite/ChangeLog

   * gcc.target/i386/keylocker-aesdec128kl.c: Adjust regex patterns.
   * gcc.target/i386/keylocker-aesdec256kl.c: Likewise.
   * gcc.target/i386/keylocker-aesdecwide128kl.c: Likewise.
   * gcc.target/i386/keylocker-aesdecwide256kl.c: Likewise.
   * gcc.target/i386/keylocker-aesenc128kl.c: Likewise.
   * gcc.target/i386/keylocker-aesencwide128kl.c: Likewise.
   * gcc.target/i386/keylocker-aesencwide256kl.c: Likewise.
   * gcc.target/i386/keylocker-encodekey128.c: Likewise.
   * gcc.target/i386/keylocker-encodekey256.c: Likewise.
   * gcc.target/i386/keylocker-aesenc256kl.c: New test.


Please rewrite scan strings back to using double-quotation marks.


out of curiosity, why?

([IMO] the {} form is generally much more readable, and less prone to
 uncaught omissions of \ as happened here)

Iain



Uros.


--
Regards,

Hongyu, Wang





Re: [PATCH V2] arm: [testcase] Better narrow some bfloat16 testcase

2020-11-09 Thread Andrea Corallo via Gcc-patches
Kyrylo Tkachov  writes:

>> -Original Message-
>> From: Andrea Corallo 
>> Sent: 09 November 2020 10:05
>> To: Christophe Lyon 
>> Cc: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org;
>> Richard Earnshaw ; nd 
>> Subject: Re: [PATCH V2] arm: [testcase] Better narrow some bfloat16
>> testcase
>> 
>> Christophe Lyon  writes:
>> [...]
>> > Yes, it works for me, thanks.
>> 
>> Super, happy to push it when I get the okay.
>
> It's okay.
> Thanks,
> Kyrill

Installed into master as 2d4fa1f79c7.

Thanks

  Andrea


Re: [PATCH] x86: Adjust keylocker testcases for fail on darwin

2020-11-09 Thread Uros Bizjak via Gcc-patches
On Mon, Nov 9, 2020 at 11:50 AM Hongyu Wang  wrote:
>
> Hi
>
> According to the discussion in
> https://gcc.gnu.org/pipermail/gcc/2020-November/234096.html,
> The testcase for keylocker-* is too strict for darwin target. This
> patch adjusted the regex, and add a missing test for aesenc256kl
> instruction.
>
> Tested by Iain Sandone and all get pass in darwin target.
>
> Ok for trunk?
>
> gcc/testsuite/ChangeLog
>
> * gcc.target/i386/keylocker-aesdec128kl.c: Adjust regex patterns.
> * gcc.target/i386/keylocker-aesdec256kl.c: Likewise.
> * gcc.target/i386/keylocker-aesdecwide128kl.c: Likewise.
> * gcc.target/i386/keylocker-aesdecwide256kl.c: Likewise.
> * gcc.target/i386/keylocker-aesenc128kl.c: Likewise.
> * gcc.target/i386/keylocker-aesencwide128kl.c: Likewise.
> * gcc.target/i386/keylocker-aesencwide256kl.c: Likewise.
> * gcc.target/i386/keylocker-encodekey128.c: Likewise.
> * gcc.target/i386/keylocker-encodekey256.c: Likewise.
> * gcc.target/i386/keylocker-aesenc256kl.c: New test.

Please rewrite scan strings back to using double-quotation marks.

Uros.

>
> --
> Regards,
>
> Hongyu, Wang


[Patch] Fortran: Fix OpenACC in specification-part checks [PR90111]

2020-11-09 Thread Tobias Burnus

OpenACC (as OpenMP) permits some directives in the 'specification part'

In Fortran (here: Fortran 2018), the latter is:
  R504  specification-part is  [use-stmt]...
   [import-stmt]...
   [implicit-part]
   [declaration-construct]...
which is an ordered list (first use, then import etc.).

Hence, gfortran's state_order lists the latter separately:

  enum state_order {ORDER_START,  ORDER_USE, ORDER_IMPORT,
ORDER_IMPLICIT_NONE, ORDER_IMPLICIT, ORDER_SPEC,
ORDER_EXEC};

Currently, 'acc declare' and 'acc routine' are placed into
'#define case_decl' which implies ORDER_SPEC.

This patch adds them to the related OpenMP constructs ('case_omp_decl'),
which does not touch the current state and just checks that the
ST_... appear before ORDER_EXEC.

If there are no comments, I intent to commit it later today.

Tobias

PS: Thanks to Thomas for pointing me to the PR.

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
Fortran: Fix OpenACC in specification-part checks [PR90111]

OpenACC's routine and declare directives can appear anywhere in the
specification part, i.e. before/after use-stmts, import-stmt, implicit-part,
or declaration-constructs.

gcc/fortran/ChangeLog:

	PR fortran/90111
	* parse.c (case_decl): Move ST_OACC_ROUTINE and ST_OACC_DECLARE to ...
	(case_omp_decl): ... here:
	(verify_st_order): Update comment.

gcc/testsuite/ChangeLog:

	PR fortran/90111
	* gfortran.dg/goacc/specification-part.f90: New test.

 gcc/fortran/parse.c|  11 +--
 .../gfortran.dg/goacc/specification-part.f90   | 100 +
 2 files changed, 106 insertions(+), 5 deletions(-)

diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
index e57669c51e5..ec7abc240d6 100644
--- a/gcc/fortran/parse.c
+++ b/gcc/fortran/parse.c
@@ -1628,24 +1628,25 @@ next_statement (void)
   case ST_OACC_DATA: case ST_OACC_HOST_DATA: case ST_OACC_LOOP: \
   case ST_OACC_KERNELS_LOOP: case ST_OACC_SERIAL_LOOP: case ST_OACC_SERIAL: \
   case ST_OACC_ATOMIC
 
 /* Declaration statements */
 
 #define case_decl case ST_ATTR_DECL: case ST_COMMON: case ST_DATA_DECL: \
   case ST_EQUIVALENCE: case ST_NAMELIST: case ST_STATEMENT_FUNCTION: \
-  case ST_TYPE: case ST_INTERFACE: case ST_PROCEDURE: case ST_OACC_ROUTINE: \
-  case ST_OACC_DECLARE
+  case ST_TYPE: case ST_INTERFACE: case ST_PROCEDURE
 
-/* OpenMP declaration statements.  */
+/* OpenMP and OpenACC declaration statements, which may appear anywhere in
+   the specification part.  */
 
 #define case_omp_decl case ST_OMP_THREADPRIVATE: case ST_OMP_DECLARE_SIMD: \
   case ST_OMP_DECLARE_TARGET: case ST_OMP_DECLARE_REDUCTION: \
-  case ST_OMP_REQUIRES
+  case ST_OMP_REQUIRES: case ST_OACC_ROUTINE: case ST_OACC_DECLARE
+
 
 /* Block end statements.  Errors associated with interchanging these
are detected in gfc_match_end().  */
 
 #define case_end case ST_END_BLOCK_DATA: case ST_END_FUNCTION: \
 		 case ST_END_PROGRAM: case ST_END_SUBROUTINE: \
 		 case ST_END_BLOCK: case ST_END_ASSOCIATE
 
@@ -2808,17 +2809,17 @@ verify_st_order (st_state *p, gfc_statement st, bool silent)
 case_decl:
   if (p->state >= ORDER_EXEC)
 	goto order;
   if (p->state < ORDER_SPEC)
 	p->state = ORDER_SPEC;
   break;
 
 case_omp_decl:
-  /* The OpenMP directives have to be somewhere in the specification
+  /* The OpenMP/OpenACC directives have to be somewhere in the specification
 	 part, but there are no further requirements on their ordering.
 	 Thus don't adjust p->state, just ignore them.  */
   if (p->state >= ORDER_EXEC)
 	goto order;
   break;
 
 case_executable:
 case_exec_markers:
diff --git a/gcc/testsuite/gfortran.dg/goacc/specification-part.f90 b/gcc/testsuite/gfortran.dg/goacc/specification-part.f90
new file mode 100644
index 000..14af6aecc7d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/specification-part.f90
@@ -0,0 +1,100 @@
+! { dg-do compile }
+!
+! PR fortran/90111
+!
+! Check that OpenACC directives in everywhere in specification part,
+! i.e. it may appear before/after the use, import, implicit, and declaration
+!
+
+module m
+end module m
+
+subroutine foo0(kk)
+  use m
+  implicit none
+  integer :: jj, kk
+  !$acc routine
+end
+
+subroutine foo1()
+  use m
+  implicit none
+  !$acc routine
+  integer :: jj
+end
+
+subroutine foo2()
+  use m
+  !$acc routine
+  implicit none
+end
+
+subroutine foo3()
+  !$acc routine
+  use m
+  implicit none
+end
+
+module m2
+  interface
+subroutine foo0(kk)
+  use m
+  import
+  implicit none
+  integer :: kk
+  !$acc routine
+end
+subroutine foo1()
+  use m
+  import
+  implicit none
+  !$acc routine
+end
+subroutine foo2()
+  use m
+  import
+  

RE: [PATCH V2] arm: [testcase] Better narrow some bfloat16 testcase

2020-11-09 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: 09 November 2020 10:05
> To: Christophe Lyon 
> Cc: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org;
> Richard Earnshaw ; nd 
> Subject: Re: [PATCH V2] arm: [testcase] Better narrow some bfloat16
> testcase
> 
> Christophe Lyon  writes:
> [...]
> > Yes, it works for me, thanks.
> 
> Super, happy to push it when I get the okay.

It's okay.
Thanks,
Kyrill

> 
>   Andrea


Embedded Real-Time Operating Systems for the IoT Market Analysis: Here's the One Report That You Must Share With Your Management Team

2020-11-09 Thread garemwinmark via Gcc-patches



Hello

Have a nice day !

This is JoJo from WMResearch.

We recently published a new market research report:

< Global Embedded Real-Time Operating Systems for the IoT Market Research 
Report 2015-2020>and kindly let us know if you are interested in it.

If you are interested in getting more details or Sample report, please feel 
free to contact Email: j...@winmarketresearch.com.

This report will walk you through the Embedded Real-Time Operating Systems for 
the IoT market from the following aspects in:Consumption, Sales, Revenue, 
Price, Cost, Gross Margin, Market size, Market share, Growth Rate, Trends, etc. 
In addition, it provides an analysis of the impact of COVID-19 on the market.

It will help you understand, formulate and implement strategic decisions, by 
offering critical data, insights and analysis.

l The following manufacturers are covered in this report:

l AMD

l Amperex Technology Ltd. (ATL)

l Atari

l Atmel Corporation

l Blackberry Ltd

l Emerson Network Power

l ENEA

l Express Logic, Inc.

l Google

l Huawei

l IBM

l IXYS Corporation

l Johnson Controls Inc.

l Johnson Matthey

l LG Chem

l Linux

l Microchip Technology

l Microsoft

l NEC

l Nuvoton

l NXP Semiconductors

l OAR corporation

l OpenWSN

l Panasonic Corp.

l Samsung

l Segger Microcontroller Systems

l Sharp

l SHHIC

l Silicon Labs

l Spansion

l ...

(Other manufacturers Information……)

 

Main chapters showed:

2 Embedded Real-Time Operating Systems for the IoT Market Overview by Type

2.1 Global Embedded Real-Time Operating Systems for the IoT Market Size by 
Type: 2015 VS 2020 VS 2026

2.2 Global Embedded Real-Time Operating Systems for the IoT Historic Market 
Size by Type (2015-2020)

2.3 Global Embedded Real-Time Operating Systems for the IoT Forecasted 
Market Size by Type (2021-2026)

2.4 Hardware

2.5 Software

 

3 Embedded Real-Time Operating Systems for the IoT Market Overview by 
Application

3.1 Global Embedded Real-Time Operating Systems for the IoT Market Size by 
Application: 2015 VS 2020 VS 2026

3.2 Global Embedded Real-Time Operating Systems for the IoT Historic Market 
Size by Application (2015-2020)

3.3 Global Embedded Real-Time Operating Systems for the IoT Forecasted 
Market Size by Application (2021-2026)

3.4 Industrial Equipment

3.5 Automotive

3.6 Healthcare

3.7 Telecommunications

3.8 Government

3.9 Others

If you have any special requirements, please let us know and we will offer you 
the report as you want. If this report is not relevant to you, I apologize for 
troubling you. If you can direct me to the correct contact within your company, 
I would be most appreciated.

Thank you for reading.

--

Best Regards,

JoJo | Sales Manager

Win marketresearch Company

T: +86-15218821578(7*24)

Email: j...@winmarketresearch.com

GLOBAL MARKET RESEARCH REPORTS FOR YOU

[PATCH] x86: Adjust keylocker testcases for fail on darwin

2020-11-09 Thread Hongyu Wang via Gcc-patches
Hi

According to the discussion in
https://gcc.gnu.org/pipermail/gcc/2020-November/234096.html,
The testcase for keylocker-* is too strict for darwin target. This
patch adjusted the regex, and add a missing test for aesenc256kl
instruction.

Tested by Iain Sandone and all get pass in darwin target.

Ok for trunk?

gcc/testsuite/ChangeLog

* gcc.target/i386/keylocker-aesdec128kl.c: Adjust regex patterns.
* gcc.target/i386/keylocker-aesdec256kl.c: Likewise.
* gcc.target/i386/keylocker-aesdecwide128kl.c: Likewise.
* gcc.target/i386/keylocker-aesdecwide256kl.c: Likewise.
* gcc.target/i386/keylocker-aesenc128kl.c: Likewise.
* gcc.target/i386/keylocker-aesencwide128kl.c: Likewise.
* gcc.target/i386/keylocker-aesencwide256kl.c: Likewise.
* gcc.target/i386/keylocker-encodekey128.c: Likewise.
* gcc.target/i386/keylocker-encodekey256.c: Likewise.
* gcc.target/i386/keylocker-aesenc256kl.c: New test.

-- 
Regards,

Hongyu, Wang
From 9009ce97099b3a80fdf61a1927c1fff9c7f5b9bf Mon Sep 17 00:00:00 2001
From: hongyuw1 
Date: Fri, 6 Nov 2020 15:08:10 +0800
Subject: [PATCH] Adjust Keylocker regex pattern for darwin, and add missing
 aesenc256kl test.

gcc/testsuite/ChangeLog

	* gcc.target/i386/keylocker-aesdec128kl.c: Adjust regex patterns.
	* gcc.target/i386/keylocker-aesdec256kl.c: Likewise.
	* gcc.target/i386/keylocker-aesdecwide128kl.c: Likewise.
	* gcc.target/i386/keylocker-aesdecwide256kl.c: Likewise.
	* gcc.target/i386/keylocker-aesenc128kl.c: Likewise.
	* gcc.target/i386/keylocker-aesencwide128kl.c: Likewise.
	* gcc.target/i386/keylocker-aesencwide256kl.c: Likewise.
	* gcc.target/i386/keylocker-encodekey128.c: Likewise.
	* gcc.target/i386/keylocker-encodekey256.c: Likewise.
	* gcc.target/i386/keylocker-aesenc256kl.c: New test.
---
 .../gcc.target/i386/keylocker-aesdec128kl.c   |  8 ++---
 .../gcc.target/i386/keylocker-aesdec256kl.c   |  8 ++---
 .../i386/keylocker-aesdecwide128kl.c  | 36 +--
 .../i386/keylocker-aesdecwide256kl.c  | 36 +--
 .../gcc.target/i386/keylocker-aesenc128kl.c   |  8 ++---
 .../gcc.target/i386/keylocker-aesenc256kl.c   | 17 +
 .../i386/keylocker-aesencwide128kl.c  | 36 +--
 .../i386/keylocker-aesencwide256kl.c  | 36 +--
 .../gcc.target/i386/keylocker-encodekey128.c  | 14 
 .../gcc.target/i386/keylocker-encodekey256.c  | 18 +-
 10 files changed, 117 insertions(+), 100 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/keylocker-aesenc256kl.c

diff --git a/gcc/testsuite/gcc.target/i386/keylocker-aesdec128kl.c b/gcc/testsuite/gcc.target/i386/keylocker-aesdec128kl.c
index 3cdda8ed7b0..9c3c8a88b0e 100644
--- a/gcc/testsuite/gcc.target/i386/keylocker-aesdec128kl.c
+++ b/gcc/testsuite/gcc.target/i386/keylocker-aesdec128kl.c
@@ -1,9 +1,9 @@
 /* { dg-do compile } */
 /* { dg-options "-mkl -O2" } */
-/* { dg-final { scan-assembler "movdqa\[ \\t\]+\[^\n\]*k2\[^\n\r]*%xmm0" } } */
-/* { dg-final { scan-assembler "aesdec128kl\[ \\t\]+\[^\n\]*h1\[^\n\r]*%xmm0" } } */
-/* { dg-final { scan-assembler "sete" } } */
-/* { dg-final { scan-assembler "(?:movdqu|movups)\[ \\t\]+\[^\n\]*%xmm0\[^\n\r]*k1" } } */
+/* { dg-final { scan-assembler {movdqa[ \t]+[^\n\r]*, %xmm0} } } */
+/* { dg-final { scan-assembler {aesdec128kl[ \t]+[^\n\r]*, %xmm0} } } */
+/* { dg-final { scan-assembler {sete} } } */
+/* { dg-final { scan-assembler {(?:movdqu|movups)[ \t]+[^\n\r]*%xmm0,[^\n\r]*} } } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/i386/keylocker-aesdec256kl.c b/gcc/testsuite/gcc.target/i386/keylocker-aesdec256kl.c
index 70b2c6357fa..6012b69e9bf 100644
--- a/gcc/testsuite/gcc.target/i386/keylocker-aesdec256kl.c
+++ b/gcc/testsuite/gcc.target/i386/keylocker-aesdec256kl.c
@@ -1,9 +1,9 @@
 /* { dg-do compile } */
 /* { dg-options "-mkl -O2" } */
-/* { dg-final { scan-assembler "movdqa\[ \\t\]+\[^\n\]*k2\[^\n\r]*%xmm0" } } */
-/* { dg-final { scan-assembler "aesdec256kl\[ \\t\]+\[^\n\]*h1\[^\n\r]*%xmm0" } } */
-/* { dg-final { scan-assembler "sete" } } */
-/* { dg-final { scan-assembler "(?:movdqu|movups)\[ \\t\]+\[^\n\]*%xmm0\[^\n\r]*k1" } } */
+/* { dg-final { scan-assembler {movdqa[ \t]+[^\n\r]*, %xmm0} } } */
+/* { dg-final { scan-assembler {aesdec256kl[ \t]+[^\n\r]*, %xmm0} } } */
+/* { dg-final { scan-assembler {sete} } } */
+/* { dg-final { scan-assembler {(?:movdqu|movups)[ \t]+[^\n\r]*%xmm0,[^\n\r]*} } } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/i386/keylocker-aesdecwide128kl.c b/gcc/testsuite/gcc.target/i386/keylocker-aesdecwide128kl.c
index f2806891bff..61c294ee052 100644
--- a/gcc/testsuite/gcc.target/i386/keylocker-aesdecwide128kl.c
+++ b/gcc/testsuite/gcc.target/i386/keylocker-aesdecwide128kl.c
@@ -1,23 +1,23 @@
 /* { dg-do compile } */
 /* { dg-options "-mwidekl -O2" } */
-/* { dg-final { scan-assembler "movdqu\[ \\t\]+\[^\n\]*idata(\\(%rip\\))?\[^\n\r]*%xmm0" } } */
-/* { dg-final { scan-assembler "movdqu\[ 

Fix hashing of multiply and add

2020-11-09 Thread Jan Hubicka
Hi,
I have noticed that hash_operand computes a hash value that is unused
since two is a local variable.  This patch fixes it.

Bootstrapped/regtested x86_64-linux, OK?

* fold-const.c (operand_compare::hash_operand): Fix hashing of operand
3 of multiply and add.
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index c47557daeba..8844069127c 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -3806,7 +3806,7 @@ operand_compare::hash_operand (const_tree t, 
inchash::hash ,
hash_operand (TREE_OPERAND (t, 0), one, flags);
hash_operand (TREE_OPERAND (t, 1), two, flags);
hstate.add_commutative (one, two);
-   hash_operand (TREE_OPERAND (t, 2), two, flags);
+   hash_operand (TREE_OPERAND (t, 2), hstate, flags);
return;
  }
 


Re: [PATCH] Optimize macro: make it more predictable

2020-11-09 Thread Martin Liška

On 11/6/20 6:34 PM, Jeff Law wrote:

So you XNEWVEC and store the result into "merge_decoded_options".  But
you free "decoded_options".  Was that intentional?


Hello.

Good point here.



This seems to bring a bit more predictability, but I suspect there's
more to do here.


Yes, both should be freed. One can see the following leak on master:

==12237== 176 bytes in 1 blocks are definitely lost in loss record 669 of 786
==12237==at 0x483BD7B: realloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==12237==by 0x1AB10CD: xrealloc (xmalloc.c:179)
==12237==by 0x1A1AE59: prune_options (opts-common.c:1139)
==12237==by 0x1A1AE59: decode_cmdline_options_to_array(unsigned int, char 
const**, unsigned int, cl_decoded_option**, unsigned int*) (opts-common.c:1027)
==12237==by 0xDCD456: decode_cmdline_options_to_array_default_mask(unsigned 
int, char const**, cl_decoded_option**, unsigned int*) (opts-global.c:273)
==12237==by 0x921377: parse_optimize_options(tree_node*, bool) 
(c-common.c:5709)
==12237==by 0x9768DB: handle_optimize_attribute(tree_node**, tree_node*, 
tree_node*, int, bool*) (c-attribs.c:4962)
==12237==by 0x84596A: decl_attributes(tree_node**, tree_node*, int, 
tree_node*) (attribs.c:723)
==12237==by 0x856F88: c_decl_attributes(tree_node**, tree_node*, int) 
(c-decl.c:5043)
==12237==by 0x8661E5: start_function(c_declspecs*, c_declarator*, 
tree_node*) (c-decl.c:9408)
==12237==by 0x8D644A: c_parser_declaration_or_fndef(c_parser*, bool, bool, bool, 
bool, bool, tree_node**, vec, bool, tree_node*, 
oacc_routine_data*, bool*) (c-parser.c:2444)
==12237==by 0x8DF343: c_parser_external_declaration(c_parser*) 
(c-parser.c:1777)
==12237==by 0x8DFE41: c_parser_translation_unit (c-parser.c:1650)
==12237==by 0x8DFE41: c_parse_file() (c-parser.c:21876)

Martin


Re: [PATCH] Optimize macro: make it more predictable

2020-11-09 Thread Martin Liška

On 11/3/20 2:34 PM, Jakub Jelinek wrote:

On Tue, Nov 03, 2020 at 02:27:52PM +0100, Richard Biener wrote:

On Fri, Oct 23, 2020 at 1:47 PM Martin Liška  wrote:

This is a follow-up of the discussion that happened in thread about 
no_stack_protector
attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html

The current optimize attribute works in the following way:
- 1) we take current global_options as base
- 2) maybe_default_options is called for the currently selected optimization 
level, which
   means all rules in default_options_table are executed
- 3) attribute values are applied (via decode_options)

So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and 
__attribute__((optimize("-fno-stack-protector")))
ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is 
default:
  /* -O1 and -Og optimizations.  */
  { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },

My patch handled and the current optimize attribute really behaves that same as 
appending attribute value
to the command line. So far so good. We should also reflect that in 
documentation entry which is quite
vague right now:

"""
The optimize attribute is used to specify that a function is to be compiled 
with different optimization options than specified on the command line.
"""

and we may want to handle -Ox in the attribute in a special way. I guess many 
macro/pragma users expect that

-O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
with -ftree-vectorize -O1 ?


Hmm.  I guess the only two reasonable options are to append to the active set
and thus end up with -ftree-vectorize -O1 or to start from an empty set and thus
end up with -O1.


I'd say we always want to append, but only take into account explicit
options.


Yes, I also prefer to always append and basically drop the "reset" 
functionality.


So basically get the effect of
take the command line, append to that options from the optimize/target
pragmas in effect and append to that options from optimize/target
attributes and only from that figure out the implicit options.


Few notes here:
- target and optimize attributes are separate so parsing happens independently; 
however
  they use global_options and global_options_set as a starting point
- you can have a series of wrapped optimize/pragma macros and again information 
is shared
in global_options/global_options_set
- target and optimize options interact, but in a controlled way with 
SET_OPTION_IF_UNSET

That said, I hope the biggest offender is right now the handling of -Olevel.

@Jakub: Do you see a situation with my patch where it breaks?

Thanks,
Martin



Jakub





Re: [Patch] x86: Enable GCC support for Intel AVX-VNNI extension

2020-11-09 Thread Hongtao Liu via Gcc-patches
>
> +  /* Support unified builtin.  */
> +  || (mask2 == OPTION_MASK_ISA2_AVXVNNI)
>
> I don't think we gain anything with unified builtins. Better, just
> introduce separate builtins, e.g for
>

Unified builtins are used for unified intrinsics, intrinsics users may prefer
same interface and let compiler decide encoding version. Separate
buitins may cause
some defination ambiguous when target attribute is used, see avx-vnni-2.c.
We also provide separate intrinsics interface for compatibility with
different compilers(llvm/msvc/icc).

> -BDESC (OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512VL, 0,
> CODE_FOR_vpdpbusd_v8si, "__builtin_ia32_vpdpbusd_v8si",
> IX86_BUILTIN_VPDPBUSDV8SI, UNKNOWN, (int) V8SI_FTYPE_V8SI_V8SI_V8SI)
> +BDESC (OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512VL,
> OPTION_MASK_ISA2_AVXVNNI, CODE_FOR_vpdpbusd_v8si,
> "__builtin_ia32_vpdpbusd_v8si", IX86_BUILTIN_VPDPBUSDV8SI, UNKNOWN,
> (int) V8SI_FTYPE_V8SI_V8SI_V8SI)
>
> add __builtin_ia32_vpdbusd_avx_v8si with the same CODE_FOR.
>
> This will remove the need for:
>
> +  if bisa & (OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512VL))
> +== (OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512VL))
> +   || (bisa2 & OPTION_MASK_ISA2_AVXVNNI) != 0)
> +  && (((isa & (OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512VL))
> +   == (OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512VL))
> +  || (isa2 & OPTION_MASK_ISA2_AVXVNNI) != 0))
> +{
> +  isa |= OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512VL;
> +  isa2 |= OPTION_MASK_ISA2_AVXVNNI;
> +}
>
> which is already complex with AVX512VL processing.
>
> +#ifdef __AVXVNNI__
> +#define _mm256_dpbusd_avx_epi32(A, B, C) \
> +  _mm256_dpbusd_epi32((A), (B), (C))
> +#define _mm_dpbusd_avx_epi32(A, B, C) \
> +  _mm_dpbusd_epi32((A), (B), (C))
> +#define _mm256_dpbusds_avx_epi32(A, B, C) \
> +  _mm256_dpbusds_epi32((A), (B), (C))
> +#define _mm_dpbusds_avx_epi32(A, B, C) \
> +  _mm_dpbusds_epi32((A), (B), (C))
> +#define _mm256_dpwssd_avx_epi32(A, B, C) \
> +  _mm256_dpwssd_epi32((A), (B), (C))
> +#define _mm_dpwssd_avx_epi32(A, B, C) \
> +  _mm_dpwssd_epi32((A), (B), (C))
> +#define _mm256_dpwssds_avx_epi32(A, B, C) \
> +  _mm256_dpwssds_epi32((A), (B), (C))
> +#define _mm_dpwssds_avx_epi32(A, B, C) \
> +  _mm_dpwssds_epi32((A), (B), (C))
> +#endif /* __AVXVNNI__ */
> +
>
> The above won't be needed with separate builtins.
>
> Please repost the patch, I think that the following part(s) of the
> patch were already committed via another patch:
>
> @@ -399,8 +403,8 @@ ix86_handle_option (struct gcc_options *opts,
>  {
>opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_SSE_UNSET;
>opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_SSE_UNSET;
> -  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX512F_UNSET;
> -  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX512F_UNSET;
> +  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX2_UNSET;
> +  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX2_UNSET;
>  }
>return true;
>

Yes.

> No review for the sse.md and for testcases.
>
> Uros.

Update the patch based on latest trunk.

--
BR,
Hongtao
From 881868b8c9f5925c63a953454f45f5e0a3c8ea4f Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Tue, 13 Oct 2020 16:16:16 +0800
Subject: [PATCH] Support Intel AVX VNNI

2020-10-13  Hongtao Liu  
	Hongyu Wang  

gcc/
	* common/config/i386/cpuinfo.h (get_available_features):
	Detect AVXVNNI.
	* common/config/i386/i386-common.c
	(OPTION_MASK_ISA2_AVXVNNI_SET,
	OPTION_MASK_ISA2_AVXVNNI_UNSET, OPTION_MASK_ISA2_AVX2_UNSET):
	New.
	(ix86_hanlde_option): Handle -mavxvnni, unset avxvnni when
	avx2 is disabled.
	* common/config/i386/i386-cpuinfo.h (enum processor_features):
	Add FEATURE_AVXVNNI.
	* common/config/i386/i386-isas.h: Add ISA_NAMES_TABLE_ENTRY
	for avxvnni.
	* config.gcc: Add avxvnniintrin.h.
	* config/i386/avx512vnniintrin.h: Remove 128/256 bit non-mask
	intrinsics.
	* config/i386/avxvnniintrin.h: New header file.
	* config/i386/cpuid.h (bit_AVXVNNI): New.
	* config/i386/i386-builtins.c (def_builtin): Handle AVXVNNI mask
	for unified builtin.
	* config/i386/i386-builtin.def (BDESC): Adjust AVX512VNNI
	builtins for AVXVNNI.
	* config/i386/i386-c.c (ix86_target_macros_internal): Define
	__AVXVNNI__.
	* config/i386/i386-expand.c (ix86_expand_builtin): Handle bisa
	for AVXVNNI to support unified intrinsic name, since there is no
	dependency between AVX512VNNI and AVXVNNI.
	* config/i386/i386-options.c (isa2_opts): Add -mavxvnni.
	(ix86_valid_target_attribute_inner_p): Handle avxnnni.
	(ix86_option_override_internal): Ditto.
	* config/i386/i386.h (TARGET_AVXVNNI, TARGET_AVXVNNI_P,
	TARGET_AVXVNNI_P, PTA_AVXVNNI): New.
	(PTA_SAPPHIRERAPIDS): Add AVX_VNNI.
	(PTA_ALDERLAKE): Likewise.
	* config/i386/i386.md ("isa"): Add avxvnni, avx512vnnivl.
	("enabled"): Adjust for avxvnni and avx512vnnivl.
	* config/i386/i386.opt: Add option -mavxvnni.
	* 

  1   2   >