Re: [PATCH v3] libgcc: Use `-fasynchronous-unwind-tables' for LIB2_DIVMOD_FUNCS

2020-08-25 Thread Kito Cheng via Gcc-patches
Hi Maciej:

Thanks for your explanation, I am OK with this change for the RISC-V port now,
but I think I don't have permission to approve this patch since it's
more than RISC-V port specific,
maybe you need approval from Richard Biener.


On Wed, Aug 26, 2020 at 12:33 AM Maciej W. Rozycki via Gcc-patches
 wrote:
>
> Hi Kito,
>
> > I just found the mail thread about div mod with -fnon-call-exceptions,
> > I think keeping the default LIB2_DIVMOD_EXCEPTION_FLAGS unchanged
> > should be the best way to go.
> >
> > Non-call exceptions and libcalls
> > https://gcc.gnu.org/legacy-ml/gcc/2001-06/msg01108.html
> >
> > Non-call exceptions and libcalls Part 2
> > https://gcc.gnu.org/legacy-ml/gcc/2001-07/msg00402.html
>
>  Thank you for your input.  I believe I had a look at these commits before
> I posted my original proposal.  Please note however that they both predate
> the addition of `-fasynchronous-unwind-tables', so clearly the option
> could not have been considered at the time the changes were accepted into
> GCC.
>
>  Please note that, as observed by Andreas and Richard here:
>  in no case we
> want to have full exception handling here, so we clearly need no
> `-fexceptions'; this libcall code won't itself ever call `throw'.
>
>  Now it might be a bit unclear from documentation as to whether we want
> `-fnon-call-exceptions' or `-fasynchronous-unwind-tables', as it says that
> the former option makes GCC:
>
> "Generate code that allows trapping instructions to throw
>  exceptions.  Note that this requires platform-specific runtime
>  support that does not exist everywhere.  Moreover, it only allows
>  _trapping_ instructions to throw exceptions, i.e. memory references
>  or floating-point instructions.  It does not allow exceptions to be
>  thrown from arbitrary signal handlers such as 'SIGALRM'."
>
> Note the observation that arbitrary signal handlers (invoked at more inner
> a frame level, and necessarily built with `-fexceptions') are still not
> allowed to throw exceptions.  For that, as far as I understand it, you
> actually need `-fasynchronous-unwind-tables', which makes GCC:
>
> "Generate unwind table in DWARF format, if supported by target
>  machine.  The table is exact at each instruction boundary, so it
>  can be used for stack unwinding from asynchronous events (such as
>  debugger or garbage collector)."
>
> and therefore allows arbitrary signal handlers to throw exceptions,
> effectively making the option a superset of `-fexceptions'.  As libcall
> code can generally be implicitly invoked everywhere, we want people not to
> be restrained by it and let a exception thrown by e.g. a user-supplied
> SIGALRM handler propagate through the relevant libcall's stack frame,
> rather than just those exceptions the libcall itself might indirectly
> cause.
>
>  Maybe I am missing something here, especially as `-fexceptions' mentions
> code generation, while `-fasynchronous-unwind-tables' only refers to
> unwind table generation, but then what would be the option to allow
> exceptions to be thrown from arbitrary signal handlers rather than those
> for memory references or floating-point instructions (where by a special
> provision integer division falls as well)?
>
>  My understanding has been it is `-fasynchronous-unwind-tables', but I'll
> be gladly straightened out otherwise.  If I am indeed right, then perhaps
> the documentation could be clarified and expanded a bit.
>
>  Barring evidence to the contrary I maintain the change I have proposed is
> correct, and not only removes the RISC-V `ld.so' build issue, but it fixes
> the handling of asynchronous events arriving in the middle of the relevant
> libcalls for all platforms as well.
>
>  Please let me know if you have any further questions, comments or
> concerns.
>
>   Maciej


[committed] analyzer: fix leak false positive/widening on pointer iteration [PR94858]

2020-08-25 Thread David Malcolm via Gcc-patches
PR analyzer/94858 reports a false diagnostic from
-Wanalyzer-malloc-leak, where the allocated pointer is pointed to by a
field of a struct, and a loop writes to a buffer, writing through an
iterating pointer value.

There were several underlying problems, relating to clobbering of the
struct holding the malloc-ed pointer; in each case the analyzer was
conservatively assuming that a write could affect this region,
clobbering it to "unknown", and this was detected as a leak.

The initial write within the loop dereferences the initial value of
a field, and the analyzer was assuming that that pointer could
point to the result of the malloc call.  The patch extends
store::eval_alias_1 so that it "knows" that the initial value of a
pointer at the beginning of a path can't point to a region that was
allocated on the heap after the beginning of the path.

On fixing that, the next issue is that within the loop the iterated
pointer value becomes "unknown", and hence *ptr becomes a write to a
symbolic region, and thus might clobber the struct (which it can't).
This patch adds enough logic to svalue::can_merge_p to merge the
iterating pointer value so that at the 2nd iteration analyzing
the loop it becomes a widening_svalue from the initial svalue, so
that this becomes a fixed point of the analysis, and is not an
unknown_svalue.  The patch further extends store::eval_alias_1 so that
it "knows" that this widening_svalue can only point to the same base
region as the initial value did; in particular, symbolic writes through
this pointer can only clobber that base region, not the struct holding
the malloc-ed pointer.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as r11-2855-g2fc201382d3498778934f1262b57acd20f76f300.

gcc/analyzer/ChangeLog:
PR analyzer/94858
* region-model-manager.cc
(region_model_manager::get_or_create_widening_svalue): Assert that
neither of the inputs are themselves widenings.
* store.cc (store::eval_alias_1): The initial value of a pointer
can't point to a region that was allocated on the heap after the
beginning of the path.  A widened pointer value can't alias anything
that the initial pointer value can't alias.
* svalue.cc (svalue::can_merge_p): Merge BINOP (X, OP, CST) with X
to a widening svalue.  Merge
BINOP(WIDENING(BASE, BINOP(BASE, X)), X) and BINOP(BASE, X) to
to the LHS of the first BINOP.

gcc/testsuite/ChangeLog:
PR analyzer/94858
* gcc.dg/analyzer/loop-start-up-to-end-by-1.c: Remove xfail.
* gcc.dg/analyzer/pr94858-1.c: New test.
* gcc.dg/analyzer/pr94858-2.c: New test.
* gcc.dg/analyzer/torture/loop-inc-ptr-2.c: Update expected number
of enodes.
* gcc.dg/analyzer/torture/loop-inc-ptr-3.c: Likewise.
---
 gcc/analyzer/region-model-manager.cc  |  2 +
 gcc/analyzer/store.cc | 29 +
 gcc/analyzer/svalue.cc| 23 ++
 .../analyzer/loop-start-up-to-end-by-1.c  |  2 -
 gcc/testsuite/gcc.dg/analyzer/pr94858-1.c | 42 +++
 gcc/testsuite/gcc.dg/analyzer/pr94858-2.c | 25 +++
 .../gcc.dg/analyzer/torture/loop-inc-ptr-2.c  |  2 +-
 .../gcc.dg/analyzer/torture/loop-inc-ptr-3.c  |  2 +-
 8 files changed, 123 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr94858-1.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr94858-2.c

diff --git a/gcc/analyzer/region-model-manager.cc 
b/gcc/analyzer/region-model-manager.cc
index ce949322db7..9bfb0812998 100644
--- a/gcc/analyzer/region-model-manager.cc
+++ b/gcc/analyzer/region-model-manager.cc
@@ -651,6 +651,8 @@ region_model_manager::get_or_create_widening_svalue (tree 
type,
 const svalue *base_sval,
 const svalue *iter_sval)
 {
+  gcc_assert (base_sval->get_kind () != SK_WIDENING);
+  gcc_assert (iter_sval->get_kind () != SK_WIDENING);
   widening_svalue::key_t key (type, point, base_sval, iter_sval);
   if (widening_svalue **slot = m_widening_values_map.get (key))
 return *slot;
diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index 8439366f0b5..14f7c00bde6 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -1606,6 +1606,35 @@ store::eval_alias_1 (const region *base_reg_a,
  /* The initial value of a pointer can't point to a local.  */
  return tristate::TS_FALSE;
}
+  if (sval_a->get_kind () == SK_INITIAL
+ && base_reg_b->get_kind () == RK_HEAP_ALLOCATED)
+   {
+ /* The initial value of a pointer can't point to a
+region that was allocated on the heap after the beginning of the
+path.  */
+ return tristate::TS_FALSE;
+   }
+  if (const widening_svalue *widening_sval_a
+ = 

[committed] analyzer: fix ICE on initializers for unsized array fields [PR96777]

2020-08-25 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as r11-2854-gd88c8df70342fcd6817e23f243ff38d0fe42fc6b.

gcc/analyzer/ChangeLog:
PR analyzer/96777
* region-model.h (class compound_svalue): Document that all keys
must be concrete.
(compound_svalue::compound_svalue): Move definition to svalue.cc.
* store.cc (binding_map::apply_ctor_to_region): Handle
initializers for trailing arrays with incomplete size.
* svalue.cc (compound_svalue::compound_svalue): Move definition
here from region-model.h.  Add assertion that all keys are
concrete.

gcc/testsuite/ChangeLog:
PR analyzer/96777
* gcc.dg/analyzer/pr96777.c: New test.
---
 gcc/analyzer/region-model.h |  9 -
 gcc/analyzer/store.cc   | 25 +
 gcc/analyzer/svalue.cc  | 14 ++
 gcc/testsuite/gcc.dg/analyzer/pr96777.c | 12 
 4 files changed, 55 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr96777.c

diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index 79d739e3a7b..f325a8b8381 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -1109,6 +1109,9 @@ namespace ana {
mapping, but are required to use an svalue, such as when handling
compound assignments and compound return values.
 
+   All keys within the underlying binding_map are required to be concrete,
+   not symbolic.
+
Instances of this class shouldn't be bound as-is into the store;
instead they should be unpacked.  Similarly, they should not be
nested.  */
@@ -1150,11 +1153,7 @@ public:
 const binding_map *m_map_ptr;
   };
 
-  compound_svalue (tree type, const binding_map )
-  : svalue (calc_complexity (map), type),
-m_map (map)
-  {
-  }
+  compound_svalue (tree type, const binding_map );
 
   enum svalue_kind get_kind () const FINAL OVERRIDE { return SK_COMPOUND; }
   const compound_svalue *dyn_cast_compound_svalue () const { return this; }
diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index 298088f6ef9..8439366f0b5 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -429,6 +429,31 @@ binding_map::apply_ctor_to_region (const region 
*parent_reg, tree ctor,
  const binding_key *k
= binding_key::make (mgr->get_store_manager (), child_reg,
 BK_direct);
+ /* Handle the case where we have an unknown size for child_reg
+(e.g. due to it being a trailing field with incomplete array
+type.  */
+ if (!k->concrete_p ())
+   {
+ /* Assume that sval has a well-defined size for this case.  */
+ tree sval_type = sval->get_type ();
+ gcc_assert (sval_type);
+ HOST_WIDE_INT sval_byte_size = int_size_in_bytes (sval_type);
+ gcc_assert (sval_byte_size != -1);
+ bit_size_t sval_bit_size = sval_byte_size * BITS_PER_UNIT;
+ /* Get offset of child relative to base region.  */
+ region_offset child_base_offset = child_reg->get_offset ();
+ gcc_assert (!child_base_offset.symbolic_p ());
+ /* Convert to an offset relative to the parent region.  */
+ region_offset parent_base_offset = parent_reg->get_offset ();
+ gcc_assert (!parent_base_offset.symbolic_p ());
+ bit_offset_t child_parent_offset
+   = (child_base_offset.get_bit_offset ()
+  - parent_base_offset.get_bit_offset ());
+ /* Create a concrete key for the child within the parent.  */
+ k = mgr->get_store_manager ()->get_concrete_binding
+   (child_parent_offset, sval_bit_size, BK_direct);
+   }
+ gcc_assert (k->concrete_p ());
  put (k, sval);
}
 }
diff --git a/gcc/analyzer/svalue.cc b/gcc/analyzer/svalue.cc
index eabb13affcb..99b5507a43c 100644
--- a/gcc/analyzer/svalue.cc
+++ b/gcc/analyzer/svalue.cc
@@ -910,6 +910,20 @@ unmergeable_svalue::implicitly_live_p (const svalue_set 
_svalues,
 
 /* class compound_svalue : public svalue.  */
 
+compound_svalue::compound_svalue (tree type, const binding_map )
+: svalue (calc_complexity (map), type), m_map (map)
+{
+  /* All keys within the underlying binding_map are required to be concrete,
+ not symbolic.  */
+#if CHECKING_P
+  for (iterator_t iter = begin (); iter != end (); ++iter)
+{
+  const binding_key *key = (*iter).first;
+  gcc_assert (key->concrete_p ());
+}
+#endif
+}
+
 /* Implementation of svalue::dump_to_pp vfunc for compound_svalue.  */
 
 void
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr96777.c 
b/gcc/testsuite/gcc.dg/analyzer/pr96777.c
new file mode 100644
index 000..2bb2a4e0be3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr96777.c
@@ -0,0 +1,12 @@
+struct ge {
+  char au;
+  char 

RE: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions emitted at -O3

2020-08-25 Thread xiezhiheng
> -Original Message-
> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> Sent: Tuesday, August 25, 2020 7:08 PM
> To: xiezhiheng 
> Cc: Richard Biener ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions
> emitted at -O3
> 
> xiezhiheng  writes:
> >> -Original Message-
> >> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> >> Sent: Friday, August 21, 2020 5:02 PM
> >> To: xiezhiheng 
> >> Cc: Richard Biener ;
> gcc-patches@gcc.gnu.org
> >> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions
> >> emitted at -O3
> >
> > Cut...
> >
> >> Looks like the saturating intrinsics might need a bit more thought.
> >> Would you mind submitting the patch with just the other parts?
> >> Those were uncontroversial and it would be a shame to hold them
> >> up over this.
> >
> > Okay, I reorganized the existing patch and finished the first half of the
> intrinsics
> > except saturating intrinsics and load intrinsics.
> >
> > Bootstrapped and tested on aarch64 Linux platform.
> 
> I know this'll be frustrating, sorry, but could you post the
> 2020-08-17 patch without the saturation changes?  It's going to be
> easier to track and review if each patch deals with similar intrinsics.
> The non-saturating part of the 2020-08-17 patch was good because it was
> dealing purely with arithmetic operations.  Loads should really be a
> separate change.
> 
> BTW, for something like this, it's OK to test and submit several patches
> at once, so separating the patches doesn't need to mean longer test cycles.
> It's just that for review purposes, it's easier if one patch does one thing.
> 

That's true.  And I finished the patch to add FLAG for add/sub arithmetic
intrinsics except saturating intrinsics.  Later I will try to separate the rest
into several subsets to fix.

Bootstrapped and tested on aarch64 Linux platform.


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 7a71b4367d4..a93712ae0a5 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2020-08-26  Zhiheng Xie  
+
+   * config/aarch64/aarch64-simd-builtins.def: Add proper FLAG
+   for add/sub arithmetic intrinsics.
+

> > For load intrinsics, I have one problem when I set FLAG_READ_MEMORY
> for them,
> > some test cases like
> > gcc.target/aarch64/advsimd-intrinsics/vld2_lane_p8_indices_1.c
> >   #include 
> >
> >   /* { dg-do compile } */
> >   /* { dg-skip-if "" { *-*-* } { "-fno-fat-lto-objects" } } */
> >
> >   poly8x8x2_t
> >   f_vld2_lane_p8 (poly8_t * p, poly8x8x2_t v)
> >   {
> > poly8x8x2_t res;
> > /* { dg-error "lane 8 out of range 0 - 7" "" { target *-*-* } 0 } */
> > res = vld2_lane_p8 (p, v, 8);
> > /* { dg-error "lane -1 out of range 0 - 7" "" { target *-*-* } 0 } */
> > res = vld2_lane_p8 (p, v, -1);
> > return res;
> >   }
> > would fail in regression.  Because the first statement
> >   res = vld2_lane_p8 (p, v, 8);
> > would be eliminated as dead code in gimple phase but the error message is
> > generated in expand pass.  So I am going to replace the second statement
> >   res = vld2_lane_p8 (p, v, -1);
> > with
> >   res = vld2_lane_p8 (p, res, -1);
> > or do you have any other suggestions?
> 
> The test is valid as-is, so it would be better not to change it.
> 
> I guess this means that we should leave the _lane loads and stores until
> we implement the range checks in a different way.  This is somewhat
> related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95969 ,
> although your example shows that the “dummy const function” approach
> might not work.
> 
> So to start with, could you just patch the non-lane loads?

Okay.

> 
> > And for test case gcc.target/aarch64/arg-type-diagnostics-1.c, I return the
> result
> > to prevent the statement
> >   result = vrsra_n_s32 (arg1, arg2, a);
> > from being eliminated by treated as dead code.
> 
> Hmm.  Here too I think the test is valid as-is.  I think we need
> to ensure that the range check still happens even if the call is
> dead code (similar to PR95969 above).

I agree.  That would be more reasonable.

> 
> So I guess here too, it might be better to leave the _n forms to
> a separate patch.
> 
> That doesn't mean we shouldn't fix the _lane and _n cases (or the
> previous saturating cases).  It's just that each time we find a group
> of functions that's awkward for some reason, it'd be better to deal
> with those functions separately.
> 
> Thanks,
> Richard


pr94442-v2.patch
Description: pr94442-v2.patch


Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-08-25 Thread Qing Zhao via Gcc-patches



> On Aug 25, 2020, at 9:05 AM, Qing Zhao via Gcc-patches 
>  wrote:
> 
> 
> 
>> On Aug 25, 2020, at 1:41 AM, Uros Bizjak  wrote:
>> 
 
>> (The other side of the coin is how much this helps prevent exploitation;
>> numbers on that would be good to see, too.)
> 
> This can be well showed from the paper:
> 
> "Clean the Scratch Registers: A Way to Mitigate Return-Oriented 
> Programming Attacks"
> 
> https://urldefense.com/v3/__https://ieeexplore.ieee.org/document/8445132__;!!GqivPVa7Brio!JbdLvo54xB3ORTeZqpy_PwZsL9drNLaKjbg14bTKMOwxt8LWnjZ8gJWlqtlrFKPh$
>  
>   >
> 
> Please take a look at this paper.
 
 As I told you before, that isn't open information, I cannot reply to
 any of that.
>>> 
>>> A little confused here, what’s you mean by “open information”? Is the 
>>> information in a published paper not open information?
>> 
>> No, because it is behind a paywall.
> 
> Still don’t understand here:  this paper has been published in the proceeding 
> of “ 2018 IEEE 29th International Conference on Application-specific Systems, 
> Architectures and Processors (ASAP)”.
> If you want to read the complete version online, you need to pay for it.
> 
> However, it’s still a published paper, and the information inside it should 
> be “open information”. 
> 
> So, what’s the definition of “open information” you have?
> 
> I downloaded a PDF copy of this paper through my company’s paid account.  But 
> I am not sure whether it’s legal for me to attach it to this mailing list?

After consulting, it turned out that I was not allowed to further forward the 
copy I downloaded through my company’s account to this alias. 
There is some more information on this paper online though:

https://www.semanticscholar.org/paper/Clean-the-Scratch-Registers:-A-Way-to-Mitigate-Rong-Xie/6f2ce4fd31baa0f6c02f9eb5c57b90d39fe5fa13

All the figures and tables in this paper are available in this link. 

In which, Figure 1 is an illustration  of a typical ROP attack, please pay 
special attention on the “Gadgets”, which are carefully chosen machine 
instruction sequences that are already present in the machine's memory, Each 
gadget typically ends in a return instruction and is located in a subroutine 
within the existing program and/or shared library code. Chained together, these 
gadgets allow an attacker to perform arbitrary operations on a machine 
employing defenses that thwart simpler attacks.

The paper identified the important features of ROP attack as following:

"First, the destination of using gadget chains in usual is performing system 
call or system fucntion to perform malicious behaviour such as file access, 
network access and W ⊕ X disable. In most cases, the adversary would like to 
disable W ⊕ X. Because once W ⊕ X has been disabled, shellcode can be executed 
directly instead of rewritting shellcode to ROP chains which may cause some 
troubles for the adversary. 

Second, if the adversary performs ROP attacks using system call instruction, no 
matter on x86 or x64 architecture, the register would be used to pass 
parameter. Or if the adversary performs ROP attacks using system function such 
as “read” or “mprotect”, on x64 system, the register would still be used to 
pass parameters, as mentioned in subsection B and C.”
As a result, the paper proposed the idea to zeroing scratch registers that pass 
parameters at the “return” insns to mitigate the ROP attack. 

Table III, Table IV and Table V are the results of “zeroing scratch register 
mitigate ROP attack”. From the tables, zeroing scratch registers can 
successfully mitigate the ROP on all those benchmarks. 

Table VI is the performance overhead of their implementation, it looks like 
very high, average 16.2X runtime overhead.  However, this implementation is not 
use compiler to statically generate zeroing sequence, instead, it used "dynamic 
binary instrumentation at runtime “ to check every instruction to 
1. Set/unset flags to check which scratch registers are used in the routine;
2. Whether the instruction is return instruction or not, if it’s is return, 
insert the zeroing used scratch register sequence before the “return” insn. 

Due to the above run-time dynamic instrumentation method, the high runtime 
overhead is expecting, I think.

If we use GCC to statically check the “used” information and add zeroing 
sequence before return insn, the run-time overhead will be much smaller. 

I will provide run-time overhead information with the 2nd version of the patch 
by using CPU2017 applications.

thanks.

Qing


> Qing
> 
> 
>> 
>> Uros.



Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-08-25 Thread Qing Zhao via Gcc-patches



> On Aug 24, 2020, at 3:20 PM, Segher Boessenkool  
> wrote:
> 
> Hi!
> 
> On Mon, Aug 24, 2020 at 01:02:03PM -0500, Qing Zhao wrote:
>>> On Aug 24, 2020, at 12:49 PM, Segher Boessenkool 
>>>  wrote:
>>> On Wed, Aug 19, 2020 at 06:27:45PM -0500, Qing Zhao wrote:
> On Aug 19, 2020, at 5:57 PM, Segher Boessenkool 
>  wrote:
> Numbers on how expensive this is (for what arch, in code size and in
> execution time) would be useful.  If it is so expensive that no one will
> use it, it helps security at most none at all :-(
>>> 
>>> Without numbers on this, no one can determine if it is a good tradeoff
>>> for them.  And we (the GCC people) cannot know if it will be useful for
>>> enough users that it will be worth the effort for us.  Which is why I
>>> keep hammering on this point.
>> I can collect some run-time overhead data on this, do you have a 
>> recommendation on what test suite I can use
>> For this testing? (Is CPU2017 good enough)?
> 
> I would use something more real-life, not 12 small pieces of code.

There is some basic information about the benchmarks of CPU2017 in below link:

https://www.spec.org/cpu2017/Docs/overview.html#suites 


GCC itself is one of the benchmarks in CPU2017 (502.gcc_r). And 526.blender_r 
is even larger than 502.gcc_r. 
And there are several other quite big benchmarks as well (perlbench, xalancbmk, 
parest, imagick, etc).

thanks.

Qing

Re: [PATCH] lra: Canonicalize mult to shift in address reloads

2020-08-25 Thread Vladimir Makarov via Gcc-patches



On 2020-08-25 6:18 a.m., Alex Coplan wrote:

The motivation here is to be able to remove several redundant patterns
in the AArch64 backend. See the previous thread [1] for context.

Testing:
  * Bootstrapped and regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu.
  * New unit test which checks that we're using the shift pattern (rather
than the problematic mult pattern) on AArch64.

OK for master?

Yes. Thank you for working on this issue and the patch.

Thanks,
Alex

[0] : https://gcc.gnu.org/onlinedocs/gccint/Insn-Canonicalizations.html
[1] : https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552066.html

---

gcc/ChangeLog:

* lra-constraints.c (canonicalize_reload_addr): New.
(curr_insn_transform): Use canonicalize_reload_addr to ensure we
generate canonical RTL for an address reload.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/mem-shift-canonical.c: New test.





Re: [PATCH] libiberty: Add support for 'in' and 'in ref' storage classes.

2020-08-25 Thread Jeff Law via Gcc-patches
On Fri, 2020-08-07 at 13:32 +0200, Iain Buclaw via Gcc-patches wrote:
> Hi,
> 
> This patch adds support for 'in' as a first-class storage class with its
> own mangle symbol, of which also permits 'in ref'.  Previously, 'in' was
> an alias to 'const [scope]', which is a type constructor.
> 
> The mangle symbol repurposed for this is 'I', which was originally used
> by identifier types.  However, while TypeIdentifier is part of the
> grammar, it must be resolved to some other entity during the semantic
> passes, and so shouldn't appear anywhere in the mangled name.
> 
> OK for mainline?
> 
> Regards
> Iain.
> 
> ---
> libiberty/ChangeLog:
> 
>   * d-demangle.c (dlang_function_args): Handle 'in' and 'in ref'
>   parameter storage classes.
>   (dlang_type): Remove identifier type.
>   * testsuite/d-demangle-expected: Update tests.
OK
jeff
> 



Re: [PATCH] bb-reorder: Remove a misfiring micro-optimization (PR96475)

2020-08-25 Thread Jeff Law via Gcc-patches
On Fri, 2020-08-07 at 21:51 +, Segher Boessenkool wrote:
> When the compgotos pass copies the tail of blocks ending in an indirect
> jump, there is a micro-optimization to not copy the last one, since the
> original block will then just be deleted.  This does not work properly
> if cleanup_cfg does not merge all pairs of blocks we expect it to.
> 
> 
> v2: This also deletes the other use of single_pred_p, which has the same
> problem in principle, I just never have triggered it so far.
> 
> Tested on powerpc64-linux {-m32,-m64} like before.  Is this okay for
> trunk?
> 
> 
> Segher
> 
> 
> 2020-08-07  Segher Boessenkool  
> 
>   PR rtl-optimization/96475
>   * bb-reorder.c (maybe_duplicate_computed_goto): Remove single_pred_p
>   micro-optimization.
So this may have already been answered, but why didn't we merge the single pred
with its single succ?

Though I guess your patch wouldn't hurt, so OK.

jeff




Re: [PATCH] rs6000: Disable -fcaller-saves by default

2020-08-25 Thread Jeff Law via Gcc-patches
On Mon, 2020-08-10 at 11:11 -0500, Peter Bergner wrote:
> On 7/23/20 3:29 PM, Jeff Law wrote:
> > > > What's driving this change?
> > > 
> > > Peter noticed IRA allocates stuff to volatile registers while it is life
> > > through a call, and then LRA has to correct that, not optimal.
> > I can't recall if IRA or LRA handles the need to save them to the stack, 
> > but the
> > whole point is you can do much better sometimes by saving into the stack 
> > with the
> > caller-saves algorithm vs just giving up and spilling.
> > 
> > > > IRA will do a cost/benefit analysis to see using call clobbered 
> > > > registers like
> > > > this is profitable or not.  You're just turning the whole thing off.
> 
> Sorry for taking so long to reply.  I'm the guilty party for asking Pat
> to submit the patch. :-)  I was not aware IRA did that and just assumed
> it was a bug.  For now, consider the patch withdrawn.  That said, I
> will have to look at that cost computation, especially since when I
> last looked, IRA does not count potential prologue/epilogue save/restore
> insns if it were to assign a non-volatile register when computing reg
> costs.  That would seem to be an important consideration here.
No worries.  Yea, you want to count the prologue/epilogue, as well as the
saves/restores at the call points  (which need frequency scaling and accounting
for saves which don't need to happen because a prior save is sufficient to cover
more than one call), etc.

I think much of this code is still in caller-save.c.  It's been eons since I
worked on it, but I can probably get reacquainted with the implementation if you
have questions



> I'll note this all came about when I was looking at PR96017, which is
> due to not shrink-wrapping a pseudo.  That was due to it being live
> across a call.  I first I thought (for the 2nd test case, not the original
> one) split_live_ranges_for_shrink_wrap() was nicely splitting the pseudo
> for us, but it must have been the caller-saves algorithm you mention above.
> However, that code doesn't handle the original test case, which I think
> it should.
> 
> My thought for that bug was to introduce some extra splitting before RA
> (maybe as part of split_live_ranges_for_shrink_wrap?) where we split
> pseudos that are live across a call, but that have at least one path
> to the exit that doesn't cross a call.  However, if we can beef up
> the caller-saves cost computation, maybe that would work too?
I've gone back and forth on pre allocation splitting as well as post-allocating
splitting and re-allocation.   I could argue either side of that discussion --
given we've got a bit of special code for splitting to help shrink wrapping,
maybe that's the best place if we need to do splitting before RA since this was
triggered by digging into a shrink-wrapping problem.

I'd probably start by making sure the cost computation is sane though.

Jeff



Re: [PATCH] Treat { 0 } specially for structs with the designated_init attribute.

2020-08-25 Thread Asher Gordon via Gcc-patches
Hi Jeff,

Jeff Law  writes:

> On Mon, 2020-08-03 at 14:47 -0400, Asher Gordon via Gcc-patches wrote:
>> Is there any reason my patches haven't been applied yet? Is there
>> anything else I need to do?
> No, nothing you really need to do.  We're backed up more than usual, but 
> yours is
> definitely in the queue.

OK, thanks. Let me know if there's anything I could do to help.

Asher

-- 
I often quote myself; it adds spice to my conversation.
-- G. B. Shaw
   
I prefer to send and receive mail encrypted. Please send me your
public key, and if you do not have my public key, please let me
know. Thanks.

GPG fingerprint: 38F3 975C D173 4037 B397  8095 D4C9 C4FC 5460 8E68


signature.asc
Description: PGP signature


Re: [PATCH] C-SKY: Add -mbacktrace option.

2020-08-25 Thread Jeff Law via Gcc-patches
On Fri, 2020-08-21 at 14:18 +0800, Jojo R wrote:
> gcc/ChangeLog:
> 
>   * config/csky/csky.opt (TARGET_BACKTRACE): New.
>   * doc/invoke.texi (C-SKY Options): Document -mbacktrace.
ISTM you need an actual implementation.  All this does is add an option.  It's
impossible to know if this is a good idea without seeing implementation bits.

jeff



Re: [PATCH]Adjust testcase.

2020-08-25 Thread Jeff Law via Gcc-patches
On Tue, 2020-08-25 at 22:01 +0200, Jakub Jelinek wrote:
> On Tue, Aug 25, 2020 at 01:45:01PM -0600, Jeff Law via Gcc-patches wrote:
> > On Tue, 2020-08-18 at 15:18 +0800, Hongtao Liu via Gcc-patches wrote:
> > > Hi:
> > >   Rewriting testcase with cpp source file, then compare operator could
> > > be used directly for vector, this would avoid impact of vectorizer.
> > > 
> > > gcc/testsuite/ChangeLog:
> > > PR target/96667
> > > * gcc.target/i386/avx512bw-pr96246-1.c: Moved to...
> > > * g++.target/i386/avx512bw-pr96246-1.C: ...here.
> > > * gcc.target/i386/avx512bw-pr96246-2.c: Moved to...
> > > * g++.target/i386/avx512bw-pr96246-2.C: ...here.
> > > * gcc.target/i386/avx512vl-pr96246-1.c: Moved to...
> > > * g++.target/i386/avx512vl-pr96246-1.C: ...here.
> > > * gcc.target/i386/avx512vl-pr96246-2.c: Moved to...
> > > * g++.target/i386/avx512vl-pr96246-2.C: ...here.
> > 
> > Shouldn't these go into g++.target/i386?
> 
> They should, but the patch is moving them there and the ChangeLog entry says
> that.
Good grief.  I missed that.  This patch is OK ;-)

jeff



Re: [PATCH]Adjust testcase.

2020-08-25 Thread Jakub Jelinek via Gcc-patches
On Tue, Aug 25, 2020 at 01:45:01PM -0600, Jeff Law via Gcc-patches wrote:
> On Tue, 2020-08-18 at 15:18 +0800, Hongtao Liu via Gcc-patches wrote:
> > Hi:
> >   Rewriting testcase with cpp source file, then compare operator could
> > be used directly for vector, this would avoid impact of vectorizer.
> > 
> > gcc/testsuite/ChangeLog:
> > PR target/96667
> > * gcc.target/i386/avx512bw-pr96246-1.c: Moved to...
> > * g++.target/i386/avx512bw-pr96246-1.C: ...here.
> > * gcc.target/i386/avx512bw-pr96246-2.c: Moved to...
> > * g++.target/i386/avx512bw-pr96246-2.C: ...here.
> > * gcc.target/i386/avx512vl-pr96246-1.c: Moved to...
> > * g++.target/i386/avx512vl-pr96246-1.C: ...here.
> > * gcc.target/i386/avx512vl-pr96246-2.c: Moved to...
> > * g++.target/i386/avx512vl-pr96246-2.C: ...here.
> 
> Shouldn't these go into g++.target/i386?

They should, but the patch is moving them there and the ChangeLog entry says
that.

Jakub



Re: [PATCH] libstdc++: Fix operator overload resolution for calendar types

2020-08-25 Thread Patrick Palka via Gcc-patches
On Tue, 25 Aug 2020, Patrick Palka wrote:

> My original patch that implemented the calendar type operations failed
> to enforce a constraint on some of the addition/subtraction operator
> overloads that take a 'months' argument:
> 
>   Constraints: If the argument supplied by the caller for the months
>   parameter is convertible to years, its implicit conversion sequence to
>   years is worse than its implicit conversion sequence to months
> 
> This constraint is relevant when adding/subtracting a duration to/from
> say a year_month when the given duration is convertible to both 'months'
> and to 'years'.  The correct behavior here in light of this constraint
> is to select the (more efficient) 'years'-taking overload, but we
> currently emit an ambiguous overload error.
> 
> This patch follows the approach taken in the 'date' library and defines
> the constrained 'months'-taking operator overloads as function
> templates, so that we break such a implicit-conversion tie by selecting
> the non-template 'years'-taking overload.
> 
> Tested on x86_64-pc-linux-gnu, does this look OK to commit?  (The below
> diff is generated with --ignore-space-change for easier review.  In the
> actual patch, the function templates are indented two extra spaces after
> the template parameter list.)
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/std/chrono (__detail::__unspecified_month_disambuator):
>   Define.

Whoops, this ChangeLog line should say
__detail::__months_years_conversion_disambiguator.

>   (year_month::operator+=): Turn the 'months'-taking overload
>   into a function template, so that the 'years'-taking overload is
>   selected in case of an equally-ranked implicit conversion
>   sequence to both 'months' and 'years' from the supplied argument.
>   (year_month::operator-=): Likewise.
>   (year_month::operator+): Likewise.
>   (year_month::operator-): Likewise.
>   (year_month_day::operator+=): Likewise.
>   (year_month_day::operator-=): Likewise.
>   (year_month_day::operator+): Likewise.
>   (year_month_day::operator-): Likewise.
>   (year_month_day_last::operator+=): Likewise.
>   (year_month_day_last::operator-=): Likewise.
>   (year_month_day_last::operator+): Likewise
>   (year_month_day_last::operator-): Likewise.
>   (year_month_day_weekday::operator+=): Likewise
>   (year_month_day_weekday::operator-=): Likewise.
>   (year_month_day_weekday::operator+): Likewise.
>   (year_month_day_weekday::operator-): Likewise.
>   (year_month_day_weekday_last::operator+=): Likewise
>   (year_month_day_weekday_last::operator-=): Likewise.
>   (year_month_day_weekday_last::operator+): Likewise.
>   (year_month_day_weekday_last::operator-): Likewise.
>   (testsuite/std/time/year_month/2.cc): New test.
>   (testsuite/std/time/year_month_day/2.cc): New test.
>   (testsuite/std/time/year_month_day_last/2.cc): New test.
>   (testsuite/std/time/year_month_weekday/2.cc): New test.
>   (testsuite/std/time/year_month_weekday_last/2.cc): New test.
> ---
>  libstdc++-v3/include/std/chrono   | 52 ++-
>  .../testsuite/std/time/year_month/2.cc| 40 ++
>  .../testsuite/std/time/year_month_day/2.cc| 40 ++
>  .../std/time/year_month_day_last/2.cc | 40 ++
>  .../std/time/year_month_weekday/2.cc  | 40 ++
>  .../std/time/year_month_weekday_last/2.cc | 40 ++
>  6 files changed, 250 insertions(+), 2 deletions(-)
>  create mode 100644 libstdc++-v3/testsuite/std/time/year_month/2.cc
>  create mode 100644 libstdc++-v3/testsuite/std/time/year_month_day/2.cc
>  create mode 100644 libstdc++-v3/testsuite/std/time/year_month_day_last/2.cc
>  create mode 100644 libstdc++-v3/testsuite/std/time/year_month_weekday/2.cc
>  create mode 100644 
> libstdc++-v3/testsuite/std/time/year_month_weekday_last/2.cc
> 
> diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
> index 3cc1438a7b6..0e272c3da58 100644
> --- a/libstdc++-v3/include/std/chrono
> +++ b/libstdc++-v3/include/std/chrono
> @@ -2046,6 +2046,23 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  
>  // YEAR_MONTH
>  
> +namespace __detail
> +{
> +  // [time.cal.ym], [time.cal.ymd], etc constrain the 'months'-taking
> +  // addition/subtraction operator overloads like so:
> +  //
> +  //   Constraints: if the argument supplied by the caller for the months
> +  //   parameter is convertible to years, its implicit conversion 
> sequence
> +  //   to years is worse than its implicit conversion sequence to months.
> +  //
> +  // We realize this constraint by defining the 'months'-taking 
> overloads as
> +  // function templates (with a dummy defaulted template parameter), so 
> that
> +  // overload resolution doesn't select the 'months'-taking overload 
> unless
> +  // the implicit conversion 

Re: [PATCH] RX add control register PC

2020-08-25 Thread Jeff Law via Gcc-patches
On Tue, 2020-08-18 at 16:05 +0300, Darius Galis wrote:
> Hello,
> 
> The following patch is adding the PC control register.
> It also updates the __builtin_rx_mvfc() function, since
> according to the documentation, the PC register cannot be specified
> as dest.
> 
> The regression was tested with the following command:
> 
> make -k check-gcc RUNTESTFLAGS=--target_board=rx-sim
> 
> There were no additionals failures.
> 
> Please let me know if this is OK, Thank you!
> Darius Galis
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index b834a2c..3436c25 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,10 @@
> +2020-08-17  Darius Galis  
> +
> + * config/rx/rx.md (CTRLREG_PC): Add.
> + * config/rx/rx.c (CTRLREG_PC): Add
> + (rx_expand_builtin_mvtc): Add warning: PC register cannot
> + be used as dest.
> +
>   2020-08-03  Jonathan Wakely  
>   
>   * doc/cpp.texi (Variadic Macros): Use the exact ... token in
> diff --git a/gcc/config/rx/rx.c b/gcc/config/rx/rx.c
> index 151ad39..1cc88d9 100644
> --- a/gcc/config/rx/rx.c
> +++ b/gcc/config/rx/rx.c
> @@ -639,6 +639,7 @@ rx_print_operand (FILE * file, rtx op, int letter)
> switch (INTVAL (op))
>   {
>   case CTRLREG_PSW:   fprintf (file, "psw"); break;
> + case CTRLREG_PC:fprintf (file, "pc"); break;
>   case CTRLREG_USP:   fprintf (file, "usp"); break;
>   case CTRLREG_FPSW:  fprintf (file, "fpsw"); break;
>   case CTRLREG_BPSW:  fprintf (file, "bpsw"); break;
> @@ -2474,6 +2475,14 @@ rx_expand_builtin_mvtc (tree exp)
> if (! REG_P (arg2))
>   arg2 = force_reg (SImode, arg2);
>   
> +  if (INTVAL(arg1) == 1/*PC*/)

We generally avoid comments on the same line as code.  And there should be a
space before the open paren of a function or macro argument.  So:

/* PC */
if (INTVAL (arg1) == 1)

With that change, this is OK to commit.  I can't recall if you have commit privs
or not...  If not, I can commit for you.


Thanks,
Jeff



[PATCH] libstdc++: Fix operator overload resolution for calendar types

2020-08-25 Thread Patrick Palka via Gcc-patches
My original patch that implemented the calendar type operations failed
to enforce a constraint on some of the addition/subtraction operator
overloads that take a 'months' argument:

  Constraints: If the argument supplied by the caller for the months
  parameter is convertible to years, its implicit conversion sequence to
  years is worse than its implicit conversion sequence to months

This constraint is relevant when adding/subtracting a duration to/from
say a year_month when the given duration is convertible to both 'months'
and to 'years'.  The correct behavior here in light of this constraint
is to select the (more efficient) 'years'-taking overload, but we
currently emit an ambiguous overload error.

This patch follows the approach taken in the 'date' library and defines
the constrained 'months'-taking operator overloads as function
templates, so that we break such a implicit-conversion tie by selecting
the non-template 'years'-taking overload.

Tested on x86_64-pc-linux-gnu, does this look OK to commit?  (The below
diff is generated with --ignore-space-change for easier review.  In the
actual patch, the function templates are indented two extra spaces after
the template parameter list.)

libstdc++-v3/ChangeLog:

* include/std/chrono (__detail::__unspecified_month_disambuator):
Define.
(year_month::operator+=): Turn the 'months'-taking overload
into a function template, so that the 'years'-taking overload is
selected in case of an equally-ranked implicit conversion
sequence to both 'months' and 'years' from the supplied argument.
(year_month::operator-=): Likewise.
(year_month::operator+): Likewise.
(year_month::operator-): Likewise.
(year_month_day::operator+=): Likewise.
(year_month_day::operator-=): Likewise.
(year_month_day::operator+): Likewise.
(year_month_day::operator-): Likewise.
(year_month_day_last::operator+=): Likewise.
(year_month_day_last::operator-=): Likewise.
(year_month_day_last::operator+): Likewise
(year_month_day_last::operator-): Likewise.
(year_month_day_weekday::operator+=): Likewise
(year_month_day_weekday::operator-=): Likewise.
(year_month_day_weekday::operator+): Likewise.
(year_month_day_weekday::operator-): Likewise.
(year_month_day_weekday_last::operator+=): Likewise
(year_month_day_weekday_last::operator-=): Likewise.
(year_month_day_weekday_last::operator+): Likewise.
(year_month_day_weekday_last::operator-): Likewise.
(testsuite/std/time/year_month/2.cc): New test.
(testsuite/std/time/year_month_day/2.cc): New test.
(testsuite/std/time/year_month_day_last/2.cc): New test.
(testsuite/std/time/year_month_weekday/2.cc): New test.
(testsuite/std/time/year_month_weekday_last/2.cc): New test.
---
 libstdc++-v3/include/std/chrono   | 52 ++-
 .../testsuite/std/time/year_month/2.cc| 40 ++
 .../testsuite/std/time/year_month_day/2.cc| 40 ++
 .../std/time/year_month_day_last/2.cc | 40 ++
 .../std/time/year_month_weekday/2.cc  | 40 ++
 .../std/time/year_month_weekday_last/2.cc | 40 ++
 6 files changed, 250 insertions(+), 2 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/std/time/year_month/2.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/year_month_day/2.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/year_month_day_last/2.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/year_month_weekday/2.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/year_month_weekday_last/2.cc

diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
index 3cc1438a7b6..0e272c3da58 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -2046,6 +2046,23 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 // YEAR_MONTH
 
+namespace __detail
+{
+  // [time.cal.ym], [time.cal.ymd], etc constrain the 'months'-taking
+  // addition/subtraction operator overloads like so:
+  //
+  //   Constraints: if the argument supplied by the caller for the months
+  //   parameter is convertible to years, its implicit conversion sequence
+  //   to years is worse than its implicit conversion sequence to months.
+  //
+  // We realize this constraint by defining the 'months'-taking overloads 
as
+  // function templates (with a dummy defaulted template parameter), so 
that
+  // overload resolution doesn't select the 'months'-taking overload unless
+  // the implicit conversion sequence to 'months' is better than that to
+  // 'years'.
+  using __months_years_conversion_disambiguator = void;
+}
+
 class year_month
 {
 private:
@@ -2068,6 +2085,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   month() const noexcept
   { 

Re: [PATCH]Adjust testcase.

2020-08-25 Thread Jeff Law via Gcc-patches
On Tue, 2020-08-18 at 15:18 +0800, Hongtao Liu via Gcc-patches wrote:
> Hi:
>   Rewriting testcase with cpp source file, then compare operator could
> be used directly for vector, this would avoid impact of vectorizer.
> 
> gcc/testsuite/ChangeLog:
> PR target/96667
> * gcc.target/i386/avx512bw-pr96246-1.c: Moved to...
> * g++.target/i386/avx512bw-pr96246-1.C: ...here.
> * gcc.target/i386/avx512bw-pr96246-2.c: Moved to...
> * g++.target/i386/avx512bw-pr96246-2.C: ...here.
> * gcc.target/i386/avx512vl-pr96246-1.c: Moved to...
> * g++.target/i386/avx512vl-pr96246-1.C: ...here.
> * gcc.target/i386/avx512vl-pr96246-2.c: Moved to...
> * g++.target/i386/avx512vl-pr96246-2.C: ...here.

Shouldn't these go into g++.target/i386?

jeff



Re: [PATCH] x86: Detect Rocket Lake and Alder Lake

2020-08-25 Thread Jeff Law via Gcc-patches
On Sun, 2020-08-16 at 06:17 -0700, H.J. Lu via Gcc-patches wrote:
> From arch/x86/include/asm/intel-family.h on Linux kernel master branch:
> 
>  #define INTEL_FAM6_ROCKETLAKE   0xA7
>  #define INTEL_FAM6_ALDERLAKE0x97
> 
>   * common/config/i386/cpuinfo.h (get_intel_cpu): Detect Rocket
>   Lake and Alder Lake.
OK
jeff
> 



Re: [RISC-V] Add support for AddressSanitizer on RISC-V GCC

2020-08-25 Thread Jim Wilson
On Wed, Aug 19, 2020 at 1:02 AM Joshua via Gcc-patches
 wrote:
> * config/riscv/riscv.c (asan_shadow_offset): Implement the offset of 
> asan shadow memory for risc-v.
> (asan_shadow_offset): new macro definition.

When I try the patch, I get asan errors complaining about memory mappings.

==9422==Shadow memory range interleaves with an existing memory
mapping. ASan cannot proceed correctly. ABORTING.
==9422==ASan shadow was supposed to be located in the
[0x7fff7000-0x10007fff7fff] range.

I haven't gotten it working yet, but I haven't tried debugging it yet
either.  How exactly are you testing it?  It fails for me using a 4.15
kernel on hardware, and on a user mode qemu.  I haven't tried booting
a 5.x kernel in qemu, that would take some time to set up.  Likewise
using a 5.x kernel on hardware would take some time to set up.

Jim


Re: [PATCH] Treat { 0 } specially for structs with the designated_init attribute.

2020-08-25 Thread Jeff Law via Gcc-patches
On Mon, 2020-08-03 at 14:47 -0400, Asher Gordon via Gcc-patches wrote:
> Hello,
> 
> Asher Gordon  writes:
> 
> > My copyright assignment finally got finished, so you should be able to
> > apply my patches now.
> 
> Is there any reason my patches haven't been applied yet? Is there
> anything else I need to do?
No, nothing you really need to do.  We're backed up more than usual, but yours 
is
definitely in the queue.

jeff



Re: [PATCH v4] genemit.c (main): split insn-emit.c for compiling parallelly

2020-08-25 Thread Jeff Law via Gcc-patches
On Sat, 2020-08-01 at 13:38 +0200, Václav Haisman via Gcc-patches wrote:
> On 01. 08. 20 13:02, Jojo R wrote:
> > gcc/ChangeLog:
> > 
> > * genemit.c (main): Print 'split line'.
> > * Makefile.in (insn-emit.c): Define split count and file
> > 
> > ---
> >  gcc/Makefile.in | 11 +++
> >  gcc/genemit.c   | 87 -
> >  2 files changed, 60 insertions(+), 38 deletions(-)
> > 
> > diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> > index 2ba76656dbf..bc0b3e6d343 100644
> > --- a/gcc/Makefile.in
> > +++ b/gcc/Makefile.in
> > @@ -1253,6 +1253,13 @@ ANALYZER_OBJS = \
> >  # We put the *-match.o and insn-*.o files first so that a parallel make
> >  # will build them sooner, because they are large and otherwise tend to be
> >  # the last objects to finish building.
> > +
> > +insn-generated-split-num = $(shell nproc)
> > +
> > +insn-emit-split-c = $(foreach o, $(shell for i in 
> > {1..$(insn-generated-split-num)}; do echo $$i; done), insn-emit$(o).c)
> 
> I believe {1..10} is a bashism. Is it OK in this context to require Bash?
Nope.  It needs to be more portable than that.

jeff



Re: [PATCH] testsuite: Enable fenv effective-targets for C++

2020-08-25 Thread Jeff Law via Gcc-patches
On Thu, 2020-07-30 at 11:23 +0100, Jonathan Wakely via Gcc-patches wrote:
> I recently wanted to use { dg-require-effective-target fenv } in a
> libstdc++ test, but it uses -std=gnu99 which is only valid for C.
> 
> This allows C++ tests to use the fenv and fenv_exceptions
> effective-target keywords.
> 
> gcc/testsuite/ChangeLog:
> 
>   * lib/target-supports.exp (check_effective_target_fenv): Use
>   -std=gnu++11 instead of -std=gnu99 for C++ tests.
>   (check_effective_target_fenv_exceptions): Likewise.
> 
> 
> I don't actually need to use the 'fenv' target now, but does this seem
> like a reasonable change anyway?
It seems reasonable to me.  Selecting the right option automatically seems 
better
than any of the alternatives.

jeff



Re: [PATCH 3/3] MSP430: Simplify and extend shift instruction patterns

2020-08-25 Thread Jeff Law via Gcc-patches
On Tue, 2020-07-21 at 19:29 +0100, Jozef Lawrynowicz wrote:
> The implementation of define_expand and define_insn patterns to handle
> shifts in the MSP430 backend is inconsistent, resulting in missed
> opportunities to make best use of the architecture's features.
> 
> There's now a single define_expand used as the entry point for all valid
> shifts, and the decision to either use a helper function to perform the
> shift (often required for the 430 ISA), or fall through to the
> define_insn patterns can be made from that expander function.
> 
> Shifts by a constant amount have been grouped into one define_insn for
> each type of shift, instead of having different define_insn patterns for
> shifts by different amounts.
> 
> A new target option "-mmax-inline-shift=" has been added to allow tuning
> of the number of shift instructions to emit inline, instead of using
> a library helper function.
> 
> Successfully regtested on trunk for msp430-elf, ok to apply?
OK
jeff



Re: [PATCH] x86: Change CTZ_DEFINED_VALUE_AT_ZERO to return 0/2

2020-08-25 Thread Jeff Law via Gcc-patches
On Mon, 2020-07-13 at 06:42 -0700, H.J. Lu via Gcc-patches wrote:
> Change CTZ_DEFINED_VALUE_AT_ZERO/CTZ_DEFINED_VALUE_AT_ZERO to return 0/2
> to enable table-based clz/ctz optimization:
> 
>  -- Macro: CLZ_DEFINED_VALUE_AT_ZERO (MODE, VALUE)
>  -- Macro: CTZ_DEFINED_VALUE_AT_ZERO (MODE, VALUE)
>  A C expression that indicates whether the architecture defines a
>  value for 'clz' or 'ctz' with a zero operand.  A result of '0'
>  indicates the value is undefined.  If the value is defined for only
>  the RTL expression, the macro should evaluate to '1'; if the value
>  applies also to the corresponding optab entry (which is normally
>  the case if it expands directly into the corresponding RTL), then
>  the macro should evaluate to '2'.  In the cases where the value is
>  defined, VALUE should be set to this value.
> 
> gcc/
> 
>   PR target/95863
>   * config/i386/i386.h (CTZ_DEFINED_VALUE_AT_ZERO): Return 0/2.
>   (CLZ_DEFINED_VALUE_AT_ZERO): Likewise.
> 
> gcc/testsuite/
> 
>   PR target/95863
>   * gcc.target/i386/pr95863-1.c: New test.
>   * gcc.target/i386/pr95863-2.c: Likewise.
OK
jeff
> 



Re: [PATCH 1/2] IPA symver: allow multiple symvers for a definition

2020-08-25 Thread Jan Hubicka
> 
> gcc/ChangeLog:
> 
>   * cgraphunit.c (process_symver_attribute): Allow multiple
>   symver attributes for one symbol.
>   * doc/extend.texi: Document the change.
> 
> gcc/testsuite/ChangeLog:
> 
>   * lib/target-supports-dg.exp: Add dg-require-symver.
>   * lib/target-supports.exp: Likewise.
>   * gcc.dg/ipa/symver1.c: New test.
> ---
>  gcc/cgraphunit.c | 143 ---
>  gcc/doc/extend.texi  |  10 +-
>  gcc/testsuite/gcc.dg/ipa/symver1.c   |  11 ++
>  gcc/testsuite/lib/target-supports-dg.exp |  10 ++
>  gcc/testsuite/lib/target-supports.exp|  12 ++
>  5 files changed, 110 insertions(+), 76 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/ipa/symver1.c
> 

>  @smallexample
> -__attribute__ ((__symver__ ("foo@@VERS_2")))
> -__attribute__ ((alias ("foo_v1")))
> -int symver_foo_v1 (void);
> +__attribute__ ((__symver__ ("foo@@VERS_2"), ("foo@@VERS_3")))
> +int symver_foo_v1 (void)
So we can still write it as follows:
+__attribute__ ((__symver__ ("foo@@VERS_2"
+__attribute__ ((__symver__ ("foo@@VERS_3"
+int symver_foo_v1 (void)

I think we should support this syntax so the versions can be added
separately by macros mixed with other attributs

Honza
> +@{
> +@}
>  @end smallexample
>  
>  This example creates an alias of @code{foo_v1} with symbol name
> diff --git a/gcc/testsuite/gcc.dg/ipa/symver1.c 
> b/gcc/testsuite/gcc.dg/ipa/symver1.c
> new file mode 100644
> index 000..645de7ea259
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/ipa/symver1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +
> +__attribute__ ((__symver__ ("foo@VER_2")))
> +__attribute__ ((__symver__ ("foo@VER_3")))
> +int foo()
> +{
> +  return 2;
> +}
> +
> +/* { dg-final { scan-assembler ".symver.*foo, foo@VER_2" } } */
> +/* { dg-final { scan-assembler ".symver.*foo, foo@VER_3" } } */
> diff --git a/gcc/testsuite/lib/target-supports-dg.exp 
> b/gcc/testsuite/lib/target-supports-dg.exp
> index 5bb99f4e8f9..4a03eaae9ce 100644
> --- a/gcc/testsuite/lib/target-supports-dg.exp
> +++ b/gcc/testsuite/lib/target-supports-dg.exp
> @@ -665,3 +665,13 @@ if { [info procs saved-dg-process-target] == [list] } {
>   return [dg-process-target-1 $selector]
>  }
>  }
> +
> +# If this target does not support the "symver" attribute, skip this
> +# test.
> +
> +proc dg-require-symver { args } {
> +if { ![ check_symver_available ] } {
> + upvar dg-do-what dg-do-what
> + set dg-do-what [list [lindex ${dg-do-what} 0] "N" "P"]
> +}
> +}
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index c24330a27ab..f3fc5b80aea 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -10445,3 +10445,15 @@ proc check_effective_target_large_initializer { } {
>  
>  return 1
>  }
> +# Returns 1 if the target toolchain supports extended
> +# syntax of .symver directive, 0 otherwise.
> +
> +proc check_symver_available { } {
> +return [check_no_compiler_messages symver_available object {
> + int foo(void) { return 0; }
> + int main (void) {
> + asm volatile (".symver foo,foo@VER_1, local");
> + return 0;
> + }
> + }]
> +}



Re: [PATCH 2/2] IPA symver: support visibility and static symbols.

2020-08-25 Thread Jan Hubicka
> 
> gcc/ChangeLog:
> 
>   * cgraphunit.c (process_symver_attribute): Remove checks that
>   are not needed now.
>   (cgraph_node::assemble_thunks_and_aliases): Change second
>   argument to decl.
>   * config/elfos.h (ASM_OUTPUT_SYMVER_DIRECTIVE): Add new
>   VISIBILITY parameter.
>   * doc/extend.texi: Document that .symver supports visibility.
>   * symtab.c (symtab_node::verify_base): Remove checks that
>   are not needed now.
>   * varasm.c (do_assemble_symver): Detect visibility .symver
>   directive argument.
>   * varpool.c (varpool_node::assemble_aliases): Change second
>   argument to decl.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/ipa/symver2.c: New test.
>   * gcc.dg/ipa/symver3.c: New test.

Thanks for looking into this!
>  void
> -do_assemble_symver (tree decl, tree target)
> +do_assemble_symver (tree decl, tree origin_decl)
>  {
> +  tree target = DECL_ASSEMBLER_NAME (origin_decl);
>tree id = DECL_ASSEMBLER_NAME (decl);
>ultimate_transparent_alias_target ();
>ultimate_transparent_alias_target ();
>  #ifdef ASM_OUTPUT_SYMVER_DIRECTIVE
> +  const char *visibility = NULL;
> +  if (!TREE_PUBLIC (origin_decl))
> +visibility = "remove";
> +  else if (DECL_VISIBILITY (origin_decl) == VISIBILITY_INTERNAL)
> +visibility = "local";
> +  else if (DECL_VISIBILITY (origin_decl) == VISIBILITY_HIDDEN)
> +visibility = "hidden";

What will happen here with protected visibility?

Otherwise patch makes sense to me.
Honza
>ASM_OUTPUT_SYMVER_DIRECTIVE (asm_out_file,
>  IDENTIFIER_POINTER (target),
> -IDENTIFIER_POINTER (id));
> +IDENTIFIER_POINTER (id), visibility);
>  #else
>error ("symver is only supported on ELF platforms");
>  #endif
> diff --git a/gcc/varpool.c b/gcc/varpool.c
> index 458cdf1bf37..95d7844a398 100644
> --- a/gcc/varpool.c
> +++ b/gcc/varpool.c
> @@ -540,8 +540,7 @@ varpool_node::assemble_aliases (void)
>  {
>varpool_node *alias = dyn_cast  (ref->referring);
>if (alias->symver)
> - do_assemble_symver (alias->decl,
> - DECL_ASSEMBLER_NAME (decl));
> + do_assemble_symver (alias->decl, decl);
>else if (!alias->transparent_alias)
>   do_assemble_alias (alias->decl,
>  DECL_ASSEMBLER_NAME (decl));



Re: [PATCH 2/5] C front end support to detect out-of-bounds accesses to array parameters

2020-08-25 Thread Martin Sebor via Gcc-patches

Joseph, do you have any more comments on the rest of the most recent
revision of the patch?

https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552266.html

Martin

On 8/19/20 6:09 PM, Joseph Myers wrote:

On Wed, 19 Aug 2020, Martin Sebor via Gcc-patches wrote:


I think you need a while loop there, not just an if, to account for the
case of multiple consecutive cdk_attrs.  At least the GNU attribute syntax

 direct-declarator:
[...]
   ( gnu-attributes[opt] declarator )

should produce multiple consecutive cdk_attrs for each level of
parentheses with attributes inside.


I had considered a loop but couldn't find a way to trigger what you
describe (or a test in the testsuite that would do it) so I didn't
use one.  I saw loops like that in other places but I couldn't get
even those to uncover such a test case.  Here's what I tried:

   #define A(N) __attribute__ ((aligned (N), may_alias))
   int n;
   void f (int (* A (2) A (4) (* A (2) A (4) (* A (2) A (4) [n])[n])));

Sequences of consecutive attributes are all chained together.

I've added the loop here but I have no test for it.  It would be
good to add one if it really is needed.


The sort of thing I'm thinking of would be, where A is some attribute:

void f (int (A (A (A arg;

(that example doesn't involve an array, but it illustrates the syntax I'd
expect to produce multiple consecutive cdk_attrs).





Re: *PING**2 – Re: [Patch] OpenMP: Handle cpp_implicit_alias in declare-target discovery (PR96390)

2020-08-25 Thread Tobias Burnus

And another PING.

On 8/17/20 9:17 AM, Tobias Burnus wrote:

On 8/3/20 5:37 PM, Tobias Burnus wrote:

It turned out that the omp_discover_declare_target_tgt_fn_r
discovered all nodes – but as it tagged the C++ alias nodes
and not the streamed-out nodes, no device function was created
and one got link errors if offloading devices were configured.
(Only with -O0 as otherwise inlining happened.)

(Testcase is based on a sollve_vv testcase which in turn was
based on an LLVM bugreport.)

OK?

Tobias


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


Re: [EXTERNAL] Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.

2020-08-25 Thread will schmidt via Gcc-patches
On Mon, 2020-08-24 at 14:39 -0700, Carl Love wrote:
> Segher:
> 
> On Wed, 2020-08-19 at 15:16 -0500, Segher Boessenkool wrote:
> > On Wed, Aug 19, 2020 at 02:19:12PM -0500, Peter Bergner wrote:
> > > On 8/14/20 7:42 PM, Segher Boessenkool wrote:
> > > > I think your current code is fine; I hadn't considered Bill's
> > > > upcoming
> > > > rewrite.  It is more important to make that go smoother than to
> > > > fix some
> > > > aesthetics right now.
> > > > 
> > > > Please check if the names for the builtins match the (future)
> > > > documentation, and then, approved for trunk.  Thank you!
> > > 
> > > This is also a bug in GCC 10, so this should be backported too.
> > > 
> > > Segher, is this ok for Carl to backport to GCC 10 after it has sat
> > > on
> > > trunk for a few days?
> > 
> > Yes.  Thanks guys.


Hi, 

> 
> I attempted to do the backport however the patch doesn't even come
> close to applying.  The names XVCVBF16SPN and XVCVSPBF16 are the only
> two builtin names that exist in GCC 10.  The other issue is there is no
> Power 10 builtin macro definitions in GCC 10.  So basically, I can fix
> the issue with XVCVBF16SPN and XVCVSPBF16 to be restricted to Power 10
> by adding the needed Power 10 macro definition.  
> 
> This is a whole new patch so I figure it needs to be reviewed to make
> sure we want to make this change to GCC 10.  I did run the regression
> tests again using a Power 9 machine to verify it complies and there are
> no regression test failures.  
> 


I recommend a pure and clean description of what the patch is doing in a 
paragraph, separate from the history.

"
This patch corrects the built-in definitions for cvcvspbf16 and xvcvbf16spn to 
restrict their use to P10 and beyond in a way that
is consistent with the P8 and P9 builtin naming conventions.

This is a subset of changes made to trunk ...
"


> Please let me know if this is OK for the GCC 10 tree.  Thanks.
> 
>   Carl Love
> 
> 
> [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Back port to 
> GCC 10.
> 
> gcc/ChangeLog
> 
> 2020-08-24  Carl Love  

whitespace/indentation

>   * config/rs6000/rs6000-builtin.def: (BU_P10V_VSX_1) New builtin macro 
> expansion.
>   (XVCVBF16SPN, XVCVSPBF16): Replace macro expansion BU_VSX_1 with 
> BU_P10V_VSX_1.

OK as long as it's clear 'replace' means the define being used, versus the 
macro expansions themselves being replaced.

>   * config/rs6000/rs6000-call.c: (VSX_BUILTIN_XVCVSPBF16, 
> VSX_BUILTIN_XVCVBF16SPN):
>   Replace with P10V_BUILTIN_XVCVSPBF16, P10V_BUILTIN_XVCVBF16SPN 
> respectively.

s/Replace with/Rename to/ ?


> ---
>  gcc/config/rs6000/rs6000-builtin.def | 14 --
>  gcc/config/rs6000/rs6000-call.c  |  4 ++--
>  2 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtin.def 
> b/gcc/config/rs6000/rs6000-builtin.def
> index 88f78cb0a15..512b7629a46 100644
> --- a/gcc/config/rs6000/rs6000-builtin.def
> +++ b/gcc/config/rs6000/rs6000-builtin.def
> @@ -1014,6 +1014,16 @@
>| RS6000_BTC_BINARY),  \
>   CODE_FOR_ ## ICODE) /* ICODE */
> 
> +/* For builtins for power10 vector instructions that are encoded as altivec
> +   instructions, use __builtin_altivec_ as the builtin name.  */

That comment doesn't seem to apply to this change, update or drop ? 

> +
> +#define BU_P10V_VSX_1(ENUM, NAME, ATTR, ICODE)\
> +  RS6000_BUILTIN_1 (P10_BUILTIN_ ## ENUM,/* ENUM */  \
> + "__builtin_vsx_" NAME,  /* NAME */  \
> + RS6000_BTM_P10, /* MASK */  \
> + (RS6000_BTC_ ## ATTR/* ATTR */  \
> + | RS6000_BTC_UNARY),\
> + CODE_FOR_ ## ICODE) /* ICODE */
>  #endif
> 
>  
> @@ -2698,8 +2708,8 @@ BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, 
> "__builtin_cfstring", RS6000_BTM_ALWAYS,
> RS6000_BTC_MISC)
> 
>  /* POWER10 MMA builtins.  */
> -BU_VSX_1 (XVCVBF16SPN,   "xvcvbf16spn",  MISC, vsx_xvcvbf16spn)
> -BU_VSX_1 (XVCVSPBF16,"xvcvspbf16",   MISC, vsx_xvcvspbf16)
> +BU_P10V_VSX_1 (XVCVBF16SPN,  "xvcvbf16spn",  MISC, vsx_xvcvbf16spn)
> +BU_P10V_VSX_1 (XVCVSPBF16,   "xvcvspbf16",   MISC, vsx_xvcvspbf16)
> 
>  BU_MMA_1 (XXMFACC,   "xxmfacc",  QUAD, mma_xxmfacc)
>  BU_MMA_1 (XXMTACC,   "xxmtacc",  QUAD, mma_xxmtacc)
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index 2cf78dfa5fe..fc1671e1bb2 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -13383,8 +13383,8 @@ builtin_function_type (machine_mode mode_ret, 
> machine_mode mode_arg0,
>  case P8V_BUILTIN_VGBBD:
>  case 

Re: [PATCH v3] libgcc: Use `-fasynchronous-unwind-tables' for LIB2_DIVMOD_FUNCS

2020-08-25 Thread Maciej W. Rozycki via Gcc-patches
Hi Kito,

> I just found the mail thread about div mod with -fnon-call-exceptions,
> I think keeping the default LIB2_DIVMOD_EXCEPTION_FLAGS unchanged
> should be the best way to go.
> 
> Non-call exceptions and libcalls
> https://gcc.gnu.org/legacy-ml/gcc/2001-06/msg01108.html
> 
> Non-call exceptions and libcalls Part 2
> https://gcc.gnu.org/legacy-ml/gcc/2001-07/msg00402.html

 Thank you for your input.  I believe I had a look at these commits before 
I posted my original proposal.  Please note however that they both predate 
the addition of `-fasynchronous-unwind-tables', so clearly the option 
could not have been considered at the time the changes were accepted into 
GCC.

 Please note that, as observed by Andreas and Richard here: 
 in no case we 
want to have full exception handling here, so we clearly need no 
`-fexceptions'; this libcall code won't itself ever call `throw'.

 Now it might be a bit unclear from documentation as to whether we want 
`-fnon-call-exceptions' or `-fasynchronous-unwind-tables', as it says that 
the former option makes GCC:

"Generate code that allows trapping instructions to throw
 exceptions.  Note that this requires platform-specific runtime
 support that does not exist everywhere.  Moreover, it only allows
 _trapping_ instructions to throw exceptions, i.e. memory references
 or floating-point instructions.  It does not allow exceptions to be
 thrown from arbitrary signal handlers such as 'SIGALRM'."

Note the observation that arbitrary signal handlers (invoked at more inner 
a frame level, and necessarily built with `-fexceptions') are still not 
allowed to throw exceptions.  For that, as far as I understand it, you 
actually need `-fasynchronous-unwind-tables', which makes GCC:

"Generate unwind table in DWARF format, if supported by target
 machine.  The table is exact at each instruction boundary, so it
 can be used for stack unwinding from asynchronous events (such as
 debugger or garbage collector)."

and therefore allows arbitrary signal handlers to throw exceptions, 
effectively making the option a superset of `-fexceptions'.  As libcall 
code can generally be implicitly invoked everywhere, we want people not to 
be restrained by it and let a exception thrown by e.g. a user-supplied 
SIGALRM handler propagate through the relevant libcall's stack frame, 
rather than just those exceptions the libcall itself might indirectly 
cause.

 Maybe I am missing something here, especially as `-fexceptions' mentions 
code generation, while `-fasynchronous-unwind-tables' only refers to 
unwind table generation, but then what would be the option to allow 
exceptions to be thrown from arbitrary signal handlers rather than those 
for memory references or floating-point instructions (where by a special 
provision integer division falls as well)?

 My understanding has been it is `-fasynchronous-unwind-tables', but I'll 
be gladly straightened out otherwise.  If I am indeed right, then perhaps 
the documentation could be clarified and expanded a bit.

 Barring evidence to the contrary I maintain the change I have proposed is 
correct, and not only removes the RISC-V `ld.so' build issue, but it fixes 
the handling of asynchronous events arriving in the middle of the relevant 
libcalls for all platforms as well.

 Please let me know if you have any further questions, comments or 
concerns.

  Maciej


Re: [committed] libstdc++: Make self-move well-defined for containers [PR 85828]

2020-08-25 Thread Jonathan Wakely via Gcc-patches

On 12/08/20 20:37 +0100, Jonathan Wakely wrote:

The C++ LWG recently confirmed that self-move assignment should not have
undefined behaviour for standard containers (see the proposed resolution
of LWG 2839). The result should be a valid but unspecified value, just
like other times when a container is moved from.

Our std::list, std::__cxx11::basic_string and unordered containers all
have bugs which result in undefined behaviour.

For std::list the problem is that we clear the previous contents using
_M_clear() instead of clear(). This means the _M_next, _M_prev and
_M_size members are not zeroed, and so after we "update" them (with
their existing values), we are left with dangling pointers and a
non-zero size, but no elements.

For the unordered containers the problem is similar. _Hashtable first
deallocates the existing contents, then takes ownership of the pointers
from the RHS object (which has just had its contents deallocated so the
pointers are dangling).

For std::basic_string it's a little more subtle. When the string is
local (i.e. fits in the SSO buffer) we use char_traits::copy to copy the
contents from this->data() to __rhs.data(). When &__rhs == this that
copy violates the precondition that the ranges don't overlap. We only
need to check for self-move for this case where it's local, because the
only other case that can be true for self-move is that it's non-local
but the allocators compare equal. In that case the data pointer is
neither deallocated nor leaked, so the result is well-defined.

This patch also makes a small optimization for std::deque move
assignment, to use the efficient move when is_always_equal is false, but
the allocators compare equal at runtime.

Finally, we need to remove all the Debug Mode checks which abort the
program when a self-move is detected, because it's not undefined to do
that.

Before PR 85828 can be closed we should also look into fixing
std::shuffle so it doesn't do any redundant self-swaps.

libstdc++-v3/ChangeLog:

PR libstdc++/85828
* include/bits/basic_string.h (operator=(basic_string&&)): Check
for self-move before copying with char_traits::copy.
* include/bits/hashtable.h (operator=(_Hashtable&&)): Check for
self-move.
* include/bits/stl_deque.h (_M_move_assign1(deque&&, false_type)):
Check for equal allocators.
* include/bits/stl_list.h (_M_move_assign(list&&, true_type)):
Call clear() instead of _M_clear().
* include/debug/formatter.h (__msg_self_move_assign): Change
comment.
* include/debug/macros.h (__glibcxx_check_self_move_assign):
(_GLIBCXX_DEBUG_VERIFY): Remove.
* include/debug/safe_container.h (operator=(_Safe_container&&)):
Remove assertion check for safe move and make it well-defined.
* include/debug/safe_iterator.h (operator=(_Safe_iterator&&)):
Remove assertion check for self-move.
* include/debug/safe_local_iterator.h
(operator=(_Safe_local_iterator&&)): Likewise.
* testsuite/21_strings/basic_string/cons/char/self_move.cc: New test.
* testsuite/23_containers/deque/cons/self_move.cc: New test.
* testsuite/23_containers/forward_list/cons/self_move.cc: New test.
* testsuite/23_containers/list/cons/self_move.cc: New test.
* testsuite/23_containers/set/cons/self_move.cc: New test.
* testsuite/23_containers/unordered_set/cons/self_move.cc: New test.
* testsuite/23_containers/vector/cons/self_move.cc: New test.



This removes all the existing tests that expect a debug mode failure
that no longer happens.

Tested powerpc64le-linux. Committed to trunk.

commit 24f2764521d8f27760f03f626a4f20f4c82b7c73
Author: Jonathan Wakely 
Date:   Tue Aug 25 16:30:45 2020

libstdc++: Remove tests for self-move debug assertions

I recently removed the debug mode checks for self-move assignment, which
means these tests now fail when _GLIBCXX_DEBUG is added to the options
or when the check-debug target is used. Remove all the tests.

libstdc++-v3/ChangeLog:

* testsuite/21_strings/debug/iterator_self_move_assign_neg.cc: Removed.
* testsuite/21_strings/debug/self_move_assign_neg.cc: Removed.
* testsuite/23_containers/deque/debug/iterator_self_move_assign_neg.cc: Removed.
* testsuite/23_containers/deque/debug/self_move_assign_neg.cc: Removed.
* testsuite/23_containers/forward_list/debug/iterator_self_move_assign_neg.cc: Removed.
* testsuite/23_containers/forward_list/debug/self_move_assign_neg.cc: Removed.
* testsuite/23_containers/list/debug/iterator_self_move_assign_neg.cc: Removed.
* testsuite/23_containers/list/debug/self_move_assign_neg.cc: Removed.
* testsuite/23_containers/map/debug/iterator_self_move_assign_neg.cc: Removed.
* testsuite/23_containers/map/debug/self_move_assign_neg.cc: Removed.
   

Re: [Patch] OpenMP: Improve map-clause error message for array function parameter (PR96678)

2020-08-25 Thread Jakub Jelinek via Gcc-patches
On Tue, Aug 25, 2020 at 05:33:08PM +0200, Tobias Burnus wrote:
> Improve the error message. Currently code like:
> 
> void test (double src[100])
> {
>   #pragma omp target map(alloc:src[:])
> 
> fails with the surprising
>  "for pointer type length expression must be specified"
> as "double src[100]" is regarded as "double *src".
> 
> Thus, one cannot simply extract the "100" and internally
> replace src[:] by src[0:100].
> 
> However, by talking about "array function parameter",
> I think the error message is at least a bit clearer.
> 
> OK?

LGTM, thanks.

Jakub



[Patch] OpenMP: Improve map-clause error message for array function parameter (PR96678)

2020-08-25 Thread Tobias Burnus

Improve the error message. Currently code like:

void test (double src[100])
{
  #pragma omp target map(alloc:src[:])

fails with the surprising
 "for pointer type length expression must be specified"
as "double src[100]" is regarded as "double *src".

Thus, one cannot simply extract the "100" and internally
replace src[:] by src[0:100].

However, by talking about "array function parameter",
I think the error message is at least a bit clearer.

OK?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
OpenMP: Improve map-clause error message for array function parameter (PR96678)

gcc/c/ChangeLog:

	PR c/96678
	* c-typeck.c (handle_omp_array_sections_1): Talk about
	array function parameter in the error message.

gcc/cp/ChangeLog:

	PR c/96678
	* semantics.c (handle_omp_array_sections_1): Talk about
	array function parameter in the error message.

gcc/testsuite/ChangeLog:

	PR c/96678
	* c-c++-common/gomp/map-4.c: New test.
	* c-c++-common/gomp/depend-1.c: Update dg-error.
	* c-c++-common/gomp/map-1.c: Likewise.
	* c-c++-common/gomp/reduction-1.c: Likewise.
	* gcc/testsuite/g++.dg/gomp/depend-1.C: Likewise.
	* gcc/testsuite/g++.dg/gomp/depend-2.C: Likewise.

 gcc/c/c-typeck.c  |  9 +++--
 gcc/cp/semantics.c|  9 +++--
 gcc/testsuite/c-c++-common/gomp/depend-1.c|  2 +-
 gcc/testsuite/c-c++-common/gomp/map-1.c   |  2 +-
 gcc/testsuite/c-c++-common/gomp/map-4.c   | 29 +++
 gcc/testsuite/c-c++-common/gomp/reduction-1.c |  2 +-
 gcc/testsuite/g++.dg/gomp/depend-1.C  |  2 +-
 gcc/testsuite/g++.dg/gomp/depend-2.C  |  2 +-
 8 files changed, 48 insertions(+), 9 deletions(-)

diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 0d639b60ea3..e158d236501 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -13298,8 +13298,13 @@ handle_omp_array_sections_1 (tree c, tree t, vec ,
 {
   if (length == NULL_TREE)
 	{
-	  error_at (OMP_CLAUSE_LOCATION (c),
-		"for pointer type length expression must be specified");
+	  if (C_ARRAY_PARAMETER (ret))
+	error_at (OMP_CLAUSE_LOCATION (c),
+		  "for array function parameter length expression "
+		  "must be specified");
+	  else
+	error_at (OMP_CLAUSE_LOCATION (c),
+		  "for pointer type length expression must be specified");
 	  return error_mark_node;
 	}
   if (length != NULL_TREE
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 3877a0e536a..7f861fde7d6 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -5083,8 +5083,13 @@ handle_omp_array_sections_1 (tree c, tree t, vec ,
 {
   if (length == NULL_TREE)
 	{
-	  error_at (OMP_CLAUSE_LOCATION (c),
-		"for pointer type length expression must be specified");
+	  if (DECL_ARRAY_PARAMETER_P (ret))
+	error_at (OMP_CLAUSE_LOCATION (c),
+		  "for array function parameter length expression "
+		  "must be specified");
+	  else
+	error_at (OMP_CLAUSE_LOCATION (c),
+		  "for pointer type length expression must be specified");
 	  return error_mark_node;
 	}
   if (length != NULL_TREE
diff --git a/gcc/testsuite/c-c++-common/gomp/depend-1.c b/gcc/testsuite/c-c++-common/gomp/depend-1.c
index 8a5850e45fe..599031f2d90 100644
--- a/gcc/testsuite/c-c++-common/gomp/depend-1.c
+++ b/gcc/testsuite/c-c++-common/gomp/depend-1.c
@@ -40,7 +40,7 @@ foo (int g[3][10], int h[4][8], int i[2][10], int j[][9],
 ;
   #pragma omp task depend(out: f[1:10]) /* { dg-error "high bound \[^\n\r]* above array section size" } */
 ;
-  #pragma omp task depend(in: g[:][2:4]) /* { dg-error "for pointer type length expression must be specified" } */
+  #pragma omp task depend(in: g[:][2:4]) /* { dg-error "for array function parameter length expression must be specified" } */
 ;
   #pragma omp task depend(in: h[2:2][-1:]) /* { dg-error "negative low bound in array section" } */
 ;
diff --git a/gcc/testsuite/c-c++-common/gomp/map-1.c b/gcc/testsuite/c-c++-common/gomp/map-1.c
index 5dad7d6a9aa..508dc8d6b01 100644
--- a/gcc/testsuite/c-c++-common/gomp/map-1.c
+++ b/gcc/testsuite/c-c++-common/gomp/map-1.c
@@ -50,7 +50,7 @@ foo (int g[3][10], int h[4][8], int i[2][10], int j[][9],
 bar (e);
   #pragma omp target map(to: f[1:10]) /* { dg-error "high bound \[^\n\r]* above array section size" } */
 bar (f);
-  #pragma omp target map(from: g[:][0:10]) /* { dg-error "for pointer type length expression must be specified" } */
+  #pragma omp target map(from: g[:][0:10]) /* { dg-error "for array function parameter length expression must be specified" } */
 bar ([0][0]);
   #pragma omp target map(from: h[2:1][-1:]) /* { dg-error "negative low bound in array section" } */
 bar ([0][0]);
diff --git a/gcc/testsuite/c-c++-common/gomp/map-4.c b/gcc/testsuite/c-c++-common/gomp/map-4.c
new file mode 100644
index 

[COMMITTED] libstdc++: Fix debug-mode build failure in

2020-08-25 Thread Patrick Palka via Gcc-patches
libstdc++-v3/ChangeLog:

* include/std/chrono (year_month_weekday::ok): Fix assert.
---
 libstdc++-v3/include/std/chrono | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
index bf0d7e7af76..3cc1438a7b6 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -2613,7 +2613,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
days __d = (_M_wdi.weekday()
- chrono::weekday{sys_days{_M_y / _M_m / 1}}
+ days((_M_wdi.index()-1)*7 + 1));
-   __glibcxx_assert(__d >= 1);
+   __glibcxx_assert(__d.count() >= 1);
return __d.count() <= unsigned{(_M_y / _M_m / last).day()};
   }
 
-- 
2.28.0.308.g675a4aaf3b



[committed] libstdc++: Adjust static assertions in futures and promises [LWG 3466]

2020-08-25 Thread Jonathan Wakely via Gcc-patches
Add a static_assertions to check the result type is destructible, as in
the proposed resolution for LWG 3466 (which supersedes 3458).

libstdc++-v3/ChangeLog:

* include/std/future (future, shared_future. promise): Add
is_destructible assertion (LWG 3466). Adjust string-literal for
!is_array and !is_function assertions.
* testsuite/30_threads/future/requirements/lwg3458.cc: Check
types with no accessible destructor. Adjust expected errors.
* testsuite/30_threads/promise/requirements/lwg3466.cc:
Likewise.
* testsuite/30_threads/shared_future/requirements/lwg3458.cc:
Likewise.

Tested powerpc64le-linux. Committed to trunk.

commit 71ed3c0c9a3458998bded8e2443c0a680c2eb8cd
Author: Jonathan Wakely 
Date:   Tue Aug 25 15:52:57 2020

libstdc++: Adjust static assertions in futures and promises [LWG 3466]

Add a static_assertions to check the result type is destructible, as in
the proposed resolution for LWG 3466 (which supersedes 3458).

libstdc++-v3/ChangeLog:

* include/std/future (future, shared_future. promise): Add
is_destructible assertion (LWG 3466). Adjust string-literal for
!is_array and !is_function assertions.
* testsuite/30_threads/future/requirements/lwg3458.cc: Check
types with no accessible destructor. Adjust expected errors.
* testsuite/30_threads/promise/requirements/lwg3466.cc:
Likewise.
* testsuite/30_threads/shared_future/requirements/lwg3458.cc:
Likewise.

diff --git a/libstdc++-v3/include/std/future b/libstdc++-v3/include/std/future
index e0816c2f5e1..a7466a32e03 100644
--- a/libstdc++-v3/include/std/future
+++ b/libstdc++-v3/include/std/future
@@ -757,8 +757,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 {
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // 3458. Is shared_future intended to work with arrays or function types?
-  static_assert(!is_array<_Res>{}, "result type is not an array");
-  static_assert(!is_function<_Res>{}, "result type is not a function");
+  static_assert(!is_array<_Res>{}, "result type must not be an array");
+  static_assert(!is_function<_Res>{}, "result type must not be a 
function");
+  static_assert(is_destructible<_Res>{},
+   "result type must be destructible");
 
   friend class promise<_Res>;
   template friend class packaged_task;
@@ -892,8 +894,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 {
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // 3458. Is shared_future intended to work with arrays or function types?
-  static_assert(!is_array<_Res>{}, "result type is not an array");
-  static_assert(!is_function<_Res>{}, "result type is not a function");
+  static_assert(!is_array<_Res>{}, "result type must not be an array");
+  static_assert(!is_function<_Res>{}, "result type must not be a 
function");
+  static_assert(is_destructible<_Res>{},
+   "result type must be destructible");
 
   typedef __basic_future<_Res> _Base_type;
 
@@ -1049,8 +1053,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 {
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // 3466: Specify the requirements for promise/future/[...] consistently
-  static_assert(!is_array<_Res>{}, "result type is not an array");
-  static_assert(!is_function<_Res>{}, "result type is not a function");
+  static_assert(!is_array<_Res>{}, "result type must not be an array");
+  static_assert(!is_function<_Res>{}, "result type must not be a 
function");
+  static_assert(is_destructible<_Res>{},
+   "result type must be destructible");
 
   typedef __future_base::_State_base   _State;
   typedef __future_base::_Result<_Res> _Res_type;
diff --git a/libstdc++-v3/testsuite/30_threads/future/requirements/lwg3458.cc 
b/libstdc++-v3/testsuite/30_threads/future/requirements/lwg3458.cc
index 2bc206c9450..f26e17bd799 100644
--- a/libstdc++-v3/testsuite/30_threads/future/requirements/lwg3458.cc
+++ b/libstdc++-v3/testsuite/30_threads/future/requirements/lwg3458.cc
@@ -26,9 +26,18 @@ std::future good;
 std::future good2;
 
 std::future bad; // { dg-error "here" }
-// { dg-error "result type is not an array" "" { target *-*-* } 0 }
+// { dg-error "result type must not be an array" "" { target *-*-* } 0 }
 // { dg-prune-output "function returning an array" }
 
 std::future bad2; // { dg-error "here" }
-// { dg-error "result type is not a function" "" { target *-*-* } 0 }
+// { dg-error "result type must not be a function" "" { target *-*-* } 0 }
 // { dg-prune-output "function returning a function" }
+
+struct Indestructible { ~Indestructible() = delete; };
+std::future bad3; // { dg-error "here" }
+// { dg-error "result type must be destructible" "" { target *-*-* } 0 }
+// { dg-prune-output {deleted function} }
+
+class PrivateDtor { public: PrivateDtor(); private: ~PrivateDtor(); };
+std::future 

Re: [PATCH] Implement no_stack_protect attribute.

2020-08-25 Thread Nick Desaulniers via Gcc-patches
This would solve a common pattern in the kernel where folks are using
`extern inline` with `gnu_inline` semantics or worse (empty `asm("");`
statements) in certain places where it would be much more preferable
to have this attribute.  Thank you very much Martin for writing it.

> is direct equivalent of Clang's no_stack_protector.
> Unlike Clang, I chose to name it no_stack_protect because we already
> have stack_protect attribute (used with -fstack-protector-explicit).

That's pretty easy for us to work around the differences in the
kernel, but one final plea for the users; it would simplify users'
codebases not to have to shim this for differences between compilers.
If I had a dollar for every time I had to implement something in LLVM
where a different identifier or flag would be more consistent with
another part of the codebase...I'm sure there's many examples of this
on LLVM's side too, but I would prefer to stop the proliferation of
subtle differences like this that harm toolchain portability when
possible and when we can proactively address.
-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] hppa: PR middle-end/87256: Improved hppa_rtx_costs avoids synth_mult madness.

2020-08-25 Thread John David Anglin
On 2020-08-21 2:41 p.m., Roger Sayle wrote:
> 2020-08-21  Roger Sayle  
>
> gcc/ChangeLog
>   PR middle-end/87256
>   * config/pa/pa.c (hppa_rtx_costs_shadd_p): New helper function
>   to check for coefficients supported by shNadd and shladd,l.
>   (hppa_rtx_costs):  Rewrite to avoid using estimates based upon
>   FACTOR and enable recursing deeper into RTL expressions.
This change is also fine.

The gcc.target/hppa/shadd-2.c test now fails because there are two additional 
sh1add instructions.
However, the total number of instructions is the same as before.

I confirmed that change fixes PR middle-end/87256.  So, we can close this PR 
when the patch is installed.

Please install on trunk and gcc-10.  gcc-10 is the default compiler on Debian.

Thanks,
Dave

-- 
John David Anglin  dave.ang...@bell.net



Re: [PATCH] hppa: Improve expansion of ashldi3 when !TARGET_64BIT

2020-08-25 Thread John David Anglin
On 2020-08-25 10:05 a.m., Jeff Law wrote:
> On Sat, 2020-08-22 at 09:52 +0100, Roger Sayle wrote:
>> Hi Dave,
>>
>> It's great to hear from you.  It's been a long while.
>>
>> Sorry, doh! yes, there's a mistake in my patch (that I introduced when I
>> renumbered
>> the operands in the shd's define_expand to be the more logical 0, 1, 2, 3,
>> then 4).
>> Sorry for the inconvenience [due to my lack of familiarity with PA-RISC
>> assembly].
>> Hopefully you should get much better mileage out of the attached revision.
> FWIW this version 3-staged in my tester and is currently running the 
> testsuite.
>  It should finish testing later today.
>
> If David doesn't have objections, then I suggest going forward with the patch
> independent of the others you've got queued up.
The patch is fine.  Testing completed this morning on hpux and there are no 
regressions.

Please install on trunk and gcc-10 branch.

Thanks very much,
Dave

-- 
John David Anglin  dave.ang...@bell.net



Re: [PATCH][Hashtable 5/6] Remove H1/H2 template parameters

2020-08-25 Thread Jonathan Wakely via Gcc-patches

On 17/08/20 19:13 +0200, François Dumont via Libstdc++ wrote:

Hi

    Here is the new proposal.

    As we can't remove template parameters I simply restore those that 
I tried to pass differently _H2 and _ExtractKey, so eventually I only 
remove usage of _Hash which I renamed in _Unused. Maybe I can keep the 
doc about it in hashtable.h and just add a remark saying that it is 
now unused.


    For _RangeHash, formerly _H2, and _ExtractKey I just stop 
maintaining any storage. When we need those I always use a value 
initialized instance. I kind of prefer the value initialization syntax 
because you can't confuse it with a function call but let me know if 
it is wrong and I should use _ExtractKey() or _RangeHash(). I also add 
some static assertions about those types regarding their noexcept 
qualifications.


    I also included in this patch the few changes left from [Hashtable 
0/6] which are mostly _M_insert_unique_node and _M_insert_multi_node 
signature cleanup as the key part can be extracted from the inserted 
node.


    Tested under Linux x86_64, ok to commit ?

François

On 06/08/20 11:27 am, Jonathan Wakely wrote:

On 06/08/20 08:35 +0200, François Dumont wrote:

On 17/07/20 1:35 pm, Jonathan Wakely wrote:

I really like the general idea of getting rid of some of the
complexity and not supporting infinite customization. But we can do
that without changing mangled names of the _Hashtable specialiations.



I didn't thought we need to keep abi compatibility for extensions.


These aren't extensions though, they're part of std::unordered_map
etc.

Just because something like _Vector_base is an internal type rather
than something defined in the standard doesn't mean we can just change
its ABI, because that would change the ABI of std::vector. It the same
here.

Changing _Hashtable affects all users of std::unordered_map etc.







diff --git a/libstdc++-v3/include/bits/hashtable.h 
b/libstdc++-v3/include/bits/hashtable.h
index 7b772a475e3..1ba32a3c7e2 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -311,35 +303,37 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
"Cache the hash code or qualify your functors involved"
" in hash code and bucket index computation with noexcept");

-  // When hash codes are cached local iterator inherits from H2 functor
-  // which must then be default constructible.
-  static_assert(__if_hash_cached>::value,
+  // To get bucket index we need _RangeHash not to throw.
+  static_assert(is_nothrow_default_constructible<_RangeHash>::value,
"Functor used to map hash code to bucket index"
-   " must be default constructible");
+   " is nothrow default constructible");


Please phrase this as "must be nothrow default constructible".


+  static_assert(noexcept(
+   std::declval()((std::size_t)0, (std::size_t)0)),
+   "Functor used to map hash code to bucket index is noexcept");


Same here, "must be noexcept".

Otherwise this looks great, thanks. Please push.




Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-08-25 Thread Jeff Law via Gcc-patches
On Tue, 2020-08-25 at 02:16 -0300, Alexandre Oliva wrote:
> On Aug 24, 2020, Richard Biener  wrote:
> 
> > since the option is quite elaborate on what (sub-)set of regs is
> > supposed to be cleared I'm not sure an implementation not involving
> > any target hook is possible?
> 
> I don't think this follows.  Machine-independent code has a pretty good
> notion of what registers are call-saved or call-clobbered, which ones
> could be changed in this regard for function-specific calling
> conventions, which ones may be used by a function to hold its return
> value, which ones are used within a function...
> 
> It *should* be possible to introduce this in machine-independent code,
> emitting insns to set registers to zero and regarding them as holding
> values to be returned from the function.  Machine-specific code could
> use more efficient insns to get the same result, but I can't see good
> reason to not have a generic fallback implementation with at least a
> best-effort attempt to offer the desired feature.
I think part of the problem here is you have to worry about stubs which can
change caller-saved registers.  Return path stubs aren't particularly common, 
but
they do exist -- 32 bit hpux for example :(

Jeff



Re: [PATCH] hppa: Improve expansion of ashldi3 when !TARGET_64BIT

2020-08-25 Thread Jeff Law via Gcc-patches
On Sat, 2020-08-22 at 09:52 +0100, Roger Sayle wrote:
> Hi Dave,
> 
> It's great to hear from you.  It's been a long while.
> 
> Sorry, doh! yes, there's a mistake in my patch (that I introduced when I
> renumbered
> the operands in the shd's define_expand to be the more logical 0, 1, 2, 3,
> then 4).
> Sorry for the inconvenience [due to my lack of familiarity with PA-RISC
> assembly].
> Hopefully you should get much better mileage out of the attached revision.
FWIW this version 3-staged in my tester and is currently running the testsuite.
 It should finish testing later today.

If David doesn't have objections, then I suggest going forward with the patch
independent of the others you've got queued up.

jeff
> 



Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-08-25 Thread Qing Zhao via Gcc-patches



> On Aug 25, 2020, at 1:41 AM, Uros Bizjak  wrote:
> 
>>> 
> (The other side of the coin is how much this helps prevent exploitation;
> numbers on that would be good to see, too.)
 
 This can be well showed from the paper:
 
 "Clean the Scratch Registers: A Way to Mitigate Return-Oriented 
 Programming Attacks"
 
 https://urldefense.com/v3/__https://ieeexplore.ieee.org/document/8445132__;!!GqivPVa7Brio!JbdLvo54xB3ORTeZqpy_PwZsL9drNLaKjbg14bTKMOwxt8LWnjZ8gJWlqtlrFKPh$
  
 
 
 Please take a look at this paper.
>>> 
>>> As I told you before, that isn't open information, I cannot reply to
>>> any of that.
>> 
>> A little confused here, what’s you mean by “open information”? Is the 
>> information in a published paper not open information?
> 
> No, because it is behind a paywall.

Still don’t understand here:  this paper has been published in the proceeding 
of “ 2018 IEEE 29th International Conference on Application-specific Systems, 
Architectures and Processors (ASAP)”.
If you want to read the complete version online, you need to pay for it.

However, it’s still a published paper, and the information inside it should be 
“open information”. 

So, what’s the definition of “open information” you have?

I downloaded a PDF copy of this paper through my company’s paid account.  But I 
am not sure whether it’s legal for me to attach it to this mailing list?

Qing


> 
> Uros.



Re: [PATCH] libstdc++: Add more C++20 additions to

2020-08-25 Thread Jonathan Wakely via Gcc-patches

On 24/08/20 23:01 -0400, Patrick Palka via Libstdc++ wrote:

This patch adds the C++20 calendar types and their methods as defined in
[time.cal] (modulo the parsing/printing support).  This patch also
implements [time.hms] and [time.12], and a few more bits of
[time.clock].  The remaining C++20 additions to  from P0355 and
P1466 depend on [time.zone] and , so they will come later, as
will more optimized versions of some of the calendar algorithms.

The non-member operator overloads for the calendar types are defined as
namespace-scope functions in the standard, but here we instead define
each such operator overload as a hidden friend of the appropriate class.
This simplifies the implementation somewhat and lets us reap the
benefits of hidden friends for these routines.

The bulk of this work is based on a patch from Ed Smith-Rowland, which can
be found at the Git branch users/redi/heads/calendar.

Regression tested on x86_64-pc-linux-gnu, and also tested against the
testsuite for date.h of Howard Hinnant's 'date' library, i.e. the tests
at https://github.com/HowardHinnant/date/tree/master/test/date_test
(though some minor modifications to the tests are first needed to
account for the differences between the library API and the standard).


Looks good, please push (with the testsuite date fix mentioned on
IRC).

Thanks!




[OG10] Backporting of some patches

2020-08-25 Thread Tobias Burnus

OG10 = devel/omp/gcc-10

* Merged GCC 10 into branch

commit 68294a5e478a3215763530ae71dc0da4a0a9aa20
OpenMP/Fortran: Reject allocatable components in map clause
(cherry picked from commit ac70b20b1007da0d3d02a9cec3c0715145c4b593)
commit 575f1eb0711550adce9c61c04ff5597d71a1cc5c
OpenMP: Permit in Fortran omp target data without map
(cherry picked from commit d6cd139c1728fd37a2b6b2251029458cc2b7127e)

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


[PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-08-25 Thread Kewen.Lin via Gcc-patches
Hi Bin,

>>
>> For one particular case like:
>>
>> for (i = 0; i < SIZE; i++)
>>   y[i] = a * x[i] + z[i];
>>
>> we will mark reg_offset_p for IV candidates on x as below:
>>- (unsigned long) (x_18(D) + 8)// only mark this before.
>>- x_18(D) + 8
>>- (unsigned long) (x_18(D) + 24)
>>- (unsigned long) ((vector(2) double *) (x_18(D) + 8) + 
>> 18446744073709551600)
>>...
>>
>> Do you mind to have a review again?  Thanks in advance!
> I trust you with the change.

Thanks again!  Sorry for the late since it took some time to investigate
the exposed issues.

>>
>> Bootstrapped/regtested on powerpc64le-linux-gnu P8 and P9.
>>
>> SPEC2017 P9 performance run has no remarkable degradations/improvements.
> Is this run with unroll-loops?

Yes, the options what I used were: 

   -Ofast -mcpu=power9 -fpeel-loops -mrecip -funroll-loops

I also re-tested the newly attached patch, nothing changes for SPEC2017 data.

> Could you exercise the code with unroll-loops enabled when
> bootstrap/regtest please?  It doesn't matter if cases fail with
> unroll-loops, just making sure there is no fallout.  Otherwise it's
> fine with me.

Great idea!  With explicitly specified -funroll-loops, it's bootstrapped
but the regression testing did show one failure (the only one):

  PASS->FAIL: gcc.dg/sms-4.c scan-rtl-dump-times sms "SMS succeeded" 1

It exposes two issues:

1) Currently address_cost hook on rs6000 always return zero, but at least
from Power7, pre_inc/pre_dec kind instructions are cracked, it means we
have to take the address update into account (scalar normal operation).
Since IVOPTs reduces the cost_step for ainc candidates, it makes us prefer
ainc candidates.  In this case, the cand/group cost is -4 (minus cost_step),
with scaling up, the off becomes much.  With one simple hack on for pre_inc/
pre_dec in rs6000 address_cost, the case passed.  It should be handled in
one separated issue.

2) This case makes me think we should exclude ainc candidates in function
mark_reg_offset_candidates.  The justification is that: ainc candidate
handles step update itself and when we calculate the cost for it against
its ainc_use, the cost_step has been reduced. When unrolling happens,
the ainc computations are replicated and it doesn't save step updates 
like normal reg_offset_p candidates.

I've updated the patch to punt ainc_use candidates as below:

> + /* Skip AINC candidate since it contains address update itself,
> +the replicated AINC computations when unrolling still have
> +updates, unlike reg_offset_p candidates can save step updates
> +when unrolling.  */
> + if (cand->ainc_use)
> +   continue;
> +

For this new attached patch, it's bootstrapped and regress-tested without
explicit unrolling, while the only one failure has been identified as
rs6000 specific address_cost issue in bootstrapping and regression testing
with explicit unrolling.

By the way, with above simple hack of address_cost, I also did one
bootstrapping and regression testing with explicit unrolling, the above
sms-4.c did pass as I expected but had two failures instead:

  PASS->FAIL: gcc.dg/sms-compare-debug-1.c (test for excess errors)
  PASS->FAIL: gcc.dg/tree-ssa/ivopts-lt.c scan-tree-dump-times ivopts "PHI" 2

By further investigation, the 2nd one is expected due to the adddress_cost hook
hack, while the 1st one one exposed -fcompare-debug issue in sms.  The RTL
sequence starts to off from sms, just some NOTE_INSN_DELETED positions change.
I believe it's just exposed by this combination unluckily/luckily ;-) I will
send a patch separately for it once I got time to look into it, but it should
be unrelated to this patch series for sure.

In a word, bootstrapping/regress-testing with unroll-loops enabled shows this
patch looks fine.

BR,
Kewen
-
gcc/ChangeLog:

* tree-ssa-loop-ivopts.c (struct iv_cand): New field reg_offset_p.
(struct ivopts_data): New field consider_reg_offset_for_unroll_p.
(mark_reg_offset_candidates): New function.
(add_candidate_1): Set reg_offset_p to false for new candidate.
(set_group_iv_cost): Scale up group cost with estimate_unroll_factor if
consider_reg_offset_for_unroll_p.
(determine_iv_cost): Increase step cost with estimate_unroll_factor if
consider_reg_offset_for_unroll_p.
(tree_ssa_iv_optimize_loop): Call estimate_unroll_factor, update
consider_reg_offset_for_unroll_p.
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 1d2697ae1ba..4b58b620ddd 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -473,6 +473,9 @@ struct iv_cand
   struct iv *orig_iv;  /* The original iv if this cand is added from biv with
   smaller type.  */
   bool doloop_p;   /* Whether this is a doloop candidate.  */
+  bool reg_offset_p;   /* Whether this is available for an address type 

Re: [PATCH 2/5] Make sure that static data member constexpr isn't optimized away in test.

2020-08-25 Thread Jakub Jelinek via Gcc-patches
On Mon, Aug 24, 2020 at 05:38:28PM -0400, Jason Merrill via Gcc-patches wrote:
> > Shouldn't we instead do something like (untested) following patch?
> > I mean, for DWARF < 5 the static data members were using DW_TAG_member,
> > which has been always marked by the function, so IMHO we should also always
> > mark the DW_TAG_variables at the class scope that replaced those.
> 
> The earlier behavior seems like an accident, that happened because we always
> need to emit information about non-static data members.  I don't think we
> should take it as guidance.
> 
> In this case one reason we don't emit debug info is because (before C++17)
> there's no definition of 'b'.  After C++17 the in-class declaration of 'b'
> is a definition, but we don't have to give it a symbol, so there's still
> nothing for the debug info to describe.
> 
> This issue doesn't seem specific to class members; it also affects
> namespace-scope C++17 inline variables:
> 
> inline const int var = 42;
> int main() { return var; }
> 
> Compiling this testcase with -g doesn't emit any debug info for 'var' even
> though it's used.
> 
> Should we assume that if a variable with DW_AT_const_value is TREE_USED, we
> need to write out debug info for it?

I guess it is a question of how exactly the (default on)
-feliminate-unused-debug-symbols should behave with different kind of
entities.

One thing are the non-inline static data members without const/constexpr or
without initializer in the class.  Those need a definition if they are ever
used, so it is probably fine to only emit them in the class in the TU where
they are defined.  But note that e.g. with -fdebug-types-section we still
force them to be output in class and do no pruning (and the pruning actually
makes dwz less efficient unless dwz is tought to not only merge the DIEs
with the same children and attributes, but also reconstruct more complete
DIEs out of several less complete ones for the same class).

Another case is non-inline static const data member with initializer,
do we track non-odr uses e.g. during constant evaluation and if so, should
that result in the static data member appearing?  Because if the static
const data member with initializer is never odr used, it doesn't have to be
ever defined and so it might never appear in the debug info.

Another case are inline vars, shall we treat as being used just that they
were used in some constant expression, or do only odr uses count?

And, shall we make the DWARF-3/4 DW_TAG_member in class behave the same as
DW_TAG_variable in DWARF-5?

Jakub



[pushed] aarch64: Update feature macro name

2020-08-25 Thread Richard Sandiford
GCC used the name __ARM_FEATURE_SVE_VECTOR_OPERATIONS, but in the
final spec it was renamed to__ARM_FEATURE_SVE_VECTOR_OPERATORS.

Tested on aarch64-linux-gnu, aarch64-elf and aarch64_be-elf.
Pushed to trunk so far, will backport to GCC 10 soon.

Richard


gcc/
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Rename
__ARM_FEATURE_SVE_VECTOR_OPERATIONS to
__ARM_FEATURE_SVE_VECTOR_OPERATORS.

gcc/testsuite/
* gcc.target/aarch64/sve/acle/general/attributes_1.c: Rename
__ARM_FEATURE_SVE_VECTOR_OPERATIONS to
__ARM_FEATURE_SVE_VECTOR_OPERATORS.
---
 gcc/config/aarch64/aarch64-c.c| 2 +-
 .../gcc.target/aarch64/sve/acle/general/attributes_1.c| 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index 1a1f4ecef04..fd08be47570 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -149,7 +149,7 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
bits = 0;
   builtin_define_with_int_value ("__ARM_FEATURE_SVE_BITS", bits);
 }
-  aarch64_def_or_undef (TARGET_SVE, "__ARM_FEATURE_SVE_VECTOR_OPERATIONS",
+  aarch64_def_or_undef (TARGET_SVE, "__ARM_FEATURE_SVE_VECTOR_OPERATORS",
pfile);
   aarch64_def_or_undef (TARGET_SVE_I8MM,
"__ARM_FEATURE_SVE_MATMUL_INT8", pfile);
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/attributes_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/attributes_1.c
index 6cd4f99911e..17acfc32e78 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/attributes_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/attributes_1.c
@@ -6,8 +6,8 @@
 #error "__ARM_FEATURE_SVE_BITS is not defined but should be"
 #endif
 
-#if __ARM_FEATURE_SVE_VECTOR_OPERATIONS != 1
-#error "__ARM_FEATURE_SVE_VECTOR_OPERATIONS should be equal to 1"
+#if __ARM_FEATURE_SVE_VECTOR_OPERATORS != 1
+#error "__ARM_FEATURE_SVE_VECTOR_OPERATORS should be equal to 1"
 #endif
 
 #ifndef __cplusplus


[pushed] aarch64: Tweaks to the handling of fixed-length SVE types

2020-08-25 Thread Richard Sandiford
This patch is really four things rolled into one, since separating
them seemed artificial:

- Update the mangling of the fixed-length SVE ACLE types to match
  the upcoming spec.  The idea is to mangle:

VLAT __attribute__((arm_sve_vector_bits(N)))

  as an instance __SVE_VLS of the template:

__SVE_VLS

- Give the fixed-length types their own TYPE_DECL.  This is needed
  to make the above mangling fix work, but should also be a minor
  QoI improvement for error reporting.  Unfortunately, the names are
  quite verbose, e.g.:

svint8_t __attribute__((arm_sve_vector_bits(512)))

  but anything shorter would be ad-hoc syntax and so might be more
  confusing.

- Improve the error message reported when arm_sve_vector_bits is
  applied to tuples, such as:

svint32x2_t __attribute__((arm_sve_vector_bits(N)))

  Previously we would complain that the type isn't an SVE type;
  now we complain that it isn't a vector type.

- Don't allow arm_sve_vector_bits(N) to be applied to existing
  fixed-length SVE types.

Tested on aarch64-linux-gnu, aarch64-elf and aarch64_be-elf.
Pushed to trunk so far, will backport to GCC 10 soon.
(Will also add a release note.)

Richard


gcc/
* config/aarch64/aarch64-sve-builtins.cc (add_sve_type_attribute):
Take the ACLE name of the type as a parameter and add it as fourth
argument to the "SVE type" attribute.
(register_builtin_types): Update call accordingly.
(register_tuple_type): Likewise.  Construct the name of the type
earlier in order to do this.
(get_arm_sve_vector_bits_attributes): New function.
(handle_arm_sve_vector_bits_attribute): Report a more sensible
error message if the attribute is applied to an SVE tuple type.
Don't allow the attribute to be applied to an existing fixed-length
SVE type.  Mangle the new type as __SVE_VLS.
Add a dummy TYPE_DECL to the new type.

gcc/testsuite/
* g++.target/aarch64/sve/acle/general-c++/attributes_2.C: New test.
* g++.target/aarch64/sve/acle/general-c++/mangle_6.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_7.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_8.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_9.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_10.C: Likewise.
* gcc.target/aarch64/sve/acle/general/attributes_7.c: Check the
error messages reported when arm_sve_vector_bits is applied to
SVE tuple types or to existing fixed-length SVE types.
---
 gcc/config/aarch64/aarch64-sve-builtins.cc| 130 +++---
 .../sve/acle/general-c++/attributes_2.C   |  66 +
 .../aarch64/sve/acle/general-c++/mangle_10.C  |  19 +++
 .../aarch64/sve/acle/general-c++/mangle_6.C   |  36 +
 .../aarch64/sve/acle/general-c++/mangle_7.C   |  19 +++
 .../aarch64/sve/acle/general-c++/mangle_8.C   |  19 +++
 .../aarch64/sve/acle/general-c++/mangle_9.C   |  19 +++
 .../aarch64/sve/acle/general/attributes_7.c   |   4 +
 8 files changed, 295 insertions(+), 17 deletions(-)
 create mode 100644 
gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/attributes_2.C
 create mode 100644 
gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_10.C
 create mode 100644 
gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_6.C
 create mode 100644 
gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_7.C
 create mode 100644 
gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_8.C
 create mode 100644 
gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_9.C

diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
b/gcc/config/aarch64/aarch64-sve-builtins.cc
index 3150659bee9..e753966efba 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -564,15 +564,16 @@ static bool reported_missing_registers_p;
 
 /* Record that TYPE is an ABI-defined SVE type that contains NUM_ZR SVE vectors
and NUM_PR SVE predicates.  MANGLED_NAME, if nonnull, is the ABI-defined
-   mangling of the type.  */
+   mangling of the type.  ACLE_NAME is the  name of the type.  */
 static void
 add_sve_type_attribute (tree type, unsigned int num_zr, unsigned int num_pr,
-   const char *mangled_name)
+   const char *mangled_name, const char *acle_name)
 {
   tree mangled_name_tree
 = (mangled_name ? get_identifier (mangled_name) : NULL_TREE);
 
-  tree value = tree_cons (NULL_TREE, mangled_name_tree, NULL_TREE);
+  tree value = tree_cons (NULL_TREE, get_identifier (acle_name), NULL_TREE);
+  value = tree_cons (NULL_TREE, mangled_name_tree, value);
   value = tree_cons (NULL_TREE, size_int (num_pr), value);
   value = tree_cons (NULL_TREE, size_int (num_zr), value);
   TYPE_ATTRIBUTES (type) = tree_cons (get_identifier ("SVE type"), value,
@@ -3363,7 +3364,8 @@ register_builtin_types ()
   TYPE_ARTIFICIAL 

[pushed] aarch64: Update the mangling of single SVE vectors and predicates

2020-08-25 Thread Richard Sandiford
GCC was implementing an old mangling scheme for single SVE
vectors and predicates (based on the Advanced SIMD one).
The final definition instead put them in the vendor built-in
namespace via the "u" prefix.

Tested on aarch64-linux-gnu, aarch64-elf and aarch64_be-elf.
Pushed to trunk so far, will backport to GCC 10 soon.
(Will also add a release note.)

Richard


gcc/
* config/aarch64/aarch64-sve-builtins.cc (DEF_SVE_TYPE): Add a
leading "u" to each mangled name.

gcc/testsuite/
* g++.target/aarch64/sve/acle/general-c++/mangle_1.C: Add a leading
"u" to the mangling of each SVE vector and predicate type.
* g++.target/aarch64/sve/acle/general-c++/mangle_2.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_3.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_5.C: Likewise.
---
 gcc/config/aarch64/aarch64-sve-builtins.cc|  2 +-
 .../aarch64/sve/acle/general-c++/mangle_1.C   | 26 +--
 .../aarch64/sve/acle/general-c++/mangle_2.C   | 26 +--
 .../aarch64/sve/acle/general-c++/mangle_3.C   |  4 +--
 .../aarch64/sve/acle/general-c++/mangle_5.C   |  4 +--
 5 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
b/gcc/config/aarch64/aarch64-sve-builtins.cc
index c49fcebcd43..3150659bee9 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -101,7 +101,7 @@ struct registered_function_hasher : nofree_ptr_hash 

 /* Information about each single-predicate or single-vector type.  */
 static CONSTEXPR const vector_type_info vector_types[] = {
 #define DEF_SVE_TYPE(ACLE_NAME, NCHARS, ABI_NAME, SCALAR_TYPE) \
-  { #ACLE_NAME, #ABI_NAME, #NCHARS #ABI_NAME },
+  { #ACLE_NAME, #ABI_NAME, "u" #NCHARS #ABI_NAME },
 #include "aarch64-sve-builtins.def"
 };
 
diff --git a/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_1.C 
b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_1.C
index 1a171248585..36dab3c9b71 100644
--- a/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_1.C
+++ b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_1.C
@@ -16,16 +16,16 @@ void f11(svfloat32_t) {}
 void f12(svfloat64_t) {}
 void f13(svbfloat16_t) {}
 
-/* { dg-final { scan-assembler "_Z2f110__SVBool_t:" } } */
-/* { dg-final { scan-assembler "_Z2f210__SVInt8_t:" } } */
-/* { dg-final { scan-assembler "_Z2f311__SVInt16_t:" } } */
-/* { dg-final { scan-assembler "_Z2f411__SVInt32_t:" } } */
-/* { dg-final { scan-assembler "_Z2f511__SVInt64_t:" } } */
-/* { dg-final { scan-assembler "_Z2f611__SVUint8_t:" } } */
-/* { dg-final { scan-assembler "_Z2f712__SVUint16_t:" } } */
-/* { dg-final { scan-assembler "_Z2f812__SVUint32_t:" } } */
-/* { dg-final { scan-assembler "_Z2f912__SVUint64_t:" } } */
-/* { dg-final { scan-assembler "_Z3f1013__SVFloat16_t:" } } */
-/* { dg-final { scan-assembler "_Z3f1113__SVFloat32_t:" } } */
-/* { dg-final { scan-assembler "_Z3f1213__SVFloat64_t:" } } */
-/* { dg-final { scan-assembler "_Z3f1314__SVBfloat16_t:" } } */
+/* { dg-final { scan-assembler "_Z2f1u10__SVBool_t:" } } */
+/* { dg-final { scan-assembler "_Z2f2u10__SVInt8_t:" } } */
+/* { dg-final { scan-assembler "_Z2f3u11__SVInt16_t:" } } */
+/* { dg-final { scan-assembler "_Z2f4u11__SVInt32_t:" } } */
+/* { dg-final { scan-assembler "_Z2f5u11__SVInt64_t:" } } */
+/* { dg-final { scan-assembler "_Z2f6u11__SVUint8_t:" } } */
+/* { dg-final { scan-assembler "_Z2f7u12__SVUint16_t:" } } */
+/* { dg-final { scan-assembler "_Z2f8u12__SVUint32_t:" } } */
+/* { dg-final { scan-assembler "_Z2f9u12__SVUint64_t:" } } */
+/* { dg-final { scan-assembler "_Z3f10u13__SVFloat16_t:" } } */
+/* { dg-final { scan-assembler "_Z3f11u13__SVFloat32_t:" } } */
+/* { dg-final { scan-assembler "_Z3f12u13__SVFloat64_t:" } } */
+/* { dg-final { scan-assembler "_Z3f13u14__SVBfloat16_t:" } } */
diff --git a/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_2.C 
b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_2.C
index 6792b8a3133..ad4aaee291f 100644
--- a/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_2.C
+++ b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_2.C
@@ -14,16 +14,16 @@ void f11(__SVFloat32_t) {}
 void f12(__SVFloat64_t) {}
 void f13(__SVBfloat16_t) {}
 
-/* { dg-final { scan-assembler "_Z2f110__SVBool_t:" } } */
-/* { dg-final { scan-assembler "_Z2f210__SVInt8_t:" } } */
-/* { dg-final { scan-assembler "_Z2f311__SVInt16_t:" } } */
-/* { dg-final { scan-assembler "_Z2f411__SVInt32_t:" } } */
-/* { dg-final { scan-assembler "_Z2f511__SVInt64_t:" } } */
-/* { dg-final { scan-assembler "_Z2f611__SVUint8_t:" } } */
-/* { dg-final { scan-assembler "_Z2f712__SVUint16_t:" } } */
-/* { dg-final { scan-assembler "_Z2f812__SVUint32_t:" } } */
-/* { dg-final { scan-assembler "_Z2f912__SVUint64_t:" } } */
-/* { dg-final { scan-assembler "_Z3f1013__SVFloat16_t:" } } */
-/* { dg-final { 

Re: [PATCH] x86: Use target("general-regs-only,baseline-isas-only") in

2020-08-25 Thread Uros Bizjak via Gcc-patches
On Tue, Aug 25, 2020 at 2:13 PM H.J. Lu  wrote:
>
> On Mon, Aug 24, 2020 at 12:40 PM H.J. Lu  wrote:
> >
> > On Mon, Aug 24, 2020 at 12:25 PM Uros Bizjak  wrote:
> > >
> > > On Mon, Aug 24, 2020 at 6:17 PM H.J. Lu  wrote:
> > > >
> > > > On Mon, Aug 24, 2020 at 7:55 AM Uros Bizjak  wrote:
> > > > >
> > > > > On Mon, Aug 24, 2020 at 3:23 PM H.J. Lu  wrote:
> > > > >
> > > > > > > Speaking of pragmas, these should be added outside cpuid.h, like:
> > > > > > >
> > > > > > > #pragma GCC push_options
> > > > > > > #pragma GCC target("general-regs-only")
> > > > > > >
> > > > > > > #include 
> > > > > > >
> > > > > > > void cpuid_check ()
> > > > > > > ...
> > > > > > >
> > > > > > > #pragma GCC pop_options
> > > > > > >
> > > > > > > >footnote
> > > > > > >
> > > > > > > Nowadays, -march=native is mostly used outside generic target
> > > > > > > compilations, so for relevant avx512 targets, we still generate 
> > > > > > > spills
> > > > > > > to mask regs. In future, we can review the setting of the tuning 
> > > > > > > flag
> > > > > > > for a generic target in the same way as with SSE2 inter-reg moves.
> > > > > > >
> > > > > >
> > > > > > Florian raised an issue that we need to limit  to the 
> > > > > > basic ISAs.
> > > > > >  should be handled similarly to other intrinsic header 
> > > > > > files.
> > > > > > That is  should use
> > > > > >
> > > > > > #pragma GCC push_options
> > > > > > #ifdef __x86_64__
> > > > > > #pragma GCC target("arch=x86-64")
> > > > > > #else
> > > > > > #pragma GCC target("arch=i386")
> > > > > > ...
> > > > > > #pragma GCC pop_options
> > > > > >
> > > > > > Here is a patch.  OK for master?
> > > > >
> > > > > -ENOPATCH
> > > > >
> > > > > However, how will this affect inlining? Every single function in
> > > > > cpuid.h is defined as static __inline, and due to target flags
> > > > > mismatch, it won't be inlined anymore. These inline functions are used
> > > > > in some bit testing functions, and to keep them inlined, these should
> > > > > also use the same options to avoid non-basic ISAs. This is the reason
> > > > > cpuid.h should be #included after pragma, together with bit testing
> > > > > functions, as shown above.
> > > > >
> > > >
> > > > How about target("baseline-isas-only")? All CPUID functions are
> > > > inlined.
> > >
> > > No, I don't think this is a good idea. Now consider the situation that
> > > caller functions are compiled with e.g. -mgeneral-regs-only. Due to
> > > #pragmas, CPUID functions are compiled with a superset ISAs, so they
> > > again won't be inlined. ISAs of caller functions and CPUID should
> > > match, the best way is to include  after the #pragma. And
> > > IMO, general-regs-only target #pragma is an excellent setting for
> > > both: cpuid.h and caller bit testing functions.
> > >
> > > So, if we care about inlining, decorating cpuid.h with target pragmas
> > > is a bad idea.
> >
> > This can be done with #pragma in .
> >
>
> We just need to update ix86_can_inline_p to allow inline functions
> with baseline-isas-only and general-regs-only attributes if caller
> supports the same set of ISAs.
>
> Here is the updated patch.

I'm not against it, but I don't plan to approve the attached patch.

Uros.


Re: [PATCH v3] c++: Implement P1009: Array size deduction in new-expressions.

2020-08-25 Thread Marek Polacek via Gcc-patches
On Mon, Aug 24, 2020 at 10:49:38PM -0400, Jason Merrill wrote:
> On 8/24/20 5:44 PM, Marek Polacek wrote:
> > --- a/gcc/tree.c
> > +++ b/gcc/tree.c
> > @@ -2123,6 +2123,23 @@ build_constructor_from_list (tree type, tree vals)
> > return build_constructor (type, v);
> >   }
> > +/* Return a new CONSTRUCTOR node whose type is TYPE and whose values
> > +   are in a vector pointed to by VALS.  Note that the TREE_PURPOSE
> > +   fields in the constructor remain null.  */
> > +
> > +tree
> > +build_constructor_from_vec (tree type, const vec *vals)
> > +{
> > +  tree ctor = build_constructor (type, NULL);
> > +
> > +  unsigned int ix;
> > +  tree t;
> > +  FOR_EACH_VEC_SAFE_ELT (vals, ix, t)
> > +CONSTRUCTOR_APPEND_ELT (CONSTRUCTOR_ELTS (ctor), NULL_TREE, t);
> 
> Either you need to build up the constructor_elt vec first and then pass it
> to build_constructor, or call recompute_constructor_flags after you add all
> the elements.

Ug.  I wanted to get rid of the vec temporary, moved
build_constructor, and forgot about recompute.  Fixed.

> And perhaps
> 
>   for (tree t : *vals)

Nice!  So nice to get rid of the ix temp, too.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This patch implements C++20 P1009, allowing code like

  new double[]{1,2,3}; // array bound will be deduced

Since this proposal makes the initialization rules more consistent, it is
applied to all previous versions of C++ (thus, effectively, all the way back
to C++11).

My patch is based on Jason's patch that handled the basic case.  I've
extended it to work with ()-init and also the string literal case.
Further testing revealed that to handle stuff like

  new int[]{t...};

in a template, we have to consider such a NEW_EXPR type-dependent.
Obviously, we first have to expand the pack to be able to deduce the
number of elements in the array.

Curiously, while implementing this proposal, I noticed that we fail
to accept

  new char[4]{"abc"};

so I've assigned 77841 to self.  I think the fix will depend on the
build_new_1 hunk in this patch.

The new tree.c function build_constructor_from_vec helps us morph
a vector into a CONSTRUCTOR more efficiently.

gcc/cp/ChangeLog:

PR c++/93529
* call.c (build_new_method_call_1): Use build_constructor_from_vec
instead of build_tree_list_vec + build_constructor_from_list.
* init.c (build_new_1): Handle new char[]{"foo"}.  Use
build_constructor_from_vec instead of build_tree_list_vec +
build_constructor_from_list.
(build_new): Deduce the array size in new-expression if not
present.  Handle ()-init.  Handle initializing an array from
a string literal.
* parser.c (cp_parser_new_type_id): Leave [] alone.
(cp_parser_direct_new_declarator): Allow [].
* pt.c (type_dependent_expression_p): In a NEW_EXPR, consider
array types whose dimension has to be deduced type-dependent.

gcc/ChangeLog:

PR c++/93529
* tree.c (build_constructor_from_vec): New.
* tree.h (build_constructor_from_vec): Declare.

gcc/testsuite/ChangeLog:

PR c++/93529
* g++.dg/cpp0x/sfinae4.C: Adjust expected result after P1009.
* g++.dg/cpp2a/new-array1.C: New test.
* g++.dg/cpp2a/new-array2.C: New test.
* g++.dg/cpp2a/new-array3.C: New test.
* g++.dg/cpp2a/new-array4.C: New test.

Co-authored-by: Jason Merrill 
---
 gcc/cp/call.c   |  4 +-
 gcc/cp/init.c   | 55 +--
 gcc/cp/parser.c | 13 +++--
 gcc/cp/pt.c |  4 ++
 gcc/testsuite/g++.dg/cpp0x/sfinae4.C|  8 ++-
 gcc/testsuite/g++.dg/cpp2a/new-array1.C | 70 +
 gcc/testsuite/g++.dg/cpp2a/new-array2.C | 22 
 gcc/testsuite/g++.dg/cpp2a/new-array3.C | 17 ++
 gcc/testsuite/g++.dg/cpp2a/new-array4.C | 10 
 gcc/tree.c  | 15 ++
 gcc/tree.h  |  1 +
 11 files changed, 209 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/new-array1.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/new-array2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/new-array3.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/new-array4.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 4726e57a30d..61bbb38bd2b 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -10297,8 +10297,8 @@ build_new_method_call_1 (tree instance, tree fns, 
vec **args,
  && !vec_safe_is_empty (user_args))
{
  /* Create a CONSTRUCTOR from ARGS, e.g. {1, 2} from <1, 2>.  */
- tree list = build_tree_list_vec (user_args);
- tree ctor = build_constructor_from_list (init_list_type_node, list);
+ tree ctor = build_constructor_from_vec (init_list_type_node,
+ user_args);
  CONSTRUCTOR_IS_DIRECT_INIT (ctor) = true;

[PATCH] x86: Use target("general-regs-only, baseline-isas-only") in

2020-08-25 Thread H.J. Lu via Gcc-patches
On Mon, Aug 24, 2020 at 12:40 PM H.J. Lu  wrote:
>
> On Mon, Aug 24, 2020 at 12:25 PM Uros Bizjak  wrote:
> >
> > On Mon, Aug 24, 2020 at 6:17 PM H.J. Lu  wrote:
> > >
> > > On Mon, Aug 24, 2020 at 7:55 AM Uros Bizjak  wrote:
> > > >
> > > > On Mon, Aug 24, 2020 at 3:23 PM H.J. Lu  wrote:
> > > >
> > > > > > Speaking of pragmas, these should be added outside cpuid.h, like:
> > > > > >
> > > > > > #pragma GCC push_options
> > > > > > #pragma GCC target("general-regs-only")
> > > > > >
> > > > > > #include 
> > > > > >
> > > > > > void cpuid_check ()
> > > > > > ...
> > > > > >
> > > > > > #pragma GCC pop_options
> > > > > >
> > > > > > >footnote
> > > > > >
> > > > > > Nowadays, -march=native is mostly used outside generic target
> > > > > > compilations, so for relevant avx512 targets, we still generate 
> > > > > > spills
> > > > > > to mask regs. In future, we can review the setting of the tuning 
> > > > > > flag
> > > > > > for a generic target in the same way as with SSE2 inter-reg moves.
> > > > > >
> > > > >
> > > > > Florian raised an issue that we need to limit  to the basic 
> > > > > ISAs.
> > > > >  should be handled similarly to other intrinsic header files.
> > > > > That is  should use
> > > > >
> > > > > #pragma GCC push_options
> > > > > #ifdef __x86_64__
> > > > > #pragma GCC target("arch=x86-64")
> > > > > #else
> > > > > #pragma GCC target("arch=i386")
> > > > > ...
> > > > > #pragma GCC pop_options
> > > > >
> > > > > Here is a patch.  OK for master?
> > > >
> > > > -ENOPATCH
> > > >
> > > > However, how will this affect inlining? Every single function in
> > > > cpuid.h is defined as static __inline, and due to target flags
> > > > mismatch, it won't be inlined anymore. These inline functions are used
> > > > in some bit testing functions, and to keep them inlined, these should
> > > > also use the same options to avoid non-basic ISAs. This is the reason
> > > > cpuid.h should be #included after pragma, together with bit testing
> > > > functions, as shown above.
> > > >
> > >
> > > How about target("baseline-isas-only")? All CPUID functions are
> > > inlined.
> >
> > No, I don't think this is a good idea. Now consider the situation that
> > caller functions are compiled with e.g. -mgeneral-regs-only. Due to
> > #pragmas, CPUID functions are compiled with a superset ISAs, so they
> > again won't be inlined. ISAs of caller functions and CPUID should
> > match, the best way is to include  after the #pragma. And
> > IMO, general-regs-only target #pragma is an excellent setting for
> > both: cpuid.h and caller bit testing functions.
> >
> > So, if we care about inlining, decorating cpuid.h with target pragmas
> > is a bad idea.
>
> This can be done with #pragma in .
>

We just need to update ix86_can_inline_p to allow inline functions
with baseline-isas-only and general-regs-only attributes if caller
supports the same set of ISAs.

Here is the updated patch.

-- 
H.J.
From 78eb1a4c4938494349032f0e10017ce553fb8fdd Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Fri, 21 Aug 2020 09:42:49 -0700
Subject: [PATCH] x86: Use target("general-regs-only,baseline-isas-only") in
 

Add -mbaseline-isas-only and target("baseline-isas-only") attribute to
support baseline ISAs, which include FXSR, MMX, SSE and SSE2 in 64-bit
mode.  Use only general registers and baseline ISAs to perform CPUID
check.  We can inline functions with general registers and baseline
ISAs attributes if caller supports the same set of ISAs.

gcc/

	PR target/96744
	* common/config/i386/i386-common.c (ix86_handle_option): Support
	-mbaseline-isas-only.
	* config/i386/cpuid.h: Add #pragma GCC
	target("general-regs-only,baseline-isas-only").
	* config/i386/i386-options.c (ix86_valid_target_attribute_inner_p):
	Handle baseline-isas-only.
	* config/i386/i386.c (ix86_can_inline_p): Allow inline functions
	with baseline-isas-only and general-regs-only attributes if caller
	supports the same set of ISAs.
	* config/i386/i386.h (TARGET_64BIT_BASELINE_ISAS): New.
	* config/i386/i386.opt: Add -mbaseline-isas-only.
	* doc/extend.texi: Document target("baseline-isas-only") function
	attribute.
	* doc/invoke.texi: Document -mbaseline-isas-only.

gcc/testsuite/

	PR target/96744
	* gcc.target/i386/avx512-check.h: Add #pragma GCC
	target("baseline-isas-only") for CPUID check.
	* gcc.target/i386/pr96744-10.c: New test.
	* gcc.target/i386/pr96744-11.c: Likewise.
	* gcc.target/i386/pr96744-12.c: Likewise.
	* gcc.target/i386/pr96744-12.c: Likewise.
	* gcc.target/i386/pr96744-14.c: Likewise.
	* gcc.target/i386/pr96744-15.c: Likewise.
---
 gcc/common/config/i386/i386-common.c | 28 +
 gcc/config/i386/cpuid.h  | 13 ++
 gcc/config/i386/i386-options.c   |  7 +++-
 gcc/config/i386/i386.c   | 34 ---
 gcc/config/i386/i386.h   |  4 ++
 gcc/config/i386/i386.opt |  6 ++-
 gcc/doc/extend.texi  |  4 ++
 

[PATCH] tree-optimization/96548 - fix failure to recompute RPO after CFG change

2020-08-25 Thread Richard Biener
This recomputes RPO after store-motion changes the CFG.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2020-08-25  Richard Biener  

PR tree-optimization/96548
PR tree-optimization/96760
* tree-ssa-loop-im.c (tree_ssa_lim): Recompute RPO after
store-motion.

* gcc.dg/torture/pr96548.c: New testcase.
* gcc.dg/torture/pr96760.c: Likewise.
---
 gcc/testsuite/gcc.dg/torture/pr96548.c | 20 
 gcc/testsuite/gcc.dg/torture/pr96760.c | 22 ++
 gcc/tree-ssa-loop-im.c |  4 
 3 files changed, 46 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr96548.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr96760.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr96548.c 
b/gcc/testsuite/gcc.dg/torture/pr96548.c
new file mode 100644
index 000..a0547427d29
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr96548.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+
+int c9, d3;
+
+void
+sg (int *rs, int f2)
+{
+  for (;;)
+{
+  if (*rs < 1)
+__builtin_unreachable ();
+
+  for (c9 = 0; c9 < 1; ++c9)
+while (f2 < 1)
+  ++c9;
+
+  if (d3)
+c9 += !!f2 ? 0 : d3;
+}
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr96760.c 
b/gcc/testsuite/gcc.dg/torture/pr96760.c
new file mode 100644
index 000..4f6bbe96faf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr96760.c
@@ -0,0 +1,22 @@
+/* { dg-do run } */
+
+char a = 0, f = 0, c = 5;
+unsigned long d = 0;
+int g = 0;
+int *e = 
+
+int main() {
+  char  b = 0;
+  for (;;) {
+for (a = 0; a < 2; a++) { // no UB I believe
+  if (c) {
+if (d != 0)
+  __builtin_abort ();
+return 0;
+  }
+}
+f = (d++, *e);
+  }
+
+  return 1;
+}
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 8b7eb57514f..d8fb28ec5bd 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -3114,6 +3114,10 @@ tree_ssa_lim (function *fun)
  out of the loops as well.  */
   do_store_motion ();
 
+  free (rpo);
+  rpo = XNEWVEC (int, last_basic_block_for_fn (fun));
+  n = pre_and_rev_post_order_compute_fn (fun, NULL, rpo, false);
+
   /* Move the expressions that are expensive enough.  */
   for (int i = 0; i < n; ++i)
 todo |= move_computations_worker (BASIC_BLOCK_FOR_FN (fun, rpo[i]));
-- 
2.26.2


Re: [PATCH] introduce attribute exalias

2020-08-25 Thread Iain Sandoe

Alexandre Oliva  wrote:


On Aug 15, 2020, Iain Sandoe  wrote:


Alexandre Oliva  wrote:



I'm pretty sure setting multiple labels at the same address is used
and relied on quite often.



That’s what’s currently disallowed (the assemblers all support .set).


I understand you mean it's disallowed for global labels.


Yes - or, in fact, linker-visible ones for Darwin because of the issue below.


 I meant it was often used for local labels.


… including for Darwin targets;
one has to take care with details sometimes, since a local label defines
something that is part of the atom delineated by the next preceding global
(or linker-visible) symbol (or section start if there are none).

When the local label does not really belong to that atom, it can cause
problems (e.g. apparently branching to code that is weak global) so we insert
a non-global (but linker-visible) symbol to allow a new atom to start.


Thinking aloud  - not thought through in any detail - I wonder if the
facilities of
C++20 modules are sufficient?


I'm really not sure what issue you're thinking of solving.


The one where one wanted to import the declaration of a function without
having to include every header needed to declare the types it uses.

Since a header unit should be self-contained and is a compiled artefact.


The one I'm working on is that of enabling the use of a uniform string
in Ada import statements (and also in alias("targets")), even when a
symbolic type that does not mangle uniformly is in use (think int64_t
mapping to long or long long).  Having per-target source files is quite
cumbersome in Ada, and there isn't a preprocessor to rely on for
conditionals and token pasting and whatnot.

I'm afraid I don't see how C++ modules could any offer in this regard.


perhaps not, it was only “thinking aloud”
(since we are agreed it would be nice to have a solution that the compiler
could manipulate/synthesize, rather than requiring source-level changes).

FAOD, the comments above are just continuation of discussion - not any
additional objection to the proposal.

thanks
Iain



Re: [PATCH PR96357][GCC][AArch64]: could not split insn UNSPEC_COND_FSUB with AArch64 SVE

2020-08-25 Thread Richard Sandiford
Przemyslaw Wirkus  writes:
> @@ -5234,21 +5234,21 @@ (define_insn_and_rewrite "*cond_sub_3_const"
>  
>  ;; Predicated floating-point subtraction from a constant, merging with an
>  ;; independent value.
> -(define_insn_and_rewrite "*cond_sub_any_const"
> +;; The subtraction predicate and the merge predicate are allowed to be
> +;; different.

Sorry for the micromanagement, but I think this is easier to read if it
flows as a single paragraph:

;; Predicated floating-point subtraction from a constant, merging with an
;; independent value.  The subtraction predicate and the merge predicate are
;; allowed to be different.

or is written as two separate paragraphs:

;; Predicated floating-point subtraction from a constant, merging with an
;; independent value.
;;
;; The subtraction predicate and the merge predicate are allowed to be
;; different.

Same for the second comment.

> @@ -5271,6 +5271,41 @@ (define_insn_and_rewrite "*cond_sub_any_const"
>[(set_attr "movprfx" "yes")]
>  )
>  
> +;; Predicated floating-point subtraction from a constant, merging with an
> +;; independent value.
> +;; The subtraction predicate and the merge predicate must be the same.
> +(define_insn_and_rewrite "*cond_sub_strict_const"
> +  [(set (match_operand:SVE_FULL_F 0 "register_operand" "=w, w, ?w")
> + (unspec:SVE_FULL_F
> +   [(match_operand: 1 "register_operand" "Upl, Upl, Upl")
> +(unspec:SVE_FULL_F
> +  [(match_dup 1)
> +   (const_int SVE_STRICT_GP)
> +   (match_operand:SVE_FULL_F 2 "aarch64_sve_float_arith_immediate")
> +   (match_operand:SVE_FULL_F 3 "register_operand" "w, w, w")]
> +  UNSPEC_COND_FSUB)
> +(match_operand:SVE_FULL_F 4 "aarch64_simd_reg_or_zero" "Dz, 0, w")]
> +   UNSPEC_SEL))]
> +  "TARGET_SVE && !rtx_equal_p (operands[3], operands[4])"
> +  "@
> +   movprfx\t%0., %1/z, %3.\;fsubr\t%0., %1/m, 
> %0., #%2
> +   movprfx\t%0., %1/m, %3.\;fsubr\t%0., %1/m, 
> %0., #%2
> +   #"
> +  "&& 1"
> +  {
> +if (reload_completed
> +&& register_operand (operands[4], mode)
> +&& !rtx_equal_p (operands[0], operands[4]))
> +  {
> + emit_insn (gen_vcond_mask_ (operands[0], operands[3],
> +  operands[4], operands[1]));
> + operands[4] = operands[3] = operands[0];
> +  }
> +else
> +  FAIL;

I should have realised this would be the case, sorry, but now that there's
only one rewrite, this should simply be:

  "&& reload_completed
   && register_operand (operands[4], mode)
   && !rtx_equal_p (operands[0], operands[4]))"
  {
emit_insn (gen_vcond_mask_ (operands[0], operands[3],
 operands[4], operands[1]));
operands[4] = operands[3] = operands[0];
  }

OK with those changes, thanks.

Richard


Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions emitted at -O3

2020-08-25 Thread Richard Sandiford
xiezhiheng  writes:
>> -Original Message-
>> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
>> Sent: Friday, August 21, 2020 5:02 PM
>> To: xiezhiheng 
>> Cc: Richard Biener ; gcc-patches@gcc.gnu.org
>> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions
>> emitted at -O3
>
> Cut...
>  
>> Looks like the saturating intrinsics might need a bit more thought.
>> Would you mind submitting the patch with just the other parts?
>> Those were uncontroversial and it would be a shame to hold them
>> up over this.
>
> Okay, I reorganized the existing patch and finished the first half of the 
> intrinsics
> except saturating intrinsics and load intrinsics.
>
> Bootstrapped and tested on aarch64 Linux platform.

I know this'll be frustrating, sorry, but could you post the
2020-08-17 patch without the saturation changes?  It's going to be
easier to track and review if each patch deals with similar intrinsics.
The non-saturating part of the 2020-08-17 patch was good because it was
dealing purely with arithmetic operations.  Loads should really be a
separate change.

BTW, for something like this, it's OK to test and submit several patches
at once, so separating the patches doesn't need to mean longer test cycles.
It's just that for review purposes, it's easier if one patch does one thing.

> For load intrinsics, I have one problem when I set FLAG_READ_MEMORY for them,
> some test cases like
> gcc.target/aarch64/advsimd-intrinsics/vld2_lane_p8_indices_1.c
>   #include 
>
>   /* { dg-do compile } */
>   /* { dg-skip-if "" { *-*-* } { "-fno-fat-lto-objects" } } */
>
>   poly8x8x2_t
>   f_vld2_lane_p8 (poly8_t * p, poly8x8x2_t v)
>   {
> poly8x8x2_t res;
> /* { dg-error "lane 8 out of range 0 - 7" "" { target *-*-* } 0 } */
> res = vld2_lane_p8 (p, v, 8);
> /* { dg-error "lane -1 out of range 0 - 7" "" { target *-*-* } 0 } */
> res = vld2_lane_p8 (p, v, -1);
> return res;
>   }
> would fail in regression.  Because the first statement
>   res = vld2_lane_p8 (p, v, 8);
> would be eliminated as dead code in gimple phase but the error message is
> generated in expand pass.  So I am going to replace the second statement
>   res = vld2_lane_p8 (p, v, -1);
> with
>   res = vld2_lane_p8 (p, res, -1);
> or do you have any other suggestions?

The test is valid as-is, so it would be better not to change it.

I guess this means that we should leave the _lane loads and stores until
we implement the range checks in a different way.  This is somewhat
related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95969 ,
although your example shows that the “dummy const function” approach
might not work.

So to start with, could you just patch the non-lane loads?

> And for test case gcc.target/aarch64/arg-type-diagnostics-1.c, I return the 
> result
> to prevent the statement
>   result = vrsra_n_s32 (arg1, arg2, a);
> from being eliminated by treated as dead code.

Hmm.  Here too I think the test is valid as-is.  I think we need
to ensure that the range check still happens even if the call is
dead code (similar to PR95969 above).

So I guess here too, it might be better to leave the _n forms to
a separate patch.

That doesn't mean we shouldn't fix the _lane and _n cases (or the
previous saturating cases).  It's just that each time we find a group
of functions that's awkward for some reason, it'd be better to deal
with those functions separately.

Thanks,
Richard


[Patch] OpenMP/Fortran: Handle polymorphic scalars in data-sharing FIRSTPRIVATE (PR86470)

2020-08-25 Thread Tobias Burnus

This patch adds support for polymorphic variables ("CLASS")
to OpenMP's data-sharing clause FIRSTPRIVATE.

While the patch should be okay, there is more follow-up
work required, but one has to make a start :-)

* PRIVATE – as used in the testcase of the PR is not yet supported,
  only FIRSTPRIVATE.
* polymorphic arrays are not supported (see 'sorry').
– For nonallocatable arrays, the decl passed to the function
  does contain much information; the LANG_SPECIFIC is non-NULL
  its the pointer components contain garbage :-(
– Handling noncharacter polymorphic arrays (hence: non-unlimited
  polymorphic) seems to be simpler; the current patch seems to
  work for some cases already, if the "sorry" is commented.
* ...

OK for mainline?

Tobias

PS: Supporting *map*ing of polymorphic variables is another matter,
which is unfortunately even harder.

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
[Patch] OpenMP/Fortran: Handle polymorphic scalars in data-sharing FIRSTPRIVATE (PR86470)

gcc/fortran/ChangeLog:

	PR fortran/86470
	* trans-expr.c (gfc_copy_class_to_class): Add unshare_expr.
	* trans-openmp.c (gfc_is_polymorphic_nonptr,
	gfc_is_unlimited_polymorphic_nonptr): New.
	(gfc_omp_clause_copy_ctor): Handle polymorphic variables.

libgomp/ChangeLog:

	PR fortran/86470
	* testsuite/libgomp.fortran/class-firstprivate-1.f90: New test.
	* testsuite/libgomp.fortran/class-firstprivate-2.f90: New test.
	* testsuite/libgomp.fortran/class-firstprivate-3.f90: New test.

gcc/testsuite/ChangeLog:

	PR fortran/86470
	* gfortran.dg/gomp/class-firstprivate-1.f90: New test.
	* gfortran.dg/gomp/class-firstprivate-2.f90: New test.
	* gfortran.dg/gomp/class-firstprivate-3.f90: New test.

 gcc/fortran/trans-expr.c   |   2 +-
 gcc/fortran/trans-openmp.c | 126 +
 gcc/testsuite/gfortran.dg/gomp/class-firstprivate-1.f90|  62 ++
 gcc/testsuite/gfortran.dg/gomp/class-firstprivate-2.f90|  54 ++
 gcc/testsuite/gfortran.dg/gomp/class-firstprivate-3.f90|  61 ++
 libgomp/testsuite/libgomp.fortran/class-firstprivate-1.f90 | 323 +++
 libgomp/testsuite/libgomp.fortran/class-firstprivate-2.f90 | 334 +
 libgomp/testsuite/libgomp.fortran/class-firstprivate-3.f90 | 334 +
 8 files changed, 1295 insertions(+), 1 deletion(-)

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 36ff9b5cbc6..b0c38e9f444 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -1451,7 +1451,7 @@ gfc_copy_class_to_class (tree from, tree to, tree nelems, bool unlimited)
 	{
 	  vec_safe_push (args, from_len);
 	  vec_safe_push (args, to_len);
-	  extcopy = build_call_vec (fcn_type, fcn, args);
+	  extcopy = build_call_vec (fcn_type, unshare_expr (fcn), args);
 	  tmp = fold_build2_loc (input_location, GT_EXPR,
  logical_type_node, from_len,
  build_zero_cst (TREE_TYPE (from_len)));
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 063d4c145e2..705cdc7749f 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -355,6 +355,51 @@ gfc_has_alloc_comps (tree type, tree decl)
   return false;
 }
 
+/* Return true if TYPE is polymorphic but not with pointer attribute.  */
+
+static bool
+gfc_is_polymorphic_nonptr (tree type)
+{
+  if (POINTER_TYPE_P (type))
+type = TREE_TYPE (type);
+  if (TREE_CODE (type) != RECORD_TYPE)
+return false;
+
+  tree field = TYPE_FIELDS (type);
+  if (!field || 0 != strcmp ("_data", IDENTIFIER_POINTER (DECL_NAME (field
+return false;
+  field = DECL_CHAIN (field);
+  if (!field || 0 != strcmp ("_vptr", IDENTIFIER_POINTER (DECL_NAME (field
+return false;
+
+  return true;
+}
+
+/* Return true if TYPE is unlimited polymorphic but not with pointer attribute;
+   unlimited means also intrinsic types are handled and _len is used.  */
+
+static bool
+gfc_is_unlimited_polymorphic_nonptr (tree type)
+{
+  if (POINTER_TYPE_P (type))
+type = TREE_TYPE (type);
+  if (TREE_CODE (type) != RECORD_TYPE)
+return false;
+
+  tree field = TYPE_FIELDS (type); /* _data */
+  if (!field)
+return false;
+  field = DECL_CHAIN (field); /* _vptr */
+  if (!field)
+return false;
+  field = DECL_CHAIN (field);
+  if (!field || 0 != strcmp ("_len", IDENTIFIER_POINTER (DECL_NAME (field
+return false;
+
+  return true;
+}
+
+
 /* Return true if DECL in private clause needs
OMP_CLAUSE_PRIVATE_OUTER_REF on the private clause.  */
 bool
@@ -740,6 +785,87 @@ gfc_omp_clause_copy_ctor (tree clause, tree dest, tree src)
   gcc_assert (OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_FIRSTPRIVATE
 	  || OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_LINEAR);
 
+  /* TODO: implement support for polymorphic arrays; reject for now.  */
+  /* Void 

Re: [PATCH] fix a typo in rtl.def

2020-08-25 Thread Richard Sandiford
Wei Wentao  writes:
> Hi,
>
> This patch fix a typo in rtl.def.

Thanks, pushed.  Looks like it had gone unnoticed (or at least unfixed)
for almost 30 years. :-)

Richard

>
> Regards!
>
> Weiwt
>
> ---
>  gcc/rtl.def | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/rtl.def b/gcc/rtl.def
> index 9754333eafb..7ec94a95105 100644
> --- a/gcc/rtl.def
> +++ b/gcc/rtl.def
> @@ -380,7 +380,7 @@ DEF_RTL_EXPR(PC, "pc", "", RTX_OBJ)
>  
>  /* A register.  The "operand" is the register number, accessed with
> the REGNO macro.  If this number is less than FIRST_PSEUDO_REGISTER
> -   than a hardware register is being referred to.  The second operand
> +   then a hardware register is being referred to.  The second operand
> points to a reg_attrs structure.
> This rtx needs to have as many (or more) fields as a MEM, since we
> can change REG rtx's into MEMs during reload.  */


[PATCH] lra: Canonicalize mult to shift in address reloads

2020-08-25 Thread Alex Coplan
Hello,

Inside a (mem) RTX, it is canonical to write multiplications by powers
of two using a (mult) [0]. For example, given the following C function:

long f(long *p, long x)
{
return p[x];
}

AArch64 GCC generates the following RTL insn (in final):

(set (reg/i:DI 0 x0)
 (mem:DI (plus:DI (mult:DI (reg:DI 1 x1 [101])
 (const_int 8 [0x8]))
 (reg:DI 0 x0 [100])) [1 *_3+0 S8 A64]))

However, outside of a (mem), the canonical way to write multiplications
by powers of two is using (ashift). E.g. for the following C function:

long *g(long *p, long x)
{
return p + x;
}

AArch64 GCC generates:

(set (reg/i:DI 0 x0)
 (plus:DI (ashift:DI (reg:DI 1 x1 [100])
 (const_int 3 [0x3]))
 (reg:DI 0 x0 [99])))

Now I observed that LRA does not quite respect this RTL canonicalization
rule. When compiling gcc/testsuite/gcc.dg/torture/pr34330.c with -Os
-ftree-vectorize, the RTL in the dump "281r.ira" has the insn:

(set (reg:SI 111 [ MEM[base: b_25(D), index: ivtmp.9_35, step: 4, offset: 0B] ])
 (mem:SI (plus:DI (mult:DI (reg:DI 101 [ ivtmp.9 ])
 (const_int 4 [0x4]))
 (reg/v/f:DI 105 [ b ])) [2 MEM[base: b_25(D), index: ivtmp.9_35, 
step: 4, offset: 0B]+0 S4 A32]))

but LRA then proceeds to generate a reload, and we get the following
non-canonical insn in "282r.reload":

(set (reg:DI 7 x7 [121])
 (plus:DI (mult:DI (reg:DI 5 x5 [orig:101 ivtmp.9 ] [101])
 (const_int 4 [0x4]))
 (reg/v/f:DI 1 x1 [orig:105 b ] [105])))

This patch fixes LRA to ensure that we generate canonical RTL in this
case. After the patch, we get the following insn in "282r.reload":

(set (reg:DI 7 x7 [121])
(plus:DI (ashift:DI (reg:DI 5 x5 [orig:101 ivtmp.9 ] [101])
(const_int 2 [0x2]))
(reg/v/f:DI 1 x1 [orig:105 b ] [105])))

The motivation here is to be able to remove several redundant patterns
in the AArch64 backend. See the previous thread [1] for context.

Testing:
 * Bootstrapped and regtested on aarch64-none-linux-gnu,
   x86_64-pc-linux-gnu.
 * New unit test which checks that we're using the shift pattern (rather
   than the problematic mult pattern) on AArch64.

OK for master?

Thanks,
Alex

[0] : https://gcc.gnu.org/onlinedocs/gccint/Insn-Canonicalizations.html
[1] : https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552066.html

---

gcc/ChangeLog:

* lra-constraints.c (canonicalize_reload_addr): New.
(curr_insn_transform): Use canonicalize_reload_addr to ensure we
generate canonical RTL for an address reload.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/mem-shift-canonical.c: New test.

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 421c453997b..630a4f167cd 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -570,6 +570,34 @@ init_curr_insn_input_reloads (void)
   curr_insn_input_reloads_num = 0;
 }
 
+/* The canonical form of an rtx inside a MEM is not necessarily the same as the
+   canonical form of the rtx outside the MEM.  Fix this up in the case that
+   we're reloading an address (and therefore pulling it outside a MEM).  */
+static rtx canonicalize_reload_addr (rtx addr)
+{
+  const enum rtx_code code = GET_CODE (addr);
+
+  if (code == PLUS)
+{
+  rtx inner = XEXP (addr, 0);
+  if (GET_CODE (inner) == MULT && CONST_INT_P (XEXP (inner, 1)))
+   {
+ const HOST_WIDE_INT ci = INTVAL (XEXP (inner, 1));
+ const int pwr2 = exact_log2 (ci);
+ if (pwr2 > 0)
+   {
+ /* Rewrite this to use a shift instead, which is canonical
+when outside of a MEM.  */
+ XEXP (addr, 0) = gen_rtx_ASHIFT (GET_MODE (inner),
+  XEXP (inner, 0),
+  GEN_INT (pwr2));
+   }
+   }
+}
+
+  return addr;
+}
+
 /* Create a new pseudo using MODE, RCLASS, ORIGINAL or reuse already
created input reload pseudo (only if TYPE is not OP_OUT).  Don't
reuse pseudo if IN_SUBREG_P is true and the reused pseudo should be
@@ -4362,12 +4390,19 @@ curr_insn_transform (bool check_only_p)
{
  rtx addr = *loc;
  enum rtx_code code = GET_CODE (addr);
- 
+ bool align_p = false;
+
  if (code == AND && CONST_INT_P (XEXP (addr, 1)))
-   /* (and ... (const_int -X)) is used to align to X bytes.  */
-   addr = XEXP (*loc, 0);
+   {
+ /* (and ... (const_int -X)) is used to align to X bytes.  */
+ align_p = true;
+ addr = XEXP (*loc, 0);
+   }
+ else
+   addr = canonicalize_reload_addr (addr);
+
  lra_emit_move (new_reg, addr);
- if (addr != *loc)
+ if (align_p)
emit_move_insn (new_reg, gen_rtx_AND (GET_MODE (new_reg), 
new_reg, XEXP (*loc, 1)));

[PATCH 1/2] IPA symver: allow multiple symvers for a definition

2020-08-25 Thread Martin Liska

gcc/ChangeLog:

* cgraphunit.c (process_symver_attribute): Allow multiple
symver attributes for one symbol.
* doc/extend.texi: Document the change.

gcc/testsuite/ChangeLog:

* lib/target-supports-dg.exp: Add dg-require-symver.
* lib/target-supports.exp: Likewise.
* gcc.dg/ipa/symver1.c: New test.
---
 gcc/cgraphunit.c | 143 ---
 gcc/doc/extend.texi  |  10 +-
 gcc/testsuite/gcc.dg/ipa/symver1.c   |  11 ++
 gcc/testsuite/lib/target-supports-dg.exp |  10 ++
 gcc/testsuite/lib/target-supports.exp|  12 ++
 5 files changed, 110 insertions(+), 76 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/symver1.c

diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 0b1009d0dea..fa3aec79a48 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -720,80 +720,81 @@ process_symver_attribute (symtab_node *n)
 {
   tree value = lookup_attribute ("symver", DECL_ATTRIBUTES (n->decl));
 
-  if (!value)
-return;
-  if (lookup_attribute ("symver", TREE_CHAIN (value)))
+  for (; value != NULL; value = TREE_CHAIN (value))
 {
-  error_at (DECL_SOURCE_LOCATION (n->decl),
-		"multiple versions for one symbol");
-  return;
-}
-  tree symver = get_identifier_with_length
-		  (TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (value))),
-		   TREE_STRING_LENGTH (TREE_VALUE (TREE_VALUE (value;
-  symtab_node *def = symtab_node::get_for_asmname (symver);
+  /* Starting from bintuils 2.35 gas supports:
+	  # Assign foo to bar@V1 and baz@V2.
+	  .symver foo, bar@V1
+	  .symver foo, baz@V2
+  */
+
+  tree symver = get_identifier_with_length
+	(TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (value))),
+	 TREE_STRING_LENGTH (TREE_VALUE (TREE_VALUE (value;
+  symtab_node *def = symtab_node::get_for_asmname (symver);
+
+  if (def)
+	{
+	  error_at (DECL_SOURCE_LOCATION (n->decl),
+		"duplicate definition of a symbol version");
+	  inform (DECL_SOURCE_LOCATION (def->decl),
+		  "same version was previously defined here");
+	  return;
+	}
+  if (!n->definition)
+	{
+	  error_at (DECL_SOURCE_LOCATION (n->decl),
+		"symbol needs to be defined to have a version");
+	  return;
+	}
+  if (DECL_COMMON (n->decl))
+	{
+	  error_at (DECL_SOURCE_LOCATION (n->decl),
+		"common symbol cannot be versioned");
+	  return;
+	}
+  if (DECL_COMDAT (n->decl))
+	{
+	  error_at (DECL_SOURCE_LOCATION (n->decl),
+		"comdat symbol cannot be versioned");
+	  return;
+	}
+  if (n->weakref)
+	{
+	  error_at (DECL_SOURCE_LOCATION (n->decl),
+		"% cannot be versioned");
+	  return;
+	}
+  if (!TREE_PUBLIC (n->decl))
+	{
+	  error_at (DECL_SOURCE_LOCATION (n->decl),
+		"versioned symbol must be public");
+	  return;
+	}
+  if (DECL_VISIBILITY (n->decl) != VISIBILITY_DEFAULT)
+	{
+	  error_at (DECL_SOURCE_LOCATION (n->decl),
+		"versioned symbol must have default visibility");
+	  return;
+	}
 
-  if (def)
-{
-  error_at (DECL_SOURCE_LOCATION (n->decl),
-		"duplicate definition of a symbol version");
-  inform (DECL_SOURCE_LOCATION (def->decl),
-	  "same version was previously defined here");
-  return;
-}
-  if (!n->definition)
-{
-  error_at (DECL_SOURCE_LOCATION (n->decl),
-		"symbol needs to be defined to have a version");
-  return;
-}
-  if (DECL_COMMON (n->decl))
-{
-  error_at (DECL_SOURCE_LOCATION (n->decl),
-		"common symbol cannot be versioned");
-  return;
-}
-  if (DECL_COMDAT (n->decl))
-{
-  error_at (DECL_SOURCE_LOCATION (n->decl),
-		"comdat symbol cannot be versioned");
-  return;
-}
-  if (n->weakref)
-{
-  error_at (DECL_SOURCE_LOCATION (n->decl),
-		"% cannot be versioned");
-  return;
+  /* Create new symbol table entry representing the version.  */
+  tree new_decl = copy_node (n->decl);
+
+  DECL_INITIAL (new_decl) = NULL_TREE;
+  if (TREE_CODE (new_decl) == FUNCTION_DECL)
+	DECL_STRUCT_FUNCTION (new_decl) = NULL;
+  SET_DECL_ASSEMBLER_NAME (new_decl, symver);
+  TREE_PUBLIC (new_decl) = 1;
+  DECL_ATTRIBUTES (new_decl) = NULL;
+
+  symtab_node *symver_node = symtab_node::get_create (new_decl);
+  symver_node->alias = true;
+  symver_node->definition = true;
+  symver_node->symver = true;
+  symver_node->create_reference (n, IPA_REF_ALIAS, NULL);
+  symver_node->analyzed = true;
 }
-  if (!TREE_PUBLIC (n->decl))
-{
-  error_at (DECL_SOURCE_LOCATION (n->decl),
-		"versioned symbol must be public");
-  return;
-}
-  if (DECL_VISIBILITY (n->decl) != VISIBILITY_DEFAULT)
-{
-  error_at (DECL_SOURCE_LOCATION (n->decl),
-		"versioned symbol must have default visibility");
-  return;
-}
-
-  /* Create new symbol table entry representing the version.  */
-  tree new_decl = copy_node (n->decl);
-
-  DECL_INITIAL (new_decl) = NULL_TREE;
-  if (TREE_CODE (new_decl) == 

[PATCH 2/2] IPA symver: support visibility and static symbols.

2020-08-25 Thread Martin Liska

gcc/ChangeLog:

* cgraphunit.c (process_symver_attribute): Remove checks that
are not needed now.
(cgraph_node::assemble_thunks_and_aliases): Change second
argument to decl.
* config/elfos.h (ASM_OUTPUT_SYMVER_DIRECTIVE): Add new
VISIBILITY parameter.
* doc/extend.texi: Document that .symver supports visibility.
* symtab.c (symtab_node::verify_base): Remove checks that
are not needed now.
* varasm.c (do_assemble_symver): Detect visibility .symver
directive argument.
* varpool.c (varpool_node::assemble_aliases): Change second
argument to decl.

gcc/testsuite/ChangeLog:

* gcc.dg/ipa/symver2.c: New test.
* gcc.dg/ipa/symver3.c: New test.
---
 gcc/cgraphunit.c   | 15 +--
 gcc/config/elfos.h | 20 +++-
 gcc/doc/extend.texi|  3 +--
 gcc/symtab.c   | 16 
 gcc/testsuite/gcc.dg/ipa/symver2.c |  9 +
 gcc/testsuite/gcc.dg/ipa/symver3.c | 13 +
 gcc/varasm.c   | 12 ++--
 gcc/varpool.c  |  3 +--
 8 files changed, 46 insertions(+), 45 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/symver2.c
 create mode 100644 gcc/testsuite/gcc.dg/ipa/symver3.c

diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index fa3aec79a48..c85d3482c8b 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -765,18 +765,6 @@ process_symver_attribute (symtab_node *n)
 		"% cannot be versioned");
 	  return;
 	}
-  if (!TREE_PUBLIC (n->decl))
-	{
-	  error_at (DECL_SOURCE_LOCATION (n->decl),
-		"versioned symbol must be public");
-	  return;
-	}
-  if (DECL_VISIBILITY (n->decl) != VISIBILITY_DEFAULT)
-	{
-	  error_at (DECL_SOURCE_LOCATION (n->decl),
-		"versioned symbol must have default visibility");
-	  return;
-	}
 
   /* Create new symbol table entry representing the version.  */
   tree new_decl = copy_node (n->decl);
@@ -2239,8 +2227,7 @@ cgraph_node::assemble_thunks_and_aliases (void)
 	 of buffering it in same alias pairs.  */
 	  TREE_ASM_WRITTEN (decl) = 1;
 	  if (alias->symver)
-	do_assemble_symver (alias->decl,
-DECL_ASSEMBLER_NAME (decl));
+	do_assemble_symver (alias->decl, decl);
 	  else
 	do_assemble_alias (alias->decl,
 			   DECL_ASSEMBLER_NAME (decl));
diff --git a/gcc/config/elfos.h b/gcc/config/elfos.h
index 74a3eafda6b..6c9b73320b9 100644
--- a/gcc/config/elfos.h
+++ b/gcc/config/elfos.h
@@ -248,15 +248,17 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 }	\
   while (0)
 
-#define ASM_OUTPUT_SYMVER_DIRECTIVE(FILE, NAME, NAME2)		\
-  do\
-{\
-  fputs ("\t.symver\t", (FILE));\
-  assemble_name ((FILE), (NAME));\
-  fputs (", ", (FILE));	\
-  assemble_name ((FILE), (NAME2));\
-  fputc ('\n', (FILE));	\
-}\
+#define ASM_OUTPUT_SYMVER_DIRECTIVE(FILE, NAME, NAME2, VISIBILITY)  \
+  do\
+{\
+  fputs ("\t.symver\t", (FILE));\
+  assemble_name ((FILE), (NAME));\
+  fputs (", ", (FILE));	\
+  assemble_name ((FILE), (NAME2));\
+  if (visibility != NULL)	\
+	fprintf ((FILE), ", %s", (VISIBILITY));			\
+  fputc ('\n', (FILE));	\
+}\
   while (0)
 
 /* The following macro defines the format used to output the second
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 97ae1ee0843..22e62f36714 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3729,8 +3729,7 @@ On ELF targets this attribute creates a symbol version.  The @var{name2} part
 of the parameter is the actual name of the symbol by which it will be
 externally referenced.  The @code{nodename} portion should be the name of a
 node specified in the version script supplied to the linker when building a
-shared library.  Versioned symbol must be defined and must be exported with
-default visibility.
+shared library.
 
 @smallexample
 __attribute__ ((__symver__ ("foo@@VERS_1"))) int
diff --git a/gcc/symtab.c b/gcc/symtab.c
index d7dfbb676df..51628fe625f 100644
--- a/gcc/symtab.c
+++ b/gcc/symtab.c
@@ -1178,22 +1178,6 @@ symtab_node::verify_base (void)
   error ("node is symver but not alias");
   error_found = true;
 }
-  /* Limitation of gas requires us to output targets of symver aliases as
- global symbols.  This is binutils PR 25295.  */
-  if (symver
-  && (!TREE_PUBLIC (get_alias_target ()->decl)
-	  || DECL_VISIBILITY (get_alias_target ()->decl) != VISIBILITY_DEFAULT))
-{
-  error ("symver target is not exported with default visibility");
-  error_found = true;
-}
-  if (symver
-  && (!TREE_PUBLIC (decl)
-	  || DECL_VISIBILITY (decl) != VISIBILITY_DEFAULT))
-{
-  error ("symver is not exported with default visibility");
-  

[PATCH 0/2] symver: extend functionality

2020-08-25 Thread Martin Liska
Hey.

Since the bintuils release 2.35, we can now support new .symver syntax added in:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=6914be53bd662eefd979d0c82d2e20e108c4ee66

Patch survives bootstrap and regression tests.
Thoughts?
Martin

Martin Liska (2):
  IPA symver: allow multiple symvers for a definition
  IPA symver: support visibility and static symbols.

 gcc/cgraphunit.c | 134 +++
 gcc/config/elfos.h   |  20 ++--
 gcc/doc/extend.texi  |  13 +--
 gcc/symtab.c |  16 ---
 gcc/testsuite/gcc.dg/ipa/symver1.c   |  11 ++
 gcc/testsuite/gcc.dg/ipa/symver2.c   |   9 ++
 gcc/testsuite/gcc.dg/ipa/symver3.c   |  13 +++
 gcc/testsuite/lib/target-supports-dg.exp |  10 ++
 gcc/testsuite/lib/target-supports.exp|  12 ++
 gcc/varasm.c |  12 +-
 gcc/varpool.c|   3 +-
 11 files changed, 144 insertions(+), 109 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/symver1.c
 create mode 100644 gcc/testsuite/gcc.dg/ipa/symver2.c
 create mode 100644 gcc/testsuite/gcc.dg/ipa/symver3.c

-- 
2.28.0



Re: BoF DWARF5 patches

2020-08-25 Thread Mark Wielaard
Hi,

On Mon, 2020-08-24 at 14:56 +0200, Mark Wielaard wrote:
> This afternoon there will be a BoF at the virtual Cauldron about
> DWARF5 and beyond. 
> https://linuxplumbersconf.org/event/7/contributions/746/
> 
> Here are some patches for GCC that I would like to discuss.
> I'll reply to them after the BoF with any comments people made.

Jeremy Bennett was nice enough to add the BoF notes to the GCC wiki: 
https://gcc.gnu.org/wiki/linuxplumbers2020#DWARF5.2FDWARF64_BoF

Cheers,

Mark


Re: [PATCH v3] libgcc: Use `-fasynchronous-unwind-tables' for LIB2_DIVMOD_FUNCS

2020-08-25 Thread Kito Cheng via Gcc-patches
Hi Maciej:

I just found the mail thread about div mod with -fnon-call-exceptions,
I think keeping the default LIB2_DIVMOD_EXCEPTION_FLAGS unchanged
should be the best way to go.

Non-call exceptions and libcalls
https://gcc.gnu.org/legacy-ml/gcc/2001-06/msg01108.html

Non-call exceptions and libcalls Part 2
https://gcc.gnu.org/legacy-ml/gcc/2001-07/msg00402.html


On Tue, Aug 25, 2020 at 9:41 AM Kito Cheng  wrote:
>
> Hi Maciej:
>
> Thanks for your patch, I tried to compile all target libraries with
> -O0 and got link error as you describe, it's really confusing about
> the error message, __pthread_mutex_lock and __pthread_mutex_unlock are
> undefined, it turns out the root cause is the __umoddi3 called
> _Unwind_Resume and pull-in lots of functions, so really appreciate you
> digging out this issue.
>
> After figure out what happen and reproduce what you see, I spend times
> to dig out the reason why -fexceptions -fnon-call-exceptions is set
> for div and mod routines, it was introduce at many years ago[1] and no
> much comment on why doing this, but I guess it's because some ISAs
> might generate exception/trap when divide-by-0, and then the execution
> environment will convert that to an exception, so those files/routines
> need to compile with such flags.
>
> However RISC-V never causes a trap when divide-by-0, so I believe
> -fexceptions -fnon-call-exceptions could be removed safely for RISC-V,
> but I am not sure it's safe for all other targets or not.
>
> I would suggest you set LIB2_DIVMOD_EXCEPTION_FLAGS to empty or
> '-fasynchronous-unwind-tables' for RISC-V port and do not change the
> default one in libgcc/Makefile.in.
>
> [1] 
> https://github.com/gcc-mirror/gcc/commit/fc6aa0a98a0c7d10d39dd9d1485d0996b84b1860#diff-f98ad72e54bdf47d90acbfafa467d802R132
>
>
> On Fri, Aug 21, 2020 at 2:46 AM Maciej W. Rozycki via Gcc-patches
>  wrote:
> >
> > Complement commit b932f770f70d ("x86_64 frame unwind info"), SVN r46374,
> > , and replace
> > `-fexceptions -fnon-call-exceptions' with `-fasynchronous-unwind-tables'
> > in LIB2_DIVMOD_FUNCS compilation flags so as to provide unwind tables
> > for the affected functions while not pulling the unwinder proper, which
> > is not required here.
> >
> > Beyond saving program space it fixes a RISC-V glibc build error due to
> > unsatisfied `malloc' and `free' references from the unwinder causing
> > link errors with `ld.so' where libgcc has been built at -O0.
> >
> > gcc/
> > * testsuite/gcc.target/arm/div64-unwinding.c: Rename to...
> > * testsuite/gcc.dg/div64-unwinding.c: ... this.
> >
> > libgcc/
> > * Makefile.in [!LIB2_DIVMOD_EXCEPTION_FLAGS]
> > (LIB2_DIVMOD_EXCEPTION_FLAGS): Replace `-fexceptions
> > -fnon-call-exceptions' with `-fasynchronous-unwind-tables'.
> > ---
> > Hi,
> >
> >  No change from v2 except for the removal of the ARM parts; hence no need
> > to retest.  OK to apply?
> >
> >   Maciej
> >
> > Changes from v2:
> >
> > - Removal of the ARM overrides removed.
> >
> > Changes from v1:
> >
> > - ChangeLog entries added.
> > ---
> >  gcc/testsuite/gcc.dg/div64-unwinding.c |   25 
> > +
> >  gcc/testsuite/gcc.target/arm/div64-unwinding.c |   25 
> > -
> >  libgcc/Makefile.in |2 +-
> >  3 files changed, 26 insertions(+), 26 deletions(-)
> >
> > gcc-libgcc-divmod-asynchronous-unwind-tables.diff
> > Index: gcc/gcc/testsuite/gcc.dg/div64-unwinding.c
> > ===
> > --- /dev/null
> > +++ gcc/gcc/testsuite/gcc.dg/div64-unwinding.c
> > @@ -0,0 +1,25 @@
> > +/* Performing a 64-bit division should not pull in the unwinder.  */
> > +
> > +/* { dg-do run { target { { ! *-*-linux* } && { ! *-*-uclinux* } } } } */
> > +/* { dg-skip-if "load causes weak symbol resolution" { vxworks_kernel } } 
> > */
> > +/* { dg-options "-O0" } */
> > +
> > +#include 
> > +
> > +long long
> > +foo (long long c, long long d)
> > +{
> > +  return c/d;
> > +}
> > +
> > +long long x = 0;
> > +long long y = 1;
> > +
> > +extern int (*_Unwind_RaiseException) (void *) __attribute__((weak));
> > +
> > +int main(void)
> > +{
> > +  if (&_Unwind_RaiseException != NULL)
> > +abort ();;
> > +  return foo (x, y);
> > +}
> > Index: gcc/gcc/testsuite/gcc.target/arm/div64-unwinding.c
> > ===
> > --- gcc.orig/gcc/testsuite/gcc.target/arm/div64-unwinding.c
> > +++ /dev/null
> > @@ -1,25 +0,0 @@
> > -/* Performing a 64-bit division should not pull in the unwinder.  */
> > -
> > -/* { dg-do run { target { { ! *-*-linux* } && { ! *-*-uclinux* } } } } */
> > -/* { dg-skip-if "load causes weak symbol resolution" { vxworks_kernel } } 
> > */
> > -/* { dg-options "-O0" } */
> > -
> > -#include 
> > -
> > -long long
> > -foo (long long c, long long d)
> > -{
> > -  return c/d;

Re: [PATCH 1/5] Don't enable -gvariable-location-views by default for DWARF5.

2020-08-25 Thread Mark Wielaard
Hi Alex,

On Tue, 2020-08-25 at 01:05 -0300, Alexandre Oliva wrote:
> it would seem to
> make more sense to adopt and promote the proposed extension,
> implemented in =incompat5 in GCC, but it would need some
> implementation work in consumers to at least ignore the extension.

Is that what is described in:
http://www.fsfla.org/~lxoliva/papers/sfn/dwarf6-sfn-lvu.txt

Does it match the proposal in:
http://www.dwarfstd.org/ShowIssue.php?issue=170427.1

Should we try to introduce your extending loclists proposal:
http://www.dwarfstd.org/ShowIssue.php?issue=170427.2

The main issue is that there is no formal way of extending the
loclists.

Thanks,

Mark


Re: [PATCH 2/5] Make sure that static data member constexpr isn't optimized away in test.

2020-08-25 Thread Mark Wielaard
Hi,

On Mon, 2020-08-24 at 17:38 -0400, Jason Merrill wrote:
> > This looks incorrect to me, that is a workaround for a real GCC bug.
> > 
> > Shouldn't we instead do something like (untested) following patch?
> > I mean, for DWARF < 5 the static data members were using DW_TAG_member,
> > which has been always marked by the function, so IMHO we should also always
> > mark the DW_TAG_variables at the class scope that replaced those.
> 
> The earlier behavior seems like an accident, that happened because we 
> always need to emit information about non-static data members.  I don't 
> think we should take it as guidance.

Maybe the reason they got emitted this way was an accident on the GCC
side. But I don't think it is an accident on the GDB side. At least
looking at the various gdb testcases, the intention is to show a
struct/class type as defined to the user, which includes both the
static and non-static data members of a class.

> In this case one reason we don't emit debug info is because (before 
> C++17) there's no definition of 'b'.  After C++17 the in-class 
> declaration of 'b' is a definition, but we don't have to give it a 
> symbol, so there's still nothing for the debug info to describe.

But don't we describe other parts of a type that don't have a symbol as
part of the debug info?

> This issue doesn't seem specific to class members; it also affects 
> namespace-scope C++17 inline variables:
> 
> inline const int var = 42;
> int main() { return var; }
> 
> Compiling this testcase with -g doesn't emit any debug info for 'var' 
> even though it's used.

That is new in GCC11, don't have GCC10 here to compare against, but
GCC9 produces:

DWARF section [27] '.debug_info' at offset 0x30b5:
 [Offset]
 Compilation unit at offset 0:
 Version: 5, Abbreviation section offset: 0, Address size: 8, Offset size: 4
 Unit type: compile (1)
 [ c]  compile_unit abbrev: 1
   producer (strp) "GNU C++17 9.3.1 20200408 (Red Hat 
9.3.1-2) -mtune=generic -march=x86-64 -gdwarf-5 -O2 -std=gnu++17"
   language (data1) C_plus_plus_14 (33)
   name (strp) "p.cc"
   comp_dir (strp) "/tmp"
   ranges   (sec_offset) range list [ c]
   low_pc   (addr) 00
   stmt_list(sec_offset) 0
 [2a]variable abbrev: 2
 name (string) "var"
 decl_file(data1) p.cc (1)
 decl_line(data1) 1
 decl_column  (data1) 18
 type (ref4) [3f]
 external (flag_present) yes
 inline   (data1) declared_inlined (3)
 const_value  (data1) 42
 [38]base_typeabbrev: 3
 byte_size(data1) 4
 encoding (data1) signed (5)
 name (string) "int"
 [3f]const_type   abbrev: 4
 type (ref4) [38]
 [44]subprogram   abbrev: 5
 external (flag_present) yes
 name (strp) "main"
 decl_file(data1) p.cc (1)
 decl_line(data1) 2
 decl_column  (data1) 5
 type (ref4) [38]
 low_pc   (addr) 0x00401050 
 high_pc  (data8) 6 (0x00401056 <_start>)
 frame_base   (exprloc) 
  [ 0] call_frame_cfa
 call_all_calls   (flag_present) yes

> Should we assume that if a variable with DW_AT_const_value is TREE_USED, 
> we need to write out debug info for it?

I would say yes. If it is used a user might want to look up its value.

Cheers,

Mark


RE: [PATCH PR96357][GCC][AArch64]: could not split insn UNSPEC_COND_FSUB with AArch64 SVE

2020-08-25 Thread Przemyslaw Wirkus
Hi Richard,

Thank you for your comments.
I've attached updated  patch with changes reflecting your comments.

Kind regards,
Przemyslaw

> -Original Message-
> From: Richard Sandiford 
> Sent: 19 August 2020 11:32
> To: Przemyslaw Wirkus 
> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw
> ; Marcus Shawcroft
> ; Kyrylo Tkachov 
> Subject: Re: [PATCH PR96357][GCC][AArch64]: could not split insn
> UNSPEC_COND_FSUB with AArch64 SVE
> 
> Przemyslaw Wirkus  writes:
> > Hi,
> >
> > Problem is related to that operand 4 (In original pattern
> > *cond_sub_any_const) is no longer the same as operand 1, and so
> > the pattern doesn't match the split condition.
> >
> > Pattern *cond_sub_any_const is being split by this patch into
> > two separate patterns:
> > * Pattern *cond_sub_relaxed_const now matches const_int
> >   SVE_RELAXED_GP operand.
> > * Pattern *cond_sub_strict_const now matches const_int
> >   SVE_STRICT_GP operand.
> > * Remove aarch64_sve_pred_dominates_p condition from both patterns.
> 
> Thanks for doing this.
> 
> > @@ -5271,6 +5270,43 @@ (define_insn_and_rewrite
> "*cond_sub_any_const"
> >[(set_attr "movprfx" "yes")]
> >  )
> >
> > +;; Predicated floating-point subtraction from a constant, merging
> > +with an ;; independent value.
> 
> The previous pattern had the same comment.  Maybe add:
> 
>   The subtraction predicate and the merge predicate are allowed to be
>   different.
> 
> to the relaxed one and:
> 
>   The subtraction predicate and the merge predicate must be the same.
> 
> to this one.
> 
> > +(define_insn_and_rewrite "*cond_sub_strict_const"
> > +  [(set (match_operand:SVE_FULL_F 0 "register_operand" "=w, w, ?w")
> > +   (unspec:SVE_FULL_F
> > + [(match_operand: 1 "register_operand" "Upl, Upl, Upl")
> > +  (unspec:SVE_FULL_F
> > +[(match_dup 1)
> > + (const_int SVE_STRICT_GP)
> > + (match_operand:SVE_FULL_F 2
> "aarch64_sve_float_arith_immediate")
> > + (match_operand:SVE_FULL_F 3 "register_operand" "w, w, w")]
> > +UNSPEC_COND_FSUB)
> > +  (match_operand:SVE_FULL_F 4 "aarch64_simd_reg_or_zero" "Dz, 0,
> w")]
> > + UNSPEC_SEL))]
> > +  "TARGET_SVE
> > +   && !rtx_equal_p (operands[3], operands[4])"
> 
> Very minor, but the file generally puts conditions on a single line if 
> they'll fit.
> Same for the relaxed version.
> 
> > +  "@
> > +
> movprfx\t%0., %1/z, %3.\;fsubr\t%0., %1/m, %0.<
> Vetype>, #%2
> > +
> movprfx\t%0., %1/m, %3.\;fsubr\t%0., %1/m, %0.
> , #%2
> > +   #"
> > +  "&& 1"
> > +  {
> > +if (reload_completed
> > +&& register_operand (operands[4], mode)
> > +&& !rtx_equal_p (operands[0], operands[4]))
> > +  {
> > +   emit_insn (gen_vcond_mask_ (operands[0],
> operands[3],
> > +operands[4], operands[1]));
> > +   operands[4] = operands[3] = operands[0];
> > +  }
> > +else if (!rtx_equal_p (operands[1], operands[5]))
> > +  operands[5] = copy_rtx (operands[1]);
> 
> The last two lines are a hold-over from the relaxed version, where there were
> two predicates.  There's no operand 5 in this pattern, so we should just 
> delete
> the lines.
> 
> Thanks,
> Richard


rb13404.patch
Description: rb13404.patch


Re: [PATCH] introduce attribute exalias

2020-08-25 Thread Alexandre Oliva
On Aug 15, 2020, Iain Sandoe  wrote:

> Alexandre Oliva  wrote:

>> I'm pretty sure setting multiple labels at the same address is used
>> and relied on quite often.

> That’s what’s currently disallowed (the assemblers all support .set).

I understand you mean it's disallowed for global labels.  I meant it was
often used for local labels.


> For function aliases, I think there’s a simple work-around and it’s just a
> question of time for me to make a patch etc.

Yeah, one possibility that comes to mind is to output additional text
symbols (akin to PLT entries) that just jump to the intended target.

> for general aliases to public symbols including data, not so easy.

*nod*

As far as I'm concerned, function aliases as above would be enough to
address the issue I was asked to address.  The RTTI aliases are only
used to import and catch C++ exception in Ada.  I suppose a mnemonic
would be just as welcome, but if it's not available, the mangled name
will have to do.


> Actually, I was thinking about folks who like template metaprogramming
> (not personally a fan) - and how they would arrange to get automatic
> export information to track that meta-progamming.

There's no (current) way to import C++ templates as Ada generics; the
best you can do is to import C++ template instantiations/specializations
as Ada records, procedures and functions.  This, and the lack of a
symbolic representation of generics, has driven me to introduce
user-chosen named aliases for specializations, rather than to the
generics.

I've considered enabling symbolic mnemonic template names to be
associated with templates, say:

template 
struct foo {
  static void __attribute__((__exalias__("FOO_%_%_static_method")))
  static_method () {}
};

replacing % and % with the mangling for the template arguments,
but that would bring us back the problem of varying mangled names as
mnemonic types like u64 get resolved to the mangling of their
target-dependent language types, defeating the purpose of referencing
cross-platform symbol names from Ada.

> Thinking aloud  - not thought through in any detail - I wonder if the
> facilities of
> C++20 modules are sufficient?

I'm really not sure what issue you're thinking of solving.

The one I'm working on is that of enabling the use of a uniform string
in Ada import statements (and also in alias("targets")), even when a
symbolic type that does not mangle uniformly is in use (think int64_t
mapping to long or long long).  Having per-target source files is quite
cumbersome in Ada, and there isn't a preprocessor to rely on for
conditionals and token pasting and whatnot.

I'm afraid I don't see how C++ modules could any offer in this regard.


>> Hmm, thanks, I will make sure there's some more verbose failure mode if
>> we can't find a way for something akin to an alias to be usable there.

> I imagine it will be easy to fix a diagnostic output.

*nod*

-- 
Alexandre Oliva, happy hacker
https://FSFLA.org/blogs/lxo/
Free Software Activist
GNU Toolchain Engineer


Re: [committed] doc: Switch valgrind.com to https

2020-08-25 Thread Martin Liška

Hey Gerald.

I noticed your continual effort to change http:// links to https://.
I've written a simple script that can do that in automatic way.

What do you think about using it?
Martin
#!/usr/bin/env python3

import os
import re
import requests

urls = set()

def check_url(url):
try:
r = requests.get(url, timeout = 5)
print(r)
return True
except requests.exceptions.ConnectionError as e:
print(e)
return False

for root, dirs, files in os.walk('.'):
for f in files:
if f.endswith('.html'):
full = os.path.join(root, f)
for l in open(full).readlines():
m = re.search(r'"(http:\/\/[^"]*)"', l)
if m:
urls.add(m.group(1))

urls = list(sorted(list(urls)))
to_replace = []

for i, url in enumerate(urls):
https = url.replace('http://', 'https://')
print('%d/%d: %s' % (i, len(urls), https))
if (check_url(https)):
to_replace.append(url)

print()
print('Can use HTTPS for %d of links:' % (len(to_replace)))
for url in to_replace:
print(url)
0/329: https://CobolForGCC.sourceforge.net/
HTTPSConnectionPool(host='cobolforgcc.sourceforge.net', port=443): Max retries 
exceeded with url: / (Caused by 
NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection 
refused'))
1/329: https://annwm.lbl.gov/bench/
HTTPSConnectionPool(host='annwm.lbl.gov', port=443): Max retries exceeded with 
url: /bench/ (Caused by ConnectTimeoutError(, 'Connection to annwm.lbl.gov timed out. (connect 
timeout=5)'))
2/329: https://archive.adaic.com/standards/83lrm/html/ada_lrm.html
HTTPSConnectionPool(host='archive.adaic.com', port=443): Max retries exceeded 
with url: /standards/83lrm/html/ada_lrm.html (Caused by 
ConnectTimeoutError(, 'Connection to archive.adaic.com timed out. (connect 
timeout=5)'))
3/329: https://archive.adaic.com/standards/83rat/html/Welcome.html
HTTPSConnectionPool(host='archive.adaic.com', port=443): Max retries exceeded 
with url: /standards/83rat/html/Welcome.html (Caused by 
ConnectTimeoutError(, 'Connection to archive.adaic.com timed out. (connect 
timeout=5)'))
4/329: https://awards.acm.org/about/2015-technical-awards

5/329: https://blackfin.uclinux.org/gf/
HTTPSConnectionPool(host='blackfin.uclinux.org', port=443): Max retries 
exceeded with url: /gf/ (Caused by SSLError(SSLCertVerificationError("hostname 
'blackfin.uclinux.org' doesn't match either of '*.azurewebsites.net', 
'*.scm.azurewebsites.net', '*.azure-mobile.net', '*.scm.azure-mobile.net', 
'*.sso.azurewebsites.net'")))
6/329: https://buildbot.net/

7/329: https://c-faq.com/
HTTPSConnectionPool(host='c-faq.com', port=443): Max retries exceeded with url: 
/ (Caused by SSLError(SSLCertVerificationError("hostname 'c-faq.com' doesn't 
match either of '*.eskimo.com', 'eskimo.com'")))
8/329: https://chasm-interop.sourceforge.net/
HTTPSConnectionPool(host='chasm-interop.sourceforge.net', port=443): Max 
retries exceeded with url: / (Caused by 
NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection 
refused'))
9/329: https://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.39.1922

10/329: https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.37.7180

11/329: https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.2235

12/329: https://cobolforgcc.sourceforge.net/cobol_toc.html
HTTPSConnectionPool(host='cobolforgcc.sourceforge.net', port=443): Max retries 
exceeded with url: /cobol_toc.html (Caused by 
NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection 
refused'))
13/329: https://compilerconnection.com

14/329: https://deuce.doc.wustl.edu/Download.html

15/329: https://developer.axis.com/
HTTPSConnectionPool(host='developer.axis.com', port=443): Max retries exceeded 
with url: / (Caused by SSLError(SSLCertVerificationError("hostname 
'developer.axis.com' doesn't match either of 'www.axis.com', 'axis.com', 
'beta.www.axis.com'")))
16/329: https://developer.classpath.org/doc/
HTTPSConnectionPool(host='developer.classpath.org', port=443): Max retries 
exceeded with url: /doc/ (Caused by 
NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection 
refused'))
17/329: https://developer.classpath.org/mediation/ClasspathGraphicsImagesText
HTTPSConnectionPool(host='developer.classpath.org', port=443): Max retries 
exceeded with url: /mediation/ClasspathGraphicsImagesText (Caused by 
NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection 
refused'))
18/329: https://dwarfstd.org/
HTTPSConnectionPool(host='dwarfstd.org', port=443): Max retries exceeded with 
url: / (Caused by SSLError(SSLCertVerificationError(1, '[SSL: 
CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate 
(_ssl.c:1123)')))
19/329: https://exactcode.com/opensource/openbench/

20/329: https://fastjar.sourceforge.net/
HTTPSConnectionPool(host='fastjar.sourceforge.net', port=443): Max retries 

Re: [PATCH] introduce attribute exalias

2020-08-25 Thread Alexandre Oliva
On Aug 15, 2020, Nathan Sidwell  wrote:

> 'alias' is also now a confusing term, because of the concept of 
> object-aliasing.

True, but it's also an established attribute name.  It seems thus
desirable to bring the conceptual framework of the alias attribute to
mind through the name of the attribute used in this new feature, but
with a twist that distinguishes it from the original attribute in a
meaningful way.


> The existing alias attribute is defined as:

>> The @code{alias} attribute causes the declaration to be emitted as an alias
>> for another symbol, which must have been previously declared with the same
>> type, and for variables, also the same size and alignment.  Declaring an 
>> alias
>> with a different type than the target is undefined and may be diagnosed.  As
>> an example, the following declarations:

> I.e. it is creating a declaration that is aliased to some other symbol
> (which has to also be emitted by the same TU due to the usual elf-like
> object file semantics).  Notice it says nothing about emitting a
> *symbol*.

It's an implied expectation that declarations of language entities get
symbols associated with them, and that attribute alias causes the symbol
name associated with one declaration to refer to the same entity denoted
by the named symbol, instead of introducing a separate entity.

> The new attribute is emitting a symbol that equates the declaration it
> is attached to (i.e. the other way round).

I.e., now we have a single declaration of a language entity, and we wish
additional symbol names to be associated with it.

> Its intent is to allow code written in another language to refer to
> this definition.  I imagine you'd commonly use the foreign language's
> mangling for the string provided.

What I have in mind are mnemonic, user-chosen names, rather than
machine-mangled ones.


> If we spell it 'X', consider:

> [[gnu::X ("other")]] int i;

> Most commonly, the assembly emitted would contain:
>   .globl other
>   .equiv other, i

> so, perhaps we should spell it 'equiv'?  That's using an existing term.

Uhh, my turn to find the term meaningless.  Not just because on my
machines, aliases use .set rather than .equiv.  It's missing about as
much context as 'aka' and 'nickname' in the relationship to alias.

The problem we face is that alias is a symmetric relationship, in that
if X aliases Y, then Y aliases X, but attribute alias is not symmetric:
its attribute must be placed in an undefined declaration, naming the
already-defined entity.

It would have made just as much sense to make it work backwards, namely,
attaching the attribute to the defined entity, naming any other
undefined declarations that ought to refer to the same entity.  That's
pretty much the attribute I propose.

Thus clearly, though "alias" is symmetric, our use thereof isn't.  Our
implementation uses the phrase "alias target" to refer to the
already-defined entity that a declaration holding an alias attribute
should alias; the declaration holding the alias attribute is referred to
as an alias.

Since attribute alias is already taken, it would thus make sense to look
for opposites of alias target to denote the new attribute.  Alias source
and alias origin are probably the most natural opposites of alias
target, but they don't suggest to me what we're looking for.  Alias lead
might work.  Alias bead might, too.

OTOH, since the alias attribute is associated with the alias
declaration, and it names an alias target; the opposite of that, with an
attribute in the alias target declaration, would best have the attribute
named alias_target (that's an attribute of the declaration, after all),
and then the named symbol would be the alias.  This would be
surprisingly consistent with the current use of attribute alias:

int attribute((alias_target("Y"))) X; // X is the alias target for Y
int attribute((alias("Y"))) Z; // Z is an alias for Y


Now, if we were to use "equiv", it would make sense to think of the
current alias attribute as "equiv_to" / "alias_target".


Another possibility that occurs to me is to reuse the existing attribute
alias, but naming the equivalent symbol as a second parameter to
attribute alias, suggesting a relation alias(X,Y), building on the
existing alias(X) in which the Y is implied, and introducing a variant
in which it is the X that is implied.  Alas, an explicit placeholder is
needed in this case.

Hmm, maybe such a two-argument variant of alias could be made more
readable by requiring the *second* argument to be number 2:

int attribute((alias("Y", 2))) X;

reading as imperative "alias Y to X", whereas existing uses:

int attribute((alias("Y"))) Z;

read as imperative "alias Y Z", nonsense, but taken as "alias Z to Y",
perhaps by analogy with how "give Y Z" is taken as "give Z to Y".


/me jokingly suggests inhalias and exhalias ;-)


This is enough bikeshedding for me ;-)

-- 
Alexandre Oliva, happy hacker
https://FSFLA.org/blogs/lxo/
Free Software Activist
GNU 

Re: [PATCH] Fix libstdc++ testsuite to handle VxWorks gthreads implementation

2020-08-25 Thread Jonathan Wakely via Gcc-patches

On 24/08/20 23:19 -0300, Alexandre Oliva wrote:

On Aug 24, 2020, Jonathan Wakely  wrote:


OK for trunk, thanks.



Given the approval and the lack of significant changes, I'll put this in
unless there are objections in the next 48 hours.  Thanks for the review!



Thanks. There's a new FAIL due to a bad merge.


Erhm...  Weird, I don't recall any manual adjustments to
packaged_task/cons/alloc.cc.  Indeed, the changes look exactly the same
that Corentin had proposed before.  Maybe the constraints were already
there, but hte unintended effects were not noticeable before the change
to the default?


You're right, it's not a bad merge (just a bad change I should have
noticed in the review :-)

My change to prevent it running for C++17 was done May 2019 with
9a0af7e3fb425ae2c0e044d044feb81ef493ce2c so that was already there
before Corentin's original patch.


Anyway, thanks for fixing the failure.  That said, I confess I don't get
how this test will (or should) ever be run in C++14 mode, being given an
explicit -std=gnu++11.  Is this really the right way to go about it?


The medium term plan is to run libstdc++ tests with several different
-std options, as is done for g++ tests. At that point we can drop most
(but not all) explicit -std options in the tests. But until then, this
test won't run at all if we don't put an explicit -std there, because
it gets skipped with the default -std=gnu++17.

Testing with -std=gnu++11 is better than not testing at all, and the
code being tested is identical for C++11 and C++14 so it's not a
problem that it isn't tested for C++14.

Now and then I manually remove all -std options from the tests and run
them with multiple -std permutations, so it does get tested with C++14
mode, just not automatically.



Re: [PATCH 1/5] Don't enable -gvariable-location-views by default for DWARF5.

2020-08-25 Thread Richard Biener via Gcc-patches
On Tue, Aug 25, 2020 at 6:05 AM Alexandre Oliva  wrote:
>
> On Aug 24, 2020, Jakub Jelinek  wrote:
>
> > On Mon, Aug 24, 2020 at 02:56:54PM +0200, Mark Wielaard wrote:
> >> DWARF5 makes it possible to read loclists tables without consulting
> >> the debuginfo tree by introducing a table header. Adding location views
> >> breaks this (at least for binutils and elfutils). So don't enable
> >> variable-location-views by default if DWARF5 or higher is selected.
>
> > This should be discussed with Alex, CCed.
> > I'd say elfutils/binutils should just show .debug_loclists independent of
> > .debug_info if there are no loc views and use .debug_info otherwise.
>
> I've suggested before that it made sense to me to start emitting
> locviews when there were concrete plans to implement support for them in
> debug info consumers.
>
> Without such plans, it would make more sense to just disable them
> altogether.
>
> Now, if there are any such plans, it is disabling them for the default
> debug format that doesn't make much sense to me; it would seem to make
> more sense to adopt and promote the proposed extension, implemented in
> =incompat5 in GCC, but it would need some implementation work in
> consumers to at least ignore the extension.
>
>
> Red Hat has had me involved in these efforts for over a decade, but I'm
> not aware of its plans any more, and I don't know of anyone else driving
> the implementation of locviews in consumers, so, given the little I
> know, this patch seems like a timid step in a reasonable direction:
> outputting locviews is no use if there are no consumers for it, more so
> when they actively disturb existing standard-compliant consumers.

But then leaving it enabled by default but not with -gdwarf-5 looks odd
and backward to me as well.  If the consensus is there's no use then
please disable them by default everywhere instead.

Richard.

> --
> Alexandre Oliva, happy hacker
> https://FSFLA.org/blogs/lxo/
> Free Software Activist
> GNU Toolchain Engineer


Re: [PATCH] gimple: Ignore *0 = {CLOBBER} in path isolation [PR96722]

2020-08-25 Thread Richard Biener
On Mon, 24 Aug 2020, Jakub Jelinek wrote:

> Hi!
> 
> Clobbers of MEM_REF with NULL address are just fancy nops, something we just
> ignore and don't emit any code for it (ditto for other clobbers), they just
> mark end of life on something, so we shouldn't infer from those that there
> is some UB.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2020-08-24  Jakub Jelinek  
> 
>   PR tree-optimization/96722
>   * gimple.c (infer_nonnull_range): Formatting fix.
>   (infer_nonnull_range_by_dereference): Return false for clobber stmts.
> 
>   * g++.dg/opt/pr96722.C: New test.
> 
> --- gcc/gimple.c.jj   2020-08-03 22:54:51.417531677 +0200
> +++ gcc/gimple.c  2020-08-24 13:23:22.082312349 +0200
> @@ -2917,8 +2917,8 @@ check_loadstore (gimple *, tree op, tree
>  bool
>  infer_nonnull_range (gimple *stmt, tree op)
>  {
> -  return infer_nonnull_range_by_dereference (stmt, op)
> -|| infer_nonnull_range_by_attribute (stmt, op);
> +  return (infer_nonnull_range_by_dereference (stmt, op)
> +   || infer_nonnull_range_by_attribute (stmt, op));
>  }
>  
>  /* Return true if OP can be inferred to be non-NULL after STMT
> @@ -2930,7 +2930,8 @@ infer_nonnull_range_by_dereference (gimp
>   non-NULL if -fdelete-null-pointer-checks is enabled.  */
>if (!flag_delete_null_pointer_checks
>|| !POINTER_TYPE_P (TREE_TYPE (op))
> -  || gimple_code (stmt) == GIMPLE_ASM)
> +  || gimple_code (stmt) == GIMPLE_ASM
> +  || gimple_clobber_p (stmt))
>  return false;
>  
>if (walk_stmt_load_store_ops (stmt, (void *)op,
> --- gcc/testsuite/g++.dg/opt/pr96722.C.jj 2020-08-24 13:24:45.357132323 
> +0200
> +++ gcc/testsuite/g++.dg/opt/pr96722.C2020-08-24 13:25:06.224836626 
> +0200
> @@ -0,0 +1,20 @@
> +// PR tree-optimization/96722
> +// { dg-do run }
> +// { dg-options "-O2" }
> +
> +struct S { int s; ~S () {} };
> +
> +void
> +foo (S *a)
> +{
> +  if (a)
> +return;
> +  a->~S ();
> +}
> +
> +int
> +main ()
> +{
> +  S s;
> +  foo ();
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH] strlen: Fix handle_builtin_string_cmp [PR96758]

2020-08-25 Thread Richard Biener
On Mon, 24 Aug 2020, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase is miscompiled, because handle_builtin_string_cmp
> sees a strncmp call with constant last argument 4, where one of the strings
> has an upper bound of 5 bytes (due to it being an array of that size) and
> the other has a known string length of 1 and the result is used only in
> equality comparison.
> It is folded into __builtin_strncmp_eq (str1, str2, 4), which is
> incorrect, because that means reading 4 bytes from both strings and
> comparing that.  When one of the strings has known strlen of 1, we want to
> compare just 2 bytes, not 4, as strncmp shouldn't compare any bytes beyond
> the null.
> So, the last argument to __builtin_strncmp_eq should be the minimum of the
> provided strncmp last argument and the known string length + 1 (assuming
> the other string has only a known upper bound due to array size).
> 
> Besides that, I've noticed the code has been written with the intent to also
> support the case where we know exact string length of both strings (but not
> the string content, so we can't compute it at compile time).  In that case,
> both cstlen1 and cstlen2 are non-negative and both arysiz1 and arysiz2 are
> negative.  We wouldn't optimize that, cmpsiz would be either the strncmp
> last argument, or for strcmp the first string length, but varsiz would be
> -1 and thus cmpsiz would be never < varsiz.  The patch fixes it by using the
> correct length, in that case using the minimum of the two and for strncmp
> also the last argument.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> For backporting, I think we should keep the
> -  if (cmpsiz < varsiz && used_only_for_zero_equality (lhs))
> +  if ((varsiz < 0 || cmpsiz < varsiz) && used_only_for_zero_equality (lhs))
> change outso that we don't introduce a new optimization and just fix the
> bug.

Agreed.

Richard.

> 2020-08-24  Jakub Jelinek  
> 
>   PR tree-optimization/96758
>   * tree-ssa-strlen.c (handle_builtin_string_cmp): If both cstlen1
>   and cstlen2 are set, set cmpsiz to their minimum, otherwise use the
>   one that is set.  If bound is used and smaller than cmpsiz, set cmpsiz
>   to bound.  If both cstlen1 and cstlen2 are set, perform the 
> optimization.
> 
>   * gcc.dg/strcmpopt_12.c: New test.
> 
> --- gcc/tree-ssa-strlen.c.jj  2020-07-28 15:39:10.078755265 +0200
> +++ gcc/tree-ssa-strlen.c 2020-08-24 11:31:05.457832003 +0200
> @@ -4485,11 +4485,19 @@ handle_builtin_string_cmp (gimple_stmt_i
>  ++cstlen2;
>  
>/* The exact number of characters to compare.  */
> -  HOST_WIDE_INT cmpsiz = bound < 0 ? cstlen1 < 0 ? cstlen2 : cstlen1 : bound;
> +  HOST_WIDE_INT cmpsiz;
> +  if (cstlen1 >= 0 && cstlen2 >= 0)
> +cmpsiz = MIN (cstlen1, cstlen2);
> +  else if (cstlen1 >= 0)
> +cmpsiz = cstlen1;
> +  else
> +cmpsiz = cstlen2;
> +  if (bound >= 0)
> +cmpsiz = MIN (cmpsiz, bound);
>/* The size of the array in which the unknown string is stored.  */
>HOST_WIDE_INT varsiz = arysiz1 < 0 ? arysiz2 : arysiz1;
>  
> -  if (cmpsiz < varsiz && used_only_for_zero_equality (lhs))
> +  if ((varsiz < 0 || cmpsiz < varsiz) && used_only_for_zero_equality (lhs))
>  {
>/* If the known length is less than the size of the other array
>and the strcmp result is only used to test equality to zero,
> --- gcc/testsuite/gcc.dg/strcmpopt_12.c.jj2020-08-24 11:36:21.380357549 
> +0200
> +++ gcc/testsuite/gcc.dg/strcmpopt_12.c   2020-08-24 11:36:35.405158915 
> +0200
> @@ -0,0 +1,17 @@
> +/* PR tree-optimization/96758 */
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +int v = 1;
> +
> +int
> +main ()
> +{
> +  const char *s = v ? "a" : "b";
> +  char x[5];
> +  char y[5] = "a\0a";
> +  __builtin_memcpy (x, y, sizeof (y));
> +  if (__builtin_strncmp (x, s, 4) != 0)
> +__builtin_abort ();
> +  return 0;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


[PATCH] debug/96690 - mangle symbols eventually used by late dwarf output

2020-08-25 Thread Richard Biener
The following makes sure to, at early debug generation time, mangle
symbols we eventually end up outputting during late finish.

[LTO] Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed
to trunk sofar.

Richard.

2020-08-24  Richard Biener  

PR debug/96690
* dwarf2out.c (reference_to_unused): Make FUNCTION_DECL
processing more consistent with respect to
symtab->global_info_ready.
(tree_add_const_value_attribute): Unconditionally call
rtl_for_decl_init to do all mangling early but throw
away the result if early_dwarf.

* g++.dg/lto/pr96690_0.C: New testcase.
---
 gcc/dwarf2out.c  | 15 +++
 gcc/testsuite/g++.dg/lto/pr96690_0.C | 17 +
 2 files changed, 24 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/lto/pr96690_0.C

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 9deca031fc2..6b03ff85f99 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -19756,7 +19756,7 @@ reference_to_unused (tree * tp, int * walk_subtrees,
   /* ???  The C++ FE emits debug information for using decls, so
  putting gcc_unreachable here falls over.  See PR31899.  For now
  be conservative.  */
-  else if (!symtab->global_info_ready && VAR_OR_FUNCTION_DECL_P (*tp))
+  else if (!symtab->global_info_ready && VAR_P (*tp))
 return *tp;
   else if (VAR_P (*tp))
 {
@@ -19771,7 +19771,7 @@ reference_to_unused (tree * tp, int * walk_subtrees,
  optimizing and gimplifying the CU by now.
 So if *TP has no call graph node associated
 to it, it means *TP will not be emitted.  */
-  if (!cgraph_node::get (*tp))
+  if (!symtab->global_info_ready || !cgraph_node::get (*tp))
return *tp;
 }
   else if (TREE_CODE (*tp) == STRING_CST && !TREE_ASM_WRITTEN (*tp))
@@ -20295,12 +20295,11 @@ tree_add_const_value_attribute (dw_die_ref die, tree 
t)
  return true;
}
 }
-  if (! early_dwarf)
-{
-  rtl = rtl_for_decl_init (init, type);
-  if (rtl)
-   return add_const_value_attribute (die, rtl);
-}
+  /* Generate the RTL even if early_dwarf to force mangling of all refered to
+ symbols.  */
+  rtl = rtl_for_decl_init (init, type);
+  if (rtl && !early_dwarf)
+return add_const_value_attribute (die, rtl);
   /* If the host and target are sane, try harder.  */
   if (CHAR_BIT == 8 && BITS_PER_UNIT == 8
   && initializer_constant_valid_p (init, type))
diff --git a/gcc/testsuite/g++.dg/lto/pr96690_0.C 
b/gcc/testsuite/g++.dg/lto/pr96690_0.C
new file mode 100644
index 000..c49327720fc
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lto/pr96690_0.C
@@ -0,0 +1,17 @@
+// { dg-lto-do assemble }
+// { dg-lto-options { { -flto -ffat-lto-objects -g } } }
+struct A { A (int); };
+template  class B { T f; };
+unsigned char *foo (int *, bool *, const int &);
+template  struct C {};
+struct D { B > d; };
+struct E { D e; };
+struct F {};
+struct G { static int bar (A, F, E, int); };
+
+void
+baz ()
+{
+  F f;
+  G::bar (0, f, E (), 0);
+}
-- 
2.26.2


Re: [PATCH 0/6] Parallelize Intra-Procedural Optimizations using the LTO Engine.

2020-08-25 Thread Richard Biener via Gcc-patches
On Mon, Aug 24, 2020 at 8:39 PM Giuliano Belinassi via Gcc-patches
 wrote:
>
> Ho, Josh.
>
> On 08/24, Josh Triplett wrote:
> > On Sat, Aug 22, 2020 at 06:04:48PM -0300, Giuliano Belinassi wrote:
> > > Hi, Josh
> > >
> > > On 08/21, Josh Triplett wrote:
> > > > On Thu, Aug 20, 2020 at 07:00:13PM -0300, Giuliano Belinassi wrote:
> > > > > This patch series add a new flag "-fparallel-jobs=" to control if the
> > > > > compiler should try to compile the current file in parallel.
> > > > [...]
> > > > > Bootstrapped and Regtested on Linux x86_64.
> > > > >
> > > > > Giuliano Belinassi (6):
> > > > >   Modify gcc driver for parallel compilation
> > > > >   Implement a new partitioner for parallel compilation
> > > > >   Implement fork-based parallelism engine
> > > > >   Add `+' for Jobserver Integration
> > > > >   Add invoke documentation
> > > > >   New tests for parallel compilation feature
> > > >
> > > > Very nice!
> > >
> > > Thank you for your interest in this :)
> > >
> > > >
> > > > I'm interested in testing this on a highly parallel system. What
> > > > baseline do these patches apply to?  They don't seem to apply to GCC
> > > > trunk.
> > >
> > > Hummm, this was supposed to work on trunk out of the box. However,
> > > there is a high probability that I messed up something while rebasing.
> > > I will post a version 2 of it when I get more comments and when I fix
> > > the Makefile issue that Joseph pointed out in other e-mail.
> > >
> > > If you want to test it on a high parallel system, I think it will be
> > > cool to see how it behaves also when --param=promote-statics=1, as it
> > > increases parallelism opportunity. :)
> >
> > I plan to try several variations, including that.
> >
> > I'd like to see how it affects the performance of Linux kernel builds.
>
> Well, I expect little to no impact on that.  I ran an experiment back
> on 2018 looking for parallelism bottleneck in Kernel, and what I found
> was that the developers did a good job on balancing the file sizes.
>
> This was run on a machine with 4x AMD Opteron CPUs, (64 cores in total)
> https://www.ime.usp.br/~belinass/64cores-kernel-experiment.svg
>
> As you can see from this image, the jobs ends almost at the same time.
>
> >
> > > > Also, I tried to bootstrap the current tip of the devel/autopar_devel
> > > > branch, but ended up with compiler segfaults that all look like this:
> > > > ../../gcc/zlib/compress.c:86:1: internal compiler error: Segmentation 
> > > > fault
> > > >86 | }
> > > >   | ^
> > >
> > > Well, there was once a bug in this branch when compiling with -flto that
> > > caused the assembler output file not to be properly initialized early
> > > enough, resulting in LTO LGEN stage writing into a invalid FILE pointer.
> > > I fixed this during rebasing but I forgot to push to the autopar_devel
> > > branch. In any case, I just pushed the recent changes to autopar_devel
> > > which fix this issue.
> >
> > That might explain the problem; I had tried to build gcc with the
> > bootstrap-lto configuration.
> >
> > > In any case, -fparallel-jobs= should NOT be used together with -flto.
> > > Although I used part of the LTO engine for development of this feature,
> > > they are meant for distinct things. I guess I should give a warning
> > > about that in next version :)
> >
> > Interesting. Is that something that could change in the future? I'd like
> > to be able to get some parallelism when creating the object files, and
> > then more parallelism when doing the final LTO link.
>
> Well, if by "final LTO link" you mean LTO's Whole Program Analysis,
> that is a quite challenging task to parallelize :)
>
> As for the "creating object files", you mean the LTO LGEN, I think
> it is not possible for now because -- as far as I understeand --, LTO
> object files are just containers for a intermediate language and
> does not support partial linking.

It was designed to allow partially linked LTO IR files but IIRC support
for this may have been rotten a bit.  But since most of the compile time
for LTO LGEN is spent in the frontends parsing the code (and that's
incredibly hard if not impossible to parallelize), splitting this task
is not going to bring much improvements.

> However, I would not expect LGEN bottlenecking compilation of any
> project. Most compilation time is spent in optimization, that is
> IPA and Intra-Procedural.

Indeed.

Btw, you can "emulate" what -fparallel-jobs=N does via

> gcc -c t.c -o t.il.o -flto -fno-fat-lto-objects
> gcc -o t.o t.il.o -r -flinker-output=nolto-rel -flto=N

with the twist that the partitioning done by the LTO link step
might not be exactly the same as the one done by -fparallel-jobs
(surprisingly we needed a different partitioner).  What -fparallel-jobs
improves over the manual -flto way above is that it completely
elides LTO IR streaming but otherwise it operates in the same
manner.

There's a regression with -fparallel-jobs when you use -g which
we still need to address since with 

Re: [PATCH 1/6] Modify gcc driver for parallel compilation

2020-08-25 Thread Richard Biener via Gcc-patches
On Mon, Aug 24, 2020 at 8:06 PM Giuliano Belinassi
 wrote:
>
> Hi, Richi.
>
> On 08/24, Richard Biener wrote:
> > On Fri, Aug 21, 2020 at 12:00 AM Giuliano Belinassi
> >  wrote:
> > >
> > > Update the driver for parallel compilation. This process work as
> > > follows:
> > >
> > > When calling gcc, the driver will check if the flag
> > > "-fparallel-jobs" was provided by the user. If yes, then we will
> > > check what is the desired output, and if it can be parallelized.
> > > There are the following cases, which is described:
> > >
> > > 1. -S or -E was provided: We can't run in parallel, as the output
> > >can not be easily merged together into one file.
> > >
> > > 2. -c was provided: When cc1* forks into multiple processes, it
> > >must tell the driver where it stored its generated assembler files.
> > >Therefore we pass a hidden "-fsplit-outputs=filename" to the compiler,
> > >and we check if "filename" was created by it. If yes, we open it,
> > >call assembler for each generated asm file
> > >(this file must not be empty), and link them together with
> > >partial linking to a single .o file. This process is done for each
> > >object file in the argument list.
> > >
> > > 3. -c was not provided, and the final product will be an binary: Here
> > >we proceed exactly as 2., but we avoid doing the partial
> > >linking, feeding the generated object files directly into the final 
> > > link.
> > >
> > > For that to work, we had to heavily modify how the "execute" function
> > > works, extracting common code which is used multiple times, and
> > > also detecting when the command is a call to a compiler or an
> > > assembler, as can be seen in append_split_outputs.
> > >
> > > Finally, we added some tests which reflects all cases found when
> > > bootstrapping the compiler, so development of further features to the
> > > driver get faster for now on.
> >
> > Few comments inline, Joseph may want to comment on the overall
> > structure as driver maintainer (CCed).
> >
> > I know I asked for the changes on the branch to be squashed but
> > the diff below is quite unreadable with the ChangeLog not helping
> > the overall refactoring much.  Is it possible to do some of the
> > factoring/refactoring without any functionality change to make the
> > actual diff easier to follow?
>
> Well, the refactoring is necessary, otherwise I would need to copy and
> paste a really huge amount of code.
>
> What I can do (and sounds reasonable to me) is to break this patch into
> two parts; one with just refactoring changes, and the other adding the
> parallelism engine.
>
> >
> > Thanks,
> > Richard.
> >
> > > gcc/ChangeLog
> > > 2020-08-20  Giuliano Belinassi  
> > >
> > > * common.opt (fsplit-outputs): New flag.
> > > (fparallel-jobs): New flag.
> > > * gcc.c (extra_arg_storer): New class.
> > > (have_S): New variable.
> > > (struct command): Move from execute.
> > > (is_compiler): New function.
> > > (is_assembler): New function.
> > > (get_number_of_args): New function.
> > > (get_file_by_lines): New function.
> > > (identify_asm_file): New function.
> > > (struct infile): New attribute temp_additional_asm.
> > > (current_infile): New variable.
> > > (get_path_to_ld): New function.
> > > (has_hidden_E): New function.
> > > (sort_asm_files): New function.
> > > (append_split_outputs): New function.
> > > (print_command): New function.
> > > (print_commands): New function.
> > > (print_argbuf): New function.
> > > (handle_verbose): Extracted from execute.
> > > (append_valgrind): Same as above.
> > > (async_launch_commands): Same as above.
> > > (await_commands_to_finish): Same as above.
> > > (split_commands): Same as above.
> > > (parse_argbuf): Same as above.
> > > (execute): Refator.
> > > (fsplit_arg): New function.
> > > (alloc_infile): Initialize infiles with 0.
> > > (process_command): Remember when -S was passed.
> > > (do_spec_on_infiles): Remember current infile being processed.
> > > (maybe_run_linker): Replace object files when -o is a executable.
> > > (finalize): Deinitialize temp_object_files.
> > >
> > > gcc/testsuite/ChangeLog:
> > > 20-08-2020  Giuliano Belinassi  
> > >
> > > * driver/driver.exp: New test.
> > > * driver/a.c: New file.
> > > * driver/b.c: New file.
> > > * driver/empty.c: New file.
> > > * driver/foo.c: New file.
> > > ---
> > >  gcc/common.opt  |4 +
> > >  gcc/gcc.c   | 1219 ---
> > >  gcc/testsuite/driver/a.c|6 +
> > >  gcc/testsuite/driver/b.c|6 +
> > >  gcc/testsuite/driver/driver.exp |   80 ++
> > >  gcc/testsuite/driver/empty.c|0
> > >  

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-08-25 Thread Uros Bizjak via Gcc-patches
On Mon, Aug 24, 2020 at 10:43 PM Qing Zhao  wrote:
>
>
>
> > On Aug 24, 2020, at 3:20 PM, Segher Boessenkool 
> >  wrote:
> >
> > Hi!
> >
> > On Mon, Aug 24, 2020 at 01:02:03PM -0500, Qing Zhao wrote:
> >>> On Aug 24, 2020, at 12:49 PM, Segher Boessenkool 
> >>>  wrote:
> >>> On Wed, Aug 19, 2020 at 06:27:45PM -0500, Qing Zhao wrote:
> > On Aug 19, 2020, at 5:57 PM, Segher Boessenkool 
> >  wrote:
> > Numbers on how expensive this is (for what arch, in code size and in
> > execution time) would be useful.  If it is so expensive that no one will
> > use it, it helps security at most none at all :-(
> >>>
> >>> Without numbers on this, no one can determine if it is a good tradeoff
> >>> for them.  And we (the GCC people) cannot know if it will be useful for
> >>> enough users that it will be worth the effort for us.  Which is why I
> >>> keep hammering on this point.
> >> I can collect some run-time overhead data on this, do you have a 
> >> recommendation on what test suite I can use
> >> For this testing? (Is CPU2017 good enough)?
> >
> > I would use something more real-life, not 12 small pieces of code.
>
> Then, what kind of real-life benchmark you are suggesting?
>
> >
> >>> (The other side of the coin is how much this helps prevent exploitation;
> >>> numbers on that would be good to see, too.)
> >>
> >> This can be well showed from the paper:
> >>
> >> "Clean the Scratch Registers: A Way to Mitigate Return-Oriented 
> >> Programming Attacks"
> >>
> >> https://urldefense.com/v3/__https://ieeexplore.ieee.org/document/8445132__;!!GqivPVa7Brio!JbdLvo54xB3ORTeZqpy_PwZsL9drNLaKjbg14bTKMOwxt8LWnjZ8gJWlqtlrFKPh$
> >>   
> >>  >>  >
> >>
> >> Please take a look at this paper.
> >
> > As I told you before, that isn't open information, I cannot reply to
> > any of that.
>
> A little confused here, what’s you mean by “open information”? Is the 
> information in a published paper not open information?

No, because it is behind a paywall.

Uros.


[PATCH] Fortran : ICE for division by zero in declaration PR95882

2020-08-25 Thread Mark Eggleston

Second attempt, this time with the correct attachment.

OK to commit and backport?

[PATCH] Fortran  : ICE for division by zero in declaration PR95882

A length expression containing a divide by zero in a character
declaration will result in an ICE if the constant is anymore
complicated that a contant divided by a constant.

The cause was that char_len_param_value can return MATCH_YES
even if a divide by zero was seen.  Prior to returning check
whether a divide by zero was seen and if so set it to MATCH_ERROR.

2020-08-24  Mark Eggleston 

gcc/fortran

    PR fortran/95882
    * decl.c (char_len_param_value): Check gfc_seen_div0 and
    it is set return MATCH_ERROR.

2020-08-24  Mark Eggleston 

gcc/testsuite/

    PR fortran/95882
    * gfortran.dg/pr95882_1.f90: New test.
    * gfortran.dg/pr95882_2.f90: New test.
    * gfortran.dg/pr95882_3.f90: New test.
    * gfortran.dg/pr95882_4.f90: New test.
    * gfortran.dg/pr95882_5.f90: New test.

--
https://www.codethink.co.uk/privacy.html

>From e97963ec8edc58217d2ff225c58256ebd61c8e7c Mon Sep 17 00:00:00 2001
From: Mark Eggleston 
Date: Fri, 21 Aug 2020 06:39:30 +0100
Subject: [PATCH] Fortran  : ICE for division by zero in declaration PR95882

A length expression containing a divide by zero in a character
declaration will result in an ICE if the constant is anymore
complicated that a contant divided by a constant.

The cause was that char_len_param_value can return MATCH_YES
even if a divide by zero was seen.  Prior to returning check
whether a divide by zero was seen and if so set it to MATCH_ERROR.

2020-08-24  Mark Eggleston  

gcc/fortran

	PR fortran/95882
	* decl.c (char_len_param_value): Check gfc_seen_div0 and
	it is set return MATCH_ERROR.

2020-08-24  Mark Eggleston  

gcc/testsuite/

	PR fortran/95882
	* gfortran.dg/pr95882_1.f90: New test.
	* gfortran.dg/pr95882_2.f90: New test.
	* gfortran.dg/pr95882_3.f90: New test.
	* gfortran.dg/pr95882_4.f90: New test.
	* gfortran.dg/pr95882_5.f90: New test.
---
 gcc/fortran/decl.c  | 3 +++
 gcc/testsuite/gfortran.dg/pr95882_1.f90 | 8 
 gcc/testsuite/gfortran.dg/pr95882_2.f90 | 6 ++
 gcc/testsuite/gfortran.dg/pr95882_3.f90 | 6 ++
 gcc/testsuite/gfortran.dg/pr95882_4.f90 | 7 +++
 gcc/testsuite/gfortran.dg/pr95882_5.f90 | 6 ++
 6 files changed, 36 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/pr95882_1.f90
 create mode 100644 gcc/testsuite/gfortran.dg/pr95882_2.f90
 create mode 100644 gcc/testsuite/gfortran.dg/pr95882_3.f90
 create mode 100644 gcc/testsuite/gfortran.dg/pr95882_4.f90
 create mode 100644 gcc/testsuite/gfortran.dg/pr95882_5.f90

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index d854b2a0307..c612b492f3e 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -1146,6 +1146,9 @@ char_len_param_value (gfc_expr **expr, bool *deferred)
   gfc_free_expr (e);
 }
 
+  if (gfc_seen_div0)
+m = MATCH_ERROR;
+
   return m;
 
 syntax:
diff --git a/gcc/testsuite/gfortran.dg/pr95882_1.f90 b/gcc/testsuite/gfortran.dg/pr95882_1.f90
new file mode 100644
index 000..c254bddf494
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95882_1.f90
@@ -0,0 +1,8 @@
+! { dg-do compile }
+
+module m
+   type t
+  character(((0)/0)) :: c  ! { dg-error "Division by zero" }
+   end type
+end
+
diff --git a/gcc/testsuite/gfortran.dg/pr95882_2.f90 b/gcc/testsuite/gfortran.dg/pr95882_2.f90
new file mode 100644
index 000..d308f0c3181
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95882_2.f90
@@ -0,0 +1,6 @@
+! { dg-do compile }
+
+module m
+   character(0/(0)) :: c = '123456789'  ! { dg-error "Division by zero" }
+end
+
diff --git a/gcc/testsuite/gfortran.dg/pr95882_3.f90 b/gcc/testsuite/gfortran.dg/pr95882_3.f90
new file mode 100644
index 000..bd849135480
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95882_3.f90
@@ -0,0 +1,6 @@
+! { dg-do compile }
+
+subroutine s(c)
+   character(((0)/0)) :: c  ! { dg-error "Division by zero" }
+end
+
diff --git a/gcc/testsuite/gfortran.dg/pr95882_4.f90 b/gcc/testsuite/gfortran.dg/pr95882_4.f90
new file mode 100644
index 000..52892d32b8b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95882_4.f90
@@ -0,0 +1,7 @@
+! { dg-do compile }
+
+program p
+   character(((0)/0)) :: c  ! { dg-error "Division by zero" }
+   common /x/ c
+end
+
diff --git a/gcc/testsuite/gfortran.dg/pr95882_5.f90 b/gcc/testsuite/gfortran.dg/pr95882_5.f90
new file mode 100644
index 000..dcdf5304052
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95882_5.f90
@@ -0,0 +1,6 @@
+! { dg-do compile }
+
+program p
+   character(0/(0)) :: c = '123456789'  ! { dg-error "Division by zero" }
+   common c
+end
-- 
2.11.0



Re: [PATCH] Fortran : ICE for division by zero in declaration PR95882

2020-08-25 Thread Mark Eggleston



On 25/08/2020 07:13, Mark Eggleston wrote:


On 24/08/2020 17:42, Thomas Koenig wrote:

Hi Mark,


OK to commit and backport?


The test cases mentioned in the ChangeLog are not in the
patch, instead there is the test case for PR 96624.

Could you correct that?

Whoops, yes I'll fix that.

It is actually the wrong attachment, I'll try again.


Best regards

Thomas


--
https://www.codethink.co.uk/privacy.html



Re: [PATCH] Fortran : ICE for division by zero in declaration PR95882

2020-08-25 Thread Mark Eggleston



On 24/08/2020 17:42, Thomas Koenig wrote:

Hi Mark,


OK to commit and backport?


The test cases mentioned in the ChangeLog are not in the
patch, instead there is the test case for PR 96624.

Could you correct that?

Whoops, yes I'll fix that.


Best regards

Thomas


--
https://www.codethink.co.uk/privacy.html