Re: [PATCH] libgcc: allocate extra space for morestack expansion

2021-02-18 Thread Rain Mark via Gcc-patches
On Wed, Feb 10, 2021 at 06:36:07AM -0800, Ian Lance Taylor wrote:
> Rain Mark  writes:
> 
> > After enabling -fsplit-stack, dynamic stack expansion of the coroutine
> > is realized, but calling functions without -fsplit-stack will directly
> > expand the space by 1M, which is too wasteful for a system with a large
> > number of coroutines running under the 128K stack size. We hope to give
> > users more control over the size of stack expansion to adapt to
> > more complex scenarios.
> >
> > Apply for more space each time the stack is expanded, which
> > can also reduce the frequency of morestack being called.
> > When calling the non-split function, we do not need additional
> > checks and apply for 1M space.  At this time, we can turn off
> > the conversion of linker to __morestack_non_split.  This is more friendly
> > to a large number of small stack coroutines running below 128K.
> >
> > Later we need to add an option to the gold linker to turn off
> > the __morestack_non_split conversion.
> 
> Why is the new variable thread local?
> 
> At first glance it seems like this might make more sense as a compiler
> option or compiler function attribute, rather than as a function to
> call.  When would you want to dynamically change the value?
> 
> Ian

Thank you very much, sorry for the late reply.

The compiler option is a better way, but considering that
multiple libraries provide different coroutine implementations.
A system may introduce multiple stackfull coroutine implementations.
Setting the thread local variable is mainly to reset this variable when the 
coroutine is switched.
Avoid this parameter conflict in multiple coroutine implementations.

Thanks
Rain


[PATCH v2 2/2] MIPS: add builtime option for -mcompact-branches

2021-02-18 Thread YunQiang Su
For R6+ target, it allows to configure gcc to use compact branches only.
---
 gcc/config.gcc   | 12 +++-
 gcc/doc/install.texi | 19 +++
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 17fea83b2e4..047f5631067 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4743,7 +4743,7 @@ case "${target}" in
;;
 
mips*-*-*)
-   supported_defaults="abi arch arch_32 arch_64 float fpu nan 
fp_32 odd_spreg_32 tune tune_32 tune_64 divide llsc mips-plt synci lxc1-sxc1 
madd4"
+   supported_defaults="abi arch arch_32 arch_64 float fpu nan 
fp_32 odd_spreg_32 tune tune_32 tune_64 divide llsc mips-plt synci lxc1-sxc1 
madd4 compact-branches"
 
case ${with_float} in
"" | soft | hard)
@@ -4896,6 +4896,16 @@ case "${target}" in
exit 1
;;
esac
+
+   case ${with_compact_branches} in
+   never | always | optimal)
+   with_compact_branches=${with_compact_branches}
+   ;;
+   *)
+   echo "Unknown compact-branches policy used in 
--with-compact-branches" 1>&2
+   exit 1
+   ;;
+   esac
;;
 
nds32*-*-*)
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 4c38244ae58..865630826c6 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -1464,6 +1464,25 @@ systems that support conditional traps).
 Division by zero checks use the break instruction.
 @end table
 
+@item --with-compact-branches=@var{policy}
+Specify how the compiler should generate code for checking for
+division by zero.  This option is only supported on the MIPS target.
+The possibilities for @var{type} are:
+@table @code
+@item optimal
+Cause a delay slot branch to be used if one is available in the
+current ISA and the delay slot is successfully filled. If the delay slot
+is not filled, a compact branch will be chosen if one is available.
+@item never
+Ensures that compact branch instructions will never be generated.
+@item always
+Ensures that a compact branch instruction will be generated if available.
+If a compact branch instruction is not available,
+a delay slot form of the branch will be used instead.
+This option is supported from MIPS Release 6 onwards.
+For pre-R6, this option is just same as never/optimal.
+@end table
+
 @c If you make --with-llsc the default for additional targets,
 @c update the --with-llsc description in the MIPS section below.
 
-- 
2.20.1



[PATCH v2 1/2] MIPS: Not trigger error for pre-R6 and -mcompact-branches=always

2021-02-18 Thread YunQiang Su
For MIPSr6, we may wish to use compact-branches only.
Currently, we have to use `always' option, while it is mark as conflict
with pre-R6.
  cc1: error: unsupported combination: ‘mips32r2’ -mcompact-branches=always
Just ignore -mcompact-branches=always for pre-R6.

This patch also defines
__mips_compact_branches_never
__mips_compact_branches_always
__mips_compact_branches_optimal
predefined macros
---
 gcc/config/mips/mips.c|  8 +--
 gcc/config/mips/mips.h| 22 ---
 gcc/doc/invoke.texi   |  1 +
 .../gcc.target/mips/compact-branches-1.c  |  2 +-
 .../gcc.target/mips/compact-branches-8.c  | 10 +
 .../gcc.target/mips/compact-branches-9.c  | 10 +
 gcc/testsuite/gcc.target/mips/mips.exp|  4 +---
 7 files changed, 38 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/compact-branches-8.c
 create mode 100644 gcc/testsuite/gcc.target/mips/compact-branches-9.c

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 8bd2d29552e..9a75dd61031 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -20107,13 +20107,7 @@ mips_option_override (void)
   target_flags |= MASK_ODD_SPREG;
 }
 
-  if (!ISA_HAS_COMPACT_BRANCHES && mips_cb == MIPS_CB_ALWAYS)
-{
-  error ("unsupported combination: %qs%s %s",
- mips_arch_info->name, TARGET_MICROMIPS ? " -mmicromips" : "",
- "-mcompact-branches=always");
-}
-  else if (!ISA_HAS_DELAY_SLOTS && mips_cb == MIPS_CB_NEVER)
+  if (!ISA_HAS_DELAY_SLOTS && mips_cb == MIPS_CB_NEVER)
 {
   error ("unsupported combination: %qs%s %s",
  mips_arch_info->name, TARGET_MICROMIPS ? " -mmicromips" : "",
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index b4a60a55d80..b8399fe1b0d 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -103,11 +103,9 @@ struct mips_cpu_info {
 #define TARGET_RTP_PIC (TARGET_VXWORKS_RTP && flag_pic)
 
 /* Compact branches must not be used if the user either selects the
-   'never' policy or the 'optimal' policy on a core that lacks
+   'never' policy or the 'optimal' / 'always' policy on a core that lacks
compact branch instructions.  */
-#define TARGET_CB_NEVER (mips_cb == MIPS_CB_NEVER  \
-|| (mips_cb == MIPS_CB_OPTIMAL \
-&& !ISA_HAS_COMPACT_BRANCHES))
+#define TARGET_CB_NEVER (mips_cb == MIPS_CB_NEVER || !ISA_HAS_COMPACT_BRANCHES)
 
 /* Compact branches may be used if the user either selects the
'always' policy or the 'optimal' policy on a core that supports
@@ -117,10 +115,11 @@ struct mips_cpu_info {
 && ISA_HAS_COMPACT_BRANCHES))
 
 /* Compact branches must always be generated if the user selects
-   the 'always' policy or the 'optimal' policy om a core that
-   lacks delay slot branch instructions.  */
-#define TARGET_CB_ALWAYS (mips_cb == MIPS_CB_ALWAYS\
-|| (mips_cb == MIPS_CB_OPTIMAL \
+   the 'always' policy on a core support compact branches,
+   or the 'optimal' policy on a core that lacks delay slot branch 
instructions.  */
+#define TARGET_CB_ALWAYS ((mips_cb == MIPS_CB_ALWAYS \
+&& ISA_HAS_COMPACT_BRANCHES) \
+|| (mips_cb == MIPS_CB_OPTIMAL   \
 && !ISA_HAS_DELAY_SLOTS))
 
 /* Special handling for JRC that exists in microMIPSR3 as well as R6
@@ -655,6 +654,13 @@ struct mips_cpu_info {
builtin_define ("__mips_no_lxc1_sxc1"); \
   if (!ISA_HAS_UNFUSED_MADD4 && !ISA_HAS_FUSED_MADD4)  \
builtin_define ("__mips_no_madd4"); \
+   \
+  if (TARGET_CB_NEVER) \
+   builtin_define ("__mips_compact_branches_never");   \
+  else if (TARGET_CB_ALWAYS)   \
+   builtin_define ("__mips_compact_branches_always");  \
+  else \
+   builtin_define ("__mips_compact_branches_optimal"); \
 }  \
   while (0)
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c00514a6306..63f9b85d7ab 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -25227,6 +25227,7 @@ instruction is not available, a delay slot form of the 
branch will be
 used instead.
 
 This option is supported from MIPS Release 6 onwards.
+If it is used for pre-R6 target, it will be just ignored.
 
 The @option{-mcompact-branches=optimal} option will cause a delay slot
 branch to be used if one is available in the current ISA and the delay
diff --git a/gcc/testsuite/gcc.target/mips/compact-branches-1.c 

Re: [PATCH 2/3] MIPS: add builtime option for -mcompact-branches

2021-02-18 Thread YunQiang Su
Maciej W. Rozycki  于2021年2月17日周三 上午3:45写道:
>
> On Tue, 16 Feb 2021, Jeff Law via Gcc-patches wrote:
>
> > > diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
> > > index 4c38244ae58..6b9520569ba 100644
> > > --- a/gcc/doc/install.texi
> > > +++ b/gcc/doc/install.texi
> > > @@ -1464,6 +1464,29 @@ systems that support conditional traps).
> > >  Division by zero checks use the break instruction.
> > >  @end table
> > >
> > > +@item --with-compact-branches=@var{policy}
> > > +Specify how the compiler should generate code for checking for
> > > +division by zero.  This option is only supported on the MIPS target.
> > > +The possibilities for @var{type} are:
> > Is this really correct -- I would expect that the change affects
> > branches in general, not just checks for division by zero.
>
>  I wonder if we need to multiply the options here at all.  The original
> change:  was
> discussed here: 
> in this respect:
>
> On Mon, 17 Aug 2015, Matthew Fortune wrote:
>
> > > > Compact branches are used based on a branch policy. The polices are:
> > > >
> > > > never: Only use delay slot branches
> > > > optimal: Do whatever is best for the current architecture.  This will
> > > >  generally mean that delay slot branches will be used if the 
> > > > delay
> > > >  slot gets filled but otherwise a compact branch will be used. A
> > > >  special case here is that JAL and J will not be used in R6 code
> > > >  regardless of whether the delay slot could be filled.
> > > > always: Never emit a delay slot form of a branch if a compact form 
> > > > exists.
> > > > This policy cannot apply 100% as FP branches (and MSA branches 
> > > > when
> > > > committed) only have delay slot forms.
> > > >
> > > > These user choices are combined with the features available in the 
> > > > chosen
> > > > architecture and, in particular, the optimal form will get handled like
> > > > 'never' when there are no compact branches available and will get 
> > > > handled
> > > > like 'always' when there are no delay slot branches available.
> > > >
> > >
> > > Why did you choose to make this a user-selectable option?  Why not always 
> > > generated
> > > optimal?
> > > I don't have a strong opinion about it, but the options seem to 
> > > complicate things and I'm
> > > interested in your rationale.
> >
> > This is down to micro-architecture decisions that different implementers 
> > may make.
> > Honestly, I have not absorbed all of the rationale behind choosing one form 
> > over
> > the other but our arch team have made enough comments about this to mean 
> > the support
> > in the compiler is worth the extra bit of effort. I can attempt a write-up 
> > of some
> > of the pipeline possibilities if you would like more detail but I'd 
> > probably have to
> > refresh my mind on this with our hardware teams.
>
>  My understanding therefore is that the original assumption that `optimal'
> will serve people best is no longer true.
>

I guess that `optimal' can still produce the best performance, while
the delay slot
make MIPS quite differnent with other architectures.
And the hardware engineers seems hate it also.

And we expect that MIPS can have as few as possible differnece delta
with other major architectures,
to ultily all of new framworks of community.

>  First, I think it would be good if we knew why.  I find proliferating
> variants of defaults, especially for the less obvious cases, will cause
> user confusion as one won't know what code model to expect, especially as
> (please correct me if I am wrong) we don't actually provide a way to dump
> the compiler's overridable configuration defaults.
>

So, should we provide a predefined macro for it?

>  Second, I wonder if it makes sense to just keep things simple, and rather
> than adding `prefer' (to stand for "`always' if available"), and possibly
> `avoid' (to stand for "`never' if available"), whether we shouldn't just
> relax the checks for `always' and `never', and let them through whether
> the architecture selected provides for the option chosen or not.
>

relax the `always' is what I would like to do first.
But I was afread to break some complatiable.

>  Please note that in the discussion quoted Catherine has already raised a
> concern I agree with of adding a complication here, and now we would
> complicate it even further by not only adding a fourth choice, but another
> overridable configuration default as well.

I am still concern about whether we should just set the `always' as default.
My short team plan is to set it default for Debian r6 Port.
So, at least, I wish that we can provide a buildtime option for other need.

>
>   Maciej


Re: [PATCH v2] libiberty(argv.c): Fix memory leak in expandargv.

2021-02-18 Thread Jeff Law via Gcc-patches



On 2/18/21 5:08 AM, Ayush Mittal via Gcc-patches wrote:
> libiberty/ChangeLog:
>
>   * argv.c (expandargv): free allocated buffer if read fails.
I went ahead and committed this, even though we're in stage4 as other
projects use libiberty and it's a trivial enough change that they
shouldn't have to wait for the GCC development cycle to re-enter stage1.

Thanks,
Jeff



Re: [pushed] c++: Tuple of self-dependent classes [PR96926]

2021-02-18 Thread Jason Merrill via Gcc-patches

On 2/18/21 9:22 PM, Jason Merrill wrote:

When compiling this testcase, trying to resolve the initialization for the
tuple member ends up recursively considering the same set of tuple
constructor overloads, and since two of them separately depend on
is_constructible, the one we try second fails to instantiate
is_constructible because we're still in the middle of instantiating it the
first time.

Fixed by implementing an optimization that someone suggested we were already
doing: if we see a non-template candidate that is a perfect match for all
arguments, we can skip considering template candidates at all.  It would be
enough to do this only when LOOKUP_DEFAULTED, but it shouldn't hurt in other
cases.
>From d909ead68214042e9876a8df136d0835273d4b86 Mon Sep 17 00:00:00 2001
From: Jason Merrill 
Date: Thu, 18 Feb 2021 21:27:37 -0500
Subject: [PATCH] c++: Tweak PR969626 patch
To: gcc-patches@gcc.gnu.org

It occurred to me that other types of conversions use rvaluedness_matches_p,
but those uses don't affect overload resolution, so we shouldn't look at the
flag for them.  Fixing that made decltype64.C compile successfully, because
the non-template candidate was a perfect match, so we now wouldn't consider
the broken template.  Changing the argument to const& makes it no longer a
perfect match (because of the added const), so we again get the infinite
recursion.

This illustrates the limited nature of this optimization/recursion break; it
works for most copy/move constructors because the constructor we're looking
for is almost always a perfect match.  If it happens to help improve compile
time for other calls, that's just a bonus.

gcc/cp/ChangeLog:

	PR c++/96926
	* call.c (perfect_conversion_p): Limit rvalueness
	test to reference bindings.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/decltype64.C: Change argument to const&.
---
 gcc/cp/call.c   | 14 --
 gcc/testsuite/g++.dg/cpp0x/decltype64.C |  2 +-
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index bc369c68c5a..0ba0e19ae08 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -5864,12 +5864,14 @@ perfect_conversion_p (conversion *conv)
 {
   if (CONVERSION_RANK (conv) != cr_identity)
 return false;
-  if (!conv->rvaluedness_matches_p)
-return false;
-  if (conv->kind == ck_ref_bind
-  && !same_type_p (TREE_TYPE (conv->type),
-		   next_conversion (conv)->type))
-return false;
+  if (conv->kind == ck_ref_bind)
+{
+  if (!conv->rvaluedness_matches_p)
+	return false;
+  if (!same_type_p (TREE_TYPE (conv->type),
+			next_conversion (conv)->type))
+	return false;
+}
   return true;
 }
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/decltype64.C b/gcc/testsuite/g++.dg/cpp0x/decltype64.C
index 46d18594c94..0cd614cceeb 100644
--- a/gcc/testsuite/g++.dg/cpp0x/decltype64.C
+++ b/gcc/testsuite/g++.dg/cpp0x/decltype64.C
@@ -5,7 +5,7 @@ template
 struct index
 {};
 
-constexpr int recursive_impl(index<0u>)
+constexpr int recursive_impl(const index<0u>&)
 {
   return 0;
 }
-- 
2.27.0



Re: [PATCH] improve warning suppression for inlined functions (PR 98465, 98512)

2021-02-18 Thread Jeff Law via Gcc-patches



On 1/19/21 11:58 AM, Martin Sebor via Gcc-patches wrote:
> std::string tends to trigger a class of false positive out of bounds
> access warnings for code GCC cannot prove is unreachable because of
> missing aliasing constrains, and that ends up expanded inline into
> user code.  Simply inserting the contents of a constant char array
> does that.  In GCC 10 these false positives are suppressed due to
> -Wno-system-headers, but in GCC 11, to help detect calls rendered
> invalid by user code passing in either incorrect or insufficiently
> constrained arguments, -Wno-system-header no longer has this effect
> on invalid access warnings.
>
> To solve the problem without at least partially reverting the change
> and going back to the GCC 10 way of things for the affected subset
> of calls (just memcpy and memmove), the attached patch enhances
> the #pragma GCC diagnostic machinery to consider not just a single
> location for inlined code but all locations at which an expression
> and its callers are inlined all the way up the stack.  This gives
> each author of a function involved in inlining the ability to
> control a warning issued for the code, not just the user into whose
> code all the calls end up inlined.  To resolve PR 98465, it lets us
> suppress the false positives selectively in std::string rather
> than across the board in GCC.
>
> The solution is to provide a new pair of overloads for warning
> functions that, instead of taking a single location argument, take
> a tree node from which the location(s) are determined.  The tree
> argument is indirect because the diagnostic machinery doesn't (and
> cannot without more intrusive changes) at the moment depend on
> the various tree definitions.  A nice feature of these overloads
> is that they do away with the need for the %K directive (and in
> the future also %G, with another enhancement to accept a gimple*
> argument).
>
> This patch depends on the fix for PR 98664 (already approved but
> not yet checked in).  I've tested it on x86_64-linux.
>
> To avoid fallout I tried to keep the changes to a minimum, and
> so the design isn't as robust as I'd like it ultimately to be.
> I plan to enhance it in stage 1.
I'd lean towards deferring to gcc12 stage1 given the libstdc++ hack is
in place.  That does mean that glibc will need to work around the
instance they've stumbled over in ppc's rawmemchr.

If there are other BZs where this would help our ability to "cleanly"
suppress diagnosics with our pragmas, then we probably should link them
together with 98512/98465 and defer them all to gcc-12.

Jeff



Re: [PATCH] clear VLA bounds from all arguments (PR 97172)

2021-02-18 Thread Jeff Law via Gcc-patches



On 2/18/21 7:23 PM, Martin Sebor wrote:
> The fix for PR 97172 that removes non-constant VLA bounds from
> attribute access is incomplete: it inadvertently removes the bounds
> corresponding to just the first VLA argument, and not from subsequent
> arguments.
>
> The attached change removes the vestigial condition that causes this
> bug.  Since it's behind a build failure in a Fedora package (cava?)
> I'll commit it under the "obvious" clause tomorrow if there are no
> concerns.
>
> Martin
>
>
> gcc-97172-2.diff
>
> PR c/97172 - ICE: tree code 'ssa_name' is not supported in LTO streams
>
> gcc/ChangeLog:
>
>   * attribs.c (init_attr_rdwr_indices): Guard vblist use.
>   (attr_access::free_lang_data): Remove a spurious test.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.dg/pr97172.c: Add test cases.
OK. Thanks.

Jeff



Re: [PATCH] PR 99133, Mark xxspltiw, xxspltidp, and xxsplti32x as being prefixed

2021-02-18 Thread Michael Meissner via Gcc-patches
On Wed, Feb 17, 2021 at 06:09:39PM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Feb 17, 2021 at 12:17:30PM -0500, Michael Meissner wrote:
> > I noticed that the power10 xxspltiw, xxspltidp, and xxsplti32dx
> > instructions are not flagged as prefixed instructions, which means the
> > instruction length is not set to 12 bytes.  This patch sets these
> > instructions to be prefixed.  It also ensures that a leading 'p' is not
> > emitted before the instruction.
> > 
> > I checked this patch by doing a bootstrap build/check on a little endian 
> > power8
> > server system.  There were no regressions.
> 
> Why test a p10 patch on a p8?

Well it is just a code gen patch, so I can test it on any system, even an
x86_64 with a cross compiler.  Given that right now xxsplatiw is only created
via a built-in, The main consequence of not setting the prefixed attribute is
that the insn length is wrong.  Normal functions probably won't even notice,
but it is certainly possible that some function is large enough and it uses
xxsplatiw (and friends) and the assembler would complain that conditional jumps
are too far.

At the moment given it is only generated as a built-in function, I doubt people
would run into it.  As we add support in GCC 12 to use these instructions to
load up constants into the vector registers, it is more likely that people will
run into issues if the length is wrong.

> > In addition, I debugged cc1, and I
> > put a breakpoint on the get_attr_length function and I verified the insns 
> > now
> > have length 12.
> 
> You can just use -dp; the generated assembler output has lines like
>   pla 3,.LC0@pcrel # 6[c=4 l=12]  *pcrel_local_addr
> (c is cost, l is length).
> 
>   fprintf (asm_out_file, "[c=%d",
>insn_cost (debug_insn, optimize_insn_for_speed_p ()));
>   if (HAVE_ATTR_length)
> fprintf (asm_out_file, " l=%d",
>  get_attr_length (debug_insn));
>   fprintf (asm_out_file, "]  ");

Sure, but it is just as simple to verify it with a debugger.

> > +   Some prefixed instructions (xxspltiw, xxspltidp, xxsplti32dp, etc.) do 
> > not
> > +   have a leading 'p'.  Setting the prefix attribute to special does not 
> > the 'p'
> > +   prefix.  */
> 
> (grammar)
> 
> "special" is the *normal* case.  *Most* prefixed insns will be like
> this.  They aren't right now, but they will be.
> 
> It sounds like you should make a new attribute, "always_prefixed" or
> something, that then the code that sets "prefixed" can use.

Yes agreed.

> >  void
> >  rs6000_final_prescan_insn (rtx_insn *insn, rtx [], int)
> >  {
> > -  next_insn_prefixed_p = (get_attr_prefixed (insn) != PREFIXED_NO);
> > +  next_insn_prefixed_p = (get_attr_prefixed (insn) != PREFIXED_NO
> > + && get_attr_prefixed (insn) != PREFIXED_SPECIAL);
> >return;
> >  }
> 
> So you set next_insn_prefixed_p when exactly?  The original code was
> correct, as far as I can see?

rs6000_final_prescan_insn is called in the FINAL_PRESCAN_INSN target hook, and
it is the only place we can set the flag for ASM_OUTPUT_OPCODE to print the
initial 'p'.  As you know, during the development of prefixed support, I went
through various different methods of generating the prefixed instruction
(i.e. pld instead of ld).  The current method works for normal load, store and
add instructions that have a normal form and a prefixed form.  But it doesn't
work for other instructions that we need to start dealing with. 
> 
> There are four kinds of insns now:
> 
> 1) Never prefixed.
> 2) Always prefixed, like xxspltiw but many more in the future.
> 3) Sometimes prefixed, and they get a "p" mnemonic prefix then.  Those
> are the majority currently, since we mostly have load/store insns that
> are prefixed currently, and those usually have a non-prefixed form we
> handled already.
> 4) Sometimes prefixed, and they get a "pm" prefix then.  We don't handle
> those yet (those are the MMA GER insns).
> 
> The "prefixed" attribute should just mean if the instruction ended up as
> some prefixed form (8 bytes).
> 
> So, for insns like xxspltiw you should just set "prefixed" to "yes",
> because that makes sense, is not confusing.  The code that prefixes "p"
> to the mnemonic should change, instead.  It can look at some new
> attribute, but it could also just use
>   prefixed_load_p (insn) || prefixed_store_p (insn)
>   || prefixed_paddi_p (insn)
> or similar (perhaps make a helper function for that then?)

Yes.  Pat Haugen will be taking over the PR.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


RE: [PATCH 1/1] libiberty(argv.c): Fix memory leak in expandargv.

2021-02-18 Thread Maninder Singh via Gcc-patches
Hi Martin,

>> Dynamic memory referenced by 'buffer' was allocated using xmalloc but fails 
>> to free it
>> when jump to 'error' label.
>> 
>> Issue as per static analysis tool.
> 
>The change looks okay to me although I can't approve it.  Since GCC
>is a regression fixing stage, unless the leak is a recent regression
>the fix is (strictly speaking) out of scope for GCC 11.  Either
>a libiberty or a global maintainer might be comfortable approving it 
>regardless.

OK

>That said, rather than adding another call to free, I would suggest
>to consider initializing buffer to null and moving the existing call
>to free the buffer under the error: label.
> 

We thought same, but then it will add un-necessary call to free in case
of earlier 3 goto cases and thus impact code's readability (going for free 
without allocating).

  if (fseek (f, 0L, SEEK_END) == -1)
goto error;
  pos = ftell (f);
  if (pos == -1)
goto error;
  if (fseek (f, 0L, SEEK_SET) == -1)
goto error;

thats why we tried with current change.
Other option was to create one more label in case of free.

Thanks,
Maninder Singh

>Martin
> 
>> 
>> Signed-off-by: Ayush Mittal 
>> Signed-off-by: Maninder Singh 
>> ---
>   libiberty/ChangeLog | 4 
>   libiberty/argv.c| 5 -
>   2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog
> index e472553..96cacba 100644
> --- a/libiberty/ChangeLog
> +++ b/libiberty/ChangeLog
> @@ -1,3 +1,7 @@
> +2021-02-18  Ayush Mittal  
> +
> +* argv.c (expandargv): Fix memory leak for buffer allocated to read 
> file contents.
> +
>   2021-02-01  Martin Sebor  
>   
>   * dyn-string.c (dyn_string_insert_cstr): Use memcpy instead of 
> strncpy
> diff --git a/libiberty/argv.c b/libiberty/argv.c
> index cd97f90..7199c7d 100644
> --- a/libiberty/argv.c
> +++ b/libiberty/argv.c
> @@ -442,7 +442,10 @@ expandargv (int *argcp, char ***argvp)
>due to CR/LF->CR translation when reading text files.
>That does not in-and-of itself indicate failure.  */
> && ferror (f))
> -goto error;
> +{
> +  free(buffer);
> +  goto error;
> +}
> /* Add a NUL terminator.  */
> buffer[len] = '\0';
> /* If the file is empty or contains only whitespace, buildargv would
> 
 


Re: [PATCH] rs6000: Use rldimi for vec init instead of shift + ior

2021-02-18 Thread Kewen.Lin via Gcc-patches
Hi Segher & Will,

Thanks for your review comments!

on 2021/2/19 上午2:33, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Feb 03, 2021 at 02:37:05PM +0800, Kewen.Lin wrote:
>> This patch merges the previously approved one[1] and its relied patch
>> made by Segher here[2], it's to make unsigned int vector init go with
>> rldimi to merge two integers instead of shift and ior.
> 
>> gcc/ChangeLog:
>>
>> 2020-02-03  Segher Boessenkool  
>>  Kewen Lin  
>>
>>  * config/rs6000/rs6000.md (*rotl3_insert_3): Renamed to...
>>  (rotl3_insert_3): ...this.
>>  (plus_ior_xor): New code_iterator.
>>  (define_split for GPR rl*imi): New splitter.
>>  * config/rs6000/vsx.md (vsx_init_v4si): Use gen_rotldi3_insert_3
>>  for integer merging.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.target/powerpc/vec-init-10.c: New test.
> 
> Is there a PR you should mention here?

I thought this is trivial so didn't file one upstream PR.  I did a
searching in upstream bugzilla, PR93453 looks similar but different.
I confirmed that the current patch can't make it pass (just two insns
instead of three insns).

Do you happen to have one related in mind?  If no, I will commit it
without PR.

> 
>> +/* { dg-final { scan-assembler-not "sldi" } } */
>> +/* { dg-final { scan-assembler-not "or" } } */
>> +/* { dg-final { scan-assembler-times {\mrldimi\M} 4 } } */
> 
> /* { dg-final { scan-assembler-not {\msldi\M} } } */
> /* { dg-final { scan-assembler-not {\mor\M} } } */
> /* { dg-final { scan-assembler-times {\mrldimi\M} 4 } } */
> 
> Okay for trunk with that tweak.  Thanks!
> 

Will fix it, thanks!


BR,
Kewen


[committed] jit: fix ICE on BUILT_IN_TRAP [PR99126]

2021-02-18 Thread David Malcolm via Gcc-patches
I tried several approaches to fixing this; this seemed the
least invasive.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r11-7288-gb258e263e0d74ca1f76aeaac5f4d1abef6b13707.

gcc/jit/ChangeLog:
PR jit/99126
* jit-builtins.c
(gcc::jit::builtins_manager::get_builtin_function_by_id):
Update assertion to reject BUILT_IN_NONE.
(gcc::jit::builtins_manager::ensure_optimization_builtins_exist):
New.
* jit-builtins.h
(gcc::jit::builtins_manager::ensure_optimization_builtins_exist):
New decl.
* jit-playback.c (gcc::jit::playback::context::replay): Call it.
Remove redundant conditional on bm.

gcc/testsuite/ChangeLog:
PR jit/99126
* jit.dg/test-trap.c: New test.
---
 gcc/jit/jit-builtins.c   | 14 +++-
 gcc/jit/jit-builtins.h   |  3 ++
 gcc/jit/jit-playback.c   | 11 +++---
 gcc/testsuite/jit.dg/test-trap.c | 59 
 4 files changed, 82 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/jit.dg/test-trap.c

diff --git a/gcc/jit/jit-builtins.c b/gcc/jit/jit-builtins.c
index 18e477cc907..1ea96f4e025 100644
--- a/gcc/jit/jit-builtins.c
+++ b/gcc/jit/jit-builtins.c
@@ -162,7 +162,7 @@ builtins_manager::get_builtin_function (const char *name)
 recording::function *
 builtins_manager::get_builtin_function_by_id (enum built_in_function 
builtin_id)
 {
-  gcc_assert (builtin_id >= 0);
+  gcc_assert (builtin_id > BUILT_IN_NONE);
   gcc_assert (builtin_id < END_BUILTINS);
 
   /* Lazily build the functions, caching them so that repeated calls for
@@ -600,6 +600,18 @@ builtins_manager::make_ptr_type (enum jit_builtin_type,
   return base_type->get_pointer ();
 }
 
+/* Ensure that builtins that could be needed during optimization
+   get created ahead of time.  */
+
+void
+builtins_manager::ensure_optimization_builtins_exist ()
+{
+  /* build_common_builtin_nodes does most of this, but not all.
+ We can't loop through all of the builtin_data array, we don't
+ support all types yet.  */
+  (void)get_builtin_function_by_id (BUILT_IN_TRAP);
+}
+
 /* Playback support.  */
 
 /* A builtins_manager is associated with a recording::context
diff --git a/gcc/jit/jit-builtins.h b/gcc/jit/jit-builtins.h
index b9f008dd4e2..c5e2b2dd600 100644
--- a/gcc/jit/jit-builtins.h
+++ b/gcc/jit/jit-builtins.h
@@ -127,6 +127,9 @@ public:
   tree
   get_attrs_tree (enum built_in_attribute attr);
 
+  void
+  ensure_optimization_builtins_exist ();
+
   void
   finish_playback (void);
 
diff --git a/gcc/jit/jit-playback.c b/gcc/jit/jit-playback.c
index 152ef250949..c6136301243 100644
--- a/gcc/jit/jit-playback.c
+++ b/gcc/jit/jit-playback.c
@@ -2949,6 +2949,11 @@ replay ()
   /* Replay the recorded events:  */
   timevar_push (TV_JIT_REPLAY);
 
+  /* Ensure that builtins that could be needed during optimization
+ get created ahead of time.  */
+  builtins_manager *bm = m_recording_ctxt->get_builtins_manager ();
+  bm->ensure_optimization_builtins_exist ();
+
   m_recording_ctxt->replay_into (this);
 
   /* Clean away the temporary references from recording objects
@@ -2957,13 +2962,11 @@ replay ()
  refs.  Hence we must stop using them before the GC can run.  */
   m_recording_ctxt->disassociate_from_playback ();
 
-  /* The builtins_manager, if any, is associated with the recording::context
+  /* The builtins_manager is associated with the recording::context
  and might be reused for future compiles on other playback::contexts,
  but its m_attributes array is not GTY-labeled and hence will become
  nonsense if the GC runs.  Purge this state.  */
-  builtins_manager *bm = get_builtins_manager ();
-  if (bm)
-bm->finish_playback ();
+  bm->finish_playback ();
 
   timevar_pop (TV_JIT_REPLAY);
 
diff --git a/gcc/testsuite/jit.dg/test-trap.c b/gcc/testsuite/jit.dg/test-trap.c
new file mode 100644
index 000..4eb65cd14c1
--- /dev/null
+++ b/gcc/testsuite/jit.dg/test-trap.c
@@ -0,0 +1,59 @@
+#include 
+#include 
+#include 
+
+#include "libgccjit.h"
+
+#include "harness.h"
+
+void
+create_code (gcc_jit_context *ctxt, void *user_data)
+{
+  /* Let's try to inject the equivalent of:
+
+ void
+ test_trap (void)
+ {
+   *((int *)0) = 42;
+ }
+  */
+  gcc_jit_type *void_type
+= gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_VOID);
+  gcc_jit_type *int_type
+= gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_INT);
+  gcc_jit_type *int_ptr_type
+= gcc_jit_type_get_pointer (int_type);
+
+  /* Build the test_fn.  */
+  gcc_jit_function *func
+= gcc_jit_context_new_function (ctxt, NULL,
+   GCC_JIT_FUNCTION_EXPORTED,
+   void_type,
+   "test_trap",
+   0, NULL,
+   0);
+
+  gcc_jit_block *initial = gcc_jit_function_new_block (func, "initial");

[PATCH] clear VLA bounds from all arguments (PR 97172)

2021-02-18 Thread Martin Sebor via Gcc-patches

The fix for PR 97172 that removes non-constant VLA bounds from
attribute access is incomplete: it inadvertently removes the bounds
corresponding to just the first VLA argument, and not from subsequent
arguments.

The attached change removes the vestigial condition that causes this
bug.  Since it's behind a build failure in a Fedora package (cava?)
I'll commit it under the "obvious" clause tomorrow if there are no
concerns.

Martin

PR c/97172 - ICE: tree code 'ssa_name' is not supported in LTO streams

gcc/ChangeLog:

	* attribs.c (init_attr_rdwr_indices): Guard vblist use.
	(attr_access::free_lang_data): Remove a spurious test.

gcc/testsuite/ChangeLog:

	* gcc.dg/pr97172.c: Add test cases.

diff --git a/gcc/attribs.c b/gcc/attribs.c
index 81322d40f1d..60933d20810 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -2124,7 +2124,7 @@ init_attr_rdwr_indices (rdwr_map *rwm, tree attrs)
 		  if (*m == '$')
 		{
 		  ++m;
-		  if (!acc.size)
+		  if (!acc.size && vblist)
 			{
 			  /* Extract the list of VLA bounds for the current
 			 parameter, store it in ACC.SIZE, and advance
@@ -2252,10 +2252,6 @@ attr_access::free_lang_data (tree attrs)
   if (!vblist)
 	continue;
 
-  vblist = TREE_VALUE (vblist);
-  if (!vblist)
-	continue;
-
   for (vblist = TREE_VALUE (vblist); vblist; vblist = TREE_CHAIN (vblist))
 	{
 	  tree *pvbnd = _VALUE (vblist);
diff --git a/gcc/testsuite/gcc.dg/pr97172.c b/gcc/testsuite/gcc.dg/pr97172.c
index ab5b2e9e7e9..8ae6342db7f 100644
--- a/gcc/testsuite/gcc.dg/pr97172.c
+++ b/gcc/testsuite/gcc.dg/pr97172.c
@@ -30,21 +30,52 @@ void fnp1_np1_np1 (int a[n + 1][n + 1][n + 1]);
 
 void gn (int a[n]) { fn (a); }
 void gnp1 (int a[n + 1]) { fnp1 (a); }
+void gnd2p1 (int a[n / 2 + 1]) { fnp1 (a); }
 
 void gx_n (int a[][n]) { fx_n (a); }
 void gx_np1 (int a[][n + 1]) { fx_np1 (a); }
+void gx_nd2p1 (int a[][n / 2 + 1]) { fx_np1 (a); }
 
 void g2_n (int a[2][n]) { f2_n (a); }
 void g2_np1 (int a[2][n + 1]) { f2_np1 (a); }
+void g2_nd2p1 (int a[2][n / 2 + 1]) { f2_np1 (a); }
 
 void gn_3 (int a[n][3]) { fn_3 (a); }
 void gnp1_3 (int a[n + 1][3]) { fnp1_3 (a); }
+void gnd2p1_3 (int a[n / 2 + 1][3]) { fnp1_3 (a); }
 
 void gn_n (int a[n][n]) { fn_n (a); }
 void gn_np1 (int a[n][n + 1]) { fn_np1 (a); }
 void gnp1_np1 (int a[n + 1][n + 1]) { fnp1_np1 (a); }
+void gnd2p1_nd2p1 (int a[n / 2 + 1][n / 2 + 1]) { fnp1_np1 (a); }
 
 void gn_n_n (int a[n][n][n]) { fn_n_n (a); }
 void gn_n_np1 (int a[n][n][n + 1]) { fn_n_np1 (a); }
 void gn_np1_np1 (int a[n][n + 1][n + 1]) { fn_np1_np1 (a); }
 void gnp1_np1_np1 (int a[n + 1][n + 1][n + 1]) { fnp1_np1_np1 (a); }
+void gnd2p1_nd2p1_nd2p1 (int a[n / 2 + 1][n / 2 + 1][n / 2 + 1])
+{ fnp1_np1_np1 (a); }
+
+
+void fna3_1 (int n,
+	 int a[n / 2 + 1],
+	 int b[n / 2 + 1],
+	 int c[n / 2 + 1]);
+
+void gna3_1 (int n,
+	 int a[n / 2 + 1],
+	 int b[n / 2 + 1],
+	 int c[n / 2 + 1]) { fna3_1 (n, a, b, c); }
+
+void fna3_2_3_4 (int n,
+		 int a[n / 2 + 1][n / 2 + 2],
+		 int b[n / 2 + 1][n / 2 + 2][n / 2 + 3],
+		 int c[n / 2 + 1][n / 2 + 2][n / 2 + 3][n / 2 + 4]);
+
+void gna3_2_3_4 (int n,
+		 int a[n / 2 + 1][n / 2 + 2],
+		 int b[n / 2 + 1][n / 2 + 2][n / 2 + 3],
+		 int c[n / 2 + 1][n / 2 + 2][n / 2 + 3][n / 2 + 4])
+{
+  fna3_2_3_4 (n, a, b, c);
+}


[pushed] c++: Tuple of self-dependent classes [PR96926]

2021-02-18 Thread Jason Merrill via Gcc-patches
When compiling this testcase, trying to resolve the initialization for the
tuple member ends up recursively considering the same set of tuple
constructor overloads, and since two of them separately depend on
is_constructible, the one we try second fails to instantiate
is_constructible because we're still in the middle of instantiating it the
first time.

Fixed by implementing an optimization that someone suggested we were already
doing: if we see a non-template candidate that is a perfect match for all
arguments, we can skip considering template candidates at all.  It would be
enough to do this only when LOOKUP_DEFAULTED, but it shouldn't hurt in other
cases.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/96926
* call.c (perfect_conversion_p): New.
(perfect_candidate_p): New.
(add_candidates): Ignore templates after a perfect non-template.

gcc/testsuite/ChangeLog:

PR c++/96926
* g++.dg/cpp0x/overload4.C: New test.
---
 gcc/cp/call.c  |  90 +++--
 gcc/testsuite/g++.dg/cpp0x/overload4.C | 174 +
 2 files changed, 252 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/overload4.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 186feef6fe3..bc369c68c5a 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -5853,6 +5853,47 @@ prep_operand (tree operand)
   return operand;
 }
 
+/* True iff CONV represents a conversion sequence which no other can be better
+   than under [over.ics.rank]: in other words, a "conversion" to the exact same
+   type (including binding to a reference to the same type).  This is stronger
+   than the standard's "identity" category, which also includes reference
+   bindings that add cv-qualifiers or change rvalueness.  */
+
+static bool
+perfect_conversion_p (conversion *conv)
+{
+  if (CONVERSION_RANK (conv) != cr_identity)
+return false;
+  if (!conv->rvaluedness_matches_p)
+return false;
+  if (conv->kind == ck_ref_bind
+  && !same_type_p (TREE_TYPE (conv->type),
+  next_conversion (conv)->type))
+return false;
+  return true;
+}
+
+/* True if CAND represents a perfect match, i.e. all perfect conversions, so no
+   other candidate can be a better match.  Since the template/non-template
+   tiebreaker comes immediately after the conversion comparison in
+   [over.match.best], a perfect non-template candidate is better than all
+   templates.  */
+
+static bool
+perfect_candidate_p (z_candidate *cand)
+{
+  if (cand->viable < 1)
+return false;
+  int len = cand->num_convs;
+  for (int i = 0; i < len; ++i)
+if (!perfect_conversion_p (cand->convs[i]))
+  return false;
+  if (conversion *conv = cand->second_conv)
+if (!perfect_conversion_p (conv))
+  return false;
+  return true;
+}
+
 /* Add each of the viable functions in FNS (a FUNCTION_DECL or
OVERLOAD) to the CANDIDATES, returning an updated list of
CANDIDATES.  The ARGS are the arguments provided to the call;
@@ -5920,6 +5961,18 @@ add_candidates (tree fns, tree first_arg, const 
vec *args,
 /* Delay creating the implicit this parameter until it is needed.  */
 non_static_args = NULL;
 
+  /* If there's a non-template perfect match, we don't need to consider
+ templates.  So check non-templates first.  This optimization is only
+ really needed for the defaulted copy constructor of tuple and the like
+ (96926), but it seems like we might as well enable it more generally.  */
+  bool seen_perfect = false;
+  enum { templates, non_templates, either } which = either;
+  if (template_only)
+which = templates;
+  else /*if (flags & LOOKUP_DEFAULTED)*/
+which = non_templates;
+
+ again:
   for (lkp_iterator iter (fns); iter; ++iter)
 {
   fn = *iter;
@@ -5928,6 +5981,10 @@ add_candidates (tree fns, tree first_arg, const 
vec *args,
continue;
   if (check_list_ctor && !is_list_ctor (fn))
continue;
+  if (which == templates && TREE_CODE (fn) != TEMPLATE_DECL)
+   continue;
+  if (which == non_templates && TREE_CODE (fn) == TEMPLATE_DECL)
+   continue;
 
   tree fn_first_arg = NULL_TREE;
   const vec *fn_args = args;
@@ -5967,7 +6024,7 @@ add_candidates (tree fns, tree first_arg, const vec *args,
fn,
ctype,
explicit_targs,
-   fn_first_arg, 
+   fn_first_arg,
fn_args,
return_type,
access_path,
@@ -5975,17 +6032,26 @@ add_candidates (tree fns, tree first_arg, const 
vec *args,
flags,
strict,
complain);
-  else if (!template_only)
-   add_function_candidate (candidates,
-   

Re: [PATCH] Fix producer string memory leaks

2021-02-18 Thread Martin Sebor via Gcc-patches

On 2/18/21 2:22 AM, Richard Biener wrote:

On Tue, Feb 16, 2021 at 5:09 PM Martin Sebor  wrote:


On 2/15/21 7:39 AM, Richard Biener wrote:

On Mon, Feb 15, 2021 at 2:46 PM Martin Liška  wrote:


On 2/12/21 5:56 PM, Martin Sebor wrote:

On 2/12/21 2:09 AM, Richard Biener via Gcc-patches wrote:

On Thu, Feb 11, 2021 at 6:41 PM Martin Liška  wrote:


Hello.

This fixes 2 memory leaks I noticed.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?


OK.


Thanks,
Martin

gcc/ChangeLog:

   * opts-common.c (decode_cmdline_option): Release werror_arg.
   * opts.c (gen_producer_string): Release output of
   gen_command_line_string.
---
 gcc/opts-common.c | 1 +
 gcc/opts.c| 7 +--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/opts-common.c b/gcc/opts-common.c
index 6cb5602896d..5e10edeb4cf 100644
--- a/gcc/opts-common.c
+++ b/gcc/opts-common.c
@@ -766,6 +766,7 @@ decode_cmdline_option (const char *const *argv, unsigned 
int lang_mask,
   werror_arg[0] = 'W';

   size_t warning_index = find_opt (werror_arg, lang_mask);
+  free (werror_arg);


Sorry to butt in here, but since we're having a discussion on this
same subject in another review of a fix for a memory leak, I thought
I'd pipe up: I would suggest a better (more in line with C++ and more
future-proof) solution is to replace the call to xstrdup (and the need
to subsequently call free) with std::string.


Hello.

To be honest, I like the suggested approach using std::string. On the other 
hand,
I don't want to mix existing C API (char *) with std::string.


That's one of the main problems.


I'm sending a patch that fixed the remaining leaks.


OK.




   if (warning_index != OPT_SPECIAL_unknown)
   {
 const struct cl_option *warning_option
diff --git a/gcc/opts.c b/gcc/opts.c
index fc5f61e13cc..24bb64198c8 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -3401,8 +3401,11 @@ char *
 gen_producer_string (const char *language_string, cl_decoded_option 
*options,
unsigned int options_count)
 {
-  return concat (language_string, " ", version_string, " ",
-gen_command_line_string (options, options_count), NULL);
+  char *cmdline = gen_command_line_string (options, options_count);
+  char *combined = concat (language_string, " ", version_string, " ",
+  cmdline, NULL);
+  free (cmdline);
+  return combined;
 }


Here, changing gen_command_line_string() to return std::string instead
of a raw pointer would similarly avoid having to remember to free
the pointer (and forgetting).  The function has two other callers,
both in gcc/toplev.c, and both also leak the memory for the same
reason as here.


Btw. have we made a general consensus that using std::string is fine in the
GCC internals?


No, we didn't.  But if Martin likes RAII adding sth like


It's not so much what I like as about using established C++ idioms
to help us avoid common memory management mistakes (leaks, dangling
pointers, etc.)

With respect to the C++ standard library, the GCC Coding Conventions
say:

Use of the standard library is permitted.  Note, however, that it
is currently not usable with garbage collected data.

So as a return value of a function, or as a local variable, using
std::string seems entirely appropriate, and I have been using it
that way.

Richard, if that's not in line with your understanding of
the intent of the text in the conventions or with your expectations,


The conventions were written / changed arbitrarily without real consent.


Let's try to fix that then.



Using std::string just because it implements the RAII part you like
but then still needing to interface with C string APIs on _both_ sides
makes std::string a bad fit.

GCCs code-base is a mess of C/C++ mix already, throwing std::string
inbetween a C string API sandwitch isn't going to improve things.


Very little C++ code has the luxury of being pure C++, without
having to interface with C code at some level.  Most projects that
started out as C and transitioned to C++ also are a mix of C and
C++ APIs and idioms as the transition is usually slow, piecemeal,
and commonly never fully complete.  That doesn't mean they can't
or shouldn't use std::string, or other components from the C++
standard library, or powerful C++ idioms.

But since we're making no progress here let me start a broader
discussion about this topic.  Maybe closer to when we're done with
the release.   If it turns out that there is consensus against using
std::string in GCC I volunteer to update the coding conventions and
contribute the class I prototyped.  Sharing experiences and even
code with other projects like GDB might be worth considering.

Martin


please propose a change for everyone's consideration.  If a consensus
emerges not to use std::string in GCC (or any other part of the C++
library) let's update 

[committed] [PR96264] LRA: Check output insn hard regs when updating available rematerialization insns

2021-02-18 Thread Vladimir Makarov via Gcc-patches

The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96264

The patch was successfully bootstrapped and tested on ppc64le.


commit e0d3041c9caece8b48be016fa515747eb2746d35
Author: Vladimir Makarov 
Date:   Thu Feb 18 22:40:54 2021 +

[PR96264] LRA: Check output insn hard regs when updating available rematerialization after the insn

 Insn for rematerialization can contain a clobbered hard register.  We
can not move such insn through another insn setting up the same hard
register.  The patch adds such check.

gcc/ChangeLog:

PR rtl-optimization/96264
* lra-remat.c (reg_overlap_for_remat_p): Check also output insn
	hard regs.

gcc/testsuite/ChangeLog:

PR rtl-optimization/96264
* gcc.target/powerpc/pr96264.c: New.

diff --git a/gcc/lra-remat.c b/gcc/lra-remat.c
index 8bd9ffa..d983731 100644
--- a/gcc/lra-remat.c
+++ b/gcc/lra-remat.c
@@ -651,7 +651,11 @@ calculate_local_reg_remat_bb_data (void)
 
 
 
-/* Return true if REG overlaps an input operand of INSN.  */
+/* Return true if REG overlaps an input operand or non-input hard register of
+   INSN.  Basically the function returns false if we can move rematerialization
+   candidate INSN through another insn with output REG or dead input REG (we
+   consider it to avoid extending reg live range) with possible output pseudo
+   renaming in INSN.  */
 static bool
 reg_overlap_for_remat_p (lra_insn_reg *reg, rtx_insn *insn)
 {
@@ -675,10 +679,11 @@ reg_overlap_for_remat_p (lra_insn_reg *reg, rtx_insn *insn)
 	 reg2 != NULL;
 	 reg2 = reg2->next)
   {
-	if (reg2->type != OP_IN)
-	  continue;
-	unsigned regno2 = reg2->regno;
 	int nregs2;
+	unsigned regno2 = reg2->regno;
+
+	if (reg2->type != OP_IN && regno2 >= FIRST_PSEUDO_REGISTER)
+	  continue;
 
 	if (regno2 >= FIRST_PSEUDO_REGISTER && reg_renumber[regno2] >= 0)
 	  regno2 = reg_renumber[regno2];
diff --git a/gcc/testsuite/gcc.target/powerpc/pr96264.c b/gcc/testsuite/gcc.target/powerpc/pr96264.c
new file mode 100644
index 000..e89979b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr96264.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-options "-Os -fno-forward-propagate -fschedule-insns -fno-tree-ter -Wno-psabi" } */
+/* { dg-require-effective-target p8vector_hw } */
+
+typedef unsigned char __attribute__ ((__vector_size__ (64))) v512u8;
+typedef unsigned short u16;
+typedef unsigned short __attribute__ ((__vector_size__ (64))) v512u16;
+typedef unsigned __int128 __attribute__ ((__vector_size__ (64))) v512u128;
+
+v512u16 d;
+v512u128 f;
+
+v512u8
+foo (u16 e)
+{
+  v512u128 g = f - -e;
+  d = (5 / (d + 1)) < e;
+  return (v512u8) g;
+}
+
+int
+main (void)
+{
+  v512u8 x = foo (2);
+  for (unsigned i = 0; i < sizeof (x); i++)
+if (x[i] != (i % 16 ? 0 : 2))
+  __builtin_abort ();
+}


[PATCH] constrain writing one too many bytes" warning (PR 97631)

2021-02-18 Thread Martin Sebor via Gcc-patches

The "writing one too many bytes" form of -Wstringop-overflow is
designed to trigger for strcpy(d, s) calls into allocated destinations
whose size is the result of (or depends on) strlen(s).  But the warning
is in code that's also called from handlers for bounded functions like
memcpy and strncpy, and the code doesn't differentiate between the two
kinds of callers, causing false positives.

The attached patch corrects both the warning routine and its callers
to properly distinguish these two classes of callers.  In addition,
it corrects a mistake where -Wstringop-overflow is being issued for
destinations of unknown size instead of the more appropriate
-Wstringop-truncation.

Tested on x86_64-linux.

The bug is a P2 10/11 regression and I'm looking to commit this change
both into the trunk and 10-branch.

Martin

PR middle-end/97631 - bogus "writing one too many bytes" warning for memcpy with strlen argument

gcc/ChangeLog:

	PR middle-end/97631
	* tree-ssa-strlen.c (maybe_warn_overflow): Test rawmem.
	(handle_builtin_stxncpy_strncat): Rename locals.  Determine
	destination size from allocation calls.  Issue a more appropriate
	kind of warning.
	(handle_builtin_memcpy): Pass true as rawmem to maybe_warn_overflow.
	(handle_builtin_memset): Same.

gcc/testsuite/ChangeLog:

	PR middle-end/97631
	* c-c++-common/Wstringop-overflow.c: Remove unexpected warnings.
	Add an xfail.
	* c-c++-common/Wstringop-truncation.c: Add expected warnings.
	* gcc.dg/Wstringop-overflow-10.c: Also enable -Wstringop-truncation.
	* gcc.dg/Wstringop-overflow-65.c: New test.

diff --git a/gcc/testsuite/c-c++-common/Wstringop-overflow.c b/gcc/testsuite/c-c++-common/Wstringop-overflow.c
index 53f5166f30a..5757a23141e 100644
--- a/gcc/testsuite/c-c++-common/Wstringop-overflow.c
+++ b/gcc/testsuite/c-c++-common/Wstringop-overflow.c
@@ -115,28 +115,31 @@ void test_strncpy (char **d, const char* s, int i)
   T (d, "123", sizeof "123");
   T (d, ar, sizeof ar);
 
-  T (d, s, strlen (s));   /* { dg-warning "\\\[-Wstringop-overflow=]" } */
+  /* There is no overflow in the following calls but they are diagnosed
+ by -Wstringop-truncation.  Verify that they aren'y also diagnosed
+ by -Wstringop-overflow.  */
+  T (d, s, strlen (s));
 
   {
-int n = strlen (s);   /* { dg-message "length computed here" } */
-T (d, s, n);  /* { dg-warning "\\\[-Wstringop-overflow=]" } */
+int n = strlen (s);
+T (d, s, n);
   }
 
   {
-unsigned n = strlen (s);   /* { dg-message "length computed here" } */
-T (d, s, n);   /* { dg-warning "\\\[-Wstringop-overflow=]" } */
+unsigned n = strlen (s);
+T (d, s, n);
   }
 
   {
 size_t n;
-n = strlen (s);   /* { dg-message "length computed here" } */
-T (d, s, n);  /* { dg-warning "\\\[-Wstringop-overflow=]" } */
+n = strlen (s);
+T (d, s, n);
   }
 
   {
 size_t n;
-n = strlen (s) - 1;   /* { dg-message "length computed here" } */
-T (d, s, n);  /* { dg-warning "\\\[-Wstringop-overflow=]" } */
+n = strlen (s) - 1;
+T (d, s, n);
   }
 
   {
@@ -148,11 +151,8 @@ void test_strncpy (char **d, const char* s, int i)
 
   {
 /* This use of strncpy is certainly dubious and it could well be
-   diagnosed by -Wstringop-truncation but it isn't.  That it is
-   diagnosed with -Wstringop-overflow is more by accident than
-   by design.  -Wstringop-overflow considers any dependency of
-   the bound on strlen(s) a potential bug.  */
-size_t n = i < strlen (s) ? i : strlen (s);   /* { dg-message "length computed here" } */
-T (d, s, n);  /* { dg-message ".strncpy\[^\n\r]* specified bound depends on the length of the source argument" } */
+   diagnosed by -Wstringop-truncation but it isn't.  */
+size_t n = i < strlen (s) ? i : strlen (s);   /* { dg-message "length computed here" "note" { xfail *-*-* } } */
+T (d, s, n);  /* { dg-message ".strncpy\[^\n\r]* specified bound depends on the length of the source argument" "pr?" { xfail *-*-* } } */
   }
 }
diff --git a/gcc/testsuite/c-c++-common/Wstringop-truncation.c b/gcc/testsuite/c-c++-common/Wstringop-truncation.c
index f29eee29e85..114837b2b64 100644
--- a/gcc/testsuite/c-c++-common/Wstringop-truncation.c
+++ b/gcc/testsuite/c-c++-common/Wstringop-truncation.c
@@ -226,19 +226,18 @@ void test_strncpy_ptr (char *d, const char* s, const char *t, int i)
   }
 
   {
-/* The following is likely buggy but there's no apparent truncation
-   so it's not diagnosed by -Wstringop-truncation.  Instead, it is
-   diagnosed by -Wstringop-overflow (tested elsewhere).  */
+/* The following truncates the terminating nul.  The warning should
+   say that but doesn't.  */
 int n;
 n = strlen (s) - 1;
-CPY (d, s, n);
+CPY (d, s, n);  /* { dg-warning "\\\[-Wstringop-truncation" } */
   }
 
   {
 /* Same as above.  */
 size_t n;
 

c++: header-unit build capability [PR 99023]

2021-02-18 Thread Nathan Sidwell
This defect really required building header-units and include 
translation of pieces of the standard library.  This adds smarts to	the 
modules test harness to do that -- accept .X files as the source file, 
but provide '-x c++-system-header $HDR' in the options.  The .X file 
will be considered by the driver to be a linker script and ignored (with 
a warning).


Using this we can add 2	tests that end up building list_initializer and 
iostream, along with a test that iostream's build include-translates 
list_initializer's #include. That discovered a set of issues with	the 
-flang-info-include-translate=HDR handling, also fixed and documented here.


PR c++/99023
gcc/cp/
* module.cc (canonicalize_header_name): Use
cpp_probe_header_unit.
(maybe_translate_include): Fix note_includes comparison.
(init_modules): Fix note_includes string termination.
libcpp/
* include/cpplib.h (cpp_find_header_unit): Rename to ...
(cpp_probe_header_unit): ... this.
* internal.h (_cp_find_header_unit): Declare.
* files.c (cpp_find_header_unit): Break apart to ..
(test_header_unit): ... this, and ...
(_cpp_find_header_unit): ... and, or and ...
(cpp_probe_header_unit): ... this.
* macro.c (cpp_get_token_1): Call _cpp_find_header_unit.
gcc/
* doc/invoke.texi (flang-info-include-translate): Document header
lookup behaviour.
gcc/testsuite/
* g++.dg/modules/modules.exp: Bail on cross-testing.  Add support
for .X files.
* g++.dg/modules/pr99023_a.X: New.
* g++.dg/modules/pr99023_b.X: New.

--
Nathan Sidwell
diff --git c/gcc/cp/module.cc w/gcc/cp/module.cc
index 064e57a29c4..e801c52069e 100644
--- c/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -19091,7 +19091,7 @@ canonicalize_header_name (cpp_reader *reader, location_t loc, bool unquoted,
   buf[len] = 0;
 
   if (const char *hdr
-	  = cpp_find_header_unit (reader, buf, str[-1] == '<', loc))
+	  = cpp_probe_header_unit (reader, buf, str[-1] == '<', loc))
 	{
 	  len = strlen (hdr);
 	  str = hdr;
@@ -19185,19 +19185,11 @@ maybe_translate_include (cpp_reader *reader, line_maps *lmaps, location_t loc,
   else if (note_include_translate_no && xlate == 0)
 note = true;
   else if (note_includes)
-{
-  /* We do not expect the note_includes vector to be large, so O(N)
-	 iteration.  */
-  for (unsigned ix = note_includes->length (); !note && ix--;)
-	{
-	  const char *hdr = (*note_includes)[ix];
-	  size_t hdr_len = strlen (hdr);
-	  if ((hdr_len == len
-	   || (hdr_len < len && IS_DIR_SEPARATOR (path[len - hdr_len - 1])))
-	  && !memcmp (hdr, path + len - hdr_len, hdr_len))
-	note = true;
-	}
-}
+/* We do not expect the note_includes vector to be large, so O(N)
+   iteration.  */
+for (unsigned ix = note_includes->length (); !note && ix--;)
+  if (!strcmp ((*note_includes)[ix], path))
+	note = true;
 
   if (note)
 inform (loc, xlate
@@ -19570,7 +19562,7 @@ init_modules (cpp_reader *reader)
 	0, !delimed, hdr, len);
 	char *path = XNEWVEC (char, len + 1);
 	memcpy (path, hdr, len);
-	path[len+1] = 0;
+	path[len] = 0;
 
 	(*note_includes)[ix] = path;
   }
diff --git c/gcc/doc/invoke.texi w/gcc/doc/invoke.texi
index e8baa545eee..c00514a6306 100644
--- c/gcc/doc/invoke.texi
+++ w/gcc/doc/invoke.texi
@@ -3382,7 +3382,12 @@ is used when building the C++ library.)
 @itemx -flang-info-include-translate=@var{header}
 @opindex flang-info-include-translate
 @opindex flang-info-include-translate-not
-Diagnose include translation events.
+Diagnose include translation events.  The first will note accepted
+include translations, the second will note declined include
+translations.  The @var{header} form will inform of include
+translations relating to that specific header.  If @var{header} is of
+the form @code{"user"} or @code{} it will be resolved to a
+specific user or system header using the include path.
 
 @item -stdlib=@var{libstdc++,libc++}
 @opindex stdlib
diff --git c/gcc/testsuite/g++.dg/modules/modules.exp w/gcc/testsuite/g++.dg/modules/modules.exp
index c17120f2c00..38654caf7b9 100644
--- c/gcc/testsuite/g++.dg/modules/modules.exp
+++ w/gcc/testsuite/g++.dg/modules/modules.exp
@@ -39,6 +39,11 @@ set MOD_STD_LIST { 17 2a }
 
 dg-init
 
+if {[is_remote host]} {
+# remote testing not functional here :(
+return
+}
+
 global module_do
 global module_cmis
 
@@ -274,6 +279,9 @@ proc module-init { src } {
 return $option_list
 }
 
+# cleanup any detritus from previous run
+cleanup_module_files [find $DEFAULT_REPO *.gcm]
+
 # not grouped tests, sadly tcl doesn't have negated glob
 foreach test [prune [lsort [find $srcdir/$subdir {*.[CH]}]] \
 		  "$srcdir/$subdir/*_?.\[CH\]"] {
@@ -282,6 +290,7 @@ foreach test [prune [lsort [find $srcdir/$subdir {*.[CH]}]] \
 
 	set std_list [module-init $test]
 	foreach std $std_list {
+	global 

[PATCH] PR fortran/99147 - Sanitizer detects heap-use-after-free in gfc_add_flavor

2021-02-18 Thread Harald Anlauf via Gcc-patches
Dear all,

the PR reports an issue detected with an ASAN instrumented compiler,
which can also be verified with valgrind.  It appears that the state
of gfc_new_block could be such that it should not be dereferenced.
Reversing the order of condition evaluation helped.

I failed to find out why this should happen, but then other places
in the code put dereferences of gfc_new_block behind other checks.
Simple things like initializing gfc_new_block with NULL in decl.c
did not help.

Regtested on x86_64-pc-linux-gnu.  No testcase added since the issue
can be found only with an instrumented compiler or valgrind.

I consider the patch to be obvious and trivial, but post it here
in case somebody wants to dig deeper.

OK for master?

Thanks,
Harald


PR fortran/99147 - Sanitizer detects heap-use-after-free in gfc_add_flavor

Reverse order of conditions to avoid invalid read.

gcc/fortran/ChangeLog:

* symbol.c (gfc_add_flavor): Reverse order of conditions.

diff --git a/gcc/fortran/symbol.c b/gcc/fortran/symbol.c
index 3b988d1be22..e982374d9d1 100644
--- a/gcc/fortran/symbol.c
+++ b/gcc/fortran/symbol.c
@@ -1772,8 +1772,8 @@ gfc_add_flavor (symbol_attribute *attr, sym_flavor f, const char *name,
   /* Copying a procedure dummy argument for a module procedure in a
  submodule results in the flavor being copied and would result in
  an error without this.  */
-  if (gfc_new_block && gfc_new_block->abr_modproc_decl
-  && attr->flavor == f && f == FL_PROCEDURE)
+  if (attr->flavor == f && f == FL_PROCEDURE
+  && gfc_new_block && gfc_new_block->abr_modproc_decl)
 return true;

   if (attr->flavor != FL_UNKNOWN)


Re: [PATCH] handle VLA of zero length arrays and vice versa (PR 99121)

2021-02-18 Thread Martin Sebor via Gcc-patches

On 2/18/21 11:03 AM, Jakub Jelinek wrote:

On Thu, Feb 18, 2021 at 07:00:52PM +0100, Jakub Jelinek wrote:

The size of the VLA is zero regardless of its bound and accessing
it is invalid so the warning is expected.


Yes, some warning, but not the one you are giving, that is nonsensical.
Array subscript 0 is not outside of array bounds of struct S[n], a[anything]
will still be zero sized and will not be problematic.


The warning is designed for ordinary arrays of nonzero size.  There's
no point in putting an effort into coming up with a special warning
just for those because they serve no purpose in these contexts (as
complete objects).



Scalar objects with zero size will always have that zero size,
similarly arrays thereof (constant or variable sized).
So the warning should be simply if eltsize == 0,
check if the access is before or after the object and complain
that a memory access is done before or after a zero sized object %qD.

Jakub


No, I don't think making this exception would be helpful.  Zero length
arrays are a non-standard extension meant to be used as struct members,
before flexible array members were added to C.  In other contexts, they
are almost certainly unintended and so likely bugs.  There's no valid
use case for such arrays, and diagnosing accesses to them helps find
such bugs.

That said, I also don't think a fix for the ICE should be held up
because we disagree on this vanishingly unimportant corner case.
The ICE effectively prevents using such arrays (VLAs) and since no
bug reports have been raised for it since it was introduced in GCC
9 it's unlikely that any code relies on it.  (I suspect the bug
itself was the result of fuzzing.)

Martin


c++: Remove obsolete dg-module-headers [PR 99023]

2021-02-18 Thread Nathan Sidwell


PR99023's testcase is highlighting some missing functionality of the 
modules test harness.  I did have some partial support, but it's only 
use in one place for a now-obsolete test.  This patch expunges that 
support so I can add better functionality now I understand better what 
is necessary.


PR c++/99023
gcc/testsuite/
* modules.exp: Remove dg-module-headers support
* alias-2_a.H: Delete.
* sys/alias-2_a.H: Delete.

--
Nathan Sidwell
diff --git c/gcc/testsuite/g++.dg/modules/alias-2_a.H w/gcc/testsuite/g++.dg/modules/alias-2_a.H
deleted file mode 100644
index 1befe85cf49..000
--- c/gcc/testsuite/g++.dg/modules/alias-2_a.H
+++ /dev/null
@@ -1,9 +0,0 @@
-// { dg-additional-options "-fmodule-header -isystem [srcdir]/sys" }
-// { dg-module-cmi {} }
-// { dg-module-headers test sys/alias-2_a.H }
-#ifndef ALIAS_2_A
-#define ALIAS_2_A
-
-int frob ();
-
-#endif
diff --git c/gcc/testsuite/g++.dg/modules/modules.exp w/gcc/testsuite/g++.dg/modules/modules.exp
index 28d627dcb86..c17120f2c00 100644
--- c/gcc/testsuite/g++.dg/modules/modules.exp
+++ w/gcc/testsuite/g++.dg/modules/modules.exp
@@ -41,7 +41,6 @@ dg-init
 
 global module_do
 global module_cmis
-global module_headers
 
 set DEFAULT_REPO "gcm.cache"
 
@@ -132,39 +131,6 @@ proc module_cmi_p { src ifs } {
 return $res
 }
 
-# Append required header unit names to module_headers var
-proc dg-module-headers { args } {
-if { [llength $args] != 3 } {
-	error "[lindex $args 0]: wrong number of arguments"
-	return
-}
-}
-
-proc do_module_headers { srcdir subdir std flags} {
-global module_headers
-foreach header $module_headers {
-	set kind [lindex $header 0]
-	set hdr [lindex $header 1]
-	verbose "Header $hdr $std" 1
-	switch $kind {
-	test {
-		global module_cmis
-		set module_cmis {}
-		dg-test -keep-output $srcdir/$subdir/$hdr "$std" $flags
-		global mod_files
-		lappend mod_files [module_cmi_p $subdir/$hdr $module_cmis]
-	}
-	system -
-	user {
-		# FIXME
-	}
-	default {
-		error "$kind unknown header"
-	}
-	}
-}
-}
-
 # link and maybe run a set of object files
 # dg-module-do WHAT WHEN
 proc dg-module-do { args } {
@@ -277,8 +243,6 @@ proc srcdir {} {
 proc module-init { src } {
 set tmp [dg-get-options $src]
 set option_list {}
-global module_headers
-set module_headers {}
 set have_std 0
 set std_prefix "-std=c++"
 
@@ -295,12 +259,6 @@ proc module-init { src } {
 		set have_std 1
 		}
 	}
-	"dg-module-headers" {
-		set kind [lindex $op 2]
-		foreach header [lindex $op 3] {
-		lappend module_headers [list $kind $header]
-		}
-	}
 	}
 }
 
@@ -324,7 +282,6 @@ foreach test [prune [lsort [find $srcdir/$subdir {*.[CH]}]] \
 
 	set std_list [module-init $test]
 	foreach std $std_list {
-	do_module_headers $srcdir $subdir $std $DEFAULT_MODFLAGS
 	set module_cmis {}
 	verbose "Testing $nshort $std" 1
 	dg-test $test "$std" $DEFAULT_MODFLAGS
@@ -347,7 +304,6 @@ foreach src [lsort [find $srcdir/$subdir {*_a.[CH}]] {
 	global module_do
 	set module_do {"compile" "P"}
 	set asm_list {}
-	do_module_headers $srcdir $subdir $std $DEFAULT_MODFLAGS
 	foreach test $tests {
 		if { [lindex $module_do 1] != "N" } {
 		set module_cmis {}
diff --git c/gcc/testsuite/g++.dg/modules/sys/alias-2_a.H w/gcc/testsuite/g++.dg/modules/sys/alias-2_a.H
deleted file mode 100644
index 5a058089725..000
--- c/gcc/testsuite/g++.dg/modules/sys/alias-2_a.H
+++ /dev/null
@@ -1,9 +0,0 @@
-// { dg-additional-options "-fmodule-header -isystem [srcdir]/sys" }
-// { dg-module-cmi {} }
-
-#ifndef ALIAS_2_A_SYS
-#define ALIAS_2_A_SYS
-
-int frob (int);
-
-#endif


c++: Remove large abi-specific tests [PR 99150]

2021-02-18 Thread Nathan Sidwell

Remove the two large and incorrectly abi-specific testcases I added.
Replacement tests will be forthcoming.

PR c++/99150
gcc/testsuite/
* g++.dg/modules/pr99023_a.H: Delete.
* g++.dg/modules/pr99023_b.H: Delete.

No patch because it exceeds the ml size limit :(

--
Nathan Sidwell


Re: [PATCH, rs6000] Add Power10 scheduling description

2021-02-18 Thread Pat Haugen via Gcc-patches
Ping2.

On 1/26/21 11:30 AM, Pat Haugen via Gcc-patches wrote:
> Ping.
> 
> On 11/13/20 4:04 PM, Pat Haugen via Gcc-patches wrote:
>> Add Power10 scheduling description.
>>
>> This patch adds the Power10 scheduling description. Since power10.md was 
>> pretty much a complete rewrite (existing version of power10.md is mostly 
>> just a copy of power9.md), I diffed power10.md with /dev/null so that the 
>> full contents of the file are shown as opposed to a diff. This should make 
>> it easier to read. This patch will not apply on current trunk do to that 
>> reason.
>>  
>> Bootstrap/regtest on powerpc64le (Power8/Power10) with no new regressions. 
>> Ok for trunk?
>>
>> -Pat
>>
>>
>> 2020-11-13  Pat Haugen  
>>
>> gcc/
>>  * config/rs6000/rs6000.c (struct processor_costs): New.
>>  (rs6000_option_override_internal): Set Power10 costs.
>>  (rs6000_issue_rate): Set Power10 issue rate.
>>  * config/rs6000/power10.md: Rewrite for Power10.
>>
> 



Re: [PATCH, rs6000] Update "prefix" attribute for Power10

2021-02-18 Thread Pat Haugen via Gcc-patches
Ping2.

On 1/26/21 11:28 AM, Pat Haugen via Gcc-patches wrote:
> Ping.
> 
> On 12/10/20 3:32 PM, Pat Haugen via Gcc-patches wrote:
>> Update prefixed attribute for Power10.
>>
>>
>> This patch was broken out from my larger patch to update various attributes 
>> for
>> Power10, in order to make the review process hopefully easier. This patch 
>> only
>> updates the prefix attribute for various new instructions. Changes in this
>> version include missed updates to rs6000_insn_cost and
>> rs6000_adjust_insn_length. I stayed with the new 'always' keyword but added
>> additional commentary so hopefully is more clear.
>>
>> Bootstrap/regtest on powerpc64le (Power8/Power10) with no new regressions. 
>> Ok for trunk?
>>
>> -Pat
>>
>>
>> 2020-11-10  Pat Haugen  
>>
>> gcc/
>>  * config/rs6000/altivec.md (xxspltiw_v4si, xxspltiw_v4sf_inst,
>>  xxspltidp_v2df_inst, xxsplti32dx_v4si_inst, xxsplti32dx_v4sf_inst,
>>  xxblend_, xxpermx_inst, xxeval): Mark prefixed "always".
>>  * config/rs6000/mma.md (mma_, mma_,
>>  mma_, mma_, mma_, mma_,
>>  mma_, mma_, mma_, mma_):
>>  Likewise.
>>  * config/rs6000/rs6000.c (rs6000_insn_cost): Update test for prefixed
>>  insn.
>>  (next_insn_prefixed_p): Rename to prefix_next_insn_p.
>>  (rs6000_final_prescan_insn): Only add 'p' for PREFIXED_YES.
>>  (rs6000_asm_output_opcode): Adjust.
>>  (rs6000_adjust_insn_length): Update test for prefixed insns.
>>  * config/rs6000/rs6000.md (define_attr "prefixed"): Add 'always'
>>  and update commentary.
>>
> 



Re: [PATCH, rs6000] Update "size" attribute for Power10

2021-02-18 Thread Pat Haugen via Gcc-patches
Ping2

On 1/26/21 11:27 AM, Pat Haugen via Gcc-patches wrote:
> Ping
> 
> On 12/8/20 3:46 PM, Pat Haugen via Gcc-patches wrote:
>> Update size attribute for Power10.
>>
>>
>> This patch was broken out from my larger patch to update various attributes 
>> for
>> Power10, in order to make the review process hopefully easier. This patch 
>> only
>> updates the size attribute for various new instructions. There were no 
>> changes
>> requested to this portion of the original patch, so nothing is new here.
>>
>> Bootstrap/regtest on powerpc64le (Power8/Power10) with no new regressions. 
>> Ok for trunk?
>>
>> -Pat
>>
>>
>> 2020-11-08  Pat Haugen  
>>
>> gcc/
>>  * config/rs6000/dfp.md (extendddtd2, trunctddd2, *cmp_internal1,
>>  floatditd2, ftrunc2, fixdi2, dfp_ddedpd_,
>>  dfp_denbcd_, dfp_dxex_, dfp_diex_,
>>  *dfp_sgnfcnc_, dfp_dscli_, dfp_dscri_): Update size
>>  attribute for Power10.
>>  * config/rs6000/mma.md (*movoo): Likewise.
>>  * config/rs6000/rs6000.md (define_attr "size"): Add 256.
>>  (define_mode_attr bits): Add DD/TD modes.
>>  * config/rs6000/sync.md (load_quadpti, store_quadpti, load_lockedpti,
>>  store_conditionalpti): Update size attribute for Power10.
>>
> 



Re: [PATCH v7] Add retain attribute to place symbols in SHF_GNU_RETAIN section

2021-02-18 Thread H.J. Lu via Gcc-patches
On Thu, Feb 18, 2021 at 7:15 AM Richard Biener
 wrote:
>
> On Thu, Feb 18, 2021 at 4:01 PM H.J. Lu  wrote:
> >
> > On Thu, Feb 18, 2021 at 2:25 AM Richard Biener
> >  wrote:
> > >
> > > On Thu, Feb 18, 2021 at 5:15 AM H.J. Lu via Gcc-patches
> > >  wrote:
> > > >
> > > > On Wed, Feb 17, 2021 at 12:50 PM H.J. Lu  wrote:
> > > > >
> > > > > On Wed, Feb 17, 2021 at 12:14 PM Jakub Jelinek  
> > > > > wrote:
> > > > > >
> > > > > > On Wed, Feb 17, 2021 at 12:04:34PM -0800, H.J. Lu wrote:> >
> > > > > > > +  /* For -fretain-used-symbol, a "used" attribute also implies 
> > > > > > > "retain".  */
> > > > > >
> > > > > > s/-symbol/-symbols/
> > > > >
> > > > > Fixed.
> > > > >
> > > > > > > +  if (flag_retain_used_symbols
> > > > > > > +  && attributes
> > > > > > > +  && (TREE_CODE (*node) == FUNCTION_DECL
> > > > > > > +   || (VAR_P (*node) && TREE_STATIC (*node))
> > > > > > > +   || (TREE_CODE (*node) == TYPE_DECL))
> > > > > >
> > > > > > What will "retain" do on TYPE_DECLs?  It only makes sense to me
> > > > > > for FUNCTION_DECL or non-automatic VAR_DECLs, unless for types
> > > > > > it means the types with that result in vars with those types 
> > > > > > automatically
> > > > > > getting "retain", but there is no code for that and I don't see 
> > > > > > "used"
> > > > > > handled that way.
> > > > >
> > > > > Fixed.
> > > > >
> > > > > > > +  if (SUPPORTS_SHF_GNU_RETAIN
> > > > > > > +  && (TREE_CODE (node) == FUNCTION_DECL
> > > > > > > +   || (VAR_P (node) && TREE_STATIC (node))
> > > > > > > +   || (TREE_CODE (node) == TYPE_DECL)))
> > > > > >
> > > > > > Likewise.
> > > > >
> > > > > Fixed.
> > > > >
> > > > > > > +;
> > > > > > > +  else
> > > > > > > +{
> > > > > > > +  warning (OPT_Wattributes, "%qE attribute ignored", name);
> > > > > > > +  *no_add_attrs = true;
> > > > > > > +}
> > > > > > > +
> > > > > > > +  return NULL_TREE;
> > > > > > > +}
> > > > > > > +
> > > > > > >  /* Handle a "externally_visible" attribute; arguments as in
> > > > > > > struct attribute_spec.handler.  */
> > > > > > >
> > > > > > > diff --git a/gcc/common.opt b/gcc/common.opt
> > > > > > > index c75dd36843e..70842481d4d 100644
> > > > > > > --- a/gcc/common.opt
> > > > > > > +++ b/gcc/common.opt
> > > > > > > @@ -2404,6 +2404,10 @@ frerun-loop-opt
> > > > > > >  Common Ignore
> > > > > > >  Does nothing.  Preserved for backward compatibility.
> > > > > > >
> > > > > > > +fretain-used-symbols
> > > > > > > +Common Var(flag_retain_used_symbols)
> > > > > > > +Make the used attribute to imply the retain attribute if 
> > > > > > > supported.
> > > > > >
> > > > > > English is not my native language, but I think the "to " doesn't 
> > > > > > belong
> > > > > > here.
> > > > >
> > > > > Fixed.
> > > > >
> > > > > > > +@item -fretain-used-symbols
> > > > > > > +@opindex fretain-used-symbols
> > > > > > > +On systems with recent GNU assembler and linker, the compiler 
> > > > > > > makes
> > > > > >
> > > > > > I think we should avoid recent here, new/old/recent etc. are 
> > > > > > relative terms.
> > > > > > Either use exact version (is it 2.36 or later) or say GNU assembler 
> > > > > > and
> > > > > > linker that supports the @code{.retain} directive or something 
> > > > > > similar.
> > > > >
> > > > > Fixed.
> > > > >
> > > > > > >flags |= SECTION_NAMED;
> > > > > > > -  if (SUPPORTS_SHF_GNU_RETAIN
> > > > > > > -  && decl != nullptr
> > > > > > > +  if (decl != nullptr
> > > > > > >&& DECL_P (decl)
> > > > > > > -  && DECL_PRESERVE_P (decl))
> > > > > > > +  && DECL_PRESERVE_P (decl)
> > > > > > > +  && lookup_attribute ("retain", DECL_ATTRIBUTES (decl)))
> > > > > > >  flags |= SECTION_RETAIN;
> > > > > > >if (*slot == NULL)
> > > > > > >  {
> > > > > >
> > > > > > I think none of the varasm.c code should use DECL_PRESERVE_P when 
> > > > > > it means
> > > > > > retain, just lookup_attribute.
> > > > >
> > > > > Fixed.
> > > > >
> > > > > > > @@ -487,7 +487,8 @@ resolve_unique_section (tree decl, int reloc 
> > > > > > > ATTRIBUTE_UNUSED,
> > > > > > >if (DECL_SECTION_NAME (decl) == NULL
> > > > > > >&& targetm_common.have_named_sections
> > > > > > >&& (flag_function_or_data_sections
> > > > > > > -   || (SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl))
> > > > > > > +   || (DECL_PRESERVE_P (decl)
> > > > > > > +   && lookup_attribute ("retain", DECL_ATTRIBUTES 
> > > > > > > (decl)))
> > > > > > > || DECL_COMDAT_GROUP (decl)))
> > > > > > >  {
> > > > > > >targetm.asm_out.unique_section (decl, reloc);
> > > > > > > @@ -1227,7 +1228,8 @@ get_variable_section (tree decl, bool 
> > > > > > > prefer_noswitch_p)
> > > > > > >  vnode->get_constructor ();
> > > > > > >
> > > > > > >if (DECL_COMMON (decl)
> > > > > > > -  && !(SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl)))
> > > > > > > +  && !(DECL_PRESERVE_P (decl)
> > > > > > > +&& 

Re: [PATCH] RISC-V: Zihintpause: add __builtin_riscv_pause

2021-02-18 Thread Jim Wilson
On Thu, Jan 7, 2021 at 12:50 AM Kito Cheng  wrote:

> My point is tracking info and consistent behavior/scheme with other
> extensions, so personally I strongly prefer it should be guarded with
> -march.
>

It is a hint.  We should allow it even if the architecture extension is not
enabled.

For comparison, I suggest you look at the aarch64 port.  They have 3 kinds
of hints: branch protection (bti), pointer authentication, and speculation
control.  They deliberately allow you to emit the instructions even if the
hardware doesn't support the feature because they are hints, and execute as
nops if the hardware support is missing.  The rationale is that the code
will work with or without the hardware support, but will work better with
the hardware support, so it is best to always allow it.  We should do the
same with RISC-V hints.

I agree that we need to include LLVM folks in the discussion to make sure
that GCC and LLVM handle this the same way.

Jim


Re: [PATCH] c: Fix ICE with -fexcess-precision=standard [PR99136]

2021-02-18 Thread Joseph Myers
On Thu, 18 Feb 2021, Jakub Jelinek via Gcc-patches wrote:

> Hi!
> 
> The following testcase ICEs on i686-linux, because c_finish_return wraps
> c_fully_folded retval back into EXCESS_PRECISION_EXPR, but when the function
> return type is void, we don't call convert_for_assignment on it that would
> then be fully folded again, but just put the retval into RETURN_EXPR's
> operand, so nothing removes it anymore and during gimplification we
> ICE as EXCESS_PRECISION_EXPR is not handled.
> 
> This patch fixes it by not adding that EXCESS_PRECISION_EXPR in functions
> returning void, the return value is ignored and all we need is evaluate any
> side-effects of the expression.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH V2 3/5] CTF/BTF testsuites

2021-02-18 Thread Jose E. Marchesi via Gcc-patches
This commit adds a new testsuite for the CTF debug format.

2021-02-18  Indu Bhagat  
David Faust  

gcc/testsuite/

* gcc.dg/debug/btf/btf-1.c: New test.
* gcc.dg/debug/btf/btf-2.c: Likewise.
* gcc.dg/debug/btf/btf-anonymous-struct-1.c: Likewise.
* gcc.dg/debug/btf/btf-anonymous-union-1.c: Likewise.
* gcc.dg/debug/btf/btf-array-1.c: Likewise.
* gcc.dg/debug/btf/btf-bitfields-1.c: Likewise.
* gcc.dg/debug/btf/btf-bitfields-2.c: Likewise.
* gcc.dg/debug/btf/btf-bitfields-3.c: Likewise.
* gcc.dg/debug/btf/btf-cvr-quals-1.c: Likewise.
* gcc.dg/debug/btf/btf-enum-1.c: Likewise.
* gcc.dg/debug/btf/btf-forward-1.c: Likewise.
* gcc.dg/debug/btf/btf-function-1.c: Likewise.
* gcc.dg/debug/btf/btf-function-2.c: Likewise.
* gcc.dg/debug/btf/btf-int-1.c: Likewise.
* gcc.dg/debug/btf/btf-pointers-1.c: Likewise.
* gcc.dg/debug/btf/btf-struct-1.c: Likewise.
* gcc.dg/debug/btf/btf-typedef-1.c: Likewise.
* gcc.dg/debug/btf/btf-union-1.c: Likewise.
* gcc.dg/debug/btf/btf-variables-1.c: Likewise.
* gcc.dg/debug/btf/btf.exp: Likewise.
* gcc.dg/debug/ctf/ctf-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-anonymous-struct-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-anonymous-union-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-array-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-array-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-array-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-array-4.c: Likewise.
* gcc.dg/debug/ctf/ctf-attr-mode-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-attr-used-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-bitfields-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-bitfields-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-bitfields-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-bitfields-4.c: Likewise.
* gcc.dg/debug/ctf/ctf-complex-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-cvr-quals-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-cvr-quals-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-cvr-quals-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-cvr-quals-4.c: Likewise.
* gcc.dg/debug/ctf/ctf-enum-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-enum-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-file-scope-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-float-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-forward-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-forward-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-func-index-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-function-pointers-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-function-pointers-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-function-pointers-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-functions-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-int-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-objt-index-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-pointers-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-pointers-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-preamble-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-4.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-5.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-6.c: Likewise.
* gcc.dg/debug/ctf/ctf-str-table-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-struct-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-struct-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-struct-array-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-struct-pointer-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-struct-pointer-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-struct-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-struct-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-struct-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-union-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-variables-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-variables-2.c: Likewise.
* gcc.dg/debug/ctf/ctf.exp: Likewise.
* gcc.dg/debug/dwarf2-ctf-1.c: Likewise.
---
 gcc/testsuite/gcc.dg/debug/btf/btf-1.c|  6 ++
 gcc/testsuite/gcc.dg/debug/btf/btf-2.c| 10 +++
 .../gcc.dg/debug/btf/btf-anonymous-struct-1.c | 23 ++
 .../gcc.dg/debug/btf/btf-anonymous-union-1.c  | 23 ++
 gcc/testsuite/gcc.dg/debug/btf/btf-array-1.c  | 31 +++
 .../gcc.dg/debug/btf/btf-bitfields-1.c| 34 
 .../gcc.dg/debug/btf/btf-bitfields-2.c| 26 ++
 .../gcc.dg/debug/btf/btf-bitfields-3.c| 43 ++
 .../gcc.dg/debug/btf/btf-cvr-quals-1.c| 52 
 

[PATCH V2 5/5] Enable BTF generation in the BPF backend

2021-02-18 Thread Jose E. Marchesi via Gcc-patches
This patch changes the BPF GCC backend in order to use the DWARF debug
hooks and therefore enables the user to generate BTF debugging
information with -gbtf.  Generating BTF is crucial when compiling BPF
programs, since the CO-RE (compile-once, run-everwhere) mechanism
used by the kernel BPF loader relies on it.

Note that since in eBPF it is not possible to unwind frames due to the
restrictive nature of the target architecture, we are disabling the
generation of CFA in this target.

2021-01-22  David Faust 

* config/bpf/bpf.c (bpf_expand_prologue): Do not mark insns as
frame related.
(bpf_expand_epilogue): Likewise.
* config/bpf/bpf.h (DWARF2_FRAME_INFO): Define to 0.
Do not define DBX_DEBUGGING_INFO.
---
 gcc/config/bpf/bpf.c |  4 
 gcc/config/bpf/bpf.h | 12 ++--
 2 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/gcc/config/bpf/bpf.c b/gcc/config/bpf/bpf.c
index 126d4a2798d..e635f9edb40 100644
--- a/gcc/config/bpf/bpf.c
+++ b/gcc/config/bpf/bpf.c
@@ -349,7 +349,6 @@ bpf_expand_prologue (void)
  hard_frame_pointer_rtx,
  fp_offset - 8));
  insn = emit_move_insn (mem, gen_rtx_REG (DImode, regno));
- RTX_FRAME_RELATED_P (insn) = 1;
  fp_offset -= 8;
}
}
@@ -364,7 +363,6 @@ bpf_expand_prologue (void)
 {
   insn = emit_move_insn (stack_pointer_rtx,
 hard_frame_pointer_rtx);
-  RTX_FRAME_RELATED_P (insn) = 1;
 
   if (size > 0)
{
@@ -372,7 +370,6 @@ bpf_expand_prologue (void)
 gen_rtx_PLUS (Pmode,
   stack_pointer_rtx,
   GEN_INT (-size;
- RTX_FRAME_RELATED_P (insn) = 1;
}
 }
 }
@@ -412,7 +409,6 @@ bpf_expand_epilogue (void)
  hard_frame_pointer_rtx,
  fp_offset - 8));
  insn = emit_move_insn (gen_rtx_REG (DImode, regno), mem);
- RTX_FRAME_RELATED_P (insn) = 1;
  fp_offset -= 8;
}
}
diff --git a/gcc/config/bpf/bpf.h b/gcc/config/bpf/bpf.h
index 9e2f5260900..ef0123b56a6 100644
--- a/gcc/config/bpf/bpf.h
+++ b/gcc/config/bpf/bpf.h
@@ -235,17 +235,9 @@ enum reg_class
 
 / Debugging Info /
 
-/* We cannot support DWARF2 because of the limitations of eBPF.  */
+/* In eBPF it is not possible to unwind frames. Disable CFA.  */
 
-/* elfos.h insists in using DWARF.  Undo that here.  */
-#ifdef DWARF2_DEBUGGING_INFO
-# undef DWARF2_DEBUGGING_INFO
-#endif
-#ifdef PREFERRED_DEBUGGING_TYPE
-# undef PREFERRED_DEBUGGING_TYPE
-#endif
-
-#define DBX_DEBUGGING_INFO
+#define DWARF2_FRAME_INFO 0
 
 / Stack Layout and Calling Conventions.  */
 
-- 
2.25.0.2.g232378479e



[PATCH V2 4/5] CTF/BTF documentation

2021-02-18 Thread Jose E. Marchesi via Gcc-patches
This commit documents the new command line options introduced by the
CTF and BTF debug formats.

2021-02-18  Indu Bhagat  

* doc/invoke.texi: Document the CTF and BTF debug info options.
---
 gcc/doc/invoke.texi | 20 
 1 file changed, 20 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e8baa545eee..31c88c5be6f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -460,6 +460,7 @@ Objective-C and Objective-C++ Dialects}.
 @item Debugging Options
 @xref{Debugging Options,,Options for Debugging Your Program}.
 @gccoptlist{-g  -g@var{level}  -gdwarf  -gdwarf-@var{version} @gol
+-gbtf -gctf  -gctf@var{level} @gol
 -ggdb  -grecord-gcc-switches  -gno-record-gcc-switches @gol
 -gstabs  -gstabs+  -gstrict-dwarf  -gno-strict-dwarf @gol
 -gas-loc-support  -gno-as-loc-support @gol
@@ -9642,6 +9643,25 @@ other DWARF-related options such as
 @option{-fno-dwarf2-cfi-asm}) retain a reference to DWARF Version 2
 in their names, but apply to all currently-supported versions of DWARF.
 
+@item -gbtf
+@opindex gbtf
+Request BTF debug information.
+
+@item -gctf
+@itemx -gctf@var{level}
+@opindex gctf
+Request CTF debug information and use level to specify how much CTF debug
+information should be produced.  If -gctf is specified without a value for
+level, the default level of CTF debug information is 2.
+
+Level 0 produces no CTF debug information at all.  Thus, -gctf0 negates -gctf.
+
+Level 1 produces CTF information for tracebacks only.  This includes callsite
+information, but does not include type information.
+
+Level 2 produces type information for entities (functions, data objects etc.)
+at file-scope or global-scope only.
+
 @item -gstabs
 @opindex gstabs
 Produce debugging information in stabs format (if that is supported),
-- 
2.25.0.2.g232378479e



[PATCH V2 0/5] Support for the CTF and BTF debug formats

2021-02-18 Thread Jose E. Marchesi via Gcc-patches
[Changes from V1:
- Added support for BTF: implementation, documentation and tests.
- The previous code has been refactored to accommodate BTF better.
- The command-line option -gt has been renamed to -gctf.]

Hi people!

Last year we submitted a first patch series introducing support for
the CTF debugging format in GCC [1].  We got a lot of feedback that
prompted us to change the approach used to generate the debug info,
and this patch series is the result of that.

This series also add support for the BTF debug format, which is needed
by the BPF backend (more on this below.)

This implementation works, but there are several points that need
discussion and agreement with the upstream community, as they impact
the way debugging options work.  We are also proposing a way to add
additional debugging formats in the future.  See below for more
details.

Finally, a patch makes the BPF GCC backend to use the DWARF debug
hooks in order to make -gbtf available to it.

[1] https://gcc.gnu.org/legacy-ml/gcc-patches/2019-05/msg01297.html

About CTF
=

CTF is a debugging format designed in order to express C types in a
very compact way.  The key is compactness and simplicity.  For more
information see:

- CTF specification
  http://www.esperi.org.uk/~oranix/ctf/ctf-spec.pdf

- Compact C-Type support in the GNU toolchain (talk + slides)
  https://linuxplumbersconf.org/event/4/contributions/396/

- On type de-duplication in CTF (talk + slides)
  https://linuxplumbersconf.org/event/7/contributions/725/

About BTF
=

BTF is a debugging format, similar to CTF, that is used in the Linux
kernel as the debugging format for BPF programs.  From the kernel
documentation:

"BTF (BPF Type Format) is the metadata format which encodes the debug
 info related to BPF program/map. The name BTF was used initially to
 describe data types. The BTF was later extended to include function
 info for defined subroutines, and line info for source/line
 information."

Supporting BTF in GCC is important because compiled BPF programs
(which GCC supports as a target) require the type information in order
to be loaded and run in diverse kernel versions.  This mechanism is
known as CO-RE (compile-once, run-everywhere) and is described in the
"Update of the BPF support in the GNU Toolchain" talk mentioned below.

The BTF is documented in the Linux kernel documentation tree:
- linux/Documentation/bpf/btf.rst

CTF in the GNU Toolchain


During the last year we have been working in adding support for CTF to
several components of the GNU toolchain:

- binutils support is already upstream.  It supports linking objects
  with CTF information with full type de-duplication.

- GDB support is to be sent upstream very shortly.  It makes the
  debugger capable to use the CTF information whenever available.
  This is useful in cases where DWARF has been stripped out but CTF is
  kept.

- GCC support is being discussed and submitted in this series.

Overview of the Implementation
==

  dwarf2out.c

The enabled debug formats are hooked in dwarf2out_early_finish.

  dwarf2ctf.c

Code that tranform the internal GCC DWARF DIEs into CTF container
structures.  This file is included in dwarf2out.c.

  ctfc.c
  ctfc.h

These two files implement the "CTF container", which is shared
among CTF and BTF, due to the many similarities between both
formats.

  ctfout.c

Code that emits assembler with the .ctf section data, from the CTF
container.

  btfout.c

Code that emits assembler with the .BTF section data, from the CTF
container.

>From debug hooks to debug formats
=

Our first attempt in adding CTF to GCC used the obvious approach of
adding a new set of debug hooks as defined in gcc/debug.h.

During our first interaction with the upstream community we were told
to _not_ use debug hooks, because these are to be obsoleted at some
point.  We were suggested to instead hook our handlers (which
processed type TREE nodes producing CTF types from them) somewhere
else.  So we did.

However at the time we were also facing the need to support BTF, which
is another type-related debug format needed by the BPF GCC backend.
Hooking here and there doesn't sound like such a good idea when it
comes to support several debug formats.

Therefore we thought about how to make GCC support diverse debugging
formats in a better way.  This led to a proposal we tried to discuss
at the GNU Tools Track in LPC2020:

- Update of the BPF support in the GNU Toolchain
  https://linuxplumbersconf.org/event/7/contributions/724/

Basically, the current situation in terms of diversity of debugging
formats in GCC can be summarized in the following like:

 tree --+  +--> dwarf2out
 rtl  --+  +--> dbxout
+--> debug_hooks --+--> vmsdbgout
 backends --+  +--> xcoffout
 lto  --+   

[PATCH V2 1/5] Add new function lang_GNU_GIMPLE

2021-02-18 Thread Jose E. Marchesi via Gcc-patches
2021-02-18  Indu Bhagat  

* langhooks.c (lang_GNU_GIMPLE): New Function.
* langhooks.h: New Prototype.
---
 gcc/langhooks.c | 9 +
 gcc/langhooks.h | 1 +
 2 files changed, 10 insertions(+)

diff --git a/gcc/langhooks.c b/gcc/langhooks.c
index 2354386f7b4..0082ee9f350 100644
--- a/gcc/langhooks.c
+++ b/gcc/langhooks.c
@@ -929,3 +929,12 @@ lang_GNU_OBJC (void)
 {
   return strncmp (lang_hooks.name, "GNU Objective-C", 15) == 0;
 }
+
+/* Returns true if the current lang_hooks represents the GNU GIMPLE
+   frontend.  */
+
+bool
+lang_GNU_GIMPLE (void)
+{
+  return strncmp (lang_hooks.name, "GNU GIMPLE", 10) == 0;
+}
diff --git a/gcc/langhooks.h b/gcc/langhooks.h
index 842e605c439..f9c82eff0cb 100644
--- a/gcc/langhooks.h
+++ b/gcc/langhooks.h
@@ -638,5 +638,6 @@ extern bool lang_GNU_C (void);
 extern bool lang_GNU_CXX (void);
 extern bool lang_GNU_Fortran (void);
 extern bool lang_GNU_OBJC (void);
+extern bool lang_GNU_GIMPLE (void);
 
 #endif /* GCC_LANG_HOOKS_H */
-- 
2.25.0.2.g232378479e



Re: [PATCH] rs6000: Use rldimi for vec init instead of shift + ior

2021-02-18 Thread Segher Boessenkool
Hi!

On Wed, Feb 03, 2021 at 02:37:05PM +0800, Kewen.Lin wrote:
> This patch merges the previously approved one[1] and its relied patch
> made by Segher here[2], it's to make unsigned int vector init go with
> rldimi to merge two integers instead of shift and ior.

> gcc/ChangeLog:
> 
> 2020-02-03  Segher Boessenkool  
>   Kewen Lin  
> 
>   * config/rs6000/rs6000.md (*rotl3_insert_3): Renamed to...
>   (rotl3_insert_3): ...this.
>   (plus_ior_xor): New code_iterator.
>   (define_split for GPR rl*imi): New splitter.
>   * config/rs6000/vsx.md (vsx_init_v4si): Use gen_rotldi3_insert_3
>   for integer merging.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/vec-init-10.c: New test.

Is there a PR you should mention here?

> +/* { dg-final { scan-assembler-not "sldi" } } */
> +/* { dg-final { scan-assembler-not "or" } } */
> +/* { dg-final { scan-assembler-times {\mrldimi\M} 4 } } */

/* { dg-final { scan-assembler-not {\msldi\M} } } */
/* { dg-final { scan-assembler-not {\mor\M} } } */
/* { dg-final { scan-assembler-times {\mrldimi\M} 4 } } */

Okay for trunk with that tweak.  Thanks!


Segher


Re: [PATCH] split, i386, v5: Fix up df uses in i386 splitters [PR99104]

2021-02-18 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 18, 2021 at 06:29:06PM +, Richard Sandiford wrote:
> Yeah, agree that copying 1KB isn't great, but I think we should keep any
> optimisations of the copy in recog.[hc].  So IMO it would be better &
> safer to go with a plain copy for now and leave optimising it as a nice
> future improvement.  (Or even better, we could try to remove the
> reliance on a single global.)

I've committed the version with the 1KB copying in the meantime.
If it shows up on the compile time measurably, we can do something with it.

Jakub



Re: [PATCH] rs6000: Use rldimi for vec init instead of shift + ior

2021-02-18 Thread Segher Boessenkool
On Thu, Feb 18, 2021 at 11:58:59AM -0600, will schmidt wrote:
> On Wed, 2021-02-03 at 14:37 +0800, Kewen.Lin via Gcc-patches wrote:
> > This patch merges the previously approved one[1] and its relied patch
> 
> I don't see the review for [1] in the archives.  

I said:
  Can you make a different testcase perhaps?  This patch looks fine
  otherwise.

> > made by Segher here[2], it's to make unsigned int vector init go with
> > rldimi to merge two integers instead of shift and ior.
> > 
> > Segher's patch in [2] is required to make the test case pass,
> > otherwise the costing for new pseudo-to-pseudo copies and the folding
> > with nonzero_bits in combine will make the rl*imi pattern become
> > compact and split into ior and shift unexpectedly.
> > 
> > The commit log of Segher's patch describes it in more details:
> > 
> > "An rl*imi is usually written as an IOR of an ASHIFT or similar, and an
> > AND of a register with a constant mask.  In some cases combine knows
> > that that AND doesn't do anything (because all zero bits in that mask
> > correspond to bits known to be already zero), and then no pattern
> > matches.  This patch adds a define_split for such cases.  It uses
> > nonzero_bits in the condition of the splitter, but does not need it
> > afterwards for the instruction to be recognised.  This is necessary
> > because later passes can see fewer nonzero_bits.
> > 
> > Because it is a splitter, combine will only use it when starting with
> > three insns (or more), even though the result is just one.  This isn't
> > a huge problem in practice, but some possible combinations still won't
> > happen."
> > 
> > Bootstrapped/regtested on powerpc64le-linux-gnu P9 and
> > powerpc64-linux-gnu P8, also SPEC2017 build/run passed on P9.
> > 
> > Is it ok for trunk?
> > 
> > BR,
> > Kewen
> > 
> > [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562407.html
> > [2] https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563526.html
> > 
> > 
> > gcc/ChangeLog:
> > 
> > 2020-02-03  Segher Boessenkool  
> > Kewen Lin  
> > 
> > * config/rs6000/rs6000.md (*rotl3_insert_3): Renamed to...
> > (rotl3_insert_3): ...this.
> 
> ok
> 
> > (plus_ior_xor): New code_iterator.
> > (define_split for GPR rl*imi): New splitter.
> 
> As described above, these two changes appear to identical to what was
> posted by Segher in [1].

But with testcase :-)

> +/* { dg-final { scan-assembler-not "or" } } */
> 
> As is, it looks OK to me.  Per other reviews i've gotten, you may
> get a request to wrap the "or" with \m \M .

Right, because it is very easy to end up with "or" as a substring of
something else.

> Some existing cases with
> scan-assembler-not have a leading whitespace/tab qualifier too. 
> i.e.
> /* { dg-final { scan-assembler-not "\[ \t\]or "  } } */

Yeah, but the \m in {\mor\M} will do the same already.

Thanks,


Segher


Re: [PATCH] split, i386, v5: Fix up df uses in i386 splitters [PR99104]

2021-02-18 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek  writes:
> On Wed, Feb 17, 2021 at 10:30:06AM +, Richard Sandiford wrote:
>> Hmm.  I think that just means that the optimisation performed by
>> the copy constructor isn't valid in practice (even if it should be
>> in principle).  Guess this is the curse of manipulating data structures
>> directly rather than through well-defined interfaces :-)
>> 
>> TBH, since the obvious thing didn't work, I think we should keep it
>> simple and take the hit of the full copy.  It seems less error-prone
>> and I'd be surprised if the overhead was noticeable in practice.
>> It also means that if we do manage to optimise the copy in future,
>> all callers would automatically benefit.
>> 
>> I.e. my preference would be to go for your original patch without
>> the recog.[hc] bits.
>
> It is more than 1KB there and back each time we call one of those
> splitter conditions.
> When we know only the recog_data.{operand,insn} part is needed...
> Only copying that would be 240B there, back + clear of 8 bytes.

Yeah, agree that copying 1KB isn't great, but I think we should keep any
optimisations of the copy in recog.[hc].  So IMO it would be better &
safer to go with a plain copy for now and leave optimising it as a nice
future improvement.  (Or even better, we could try to remove the
reliance on a single global.)

Thanks,
Richard

> Anyway, don't feel strongly about that.
>
>   Jakub


Re: [PATCH] handle VLA of zero length arrays and vice versa (PR 99121)

2021-02-18 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 18, 2021 at 07:00:52PM +0100, Jakub Jelinek wrote:
> > The size of the VLA is zero regardless of its bound and accessing
> > it is invalid so the warning is expected.
> 
> Yes, some warning, but not the one you are giving, that is nonsensical.
> Array subscript 0 is not outside of array bounds of struct S[n], a[anything]
> will still be zero sized and will not be problematic.

Scalar objects with zero size will always have that zero size,
similarly arrays thereof (constant or variable sized).
So the warning should be simply if eltsize == 0,
check if the access is before or after the object and complain
that a memory access is done before or after a zero sized object %qD.

Jakub



Re: [PATCH] rs6000: Convert the vector element register to SImode [PR98914]

2021-02-18 Thread Segher Boessenkool
Hi!

On Thu, Feb 18, 2021 at 10:46:11AM -0600, will schmidt wrote:
> On Wed, 2021-02-03 at 03:01 -0600, Xionghu Luo via Gcc-patches wrote:
> > v[k] will also be expanded to IFN VEC_SET if k is long type when
> > built
> > with -Og.  -O0 didn't exposed the issue due to v is TREE_ADDRESSABLE,
> > -O1 and above also didn't capture it because of v[k] is not optimized
> > to
> > VIEW_CONVERT_EXPR(v)[k_1].
> > vec_insert defines the element argument type to be signed int by
> > ELFv2
> > ABI, so convert it to SImode if it wasn't for Power target
> > requirements.
> 
> The intro paragraph seems to start mid sentence.  Did something get cut
> off?
> The description here is specific to the reported testcase failure. 
> This should describe the patch behavior instead.  Something like 
> "When
> expanding a vector with a variable rtx, the rtx type needs to be SI"
> ...

mode, not type :-)

> > gcc/ChangeLog:
> 
> 
> Reference "PR target/98914" somewhere in here.

Exactly like that, and in *both* changelogs (gcc/ and gcc/testsuite/),
before all entries.

> > --- a/gcc/config/rs6000/rs6000.c
> > +++ b/gcc/config/rs6000/rs6000.c
> > @@ -7000,8 +7000,6 @@ rs6000_expand_vector_set_var_p9 (rtx target,
> > rtx val, rtx idx)
> > 
> >gcc_assert (VECTOR_MEM_VSX_P (mode) && !CONST_INT_P (idx));
> > 
> > -  gcc_assert (GET_MODE (idx) == E_SImode);
> > -
> >machine_mode inner_mode = GET_MODE (val);
> > 
> >rtx tmp = gen_reg_rtx (GET_MODE (idx));
> 
> 
> This needs a changelog blurb.

Yup...  Just "Remove assert." will do.

> > @@ -7144,7 +7140,7 @@ rs6000_expand_vector_set (rtx target, rtx val,
> > rtx elt_rtx)
> >machine_mode mode = GET_MODE (target);
> >machine_mode inner_mode = GET_MODE_INNER (mode);
> >rtx reg = gen_reg_rtx (mode);
> > -  rtx mask, mem, x;
> > +  rtx mask, mem, x, elt_si;
> >int width = GET_MODE_SIZE (inner_mode);
> >int i;
> > 
> > @@ -7154,16 +7150,23 @@ rs6000_expand_vector_set (rtx target, rtx
> > val, rtx elt_rtx)
> >  {
> >if (!CONST_INT_P (elt_rtx))
> > {
> > + /* elt_rtx should be SImode from ELFv2 ABI.  */
> > + elt_si = gen_reg_rtx (E_SImode);

Declare it only here, not earlier please.

> > + if (GET_MODE (elt_rtx) != E_SImode)
> > +   convert_move (elt_si, elt_rtx, 0);
> > + else
> > +   elt_si = elt_rtx;

Just always do the convert_move?  So:

  rtx elt_si = gen_reg_rtx (SImode); // no need for that E_ stuff
  convert_move (elt_si, elt_rtx, 0);

This code runs at expand time, so we have many later passes that all can
get rid of that extra move no problem.

> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr98914.c
> > @@ -0,0 +1,11 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target powerpc_p8vector_ok } */
> > +/* { dg-options "-Og -mvsx" } */

Please use
  -Og -mdejagnu-cpu=power8
instead.  Was this tested on BE configs?  Please make sure you do.

Please fix (also all Will's other comments, thanks as always!) and
repost (after testing again, of course).  Thanks!


Segher


Re: [PATCH] handle VLA of zero length arrays and vice versa (PR 99121)

2021-02-18 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 18, 2021 at 09:24:28AM -0700, Martin Sebor via Gcc-patches wrote:
> > Consider a different (GNU C, in C++ struct S has non-zero size) testcase:
> > void f (void*);
> > 
> > void g (int n)
> > {
> >struct S {} a[n];
> >((int*)a)[0] = 0;
> >f (a);
> > }
> > yyy.c:6:12: warning: array subscript 0 is outside array bounds of ‘struct 
> > S[]’ [-Warray-bounds]
> >  6 |   ((int*)a)[0] = 0;
> >|   ~^~~
> > yyy.c:5:15: note: while referencing ‘a’
> >  5 |   struct S {} a[n];
> >|   ^
> > I bet that means you are really complaining about the VLA bound rather than
> > the [0] bound even in the first case, because the wording is otherwise the
> > same.  And for g (154) the array subscript 0 is certainly not a problem,
> > so the warning would need to be worded differently in that case.
> 
> I'm not sure I follow what you're saying here either.  The vrp1 dump
> has this:
> 
> void g (int n)
> {
>   struct S a[0:D.1951];
> 
>[local count: 1073741824]:
>   MEM[(int *)] = 0;
>   f ();
>   a ={v} {CLOBBER};
>   return;
> 
> }
> 
> The size of the VLA is zero regardless of its bound and accessing
> it is invalid so the warning is expected.

Yes, some warning, but not the one you are giving, that is nonsensical.
Array subscript 0 is not outside of array bounds of struct S[n], a[anything]
will still be zero sized and will not be problematic.

> VLAs of zero-lengthg arrays are without a doubt rare, pathological
> cases.  We could special case the warning for them and print
> a different message but  I see very little value in complicating
> the code just for them.  Do you consider this special casing
> a requirement for approving the fix for the ICE?

Yes.

Jakub



Re: [PATCH] rs6000: Use rldimi for vec init instead of shift + ior

2021-02-18 Thread will schmidt via Gcc-patches
On Wed, 2021-02-03 at 14:37 +0800, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 

Hi,


> This patch merges the previously approved one[1] and its relied patch


I don't see the review for [1] in the archives.  


> made by Segher here[2], it's to make unsigned int vector init go with
> rldimi to merge two integers instead of shift and ior.
> 
> Segher's patch in [2] is required to make the test case pass,
> otherwise the costing for new pseudo-to-pseudo copies and the folding
> with nonzero_bits in combine will make the rl*imi pattern become
> compact and split into ior and shift unexpectedly.
> 
> The commit log of Segher's patch describes it in more details:
> 
> "An rl*imi is usually written as an IOR of an ASHIFT or similar, and an
> AND of a register with a constant mask.  In some cases combine knows
> that that AND doesn't do anything (because all zero bits in that mask
> correspond to bits known to be already zero), and then no pattern
> matches.  This patch adds a define_split for such cases.  It uses
> nonzero_bits in the condition of the splitter, but does not need it
> afterwards for the instruction to be recognised.  This is necessary
> because later passes can see fewer nonzero_bits.
> 
> Because it is a splitter, combine will only use it when starting with
> three insns (or more), even though the result is just one.  This isn't
> a huge problem in practice, but some possible combinations still won't
> happen."
> 
> Bootstrapped/regtested on powerpc64le-linux-gnu P9 and
> powerpc64-linux-gnu P8, also SPEC2017 build/run passed on P9.
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> 
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562407.html
> [2] https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563526.html
> 
> 
> gcc/ChangeLog:
> 
> 2020-02-03  Segher Boessenkool  
>   Kewen Lin  
> 
>   * config/rs6000/rs6000.md (*rotl3_insert_3): Renamed to...
>   (rotl3_insert_3): ...this.

ok

>   (plus_ior_xor): New code_iterator.
>   (define_split for GPR rl*imi): New splitter.

As described above, these two changes appear to identical to what was
posted by Segher in [1].

(
>   * config/rs6000/vsx.md (vsx_init_v4si): Use gen_rotldi3_insert_3
>   for integer merging.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/vec-init-10.c: New test.

+/* { dg-final { scan-assembler-not "or" } } */

As is, it looks OK to me.  Per other reviews i've gotten, you may
get a request to wrap the "or" with \m \M .
Some existing cases with
scan-assembler-not have a leading whitespace/tab qualifier too. 
i.e.
/* { dg-final { scan-assembler-not "\[ \t\]or "  } } */

Thanks,
-Will

> 
> -
> 



[RFC][patch for gcc12][version 1] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-02-18 Thread Qing Zhao via Gcc-patches
(CC’ing Kees Cook on this topic)

Hi, 

This is the first version of the complete patch for the new security feature 
for GCC: 

Initialize automatic variables with new first class option 
-ftrivial-auto-var-init=[uninitialized|pattern|zero] 
and  a new variable attribute “uninitialized” to exclude some variables from 
automatical initialization to
Control runtime overhead.



'-ftrivial-auto-var-init=CHOICE'
 Initialize automatic variables with either a pattern or with zeroes
 to increase program security by preventing uninitialized memory
 disclosure and use.

 The three values of CHOICE are:

* 'uninitialized' doesn't initialize any automatic variables.
  This is C and C++'s default.

* 'pattern' Initialize automatic variables with values which
  will likely transform logic bugs into crashes down the line,
  are easily recognized in a crash dump and without being values
  that programmers can rely on for useful program semantics.
  The values used for pattern initialization might be changed in
  the future.

* 'zero' Initialize automatic variables with zeroes.

 The default is 'uninitialized'.

 You can control this behavior for a specific variable by using the
 variable attribute 'uninitialized' (*note Variable Attributes::).

'uninitialized'
 This attribute, attached to a variable with automatic storage,
 means that the variable should not be automatically initialized by
 the compiler when the option '-ftrivial-auto-var-init' presents.

 With the option '-ftrivial-auto-var-init', all the automatic
 variables that do not have explicit initializers will be
 initialized by the compiler.  These additional compiler
 initializations might incur run-time overhead, sometimes
 dramatically.  This attribute can be used to mark some variables to
 be excluded from such automatical initialization in order to reduce
 runtime overhead.

 This attribute has no effect when the option
 '-ftrivial-auto-var-init' does not present.


**This implementation has the following features:

1.  A new internal function .DEFERRED_INIT is introduced;
2.  During gimplification phase, calls to .DEFERRED_INIT will be inserted for 
the initialization;
3.  During expand phase, calls to .DEFERRED_INIT are expanded to real “zero” or 
“pattern” initialization;
4.  In order to maintain the -Wuninitialized warning,  uninitialized warning 
phase is adjusted to specially
 handling the calls to .DEFERRED_INIT;
5.  In order to reduce the stack usage per strength reduction transformation, 
the strength reduction phase
 Is adjusted to specially handle the calls to .DEFERRED_INIT;
6.  Point to analysis also is adjusted to specially handle the calls to 
.DEFERRED_INIT;
7.  VLA auto variables are initialized specially, for “zero” initialization, 
add a call to MEMSET to initialize it;
 for “pattern” initialization, add a loop of calls to MEMCPY to initialize 
it;
8.  For “pattern” initialization, used the similar patten as LLVM 
implementation for integers and pointers:
+  /* The following value is a guaranteed unmappable pointer value and has a
+ repeated byte-pattern which makes it easier to synthesize.  We use it for
+ pointers as well as integers so that aggregates are likely to be
+ initialized with this repeated value.  */
+  uint64_t largevalue = 0xull;
+  /* For 32-bit platforms it's a bit trickier because, across systems, only the
+ zero page can reasonably be expected to be unmapped, and even then we need
+ a very low address.  We use a smaller value, and that value sadly doesn't
+ have a repeated byte-pattern.  We don't use it for integers.  */
+  uint32_t smallvalue = 0x00AA;

Use quiet NAN for real type. 

**testing cases:

1. In c-c++-common, verification at gimple level for all types, all combination 
of options, new attributes;
2. In gcc.target, (for x86 and aarch64) verification at RTL or assembly level 
for all types, all combination of options, new attributes;
3. In gcc.dg and g++.dg, verification of the new option still keep the 
-Wuninitialized warnings.  Copy the existing uninit-*.c to
   auto-init-uninit-*.c, add the new option -ftrivial-auto-var-init=zero to it.


**NOTE for the review:

In this version implementation, I still keep the implementation to the 
following two approaches for your references:

A. Adding real initialization during gimplification, not maintain the 
uninitialized warnings.
D. Adding .DEFFERED_INIT during gimplification, expand the .DEFFERED_INIT 
during expand to
real initialization. Adjusting uninitialized pass with the new refs with 
“.DEFFERED_INIT”

A is the approach that I will delete in the next version. D will be the one we 
choose. 

(The reason I kept both implementation is, please check the implementation for 
both A and D 
 since the performance, code size, stack 

Re: [PATCH 1/1] libiberty(argv.c): Fix memory leak in expandargv.

2021-02-18 Thread Martin Sebor via Gcc-patches

On 2/18/21 3:58 AM, Ayush Mittal via Gcc-patches wrote:

Dynamic memory referenced by 'buffer' was allocated using xmalloc but fails to 
free it
when jump to 'error' label.

Issue as per static analysis tool.


The change looks okay to me although I can't approve it.  Since GCC
is a regression fixing stage, unless the leak is a recent regression
the fix is (strictly speaking) out of scope for GCC 11.  Either
a libiberty or a global maintainer might be comfortable approving it 
regardless.


That said, rather than adding another call to free, I would suggest
to consider initializing buffer to null and moving the existing call
to free the buffer under the error: label.

Martin



Signed-off-by: Ayush Mittal 
Signed-off-by: Maninder Singh 
---
  libiberty/ChangeLog | 4 
  libiberty/argv.c| 5 -
  2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog
index e472553..96cacba 100644
--- a/libiberty/ChangeLog
+++ b/libiberty/ChangeLog
@@ -1,3 +1,7 @@
+2021-02-18  Ayush Mittal  
+
+   * argv.c (expandargv): Fix memory leak for buffer allocated to read 
file contents.
+
  2021-02-01  Martin Sebor  
  
  	* dyn-string.c (dyn_string_insert_cstr): Use memcpy instead of strncpy

diff --git a/libiberty/argv.c b/libiberty/argv.c
index cd97f90..7199c7d 100644
--- a/libiberty/argv.c
+++ b/libiberty/argv.c
@@ -442,7 +442,10 @@ expandargv (int *argcp, char ***argvp)
 due to CR/LF->CR translation when reading text files.
 That does not in-and-of itself indicate failure.  */
  && ferror (f))
-   goto error;
+   {
+ free(buffer);
+ goto error;
+   }
/* Add a NUL terminator.  */
buffer[len] = '\0';
/* If the file is empty or contains only whitespace, buildargv would





Re: [PATCH] ipa: Fix resolving speculations through cgraph_edge::set_call_stmt (PR 98078)

2021-02-18 Thread Martin Jambor
Hi,

ping, please.

Martin


On Thu, Jan 21 2021, Martin Jambor wrote:
> Hi,
>
> in the PR 98078 testcase, speculative call-graph edges which were
> created by IPA-CP are confirmed during inlining but
> cgraph_edge::set_call_stmt does not take it very well.
>
> The function enters the update_speculative branch and updates the edges
> in the speculation bundle separately (by a recursive call), but when it
> processes the first direct edge, most of the bundle actually ceases to
> exist because it is devirtualized.  It nevertheless goes on to attempt
> to update the indirect edge (that has just been removed), which
> surprisingly gets as far as adding the edge to the call_site_hash, the
> same devirtualized edge for the second time, and that triggers an
> assert.
>
> Fixed by this patch which makes the function aware that it is about to
> resolve a speculation and do so instead of updating components of
> speculation.  Also, it does so before dealing with the hash because
> the speculation resolution code needs the hash to point to the first
> speculative direct edge and also cleans the hash up by calling
> update_call_stmt_hash_for_removing_direct_edge.
>
> I don't have a testcase, at least not yet, the one in BZ does not link
> when it does not ICE.  I can try to find some time to make it link it is
> deemed very important.
>
> Bootstrapped and tested on x86_64-linux, also profile-LTO-bootstrapped
> on the same system.  OK for trunk?  What about gcc10, where we cannot
> trigger it but I suppose the bug is there?
>
> Thanks,
>
> Martin
>
>
> gcc/ChangeLog:
>
> 2021-01-20  Martin Jambor  
>
>   PR ipa/98078
>   * cgraph.c (cgraph_edge::set_call_stmt): Do not update all
>   corresponding speculative edges if we are about to resolve
>   sepculation.  Make edge direct (and so resolve speculations) before
>   removing it from call_site_hash.
>   (cgraph_edge::make_direct): Relax the initial assert to allow calling
>   the function on speculative direct edges.
> ---
>  gcc/cgraph.c | 37 ++---
>  1 file changed, 22 insertions(+), 15 deletions(-)
>
> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
> index db038306e19..80140757d16 100644
> --- a/gcc/cgraph.c
> +++ b/gcc/cgraph.c
> @@ -789,9 +789,22 @@ cgraph_edge::set_call_stmt (cgraph_edge *e, gcall 
> *new_stmt,
>  {
>tree decl;
>  
> +  cgraph_node *new_direct_callee = NULL;
> +  if ((e->indirect_unknown_callee || e->speculative)
> +  && (decl = gimple_call_fndecl (new_stmt)))
> +{
> +  /* Constant propagation and especially inlining can turn an indirect 
> call
> +  into a direct one.  */
> +  new_direct_callee = cgraph_node::get (decl);
> +  gcc_checking_assert (new_direct_callee);
> +}
> +
>/* Speculative edges has three component, update all of them
>   when asked to.  */
> -  if (update_speculative && e->speculative)
> +  if (update_speculative && e->speculative
> +  /* If we are about to resolve the speculation by calling make_direct
> +  below, do not bother going over all the speculative edges now.  */
> +  && !new_direct_callee)
>  {
>cgraph_edge *direct, *indirect, *next;
>ipa_ref *ref;
> @@ -821,6 +834,9 @@ cgraph_edge::set_call_stmt (cgraph_edge *e, gcall 
> *new_stmt,
>return e_indirect ? indirect : direct;
>  }
>  
> +  if (new_direct_callee)
> +e = make_direct (e, new_direct_callee);
> +
>/* Only direct speculative edges go to call_site_hash.  */
>if (e->caller->call_site_hash
>&& (!e->speculative || !e->indirect_unknown_callee)
> @@ -831,16 +847,6 @@ cgraph_edge::set_call_stmt (cgraph_edge *e, gcall 
> *new_stmt,
>(e->call_stmt, cgraph_edge_hasher::hash (e->call_stmt));
>  
>e->call_stmt = new_stmt;
> -  if (e->indirect_unknown_callee
> -  && (decl = gimple_call_fndecl (new_stmt)))
> -{
> -  /* Constant propagation (and possibly also inlining?) can turn an
> -  indirect call into a direct one.  */
> -  cgraph_node *new_callee = cgraph_node::get (decl);
> -
> -  gcc_checking_assert (new_callee);
> -  e = make_direct (e, new_callee);
> -}
>  
>function *fun = DECL_STRUCT_FUNCTION (e->caller->decl);
>e->can_throw_external = stmt_can_throw_external (fun, new_stmt);
> @@ -1279,14 +1285,15 @@ cgraph_edge::speculative_call_for_target (cgraph_node 
> *target)
>return NULL;
>  }
>  
> -/* Make an indirect edge with an unknown callee an ordinary edge leading to
> -   CALLEE.  Speculations can be resolved in the process and EDGE can be 
> removed
> -   and deallocated.  Return the edge that now represents the call.  */
> +/* Make an indirect or speculative EDGE with an unknown callee an ordinary 
> edge
> +   leading to CALLEE.  Speculations can be resolved in the process and EDGE 
> can
> +   be removed and deallocated.  Return the edge that now represents the
> +   call.  */
>  
>  cgraph_edge *
>  cgraph_edge::make_direct (cgraph_edge *edge, 

Re: Patch for PR analyzer/98797

2021-02-18 Thread brian.sobulefsky via Gcc-patches
Hi David,

Please find my revised patch attached. You should find all items addressed. I
have bootstrapped and rerun the testsuite (gcc and g++ only).  Also, you
should have seen my legal paperwork go in.

1) The commit message has been updated to reflect the nomenclature in the
testsuite, as well as include the required parentheses.

2) The commit message has been updated to reflect the other changes you
requested below.

3) My macros and inline code for "bit_byte_conversion" have been replaced with
the subroutine you requested. It works by modulo and division rather than
bitwise-and and bitshift, as requested. Without any specific instructions, I
put the routine in analyzer.cc and the prototype in analyzer.h, as it provides
a generic helper service. You can move it anywhere else you might want it.

4) I added the check for the field to be sizeof (char). This is actually a
cause to reject getting a char from a string constant for any case, so I
added the check near the outside of the nested ifs. Also, as a side note,
it looks to me like your example:

void test_4 ()
{
  const char *s = "ABCD";
  __analyzer_eval (*(short *)s == 'A');
}

is unrelated to my patch, as I remember a char pointer to a string constant is
handled as a special case elsewhere and does not end up in
maybe_fold_sub_svalue.

5) I updated the modification to get_offset_region so that my new code is only
run in the case of a zerop. This was really the failure issue anyway. I still
have it building a new constant svalue of zero, because I do not know for sure
if the zero pointer type tree and zero integer type tree are different enough to
cause confusion down the line when getting the character.

6) My new subroutine get_parent_cast_region takes any svalues as an argument
and does its own checks for svalue_kind correctness, returning NULL otherwise.
My reason for this is so get_binding_recursive would remain absolutely clean,
the way its recursive call is. Basically, get_binding_recursive can just call
get_parent_cast_region as an assignment within a conditional, just like
the recursion step is, and the returned NULLs, whether by incorrect type or by
not finding a binding for the correct type, causes it to bypass returning there.

7) The if guards were updated as requested.

8) I used your get_field_at_bit_offset routine as requested. To stress again,
this was a static function to region.cc. I had to add a forward declaration
(I again used analyzer.h because it is a "generic helper" function, but that
is easy for you to change if you do not like it there) *and* I had to update
your definition to extern rather than static. Otherwise, it seems to work.

9) The compound conditionals should all be to your liking now.

10) After all the back and forth, I think I am using "approved" helper
functions and macros rather than accessing the tree fields directly. It seems
they all work correctly.

11) All preexisting deja-gnu tests are on the original lines they were on
before. All new tests are appended. Blank lines are left where the dg-bogus
calls were.


Thank you,
Briancommit 7f0e3685900b527b64b81c3b44af36fd9e99bea3
Author: Brian Sobulefsky 
Date:   Tue Feb 9 16:25:43 2021 -0800

Address the bug found in test file gcc/testsuite/gcc.dg/analyzer/casts-1.c, as
recorded by the XFAIL, and some simpler and more complex versions found during
the investigation of it. PR analyzer/98797 was opened for these bugs.

The simplest manifestation of this bug can be replicated with:

char arr[] = {'A', 'B', 'C', 'D'};
char *pt = ar;
__analyzer_eval(*(pt + 0) == 'A');
__analyzer_eval(*(pt + 1) == 'B');
//etc

The result of the eval call is "UNKNOWN" because the analyzer is unable to
determine the value at the pointer. Note that array access (such as arr[0]) is
correctly handled. The code responsible for this is in file
region-model-manager.cc, function region_model_manager::maybe_fold_sub_svalue.
The relevant section is commented /* Handle getting individual chars from
STRING_CST */. This section only had a case for an element_region. A case
needed to be added for an offset_region.

Additionally, when the offset was 0, such as in *pt or *(pt + 0), the function
region_model_manager::get_offset_region was failing to make the needed offset
region at all. This was due to the test for a constant 0 pointer that was then
returning get_cast_region. A special case is needed for when PARENT is of type
array_type whose type matches TYPE. In this case, get_offset_region is allowed
to continue to normal conclusion.

The original bug noted in gcc/testsuite/gcc.dg/analyzer/casts-1.c was for the
case:

struct s1
{
  char a;
  char b;
  char c;
  char d;
};

struct s2
{
  char arr[4];
};

struct s2 x = {{'A', 'B', 'C', 'D'}};
struct s1 *p = (struct s1 *)
__analyzer_eval (p->a == 'A');
//etc

Re: [PATCH] rs6000: Convert the vector element register to SImode [PR98914]

2021-02-18 Thread will schmidt via Gcc-patches
On Wed, 2021-02-03 at 03:01 -0600, Xionghu Luo via Gcc-patches wrote:

Hi,

> v[k] will also be expanded to IFN VEC_SET if k is long type when
> built
> with -Og.  -O0 didn't exposed the issue due to v is TREE_ADDRESSABLE,
> -O1 and above also didn't capture it because of v[k] is not optimized
> to
> VIEW_CONVERT_EXPR(v)[k_1].
> vec_insert defines the element argument type to be signed int by
> ELFv2
> ABI, so convert it to SImode if it wasn't for Power target
> requirements.

The intro paragraph seems to start mid sentence.  Did something get cut
off?
The description here is specific to the reported testcase failure. 
This should describe the patch behavior instead.  Something like 
"When
expanding a vector with a variable rtx, the rtx type needs to be SI"
...
(I defer to any other suggestions of better or improved wording).


> 
> gcc/ChangeLog:


Reference "PR target/98914" somewhere in here.


> 
> 2021-02-03  Xionghu Luo  
> 
>   * config/rs6000/rs6000.c (rs6000_expand_vector_set): Convert
>   elt_rtx to SImode if it wasn't.

s/if it wasn't//


> 
> gcc/testsuite/ChangeLog:
> 
> 2021-02-03  Xionghu Luo  
> 
>   * gcc.target/powerpc/pr98914.c: New test.
> ---
>  gcc/config/rs6000/rs6000.c | 17 ++---
>  gcc/testsuite/gcc.target/powerpc/pr98914.c | 11 +++
>  2 files changed, 21 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr98914.c
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index ec068c58aa5..9f7f8da56c6 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -7000,8 +7000,6 @@ rs6000_expand_vector_set_var_p9 (rtx target,
> rtx val, rtx idx)
> 
>gcc_assert (VECTOR_MEM_VSX_P (mode) && !CONST_INT_P (idx));
> 
> -  gcc_assert (GET_MODE (idx) == E_SImode);
> -
>machine_mode inner_mode = GET_MODE (val);
> 
>rtx tmp = gen_reg_rtx (GET_MODE (idx));


This needs a changelog blurb.


> @@ -7047,8 +7045,6 @@ rs6000_expand_vector_set_var_p8 (rtx target,
> rtx val, rtx idx)
> 
>gcc_assert (VECTOR_MEM_VSX_P (mode) && !CONST_INT_P (idx));
> 
> -  gcc_assert (GET_MODE (idx) == E_SImode);
> -
>machine_mode inner_mode = GET_MODE (val);
>HOST_WIDE_INT mode_mask = GET_MODE_MASK (inner_mode);


Same.

> 
> @@ -7144,7 +7140,7 @@ rs6000_expand_vector_set (rtx target, rtx val,
> rtx elt_rtx)
>machine_mode mode = GET_MODE (target);
>machine_mode inner_mode = GET_MODE_INNER (mode);
>rtx reg = gen_reg_rtx (mode);
> -  rtx mask, mem, x;
> +  rtx mask, mem, x, elt_si;
>int width = GET_MODE_SIZE (inner_mode);
>int i;
> 
> @@ -7154,16 +7150,23 @@ rs6000_expand_vector_set (rtx target, rtx
> val, rtx elt_rtx)
>  {
>if (!CONST_INT_P (elt_rtx))
>   {
> +   /* elt_rtx should be SImode from ELFv2 ABI.  */
> +   elt_si = gen_reg_rtx (E_SImode);
> +   if (GET_MODE (elt_rtx) != E_SImode)
> + convert_move (elt_si, elt_rtx, 0);
> +   else
> + elt_si = elt_rtx;
> +

ok.



> /* For V2DI/V2DF, could leverage the P9 version to generate
> xxpermdi
>when elt_rtx is variable.  */
> if ((TARGET_P9_VECTOR && TARGET_POWERPC64) || width == 8)
>   {
> -   rs6000_expand_vector_set_var_p9 (target, val, elt_rtx);
> +   rs6000_expand_vector_set_var_p9 (target, val, elt_si);
> return;
>   }
> else if (TARGET_P8_VECTOR && TARGET_DIRECT_MOVE_64BIT)
>   {
> -   rs6000_expand_vector_set_var_p8 (target, val, elt_rtx);
> +   rs6000_expand_vector_set_var_p8 (target, val, elt_si);
> return;
>   }
>   }
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr98914.c
> b/gcc/testsuite/gcc.target/powerpc/pr98914.c
> new file mode 100644
> index 000..e4d78e3e6b3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr98914.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-options "-Og -mvsx" } */
> +
> +vector int
> +foo (vector int v)
> +{
> +  for (long k = 0; k < 1; ++k)
> +v[k] = 0;
> +  return v;
> +}
ok

thanks
-Will



RE: [AArch64] PR98657: Fix vec_duplicate creation in SVE's 3

2021-02-18 Thread Kyrylo Tkachov via Gcc-patches
Hi Andre,

> -Original Message-
> From: Andre Vieira (lists) 
> Sent: 17 February 2021 14:17
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Sandiford
> 
> Subject: [AArch64] PR98657: Fix vec_duplicate creation in SVE's
> 3
> 
> Hi,
> 
> This patch prevents generating a vec_duplicate with illegal predicate.
> 
> Regression tested on aarch64-linux-gnu.
> 
> OK for trunk?
> 
> gcc/ChangeLog:
> 2021-02-17  Andre Vieira  
> 
>      PR target/98657
>      * config/aarch64/aarch64-sve.md: Use 'expand_vector_broadcast'
> to emit vec_duplicate's
>      in '3' pattern.

This entry should be
* config/aarch64/aarch64-sve.md (3'): Use 
expand_vector_broadcast

Ok with the ChangeLog fixed.
Thanks,
Kyrill

> 
> gcc/testsuite/ChangeLog:
> 2021-02-17  Andre Vieira  
> 
>      PR target/98657
>      * gcc.target/aarch64/sve/pr98657.c: New test.


Re: [PATCH] Add -fgnu-retain/-fno-gnu-retain

2021-02-18 Thread Jeff Law via Gcc-patches



On 2/18/21 3:19 AM, Richard Biener via Binutils wrote:
> On Wed, Feb 17, 2021 at 12:25 PM Jozef Lawrynowicz
>  wrote:
>> On Tue, Feb 16, 2021 at 05:45:47PM +0100, Jakub Jelinek via Gcc-patches 
>> wrote:
>>> On Mon, Feb 15, 2021 at 02:35:07PM -0800, H.J. Lu via Gcc-patches wrote:
 When building Linux kernel, ld in bninutils 2.36 with GCC 11 generates
 thousands of

 ld: warning: orphan section `.data.event_initcall_finish' from 
 `init/main.o' being placed in section `.data.event_initcall_finish'
 ld: warning: orphan section `.data.event_initcall_start' from 
 `init/main.o' being placed in section `.data.event_initcall_start'
 ld: warning: orphan section `.data.event_initcall_level' from 
 `init/main.o' being placed in section `.data.event_initcall_level'

 Since these sections are marked with SHF_GNU_RETAIN, they are placed in
 separate sections.  They become orphan sections since they aren't expected
 in the Linux kernel linker script. But orphan sections normally don't work
 well with the Linux kernel linker script and the resulting kernel crashed.

 Add -fgnu-retain to disable SHF_GNU_RETAIN for Linux kernel build with
 -fno-gnu-retain.
>>> I'd say this shows that the changing of meaning of "used" attribute wasn't
>>> really a good idea, the Linux kernel certainly won't be the only problem
>>> and people use "used" attribute for many reasons and don't necessarily want
>>> the different sections or different behavior of those sections if they use
>>> it.
>>>
>>> So, can't we instead:
>>> 1) restore the old behavior of "used" by default
>>> 2) add "retain" attribute that implies the SHF_GNU_RETAIN stuff and fails
>>>if the configured assembler/linker doesn't support those
>>> 3) add a non-default option through which one could opt in for "used"
>>>attribute to imply retain attribute too for projects that want that?
>>>
>> In previous discussions, it seemed to me that there was general support
>> for the "used" attribute implying SHF_GNU_RETAIN and saving symbols from
>> linker garbage collection[1]. Of course, the current implementation
>> results in undesirable behavior - the thought that all linker scripts
>> not supporting uniquely named sections would need to be updated is quite
>> alarming.
>>
>> It's a shame that all this extra complication is required, just because
>> we cannot have a ".retain ", directive.
>>
>> My preferred vision for this functionality was:
>>   - SHF_GNU_RETAIN section flag indicates a section should be saved
>> from linker garbage collection.
>>   - ".retain " assembler directive applies SHF_GNU_RETAIN
>> to the section containing .
>>   - GCC "used" attribute emits a ".retain " directive for
>> the symbol declaration is is applied to.  Applying the "used"
>> attribute to a symbol declaration does not change the structure of
>> the object file, beyond applying SHF_GNU_RETAIN to the section
>> containing that symbol.
>>
>> H.J. vetoed ".retain "[2], since it fails the predicate:
>>   Assembler directive referring to a symbol must operate on the symbol
>>   table.
>>
>> So because of this veto, we have compromised on "code quality" so far,
>> since any linker script not supporting named sections, used to link an
>> application containing "used" symbols (now put into their own section) has
>> potential undefined behavior.
>>
>> With the new proposed change to use a "retain" attribute, we now
>> compromise on functionality, since the "used" attribute saving symbols
>> from linker garbage collection is disabled by default.
>>
>> All because we cannot introduce an exception to the above predicate.
>>
>> I would like to re-open discussions on whether a ".retain 
>> directive is acceptable. This would enable "used" to imply
>> SHF_GNU_RETAIN, without changing the structure of the object file,
>> thereby allowing the new functionality to be used without linker script
>> modifications.
>>
>> If a Binutils global maintainer could side one way or the other on
>> ".retain " (an opinion was never given by the Binutils
>> maintainers when we had the discussions on that mailing list, although
>> Jeff did support .retain in the GCC discussions[3]), then it will be
>> easier to proceed knowing definitively that ".retain" is rejected and we
>> have no choice but to go this route of having a separate "retain"
>> attribute.
> So I then propose, for GCC 11, to rip out / disable any use of SHF_GNU_RETAIN,
> effectively restoring GCC 10 beahvior and re-consider for GCC 12.
FWIW concerns WRT SHF_GNU_RETAIN are the primary reason why we didn't
include binutils-2.36 into Fedora 33.  If we included binutils-2.36,
then GCC would have started using SHF_GNU_RETAIN and we were concerned
that there could easily be fallout.

Or to put it another way, I'd fully support Richi's proposal -- I fear
that folks using the combination of gcc-11 + binutils-2.36 are likely
going to stumble over real problems in this 

Re: [PATCH] handle VLA of zero length arrays and vice versa (PR 99121)

2021-02-18 Thread Martin Sebor via Gcc-patches

On 2/18/21 2:30 AM, Jakub Jelinek wrote:

On Wed, Feb 17, 2021 at 02:11:43PM -0700, Martin Sebor wrote:

On 2/17/21 1:47 PM, Jakub Jelinek wrote:

On Wed, Feb 17, 2021 at 01:27:55PM -0700, Martin Sebor wrote:

Not in this patch, but I've looked at what maxobjsize is and wonder why
the roundtrip tree -> HOST_WIDE_INT -> offset_int:
const offset_int maxobjsize = tree_to_shwi (max_object_size ());
Can't it be
const offset_int maxobjsize = wi::to_offset (max_object_size ());
?


Yes, that's what it is elsewhere so this should do the same.  I've
changed it.


Ok.


Doesn't arrbounds[1] == 0 mean there will be warnings for any accesses?
For eltsize == 0 I think you shouldn't warn when nelts isn't known,
instead of always warning, arr[1] will have the same address as
arr[0] ...


This branch is entered for VLAs of zero-length arrays where we want
to warn, like this:

void f (void*);

void g (int n)
{
   int a[n][0];
   ((int*)a)[0] = 0;
   f (a);
}


For this you do want to warn, but not the way you warn with the patch:
xxx.c: In function ‘g’:
xxx.c:6:12: warning: array subscript 0 is outside array bounds of 
‘int[][0]’ [-Warray-bounds]
 6 |   ((int*)a)[0] = 0;
   |   ~^~~
xxx.c:5:7: note: while referencing ‘a’
 5 |   int a[n][0];
   |   ^

The message doesn't make it clear which of the two subscripts is out of
bounds, yes, for [0] it would be outside of bounds, but for the VLA index
no index < n would be outside of bounds.


There's only one subscript in '((int*)a)[0] = 0;' and in the vrp1 IL
and that's the one the warning mentions.  The size of the VLA is zero
so it doesn't matter what the index is, all accesses to it are out of
bounds.  I see nothing wrong here.



Consider a different (GNU C, in C++ struct S has non-zero size) testcase:
void f (void*);

void g (int n)
{
   struct S {} a[n];
   ((int*)a)[0] = 0;
   f (a);
}
yyy.c:6:12: warning: array subscript 0 is outside array bounds of ‘struct 
S[]’ [-Warray-bounds]
 6 |   ((int*)a)[0] = 0;
   |   ~^~~
yyy.c:5:15: note: while referencing ‘a’
 5 |   struct S {} a[n];
   |   ^
I bet that means you are really complaining about the VLA bound rather than
the [0] bound even in the first case, because the wording is otherwise the
same.  And for g (154) the array subscript 0 is certainly not a problem,
so the warning would need to be worded differently in that case.


I'm not sure I follow what you're saying here either.  The vrp1 dump
has this:

void g (int n)
{
  struct S a[0:D.1951];

   [local count: 1073741824]:
  MEM[(int *)] = 0;
  f ();
  a ={v} {CLOBBER};
  return;

}

The size of the VLA is zero regardless of its bound and accessing
it is invalid so the warning is expected.

VLAs of zero-lengthg arrays are without a doubt rare, pathological
cases.  We could special case the warning for them and print
a different message but  I see very little value in complicating
the code just for them.  Do you consider this special casing
a requirement for approving the fix for the ICE?

Martin


[PATCH v2] libiberty(argv.c): Fix memory leak in expandargv.

2021-02-18 Thread Ayush Mittal via Gcc-patches
libiberty/ChangeLog:

* argv.c (expandargv): free allocated buffer if read fails.

Signed-off-by: Ayush Mittal 
Signed-off-by: Maninder Singh 
---
v1 -> v2: corrected formatting and changelog commit

 libiberty/argv.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/libiberty/argv.c b/libiberty/argv.c
index cd97f90..48dcd10 100644
--- a/libiberty/argv.c
+++ b/libiberty/argv.c
@@ -442,7 +442,10 @@ expandargv (int *argcp, char ***argvp)
 due to CR/LF->CR translation when reading text files.
 That does not in-and-of itself indicate failure.  */
  && ferror (f))
-   goto error;
+   {
+ free (buffer);
+ goto error;
+   }
   /* Add a NUL terminator.  */
   buffer[len] = '\0';
   /* If the file is empty or contains only whitespace, buildargv would
-- 
1.9.1



Re: [PATCH] profiling: fix streaming of TOPN counters

2021-02-18 Thread Richard Biener via Gcc-patches
On Thu, Feb 18, 2021 at 12:35 PM Martin Liška  wrote:
>
> On 2/18/21 11:02 AM, Richard Biener wrote:
> > On Thu, Feb 18, 2021 at 10:46 AM Martin Liška  wrote:
> >>
> >> On 2/18/21 10:31 AM, Richard Biener wrote:
> >>> On Wed, Feb 17, 2021 at 2:16 PM Martin Liška  wrote:
> 
>  On 2/17/21 9:36 AM, Martin Liška wrote:
> > I'll write it even more robust...
> 
>  This is more elegant approach I've just tested on the instrumented clang.
> 
>  Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
>  Ready to be installed?
> >>>
> >>> Isn't it still too complicated?  We're asked to write N counters so why
> >>> don't we just write N counters?
> >>
> >> Well, we are asked to stream N counters. But each has a variable length,
> >> which makes it funny :)
> >>
> >> Thus, you already do
> >>>
> >>>  for (struct gcov_kvp *node = (struct gcov_kvp *)(intptr_t)start;
> >>> -  node != NULL; node = node->next)
> >>> +  node != NULL; node = node->next, j++)
> >>>   {
> >>> gcov_write_counter (node->value);
> >>> gcov_write_counter (node->count);
> >>> +
> >>> + /* Stop when we reach expected number of items.  */
> >>> + if (j + 1 == list_sizes[i])
> >>> +   break;
> >>>   }
> >>>
> >>> why isn't this then only thing you need (just using pair_count as gathered
> >>> previously)?  And just count pair_count to zero as IV?  No need to
> >>> do the node != NULL check?
> >>
> >> We can't do that, because that will lead in a corrupted profile.
> >> We can have 2 problematic situations:
> >>
> >> 1) linked list is shorter, e.g.
> >>
> >> 10 2 1 10
> >>
> >> Here 2 represents 2 tuples, but we stream only one {1: 10}. That 
> >> happens in the instrumented clang.
> >>
> >> 2) linked list is longer, e.g.
> >> 10 1 1 5 22 5
> >>
> >> Here 1 represents 1 tuple, be we stream 2 tuples {1: 10} and {22: 
> >> 5}.
> >
> > I guess the problem is that for whatever reason the stream includes
> >
> > unsigned disk_size = GCOV_TOPN_DISK_COUNTERS * counters + 2 * 
> > pair_total;
> > gcov_write_tag_length (GCOV_TAG_FOR_COUNTER (t_ix),
> >   GCOV_TAG_COUNTER_LENGTH (disk_size));
> >
> > which has the sum of the list lengths, but ...
>
> Yes, GCOV tags always contain size and it's very handy in verification
> that a .gcda is in a consistent state.

So is the observed issue that the above written length is too small because
we end up streaming more counters?  As said, I think including 2 * pair_total
above only causes complication ... but I guess if that's how gcov records are
structured then so be it and your patch storing the individual lengths away
first is necessary (including obtaining temporary storage, and no, the stack
is not a good place here I think).

> >
> > -  gcov_type pair_count = ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 
> > 1];
> > gcov_write_counter (ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i]);
> > -  gcov_write_counter (pair_count);
> >
> > we stream something like that again.  What are the two different counters?
> > You save one, but the other you take async again?
>
> ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 0] == total_number_of_executions
> ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 1] == length of the linked list
> ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 2] == pointer to the first item
> in a linked list
>
> >
> > The disk format here is sub-optimal at least and optimizing the read side
> > which has locking in place for making the write-side racy doesn't look good.
>
> Yes, we slighly miss some space. Per-file locking should not block streaming
> parallel of profiles.
>
> >
> >>>
> >>> Note architectures with less nice memory ordering guarantees might
> >>> eventually see partially updated pointers and counters so I think
> >>> we at least want atomic_read ()s of the values with the weakest
> >>> consistency possible.  (but that can be done as followup if we agree on 
> >>> that)
> >>
> >> Can we really see a partially updated pointer?
> >
> > Not on x86 for aligned data.  Note the same applies to the counter values.
> >
> > We've seen allocating memory from libgcov is problematic, so this is why I'm
> > asking.
>
> Sure.
>
> Martin
>
> >
> >> Thanks,
> >> Martin
> >>
> >>>
> >>> Richard.
> >>>
>  Thanks,
>  Martin
> >>
>


Re: [PATCH v7] Add retain attribute to place symbols in SHF_GNU_RETAIN section

2021-02-18 Thread Richard Biener via Gcc-patches
On Thu, Feb 18, 2021 at 4:01 PM H.J. Lu  wrote:
>
> On Thu, Feb 18, 2021 at 2:25 AM Richard Biener
>  wrote:
> >
> > On Thu, Feb 18, 2021 at 5:15 AM H.J. Lu via Gcc-patches
> >  wrote:
> > >
> > > On Wed, Feb 17, 2021 at 12:50 PM H.J. Lu  wrote:
> > > >
> > > > On Wed, Feb 17, 2021 at 12:14 PM Jakub Jelinek  wrote:
> > > > >
> > > > > On Wed, Feb 17, 2021 at 12:04:34PM -0800, H.J. Lu wrote:> >
> > > > > > +  /* For -fretain-used-symbol, a "used" attribute also implies 
> > > > > > "retain".  */
> > > > >
> > > > > s/-symbol/-symbols/
> > > >
> > > > Fixed.
> > > >
> > > > > > +  if (flag_retain_used_symbols
> > > > > > +  && attributes
> > > > > > +  && (TREE_CODE (*node) == FUNCTION_DECL
> > > > > > +   || (VAR_P (*node) && TREE_STATIC (*node))
> > > > > > +   || (TREE_CODE (*node) == TYPE_DECL))
> > > > >
> > > > > What will "retain" do on TYPE_DECLs?  It only makes sense to me
> > > > > for FUNCTION_DECL or non-automatic VAR_DECLs, unless for types
> > > > > it means the types with that result in vars with those types 
> > > > > automatically
> > > > > getting "retain", but there is no code for that and I don't see "used"
> > > > > handled that way.
> > > >
> > > > Fixed.
> > > >
> > > > > > +  if (SUPPORTS_SHF_GNU_RETAIN
> > > > > > +  && (TREE_CODE (node) == FUNCTION_DECL
> > > > > > +   || (VAR_P (node) && TREE_STATIC (node))
> > > > > > +   || (TREE_CODE (node) == TYPE_DECL)))
> > > > >
> > > > > Likewise.
> > > >
> > > > Fixed.
> > > >
> > > > > > +;
> > > > > > +  else
> > > > > > +{
> > > > > > +  warning (OPT_Wattributes, "%qE attribute ignored", name);
> > > > > > +  *no_add_attrs = true;
> > > > > > +}
> > > > > > +
> > > > > > +  return NULL_TREE;
> > > > > > +}
> > > > > > +
> > > > > >  /* Handle a "externally_visible" attribute; arguments as in
> > > > > > struct attribute_spec.handler.  */
> > > > > >
> > > > > > diff --git a/gcc/common.opt b/gcc/common.opt
> > > > > > index c75dd36843e..70842481d4d 100644
> > > > > > --- a/gcc/common.opt
> > > > > > +++ b/gcc/common.opt
> > > > > > @@ -2404,6 +2404,10 @@ frerun-loop-opt
> > > > > >  Common Ignore
> > > > > >  Does nothing.  Preserved for backward compatibility.
> > > > > >
> > > > > > +fretain-used-symbols
> > > > > > +Common Var(flag_retain_used_symbols)
> > > > > > +Make the used attribute to imply the retain attribute if supported.
> > > > >
> > > > > English is not my native language, but I think the "to " doesn't 
> > > > > belong
> > > > > here.
> > > >
> > > > Fixed.
> > > >
> > > > > > +@item -fretain-used-symbols
> > > > > > +@opindex fretain-used-symbols
> > > > > > +On systems with recent GNU assembler and linker, the compiler makes
> > > > >
> > > > > I think we should avoid recent here, new/old/recent etc. are relative 
> > > > > terms.
> > > > > Either use exact version (is it 2.36 or later) or say GNU assembler 
> > > > > and
> > > > > linker that supports the @code{.retain} directive or something 
> > > > > similar.
> > > >
> > > > Fixed.
> > > >
> > > > > >flags |= SECTION_NAMED;
> > > > > > -  if (SUPPORTS_SHF_GNU_RETAIN
> > > > > > -  && decl != nullptr
> > > > > > +  if (decl != nullptr
> > > > > >&& DECL_P (decl)
> > > > > > -  && DECL_PRESERVE_P (decl))
> > > > > > +  && DECL_PRESERVE_P (decl)
> > > > > > +  && lookup_attribute ("retain", DECL_ATTRIBUTES (decl)))
> > > > > >  flags |= SECTION_RETAIN;
> > > > > >if (*slot == NULL)
> > > > > >  {
> > > > >
> > > > > I think none of the varasm.c code should use DECL_PRESERVE_P when it 
> > > > > means
> > > > > retain, just lookup_attribute.
> > > >
> > > > Fixed.
> > > >
> > > > > > @@ -487,7 +487,8 @@ resolve_unique_section (tree decl, int reloc 
> > > > > > ATTRIBUTE_UNUSED,
> > > > > >if (DECL_SECTION_NAME (decl) == NULL
> > > > > >&& targetm_common.have_named_sections
> > > > > >&& (flag_function_or_data_sections
> > > > > > -   || (SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl))
> > > > > > +   || (DECL_PRESERVE_P (decl)
> > > > > > +   && lookup_attribute ("retain", DECL_ATTRIBUTES (decl)))
> > > > > > || DECL_COMDAT_GROUP (decl)))
> > > > > >  {
> > > > > >targetm.asm_out.unique_section (decl, reloc);
> > > > > > @@ -1227,7 +1228,8 @@ get_variable_section (tree decl, bool 
> > > > > > prefer_noswitch_p)
> > > > > >  vnode->get_constructor ();
> > > > > >
> > > > > >if (DECL_COMMON (decl)
> > > > > > -  && !(SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl)))
> > > > > > +  && !(DECL_PRESERVE_P (decl)
> > > > > > +&& lookup_attribute ("retain", DECL_ATTRIBUTES (decl
> > > > > >  {
> > > > > >/* If the decl has been given an explicit section name, or 
> > > > > > it resides
> > > > > >in a non-generic address space, then it isn't common, and 
> > > > > > shouldn't
> > > > > > @@ -7761,12 +7763,14 @@ switch_to_section (section *new_section, 
> > > > > > tree decl)
> > 

Re: [PATCH] Add -fgnu-retain/-fno-gnu-retain

2021-02-18 Thread Jozef Lawrynowicz
On Thu, Feb 18, 2021 at 03:38:46PM +0100, Jakub Jelinek via Gcc-patches wrote:
> On Thu, Feb 18, 2021 at 02:22:35PM +, Jozef Lawrynowicz wrote:
> > I think it is a enhancement, and true to the spirit of the attribute,
> > for "used" to save a symbol from linker garbage collection.
> > 
> > Why should "used" mean:
> >   Save this symbol from compiler optimization, but allow the linker to
> >   remove it.
> > 
> > I can't currently think of a use case where a "used" symbol should be
> > kept by the compiler but removed by the linker.
> > 
> > Often we see KEEP directives in linker scripts accompanying
> > "used" in the source code. libgcc/crtstuff.c is a good example. GNU
> > linker scripts need many specific KEEP directives to handle all the
> > "used" symbols.
> > 
> > If a developer marks a symbol as "used" in the source code, I think the
> > intuitive thing is for that symbol to be present in the linked output
> > file.
> 
> There are many reasons why a symbol is marked "used", one of the very often
> seen ones is e.g. that the symbol is referenced from inline asm.
> You don't need to guard such symbols against GC.

I suppose, for the cases where you cannot use extended ASM and therefore
cannot specify the symbol as an operand to the asm statement. I wonder
if that really could not be worked around though, since it would only be
required for statics, and if it would even have a negative impact in
practice.

I am just conflicted because when I first proposed this new
functionality, (as a separate "retain" attribute in fact), the feedback
from a few different people was that it would be better to leverage
"used" instead of creating a separate attribute. It seemed like a good
idea to me.

At this point, a separate "retain" attribute seems like an OK
compromise.

Thanks,
Jozef


[PATCH v7] Add retain attribute to place symbols in SHF_GNU_RETAIN section

2021-02-18 Thread H.J. Lu via Gcc-patches
On Thu, Feb 18, 2021 at 2:25 AM Richard Biener
 wrote:
>
> On Thu, Feb 18, 2021 at 5:15 AM H.J. Lu via Gcc-patches
>  wrote:
> >
> > On Wed, Feb 17, 2021 at 12:50 PM H.J. Lu  wrote:
> > >
> > > On Wed, Feb 17, 2021 at 12:14 PM Jakub Jelinek  wrote:
> > > >
> > > > On Wed, Feb 17, 2021 at 12:04:34PM -0800, H.J. Lu wrote:> >
> > > > > +  /* For -fretain-used-symbol, a "used" attribute also implies 
> > > > > "retain".  */
> > > >
> > > > s/-symbol/-symbols/
> > >
> > > Fixed.
> > >
> > > > > +  if (flag_retain_used_symbols
> > > > > +  && attributes
> > > > > +  && (TREE_CODE (*node) == FUNCTION_DECL
> > > > > +   || (VAR_P (*node) && TREE_STATIC (*node))
> > > > > +   || (TREE_CODE (*node) == TYPE_DECL))
> > > >
> > > > What will "retain" do on TYPE_DECLs?  It only makes sense to me
> > > > for FUNCTION_DECL or non-automatic VAR_DECLs, unless for types
> > > > it means the types with that result in vars with those types 
> > > > automatically
> > > > getting "retain", but there is no code for that and I don't see "used"
> > > > handled that way.
> > >
> > > Fixed.
> > >
> > > > > +  if (SUPPORTS_SHF_GNU_RETAIN
> > > > > +  && (TREE_CODE (node) == FUNCTION_DECL
> > > > > +   || (VAR_P (node) && TREE_STATIC (node))
> > > > > +   || (TREE_CODE (node) == TYPE_DECL)))
> > > >
> > > > Likewise.
> > >
> > > Fixed.
> > >
> > > > > +;
> > > > > +  else
> > > > > +{
> > > > > +  warning (OPT_Wattributes, "%qE attribute ignored", name);
> > > > > +  *no_add_attrs = true;
> > > > > +}
> > > > > +
> > > > > +  return NULL_TREE;
> > > > > +}
> > > > > +
> > > > >  /* Handle a "externally_visible" attribute; arguments as in
> > > > > struct attribute_spec.handler.  */
> > > > >
> > > > > diff --git a/gcc/common.opt b/gcc/common.opt
> > > > > index c75dd36843e..70842481d4d 100644
> > > > > --- a/gcc/common.opt
> > > > > +++ b/gcc/common.opt
> > > > > @@ -2404,6 +2404,10 @@ frerun-loop-opt
> > > > >  Common Ignore
> > > > >  Does nothing.  Preserved for backward compatibility.
> > > > >
> > > > > +fretain-used-symbols
> > > > > +Common Var(flag_retain_used_symbols)
> > > > > +Make the used attribute to imply the retain attribute if supported.
> > > >
> > > > English is not my native language, but I think the "to " doesn't belong
> > > > here.
> > >
> > > Fixed.
> > >
> > > > > +@item -fretain-used-symbols
> > > > > +@opindex fretain-used-symbols
> > > > > +On systems with recent GNU assembler and linker, the compiler makes
> > > >
> > > > I think we should avoid recent here, new/old/recent etc. are relative 
> > > > terms.
> > > > Either use exact version (is it 2.36 or later) or say GNU assembler and
> > > > linker that supports the @code{.retain} directive or something similar.
> > >
> > > Fixed.
> > >
> > > > >flags |= SECTION_NAMED;
> > > > > -  if (SUPPORTS_SHF_GNU_RETAIN
> > > > > -  && decl != nullptr
> > > > > +  if (decl != nullptr
> > > > >&& DECL_P (decl)
> > > > > -  && DECL_PRESERVE_P (decl))
> > > > > +  && DECL_PRESERVE_P (decl)
> > > > > +  && lookup_attribute ("retain", DECL_ATTRIBUTES (decl)))
> > > > >  flags |= SECTION_RETAIN;
> > > > >if (*slot == NULL)
> > > > >  {
> > > >
> > > > I think none of the varasm.c code should use DECL_PRESERVE_P when it 
> > > > means
> > > > retain, just lookup_attribute.
> > >
> > > Fixed.
> > >
> > > > > @@ -487,7 +487,8 @@ resolve_unique_section (tree decl, int reloc 
> > > > > ATTRIBUTE_UNUSED,
> > > > >if (DECL_SECTION_NAME (decl) == NULL
> > > > >&& targetm_common.have_named_sections
> > > > >&& (flag_function_or_data_sections
> > > > > -   || (SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl))
> > > > > +   || (DECL_PRESERVE_P (decl)
> > > > > +   && lookup_attribute ("retain", DECL_ATTRIBUTES (decl)))
> > > > > || DECL_COMDAT_GROUP (decl)))
> > > > >  {
> > > > >targetm.asm_out.unique_section (decl, reloc);
> > > > > @@ -1227,7 +1228,8 @@ get_variable_section (tree decl, bool 
> > > > > prefer_noswitch_p)
> > > > >  vnode->get_constructor ();
> > > > >
> > > > >if (DECL_COMMON (decl)
> > > > > -  && !(SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl)))
> > > > > +  && !(DECL_PRESERVE_P (decl)
> > > > > +&& lookup_attribute ("retain", DECL_ATTRIBUTES (decl
> > > > >  {
> > > > >/* If the decl has been given an explicit section name, or it 
> > > > > resides
> > > > >in a non-generic address space, then it isn't common, and 
> > > > > shouldn't
> > > > > @@ -7761,12 +7763,14 @@ switch_to_section (section *new_section, tree 
> > > > > decl)
> > > > >  {
> > > > >if (in_section == new_section)
> > > > >  {
> > > > > -  if (SUPPORTS_SHF_GNU_RETAIN
> > > > > -   && (new_section->common.flags & SECTION_NAMED)
> > > > > +  if ((new_section->common.flags & SECTION_NAMED)
> > > > > && decl != nullptr
> > > > > && DECL_P (decl)
> > > > > 

Re: [PATCH] Add -fgnu-retain/-fno-gnu-retain

2021-02-18 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 18, 2021 at 02:22:35PM +, Jozef Lawrynowicz wrote:
> I think it is a enhancement, and true to the spirit of the attribute,
> for "used" to save a symbol from linker garbage collection.
> 
> Why should "used" mean:
>   Save this symbol from compiler optimization, but allow the linker to
>   remove it.
> 
> I can't currently think of a use case where a "used" symbol should be
> kept by the compiler but removed by the linker.
> 
> Often we see KEEP directives in linker scripts accompanying
> "used" in the source code. libgcc/crtstuff.c is a good example. GNU
> linker scripts need many specific KEEP directives to handle all the
> "used" symbols.
> 
> If a developer marks a symbol as "used" in the source code, I think the
> intuitive thing is for that symbol to be present in the linked output
> file.

There are many reasons why a symbol is marked "used", one of the very often
seen ones is e.g. that the symbol is referenced from inline asm.
You don't need to guard such symbols against GC.

Jakub



Re: [PATCH] Add -fgnu-retain/-fno-gnu-retain

2021-02-18 Thread Jozef Lawrynowicz
On Thu, Feb 18, 2021 at 01:08:54PM +0100, Jakub Jelinek via Gcc-patches wrote:
> On Thu, Feb 18, 2021 at 12:00:59PM +, Jozef Lawrynowicz wrote:
> > If we can add ".retain " to GAS, then I agree, current GCC
> > SHF_GNU_RETAIN behavior should be removed. For GCC 12 we leverage
> > .retain to implement the functionality where "used" saves symbols form
> > linker garbage collection, without modifying section names or the
> > structure of the object file.
> 
> Even in that case I think it is a bad idea to change the "used" attribute
> behavior, not everyone using "used" attribute needs or wants that behavior,
> so it should be a functionality provided by a separate new attribute.

I think it is a enhancement, and true to the spirit of the attribute,
for "used" to save a symbol from linker garbage collection.

Why should "used" mean:
  Save this symbol from compiler optimization, but allow the linker to
  remove it.

I can't currently think of a use case where a "used" symbol should be
kept by the compiler but removed by the linker.

Often we see KEEP directives in linker scripts accompanying
"used" in the source code. libgcc/crtstuff.c is a good example. GNU
linker scripts need many specific KEEP directives to handle all the
"used" symbols.

If a developer marks a symbol as "used" in the source code, I think the
intuitive thing is for that symbol to be present in the linked output
file.

Jozef


Re: c++: Macros need to be GTY-reachable [PR 99023]

2021-02-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Feb 17, 2021 at 01:46:37PM -0500, Nathan Sidwell wrote:
> I'd missed that   macros were allocated from GC storage, and that they can
> become unattached from an identifier, and therefore not   GC-reachable.
> And then bad things happen.   Fixed by making the module machinery's
> reference vector a GC root.
> 
>   PR c++/99023
> gcc/cp/
> * module.cc (struct macro_export): Add GTY markers.
> (macro_exports): Likewise, us a   va_gc Vector.
> gcc/testsuite/
> * g++.dg/modules/pr99023_a.H: New.
> * g++.dg/modules/pr99023_b.H: New.

I must say I don't know much about modules, but seeing the second set
of > 100KB g++.dg/modules/ testcases, I need to wonder, do we really need that
large tests?
Can't cvise (or creduce or multi-delta) be used to minimize that, use
a checking script that uses unfixed and fixed compilers and minimizes
into something that is accepted without errors by the fixed compiler
and still ICEs with the unfixed one?

Jakub



Re: [PATCH] Add -fgnu-retain/-fno-gnu-retain

2021-02-18 Thread Jozef Lawrynowicz
On Thu, Feb 18, 2021 at 11:22:42PM +1030, Alan Modra via Binutils wrote:
> On Wed, Feb 17, 2021 at 11:23:12AM +, Jozef Lawrynowicz wrote:
> > In previous discussions, it seemed to me that there was general support
> > for the "used" attribute implying SHF_GNU_RETAIN and saving symbols from
> > linker garbage collection[1]. Of course, the current implementation
> > results in undesirable behavior - the thought that all linker scripts
> > not supporting uniquely named sections would need to be updated is quite
> > alarming.
> > 
> > It's a shame that all this extra complication is required, just because
> > we cannot have a ".retain ", directive.
> 
> Is that true?  Isn't the problem really that retained sections are
> named as if -ffunction-sections/-fdata-sections were in force?  And
> that is due to wanting separate sections so that garbage collection
> works, since SHF_GNU_RETAIN is all about linker garbage collection.
> 
> I don't see how having a ".retain " would help much.

In the early GCC implementations, there were issues getting the default
"unnamed" section names for symbols in GCC[1]. This prevented us from
being able to simply emit a .section directive, replacing a .text/.data
etc. directive.

H.J. improved upon my initial attempt for doing this, but then the
approach was changed to always put "used" symbols in uniquely named
sections, and I don't know why. Hopefully H.J. can clarify why "used"
symbols had to be put in sections with unique names.

With a ".retain " directive, GCC doesn't need to work out the
section associated with a symbol, alleviating the above issues, so we
don't need to change the section a symbol is in to apply SHF_GNU_RETAIN
to that section.

The bugs we are seeing now (e.g. PR 99113) are because some linker
scripts aren't set up to handle uniquely named sections. With
".retain ", the linker scripts will work without issues because
we have not changed the layout of sections.

> 
> > My preferred vision for this functionality was:
> >   - SHF_GNU_RETAIN section flag indicates a section should be saved
> > from linker garbage collection.
> >   - ".retain " assembler directive applies SHF_GNU_RETAIN
> > to the section containing .
> >   - GCC "used" attribute emits a ".retain " directive for
> > the symbol declaration is is applied to.  Applying the "used"
> > attribute to a symbol declaration does not change the structure of
> > the object file, beyond applying SHF_GNU_RETAIN to the section
> > containing that symbol.
> 
> That description seems to say that a ".retain foo" would mean
> everything in foo's section is kept.  If foo's section was the usual
> .data, you've kept virtually everything from garbage collection.

Yes, sort of. Everything from .data in the input object file - only
the .data containing foo. Unused .data sections from other object files
will be garbage collected, as required.

The user has marked "foo" as used after all, so the linker is
treating it as if it is linked to the entry point of the program in some
way. Since garbage collection only works at the section level, we can't
do any better - unless -ffunction/fdata-sections is used.

> Surely you don't expect ".retain foo" to create a separate .data
> section for foo?  If you do, I'm strongly against that idea.

No, I don't want ".retain" to change anything about the structure or
contents of a section, beyond applying SHF_GNU_RETAIN to it.

> 
> Note that gas indeed supports multiple sections named .data that can
> serve the same purpose as -fdata-sections.  See the gas doc for the
> optional .section field "unique".  That might be the best way to avoid
> an under-the-hood -ffunction-sections/-fdata-sections.

As mentioned above, there were issues getting the default "unnamed"
section names for symbols in GCC.

If we can reliably get the name of an unnamed section, then we could
potentially do away with .retain. "unique" would allow finer granularity
of garbage collection, but it is changing the structure of the object
file, which I think we should try to avoid, as it leads to issues like
we are seeing now.

Thanks,
Jozef

[1] https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557914.html
> 
> -- 
> Alan Modra
> Australia Development Lab, IBM


Re: [r11-7271 Regression] FAIL: g++.dg/modules/pr99023_a.H (test for excess errors) on Linux/x86_64

2021-02-18 Thread Nathan Sidwell

On 2/18/21 3:03 AM, Jakub Jelinek wrote:

On Wed, Feb 17, 2021 at 02:19:11PM -0800, sunil.k.pandey via Gcc-patches wrote:

On Linux/x86_64,




Yeah:
FAIL: g++.dg/modules/pr99023_a.H (test for excess errors)
Excess errors:
../../..//src/libstdc++-v3/libsupc++/initializer_list:47:11: fatal error: definition of 
'class std::initializer_list<_E>' does not match '#include '
compilation terminated.



sigh, I shouldn't have picked a header that is somewhat ABI-specific

nathan
--
Nathan Sidwell


Re: [PATCH] c++: Fix -std=c++20 ICE on virtual method call [PR99132]

2021-02-18 Thread Nathan Sidwell

On 2/18/21 3:30 AM, Jakub Jelinek wrote:

Hi!

On the following testcase we ICE in C++20 mode during cp_get_callee_fndecl
-> constexpr evaluation.
It is only in C++20 mode on this testcase because virtual methods can't
be constexpr in C++17 and earlier and so potential_constant_expression_1
rejects it earlier.
And the ICE is caused by genericization changing the h PARM_DECL from having
B type to B & DECL_BY_REFERENCE and the constexpr evaluation
not being able to deal with that.
I think this just shows that we shouldn't do the constexpr evaluation during
genericization and later, and other spots e.g. during gimplification
also don't call cp_get_callee_fndecl but cp_get_callee_fndecl_nofold.


Agreed, we've run into this kind of thing before.


After all, cp_fold has already been run and it did the folding if there
was any opportunity to do so.  And furthermore, what that cp_genericize_r
spot does is check for any left-over immediate function calls (which can be
ATM just std::source_location::current() call) and immediate functions
outside of immediate functions can't have addresses leaked into the IL,
so it will be always a direct call anyway.  And immediate functions
themselves don't make it into genericization/gimplification.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


ok, thanks


2021-02-18  Jakub Jelinek  

PR c++/99132
* cp-gimplify.c (cp_genericize_r) : Use
cp_get_callee_fndecl_nofold instead of cp_get_callee_fndecl to check
for immediate function calls.

* g++.dg/cpp2a/constexpr-virtual18.C: New test.

--- gcc/cp/cp-gimplify.c.jj 2021-02-17 16:14:42.621294819 +0100
+++ gcc/cp/cp-gimplify.c2021-02-17 16:48:30.343291226 +0100
@@ -1386,7 +1386,7 @@ cp_genericize_r (tree *stmt_p, int *walk
  break;
}
  
-  if (tree fndecl = cp_get_callee_fndecl (stmt))

+  if (tree fndecl = cp_get_callee_fndecl_nofold (stmt))
if (DECL_IMMEDIATE_FUNCTION_P (fndecl))
  {
gcc_assert (source_location_current_p (fndecl));
--- gcc/testsuite/g++.dg/cpp2a/constexpr-virtual18.C.jj 2021-02-17 
16:48:18.136421425 +0100
+++ gcc/testsuite/g++.dg/cpp2a/constexpr-virtual18.C2021-02-17 
16:49:55.929378398 +0100
@@ -0,0 +1,13 @@
+// PR c++/99132
+// { dg-do compile { target c++11 } }
+
+template  struct A { T c; };
+template  struct B {
+  A d;
+  constexpr T operator-> () { return d.c; }
+  B (B &&);
+};
+struct C {
+  virtual void foo ();
+  void bar (B h) { h->foo (); }
+};

Jakub




--
Nathan Sidwell


Re: [PATCH] middle-end/99122 - Issues with VLA parameter inlining

2021-02-18 Thread Richard Biener
On Thu, 18 Feb 2021, Jakub Jelinek wrote:

> On Thu, Feb 18, 2021 at 01:37:29PM +0100, Richard Biener wrote:
> > The following instructs IPA not to inline calls with VLA parameters
> > and adjusts inlining not to create invalid view-converted VLA
> > parameters on mismatch and makes the error_mark paths with debug
> > stmts actually work.
> > 
> > The first part avoids the ICEs with the testcases already.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, does it
> > look sane?
> > 
> > Thanks,
> > Richard.
> > 
> > 2021-02-18  Richard Biener  
> > 
> > PR middle-end/99122
> > * ipa-fnsummary.c (analyze_function_body): Set
> > CIF_FUNCTION_NOT_INLINABLE for VLA parameter calls.
> > * tree-inline.c (insert_init_debug_bind): Pass NULL for
> > error_mark_node values.
> > (force_value_to_type): Do not build V_C_Es for WITH_SIZE_EXPR
> > values.
> > (setup_one_parameter): Delay force_value_to_type until when
> > it's needed.
> > 
> > * gcc.dg/pr99122-1.c: New testcase.
> > * gcc.dg/pr99122-2.c: Likewise.
> 
> LGTM, but for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99122#c12
> I guess we still need something like Martin's patch.

Yeah.

> Of course it can be fixed incrementally.

Indeed, so I pushed the patch and will work on the followup, but
likely only tomorrow.

Richard.


Re: [PATCH] middle-end/99122 - Issues with VLA parameter inlining

2021-02-18 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 18, 2021 at 01:37:29PM +0100, Richard Biener wrote:
> The following instructs IPA not to inline calls with VLA parameters
> and adjusts inlining not to create invalid view-converted VLA
> parameters on mismatch and makes the error_mark paths with debug
> stmts actually work.
> 
> The first part avoids the ICEs with the testcases already.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, does it
> look sane?
> 
> Thanks,
> Richard.
> 
> 2021-02-18  Richard Biener  
> 
>   PR middle-end/99122
>   * ipa-fnsummary.c (analyze_function_body): Set
>   CIF_FUNCTION_NOT_INLINABLE for VLA parameter calls.
>   * tree-inline.c (insert_init_debug_bind): Pass NULL for
>   error_mark_node values.
>   (force_value_to_type): Do not build V_C_Es for WITH_SIZE_EXPR
>   values.
>   (setup_one_parameter): Delay force_value_to_type until when
>   it's needed.
> 
>   * gcc.dg/pr99122-1.c: New testcase.
>   * gcc.dg/pr99122-2.c: Likewise.

LGTM, but for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99122#c12
I guess we still need something like Martin's patch.
Of course it can be fixed incrementally.

Jakub



Re: [PATCH] match.pd: Restrict clz cmp 0 replacement by single_use, PR99142

2021-02-18 Thread Richard Biener via Gcc-patches
On Thu, Feb 18, 2021 at 1:21 PM Hans-Peter Nilsson  wrote:
>
> > From: Richard Biener via Gcc-patches 
> > Date: Thu, 18 Feb 2021 11:12:26 +0100
>
> > On Thu, Feb 18, 2021 at 12:35 AM Hans-Peter Nilsson via Gcc-patches
> >  wrote:
> > >
> > > If we're not going to eliminate the clz, it's better for the
> > > comparison to use that result than its input, so we don't
> > > extend the lifetime of the input.  Also, an additional use
> > > of the result is more likely cheaper than a compare of the
> > > input, in particular considering that the clz may have made
> > > available a non-zero condition matching the original use.
> >
> > Of course the non-zero compare can execute concurrently
> > with the clz if we use the input which can result in faster
> > operation.  It's one of the many cases where local folding
> > doesn't tell us enough.  One of the usual issues with
> > single_use checks is that the IL is not fully DCEd while
> > we fold it and thus dead uses can make sth !single_use
> > even though it really is single_use.
> >
> > Also if there's a use of the input live at the point of the
> > compare we want to replace it's likely better to do the
> > transform.
>
> Yeah, all depending on the code context.  I guess these
> simple match.pd transformations will grow some conditions.
>
> > > The "s" modifier doesn't stop this situation, as the
> > > transformation wouldn't result in "an expression with more
> > > than one operator"; a gating single_use condition on the
> > > result is necessary.
> >
> > Note that :s was done this way for the use in VN where
> > we can this way CSE c != 0 with an earlier a >= 0.
> >
> > It might be an interesting idea to allow match.pd consumers
> > to "switch" behavior of :s according to their needs.
>
> Like, as a heuristic for a non-ooo-target, to mean single_use?
> Yes, good idea!

More like have VN the existing semantics but fold_stmt treat
it as single_use for example.

> > > Regtested cris-elf and x86_64-linux.
> > > Ok to commit?
> >
> > OK.
>
> Done as a2ef38b1f94d, thanks for the review.
>
> > gcc/testsuite:
> > PR tree-optimization/99142
> > * testsuite/gcc.dg/tree-ssa/pr99142.c: New test.
>
> JFTR; oops: ^ Spurious "testsuite/" removed.
>
> brgds, H-P


[PATCH 2/2] IBM Z: Fix long double <-> DFP conversions

2021-02-18 Thread Ilya Leoshkevich via Gcc-patches
When switching the s390 backend to store long doubles in vector
registers, the patterns for long double <-> DFP conversions were
forgotten.  This did not cause observable problems so far, because
libdfp calls are emitted instead of pfpo.  However, when building
libdfp itself, this leads to infinite recursion.

gcc/ChangeLog:

* config/s390/vector.md (trunctf2_vr): New
pattern.
(trunctf2): Likewise.
(trunctdtf2_vr): Likewise.
(trunctdtf2): Likewise.
(extendtf2_vr): Likewise.
(extendtf2): Likewise.
(extendtftd2_vr): Likewise.
(extendtftd2): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/long-double-from-decimal128.c: New test.
* gcc.target/s390/vector/long-double-from-decimal32.c: New test.
* gcc.target/s390/vector/long-double-from-decimal64.c: New test.
* gcc.target/s390/vector/long-double-to-decimal128.c: New test.
* gcc.target/s390/vector/long-double-to-decimal32.c: New test.
* gcc.target/s390/vector/long-double-to-decimal64.c: New test.
---
 gcc/config/s390/vector.md | 72 +++
 .../s390/vector/long-double-from-decimal128.c | 20 ++
 .../s390/vector/long-double-from-decimal32.c  | 20 ++
 .../s390/vector/long-double-from-decimal64.c  | 20 ++
 .../s390/vector/long-double-to-decimal128.c   | 19 +
 .../s390/vector/long-double-to-decimal32.c| 19 +
 .../s390/vector/long-double-to-decimal64.c| 19 +
 7 files changed, 189 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-from-decimal128.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-from-decimal32.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-from-decimal64.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-to-decimal128.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-to-decimal32.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-to-decimal64.c

diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index e48c965db00..bc52211c55e 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -2480,6 +2480,42 @@
   "HAVE_TF (trunctfsf2)"
   { EXPAND_TF (trunctfsf2, 2); })
 
+(define_expand "trunctf2_vr"
+  [(match_operand:DFP_ALL 0 "nonimmediate_operand" "")
+   (match_operand:TF 1 "nonimmediate_operand" "")]
+  "TARGET_HARD_DFP
+   && GET_MODE_SIZE (TFmode) > GET_MODE_SIZE (mode)
+   && TARGET_VXE"
+{
+  rtx fprx2 = gen_reg_rtx (FPRX2mode);
+  emit_insn (gen_tf_to_fprx2 (fprx2, operands[1]));
+  emit_insn (gen_truncfprx22 (operands[0], fprx2));
+  DONE;
+})
+
+(define_expand "trunctf2"
+  [(match_operand:DFP_ALL 0 "nonimmediate_operand" "")
+   (match_operand:TF 1 "nonimmediate_operand" "")]
+  "HAVE_TF (trunctf2)"
+  { EXPAND_TF (trunctf2, 2); })
+
+(define_expand "trunctdtf2_vr"
+  [(match_operand:TF 0 "nonimmediate_operand" "")
+   (match_operand:TD 1 "nonimmediate_operand" "")]
+  "TARGET_HARD_DFP && TARGET_VXE"
+{
+  rtx fprx2 = gen_reg_rtx (FPRX2mode);
+  emit_insn (gen_trunctdfprx22 (fprx2, operands[1]));
+  emit_insn (gen_fprx2_to_tf (operands[0], fprx2));
+  DONE;
+})
+
+(define_expand "trunctdtf2"
+  [(match_operand:TF 0 "nonimmediate_operand" "")
+   (match_operand:TD 1 "nonimmediate_operand" "")]
+  "HAVE_TF (trunctdtf2)"
+  { EXPAND_TF (trunctdtf2, 2); })
+
 ; load lengthened
 
 (define_insn "extenddftf2_vr"
@@ -2511,6 +2547,42 @@
   "HAVE_TF (extendsftf2)"
   { EXPAND_TF (extendsftf2, 2); })
 
+(define_expand "extendtf2_vr"
+  [(match_operand:TF 0 "nonimmediate_operand" "")
+   (match_operand:DFP_ALL 1 "nonimmediate_operand" "")]
+  "TARGET_HARD_DFP
+   && GET_MODE_SIZE (mode) < GET_MODE_SIZE (TFmode)
+   && TARGET_VXE"
+{
+  rtx fprx2 = gen_reg_rtx (FPRX2mode);
+  emit_insn (gen_extendfprx22 (fprx2, operands[1]));
+  emit_insn (gen_fprx2_to_tf (operands[0], fprx2));
+  DONE;
+})
+
+(define_expand "extendtf2"
+  [(match_operand:TF 0 "nonimmediate_operand" "")
+   (match_operand:DFP_ALL 1 "nonimmediate_operand" "")]
+  "HAVE_TF (extendtf2)"
+  { EXPAND_TF (extendtf2, 2); })
+
+(define_expand "extendtftd2_vr"
+  [(match_operand:TD 0 "nonimmediate_operand" "")
+   (match_operand:TF 1 "nonimmediate_operand" "")]
+  "TARGET_HARD_DFP && TARGET_VXE"
+{
+  rtx fprx2 = gen_reg_rtx (FPRX2mode);
+  emit_insn (gen_tf_to_fprx2 (fprx2, operands[1]));
+  emit_insn (gen_extendfprx2td2 (operands[0], fprx2));
+  DONE;
+})
+
+(define_expand "extendtftd2"
+  [(match_operand:TD 0 "nonimmediate_operand" "")
+   (match_operand:TF 1 "nonimmediate_operand" "")]
+  "HAVE_TF (extendtftd2)"
+  { EXPAND_TF (extendtftd2, 2); })
+
 ; test data class
 
 (define_expand "signbittf2_vr"
diff --git a/gcc/testsuite/gcc.target/s390/vector/long-double-from-decimal128.c 
b/gcc/testsuite/gcc.target/s390/vector/long-double-from-decimal128.c
new file mode 100644
index 000..3cd2c68f5c6
--- /dev/null
+++ 

[PATCH 1/2] IBM Z: Improve FPRX2 <-> TF conversions

2021-02-18 Thread Ilya Leoshkevich via Gcc-patches
gcc/ChangeLog:

* config/s390/vector.md (*fprx2_to_tf): Rename to fprx2_to_tf,
add memory alternative.
(tf_to_fprx2): New pattern.
---
 gcc/config/s390/vector.md | 36 +++-
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index 0e3c31f5d4f..e48c965db00 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -616,12 +616,23 @@
vlvgp\t%v0,%1,%N1"
   [(set_attr "op_type" "VRR,VRX,VRX,VRI,VRR")])
 
-(define_insn "*fprx2_to_tf"
-  [(set (match_operand:TF   0 "nonimmediate_operand" "=v")
-   (subreg:TF (match_operand:FPRX2 1 "general_operand"   "f") 0))]
+(define_insn_and_split "fprx2_to_tf"
+  [(set (match_operand:TF   0 "nonimmediate_operand" "=v,AR")
+   (subreg:TF (match_operand:FPRX2 1 "general_operand"   "f,f") 0))]
   "TARGET_VXE"
-  "vmrhg\t%v0,%1,%N1"
-  [(set_attr "op_type" "VRR")])
+  "@
+   vmrhg\t%v0,%1,%N1
+   #"
+  "!(MEM_P (operands[0]) && MEM_VOLATILE_P (operands[0]))"
+  [(set (match_dup 2) (match_dup 3))
+   (set (match_dup 4) (match_dup 5))]
+{
+  operands[2] = simplify_gen_subreg (DFmode, operands[0], TFmode, 0);
+  operands[3] = simplify_gen_subreg (DFmode, operands[1], FPRX2mode, 0);
+  operands[4] = simplify_gen_subreg (DFmode, operands[0], TFmode, 8);
+  operands[5] = simplify_gen_subreg (DFmode, operands[1], FPRX2mode, 8);
+}
+  [(set_attr "op_type" "VRR,*")])
 
 (define_insn "*vec_ti_to_v1ti"
   [(set (match_operand:V1TI   0 "nonimmediate_operand" 
"=v,v,R,  v,  v,v")
@@ -753,6 +764,21 @@
   "vpdi\t%V0,%v1,%V0,5"
   [(set_attr "op_type" "VRR")])
 
+(define_insn_and_split "tf_to_fprx2"
+  [(set (match_operand:FPRX20 "nonimmediate_operand" "=f,f")
+   (subreg:FPRX2 (match_operand:TF 1 "general_operand"   "v,AR") 0))]
+  "TARGET_VXE"
+  "#"
+  "!(MEM_P (operands[1]) && MEM_VOLATILE_P (operands[1]))"
+  [(set (match_dup 2) (match_dup 3))
+   (set (match_dup 4) (match_dup 5))]
+{
+  operands[2] = simplify_gen_subreg (DFmode, operands[0], FPRX2mode, 0);
+  operands[3] = simplify_gen_subreg (DFmode, operands[1], TFmode, 0);
+  operands[4] = simplify_gen_subreg (DFmode, operands[0], FPRX2mode, 8);
+  operands[5] = simplify_gen_subreg (DFmode, operands[1], TFmode, 8);
+})
+
 ; vec_perm_const for V2DI using vpdi?
 
 ;;
-- 
2.29.2



[PATCH 0/2] IBM Z: Fix long double <-> DFP conversions

2021-02-18 Thread Ilya Leoshkevich via Gcc-patches
This series fixes PR99134.  Patch 1 is factored out from the pending
[1], patch 2 is the actual fix.  Bootstrapped and regtested on
s390x-redhat-linux.  Ok for master?

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564380.html

Ilya Leoshkevich (2):
  IBM Z: Improve FPRX2 <-> TF conversions
  IBM Z: Fix long double <-> DFP conversions

 gcc/config/s390/vector.md | 108 +-
 .../s390/vector/long-double-from-decimal128.c |  20 
 .../s390/vector/long-double-from-decimal32.c  |  20 
 .../s390/vector/long-double-from-decimal64.c  |  20 
 .../s390/vector/long-double-to-decimal128.c   |  19 +++
 .../s390/vector/long-double-to-decimal32.c|  19 +++
 .../s390/vector/long-double-to-decimal64.c|  19 +++
 7 files changed, 220 insertions(+), 5 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-from-decimal128.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-from-decimal32.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-from-decimal64.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-to-decimal128.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-to-decimal32.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-to-decimal64.c

-- 
2.29.2



Re: [PATCH] Add -fgnu-retain/-fno-gnu-retain

2021-02-18 Thread Alan Modra via Gcc-patches
On Wed, Feb 17, 2021 at 11:23:12AM +, Jozef Lawrynowicz wrote:
> In previous discussions, it seemed to me that there was general support
> for the "used" attribute implying SHF_GNU_RETAIN and saving symbols from
> linker garbage collection[1]. Of course, the current implementation
> results in undesirable behavior - the thought that all linker scripts
> not supporting uniquely named sections would need to be updated is quite
> alarming.
> 
> It's a shame that all this extra complication is required, just because
> we cannot have a ".retain ", directive.

Is that true?  Isn't the problem really that retained sections are
named as if -ffunction-sections/-fdata-sections were in force?  And
that is due to wanting separate sections so that garbage collection
works, since SHF_GNU_RETAIN is all about linker garbage collection.

I don't see how having a ".retain " would help much.

> My preferred vision for this functionality was:
>   - SHF_GNU_RETAIN section flag indicates a section should be saved
> from linker garbage collection.
>   - ".retain " assembler directive applies SHF_GNU_RETAIN
> to the section containing .
>   - GCC "used" attribute emits a ".retain " directive for
> the symbol declaration is is applied to.  Applying the "used"
> attribute to a symbol declaration does not change the structure of
> the object file, beyond applying SHF_GNU_RETAIN to the section
> containing that symbol.

That description seems to say that a ".retain foo" would mean
everything in foo's section is kept.  If foo's section was the usual
.data, you've kept virtually everything from garbage collection.
Surely you don't expect ".retain foo" to create a separate .data
section for foo?  If you do, I'm strongly against that idea.

Note that gas indeed supports multiple sections named .data that can
serve the same purpose as -fdata-sections.  See the gas doc for the
optional .section field "unique".  That might be the best way to avoid
an under-the-hood -ffunction-sections/-fdata-sections.

-- 
Alan Modra
Australia Development Lab, IBM


[PATCH] middle-end/99122 - Issues with VLA parameter inlining

2021-02-18 Thread Richard Biener
The following instructs IPA not to inline calls with VLA parameters
and adjusts inlining not to create invalid view-converted VLA
parameters on mismatch and makes the error_mark paths with debug
stmts actually work.

The first part avoids the ICEs with the testcases already.

Bootstrapped and tested on x86_64-unknown-linux-gnu, does it
look sane?

Thanks,
Richard.

2021-02-18  Richard Biener  

PR middle-end/99122
* ipa-fnsummary.c (analyze_function_body): Set
CIF_FUNCTION_NOT_INLINABLE for VLA parameter calls.
* tree-inline.c (insert_init_debug_bind): Pass NULL for
error_mark_node values.
(force_value_to_type): Do not build V_C_Es for WITH_SIZE_EXPR
values.
(setup_one_parameter): Delay force_value_to_type until when
it's needed.

* gcc.dg/pr99122-1.c: New testcase.
* gcc.dg/pr99122-2.c: Likewise.
---
 gcc/ipa-fnsummary.c  |  8 +++-
 gcc/testsuite/gcc.dg/pr99122-1.c | 13 +
 gcc/testsuite/gcc.dg/pr99122-2.c | 21 +
 gcc/tree-inline.c| 21 +
 4 files changed, 54 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr99122-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr99122-2.c

diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index e32e69cd3ad..c3a25c52214 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -2775,7 +2775,13 @@ analyze_function_body (struct cgraph_node *node, bool 
early)
 (gimple_call_arg (stmt, i));
}
}
-
+ /* We cannot setup VLA parameters during inlining.  */
+ for (unsigned int i = 0; i < gimple_call_num_args (stmt); ++i)
+   if (TREE_CODE (gimple_call_arg (stmt, i)) == WITH_SIZE_EXPR)
+ {
+   edge->inline_failed = CIF_FUNCTION_NOT_INLINABLE;
+   break;
+ }
  es->call_stmt_size = this_size;
  es->call_stmt_time = this_time;
  es->loop_depth = bb_loop_depth (bb);
diff --git a/gcc/testsuite/gcc.dg/pr99122-1.c b/gcc/testsuite/gcc.dg/pr99122-1.c
new file mode 100644
index 000..5dfc0a85ad4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr99122-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -g -w" } */
+
+void f ()
+{
+  int n = 5;
+  struct { char a[n]; } x;
+  g (x, x);
+}
+int g (double x, double y)
+{
+  return __builtin_isgreater (x, y);
+}
diff --git a/gcc/testsuite/gcc.dg/pr99122-2.c b/gcc/testsuite/gcc.dg/pr99122-2.c
new file mode 100644
index 000..2b105428233
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr99122-2.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -g -w" } */
+
+static int foo ();
+
+int
+bar (int n)
+{
+  struct S { char a[n]; } x;
+  __builtin_memset (x.a, 0, n);
+  return foo (n, x);
+}
+
+static inline int
+foo (int n, struct T { char a[n]; } b)
+{
+  int r = 0, i;
+  for (i = 0; i < n; i++)
+r += b.a[i];
+  return r;
+}
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index a710fa59027..01a08cf27be 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3352,7 +3352,10 @@ insert_init_debug_bind (copy_body_data *id,
base_stmt = gsi_stmt (gsi);
 }
 
-  note = gimple_build_debug_bind (tracked_var, unshare_expr (value), 
base_stmt);
+  note = gimple_build_debug_bind (tracked_var,
+ value == error_mark_node
+ ? NULL_TREE : unshare_expr (value),
+ base_stmt);
 
   if (bb)
 {
@@ -3418,7 +3421,9 @@ force_value_to_type (tree type, tree value)
  Still if we end up with truly mismatched types here, fall back
  to using a VIEW_CONVERT_EXPR or a literal zero to not leak invalid
  GIMPLE to the following passes.  */
-  if (!is_gimple_reg_type (TREE_TYPE (value))
+  if (TREE_CODE (value) == WITH_SIZE_EXPR)
+return error_mark_node;
+  else if (!is_gimple_reg_type (TREE_TYPE (value))
   || TYPE_SIZE (type) == TYPE_SIZE (TREE_TYPE (value)))
 return fold_build1 (VIEW_CONVERT_EXPR, type, value);
   else
@@ -3434,15 +3439,9 @@ setup_one_parameter (copy_body_data *id, tree p, tree 
value, tree fn,
 {
   gimple *init_stmt = NULL;
   tree var;
-  tree rhs = value;
   tree def = (gimple_in_ssa_p (cfun)
  ? ssa_default_def (id->src_cfun, p) : NULL);
 
-  if (value
-  && value != error_mark_node
-  && !useless_type_conversion_p (TREE_TYPE (p), TREE_TYPE (value)))
-rhs = force_value_to_type (TREE_TYPE (p), value);
-
   /* Make an equivalent VAR_DECL.  Note that we must NOT remap the type
  here since the type of this decl must be visible to the calling
  function.  */
@@ -3501,6 +3500,12 @@ setup_one_parameter (copy_body_data *id, tree p, tree 
value, tree fn,
   if (TYPE_NEEDS_CONSTRUCTING (TREE_TYPE (p)))
 TREE_READONLY (var) = 0;
 
+  tree rhs = value;
+  if (value
+  

Re: [PATCH] match.pd: Restrict clz cmp 0 replacement by single_use, PR99142

2021-02-18 Thread Hans-Peter Nilsson via Gcc-patches
> From: Richard Biener via Gcc-patches 
> Date: Thu, 18 Feb 2021 11:12:26 +0100

> On Thu, Feb 18, 2021 at 12:35 AM Hans-Peter Nilsson via Gcc-patches
>  wrote:
> >
> > If we're not going to eliminate the clz, it's better for the
> > comparison to use that result than its input, so we don't
> > extend the lifetime of the input.  Also, an additional use
> > of the result is more likely cheaper than a compare of the
> > input, in particular considering that the clz may have made
> > available a non-zero condition matching the original use.
> 
> Of course the non-zero compare can execute concurrently
> with the clz if we use the input which can result in faster
> operation.  It's one of the many cases where local folding
> doesn't tell us enough.  One of the usual issues with
> single_use checks is that the IL is not fully DCEd while
> we fold it and thus dead uses can make sth !single_use
> even though it really is single_use.
> 
> Also if there's a use of the input live at the point of the
> compare we want to replace it's likely better to do the
> transform.

Yeah, all depending on the code context.  I guess these
simple match.pd transformations will grow some conditions.

> > The "s" modifier doesn't stop this situation, as the
> > transformation wouldn't result in "an expression with more
> > than one operator"; a gating single_use condition on the
> > result is necessary.
> 
> Note that :s was done this way for the use in VN where
> we can this way CSE c != 0 with an earlier a >= 0.
> 
> It might be an interesting idea to allow match.pd consumers
> to "switch" behavior of :s according to their needs.

Like, as a heuristic for a non-ooo-target, to mean single_use?
Yes, good idea!

> > Regtested cris-elf and x86_64-linux.
> > Ok to commit?
> 
> OK.

Done as a2ef38b1f94d, thanks for the review.

> gcc/testsuite:
> PR tree-optimization/99142
> * testsuite/gcc.dg/tree-ssa/pr99142.c: New test.

JFTR; oops: ^ Spurious "testsuite/" removed.

brgds, H-P


[PATCH] C++: target attribute - local decl

2021-02-18 Thread Martin Liška

We crash when target attribute get_function_versions_dispatcher is called
for a function that is not registered in call graph. This fixes that
by emitting a new error.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/cp/ChangeLog:

PR c++/99108
* call.c (get_function_version_dispatcher): Do not parse
target attribute for a function with a missing declaration.

gcc/testsuite/ChangeLog:

PR c++/99108
* g++.target/i386/pr99108.C: New test.
---
 gcc/cp/call.c   |  8 +++-
 gcc/testsuite/g++.target/i386/pr99108.C | 18 ++
 2 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.target/i386/pr99108.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 186feef6fe3..844853e504e 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -8386,8 +8386,14 @@ get_function_version_dispatcher (tree fn)
  && DECL_FUNCTION_VERSIONED (fn));
 
   gcc_assert (targetm.get_function_versions_dispatcher);

-  dispatcher_decl = targetm.get_function_versions_dispatcher (fn);
+  if (cgraph_node::get (fn) == NULL)
+{
+  error_at (DECL_SOURCE_LOCATION (fn), "missing declaration "
+   "for a multiversioned function");
+  return NULL;
+}
 
+  dispatcher_decl = targetm.get_function_versions_dispatcher (fn);

   if (dispatcher_decl == NULL)
 {
   error_at (input_location, "use of multiversioned function "
diff --git a/gcc/testsuite/g++.target/i386/pr99108.C 
b/gcc/testsuite/g++.target/i386/pr99108.C
new file mode 100644
index 000..b0c4ffa2688
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/pr99108.C
@@ -0,0 +1,18 @@
+/* PR c++/99108 */
+/* { dg-do compile { target c++20 } } */
+/* { dg-require-ifunc "" }  */
+
+struct A {
+  void foo(auto);
+};
+void A::foo(auto)
+{
+  int f(void) __attribute__((target("default")));
+  int f(void) __attribute__((target("arch=atom"))); /* { dg-error "missing 
declaration for a multiversioned function" } */
+  int b = f();
+}
+void bar(void)
+{
+  A c;
+  c.foo(7);
+}
--
2.30.0



Re: [PATCH] Add -fgnu-retain/-fno-gnu-retain

2021-02-18 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 18, 2021 at 12:00:59PM +, Jozef Lawrynowicz wrote:
> If we can add ".retain " to GAS, then I agree, current GCC
> SHF_GNU_RETAIN behavior should be removed. For GCC 12 we leverage
> .retain to implement the functionality where "used" saves symbols form
> linker garbage collection, without modifying section names or the
> structure of the object file.

Even in that case I think it is a bad idea to change the "used" attribute
behavior, not everyone using "used" attribute needs or wants that behavior,
so it should be a functionality provided by a separate new attribute.

Jakub



Re: [PATCH] Add -fgnu-retain/-fno-gnu-retain

2021-02-18 Thread Jozef Lawrynowicz
On Thu, Feb 18, 2021 at 11:19:50AM +0100, Richard Biener via Binutils wrote:
> On Wed, Feb 17, 2021 at 12:25 PM Jozef Lawrynowicz
>  wrote:
> >
> > On Tue, Feb 16, 2021 at 05:45:47PM +0100, Jakub Jelinek via Gcc-patches 
> > wrote:
> > > On Mon, Feb 15, 2021 at 02:35:07PM -0800, H.J. Lu via Gcc-patches wrote:
> > > > When building Linux kernel, ld in bninutils 2.36 with GCC 11 generates
> > > > thousands of
> > > >
> > > > ld: warning: orphan section `.data.event_initcall_finish' from 
> > > > `init/main.o' being placed in section `.data.event_initcall_finish'
> > > > ld: warning: orphan section `.data.event_initcall_start' from 
> > > > `init/main.o' being placed in section `.data.event_initcall_start'
> > > > ld: warning: orphan section `.data.event_initcall_level' from 
> > > > `init/main.o' being placed in section `.data.event_initcall_level'
> > > >
> > > > Since these sections are marked with SHF_GNU_RETAIN, they are placed in
> > > > separate sections.  They become orphan sections since they aren't 
> > > > expected
> > > > in the Linux kernel linker script. But orphan sections normally don't 
> > > > work
> > > > well with the Linux kernel linker script and the resulting kernel 
> > > > crashed.
> > > >
> > > > Add -fgnu-retain to disable SHF_GNU_RETAIN for Linux kernel build with
> > > > -fno-gnu-retain.
> > >
> > > I'd say this shows that the changing of meaning of "used" attribute wasn't
> > > really a good idea, the Linux kernel certainly won't be the only problem
> > > and people use "used" attribute for many reasons and don't necessarily 
> > > want
> > > the different sections or different behavior of those sections if they use
> > > it.
> > >
> > > So, can't we instead:
> > > 1) restore the old behavior of "used" by default
> > > 2) add "retain" attribute that implies the SHF_GNU_RETAIN stuff and fails
> > >if the configured assembler/linker doesn't support those
> > > 3) add a non-default option through which one could opt in for "used"
> > >attribute to imply retain attribute too for projects that want that?
> > >
> >
> > In previous discussions, it seemed to me that there was general support
> > for the "used" attribute implying SHF_GNU_RETAIN and saving symbols from
> > linker garbage collection[1]. Of course, the current implementation
> > results in undesirable behavior - the thought that all linker scripts
> > not supporting uniquely named sections would need to be updated is quite
> > alarming.
> >
> > It's a shame that all this extra complication is required, just because
> > we cannot have a ".retain ", directive.
> >
> > My preferred vision for this functionality was:
> >   - SHF_GNU_RETAIN section flag indicates a section should be saved
> > from linker garbage collection.
> >   - ".retain " assembler directive applies SHF_GNU_RETAIN
> > to the section containing .
> >   - GCC "used" attribute emits a ".retain " directive for
> > the symbol declaration is is applied to.  Applying the "used"
> > attribute to a symbol declaration does not change the structure of
> > the object file, beyond applying SHF_GNU_RETAIN to the section
> > containing that symbol.
> >
> > H.J. vetoed ".retain "[2], since it fails the predicate:
> >   Assembler directive referring to a symbol must operate on the symbol
> >   table.
> >
> > So because of this veto, we have compromised on "code quality" so far,
> > since any linker script not supporting named sections, used to link an
> > application containing "used" symbols (now put into their own section) has
> > potential undefined behavior.
> >
> > With the new proposed change to use a "retain" attribute, we now
> > compromise on functionality, since the "used" attribute saving symbols
> > from linker garbage collection is disabled by default.
> >
> > All because we cannot introduce an exception to the above predicate.
> >
> > I would like to re-open discussions on whether a ".retain 
> > directive is acceptable. This would enable "used" to imply
> > SHF_GNU_RETAIN, without changing the structure of the object file,
> > thereby allowing the new functionality to be used without linker script
> > modifications.
> >
> > If a Binutils global maintainer could side one way or the other on
> > ".retain " (an opinion was never given by the Binutils
> > maintainers when we had the discussions on that mailing list, although
> > Jeff did support .retain in the GCC discussions[3]), then it will be
> > easier to proceed knowing definitively that ".retain" is rejected and we
> > have no choice but to go this route of having a separate "retain"
> > attribute.
> 
> So I then propose, for GCC 11, to rip out / disable any use of SHF_GNU_RETAIN,
> effectively restoring GCC 10 beahvior and re-consider for GCC 12.

It depends if the Binutils maintainers would accept a patch implementing
".retain " in principle.

If we can add ".retain " to GAS, then I agree, current GCC
SHF_GNU_RETAIN behavior should be removed. For GCC 12 we leverage
.retain 

Re: [PATCH] profiling: fix streaming of TOPN counters

2021-02-18 Thread Martin Liška

On 2/18/21 11:02 AM, Richard Biener wrote:

On Thu, Feb 18, 2021 at 10:46 AM Martin Liška  wrote:


On 2/18/21 10:31 AM, Richard Biener wrote:

On Wed, Feb 17, 2021 at 2:16 PM Martin Liška  wrote:


On 2/17/21 9:36 AM, Martin Liška wrote:

I'll write it even more robust...


This is more elegant approach I've just tested on the instrumented clang.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?


Isn't it still too complicated?  We're asked to write N counters so why
don't we just write N counters?


Well, we are asked to stream N counters. But each has a variable length,
which makes it funny :)

Thus, you already do


 for (struct gcov_kvp *node = (struct gcov_kvp *)(intptr_t)start;
-  node != NULL; node = node->next)
+  node != NULL; node = node->next, j++)
  {
gcov_write_counter (node->value);
gcov_write_counter (node->count);
+
+ /* Stop when we reach expected number of items.  */
+ if (j + 1 == list_sizes[i])
+   break;
  }

why isn't this then only thing you need (just using pair_count as gathered
previously)?  And just count pair_count to zero as IV?  No need to
do the node != NULL check?


We can't do that, because that will lead in a corrupted profile.
We can have 2 problematic situations:

1) linked list is shorter, e.g.

10 2 1 10

Here 2 represents 2 tuples, but we stream only one {1: 10}. That happens in 
the instrumented clang.

2) linked list is longer, e.g.
10 1 1 5 22 5

Here 1 represents 1 tuple, be we stream 2 tuples {1: 10} and {22: 5}.


I guess the problem is that for whatever reason the stream includes

unsigned disk_size = GCOV_TOPN_DISK_COUNTERS * counters + 2 * pair_total;
gcov_write_tag_length (GCOV_TAG_FOR_COUNTER (t_ix),
  GCOV_TAG_COUNTER_LENGTH (disk_size));

which has the sum of the list lengths, but ...


Yes, GCOV tags always contain size and it's very handy in verification
that a .gcda is in a consistent state.



-  gcov_type pair_count = ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 1];
gcov_write_counter (ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i]);
-  gcov_write_counter (pair_count);

we stream something like that again.  What are the two different counters?
You save one, but the other you take async again?


ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 0] == total_number_of_executions
ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 1] == length of the linked list
ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 2] == pointer to the first item
in a linked list



The disk format here is sub-optimal at least and optimizing the read side
which has locking in place for making the write-side racy doesn't look good.


Yes, we slighly miss some space. Per-file locking should not block streaming
parallel of profiles.





Note architectures with less nice memory ordering guarantees might
eventually see partially updated pointers and counters so I think
we at least want atomic_read ()s of the values with the weakest
consistency possible.  (but that can be done as followup if we agree on that)


Can we really see a partially updated pointer?


Not on x86 for aligned data.  Note the same applies to the counter values.

We've seen allocating memory from libgcov is problematic, so this is why I'm
asking.


Sure.

Martin




Thanks,
Martin



Richard.


Thanks,
Martin






Re: [PATCH 1/1] libiberty(argv.c): Fix memory leak in expandargv.

2021-02-18 Thread Prathamesh Kulkarni via Gcc-patches
On Thu, 18 Feb 2021 at 16:32, Ayush Mittal via Gcc-patches
 wrote:
>
> Dynamic memory referenced by 'buffer' was allocated using xmalloc but fails 
> to free it
> when jump to 'error' label.
>
> Issue as per static analysis tool.
>
> Signed-off-by: Ayush Mittal 
> Signed-off-by: Maninder Singh 
> ---
>  libiberty/ChangeLog | 4 
>  libiberty/argv.c| 5 -
>  2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog
> index e472553..96cacba 100644
> --- a/libiberty/ChangeLog
> +++ b/libiberty/ChangeLog
> @@ -1,3 +1,7 @@
> +2021-02-18  Ayush Mittal  
> +
> +   * argv.c (expandargv): Fix memory leak for buffer allocated to read 
> file contents.
AFAIK, the policy has changed to not include ChangeLog in patches anymore.
Instead the ChangeLog bits should be written in commit messages, and
the ChangeLog file
is updated automatically based on that.
> +
>  2021-02-01  Martin Sebor  
>
> * dyn-string.c (dyn_string_insert_cstr): Use memcpy instead of strncpy
> diff --git a/libiberty/argv.c b/libiberty/argv.c
> index cd97f90..7199c7d 100644
> --- a/libiberty/argv.c
> +++ b/libiberty/argv.c
> @@ -442,7 +442,10 @@ expandargv (int *argcp, char ***argvp)
>  due to CR/LF->CR translation when reading text files.
>  That does not in-and-of itself indicate failure.  */
>   && ferror (f))
> -   goto error;
> +   {
> + free(buffer);
> + goto error;
> +   }
Sorry to nitpick for formatting issues - IIRC, the convention is to
use free (buffer) with space
between function name and "(". As a general note, please run patches
thru contrib/check_GNU_style.py.

The patch looks OK to me, but I can't approve.

Thanks,
Prathamesh
>/* Add a NUL terminator.  */
>buffer[len] = '\0';
>/* If the file is empty or contains only whitespace, buildargv would
> --
> 1.9.1
>


[PATCH 1/1] libiberty(argv.c): Fix memory leak in expandargv.

2021-02-18 Thread Ayush Mittal via Gcc-patches
Dynamic memory referenced by 'buffer' was allocated using xmalloc but fails to 
free it
when jump to 'error' label.

Issue as per static analysis tool.

Signed-off-by: Ayush Mittal 
Signed-off-by: Maninder Singh 
---
 libiberty/ChangeLog | 4 
 libiberty/argv.c| 5 -
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog
index e472553..96cacba 100644
--- a/libiberty/ChangeLog
+++ b/libiberty/ChangeLog
@@ -1,3 +1,7 @@
+2021-02-18  Ayush Mittal  
+
+   * argv.c (expandargv): Fix memory leak for buffer allocated to read 
file contents.
+
 2021-02-01  Martin Sebor  
 
* dyn-string.c (dyn_string_insert_cstr): Use memcpy instead of strncpy
diff --git a/libiberty/argv.c b/libiberty/argv.c
index cd97f90..7199c7d 100644
--- a/libiberty/argv.c
+++ b/libiberty/argv.c
@@ -442,7 +442,10 @@ expandargv (int *argcp, char ***argvp)
 due to CR/LF->CR translation when reading text files.
 That does not in-and-of itself indicate failure.  */
  && ferror (f))
-   goto error;
+   {
+ free(buffer);
+ goto error;
+   }
   /* Add a NUL terminator.  */
   buffer[len] = '\0';
   /* If the file is empty or contains only whitespace, buildargv would
-- 
1.9.1



Re: [PATCH] match.pd: Restrict clz cmp 0 replacement by single_use, PR99142

2021-02-18 Thread Richard Biener via Gcc-patches
On Thu, Feb 18, 2021 at 11:22 AM Jakub Jelinek  wrote:
>
> On Thu, Feb 18, 2021 at 11:12:26AM +0100, Richard Biener via Gcc-patches 
> wrote:
> > On Thu, Feb 18, 2021 at 12:35 AM Hans-Peter Nilsson via Gcc-patches
> >  wrote:
> > >
> > > If we're not going to eliminate the clz, it's better for the
> > > comparison to use that result than its input, so we don't
> > > extend the lifetime of the input.  Also, an additional use
> > > of the result is more likely cheaper than a compare of the
> > > input, in particular considering that the clz may have made
> > > available a non-zero condition matching the original use.
> >
> > Of course the non-zero compare can execute concurrently
> > with the clz if we use the input which can result in faster
> > operation.  It's one of the many cases where local folding
> > doesn't tell us enough.  One of the usual issues with
> > single_use checks is that the IL is not fully DCEd while
> > we fold it and thus dead uses can make sth !single_use
> > even though it really is single_use.
>
> And on the other side, the IL might not be fully CSEd yet and
> so we can see single use, but after SCCVN it wouldn't be single use.

Yes, the issue is definitely complex.  The IL before SCCVN can also
be not fully DCEd and so VN could CSE an expression for one that
is dead (and now becomes live).

Richard.

> Jakub
>


Re: [PATCH v6] Add retain attribute to place symbols in SHF_GNU_RETAIN section

2021-02-18 Thread Richard Biener via Gcc-patches
On Thu, Feb 18, 2021 at 5:15 AM H.J. Lu via Gcc-patches
 wrote:
>
> On Wed, Feb 17, 2021 at 12:50 PM H.J. Lu  wrote:
> >
> > On Wed, Feb 17, 2021 at 12:14 PM Jakub Jelinek  wrote:
> > >
> > > On Wed, Feb 17, 2021 at 12:04:34PM -0800, H.J. Lu wrote:> >
> > > > +  /* For -fretain-used-symbol, a "used" attribute also implies 
> > > > "retain".  */
> > >
> > > s/-symbol/-symbols/
> >
> > Fixed.
> >
> > > > +  if (flag_retain_used_symbols
> > > > +  && attributes
> > > > +  && (TREE_CODE (*node) == FUNCTION_DECL
> > > > +   || (VAR_P (*node) && TREE_STATIC (*node))
> > > > +   || (TREE_CODE (*node) == TYPE_DECL))
> > >
> > > What will "retain" do on TYPE_DECLs?  It only makes sense to me
> > > for FUNCTION_DECL or non-automatic VAR_DECLs, unless for types
> > > it means the types with that result in vars with those types automatically
> > > getting "retain", but there is no code for that and I don't see "used"
> > > handled that way.
> >
> > Fixed.
> >
> > > > +  if (SUPPORTS_SHF_GNU_RETAIN
> > > > +  && (TREE_CODE (node) == FUNCTION_DECL
> > > > +   || (VAR_P (node) && TREE_STATIC (node))
> > > > +   || (TREE_CODE (node) == TYPE_DECL)))
> > >
> > > Likewise.
> >
> > Fixed.
> >
> > > > +;
> > > > +  else
> > > > +{
> > > > +  warning (OPT_Wattributes, "%qE attribute ignored", name);
> > > > +  *no_add_attrs = true;
> > > > +}
> > > > +
> > > > +  return NULL_TREE;
> > > > +}
> > > > +
> > > >  /* Handle a "externally_visible" attribute; arguments as in
> > > > struct attribute_spec.handler.  */
> > > >
> > > > diff --git a/gcc/common.opt b/gcc/common.opt
> > > > index c75dd36843e..70842481d4d 100644
> > > > --- a/gcc/common.opt
> > > > +++ b/gcc/common.opt
> > > > @@ -2404,6 +2404,10 @@ frerun-loop-opt
> > > >  Common Ignore
> > > >  Does nothing.  Preserved for backward compatibility.
> > > >
> > > > +fretain-used-symbols
> > > > +Common Var(flag_retain_used_symbols)
> > > > +Make the used attribute to imply the retain attribute if supported.
> > >
> > > English is not my native language, but I think the "to " doesn't belong
> > > here.
> >
> > Fixed.
> >
> > > > +@item -fretain-used-symbols
> > > > +@opindex fretain-used-symbols
> > > > +On systems with recent GNU assembler and linker, the compiler makes
> > >
> > > I think we should avoid recent here, new/old/recent etc. are relative 
> > > terms.
> > > Either use exact version (is it 2.36 or later) or say GNU assembler and
> > > linker that supports the @code{.retain} directive or something similar.
> >
> > Fixed.
> >
> > > >flags |= SECTION_NAMED;
> > > > -  if (SUPPORTS_SHF_GNU_RETAIN
> > > > -  && decl != nullptr
> > > > +  if (decl != nullptr
> > > >&& DECL_P (decl)
> > > > -  && DECL_PRESERVE_P (decl))
> > > > +  && DECL_PRESERVE_P (decl)
> > > > +  && lookup_attribute ("retain", DECL_ATTRIBUTES (decl)))
> > > >  flags |= SECTION_RETAIN;
> > > >if (*slot == NULL)
> > > >  {
> > >
> > > I think none of the varasm.c code should use DECL_PRESERVE_P when it means
> > > retain, just lookup_attribute.
> >
> > Fixed.
> >
> > > > @@ -487,7 +487,8 @@ resolve_unique_section (tree decl, int reloc 
> > > > ATTRIBUTE_UNUSED,
> > > >if (DECL_SECTION_NAME (decl) == NULL
> > > >&& targetm_common.have_named_sections
> > > >&& (flag_function_or_data_sections
> > > > -   || (SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl))
> > > > +   || (DECL_PRESERVE_P (decl)
> > > > +   && lookup_attribute ("retain", DECL_ATTRIBUTES (decl)))
> > > > || DECL_COMDAT_GROUP (decl)))
> > > >  {
> > > >targetm.asm_out.unique_section (decl, reloc);
> > > > @@ -1227,7 +1228,8 @@ get_variable_section (tree decl, bool 
> > > > prefer_noswitch_p)
> > > >  vnode->get_constructor ();
> > > >
> > > >if (DECL_COMMON (decl)
> > > > -  && !(SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl)))
> > > > +  && !(DECL_PRESERVE_P (decl)
> > > > +&& lookup_attribute ("retain", DECL_ATTRIBUTES (decl
> > > >  {
> > > >/* If the decl has been given an explicit section name, or it 
> > > > resides
> > > >in a non-generic address space, then it isn't common, and 
> > > > shouldn't
> > > > @@ -7761,12 +7763,14 @@ switch_to_section (section *new_section, tree 
> > > > decl)
> > > >  {
> > > >if (in_section == new_section)
> > > >  {
> > > > -  if (SUPPORTS_SHF_GNU_RETAIN
> > > > -   && (new_section->common.flags & SECTION_NAMED)
> > > > +  if ((new_section->common.flags & SECTION_NAMED)
> > > > && decl != nullptr
> > > > && DECL_P (decl)
> > > > && (!!DECL_PRESERVE_P (decl)
> > > > -   != !!(new_section->common.flags & SECTION_RETAIN)))
> > > > +   != !!(new_section->common.flags & SECTION_RETAIN))
> > > > +   && (lookup_attribute ("retain",
> > > > + DECL_ATTRIBUTES (new_section->named.decl))
> > > > +   || 

Re: [PATCH] match.pd: Restrict clz cmp 0 replacement by single_use, PR99142

2021-02-18 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 18, 2021 at 11:12:26AM +0100, Richard Biener via Gcc-patches wrote:
> On Thu, Feb 18, 2021 at 12:35 AM Hans-Peter Nilsson via Gcc-patches
>  wrote:
> >
> > If we're not going to eliminate the clz, it's better for the
> > comparison to use that result than its input, so we don't
> > extend the lifetime of the input.  Also, an additional use
> > of the result is more likely cheaper than a compare of the
> > input, in particular considering that the clz may have made
> > available a non-zero condition matching the original use.
> 
> Of course the non-zero compare can execute concurrently
> with the clz if we use the input which can result in faster
> operation.  It's one of the many cases where local folding
> doesn't tell us enough.  One of the usual issues with
> single_use checks is that the IL is not fully DCEd while
> we fold it and thus dead uses can make sth !single_use
> even though it really is single_use.

And on the other side, the IL might not be fully CSEd yet and
so we can see single use, but after SCCVN it wouldn't be single use.

Jakub



Re: [PATCH] Add -fgnu-retain/-fno-gnu-retain

2021-02-18 Thread Richard Biener via Gcc-patches
On Wed, Feb 17, 2021 at 12:25 PM Jozef Lawrynowicz
 wrote:
>
> On Tue, Feb 16, 2021 at 05:45:47PM +0100, Jakub Jelinek via Gcc-patches wrote:
> > On Mon, Feb 15, 2021 at 02:35:07PM -0800, H.J. Lu via Gcc-patches wrote:
> > > When building Linux kernel, ld in bninutils 2.36 with GCC 11 generates
> > > thousands of
> > >
> > > ld: warning: orphan section `.data.event_initcall_finish' from 
> > > `init/main.o' being placed in section `.data.event_initcall_finish'
> > > ld: warning: orphan section `.data.event_initcall_start' from 
> > > `init/main.o' being placed in section `.data.event_initcall_start'
> > > ld: warning: orphan section `.data.event_initcall_level' from 
> > > `init/main.o' being placed in section `.data.event_initcall_level'
> > >
> > > Since these sections are marked with SHF_GNU_RETAIN, they are placed in
> > > separate sections.  They become orphan sections since they aren't expected
> > > in the Linux kernel linker script. But orphan sections normally don't work
> > > well with the Linux kernel linker script and the resulting kernel crashed.
> > >
> > > Add -fgnu-retain to disable SHF_GNU_RETAIN for Linux kernel build with
> > > -fno-gnu-retain.
> >
> > I'd say this shows that the changing of meaning of "used" attribute wasn't
> > really a good idea, the Linux kernel certainly won't be the only problem
> > and people use "used" attribute for many reasons and don't necessarily want
> > the different sections or different behavior of those sections if they use
> > it.
> >
> > So, can't we instead:
> > 1) restore the old behavior of "used" by default
> > 2) add "retain" attribute that implies the SHF_GNU_RETAIN stuff and fails
> >if the configured assembler/linker doesn't support those
> > 3) add a non-default option through which one could opt in for "used"
> >attribute to imply retain attribute too for projects that want that?
> >
>
> In previous discussions, it seemed to me that there was general support
> for the "used" attribute implying SHF_GNU_RETAIN and saving symbols from
> linker garbage collection[1]. Of course, the current implementation
> results in undesirable behavior - the thought that all linker scripts
> not supporting uniquely named sections would need to be updated is quite
> alarming.
>
> It's a shame that all this extra complication is required, just because
> we cannot have a ".retain ", directive.
>
> My preferred vision for this functionality was:
>   - SHF_GNU_RETAIN section flag indicates a section should be saved
> from linker garbage collection.
>   - ".retain " assembler directive applies SHF_GNU_RETAIN
> to the section containing .
>   - GCC "used" attribute emits a ".retain " directive for
> the symbol declaration is is applied to.  Applying the "used"
> attribute to a symbol declaration does not change the structure of
> the object file, beyond applying SHF_GNU_RETAIN to the section
> containing that symbol.
>
> H.J. vetoed ".retain "[2], since it fails the predicate:
>   Assembler directive referring to a symbol must operate on the symbol
>   table.
>
> So because of this veto, we have compromised on "code quality" so far,
> since any linker script not supporting named sections, used to link an
> application containing "used" symbols (now put into their own section) has
> potential undefined behavior.
>
> With the new proposed change to use a "retain" attribute, we now
> compromise on functionality, since the "used" attribute saving symbols
> from linker garbage collection is disabled by default.
>
> All because we cannot introduce an exception to the above predicate.
>
> I would like to re-open discussions on whether a ".retain 
> directive is acceptable. This would enable "used" to imply
> SHF_GNU_RETAIN, without changing the structure of the object file,
> thereby allowing the new functionality to be used without linker script
> modifications.
>
> If a Binutils global maintainer could side one way or the other on
> ".retain " (an opinion was never given by the Binutils
> maintainers when we had the discussions on that mailing list, although
> Jeff did support .retain in the GCC discussions[3]), then it will be
> easier to proceed knowing definitively that ".retain" is rejected and we
> have no choice but to go this route of having a separate "retain"
> attribute.

So I then propose, for GCC 11, to rip out / disable any use of SHF_GNU_RETAIN,
effectively restoring GCC 10 beahvior and re-consider for GCC 12.

Richard.

> Thanks,
> Jozef
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557097.html
> [2] https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558123.html
> [3] https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558398.html


Re: [PATCH v2] Add -fgnu-retain to place used symbols in SHF_GNU_RETAIN section

2021-02-18 Thread Richard Biener via Gcc-patches
On Wed, Feb 17, 2021 at 3:33 PM Jakub Jelinek via Gcc-patches
 wrote:
>
> On Tue, Feb 16, 2021 at 11:59:21AM -0800, H.J. Lu wrote:
> >   PR target/99113
> >   * common.opt: Add -fgnu-retain.
>
> I'm not sure -fgnu-retain as the option name.
> Wouldn't say -fretain-used-vars be better?

-fretain-used-annotated-vars?  Because 'used' has a meaning in user terms
and I would not associate it with only 'used' attribute annotated variables.

> > @@ -1666,6 +1666,10 @@ floop-unroll-and-jam
> >  Common Var(flag_unroll_jam) Optimization
> >  Perform unroll-and-jam on loops.
> >
> > +fgnu-retain
> > +Common Var(flag_gnu_retain)
> > +Use SHF_GNU_RETAIN on used symbols if supported by the assembler and the 
> > linker.
>
> on variables with the used attribute?
>
> > diff --git a/gcc/toplev.c b/gcc/toplev.c
> > index d8cc254adef..119cd7c0432 100644
> > --- a/gcc/toplev.c
> > +++ b/gcc/toplev.c
> > @@ -1761,6 +1761,13 @@ process_options (void)
> >if (flag_large_source_files)
> >  line_table->default_range_bits = 0;
> >
> > +  if (flag_gnu_retain && !SUPPORTS_SHF_GNU_RETAIN)
> > +{
> > +  warning_at (UNKNOWN_LOCATION, 0, "%qs is not supported for this 
> > target",
> > +   "-fgnu-retain");
> > +  flag_gnu_retain = 0;
> > +}
> > +
> >/* Please don't change global_options after this point, those changes 
> > won't
> >   be reflected in optimization_{default,current}_node.  */
> >  }
> > diff --git a/gcc/varasm.c b/gcc/varasm.c
> > index 29478ab0d8d..4e0e30abee5 100644
> > --- a/gcc/varasm.c
> > +++ b/gcc/varasm.c
> > @@ -297,7 +297,7 @@ get_section (const char *name, unsigned int flags, tree 
> > decl,
> >slot = section_htab->find_slot_with_hash (name, htab_hash_string (name),
> >   INSERT);
> >flags |= SECTION_NAMED;
> > -  if (SUPPORTS_SHF_GNU_RETAIN
> > +  if (flag_gnu_retain
> >&& decl != nullptr
> >&& DECL_P (decl)
> >&& DECL_PRESERVE_P (decl))
> > @@ -487,7 +487,7 @@ resolve_unique_section (tree decl, int reloc 
> > ATTRIBUTE_UNUSED,
> >if (DECL_SECTION_NAME (decl) == NULL
> >&& targetm_common.have_named_sections
> >&& (flag_function_or_data_sections
> > -   || (SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl))
> > +   || (flag_gnu_retain && DECL_PRESERVE_P (decl))
> > || DECL_COMDAT_GROUP (decl)))
> >  {
> >targetm.asm_out.unique_section (decl, reloc);
>
> I'm not convinced this will work properly with LTO, the option from
> -flto -c compilation would be ignored.
> For functions, we usually mark the option Optimization and make it
> saved/restored on function switches, but this is for variables, so it will
> not work for those.
> I'd think better would be to add when seeing "used" attribute some
> artificial attribute (unless we have "retain" attribute to mean that, say
> "retain ") when the flag is on in handle_used_attribute and use that
> attribute in varasm instead of the option?

Yeah, that's very much better and more likely to work.

> > @@ -1227,7 +1227,7 @@ get_variable_section (tree decl, bool 
> > prefer_noswitch_p)
> >  vnode->get_constructor ();
> >
> >if (DECL_COMMON (decl)
> > -  && !(SUPPORTS_SHF_GNU_RETAIN && DECL_PRESERVE_P (decl)))
> > +  && !(flag_gnu_retain && DECL_PRESERVE_P (decl)))
> >  {
> >/* If the decl has been given an explicit section name, or it resides
> >in a non-generic address space, then it isn't common, and shouldn't
> > @@ -7761,7 +7761,7 @@ switch_to_section (section *new_section, tree decl)
> >  {
> >if (in_section == new_section)
> >  {
> > -  if (SUPPORTS_SHF_GNU_RETAIN
> > +  if (flag_gnu_retain
> > && (new_section->common.flags & SECTION_NAMED)
> > && decl != nullptr
> > && DECL_P (decl)
>
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -16168,6 +16168,11 @@ DSOs; if your program relies on reinitialization 
> > of a DSO via
> > @code{dlclose} and @code{dlopen}, you can use
> > @option{-fno-gnu-unique}.
> >
> > +@item -fgnu-retain
> > +@opindex fgnu-retain
> > +On systems with recent GNU assembler and linker, the compiler places
> > +used symbols in separate SHF_GNU_RETAIN sections.
>
> Again, variables with the used attribute.
>
> Jakub
>


Re: [PATCH] match.pd: Restrict clz cmp 0 replacement by single_use, PR99142

2021-02-18 Thread Richard Biener via Gcc-patches
On Thu, Feb 18, 2021 at 12:35 AM Hans-Peter Nilsson via Gcc-patches
 wrote:
>
> If we're not going to eliminate the clz, it's better for the
> comparison to use that result than its input, so we don't
> extend the lifetime of the input.  Also, an additional use
> of the result is more likely cheaper than a compare of the
> input, in particular considering that the clz may have made
> available a non-zero condition matching the original use.

Of course the non-zero compare can execute concurrently
with the clz if we use the input which can result in faster
operation.  It's one of the many cases where local folding
doesn't tell us enough.  One of the usual issues with
single_use checks is that the IL is not fully DCEd while
we fold it and thus dead uses can make sth !single_use
even though it really is single_use.

Also if there's a use of the input live at the point of the
compare we want to replace it's likely better to do the
transform.

> The "s" modifier doesn't stop this situation, as the
> transformation wouldn't result in "an expression with more
> than one operator"; a gating single_use condition on the
> result is necessary.

Note that :s was done this way for the use in VN where
we can this way CSE c != 0 with an earlier a >= 0.

It might be an interesting idea to allow match.pd consumers
to "switch" behavior of :s according to their needs.

> Regtested cris-elf and x86_64-linux.
> Ok to commit?

OK.

Thanks,
Richard.

> gcc:
> PR tree-optimization/99142
> * match.pd (clz cmp 0): Gate replacement on single_use of clz result.
>
> gcc/testsuite:
> PR tree-optimization/99142
> * testsuite/gcc.dg/tree-ssa/pr99142.c: New test.
> ---
>  gcc/match.pd|  4 ++--
>  gcc/testsuite/gcc.dg/tree-ssa/pr99142.c | 14 ++
>  2 files changed, 16 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr99142.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index e14f69744d7e..760f773cf1b1 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -6315,8 +6315,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (for op (eq ne)
>cmp (lt ge)
>(simplify
> -   (op (clz:s @0) INTEGER_CST@1)
> -   (if (integer_zerop (@1))
> +   (op (clz:s@2 @0) INTEGER_CST@1)
> +   (if (integer_zerop (@1) && single_use (@2))
>  /* clz(X) == 0 is (int)X < 0 and clz(X) != 0 is (int)X >= 0.  */
>  (with { tree stype = signed_type_for (TREE_TYPE (@0));
> HOST_WIDE_INT val = 0;
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr99142.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr99142.c
> new file mode 100644
> index ..1781a89de32a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr99142.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-not " >= 0\\)" "optimized" } } */
> +int f(int a, int *b, int *d)
> +{
> +  int c = __builtin_clz(a);
> +
> +  *b = c;
> +
> +  if (c != 0)
> +*d = c;
> +
> +  return c;
> +}
> --
> 2.11.0
>


Re: [PATCH] profiling: fix streaming of TOPN counters

2021-02-18 Thread Richard Biener via Gcc-patches
On Thu, Feb 18, 2021 at 10:46 AM Martin Liška  wrote:
>
> On 2/18/21 10:31 AM, Richard Biener wrote:
> > On Wed, Feb 17, 2021 at 2:16 PM Martin Liška  wrote:
> >>
> >> On 2/17/21 9:36 AM, Martin Liška wrote:
> >>> I'll write it even more robust...
> >>
> >> This is more elegant approach I've just tested on the instrumented clang.
> >>
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> Ready to be installed?
> >
> > Isn't it still too complicated?  We're asked to write N counters so why
> > don't we just write N counters?
>
> Well, we are asked to stream N counters. But each has a variable length,
> which makes it funny :)
>
>Thus, you already do
> >
> > for (struct gcov_kvp *node = (struct gcov_kvp *)(intptr_t)start;
> > -  node != NULL; node = node->next)
> > +  node != NULL; node = node->next, j++)
> >  {
> >gcov_write_counter (node->value);
> >gcov_write_counter (node->count);
> > +
> > + /* Stop when we reach expected number of items.  */
> > + if (j + 1 == list_sizes[i])
> > +   break;
> >  }
> >
> > why isn't this then only thing you need (just using pair_count as gathered
> > previously)?  And just count pair_count to zero as IV?  No need to
> > do the node != NULL check?
>
> We can't do that, because that will lead in a corrupted profile.
> We can have 2 problematic situations:
>
> 1) linked list is shorter, e.g.
>
> 10 2 1 10
>
> Here 2 represents 2 tuples, but we stream only one {1: 10}. That happens 
> in the instrumented clang.
>
> 2) linked list is longer, e.g.
> 10 1 1 5 22 5
>
> Here 1 represents 1 tuple, be we stream 2 tuples {1: 10} and {22: 5}.

I guess the problem is that for whatever reason the stream includes

   unsigned disk_size = GCOV_TOPN_DISK_COUNTERS * counters + 2 * pair_total;
   gcov_write_tag_length (GCOV_TAG_FOR_COUNTER (t_ix),
 GCOV_TAG_COUNTER_LENGTH (disk_size));

which has the sum of the list lengths, but ...

-  gcov_type pair_count = ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 1];
   gcov_write_counter (ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i]);
-  gcov_write_counter (pair_count);

we stream something like that again.  What are the two different counters?
You save one, but the other you take async again?

The disk format here is sub-optimal at least and optimizing the read side
which has locking in place for making the write-side racy doesn't look good.

> >
> > Note architectures with less nice memory ordering guarantees might
> > eventually see partially updated pointers and counters so I think
> > we at least want atomic_read ()s of the values with the weakest
> > consistency possible.  (but that can be done as followup if we agree on 
> > that)
>
> Can we really see a partially updated pointer?

Not on x86 for aligned data.  Note the same applies to the counter values.

We've seen allocating memory from libgcov is problematic, so this is why I'm
asking.

> Thanks,
> Martin
>
> >
> > Richard.
> >
> >> Thanks,
> >> Martin
>


Re: [PATCH] profiling: fix streaming of TOPN counters

2021-02-18 Thread Martin Liška

On 2/18/21 10:31 AM, Richard Biener wrote:

On Wed, Feb 17, 2021 at 2:16 PM Martin Liška  wrote:


On 2/17/21 9:36 AM, Martin Liška wrote:

I'll write it even more robust...


This is more elegant approach I've just tested on the instrumented clang.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?


Isn't it still too complicated?  We're asked to write N counters so why
don't we just write N counters?


Well, we are asked to stream N counters. But each has a variable length,
which makes it funny :)

  Thus, you already do


for (struct gcov_kvp *node = (struct gcov_kvp *)(intptr_t)start;
-  node != NULL; node = node->next)
+  node != NULL; node = node->next, j++)
 {
   gcov_write_counter (node->value);
   gcov_write_counter (node->count);
+
+ /* Stop when we reach expected number of items.  */
+ if (j + 1 == list_sizes[i])
+   break;
 }

why isn't this then only thing you need (just using pair_count as gathered
previously)?  And just count pair_count to zero as IV?  No need to
do the node != NULL check?


We can't do that, because that will lead in a corrupted profile.
We can have 2 problematic situations:

1) linked list is shorter, e.g.

10 2 1 10

Here 2 represents 2 tuples, but we stream only one {1: 10}. That happens in 
the instrumented clang.

2) linked list is longer, e.g.
10 1 1 5 22 5

Here 1 represents 1 tuple, be we stream 2 tuples {1: 10} and {22: 5}.



Note architectures with less nice memory ordering guarantees might
eventually see partially updated pointers and counters so I think
we at least want atomic_read ()s of the values with the weakest
consistency possible.  (but that can be done as followup if we agree on that)


Can we really see a partially updated pointer?

Thanks,
Martin



Richard.


Thanks,
Martin




Re: [PATCH] profiling: fix streaming of TOPN counters

2021-02-18 Thread Richard Biener via Gcc-patches
On Wed, Feb 17, 2021 at 2:16 PM Martin Liška  wrote:
>
> On 2/17/21 9:36 AM, Martin Liška wrote:
> > I'll write it even more robust...
>
> This is more elegant approach I've just tested on the instrumented clang.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

Isn't it still too complicated?  We're asked to write N counters so why
don't we just write N counters?  Thus, you already do

   for (struct gcov_kvp *node = (struct gcov_kvp *)(intptr_t)start;
-  node != NULL; node = node->next)
+  node != NULL; node = node->next, j++)
{
  gcov_write_counter (node->value);
  gcov_write_counter (node->count);
+
+ /* Stop when we reach expected number of items.  */
+ if (j + 1 == list_sizes[i])
+   break;
}

why isn't this then only thing you need (just using pair_count as gathered
previously)?  And just count pair_count to zero as IV?  No need to
do the node != NULL check?

Note architectures with less nice memory ordering guarantees might
eventually see partially updated pointers and counters so I think
we at least want atomic_read ()s of the values with the weakest
consistency possible.  (but that can be done as followup if we agree on that)

Richard.

> Thanks,
> Martin


Re: [PATCH] i386: Avoid C++ global constructors in every object that includes i386.h

2021-02-18 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 18, 2021 at 09:42:00AM +0100, Richard Biener wrote:
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> OK.  Can you quickly try whether GCC 4.8 is happy with it?

Yes, at least non-bootstrap build with GCC 4.8.5 is just fine with it.

Jakub



Re: [PATCH] handle VLA of zero length arrays and vice versa (PR 99121)

2021-02-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Feb 17, 2021 at 02:11:43PM -0700, Martin Sebor wrote:
> On 2/17/21 1:47 PM, Jakub Jelinek wrote:
> > On Wed, Feb 17, 2021 at 01:27:55PM -0700, Martin Sebor wrote:
> > 
> > Not in this patch, but I've looked at what maxobjsize is and wonder why
> > the roundtrip tree -> HOST_WIDE_INT -> offset_int:
> >const offset_int maxobjsize = tree_to_shwi (max_object_size ());
> > Can't it be
> >const offset_int maxobjsize = wi::to_offset (max_object_size ());
> > ?
> 
> Yes, that's what it is elsewhere so this should do the same.  I've
> changed it.

Ok.

> > Doesn't arrbounds[1] == 0 mean there will be warnings for any accesses?
> > For eltsize == 0 I think you shouldn't warn when nelts isn't known,
> > instead of always warning, arr[1] will have the same address as
> > arr[0] ...
> 
> This branch is entered for VLAs of zero-length arrays where we want
> to warn, like this:
> 
> void f (void*);
> 
> void g (int n)
> {
>   int a[n][0];
>   ((int*)a)[0] = 0;
>   f (a);
> }

For this you do want to warn, but not the way you warn with the patch:
xxx.c: In function ‘g’:
xxx.c:6:12: warning: array subscript 0 is outside array bounds of 
‘int[][0]’ [-Warray-bounds]
6 |   ((int*)a)[0] = 0;
  |   ~^~~
xxx.c:5:7: note: while referencing ‘a’
5 |   int a[n][0];
  |   ^

The message doesn't make it clear which of the two subscripts is out of
bounds, yes, for [0] it would be outside of bounds, but for the VLA index
no index < n would be outside of bounds.

Consider a different (GNU C, in C++ struct S has non-zero size) testcase:
void f (void*);

void g (int n)
{
  struct S {} a[n];
  ((int*)a)[0] = 0;
  f (a);
}
yyy.c:6:12: warning: array subscript 0 is outside array bounds of ‘struct 
S[]’ [-Warray-bounds]
6 |   ((int*)a)[0] = 0;
  |   ~^~~
yyy.c:5:15: note: while referencing ‘a’
5 |   struct S {} a[n];
  |   ^
I bet that means you are really complaining about the VLA bound rather than
the [0] bound even in the first case, because the wording is otherwise the
same.  And for g (154) the array subscript 0 is certainly not a problem,
so the warning would need to be worded differently in that case.

Jakub



Re: [PATCH] Fix producer string memory leaks

2021-02-18 Thread Richard Biener via Gcc-patches
On Tue, Feb 16, 2021 at 5:09 PM Martin Sebor  wrote:
>
> On 2/15/21 7:39 AM, Richard Biener wrote:
> > On Mon, Feb 15, 2021 at 2:46 PM Martin Liška  wrote:
> >>
> >> On 2/12/21 5:56 PM, Martin Sebor wrote:
> >>> On 2/12/21 2:09 AM, Richard Biener via Gcc-patches wrote:
>  On Thu, Feb 11, 2021 at 6:41 PM Martin Liška  wrote:
> >
> > Hello.
> >
> > This fixes 2 memory leaks I noticed.
> >
> > Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >
> > Ready to be installed?
> 
>  OK.
> 
> > Thanks,
> > Martin
> >
> > gcc/ChangeLog:
> >
> >   * opts-common.c (decode_cmdline_option): Release werror_arg.
> >   * opts.c (gen_producer_string): Release output of
> >   gen_command_line_string.
> > ---
> > gcc/opts-common.c | 1 +
> > gcc/opts.c| 7 +--
> > 2 files changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/gcc/opts-common.c b/gcc/opts-common.c
> > index 6cb5602896d..5e10edeb4cf 100644
> > --- a/gcc/opts-common.c
> > +++ b/gcc/opts-common.c
> > @@ -766,6 +766,7 @@ decode_cmdline_option (const char *const *argv, 
> > unsigned int lang_mask,
> >   werror_arg[0] = 'W';
> >
> >   size_t warning_index = find_opt (werror_arg, lang_mask);
> > +  free (werror_arg);
> >>>
> >>> Sorry to butt in here, but since we're having a discussion on this
> >>> same subject in another review of a fix for a memory leak, I thought
> >>> I'd pipe up: I would suggest a better (more in line with C++ and more
> >>> future-proof) solution is to replace the call to xstrdup (and the need
> >>> to subsequently call free) with std::string.
> >>
> >> Hello.
> >>
> >> To be honest, I like the suggested approach using std::string. On the 
> >> other hand,
> >> I don't want to mix existing C API (char *) with std::string.
> >
> > That's one of the main problems.
> >
> >> I'm sending a patch that fixed the remaining leaks.
> >
> > OK.
> >
> >>>
> >   if (warning_index != OPT_SPECIAL_unknown)
> >   {
> > const struct cl_option *warning_option
> > diff --git a/gcc/opts.c b/gcc/opts.c
> > index fc5f61e13cc..24bb64198c8 100644
> > --- a/gcc/opts.c
> > +++ b/gcc/opts.c
> > @@ -3401,8 +3401,11 @@ char *
> > gen_producer_string (const char *language_string, cl_decoded_option 
> > *options,
> >unsigned int options_count)
> > {
> > -  return concat (language_string, " ", version_string, " ",
> > -gen_command_line_string (options, options_count), 
> > NULL);
> > +  char *cmdline = gen_command_line_string (options, options_count);
> > +  char *combined = concat (language_string, " ", version_string, " ",
> > +  cmdline, NULL);
> > +  free (cmdline);
> > +  return combined;
> > }
> >>>
> >>> Here, changing gen_command_line_string() to return std::string instead
> >>> of a raw pointer would similarly avoid having to remember to free
> >>> the pointer (and forgetting).  The function has two other callers,
> >>> both in gcc/toplev.c, and both also leak the memory for the same
> >>> reason as here.
> >>
> >> Btw. have we made a general consensus that using std::string is fine in the
> >> GCC internals?
> >
> > No, we didn't.  But if Martin likes RAII adding sth like
>
> It's not so much what I like as about using established C++ idioms
> to help us avoid common memory management mistakes (leaks, dangling
> pointers, etc.)
>
> With respect to the C++ standard library, the GCC Coding Conventions
> say:
>
>Use of the standard library is permitted.  Note, however, that it
>is currently not usable with garbage collected data.
>
> So as a return value of a function, or as a local variable, using
> std::string seems entirely appropriate, and I have been using it
> that way.
>
> Richard, if that's not in line with your understanding of
> the intent of the text in the conventions or with your expectations,

The conventions were written / changed arbitrarily without real consent.

Using std::string just because it implements the RAII part you like
but then still needing to interface with C string APIs on _both_ sides
makes std::string a bad fit.

GCCs code-base is a mess of C/C++ mix already, throwing std::string
inbetween a C string API sandwitch isn't going to improve things.

> please propose a change for everyone's consideration.  If a consensus
> emerges not to use std::string in GCC (or any other part of the C++
> library) let's update the coding conventions.  FWIW, I have prototyped
> a simple string class over the weekend (based on auto_vec) that I'm
> willing to contribute if std::string turns out to be out of favor.
>
> > template 
> > class auto_free
> > {
> > auto_free (T *ptr) : m_ptr (ptr) {};
> >~auto_free () { m_ptr->~T 

Re: [PATCH] Change semantics of -frecord-gcc-switches and add -frecord-gcc-switches-format.

2021-02-18 Thread Martin Liška

On 2/16/21 10:17 PM, Qing Zhao wrote:

Hello,

What’s the status of this patch now? Is there any major technical issue with 
the patch?

Our company has been waiting for this patch for almost one year, we need it for 
our important application.


Hello.

You are right, we've been waiting for quite some time.



Could this one be approved and committed to gcc11?


@richi: ?

Thanks,
Martin



Thanks.

Qing


On Feb 5, 2021, at 3:34 AM, Martin Liška  wrote:

Hello.

Based on discussion with Richi, I'm re-sending the patch. Note that the patch
has been waiting for a review for almost one year and I would like to see
it in GCC 11.1.

Thank you,
Martin
<0001-Change-semantics-of-frecord-gcc-switches-and-add-fre.patch>






Re: [PATCH v2 2/2] ada: add 128bit operation to MIPS N32 and N64

2021-02-18 Thread Arnaud Charlet
> > > For MIPS N64 and N32:
> > >   add GNATRTL_128BIT_PAIRS to LIBGNAT_TARGET_PAIRS
> > >   add GNATRTL_128BIT_OBJS to EXTRA_GNATRTL_NONTASKING_OBJS
> > >
> > > gcc/ada/ChangeLog:
> > >   PR ada/98996
> > >   * Makefile.rtl (LIBGNAT_TARGET_PAIRS, EXTRA_GNATRTL_NONTASKING_OBJS)
> > >   : add 128Bit operation file to MIPS N64 and N32.
> >
> > Note that the ChangeLog is generated automatically these days.
> 
> Thank you for your instruction. I noticed the contrib/mklog.py.
> It generate the template of ChangeLog.
> Should I resend a new version of patch?

No need to as far as I'm concerned, you can go ahead and commit your change.
There are commit hooks checking the proper format of the commit log these
days anyway.

Arno


Re: [PATCH v2 2/2] ada: add 128bit operation to MIPS N32 and N64

2021-02-18 Thread YunQiang Su
Arnaud Charlet  于2021年2月18日周四 下午3:38写道:
>
> > For MIPS N64 and N32:
> >   add GNATRTL_128BIT_PAIRS to LIBGNAT_TARGET_PAIRS
> >   add GNATRTL_128BIT_OBJS to EXTRA_GNATRTL_NONTASKING_OBJS
> >
> > gcc/ada/ChangeLog:
> >   PR ada/98996
> >   * Makefile.rtl (LIBGNAT_TARGET_PAIRS, EXTRA_GNATRTL_NONTASKING_OBJS)
> >   : add 128Bit operation file to MIPS N64 and N32.
>
> Note that the ChangeLog is generated automatically these days.

Thank you for your instruction. I noticed the contrib/mklog.py.
It generate the template of ChangeLog.
Should I resend a new version of patch?

>
> The change is OK, thanks.
>
> > ---
> >  gcc/ada/ChangeLog|  6 ++
> >  gcc/ada/Makefile.rtl | 12 
> >  2 files changed, 18 insertions(+)
> >
> > diff --git a/gcc/ada/ChangeLog b/gcc/ada/ChangeLog
> > index 52faefaa2ae..3565a32c5ac 100644
> > --- a/gcc/ada/ChangeLog
> > +++ b/gcc/ada/ChangeLog
> > @@ -1,3 +1,9 @@
> > +2021-02-18  YunQiang Su  
> > +
> > + PR ada/98996
> > + * Makefile.rtl (LIBGNAT_TARGET_PAIRS, EXTRA_GNATRTL_NONTASKING_OBJS)
> > + : add 128Bit operation file to MIPS N64 and N32.
> > +
> >  2021-02-12  Arnaud Charlet  
> >
> >   * repinfo.ads, repinfo.adb (*SO_Ref*): Restore.
> > diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
> > index 35faf13ea46..d86eb8acbf3 100644
> > --- a/gcc/ada/Makefile.rtl
> > +++ b/gcc/ada/Makefile.rtl
> > @@ -2311,6 +2311,18 @@ ifeq ($(strip $(filter-out mips% 
> > linux%,$(target_cpu) $(target_os))),)
> >s-tpopsp.adb >system.ads >
> > +  ifeq ($(strip $(filter-out mips64% mipsisa64%,$(target_cpu))),)
> > +ifneq ($(strip $(MULTISUBDIR)),/32)
> > +  LIBGNAT_TARGET_PAIRS += $(GNATRTL_128BIT_PAIRS)
> > +  EXTRA_GNATRTL_NONTASKING_OBJS += $(GNATRTL_128BIT_OBJS)
> > +endif
> > +  else
> > +ifeq ($(strip $(filter-out /64 /n32,$(MULTISUBDIR))),)
> > +  LIBGNAT_TARGET_PAIRS += $(GNATRTL_128BIT_PAIRS)
> > +  EXTRA_GNATRTL_NONTASKING_OBJS += $(GNATRTL_128BIT_OBJS)
> > +endif
> > +  endif
> > +
> >TOOLS_TARGET_PAIRS = indepsw.adb >
> >EXTRA_GNATRTL_TASKING_OBJS=s-linux.o
> > --
> > 2.20.1


Re: [PATCH] i386: Avoid C++ global constructors in every object that includes i386.h

2021-02-18 Thread Richard Biener
On Wed, 17 Feb 2021, Jakub Jelinek wrote:

> Hi!
> 
> When looking at recog.o when working on the recog.[ch] changes to make sure
> I have not introduced runtime construction of recog_data variable, I have
> noticed that at least in unoptimized build, every single *.o file that
> included i386.h has lots of runtime constructors for all the PTA_*
> variables.
> 
> As we now require C++11, the following patch makes those constexpr so that
> they don't need runtime initialization.
> I've verified that ~ 8276 bytes long 
> _Z41__static_initialization_and_destruction_0ii
> at -O0 is gone from every *.o that included i386.h (and doesn't really need
> any global ctors anymore).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.  Can you quickly try whether GCC 4.8 is happy with it?

Thanks,
Richard.

> 2021-02-17  Jakub Jelinek  
> 
>   * wide-int-bitmask.h (wide_int_bitmask::wide_int_bitmask (),
>   wide_int_bitmask::wide_int_bitmask (uint64_t),
>   wide_int_bitmask::wide_int_bitmask (uint64_t, uint64_t),
>   wide_int_bitmask::operator ~ () const,
>   wide_int_bitmask::operator | (wide_int_bitmask) const,
>   wide_int_bitmask::operator & (wide_int_bitmask) const): Use constexpr
>   instead of inline.
>   * config/i386/i386.h (PTA_3DNOW, PTA_3DNOW_A, PTA_64BIT, PTA_ABM,
>   PTA_AES, PTA_AVX, PTA_BMI, PTA_CX16, PTA_F16C, PTA_FMA, PTA_FMA4,
>   PTA_FSGSBASE, PTA_LWP, PTA_LZCNT, PTA_MMX, PTA_MOVBE, PTA_NO_SAHF,
>   PTA_PCLMUL, PTA_POPCNT, PTA_PREFETCH_SSE, PTA_RDRND, PTA_SSE, PTA_SSE2,
>   PTA_SSE3, PTA_SSE4_1, PTA_SSE4_2, PTA_SSE4A, PTA_SSSE3, PTA_TBM,
>   PTA_XOP, PTA_AVX2, PTA_BMI2, PTA_RTM, PTA_HLE, PTA_PRFCHW, PTA_RDSEED,
>   PTA_ADX, PTA_FXSR, PTA_XSAVE, PTA_XSAVEOPT, PTA_AVX512F, PTA_AVX512ER,
>   PTA_AVX512PF, PTA_AVX512CD, PTA_NO_TUNE, PTA_SHA, PTA_PREFETCHWT1,
>   PTA_CLFLUSHOPT, PTA_XSAVEC, PTA_XSAVES, PTA_AVX512DQ, PTA_AVX512BW,
>   PTA_AVX512VL, PTA_AVX512IFMA, PTA_AVX512VBMI, PTA_CLWB, PTA_MWAITX,
>   PTA_CLZERO, PTA_NO_80387, PTA_PKU, PTA_AVX5124VNNIW, PTA_AVX5124FMAPS,
>   PTA_AVX512VPOPCNTDQ, PTA_SGX, PTA_AVX512VNNI, PTA_GFNI, PTA_VAES,
>   PTA_AVX512VBMI2, PTA_VPCLMULQDQ, PTA_AVX512BITALG, PTA_RDPID,
>   PTA_PCONFIG, PTA_WBNOINVD, PTA_AVX512VP2INTERSECT, PTA_PTWRITE,
>   PTA_AVX512BF16, PTA_WAITPKG, PTA_MOVDIRI, PTA_MOVDIR64B, PTA_ENQCMD,
>   PTA_CLDEMOTE, PTA_SERIALIZE, PTA_TSXLDTRK, PTA_AMX_TILE, PTA_AMX_INT8,
>   PTA_AMX_BF16, PTA_UINTR, PTA_HRESET, PTA_KL, PTA_WIDEKL, PTA_AVXVNNI,
>   PTA_X86_64_BASELINE, PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4,
>   PTA_CORE2, PTA_NEHALEM, PTA_WESTMERE, PTA_SANDYBRIDGE, PTA_IVYBRIDGE,
>   PTA_HASWELL, PTA_BROADWELL, PTA_SKYLAKE, PTA_SKYLAKE_AVX512,
>   PTA_CASCADELAKE, PTA_COOPERLAKE, PTA_CANNONLAKE, PTA_ICELAKE_CLIENT,
>   PTA_ICELAKE_SERVER, PTA_TIGERLAKE, PTA_SAPPHIRERAPIDS, PTA_ALDERLAKE,
>   PTA_KNL, PTA_BONNELL, PTA_SILVERMONT, PTA_GOLDMONT, PTA_GOLDMONT_PLUS,
>   PTA_TREMONT, PTA_KNM): Use constexpr instead of const.
> 
> --- gcc/wide-int-bitmask.h.jj 2021-01-04 10:25:38.611236338 +0100
> +++ gcc/wide-int-bitmask.h2021-02-16 15:55:55.848542878 +0100
> @@ -23,14 +23,14 @@ along with GCC; see the file COPYING3.
>  class wide_int_bitmask
>  {
>  public:
> -  inline wide_int_bitmask ();
> -  inline wide_int_bitmask (uint64_t l);
> -  inline wide_int_bitmask (uint64_t l, uint64_t h);
> +  constexpr wide_int_bitmask ();
> +  constexpr wide_int_bitmask (uint64_t l);
> +  constexpr wide_int_bitmask (uint64_t l, uint64_t h);
>inline wide_int_bitmask  &= (wide_int_bitmask);
>inline wide_int_bitmask  |= (wide_int_bitmask);
> -  inline wide_int_bitmask operator ~ () const;
> -  inline wide_int_bitmask operator & (wide_int_bitmask) const;
> -  inline wide_int_bitmask operator | (wide_int_bitmask) const;
> +  constexpr wide_int_bitmask operator ~ () const;
> +  constexpr wide_int_bitmask operator & (wide_int_bitmask) const;
> +  constexpr wide_int_bitmask operator | (wide_int_bitmask) const;
>inline wide_int_bitmask operator >> (int);
>inline wide_int_bitmask operator << (int);
>inline bool operator == (wide_int_bitmask) const;
> @@ -38,19 +38,19 @@ public:
>uint64_t low, high;
>  };
>  
> -inline
> +constexpr
>  wide_int_bitmask::wide_int_bitmask ()
>  : low (0), high (0)
>  {
>  }
>  
> -inline
> +constexpr
>  wide_int_bitmask::wide_int_bitmask (uint64_t l)
>  : low (l), high (0)
>  {
>  }
>  
> -inline
> +constexpr
>  wide_int_bitmask::wide_int_bitmask (uint64_t l, uint64_t h)
>  : low (l), high (h)
>  {
> @@ -72,25 +72,22 @@ wide_int_bitmask::operator |= (wide_int_
>return *this;
>  }
>  
> -inline wide_int_bitmask
> +constexpr wide_int_bitmask
>  wide_int_bitmask::operator ~ () const
>  {
> -  wide_int_bitmask ret (~low, ~high);
> -  return ret;
> +  return wide_int_bitmask (~low, ~high);
>  }
>  
> -inline wide_int_bitmask
> +constexpr wide_int_bitmask
>  

Re: [PATCH] array-bounds, v2: Fix up ICE on overaligned variables [PR99109]

2021-02-18 Thread Richard Biener
On Thu, 18 Feb 2021, Jakub Jelinek wrote:

> On Wed, Feb 17, 2021 at 02:38:04PM -0700, Martin Sebor via Gcc-patches wrote:
> > How does build_printable_array_type sound?
> 
> This adjusted version also works and has been successfully
> bootstrapped/regtested on x86_64-linux and i686-linux.
> 
> Ok for trunk?

OK.

Richard.

> 2021-02-18  Jakub Jelinek  
> 
>   PR middle-end/99109
>   * gimple-array-bounds.cc (build_zero_elt_array_type): Rename to ...
>   (build_printable_array_type): ... this.  Add nelts argument.  For
>   overaligned eltype, use TYPE_MAIN_VARIANT (eltype) instead.  If
>   nelts, call build_array_type_nelts.
>   (array_bounds_checker::check_mem_ref): Use build_printable_array_type
>   instead of build_zero_elt_array_type and build_array_type_nelts.
> 
>   * g++.dg/warn/Warray-bounds-17.C: New test.
> 
> --- gcc/gimple-array-bounds.cc.jj 2021-01-04 10:25:37.471249246 +0100
> +++ gcc/gimple-array-bounds.cc2021-02-17 23:30:34.168721374 +0100
> @@ -372,12 +372,23 @@ array_bounds_checker::check_array_ref (l
>return warned;
>  }
>  
> -/* Hack around the internal representation constraints and build a zero
> -   element array type that actually renders as T[0] in diagnostcs.  */
> +/* Wrapper around build_array_type_nelts that makes sure the array
> +   can be created at all and handles zero sized arrays specially.  */
>  
>  static tree
> -build_zero_elt_array_type (tree eltype)
> +build_printable_array_type (tree eltype, unsigned HOST_WIDE_INT nelts)
>  {
> +  if (TYPE_SIZE_UNIT (eltype)
> +  && TREE_CODE (TYPE_SIZE_UNIT (eltype)) == INTEGER_CST
> +  && !integer_zerop (TYPE_SIZE_UNIT (eltype))
> +  && TYPE_ALIGN_UNIT (eltype) > 1
> +  && wi::zext (wi::to_wide (TYPE_SIZE_UNIT (eltype)),
> +ffs_hwi (TYPE_ALIGN_UNIT (eltype)) - 1) != 0)
> +eltype = TYPE_MAIN_VARIANT (eltype);
> +
> +  if (nelts)
> +return build_array_type_nelts (eltype, nelts);
> +
>tree idxtype = build_range_type (sizetype, size_zero_node, NULL_TREE);
>tree arrtype = build_array_type (eltype, idxtype);
>arrtype = build_distinct_type_copy (TYPE_MAIN_VARIANT (arrtype));
> @@ -561,10 +572,7 @@ array_bounds_checker::check_mem_ref (loc
>   return false;
>  
>offset_int nelts = arrbounds[1] / eltsize;
> -  if (nelts == 0)
> - reftype = build_zero_elt_array_type (reftype);
> -  else
> - reftype = build_array_type_nelts (reftype, nelts.to_uhwi ());
> +  reftype = build_printable_array_type (reftype, nelts.to_uhwi ());
>  }
>else if (TREE_CODE (arg) == ADDR_EXPR)
>  {
> @@ -675,7 +683,7 @@ array_bounds_checker::check_mem_ref (loc
>/* Treat a reference to a non-array object as one to an array
>of a single element.  */
>if (TREE_CODE (reftype) != ARRAY_TYPE)
> - reftype = build_array_type_nelts (reftype, 1);
> + reftype = build_printable_array_type (reftype, 1);
>  
>/* Extract the element type out of MEM_REF and use its size
>to compute the index to print in the diagnostic; arrays
> --- gcc/testsuite/g++.dg/warn/Warray-bounds-17.C.jj   2021-02-16 
> 17:24:14.178813304 +0100
> +++ gcc/testsuite/g++.dg/warn/Warray-bounds-17.C  2021-02-16 
> 17:23:35.305251062 +0100
> @@ -0,0 +1,15 @@
> +// PR middle-end/99109
> +// { dg-do compile }
> +// { dg-options "-O2 -Warray-bounds" }
> +
> +typedef int A __attribute__((aligned (64)));
> +void foo (int *);
> +
> +void
> +bar (void)
> +{
> +  A b;   // { dg-message "while referencing" }
> +  int *p = 
> +  int *x = (p - 1);  // { dg-warning "outside array bounds" }
> +  foo (x);
> +}
> 
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


[PATCH] array-bounds, v2: Fix up ICE on overaligned variables [PR99109]

2021-02-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Feb 17, 2021 at 02:38:04PM -0700, Martin Sebor via Gcc-patches wrote:
> How does build_printable_array_type sound?

This adjusted version also works and has been successfully
bootstrapped/regtested on x86_64-linux and i686-linux.

Ok for trunk?

2021-02-18  Jakub Jelinek  

PR middle-end/99109
* gimple-array-bounds.cc (build_zero_elt_array_type): Rename to ...
(build_printable_array_type): ... this.  Add nelts argument.  For
overaligned eltype, use TYPE_MAIN_VARIANT (eltype) instead.  If
nelts, call build_array_type_nelts.
(array_bounds_checker::check_mem_ref): Use build_printable_array_type
instead of build_zero_elt_array_type and build_array_type_nelts.

* g++.dg/warn/Warray-bounds-17.C: New test.

--- gcc/gimple-array-bounds.cc.jj   2021-01-04 10:25:37.471249246 +0100
+++ gcc/gimple-array-bounds.cc  2021-02-17 23:30:34.168721374 +0100
@@ -372,12 +372,23 @@ array_bounds_checker::check_array_ref (l
   return warned;
 }
 
-/* Hack around the internal representation constraints and build a zero
-   element array type that actually renders as T[0] in diagnostcs.  */
+/* Wrapper around build_array_type_nelts that makes sure the array
+   can be created at all and handles zero sized arrays specially.  */
 
 static tree
-build_zero_elt_array_type (tree eltype)
+build_printable_array_type (tree eltype, unsigned HOST_WIDE_INT nelts)
 {
+  if (TYPE_SIZE_UNIT (eltype)
+  && TREE_CODE (TYPE_SIZE_UNIT (eltype)) == INTEGER_CST
+  && !integer_zerop (TYPE_SIZE_UNIT (eltype))
+  && TYPE_ALIGN_UNIT (eltype) > 1
+  && wi::zext (wi::to_wide (TYPE_SIZE_UNIT (eltype)),
+  ffs_hwi (TYPE_ALIGN_UNIT (eltype)) - 1) != 0)
+eltype = TYPE_MAIN_VARIANT (eltype);
+
+  if (nelts)
+return build_array_type_nelts (eltype, nelts);
+
   tree idxtype = build_range_type (sizetype, size_zero_node, NULL_TREE);
   tree arrtype = build_array_type (eltype, idxtype);
   arrtype = build_distinct_type_copy (TYPE_MAIN_VARIANT (arrtype));
@@ -561,10 +572,7 @@ array_bounds_checker::check_mem_ref (loc
return false;
 
   offset_int nelts = arrbounds[1] / eltsize;
-  if (nelts == 0)
-   reftype = build_zero_elt_array_type (reftype);
-  else
-   reftype = build_array_type_nelts (reftype, nelts.to_uhwi ());
+  reftype = build_printable_array_type (reftype, nelts.to_uhwi ());
 }
   else if (TREE_CODE (arg) == ADDR_EXPR)
 {
@@ -675,7 +683,7 @@ array_bounds_checker::check_mem_ref (loc
   /* Treat a reference to a non-array object as one to an array
 of a single element.  */
   if (TREE_CODE (reftype) != ARRAY_TYPE)
-   reftype = build_array_type_nelts (reftype, 1);
+   reftype = build_printable_array_type (reftype, 1);
 
   /* Extract the element type out of MEM_REF and use its size
 to compute the index to print in the diagnostic; arrays
--- gcc/testsuite/g++.dg/warn/Warray-bounds-17.C.jj 2021-02-16 
17:24:14.178813304 +0100
+++ gcc/testsuite/g++.dg/warn/Warray-bounds-17.C2021-02-16 
17:23:35.305251062 +0100
@@ -0,0 +1,15 @@
+// PR middle-end/99109
+// { dg-do compile }
+// { dg-options "-O2 -Warray-bounds" }
+
+typedef int A __attribute__((aligned (64)));
+void foo (int *);
+
+void
+bar (void)
+{
+  A b; // { dg-message "while referencing" }
+  int *p = 
+  int *x = (p - 1);// { dg-warning "outside array bounds" }
+  foo (x);
+}


Jakub



[PATCH] c: Fix ICE with -fexcess-precision=standard [PR99136]

2021-02-18 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs on i686-linux, because c_finish_return wraps
c_fully_folded retval back into EXCESS_PRECISION_EXPR, but when the function
return type is void, we don't call convert_for_assignment on it that would
then be fully folded again, but just put the retval into RETURN_EXPR's
operand, so nothing removes it anymore and during gimplification we
ICE as EXCESS_PRECISION_EXPR is not handled.

This patch fixes it by not adding that EXCESS_PRECISION_EXPR in functions
returning void, the return value is ignored and all we need is evaluate any
side-effects of the expression.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-02-18  Jakub Jelinek  

PR c/99136
* c-typeck.c (c_finish_return): Don't wrap retval into
EXCESS_PRECISION_EXPR in functions that return void.

* gcc.dg/pr99136.c: New test.

--- gcc/c/c-typeck.c.jj 2021-02-04 23:41:24.200308485 +0100
+++ gcc/c/c-typeck.c2021-02-17 19:43:30.348054337 +0100
@@ -10740,7 +10740,9 @@ c_finish_return (location_t loc, tree re
  retval = TREE_OPERAND (retval, 0);
}
   retval = c_fully_fold (retval, false, NULL);
-  if (semantic_type)
+  if (semantic_type
+ && valtype != NULL_TREE
+ && TREE_CODE (valtype) != VOID_TYPE)
retval = build1 (EXCESS_PRECISION_EXPR, semantic_type, retval);
 }
 
--- gcc/testsuite/gcc.dg/pr99136.c.jj   2021-02-17 19:42:17.253844514 +0100
+++ gcc/testsuite/gcc.dg/pr99136.c  2021-02-17 19:42:09.684926337 +0100
@@ -0,0 +1,9 @@
+/* PR c/99136 */
+/* { dg-do compile } */
+/* { dg-options "-w -fexcess-precision=standard" } */
+
+void
+foo (double x)
+{
+  return 1.0 / x;
+}

Jakub



[PATCH] c++: Fix -std=c++20 ICE on virtual method call [PR99132]

2021-02-18 Thread Jakub Jelinek via Gcc-patches
Hi!

On the following testcase we ICE in C++20 mode during cp_get_callee_fndecl
-> constexpr evaluation.
It is only in C++20 mode on this testcase because virtual methods can't
be constexpr in C++17 and earlier and so potential_constant_expression_1
rejects it earlier.
And the ICE is caused by genericization changing the h PARM_DECL from having
B type to B & DECL_BY_REFERENCE and the constexpr evaluation
not being able to deal with that.
I think this just shows that we shouldn't do the constexpr evaluation during
genericization and later, and other spots e.g. during gimplification
also don't call cp_get_callee_fndecl but cp_get_callee_fndecl_nofold.
After all, cp_fold has already been run and it did the folding if there
was any opportunity to do so.  And furthermore, what that cp_genericize_r
spot does is check for any left-over immediate function calls (which can be
ATM just std::source_location::current() call) and immediate functions
outside of immediate functions can't have addresses leaked into the IL,
so it will be always a direct call anyway.  And immediate functions
themselves don't make it into genericization/gimplification.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-02-18  Jakub Jelinek  

PR c++/99132
* cp-gimplify.c (cp_genericize_r) : Use
cp_get_callee_fndecl_nofold instead of cp_get_callee_fndecl to check
for immediate function calls.

* g++.dg/cpp2a/constexpr-virtual18.C: New test.

--- gcc/cp/cp-gimplify.c.jj 2021-02-17 16:14:42.621294819 +0100
+++ gcc/cp/cp-gimplify.c2021-02-17 16:48:30.343291226 +0100
@@ -1386,7 +1386,7 @@ cp_genericize_r (tree *stmt_p, int *walk
  break;
}
 
-  if (tree fndecl = cp_get_callee_fndecl (stmt))
+  if (tree fndecl = cp_get_callee_fndecl_nofold (stmt))
if (DECL_IMMEDIATE_FUNCTION_P (fndecl))
  {
gcc_assert (source_location_current_p (fndecl));
--- gcc/testsuite/g++.dg/cpp2a/constexpr-virtual18.C.jj 2021-02-17 
16:48:18.136421425 +0100
+++ gcc/testsuite/g++.dg/cpp2a/constexpr-virtual18.C2021-02-17 
16:49:55.929378398 +0100
@@ -0,0 +1,13 @@
+// PR c++/99132
+// { dg-do compile { target c++11 } }
+
+template  struct A { T c; };
+template  struct B {
+  A d;
+  constexpr T operator-> () { return d.c; }
+  B (B &&);
+};
+struct C {
+  virtual void foo ();
+  void bar (B h) { h->foo (); }
+};

Jakub



Re: [r11-7271 Regression] FAIL: g++.dg/modules/pr99023_a.H (test for excess errors) on Linux/x86_64

2021-02-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Feb 17, 2021 at 02:19:11PM -0800, sunil.k.pandey via Gcc-patches wrote:
> On Linux/x86_64,
> 
> d8889c99aab4b599aa7ceb7079e69a9766171336 is the first bad commit
> commit d8889c99aab4b599aa7ceb7079e69a9766171336
> Author: Nathan Sidwell 
> Date:   Wed Feb 17 10:43:21 2021 -0800
> 
> c++: Macros need to be GTY-reachable [PR 99023]
> 
> caused
> 
> FAIL: g++.dg/modules/pr99023_a.H module-cmi ,/pr99023_a.H 
> (gcm.cache/,/pr99023_a.H.gcm)
> FAIL: g++.dg/modules/pr99023_a.H (test for excess errors)
> 
> with GCC configured with
> 
> ../../gcc/configure 
> --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-7271/usr
>  --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
> --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
> --enable-libmpx x86_64-linux --disable-bootstrap
> 
> To reproduce:
> 
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="modules.exp=g++.dg/modules/pr99023_a.H 
> --target_board='unix{-m32}'"
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="modules.exp=g++.dg/modules/pr99023_a.H 
> --target_board='unix{-m32\ -march=cascadelake}'"
> 
> (Please do not reply to this email, for question about this report, contact 
> me at skpgkp2 at gmail dot com)

Yeah:
FAIL: g++.dg/modules/pr99023_a.H (test for excess errors)
Excess errors:
../../..//src/libstdc++-v3/libsupc++/initializer_list:47:11: fatal error: 
definition of 'class std::initializer_list<_E>' does not match '#include 
'
compilation terminated.

Jakub