Re: [PATCH] middle-end/114604 - ranger allocates bitmap without initialized obstack

2024-04-08 Thread Aldy Hernandez
BTW, I'm not opposed to this patch.  Thank you for tracking this down,
and feel free to commit as is if y'all PMs agree it's OK.  I just
wanted to know if there's a better way going forward.  I can certainly
put it on my TODO list once stage1 opens again.

And no, there probably isn't an obstack for those classes, but I
wonder if we should have a class local one, as we do for the rest of
the classes.

Aldy

On Mon, Apr 8, 2024 at 7:47 PM Richard Biener
 wrote:
>
>
>
> > Am 08.04.2024 um 18:40 schrieb Aldy Hernandez :
> >
> > On Mon, Apr 8, 2024 at 6:29 PM Richard Biener  wrote:
> >>
> >>
> >>
>  Am 08.04.2024 um 18:09 schrieb Aldy Hernandez :
> >>>
> >>> On Mon, Apr 8, 2024 at 5:54 PM Jakub Jelinek  wrote:
> 
>  On Mon, Apr 08, 2024 at 05:40:23PM +0200, Aldy Hernandez wrote:
> >>   PR middle-end/114604
> >>   * gimple-range.cc (enable_ranger): Initialize the global
> >>   bitmap obstack.
> >>   (disable_ranger): Release it.
> >> ---
> >> gcc/gimple-range.cc | 4 
> >> 1 file changed, 4 insertions(+)
> >>
> >> diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
> >> index c16b776c1e3..4d3b1ce8588 100644
> >> --- a/gcc/gimple-range.cc
> >> +++ b/gcc/gimple-range.cc
> >> @@ -689,6 +689,8 @@ enable_ranger (struct function *fun, bool 
> >> use_imm_uses)
> >> {
> >>  gimple_ranger *r;
> >>
> >> +  bitmap_obstack_initialize (NULL);
> >> +
> >>  gcc_checking_assert (!fun->x_range_query);
> >>  r = new gimple_ranger (use_imm_uses);
> >>  fun->x_range_query = r;
> >> @@ -705,6 +707,8 @@ disable_ranger (struct function *fun)
> >>  gcc_checking_assert (fun->x_range_query);
> >>  delete fun->x_range_query;
> >>  fun->x_range_query = NULL;
> >> +
> >> +  bitmap_obstack_release (NULL);
> >
> > Are you not allowed to initialize/use obstacks unless
> > bitmap_obstack_initialize(NULL) is called?
> 
>  You can use it with some other obstack, just not the default one.
> 
> > If so, wouldn't it be
> > better to lazily initialize it downstream (bitmap_alloc, or whomever
> > needs it initialized)?
> 
>  No, you still need to decide where is the safe point to release it.
>  Unlike the non-default bitmap_obstack_initialize/bitmap_obstack_release,
>  the default one can nest (has associated nesting counter).  So, the above
>  patch just says that ranger starts using the default obstack in
>  enable_ranger and stops using it in disable_ranger and anything ranger
>  associated in the obstack can be freed at that point.
> >>>
> >>> I thought ranger never used the default one:
> >>>
> >>> $ grep bitmap_obstack_initialize *value* *range*
> >>> value-relation.cc:  bitmap_obstack_initialize (_bitmaps);
> >>> value-relation.cc:  bitmap_obstack_initialize (_bitmaps);
> >>> gimple-range-cache.cc:  bitmap_obstack_initialize (_bitmaps);
> >>> gimple-range-gori.cc:  bitmap_obstack_initialize (_bitmaps);
> >>> gimple-range-infer.cc:  bitmap_obstack_initialize (_bitmaps);
> >>> gimple-range-phi.cc:  bitmap_obstack_initialize (_bitmaps);
> >>>
> >>> or even:
> >>>
> >>> $ grep obstack.*NULL *value* *range*
> >>> value-range-storage.cc:obstack_free (_obstack, NULL);
> >>> value-relation.cc:  obstack_free (_chain_obstack, NULL);
> >>> value-relation.cc:  obstack_free (_chain_obstack, NULL);
> >>> gimple-range-infer.cc:  obstack_free (_list_obstack, NULL);
> >>> value-range-storage.cc:obstack_free (_obstack, NULL);
> >>>
> >>> I'm obviously missing something here.
> >>
> >> Look for BITMAP_ALLOC (NULL) in the backtrace in the PR
> >
> > Ahh!  Thanks.
> >
> > A few default obstack uses snuck in while I wasn't looking.
> >
> > $ grep BITMAP_ALLOC.*NULL *range*
> > gimple-range-cache.cc:  m_propfail = BITMAP_ALLOC (NULL);
> > gimple-range-cache.h:  inline ssa_lazy_cache () { active_p =
> > BITMAP_ALLOC (NULL); }
> > gimple-range.cc:  m_pop_list = BITMAP_ALLOC (NULL);
> >
> > I wonder if it would be cleaner to just change these to use named obstacks.
>
> I didn’t find any obvious obstack to use, but sure.  This was the easiest fix 
> ;)
>
> Richard
>
> > Andrew, is there a reason we were using the default obstack for these?
> > For reference, they are  class update_list used in the ranger cache,
> > ssa_lazy_cache, and dom_ranger.
> >
> > Aldy
> >
>



Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-08 Thread Kewen.Lin
on 2024/4/9 11:20, Peter Bergner wrote:
> On 4/8/24 9:37 PM, Kewen.Lin wrote:
>> on 2024/4/8 21:21, Peter Bergner wrote:
>> I prefer to remove it completely, that is:
>>
>>> -mdirect-move
>>> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
>>
>> The reason why you still kept it is to keep a historical record here?
> 
> I believe we've never completely removed an option before.  I think the

By checking the history, we did remove some options for SPE, paired single,
xilinx-fpu etc., which can be taken as gone with feature removal, but also
-maltivec={le,be} and -misel={yes,no}.

> thought was, if some software package explicitly used the option, then
> they shouldn't see an 'unrecognized command-line option' error, but
> rather either a warning that the option was removed or just silently
> ignore it.  Ie, we don't want to make a package that used to build with
> an old compiler now break its build because the option doesn't exist
> anymore.

I understand, but an argument is that no errors (even no warnings) can imply
some option still takes effect and cause some misunderstanding easily.  For
the release in which we remove the support of an option, we can still mark
it as WarnRemoved, but after a release or so, users should be aware of this
change and modify their build scripts if need, it's better to emit errors
for them to avoid the false appearance that it's still supported.

> 
>> Segher pointed out to me that this kind of option complete removal should be
>> stage 1 stuff, so let's defer to make it in a separated patch next release
>> (including some other options like mfpgpr you showed below etc.). :)
> 
> If we're going to completely remove it, then for sure, it's a stage1 thing.
> I'd like to hear Segher's thoughts on whether we should completely remove
> it or just silently ignore it.
> 
> 
> 
>> For the original patch,
>>
>>> +mno-direct-move
>>> +Target Undocumented WarnRemoved
>>
>> s/WarnRemoved/Ignore/ to match some other existing practice, there is no
>> warning now if specifying -mno-direct-move and it would be good to keep
>> the same behavior for users.
> 
> If we want to silently ignore -mdirect-move and -mno-direct-move, then we
> just need to do:
> 
> mdirect-move
> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
> +Target Undocumented Ignore
> 

Since removing it completely is a stage1 thing, I prefer to keep mdirect-move
and -mno-direct-move handlings as before, WarnRemoved and Ignore separately.

> There's no need to mention -mno-direct-move at all then.  It was only in the
> case I thought we wanted to warn against it's use that I added 
> -mno-direct-move.
> 
> 

Not to mention it is fine too, just keep the handlings and defer it to stage 1. 
:)

BR,
Kewen



Re: [PATCH/RFC] On the use of -funreachable-traps to deal with PR 109627

2024-04-08 Thread Jeff Law




On 4/8/24 5:04 PM, Iain Sandoe wrote:

Hi

PR 109627 is about functions that have had their bodies completely elided, but 
still have the wrappers for EH frames (either .cfi_xxx or LFSxx/LFExx).

These are causing issues for some linkers because such functions result in FDEs 
with a 0 code extent.

The simplest representation of this is (from PR109527)

void foo () { __builtin_unreachable (); }
With the possibility of sounding like a broken record, I think 
__builtin_unreachable is fundamentally flawed.   It generates no code 
and just lets the program continue if ever "reached".  This is a 
security risk and (IMHO) just plain silly.  We're in a situation that is 
never supposed to happen, so continuing to execute code is just asking 
for problems.


If it were up to me, I'd have __builtin_unreachable emit a trap or 
similar construct that should (in general) halt execution.


Jeff



Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-08 Thread Peter Bergner
On 4/8/24 9:37 PM, Kewen.Lin wrote:
> on 2024/4/8 21:21, Peter Bergner wrote:
> I prefer to remove it completely, that is:
> 
>> -mdirect-move
>> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
> 
> The reason why you still kept it is to keep a historical record here?

I believe we've never completely removed an option before.  I think the
thought was, if some software package explicitly used the option, then
they shouldn't see an 'unrecognized command-line option' error, but
rather either a warning that the option was removed or just silently
ignore it.  Ie, we don't want to make a package that used to build with
an old compiler now break its build because the option doesn't exist
anymore.



> Segher pointed out to me that this kind of option complete removal should be
> stage 1 stuff, so let's defer to make it in a separated patch next release
> (including some other options like mfpgpr you showed below etc.). :)

If we're going to completely remove it, then for sure, it's a stage1 thing.
I'd like to hear Segher's thoughts on whether we should completely remove
it or just silently ignore it.



> For the original patch,
> 
>> +mno-direct-move
>> +Target Undocumented WarnRemoved
> 
> s/WarnRemoved/Ignore/ to match some other existing practice, there is no
> warning now if specifying -mno-direct-move and it would be good to keep
> the same behavior for users.

If we want to silently ignore -mdirect-move and -mno-direct-move, then we
just need to do:

mdirect-move
-Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
+Target Undocumented Ignore

There's no need to mention -mno-direct-move at all then.  It was only in the
case I thought we wanted to warn against it's use that I added -mno-direct-move.



>> That said, it's not what we've done with
>> other options, but maybe those just need to be changed too?
> 
> Yes, I think they need to be changed too (next release).

If that's the consensus with Segher, sure, we can plan on that in stage1.

Peter




Re: [PATCH] c++: Keep DECL_SAVED_TREE of destructor instantiations in modules [PR104040]

2024-04-08 Thread Jason Merrill

On 4/4/24 07:27, Nathaniel Shead wrote:

On Wed, Apr 03, 2024 at 11:18:01AM -0400, Jason Merrill wrote:

On 4/2/24 20:57, Nathaniel Shead wrote:

On Tue, Apr 02, 2024 at 01:18:17PM -0400, Jason Merrill wrote:

On 3/28/24 23:21, Nathaniel Shead wrote:

- && !(modules_p () && DECL_DECLARED_INLINE_P (fn)))
+ && !(modules_p ()
+  && (DECL_DECLARED_INLINE_P (fn)
+  || DECL_TEMPLATE_INSTANTIATION (fn


How about using vague_linkage_p?



Right, of course.  How about this?
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

A template instantiation still needs to have its DECL_SAVED_TREE so that
its definition is emitted into the CMI. This way it can be emitted in
the object file of any importers that use it, in case it doesn't end up
getting emitted in this TU.

PR c++/104040

gcc/cp/ChangeLog:

* semantics.cc (expand_or_defer_fn_1): Keep DECL_SAVED_TREE for
all vague linkage functions.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr104040_a.C: New test.
* g++.dg/modules/pr104040_b.C: New test.

Signed-off-by: Nathaniel Shead 
Reviewed-by: Jason Merrill 
---
   gcc/cp/semantics.cc   |  5 +++--
   gcc/testsuite/g++.dg/modules/pr104040_a.C | 14 ++
   gcc/testsuite/g++.dg/modules/pr104040_b.C |  8 
   3 files changed, 25 insertions(+), 2 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/modules/pr104040_a.C
   create mode 100644 gcc/testsuite/g++.dg/modules/pr104040_b.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index adb1ba48d29..03800a20b26 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -5033,9 +5033,10 @@ expand_or_defer_fn_1 (tree fn)
 /* We don't want to process FN again, so pretend we've written
 it out, even though we haven't.  */
 TREE_ASM_WRITTEN (fn) = 1;
-  /* If this is a constexpr function, keep DECL_SAVED_TREE.  */
+  /* If this is a constexpr function, or the body might need to be
+exported from a module CMI, keep DECL_SAVED_TREE.  */
 if (!DECL_DECLARED_CONSTEXPR_P (fn)
- && !(modules_p () && DECL_DECLARED_INLINE_P (fn)))
+ && !(modules_p () && vague_linkage_p (fn)))


Also, how about module_maybe_has_cmi_p?  OK with that change.


Using 'module_maybe_has_cmi_p' doesn't seem to work.  This is for two
reasons, one of them fixable and one of them not (easily):

- It seems that header modules don't count for 'module_maybe_has_cmi_p';
   I didn't notice this initially, and maybe they should for the
   no-linkage decls too?


I think so; they could similarly be referred to by an importer.


 But even accounting for this,

- For some reason only clearing it if the module might have a CMI causes
   crashes in importers for some testcases.  I'm not 100% sure why yet,
   but I suspect it might be some duplicate-decls thing where the type
   inconsistently has DECL_SAVED_TREE applied, since this is also called
   on streamed-in declarations.


Clearing if the module might have a CMI sounds backwards, I'd expect 
that to be the case where we want to leave it alone.  Is that the 
problem, or just a typo?



Out of interest, what was the reason that it was cleared at all in the
first place?  I wasn't able to find anything with git blame; is it just
for performance reasons in avoiding excess lowering later?


That change goes back to the LTO merge, I believe it was to reduce 
unnecessary LTO streaming.


But now that I think about it some more, I don't see why handling 
modules specially here is necessary at all; the point of this code is 
that after we build the destructor clones, the DECL_SAVED_TREE of the 
cloned function is no longer useful.  Why would modules care about the 
maybe-in-charge function?


Jason



Re: [PATCH] i386: Fix aes/vaes patterns [PR114576]

2024-04-08 Thread Hongtao Liu
On Thu, Apr 4, 2024 at 4:42 PM Jakub Jelinek  wrote:
>
> On Wed, Apr 19, 2023 at 02:40:59AM +, Jiang, Haochen via Gcc-patches 
> wrote:
> > > >  (define_insn "aesenc"
> > > > -  [(set (match_operand:V2DI 0 "register_operand" "=x,x")
> > > > -   (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x")
> > > > -  (match_operand:V2DI 2 "vector_operand" "xBm,xm")]
> > > > +  [(set (match_operand:V2DI 0 "register_operand" "=x,x,v")
> > > > +   (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v")
> > > > +  (match_operand:V2DI 2 "vector_operand"
> > > > + "xBm,xm,vm")]
> > > >   UNSPEC_AESENC))]
> > > > -  "TARGET_AES"
> > > > +  "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)"
> > > >"@
> > > > aesenc\t{%2, %0|%0, %2}
> > > > +   vaesenc\t{%2, %1, %0|%0, %1, %2}
> > > > vaesenc\t{%2, %1, %0|%0, %1, %2}"
> > > > -  [(set_attr "isa" "noavx,avx")
> > > > +  [(set_attr "isa" "noavx,aes,avx512vl")
> > > Shouldn't it be vaes_avx512vl and then remove " || (TARGET_VAES &&
> > > TARGET_AVX512VL)" from condition.
> >
> > Since VAES should not imply AES, we need that "|| (TARGET_VAES &&
> > TARGET_AVX512VL)"
> >
> > And there is no need to add vaes_avx512vl since the last alternative will 
> > only
> > be hit when there is no aes. When there is no aes, the pattern will need 
> > vaes
> > and avx512vl both or we could not use this pattern. avx512vl here is just 
> > like
> > a placeholder.
>
> As the following testcase shows, the above change was incorrect.
>
> Using aes isa for the second alternative is obviously wrong, aes is enabled
> whenever -maes is, regardless of -mavx or -mno-avx, so the above change
> means that for -maes -mno-avx RA can choose, either it matches the first
> alternative with the dup operand, or it matches the second one (but that
> is of course wrong because vaesenc VEX encoded insn needs AES & AVX CPUID).
>
> The big question is if "Since VAES should not imply AES" is the case or not.
> Looking around at what LLVM does on godbolt, seems since clang 6 which added
> -mvaes support -mvaes there implies -maes, but GCC treats those two
> independent.
>
> Now, if we'd take the LLVM path of making -mvaes imply -maes and -mno-aes
> imply -mno-vaes, then we should probably just revert the above patch and
> tweak common/config/i386/ to do the implications (+ add the testcase from
> this patch).
>
> If we keep the current behavior, where AES and VAES are completely
> independent extensions, then we need to do more changes as the following
> patch attempts to do.
> We should use the aesenc etc. insns for noavx as before, we know at that
> point that TARGET_AES must be true because (TARGET_VAES && TARGET_AVX512VL)
> won't be true when !TARGET_AVX - TARGET_AVX512VL implies TARGET_AVX.
> For the second alternative, i.e. the AVX AES VEX encoded case, the patch
> uses aes_avx isa which requires both.  Now, for the third one we can't
> use avx512vl isa attribute, because one could compile with
> -maes -mavx512vl -mno-vaes and in that case we want VEX encoded vaesenc
> which can't use %xmm16+ (nor EGPRs), so we need vaes_avx512vl isa to
> ensure it is enabled only for -mvaes -mavx512vl.  And there is another
> problem, with -mno-aes -mvaes -mavx512vl we could emit VEX encoded vaesenc
> which requires AES and AVX ISAs rather than the VAES and AVX512VL which
> are enabled.  So the patch uses the {evex} prefix for those cases.
> And similarly for the vaes*_ instructions, if they aren't 128-bit
> or use %xmm16+ registers, the current case is fine, but if they are 128-bit
> and use only %xmm0-15 registers, assembler would again emit VEX encoded insn
> which needs AES & AVX CPUID, rather than the EVEX encoded ones which need
> VAES & AVX512VL CPUIDs.
> Still, I wonder if -mvaes shouldn't imply at least -mavx512f and
> -mno-avx512f shouldn't imply -mno-vaes, because otherwise can't see how
> it could use 512-bit registers (this part not done in the patch).
>
> The following patch has been successfully bootstrapped/regtested on
> x86_64-linux and i686-linux.
>
> 2024-04-04  Jakub Jelinek  
>
> PR target/114576
> * config/i386/i386.md (isa): Remove aes, add aes_avx, vaes_avx512vl.
> (enabled): Remove aes isa check, add aes_avx and vaes_avx512vl.
> * config/i386/sse.md (aesenc, aesenclast, aesdec, aesdeclast): Add
> 4th alternative, emit {evex} prefix for the third one, use
> noavx,aes_avx,vaes_avx512vl,vaes_avx512vl isa attribute, use jm
> rather than m constraint on the 2nd and 3rd alternative input.
> (vaesdec_, vaesdeclast_, vaesenc_,
> vaesenclast_): Add second alternative with x instead of v
> and jm instead of m.
>
> * gcc.target/i386/aes-pr114576.c: New test.
>
> --- gcc/config/i386/i386.md.jj  2024-03-18 22:15:43.165839479 +0100
> +++ gcc/config/i386/i386.md 2024-04-04 00:48:46.575511556 +0200
> @@ -568,13 +568,14 @@ 

Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-08 Thread Kewen.Lin
Hi Peter,

on 2024/4/8 21:21, Peter Bergner wrote:
> On 4/8/24 3:55 AM, Kewen.Lin wrote:
>> on 2024/4/6 06:28, Peter Bergner wrote:
>>> +mno-direct-move
>>> +Target Undocumented WarnRemoved
>>> +
>>>  mdirect-move
>>> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
>>> +Target Undocumented WarnRemoved
>>
>> When reviewing my previous patch to "neuter option -mpower{8,9}-vector",
>> Segher mentioned that we don't need to keep such option warning all the
>> time and can drop it like in a release later as users should be aware of
>> this information then, I agreed and considering that patch disabling
>> -m[no-]direct-move was r8-7845-g57f108f5a1e1b2, I think we can just remove
>> m[no-]direct-move here?  What do you think?
> 
> 
> I'm fine with that if that is what we want.  So something like the following?
> 
> +;; This option existed in the past, but now is always silently ignored.
> mdirect-move
> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
> +Target Undocumented Ignore

I prefer to remove it completely, that is:

> -mdirect-move
> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved

The reason why you still kept it is to keep a historical record here?

Segher pointed out to me that this kind of option complete removal should be
stage 1 stuff, so let's defer to make it in a separated patch next release
(including some other options like mfpgpr you showed below etc.). :)

For the original patch,

> +mno-direct-move
> +Target Undocumented WarnRemoved

s/WarnRemoved/Ignore/ to match some other existing practice, there is no
warning now if specifying -mno-direct-move and it would be good to keep
the same behavior for users.

OK for trunk and active branches with this tweaked, thanks!

> 
> 
> The above seems to silently ignore both -mdirect-move and -mno-direct-move
> which I think is what we want.  That said, it's not what we've done with
> other options, but maybe those just need to be changed too?

Yes, I think they need to be changed too (next release).

BR,
Kewen



Re: [PATCH v2] c++/modules: Track declarations imported from partitions [PR99377]

2024-04-08 Thread Jason Merrill

On 4/4/24 08:27, Nathaniel Shead wrote:

On Wed, Apr 03, 2024 at 02:16:25PM -0400, Jason Merrill wrote:

On 3/28/24 08:22, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

The testcase in comment 15 of the linked PR is caused because the
following assumption in depset::hash::make_dependency doesn't hold:

if (DECL_LANG_SPECIFIC (not_tmpl)
&& DECL_MODULE_IMPORT_P (not_tmpl))
  {
/* Store the module number and index in cluster/section,
   so we don't have to look them up again.  */
unsigned index = import_entity_index (decl);
module_state *from = import_entity_module (index);
/* Remap will be zero for imports from partitions, which
   we want to treat as-if declared in this TU.  */
if (from->remap)
  {
dep->cluster = index - from->entity_lwm;
dep->section = from->remap;
dep->set_flag_bit ();
  }
  }

This is because at least for template specialisations, we first see the
declaration in the header unit imported from the partition, and then the
instantiation provided by the partition itself.  This means that the
'import_entity_index' lookup doesn't report that the specialisation was
declared in the partition and thus should be considered as-if it was
part of the TU, and get exported.


I think "exported" is the wrong term here; IIUC template specializations are
not themselves exported, just the template itself.


Yes, sorry; I meant "emitted" here, in terms of whether the definition
is in the CMI (regardless of whether or not that means that importers
can name it).


But if the declaration or point of instantiation of the specialization is
within a module instantiation unit, it is reachable to any importers,
including the primary module interface unit importing the partition
interface unit.

Does this work differently if "check" is a separate module rather than a
partition?


Yes.  When in a non-partition module (say, Bar), then the instantiation
is emitted into Bar's CMI.  When a caller imports Foo, it transitively
streams in Bar as well when looking for the entity and imports its
definition from there.

However, partitions work differently.  In the testcase the instantiation
is emitted into Foo:check's CMI, but partition CMIs are only used within
that module: importers don't know that partitions exist, and only read
Foo's CMI.  And because make_dependency doesn't realise that the
instantiation came from a partition it hasn't emitted it into Foo's CMI
which means that importers don't see a definition for it at all.


To fix this, this patch allows, as a special case for installing an
entity from a partition, to overwrite the entity_map entry with the
(later) index into the partition so that this assumption holds again.


Rather than special-casing partitions, would it make sense to override a
declaration with a definition?


And so in this case I think that special-casing partitions is exactly
what needs to happen, because the special case is that it came from a
partition (rather than just it was a definition).

That said, on further reflection I don't think I like the way I did
this, since it means that for this instantiation, errors will refer to
it as belonging to Foo:check instead of pr99377-3_a.H, which seems
wrong (or at least inconsistent with existing behaviour).


Hmm, I don't think it's wrong; that's where the point of instantiation 
is, and this problem is about that same distinction.


So I think I prefer the first patch, just correcting the use of 
"exported" as discussed above.  OK with that change.


Jason



Re: [PATCH] testsuite: Add profile_update_atomic check to gcov-20.c [PR114614]

2024-04-08 Thread Kewen.Lin
on 2024/4/8 18:47, Richard Biener wrote:
> On Mon, Apr 8, 2024 at 11:23 AM Kewen.Lin  wrote:
>>
>> Hi,
>>
>> As PR114614 shows, the newly added test case gcov-20.c by
>> commit r14-9789-g08a52331803f66 failed on targets which do
>> not support atomic profile update, there would be a message
>> like:
>>
>>   warning: target does not support atomic profile update,
>>single mode is selected
>>
>> Since the test case adopts -fprofile-update=atomic, it
>> requires effective target check profile_update_atomic, this
>> patch is to add the check accordingly.
>>
>> Tested well on x86_64-redhat-linux, powerpc64-linux-gnu P8/P9
>> and powerpc64le-linux-gnu P9/P10.
>>
>> Is it ok for trunk?
> 
> OK

Thanks, pushed as r14-9851.

BR,
Kewen

> 
>> BR,
>> Kewen
>> -
>> PR testsuite/114614
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.misc-tests/gcov-20.c: Add effective target check
>> profile_update_atomic.
>> ---
>>  gcc/testsuite/gcc.misc-tests/gcov-20.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/gcc/testsuite/gcc.misc-tests/gcov-20.c 
>> b/gcc/testsuite/gcc.misc-tests/gcov-20.c
>> index 215faffc980..ca8c12aad2b 100644
>> --- a/gcc/testsuite/gcc.misc-tests/gcov-20.c
>> +++ b/gcc/testsuite/gcc.misc-tests/gcov-20.c
>> @@ -1,5 +1,6 @@
>>  /* { dg-options "-fcondition-coverage -ftest-coverage 
>> -fprofile-update=atomic" } */
>>  /* { dg-do run { target native } } */
>> +/* { dg-require-effective-target profile_update_atomic } */
>>
>>  /* Some side effect to stop branches from being pruned */
>>  int x = 0;
>> --
>> 2.43.0



Re: [PATCH] rs6000: Fix wrong align passed to build_aligned_type [PR88309]

2024-04-08 Thread Kewen.Lin
on 2024/4/8 18:47, Richard Biener wrote:
> On Mon, Apr 8, 2024 at 11:22 AM Kewen.Lin  wrote:
>>
>> Hi,
>>
>> As the comments in PR88309 show, there are two oversights
>> in rs6000_gimple_fold_builtin that pass align in bytes to
>> build_aligned_type but which actually requires align in
>> bits, it causes unexpected ICE or hanging in function
>> is_miss_rate_acceptable due to zero align_unit value.
>>
>> This patch is to fix them by converting bytes to bits, add
>> an assertion on positive align_unit value and notes function
>> build_aligned_type requires align measured in bits in its
>> function comment.
>>
>> Bootstrapped and regtested on x86_64-redhat-linux,
>> powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10.
>>
>> Is it (the generic part code change) ok for trunk?
> 
> OK

Thanks, pushed as r14-9850, is it also ok to backport after burn-in time?

BR,
Kewen

> 
>> BR,
>> Kewen
>> -
>> PR target/88309
>>
>> Co-authored-by: Andrew Pinski 
>>
>> gcc/ChangeLog:
>>
>> * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Fix
>> wrong align passed to function build_aligned_type.
>> * tree-ssa-loop-prefetch.cc (is_miss_rate_acceptable): Add an
>> assertion to ensure align_unit should be positive.
>> * tree.cc (build_qualified_type): Update function comments.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/powerpc/pr88309.c: New test.
>> ---
>>  gcc/config/rs6000/rs6000-builtin.cc|  4 ++--
>>  gcc/testsuite/gcc.target/powerpc/pr88309.c | 27 ++
>>  gcc/tree-ssa-loop-prefetch.cc  |  2 ++
>>  gcc/tree.cc|  3 ++-
>>  4 files changed, 33 insertions(+), 3 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr88309.c
>>
>> diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
>> b/gcc/config/rs6000/rs6000-builtin.cc
>> index 6698274031b..e7d6204074c 100644
>> --- a/gcc/config/rs6000/rs6000-builtin.cc
>> +++ b/gcc/config/rs6000/rs6000-builtin.cc
>> @@ -1900,7 +1900,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
>> tree lhs_type = TREE_TYPE (lhs);
>> /* In GIMPLE the type of the MEM_REF specifies the alignment.  The
>>   required alignment (power) is 4 bytes regardless of data type.  */
>> -   tree align_ltype = build_aligned_type (lhs_type, 4);
>> +   tree align_ltype = build_aligned_type (lhs_type, 32);
>> /* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  
>> Create
>>the tree using the value from arg0.  The resulting type will match
>>the type of arg1.  */
>> @@ -1944,7 +1944,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
>> tree arg2_type = ptr_type_node;
>> /* In GIMPLE the type of the MEM_REF specifies the alignment.  The
>>required alignment (power) is 4 bytes regardless of data type.  */
>> -   tree align_stype = build_aligned_type (arg0_type, 4);
>> +   tree align_stype = build_aligned_type (arg0_type, 32);
>> /* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  
>> Create
>>the tree using the value from arg1.  */
>> gimple_seq stmts = NULL;
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr88309.c 
>> b/gcc/testsuite/gcc.target/powerpc/pr88309.c
>> new file mode 100644
>> index 000..c0078cf2b8c
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr88309.c
>> @@ -0,0 +1,27 @@
>> +/* { dg-require-effective-target powerpc_vsx_ok } */
>> +/* { dg-options "-mvsx -O2 -fprefetch-loop-arrays" } */
>> +
>> +/* Verify there is no ICE or hanging.  */
>> +
>> +#include 
>> +
>> +void b(float *c, vector float a, vector float, vector float)
>> +{
>> +  vector float d;
>> +  vector char ahbc;
>> +  vec_xst(vec_perm(a, d, ahbc), 0, c);
>> +}
>> +
>> +vector float e(vector unsigned);
>> +
>> +void f() {
>> +  float *dst;
>> +  int g = 0;
>> +  for (;; g += 16) {
>> +vector unsigned m, i;
>> +vector unsigned n, j;
>> +vector unsigned k, l;
>> +b(dst + g * 3, e(m), e(n), e(k));
>> +b(dst + (g + 4) * 3, e(i), e(j), e(l));
>> +  }
>> +}
>> diff --git a/gcc/tree-ssa-loop-prefetch.cc b/gcc/tree-ssa-loop-prefetch.cc
>> index bbd98e03254..70073cc4fe4 100644
>> --- a/gcc/tree-ssa-loop-prefetch.cc
>> +++ b/gcc/tree-ssa-loop-prefetch.cc
>> @@ -739,6 +739,8 @@ is_miss_rate_acceptable (unsigned HOST_WIDE_INT 
>> cache_line_size,
>>if (delta >= (HOST_WIDE_INT) cache_line_size)
>>  return false;
>>
>> +  gcc_assert (align_unit > 0);
>> +
>>miss_positions = 0;
>>total_positions = (cache_line_size / align_unit) * distinct_iters;
>>max_allowed_miss_positions = (ACCEPTABLE_MISS_RATE * total_positions) / 
>> 1000;
>> diff --git a/gcc/tree.cc b/gcc/tree.cc
>> index f801712c9dd..6f8400e6640 100644
>> --- a/gcc/tree.cc
>> +++ b/gcc/tree.cc
>> @@ -5689,7 +5689,8 @@ build_qualified_type (tree type, int type_quals 
>> MEM_STAT_DECL)
>>return t;
>> 

Re: [PATCH v2] x86: Define __APX_INLINE_ASM_USE_GPR32__

2024-04-08 Thread Hongtao Liu
On Tue, Apr 9, 2024 at 9:58 AM H.J. Lu  wrote:
>
> Define __APX_INLINE_ASM_USE_GPR32__ for -mapx-inline-asm-use-gpr32.
> When __APX_INLINE_ASM_USE_GPR32__ is defined, inline asm statements
> should contain only instructions compatible with r16-r31.
Ok.
>
> gcc/
>
> PR target/114587
> * config/i386/i386-c.cc (ix86_target_macros_internal): Define
> __APX_INLINE_ASM_USE_GPR32__ for -mapx-inline-asm-use-gpr32.
>
> gcc/testsuite/
>
> PR target/114587
> * gcc.target/i386/apx-3.c: Likewise.
> ---
>  gcc/config/i386/i386-c.cc | 2 ++
>  gcc/testsuite/gcc.target/i386/apx-3.c | 6 ++
>  2 files changed, 8 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/apx-3.c
>
> diff --git a/gcc/config/i386/i386-c.cc b/gcc/config/i386/i386-c.cc
> index 226d277676c..07f4936ba91 100644
> --- a/gcc/config/i386/i386-c.cc
> +++ b/gcc/config/i386/i386-c.cc
> @@ -751,6 +751,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
>  def_or_undef (parse_in, "__AVX10_1_512__");
>if (isa_flag2 & OPTION_MASK_ISA2_APX_F)
>  def_or_undef (parse_in, "__APX_F__");
> +  if (ix86_apx_inline_asm_use_gpr32)
> +def_or_undef (parse_in, "__APX_INLINE_ASM_USE_GPR32__");
>if (TARGET_IAMCU)
>  {
>def_or_undef (parse_in, "__iamcu");
> diff --git a/gcc/testsuite/gcc.target/i386/apx-3.c 
> b/gcc/testsuite/gcc.target/i386/apx-3.c
> new file mode 100644
> index 000..1ba4ac036fc
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/apx-3.c
> @@ -0,0 +1,6 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-mapx-inline-asm-use-gpr32" } */
> +
> +#ifndef __APX_INLINE_ASM_USE_GPR32__
> +# error __APX_INLINE_ASM_USE_GPR32__ not defined
> +#endif
> --
> 2.44.0
>


-- 
BR,
Hongtao


[PATCH v2] x86: Define __APX_INLINE_ASM_USE_GPR32__

2024-04-08 Thread H.J. Lu
Define __APX_INLINE_ASM_USE_GPR32__ for -mapx-inline-asm-use-gpr32.
When __APX_INLINE_ASM_USE_GPR32__ is defined, inline asm statements
should contain only instructions compatible with r16-r31.

gcc/

PR target/114587
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__APX_INLINE_ASM_USE_GPR32__ for -mapx-inline-asm-use-gpr32.

gcc/testsuite/

PR target/114587
* gcc.target/i386/apx-3.c: Likewise.
---
 gcc/config/i386/i386-c.cc | 2 ++
 gcc/testsuite/gcc.target/i386/apx-3.c | 6 ++
 2 files changed, 8 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-3.c

diff --git a/gcc/config/i386/i386-c.cc b/gcc/config/i386/i386-c.cc
index 226d277676c..07f4936ba91 100644
--- a/gcc/config/i386/i386-c.cc
+++ b/gcc/config/i386/i386-c.cc
@@ -751,6 +751,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
 def_or_undef (parse_in, "__AVX10_1_512__");
   if (isa_flag2 & OPTION_MASK_ISA2_APX_F)
 def_or_undef (parse_in, "__APX_F__");
+  if (ix86_apx_inline_asm_use_gpr32)
+def_or_undef (parse_in, "__APX_INLINE_ASM_USE_GPR32__");
   if (TARGET_IAMCU)
 {
   def_or_undef (parse_in, "__iamcu");
diff --git a/gcc/testsuite/gcc.target/i386/apx-3.c 
b/gcc/testsuite/gcc.target/i386/apx-3.c
new file mode 100644
index 000..1ba4ac036fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/apx-3.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-mapx-inline-asm-use-gpr32" } */
+
+#ifndef __APX_INLINE_ASM_USE_GPR32__
+# error __APX_INLINE_ASM_USE_GPR32__ not defined
+#endif
-- 
2.44.0



Re: [PATCH] x86: Define macros for APX options

2024-04-08 Thread Hongtao Liu
On Mon, Apr 8, 2024 at 11:44 PM H.J. Lu  wrote:
>
> Define following macros for APX options:
>
> 1. __APX_EGPR__: -mapx-features=egpr.
> 2. __APX_PUSH2POP2__: -mapx-features=push2pop2.
> 3. __APX_NDD__: -mapx-features=ndd.
> 4. __APX_PPX__: -mapx-features=ppx.
For -mapx-features=, we haven't decided to expose this option to users
yet, we want users to just use -mapxf, so I think __APXF__ should be
enough?
> 5. __APX_INLINE_ASM_USE_GPR32__: -mapx-inline-asm-use-gpr32.
I'm ok for this one.
>
> They can be used to make assembly codes compatible with APX options.
> Some use cases are:
>
> 1. When __APX_PUSH2POP2__ is defined, assembly codes should always align
> the outgoing stack to 16 bytes.
> 2. When __APX_INLINE_ASM_USE_GPR32__ is defined, inline asm statements
> should contain only instructions compatible with r16-r31.
>
> gcc/
>
> PR target/114587
> * config/i386/i386-c.cc (ix86_target_macros_internal): Define
> __APX_XXX__ for APX options.
>
> gcc/testsuite/
>
> PR target/114587
> * gcc.target/i386/apx-3a.c: New test.
> * gcc.target/i386/apx-3b.c: Likewise.
> * gcc.target/i386/apx-3c.c: Likewise.
> * gcc.target/i386/apx-3d.c: Likewise.
> * gcc.target/i386/apx-3e.c: Likewise.
> * gcc.target/i386/apx-4.c: Likewise.
> ---
>  gcc/config/i386/i386-c.cc  | 10 ++
>  gcc/testsuite/gcc.target/i386/apx-3a.c |  6 ++
>  gcc/testsuite/gcc.target/i386/apx-3b.c |  6 ++
>  gcc/testsuite/gcc.target/i386/apx-3c.c |  6 ++
>  gcc/testsuite/gcc.target/i386/apx-3d.c |  6 ++
>  gcc/testsuite/gcc.target/i386/apx-3e.c | 18 ++
>  gcc/testsuite/gcc.target/i386/apx-4.c  |  6 ++
>  7 files changed, 58 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/apx-3a.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/apx-3b.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/apx-3c.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/apx-3d.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/apx-3e.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/apx-4.c
>
> diff --git a/gcc/config/i386/i386-c.cc b/gcc/config/i386/i386-c.cc
> index 226d277676c..b8cfba90fdc 100644
> --- a/gcc/config/i386/i386-c.cc
> +++ b/gcc/config/i386/i386-c.cc
> @@ -751,6 +751,16 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
>  def_or_undef (parse_in, "__AVX10_1_512__");
>if (isa_flag2 & OPTION_MASK_ISA2_APX_F)
>  def_or_undef (parse_in, "__APX_F__");
> +  if (TARGET_APX_EGPR)
> +def_or_undef (parse_in, "__APX_EGPR__");
> +  if (TARGET_APX_PUSH2POP2)
> +def_or_undef (parse_in, "__APX_PUSH2POP2__");
> +  if (TARGET_APX_NDD)
> +def_or_undef (parse_in, "__APX_NDD__");
> +  if (TARGET_APX_PPX)
> +def_or_undef (parse_in, "__APX_PPX__");
> +  if (ix86_apx_inline_asm_use_gpr32)
> +def_or_undef (parse_in, "__APX_INLINE_ASM_USE_GPR32__");
>if (TARGET_IAMCU)
>  {
>def_or_undef (parse_in, "__iamcu");
> diff --git a/gcc/testsuite/gcc.target/i386/apx-3a.c 
> b/gcc/testsuite/gcc.target/i386/apx-3a.c
> new file mode 100644
> index 000..86d3ef2061d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/apx-3a.c
> @@ -0,0 +1,6 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-mapx-features=egpr" } */
> +
> +#ifndef __APX_EGPR__
> +# error __APX_EGPR__ not defined
> +#endif
> diff --git a/gcc/testsuite/gcc.target/i386/apx-3b.c 
> b/gcc/testsuite/gcc.target/i386/apx-3b.c
> new file mode 100644
> index 000..611727a389a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/apx-3b.c
> @@ -0,0 +1,6 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-mapx-features=push2pop2" } */
> +
> +#ifndef __APX_PUSH2POP2__
> +# error __APX_PUSH2POP2__ not defined
> +#endif
> diff --git a/gcc/testsuite/gcc.target/i386/apx-3c.c 
> b/gcc/testsuite/gcc.target/i386/apx-3c.c
> new file mode 100644
> index 000..52655b6cfa5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/apx-3c.c
> @@ -0,0 +1,6 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-mapx-features=ndd" } */
> +
> +#ifndef __APX_NDD__
> +# error __APX_NDD__ not defined
> +#endif
> diff --git a/gcc/testsuite/gcc.target/i386/apx-3d.c 
> b/gcc/testsuite/gcc.target/i386/apx-3d.c
> new file mode 100644
> index 000..9b91af1d377
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/apx-3d.c
> @@ -0,0 +1,6 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-mapx-features=ppx" } */
> +
> +#ifndef __APX_PPX__
> +# error __APX_PPX__ not defined
> +#endif
> diff --git a/gcc/testsuite/gcc.target/i386/apx-3e.c 
> b/gcc/testsuite/gcc.target/i386/apx-3e.c
> new file mode 100644
> index 000..7278428e5c4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/apx-3e.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-mapx-features=egpr,push2pop2,ndd,ppx" } */
> +
> +#ifndef __APX_EGPR__

RE: [PATCH] Another ICE after conflicting types of redeclaration [PR110682]

2024-04-08 Thread Andrew Pinski (QUIC)
> -Original Message-
> From: Andrew Pinski (QUIC) 
> Sent: Friday, March 22, 2024 10:50 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Andrew Pinski (QUIC) 
> Subject: [PATCH] Another ICE after conflicting types of redeclaration
> [PR110682]
> 
> This another one of these ICE after error issues with the gimplifier and a 
> fallout
> from r12-3278-g823685221de986af.
> The problem here is that STRIP_USELESS_TYPE_CONVERSION will leave
> around a NON_LVALUE_EXPR which is an error mark node.
> Since the gimplifier assumes non-lvalue expressions has been removed, there
> was an ICE.
> 
> This fixes the issue by checking if there is a NON_LVALUE_EXPR and that has an
> error operand, we handle it as the same as if it was an error operand.


Ping?

Thanks,
Andrew


> 
> gcc/ChangeLog:
> 
>   PR c/110682
>   * gimplify.cc (gimplify_expr): Add check if there is
>   a non-lvalue with an error operand.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c/110682
>   * gcc.dg/redecl-27.c: New test.
> 
> Signed-off-by: Andrew Pinski 
> ---
>  gcc/gimplify.cc  |  6 +-
>  gcc/testsuite/gcc.dg/redecl-27.c | 14 ++
>  2 files changed, 19 insertions(+), 1 deletion(-)  create mode 100644
> gcc/testsuite/gcc.dg/redecl-27.c
> 
> diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index d64bbf3ffbd..001b4af68b9
> 100644
> --- a/gcc/gimplify.cc
> +++ b/gcc/gimplify.cc
> @@ -17686,7 +17686,11 @@ gimplify_expr (tree *expr_p, gimple_seq
> *pre_p, gimple_seq *post_p,
>save_expr = *expr_p;
> 
>/* Die, die, die, my darling.  */
> -  if (error_operand_p (save_expr))
> +  if (error_operand_p (save_expr)
> +   /* The above strip useless type conversion might not strip out
> +  a conversion from an error so handle that case here.  */
> +   || (TREE_CODE (save_expr) == NON_LVALUE_EXPR
> +   && error_operand_p (TREE_OPERAND (save_expr, 0
>   {
> ret = GS_ERROR;
> break;
> diff --git a/gcc/testsuite/gcc.dg/redecl-27.c b/gcc/testsuite/gcc.dg/redecl-
> 27.c
> new file mode 100644
> index 000..93f577e64ff
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/redecl-27.c
> @@ -0,0 +1,14 @@
> +/* We used to ICE while gimplifying the body of f
> +   due to a NON_LVALUE_EXPR still being there.
> +   PR c/110682*/
> +/* { dg-do compile } */
> +/* { dg-options "" } */
> +
> +struct a {
> +  const signed char b;
> +};
> +
> +void f(volatile struct a *c) { /* { dg-note "" } */
> +  c - 0 % c->b;
> +  struct a c = {1}; /* { dg-error "redeclared as different kind of
> +symbol" } */ }
> --
> 2.43.0



Re: [PATCH/RFC] On the use of -funreachable-traps to deal with PR 109627

2024-04-08 Thread Andrew Pinski
On Mon, Apr 8, 2024 at 4:04 PM Iain Sandoe  wrote:
>
> Hi
>
> PR 109627 is about functions that have had their bodies completely elided, 
> but still have the wrappers for EH frames (either .cfi_xxx or LFSxx/LFExx).

I was thinking about how to fix this once and for all. The easiest
method I could think of was if __builtin_unreachable is the only thing
in the CFG expand it as __builtin_trap.
And then it should just work.

It should not to hard to add that check in expand_gimple_basic_block
and handle it that way.

What do you think of that? I can code this up for GCC 15 if you want.

Thanks,
Andrew Pinski

>
> These are causing issues for some linkers because such functions result in 
> FDEs with a 0 code extent.
>
> The simplest representation of this is (from PR109527)
>
> void foo () { __builtin_unreachable (); }
>
> The solution (so far) is to detect this case during final lowering and 
> replace the unreachable (which is expanded to nothing, at least for the 
> targets I’ve dealt with) by a trap; this results in two positive improvements 
> (1) the FDE is now finite-sized so the linker consumes it and (2) actually 
> the trap is considerably more user-friendly UB than falling through to some 
> other arbitrary place.
>
> I was looking into using -funreachable-traps to do this for aarch64 Darwin - 
> because the ad-hoc solutions that were applied to X86 and PPC are not easily 
> usable for aarch64.
>
> -funreachabe-traps was added for similar reasons (helping make missing 
> returns less unexpected) in r13-1204-gd68d3664253696 by Jason (and then there 
> have been further improvements resulting in the use of __builtin_unreachable 
> trap () from Jakub)
>
> As I read the commit message for r13-1204, I would expect -funreachable-traps 
> to work for the simple case above, but it does not.  I think that is because 
> the incremental patch below is needed.  however, I am not sure if there was 
> some reason this was not done at the time?
>
> PR 109627 is currently a show-stopper for the aarch64-darwin branch since 
> libgomp and libgm2 fail to bootstrap - and other workarounds (e.g. 
> -D__builtin_unreachable=__builtin_trap) do not work got m2 (since it does not 
> use the C preprocessor by default).
>
> Setting -funreachable-traps either per affected file, or globally for a 
> target resolves the issue in a neater manner.
>
> Any guidance / comments would be most welcome - if the direction seems sane, 
> I can repost this patch formally.
>
> (I have tested quite widely on Darwin and on a small number of Linux cases 
> too)
>
> thanks
> Iain
>
> * I will note that applying this does result in some regressions in several 
> contracts test cases - but they also regress for -fsanitize=undefined 
> -fsanitise-traps (not yet clear if that’s expected or we’ve uncovered a bug 
> in the contracts impl.).
>
> --
>
>
> diff --git a/gcc/builtins.cc b/gcc/builtins.cc
> index f8d94c4b435..e2d26e45744 100644
> --- a/gcc/builtins.cc
> +++ b/gcc/builtins.cc
> @@ -5931,7 +5931,8 @@ expand_builtin_unreachable (void)
>  {
>/* Use gimple_build_builtin_unreachable or builtin_decl_unreachable
>   to avoid this.  */
> -  gcc_checking_assert (!sanitize_flags_p (SANITIZE_UNREACHABLE));
> +  gcc_checking_assert (!sanitize_flags_p (SANITIZE_UNREACHABLE)
> +  && !flag_unreachable_traps);
>emit_barrier ();
>  }
>
> @@ -10442,7 +10443,7 @@ fold_builtin_0 (location_t loc, tree fndecl)
>
>  case BUILT_IN_UNREACHABLE:
>/* Rewrite any explicit calls to __builtin_unreachable.  */
> -  if (sanitize_flags_p (SANITIZE_UNREACHABLE))
> +  if (sanitize_flags_p (SANITIZE_UNREACHABLE) || flag_unreachable_traps)
> return build_builtin_unreachable (loc);
>break;
>
> 


[PATCH/RFC] On the use of -funreachable-traps to deal with PR 109627

2024-04-08 Thread Iain Sandoe
Hi

PR 109627 is about functions that have had their bodies completely elided, but 
still have the wrappers for EH frames (either .cfi_xxx or LFSxx/LFExx).

These are causing issues for some linkers because such functions result in FDEs 
with a 0 code extent.

The simplest representation of this is (from PR109527)

void foo () { __builtin_unreachable (); }

The solution (so far) is to detect this case during final lowering and replace 
the unreachable (which is expanded to nothing, at least for the targets I’ve 
dealt with) by a trap; this results in two positive improvements (1) the FDE is 
now finite-sized so the linker consumes it and (2) actually the trap is 
considerably more user-friendly UB than falling through to some other arbitrary 
place.

I was looking into using -funreachable-traps to do this for aarch64 Darwin - 
because the ad-hoc solutions that were applied to X86 and PPC are not easily 
usable for aarch64.

-funreachabe-traps was added for similar reasons (helping make missing returns 
less unexpected) in r13-1204-gd68d3664253696 by Jason (and then there have been 
further improvements resulting in the use of __builtin_unreachable trap () from 
Jakub)

As I read the commit message for r13-1204, I would expect -funreachable-traps 
to work for the simple case above, but it does not.  I think that is because 
the incremental patch below is needed.  however, I am not sure if there was 
some reason this was not done at the time?

PR 109627 is currently a show-stopper for the aarch64-darwin branch since 
libgomp and libgm2 fail to bootstrap - and other workarounds (e.g. 
-D__builtin_unreachable=__builtin_trap) do not work got m2 (since it does not 
use the C preprocessor by default).

Setting -funreachable-traps either per affected file, or globally for a target 
resolves the issue in a neater manner.

Any guidance / comments would be most welcome - if the direction seems sane, I 
can repost this patch formally.

(I have tested quite widely on Darwin and on a small number of Linux cases too)

thanks
Iain

* I will note that applying this does result in some regressions in several 
contracts test cases - but they also regress for -fsanitize=undefined 
-fsanitise-traps (not yet clear if that’s expected or we’ve uncovered a bug in 
the contracts impl.).

--


diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index f8d94c4b435..e2d26e45744 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -5931,7 +5931,8 @@ expand_builtin_unreachable (void)
 {
   /* Use gimple_build_builtin_unreachable or builtin_decl_unreachable
  to avoid this.  */
-  gcc_checking_assert (!sanitize_flags_p (SANITIZE_UNREACHABLE));
+  gcc_checking_assert (!sanitize_flags_p (SANITIZE_UNREACHABLE)
+  && !flag_unreachable_traps);
   emit_barrier ();
 }
 
@@ -10442,7 +10443,7 @@ fold_builtin_0 (location_t loc, tree fndecl)
 
 case BUILT_IN_UNREACHABLE:
   /* Rewrite any explicit calls to __builtin_unreachable.  */
-  if (sanitize_flags_p (SANITIZE_UNREACHABLE))
+  if (sanitize_flags_p (SANITIZE_UNREACHABLE) || flag_unreachable_traps)
return build_builtin_unreachable (loc);
   break;
 


Re: [PATCH] c++: Fix up maybe_warn_for_constant_evaluated calls [PR114580]

2024-04-08 Thread Jason Merrill

On 4/5/24 14:47, Marek Polacek wrote:

On Fri, Apr 05, 2024 at 09:40:48AM +0200, Jakub Jelinek wrote:

Hi!

When looking at maybe_warn_for_constant_evaluated for the trivial
infinite loops patch, I've noticed that it can emit weird diagnostics
for if constexpr in templates, first warn that std::is_constant_evaluted()
always evaluates to false (because the function template is not constexpr)
and then during instantiation warn that std::is_constant_evaluted()
always evaluates to true (because it is used in if constexpr condition).
Now, only the latter is actually true, even when the if constexpr
is in a non-constexpr function, it will still always evaluate to true.

So, the following patch fixes it to call maybe_warn_for_constant_evaluated
always with IF_STMT_CONSTEXPR_P (if_stmt) as the second argument rather than
true if it is if constexpr with non-dependent condition etc.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-04-05  Jakub Jelinek  

PR c++/114580
* semantics.cc (finish_if_stmt_cond): Call
maybe_warn_for_constant_evaluated with IF_STMT_CONSTEXPR_P (if_stmt)
as the second argument, rather than true/false depending on if
it is if constexpr with non-dependent constant expression with
bool type.

* g++.dg/cpp2a/is-constant-evaluated15.C: New test.

--- gcc/cp/semantics.cc.jj  2024-04-03 09:58:33.407772541 +0200
+++ gcc/cp/semantics.cc 2024-04-04 12:11:36.203886572 +0200
@@ -1126,6 +1126,9 @@ tree
  finish_if_stmt_cond (tree orig_cond, tree if_stmt)
  {
tree cond = maybe_convert_cond (orig_cond);
+  maybe_warn_for_constant_evaluated (cond,
+/*constexpr_if=*/
+IF_STMT_CONSTEXPR_P (if_stmt));


I don't think we need the comment anymore since it's clear what the
argument does, and then the whole call can fit on a single line.

But either way, the patch looks good, thanks.


Agreed, OK with that change.

Jason



Re: [PATCH] c++, v2: Implement C++26 P2809R3 - Trivial infinite loops are not Undefined Behavior

2024-04-08 Thread Jason Merrill

On 4/5/24 03:57, Jakub Jelinek wrote:

Hi!

Here is a new version of the PR114462 P2809R3 patch.
As clarified on core, for trivially empty iteration statements
(where the condition isn't empty or INTEGER_CST, because those
loops can't contain std::is_constant_evaluated() in the condition
and the middle-end handles them right even with -ffinite-loops)
it attempts to fold_non_dependent_expr with manifestly_const_eval=true
the expression and if that returns integer_nonzerop, treats it
as trivial infinite loops, but keeps the condition as is.
Instead for -ffinite-loops it adds ANNOTATE_EXPR that the loop is
maybe infinite to override -ffinite-loops behavior for the particular loop.

And, it also emits maybe_warn_for_constant_evaluated warnings for loop
conditions.  For non-trivially empty loops or in immediate functions
or if !processing_template_decl and the condition of trivially empty loops
is not a constant expression or doesn't evaluate to integer non-zero
warns like for condition of non-constexpr if, and in non-constexpr functions
when the condition constant evaluates to true warns that the code has
weird behavior, that it evaluates to true when checking if the loop
is trivially infinite and to false at runtime.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-04-05  Jakub Jelinek  

PR c++/114462
gcc/
* tree-core.h (enum annot_expr_kind): Add
annot_expr_maybe_infinite_kind enumerator.
* gimplify.cc (gimple_boolify): Handle annot_expr_maybe_infinite_kind.
* tree-cfg.cc (replace_loop_annotate_in_block): Likewise.
(replace_loop_annotate): Likewise.  Move loop->finite_p initialization
before the replace_loop_annotate_in_block calls.
* tree-pretty-print.cc (dump_generic_node): Handle
annot_expr_maybe_infinite_kind.
gcc/cp/
* semantics.cc: Implement C++26 P2809R3 - Trivial infinite
loops are not Undefined Behavior.
(maybe_warn_for_constant_evaluated): Add trivial_infinite argument
and emit special diagnostics for that case.
(finish_if_stmt_cond): Adjust caller.
(finish_loop_cond): New function.
(finish_while_stmt): Use it.
(finish_do_stmt): Likewise.
(finish_for_stmt): Likewise.
gcc/testsuite/
* g++.dg/cpp26/trivial-infinite-loop1.C: New test.
* g++.dg/cpp26/trivial-infinite-loop2.C: New test.
* g++.dg/cpp26/trivial-infinite-loop3.C: New test.

--- gcc/tree-core.h.jj  2024-03-15 09:13:35.333541416 +0100
+++ gcc/tree-core.h 2024-04-04 12:39:42.707652592 +0200
@@ -983,6 +983,7 @@ enum annot_expr_kind {
annot_expr_no_vector_kind,
annot_expr_vector_kind,
annot_expr_parallel_kind,
+  annot_expr_maybe_infinite_kind,
annot_expr_kind_last
  };
  
--- gcc/gimplify.cc.jj	2024-03-11 12:36:18.494005912 +0100

+++ gcc/gimplify.cc 2024-04-04 12:40:44.619799465 +0200
@@ -4516,6 +4516,7 @@ gimple_boolify (tree expr)
case annot_expr_no_vector_kind:
case annot_expr_vector_kind:
case annot_expr_parallel_kind:
+   case annot_expr_maybe_infinite_kind:
  TREE_OPERAND (expr, 0) = gimple_boolify (TREE_OPERAND (expr, 0));
  if (TREE_CODE (type) != BOOLEAN_TYPE)
TREE_TYPE (expr) = boolean_type_node;
--- gcc/tree-cfg.cc.jj  2024-02-15 09:51:34.608063945 +0100
+++ gcc/tree-cfg.cc 2024-04-04 12:42:38.271233376 +0200
@@ -297,6 +297,9 @@ replace_loop_annotate_in_block (basic_bl
  loop->can_be_parallel = true;
  loop->safelen = INT_MAX;
  break;
+   case annot_expr_maybe_infinite_kind:
+ loop->finite_p = false;
+ break;
default:
  gcc_unreachable ();
}
@@ -320,12 +323,12 @@ replace_loop_annotate (void)
  
for (auto loop : loops_list (cfun, 0))

  {
+  /* Push the global flag_finite_loops state down to individual loops.  */
+  loop->finite_p = flag_finite_loops;
+
/* Check all exit source blocks for annotations.  */
for (auto e : get_loop_exit_edges (loop))
replace_loop_annotate_in_block (e->src, loop);
-
-  /* Push the global flag_finite_loops state down to individual loops.  */
-  loop->finite_p = flag_finite_loops;
  }
  
/* Remove IFN_ANNOTATE.  Safeguard for the case loop->latch == NULL.  */

@@ -347,6 +350,7 @@ replace_loop_annotate (void)
case annot_expr_no_vector_kind:
case annot_expr_vector_kind:
case annot_expr_parallel_kind:
+   case annot_expr_maybe_infinite_kind:
  break;
default:
  gcc_unreachable ();
--- gcc/tree-pretty-print.cc.jj 2024-03-14 14:07:34.303423928 +0100
+++ gcc/tree-pretty-print.cc2024-04-04 12:39:35.369753709 +0200
@@ -3479,6 +3479,9 @@ dump_generic_node (pretty_printer *pp, t
case annot_expr_parallel_kind:
  pp_string (pp, ", parallel");
  break;
+   case annot_expr_maybe_infinite_kind:
+ 

Re: [committed] c++: Fix ICE with weird copy assignment operator [PR114572]

2024-04-08 Thread Jason Merrill

On 4/5/24 03:35, Jakub Jelinek wrote:

Hi!

While ctors/dtors don't return anything (undeclared void or this pointer
on arm) and copy assignment operators normally return a reference to *this,
it isn't invalid to return uselessly some class object which might need
destructing, but the OpenMP clause handling code wasn't expecting that.

The following patch fixes that.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk so
far.

2024-04-05  Jakub Jelinek  

PR c++/114572
* cp-gimplify.cc (cxx_omp_clause_apply_fn): Call build_cplus_new
on build_call_a result if it has class type.

* testsuite/libgomp.c++/pr114572.C: New test.

--- gcc/cp/cp-gimplify.cc.jj2024-04-02 13:07:58.540385210 +0200
+++ gcc/cp/cp-gimplify.cc   2024-04-04 16:41:50.781008370 +0200
@@ -2480,6 +2480,8 @@ cxx_omp_clause_apply_fn (tree fn, tree a
   TREE_PURPOSE (parm), fn,
   i - is_method, tf_warning_or_error);
t = build_call_a (fn, i, argarray);
+  if (MAYBE_CLASS_TYPE_P (TREE_TYPE (t)))
+   t = build_cplus_new (TREE_TYPE (t), t, tf_warning_or_error);


Maybe use build_cxx_call instead of build_call_a?

Jason



Re: [PATCH] rust: Add rust.install-dvi and rust.install-html rules

2024-04-08 Thread Thomas Schwinge
Hi Christophe!

On 2024-04-04T16:27:19+, Christophe Lyon  wrote:
> rust has the (empty) rust.dvi and rust.html rules, but lacks the
> (empty) rust.install-dvi and rust.install-html ones.

Thanks, looks good to me.


Grüße
 Thomas


> 2024-04-04  Christophe Lyon  
>
>   gcc/rust/
>   * Make-lang.in (rust.install-dvi, rust.install-html): New rules.
> ---
>  gcc/rust/Make-lang.in | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/gcc/rust/Make-lang.in b/gcc/rust/Make-lang.in
> index 4d646018792..4d73412739d 100644
> --- a/gcc/rust/Make-lang.in
> +++ b/gcc/rust/Make-lang.in
> @@ -342,6 +342,8 @@ selftest-rust-valgrind: $(RUST_SELFTEST_DEPS)
>  # should have dependencies on info files that should be installed.
>  rust.install-info:
>  
> +rust.install-dvi:
> +rust.install-html:
>  rust.install-pdf:
>  
>  # Install man pages for the front end. This target should ignore errors.


[PATCH] btf: improve btf-datasec-3.c test [PR 114642]

2024-04-08 Thread David Faust
This test failed on powerpc --target_board=unix'{-m32}' because two
variables were not placed in sections where the test silently (and
incorrectly) assumed they would be.

The important thing for the test is only that BTF_KIND_DATASEC entries
are NOT generated for the extern variable declarations without an
explicit section attribute.  Make the test more robust by placing the
non-extern variables in explicit sections, and invert the checks to
more accurately verify what we care about in this test.

Tested on x86_64-linux-gnu and x86_64-linux-gnu host for
powerpc64-linux-gnu and bpf-unkown-none targets.

OK?

gcc/testsuite/
PR testsuite/114642
* gcc.dg/debug/btf/btf-datasec-3.c: Make test more robust on different
architectures.
---
 gcc/testsuite/gcc.dg/debug/btf/btf-datasec-3.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-3.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-3.c
index 297340cabfa..6b127aa14da 100644
--- a/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-3.c
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-3.c
@@ -7,22 +7,22 @@
 
 extern int VERSION __attribute__((section (".version")));
 
-extern int test_bss1;
-extern int test_data1;
+extern int ext1;
+extern int ext2;
 
-int test_bss2;
-int test_data2 = 2;
+int var1 __attribute__((section (".sec_a")));
+int var2 __attribute__((section (".sec_b"))) = 2;
 
 int
 foo (void)
 {
-  test_bss2 = VERSION;
-  return test_bss1 + test_data1 + test_data2;
+  ext2 = VERSION;
+  return ext1 + var1 + var2;
 }
 
 /* There should be 3 DATASEC entries total.  Of the extern decls, only VERSION
has a known section; entries are not created for the other two.  */
 /* { dg-final { scan-assembler-times "bts_type" 3 } } */
-/* { dg-final { scan-assembler-times "bts_type: \\(BTF_KIND_VAR 
'test_data2'\\)" 1 } } */
-/* { dg-final { scan-assembler-times "bts_type: \\(BTF_KIND_VAR 
'test_bss2'\\)" 1 } } */
 /* { dg-final { scan-assembler-times "bts_type: \\(BTF_KIND_VAR 'VERSION'\\)" 
1 } } */
+/* { dg-final { scan-assembler-not "bts_type: \\(BTF_KIND_VAR 'ext1'\\)" } } */
+/* { dg-final { scan-assembler-not "bts_type: \\(BTF_KIND_VAR 'ext2'\\)" } } */
-- 
2.43.0



Re: [PATCH] btf: emit symbol refs in DATASEC entries only for BPF [PR114608]

2024-04-08 Thread Indu Bhagat

On 4/8/24 12:26, David Faust wrote:

The behavior introduced in
   fa60ac54964 btf: Emit labels in DATASEC bts_offset entries.

is only fully correct when compiling for the BPF target with BPF CO-RE
enabled.  In other cases, depending on optimizations, it can result in
an incorrect symbol reference in the entry bts_offset field for a symbol
which may not be emitted at all, causing link-time undefined symbol
reference errors like in PR114608.

The offending bts_offset field of BTF_KIND_DATASEC entries is in reality
only currently useful to consumers of BTF information for BPF programs
anyway.  Correct the regression by only emitting symbol references in
these entries when compiling for the BPF target.  For other targets, the
behavior returns to that prior to fa60ac54964.

The underlying cause is related to PR 113566 "btf: incorrect
BTF_KIND_DATASEC entries for variables which are optimized out." A
complete fix for 113566 is more involved and unsuitable for stage 4,
but will be addressed in the near future.

Tested on x86_64-linux-gnu and on x86_64-linux-gnu host for
bpf-unknown-none target.

OK?
Thanks.



LGTM.

I think adding a comment in the bugzilla for 114431 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114431 indicating a further 
patch would not hurt either.


Thanks


gcc/
PR debug/114608
* btfout.cc (btf_asm_datasec_entry): Only emit a symbol reference when
generating BTF for BPF CO-RE target.

gcc/testsuite/
PR debug/114608
* gcc.dg/debug/btf/btf-datasec-1.c: Check bts_offset symbol references
only for BPF target.
* gcc.dg/debug/btf/btf-datasec-2.c: Likewise.
* gcc.dg/debug/btf/btf-pr106773.c: Likewise.
---
  gcc/btfout.cc  |  2 +-
  gcc/testsuite/gcc.dg/debug/btf/btf-datasec-1.c | 10 ++
  gcc/testsuite/gcc.dg/debug/btf/btf-datasec-2.c |  7 ---
  gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c  |  3 ++-
  4 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index 2e2b3524e83..4a8ec4d1ff0 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -1045,7 +1045,7 @@ btf_asm_datasec_entry (ctf_container_ref ctfc, struct 
btf_var_secinfo info)
  {
const char *symbol_name = get_name_for_datasec_entry (ctfc, info.type);
btf_asm_type_ref ("bts_type", ctfc, info.type);
-  if (symbol_name == NULL)
+  if (!btf_with_core_debuginfo_p () || symbol_name == NULL)
  dw2_asm_output_data (4, info.offset, "bts_offset");
else
  dw2_asm_output_offset (4, symbol_name, NULL, "bts_offset");
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-1.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-1.c
index c8ebe5d07ca..15b76259218 100644
--- a/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-1.c
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-1.c
@@ -19,10 +19,12 @@
  /* { dg-final { scan-assembler-times "0xf03\[\t \]+\[^\n\]*btt_info" 2 } 
} */
  /* { dg-final { scan-assembler-times "0xf01\[\t \]+\[^\n\]*btt_info" 1 } 
} */
  
-/* The offset entry for each variable in a DATSEC should contain a label.  */

-/* { dg-final { scan-assembler-times 
"(?:(?:\\.4byte|\\.long|data4\\.ua|\\.ualong|\\.uaword|\\.dword|long|dc\\.l|\\.word)\[\t
 \]|\\.vbyte\t4,\[\t \]?)_?\[a-e\]\[\t \]+\[^\n\]*bts_offset" 5 } } */
-/* { dg-final { scan-assembler-times "my_cstruct\[\t \]+\[^\n\]*bts_offset" 1 
} } */
-/* { dg-final { scan-assembler-times "bigarr\[\t \]+\[^\n\]*bts_offset" 1 } } 
*/
+/* { dg-final { scan-assembler-times "0\[\t \]+\[^\n\]*bts_offset" 7 { target 
{ ! bpf-*-* } } } } */
+
+/* For BPF target the offset entry for each variable in a DATSEC should 
contain a label.  */
+/* { dg-final { scan-assembler-times ".4byte\[\t \]\[a-e\]\[\t 
\]+\[^\n\]*bts_offset" 5 { target bpf-*-* } } } */
+/* { dg-final { scan-assembler-times "my_cstruct\[\t \]+\[^\n\]*bts_offset" 1 
{ target bpf-*-* } } } */
+/* { dg-final { scan-assembler-times "bigarr\[\t \]+\[^\n\]*bts_offset" 1 { 
target bpf-*-* } } } */
  
  /* Check that strings for each DATASEC have been added to the BTF string table.  */

  /* { dg-final { scan-assembler-times "ascii \".data.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-2.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-2.c
index 857d830e446..a89a239a504 100644
--- a/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-2.c
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-2.c
@@ -10,9 +10,10 @@
  /* { dg-final { scan-assembler-times " BTF_KIND_DATASEC 
'.bar_sec'\[\\r\\n\]+\[^\\r\\n\]*0xf02\[\t \]+\[^\n\]*btt_info" 1 } } */
  
  /* Function entries should have offset with a label and size of 0 at compile time.  */

-/* { dg-final { scan-assembler-times "chacha\[\t \]+\[^\n\]*bts_offset" 1 } } 
*/
-/* { dg-final { scan-assembler-times "bar\[\t \]+\[^\n\]*bts_offset" 1 } } */
-/* { dg-final { scan-assembler-times "foo\[\t \]+\[^\n\]*bts_offset" 1 } } */
+/* { dg-final { scan-assembler-times "chacha\[\t 

GCN: '--param=gcn-preferred-vectorization-factor=[default,32,64]' (was: GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]')

2024-04-08 Thread Thomas Schwinge
Hi!

On 2024-04-08T13:24:06+0100, Andrew Stubbs  wrote:
> On 08/04/2024 11:45, Thomas Schwinge wrote:
>> On 2024-03-28T08:00:50+0100, I wrote:
>>> On 2024-03-22T15:54:48+, Andrew Stubbs  wrote:
 This patch alters the default (preferred) vector size to 32 on RDNA 
 devices to
 better match the actual hardware.  64-lane vectors will continue to be
 used where they are hard-coded (such as function prologues).

 We run these devices in wavefrontsize64 for compatibility, but they 
 actually
 only have 32-lane vectors, natively.  If the upper part of a V64 is masked
 off (as it is in V32) then RDNA devices will skip execution of the upper 
 part
 for most operations, so this adjustment shouldn't leave too much 
 performance on
 the table.  One exception is memory instructions, so full wavefrontsize32
 support would be better.

 The advantage is that we avoid the missing V64 operations (such as permute 
 and
 vec_extract).

 Committed to mainline.
>>>
>>> In my GCN target '-march=gfx1100' testing, this commit
>>> "amdgcn: Prefer V32 on RDNA devices" does resolve (or, make latent?) a
>>> number of execution test FAILs (that is, regressions compared to earlier
>>> '-march=gfx90a' etc. testing).
>>>
>>> This commit also resolves (for my '-march=gfx1100' testing) one
>>> pre-existing FAIL (that is, already seen in '-march=gfx90a' earlier
>>> etc. testing):
>>>
>>>  PASS: gcc.dg/tree-ssa/scev-14.c (test for excess errors)
>>>  [-FAIL:-]{+PASS:+} gcc.dg/tree-ssa/scev-14.c scan-tree-dump ivopts 
>>> "Overflowness wrto loop niter:\tNo-overflow"
>>>
>>> That means, this test case specifically (or, just its 'scan-tree-dump'?)
>>> needs to be adjusted for GCN V64 testing?
>>>
>>> This commit, as you'd also mentioned elsewhere, however also causes a
>>> number of regressions in 'gcc.target/gcn/gcn.exp', see list below.
>>>
>>> Those can be "fixed" with 'dg-additional-options -march=gfx90a' (or
>>> similar) in the affected test cases (let me know if you'd like me to
>>> 'git push' that), but I suppose something more elaborate may be in order?
>>> (Conditionalize those on 'target { ! gcn_rdna }', and add respective
>>> scanning for 'target gcn_rdna'?  I can help with effective-target
>>> 'gcn_rdna' (or similar), if you'd like me to.)
>>>
>>> And/or, have a '-mpreferred-simd-mode=v64' (or similar) to be used for
>>> such test cases, to override 'if (TARGET_RDNA2_PLUS)' etc. in
>>> 'gcn_vectorize_preferred_simd_mode'?
>> 
>> The latter I have quickly implemented, see attached
>> "GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]'".  OK to
>> push to trunk branch?
>> 
>> (This '--param' will also be useful for another bug/regression I'm about
>> to file.)
>> 
>>> Best, probably, both these things, to properly test both V32 and V64?
>> 
>> That part remains to be done, but is best done by someone who actually
>> knowns "GCN" assembly/GCC back end -- that is, not me.
>
> I'm not sure that this is *best* solution to the problem (in general, 
> it's probably best to test the actual code that will be generated in 
> practice), but I think this option will be useful for testing 
> performance in each configuration and other correctness issues, and 
> these tests are not testing that feature.

ACK.

> However, "vector lane width" sounds like it's configuring the number of 
> bits in each lane. I think "vectorization factor" is unambigous.
>
> OK to commit, with the name change.

Thanks, changed, and pushed v2 version to trunk branch in
commit df7625c3af004a81c13d54bb8810e03932eeb59a
"GCN: '--param=gcn-preferred-vectorization-factor=[default,32,64]'", see
attached.


Grüße
 Thomas


>>>  PASS: gcc.target/gcn/cond_fmaxnm_1.c (test for excess errors)
>>>  [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-not 
>>> \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>>  [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times 
>>> smaxv64df3_exec 3
>>>  [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times 
>>> smaxv64sf3_exec 3
>>>  PASS: gcc.target/gcn/cond_fmaxnm_1_run.c (test for excess errors)
>>>  PASS: gcc.target/gcn/cond_fmaxnm_1_run.c execution test
>>>
>>>  PASS: gcc.target/gcn/cond_fmaxnm_2.c (test for excess errors)
>>>  [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-not 
>>> \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>>  [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times 
>>> smaxv64df3_exec 3
>>>  [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times 
>>> smaxv64sf3_exec 3
>>>  PASS: gcc.target/gcn/cond_fmaxnm_2_run.c (test for excess errors)
>>>  PASS: gcc.target/gcn/cond_fmaxnm_2_run.c execution test
>>>
>>>  PASS: gcc.target/gcn/cond_fmaxnm_3.c (test for excess errors)
>>>  [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-not 
>>> \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>> 

New effective-target 'asm_goto_with_outputs' (was: [PATCH] testsuite: Fix up lra effective target)

2024-04-08 Thread Thomas Schwinge
Hi!

On 2024-03-21T12:20:38+0100, I wrote:
> On 2024-02-16T10:48:53-0800, Mike Stump  wrote:
>> On Feb 16, 2024, at 2:16 AM, Jakub Jelinek  wrote:
>>> 
>>> There is one special case, NVPTX, which is a TARGET_NO_REGISTER_ALLOCATION
>>> target.  I think claiming for it that it is a lra target is strange (even
>>> though it effectively returns true for targetm.lra_p ()), unsure if it
>>> supports asm goto with outputs or not, if it does and we want to test it,
>>> perhaps we should introduce asm_goto_outputs effective target and use
>>> lra || nvptx-*-* for that?
>>
>> Since the port people have to maintain that code in general, I usually leave 
>> it to them to try and select a cheap, maintainable way to manage it.
>>
>> If people want to pave the way, I'd tend to defer to them, having thought 
>> about more than I.
>
> Here I am.  ;-)
>
> After commit e16f90be2dc8af6c371fe79044c3e668fa3dda62
> "testsuite: Fix up lra effective target", we get for nvptx target:
>
> -PASS: gcc.c-torture/compile/asmgoto-2.c   -O0  (test for excess errors)
> +ERROR: gcc.c-torture/compile/asmgoto-2.c   -O0 : no files matched glob 
> pattern "lra1020113.c.[0-9][0-9][0-9]r.reload" for " dg-do 2 compile { target 
> lra } "
>
> Etc.
>
> That is, the current effective-target 'lra' is not suitable for nvptx --
> which, I suppose, is OK, given that nvptx neither uses LRA nor doesn't
> use LRA.  ;-) (Therefore, effective-target 'lra' shouldn't get used in
> test cases that are active for nvptx.)
>
> However, nvptx appears to support 'asm goto' with outputs, including the
> new execution test case:
>
> PASS: gcc.dg/pr107385.c execution test
>
> I'm attaching "[WIP] New effective-target 'asm_goto_with_outputs'", which
> does address the effective-target check for nvptx, and otherwise does
> 's%lra%asm_goto_with_outputs'.  (I have not yet actually merged
> 'check_effective_target_lra' into
> 'check_effective_target_asm_goto_with_outputs'.)
>
> I have verified that all current effective-target 'lra' test cases
> actually use 'asm goto' with outputs, there is just one exception:
> 'gcc.dg/pr110079.c' (see
> 
> "bb-reorder: Fix -freorder-blocks-and-partition ICEs on aarch64 with asm goto 
> [PR110079]",
> 
> "ICE with -freorder-blocks-and-partition and inline-asm goto").  That
> test case, 'gcc.dg/pr110079.c', currently uses 'target lra', and uses
> 'asm goto' -- but not with outputs, so is 'asm_goto_with_outputs' not
> really applicable?  The test case does PASS for nvptx target (but I've
> not verified what it's actually doing/testing).  How to handle that one?

I've now pushed a v2 version to trunk branch in
commit 3fa8bff30ab58bd8b8018764d390ec2fcc8153bb
"New effective-target 'asm_goto_with_outputs'", see attached.


Grüße
 Thomas


>From 3fa8bff30ab58bd8b8018764d390ec2fcc8153bb Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 4 Mar 2024 16:04:11 +0100
Subject: [PATCH] New effective-target 'asm_goto_with_outputs'

After commit e16f90be2dc8af6c371fe79044c3e668fa3dda62
"testsuite: Fix up lra effective target", we get for nvptx target:

-PASS: gcc.c-torture/compile/asmgoto-2.c   -O0  (test for excess errors)
+ERROR: gcc.c-torture/compile/asmgoto-2.c   -O0 : no files matched glob pattern "lra1020113.c.[0-9][0-9][0-9]r.reload" for " dg-do 2 compile { target lra } "

Etc.

However, nvptx appears to support 'asm goto' with outputs, including the
new execution test case:

PASS: gcc.dg/pr107385.c execution test

Therefore, generally use new effective-target 'asm_goto_with_outputs' instead
of 'lra'.  One exceptions is 'gcc.dg/pr110079.c', which doesn't use 'asm goto'
with outputs, and continues using effective-target 'lra', with special-casing
nvptx target, to avoid ERROR for 'lra'.

	gcc/
	* doc/sourcebuild.texi (Effective-Target Keywords): Document
	'asm_goto_with_outputs'.  Add comment to 'lra'.
	gcc/testsuite/
	* lib/target-supports.exp (check_effective_target_lra): Add
	comment.
	(check_effective_target_asm_goto_with_outputs): New.
	* gcc.c-torture/compile/asmgoto-2.c: Use it.
	* gcc.c-torture/compile/asmgoto-5.c: Likewise.
	* gcc.c-torture/compile/asmgoto-6.c: Likewise.
	* gcc.c-torture/compile/pr98096.c: Likewise.
	* gcc.dg/pr100590.c: Likewise.
	* gcc.dg/pr107385.c: Likewise.
	* gcc.dg/pr108095.c: Likewise.
	* gcc.dg/pr97954.c: Likewise.
	* gcc.dg/torture/pr100329.c: Likewise.
	* gcc.dg/torture/pr100398.c: Likewise.
	* gcc.dg/torture/pr100519.c: Likewise.
	* gcc.dg/torture/pr110422.c: Likewise.
	* gcc.dg/pr110079.c: Special-case nvptx target.
---
 gcc/doc/sourcebuild.texi|  6 ++
 gcc/testsuite/gcc.c-torture/compile/asmgoto-2.c |  2 +-
 gcc/testsuite/gcc.c-torture/compile/asmgoto-5.c |  2 +-
 gcc/testsuite/gcc.c-torture/compile/asmgoto-6.c |  3 +--
 gcc/testsuite/gcc.c-torture/compile/pr98096.c   |  2 +-
 gcc/testsuite/gcc.dg/pr100590.c |  2 +-
 gcc/testsuite/gcc.dg/pr107385.c   

[PATCH] Fortran: fix argument checking of intrinsics C_SIZEOF, C_F_POINTER [PR106500]

2024-04-08 Thread Harald Anlauf
Dear all,

the attached patch fixes argument checking of:

- C_SIZEOF - rejects-valid (see below) and ICE-on-valid
- C_F_POINTER - ICE-on-invalid

The interesting part is that C_SIZEOF was not well specified until
after F2018, where an interp request lead to an edit that actually
loosened restrictions and makes the checking much more straightforward,
since expressions and function results are now allowed.

I've added references to the relevant text and interp in the commit message.

While updating the checking code shared between C_SIZEOF and C_F_POINTER,
I figured that the latter missed a check preventing an ICE-on-invalid
when a function returning a pointer was passed.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From 6f412a6399a7e125db835584d3d2489a52150c27 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Mon, 8 Apr 2024 21:43:24 +0200
Subject: [PATCH] Fortran: fix argument checking of intrinsics C_SIZEOF,
 C_F_POINTER [PR106500]

The interpretation of the F2018 standard regarding valid arguments to the
intrinsic C_SIZEOF(X) was clarified in an edit to 18-007r1:

  https://j3-fortran.org/doc/year/22/22-101r1.txt

loosening restrictions and giving examples.  The F2023 text has:

! F2023:18.2.3.8  C_SIZEOF (X)
!
!   X shall be a data entity with interoperable type and type parameters,
!   and shall not be an assumed-size array, an assumed-rank array that
!   is associated with an assumed-size array, an unallocated allocatable
!   variable, or a pointer that is not associated.

where

! 3.41 data entity
!   data object, result of the evaluation of an expression, or the
!   result of the execution of a function reference

Update the checking code for interoperable arguments accordingly, and extend
to reject functions returning pointer as FPTR argument to C_F_POINTER.

gcc/fortran/ChangeLog:

	PR fortran/106500
	* check.cc (is_c_interoperable): Fix checks for C_SIZEOF.
	(gfc_check_c_f_pointer): Reject function returning a pointer as FPTR.

gcc/testsuite/ChangeLog:

	PR fortran/106500
	* gfortran.dg/c_sizeof_6.f90: Remove wrong dg-error.
	* gfortran.dg/c_f_pointer_tests_9.f90: New test.
	* gfortran.dg/c_sizeof_7.f90: New test.
---
 gcc/fortran/check.cc  | 21 ++
 .../gfortran.dg/c_f_pointer_tests_9.f90   | 21 ++
 gcc/testsuite/gfortran.dg/c_sizeof_6.f90  |  2 +-
 gcc/testsuite/gfortran.dg/c_sizeof_7.f90  | 42 +++
 4 files changed, 76 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/c_f_pointer_tests_9.f90
 create mode 100644 gcc/testsuite/gfortran.dg/c_sizeof_7.f90

diff --git a/gcc/fortran/check.cc b/gcc/fortran/check.cc
index db74dcf3f40..b7f60575c67 100644
--- a/gcc/fortran/check.cc
+++ b/gcc/fortran/check.cc
@@ -5299,18 +5299,14 @@ is_c_interoperable (gfc_expr *expr, const char **msg, bool c_loc, bool c_f_ptr)
   return false;
 }

-  if (!c_loc && expr->rank > 0 && expr->expr_type != EXPR_ARRAY)
+  /* Checks for C_SIZEOF need to take into account edits to 18-007r1, see
+ https://j3-fortran.org/doc/year/22/22-101r1.txt .  */
+  if (!c_loc && !c_f_ptr && expr->rank > 0 && expr->expr_type == EXPR_VARIABLE)
 {
   gfc_array_ref *ar = gfc_find_array_ref (expr);
-  if (ar->type != AR_FULL)
+  if (ar->type == AR_FULL && ar->as->type == AS_ASSUMED_SIZE)
 	{
-	  *msg = "Only whole-arrays are interoperable";
-	  return false;
-	}
-  if (!c_f_ptr && ar->as->type != AS_EXPLICIT
-	  && ar->as->type != AS_ASSUMED_SIZE)
-	{
-	  *msg = "Only explicit-size and assumed-size arrays are interoperable";
+	  *msg = "Assumed-size arrays are not interoperable";
 	  return false;
 	}
 }
@@ -5475,6 +5471,13 @@ gfc_check_c_f_pointer (gfc_expr *cptr, gfc_expr *fptr, gfc_expr *shape)
   return false;
 }

+  if (attr.function)
+{
+  gfc_error ("FPTR at %L to C_F_POINTER is a function returning a pointer",
+		 >where);
+  return false;
+}
+
   if (fptr->rank > 0 && !is_c_interoperable (fptr, , false, true))
 return gfc_notify_std (GFC_STD_F2018, "Noninteroperable array FPTR "
 			   "at %L to C_F_POINTER: %s", >where, msg);
diff --git a/gcc/testsuite/gfortran.dg/c_f_pointer_tests_9.f90 b/gcc/testsuite/gfortran.dg/c_f_pointer_tests_9.f90
new file mode 100644
index 000..bb6d3281b02
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/c_f_pointer_tests_9.f90
@@ -0,0 +1,21 @@
+! { dg-do compile }
+!
+! A function returning a pointer cannot be interoperable
+! and cannot be used as FPTR argument to C_F_POINTER.
+
+subroutine s ()
+  use, intrinsic :: iso_c_binding
+  implicit none
+  type(c_ptr) :: cPtr
+  call c_f_pointer (cPtr, p0)! { dg-error "function returning a pointer" }
+  call c_f_pointer (cPtr, p1, shape=[2]) ! { dg-error "function returning a pointer" }
+contains
+  function p0 ()
+integer, pointer :: p0
+nullify (p0)
+  end
+  function p1 ()
+integer, pointer :: p1(:)
+nullify (p1)
+  end
+end
diff --git 

[PATCH] btf: emit symbol refs in DATASEC entries only for BPF [PR114608]

2024-04-08 Thread David Faust
The behavior introduced in
  fa60ac54964 btf: Emit labels in DATASEC bts_offset entries.

is only fully correct when compiling for the BPF target with BPF CO-RE
enabled.  In other cases, depending on optimizations, it can result in
an incorrect symbol reference in the entry bts_offset field for a symbol
which may not be emitted at all, causing link-time undefined symbol
reference errors like in PR114608.

The offending bts_offset field of BTF_KIND_DATASEC entries is in reality
only currently useful to consumers of BTF information for BPF programs
anyway.  Correct the regression by only emitting symbol references in
these entries when compiling for the BPF target.  For other targets, the
behavior returns to that prior to fa60ac54964.

The underlying cause is related to PR 113566 "btf: incorrect
BTF_KIND_DATASEC entries for variables which are optimized out." A
complete fix for 113566 is more involved and unsuitable for stage 4,
but will be addressed in the near future.

Tested on x86_64-linux-gnu and on x86_64-linux-gnu host for
bpf-unknown-none target.

OK?
Thanks.

gcc/
PR debug/114608
* btfout.cc (btf_asm_datasec_entry): Only emit a symbol reference when
generating BTF for BPF CO-RE target.

gcc/testsuite/
PR debug/114608
* gcc.dg/debug/btf/btf-datasec-1.c: Check bts_offset symbol references
only for BPF target.
* gcc.dg/debug/btf/btf-datasec-2.c: Likewise.
* gcc.dg/debug/btf/btf-pr106773.c: Likewise.
---
 gcc/btfout.cc  |  2 +-
 gcc/testsuite/gcc.dg/debug/btf/btf-datasec-1.c | 10 ++
 gcc/testsuite/gcc.dg/debug/btf/btf-datasec-2.c |  7 ---
 gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c  |  3 ++-
 4 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index 2e2b3524e83..4a8ec4d1ff0 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -1045,7 +1045,7 @@ btf_asm_datasec_entry (ctf_container_ref ctfc, struct 
btf_var_secinfo info)
 {
   const char *symbol_name = get_name_for_datasec_entry (ctfc, info.type);
   btf_asm_type_ref ("bts_type", ctfc, info.type);
-  if (symbol_name == NULL)
+  if (!btf_with_core_debuginfo_p () || symbol_name == NULL)
 dw2_asm_output_data (4, info.offset, "bts_offset");
   else
 dw2_asm_output_offset (4, symbol_name, NULL, "bts_offset");
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-1.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-1.c
index c8ebe5d07ca..15b76259218 100644
--- a/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-1.c
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-1.c
@@ -19,10 +19,12 @@
 /* { dg-final { scan-assembler-times "0xf03\[\t \]+\[^\n\]*btt_info" 2 } } 
*/
 /* { dg-final { scan-assembler-times "0xf01\[\t \]+\[^\n\]*btt_info" 1 } } 
*/
 
-/* The offset entry for each variable in a DATSEC should contain a label.  */
-/* { dg-final { scan-assembler-times 
"(?:(?:\\.4byte|\\.long|data4\\.ua|\\.ualong|\\.uaword|\\.dword|long|dc\\.l|\\.word)\[\t
 \]|\\.vbyte\t4,\[\t \]?)_?\[a-e\]\[\t \]+\[^\n\]*bts_offset" 5 } } */
-/* { dg-final { scan-assembler-times "my_cstruct\[\t \]+\[^\n\]*bts_offset" 1 
} } */
-/* { dg-final { scan-assembler-times "bigarr\[\t \]+\[^\n\]*bts_offset" 1 } } 
*/
+/* { dg-final { scan-assembler-times "0\[\t \]+\[^\n\]*bts_offset" 7 { target 
{ ! bpf-*-* } } } } */
+
+/* For BPF target the offset entry for each variable in a DATSEC should 
contain a label.  */
+/* { dg-final { scan-assembler-times ".4byte\[\t \]\[a-e\]\[\t 
\]+\[^\n\]*bts_offset" 5 { target bpf-*-* } } } */
+/* { dg-final { scan-assembler-times "my_cstruct\[\t \]+\[^\n\]*bts_offset" 1 
{ target bpf-*-* } } } */
+/* { dg-final { scan-assembler-times "bigarr\[\t \]+\[^\n\]*bts_offset" 1 { 
target bpf-*-* } } } */
 
 /* Check that strings for each DATASEC have been added to the BTF string 
table.  */
 /* { dg-final { scan-assembler-times "ascii \".data.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-2.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-2.c
index 857d830e446..a89a239a504 100644
--- a/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-2.c
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-datasec-2.c
@@ -10,9 +10,10 @@
 /* { dg-final { scan-assembler-times " BTF_KIND_DATASEC 
'.bar_sec'\[\\r\\n\]+\[^\\r\\n\]*0xf02\[\t \]+\[^\n\]*btt_info" 1 } } */
 
 /* Function entries should have offset with a label and size of 0 at compile 
time.  */
-/* { dg-final { scan-assembler-times "chacha\[\t \]+\[^\n\]*bts_offset" 1 } } 
*/
-/* { dg-final { scan-assembler-times "bar\[\t \]+\[^\n\]*bts_offset" 1 } } */
-/* { dg-final { scan-assembler-times "foo\[\t \]+\[^\n\]*bts_offset" 1 } } */
+/* { dg-final { scan-assembler-times "chacha\[\t \]+\[^\n\]*bts_offset" 1 { 
target { bpf-*-* } }} } */
+/* { dg-final { scan-assembler-times "bar\[\t \]+\[^\n\]*bts_offset" 1 { 
target { bpf-*-* } } } } */
+/* { dg-final { scan-assembler-times "foo\[\t \]+\[^\n\]*bts_offset" 1 { 
target { bpf-*-* } } 

Re: [Patch, fortran] PR114535 - [13/14 regression] ICE with elemental finalizer

2024-04-08 Thread Jerry D

On 4/8/24 2:45 AM, Paul Richard Thomas wrote:

Hi All,

This one is blazingly 'obvious'. I haven't had the heart to investigate why somebody thought that it is a good idea to check if unreferenced symbols are finalizable because, I suspect, that 
'somebody' was me. Worse, I tried a couple of other fixes before I hit on the 'obvious' one :-(


The ChangeLog says it all. OK for mainline and then backporting in a couple of 
weeks?

Paul

Fortran: Fix ICE in trans-stmt.cc(gfc_trans_call) [PR114535]

2024-04-08  Paul Thomas  mailto:pa...@gcc.gnu.org>>

gcc/fortran
PR fortran/114535
* resolve.cc (resolve_symbol): Remove last chunk that checked
for finalization of unreferenced symbols.

gcc/testsuite/
PR fortran/114535
* gfortran.dg/pr114535d.f90: New test.
* gfortran.dg/pr114535iv.f90: Additional source.



Yes, OK Paul.  Don't feel bad. It wasn't Tabs. LOL

Jerry


Re: [Patch] Fortran: List-directed read - accept again tab as alternative to space as separator [PR114304]

2024-04-08 Thread Jerry D

On 4/8/24 2:53 AM, Tobias Burnus wrote:

Jerry D wrote:

See attached updated patch.


It turned rather quickly out that this patch – committed as 
r14-9822-g93adf88cc6744a – caused regressions.

Namely, real-world code use tab(s) as separator instead of spaces.

[For instance, PR114304 which contains a named-list input file from SPEC CPU 
2017; that example uses tabs before the '=' sign, but the issue is more 
generic.]

I think the ISO Fortran standard only permits spaces, but as it feels natural 
and is widely supported, tabs are used and should remain supported.

It is not quite clear how '\r' are or should be handled, but as eat_spaces did 
use it, I thought I would add one testcase using them as well.

That test is not affected by my change; it did work before with GCC and still does – but it does fail with ifort/ifx/flang. I have not thought deeply whether it should be supported or not – and 
looking at the libgfortran source file, it often but (→ testcase) not consistently requires that an \n follows the \r.


OK for mainline? [And: When the previous patch gets backported, this surely 
needs to be included as well.]

Tobias


Good catch. I did not even think about tabs.

OK to commit and I will take care of it when I do the backport to 13.

Thanks!

Jerry


Re: [PATCH] middle-end/114604 - ranger allocates bitmap without initialized obstack

2024-04-08 Thread Richard Biener



> Am 08.04.2024 um 18:40 schrieb Aldy Hernandez :
> 
> On Mon, Apr 8, 2024 at 6:29 PM Richard Biener  wrote:
>> 
>> 
>> 
 Am 08.04.2024 um 18:09 schrieb Aldy Hernandez :
>>> 
>>> On Mon, Apr 8, 2024 at 5:54 PM Jakub Jelinek  wrote:
 
 On Mon, Apr 08, 2024 at 05:40:23PM +0200, Aldy Hernandez wrote:
>>   PR middle-end/114604
>>   * gimple-range.cc (enable_ranger): Initialize the global
>>   bitmap obstack.
>>   (disable_ranger): Release it.
>> ---
>> gcc/gimple-range.cc | 4 
>> 1 file changed, 4 insertions(+)
>> 
>> diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
>> index c16b776c1e3..4d3b1ce8588 100644
>> --- a/gcc/gimple-range.cc
>> +++ b/gcc/gimple-range.cc
>> @@ -689,6 +689,8 @@ enable_ranger (struct function *fun, bool 
>> use_imm_uses)
>> {
>>  gimple_ranger *r;
>> 
>> +  bitmap_obstack_initialize (NULL);
>> +
>>  gcc_checking_assert (!fun->x_range_query);
>>  r = new gimple_ranger (use_imm_uses);
>>  fun->x_range_query = r;
>> @@ -705,6 +707,8 @@ disable_ranger (struct function *fun)
>>  gcc_checking_assert (fun->x_range_query);
>>  delete fun->x_range_query;
>>  fun->x_range_query = NULL;
>> +
>> +  bitmap_obstack_release (NULL);
> 
> Are you not allowed to initialize/use obstacks unless
> bitmap_obstack_initialize(NULL) is called?
 
 You can use it with some other obstack, just not the default one.
 
> If so, wouldn't it be
> better to lazily initialize it downstream (bitmap_alloc, or whomever
> needs it initialized)?
 
 No, you still need to decide where is the safe point to release it.
 Unlike the non-default bitmap_obstack_initialize/bitmap_obstack_release,
 the default one can nest (has associated nesting counter).  So, the above
 patch just says that ranger starts using the default obstack in
 enable_ranger and stops using it in disable_ranger and anything ranger
 associated in the obstack can be freed at that point.
>>> 
>>> I thought ranger never used the default one:
>>> 
>>> $ grep bitmap_obstack_initialize *value* *range*
>>> value-relation.cc:  bitmap_obstack_initialize (_bitmaps);
>>> value-relation.cc:  bitmap_obstack_initialize (_bitmaps);
>>> gimple-range-cache.cc:  bitmap_obstack_initialize (_bitmaps);
>>> gimple-range-gori.cc:  bitmap_obstack_initialize (_bitmaps);
>>> gimple-range-infer.cc:  bitmap_obstack_initialize (_bitmaps);
>>> gimple-range-phi.cc:  bitmap_obstack_initialize (_bitmaps);
>>> 
>>> or even:
>>> 
>>> $ grep obstack.*NULL *value* *range*
>>> value-range-storage.cc:obstack_free (_obstack, NULL);
>>> value-relation.cc:  obstack_free (_chain_obstack, NULL);
>>> value-relation.cc:  obstack_free (_chain_obstack, NULL);
>>> gimple-range-infer.cc:  obstack_free (_list_obstack, NULL);
>>> value-range-storage.cc:obstack_free (_obstack, NULL);
>>> 
>>> I'm obviously missing something here.
>> 
>> Look for BITMAP_ALLOC (NULL) in the backtrace in the PR
> 
> Ahh!  Thanks.
> 
> A few default obstack uses snuck in while I wasn't looking.
> 
> $ grep BITMAP_ALLOC.*NULL *range*
> gimple-range-cache.cc:  m_propfail = BITMAP_ALLOC (NULL);
> gimple-range-cache.h:  inline ssa_lazy_cache () { active_p =
> BITMAP_ALLOC (NULL); }
> gimple-range.cc:  m_pop_list = BITMAP_ALLOC (NULL);
> 
> I wonder if it would be cleaner to just change these to use named obstacks.

I didn’t find any obvious obstack to use, but sure.  This was the easiest fix ;)

Richard 

> Andrew, is there a reason we were using the default obstack for these?
> For reference, they are  class update_list used in the ranger cache,
> ssa_lazy_cache, and dom_ranger.
> 
> Aldy
> 


Re: [PATCH] ICF: Make ICF and SRA agree on padding

2024-04-08 Thread Martin Jambor
Hello,

On Sun, Apr 07 2024, Xi Ruoyao wrote:
> On Thu, 2024-04-04 at 23:19 +0200, Martin Jambor wrote:
>> The patch has been approved by Honza in Bugzilla. (I hope.  He did write
>> it looked reasonable.)  Together with the patch for PR 113907, it has
>> passed bootstrap, LTO bootstrap and LTO profiledbootstrap and testing on
>> x86_64-linux and bootstrap and LTO bootstrap on ppc64le-linux.  It also
>> passed normal bootstrap on aarch64-linux but there many testcases failed
>> because the compiler timed out.  The machine is old and slow and might
>> have been oversubscribed so my plan is to try again on gcc185 from
>> cfarm.  If that goes well, I intend to commit the patch and then start
>> working on backports.
>
> I've tried these two patches out on my own 24-core AArch64 machine. 
> Bootstrapped (but no LTO or PGO) and regtested fine.
>

Thank you very much, I have pushed th patches to upstream.

Martin


[PATCH v2] libstdc++: Fix infinite loop in std::istream::ignore(n, delim) [PR93672]

2024-04-08 Thread Jonathan Wakely
Patch v2.

I realised that it's not only negative delim values that cause the
problem, but also ones greater than CHAR_MAX. Calling ignore(n, 'a'+256)
will cause traits_type::find to match 'a' but then the eq_int_type
comparison will fail because (int)'a' != (int)('a' + 256).

This version of the patch calls to_int_type on the delim and if that
alters the value, it's never going to match so skip the loop that tries
to find it and just ignore up to n chars instead.

Tested x86_64linux and aarch64-linux.

-- >8 --

A negative delim value passed to std::istream::ignore can never match
any character in the stream, because the comparison is done using
traits_type::eq_int_type(sb->sgetc(), delim) and sgetc() never returns
negative values (except at EOF). The optimized version of ignore for the
std::istream specialization uses traits_type::find to locate the delim
character in the streambuf, which _can_ match a negative delim on
platforms where char is signed, but then we do another comparison using
eq_int_type which fails. The code then keeps looping forever, with
traits_type::find locating the character and traits_type::eq_int_type
saying it's not a match, so traits_type::find is used again and finds
the same character again.

A possible fix would be to check with eq_int_type after a successful
find, to see whether we really have a match. However, that would be
suboptimal since we know that a negative delimiter will never match
using eq_int_type. So a better fix is to adjust the check at the top of
the function that handles delim==eof(), so that we treat all negative
delim values as equivalent to EOF. That way we don't bother using find
to search for something that will never match with eq_int_type.

The version of ignore in the primary template doesn't need a change,
because it doesn't use traits_type::find, instead characters are
extracted one-by-one and always matched using eq_int_type. That avoids
the inconsistency between find and eq_int_type. The specialization for
std::wistream does use traits_type::find, but traits_type::to_int_type
is equivalent to an implicit conversion from wchar_t to wint_t, so
passing a wchar_t directly to ignore without using to_int_type works.

libstdc++-v3/ChangeLog:

PR libstdc++/93672
* src/c++98/istream.cc (istream::ignore(streamsize, int_type)):
Treat all negative delimiter values as eof().
* testsuite/27_io/basic_istream/ignore/char/93672.cc: New test.
* testsuite/27_io/basic_istream/ignore/wchar_t/93672.cc: New
test.
---
 libstdc++-v3/src/c++98/istream.cc |  13 ++-
 .../27_io/basic_istream/ignore/char/93672.cc  | 101 ++
 .../basic_istream/ignore/wchar_t/93672.cc |  34 ++
 3 files changed, 146 insertions(+), 2 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/27_io/basic_istream/ignore/char/93672.cc
 create mode 100644 
libstdc++-v3/testsuite/27_io/basic_istream/ignore/wchar_t/93672.cc

diff --git a/libstdc++-v3/src/c++98/istream.cc 
b/libstdc++-v3/src/c++98/istream.cc
index 07ac739c26a..d1bff2b 100644
--- a/libstdc++-v3/src/c++98/istream.cc
+++ b/libstdc++-v3/src/c++98/istream.cc
@@ -112,8 +112,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 basic_istream::
 ignore(streamsize __n, int_type __delim)
 {
-  if (traits_type::eq_int_type(__delim, traits_type::eof()))
-   return ignore(__n);
+  {
+   // If conversion to int_type changes the value then __delim does not
+   // correspond to a value of type char_type, and so will never match
+   // a character extracted from the input sequence. Just use ignore(n).
+   const int_type chk_delim = traits_type::to_int_type(__delim);
+   const bool matchable = traits_type::eq_int_type(chk_delim, __delim);
+   if (__builtin_expect(!matchable, 0))
+ return ignore(__n);
+   // Now we know that __delim is a valid char_type value, so it's safe
+   // for the code below to use traits_type::find to search for it.
+  }
 
   _M_gcount = 0;
   sentry __cerb(*this, true);
diff --git a/libstdc++-v3/testsuite/27_io/basic_istream/ignore/char/93672.cc 
b/libstdc++-v3/testsuite/27_io/basic_istream/ignore/char/93672.cc
new file mode 100644
index 000..96737485b83
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/basic_istream/ignore/char/93672.cc
@@ -0,0 +1,101 @@
+// { dg-do run }
+
+#include 
+#include 
+#include 
+
+void
+test_pr93672() // std::basic_istream::ignore hangs if delim MSB is set
+{
+  std::istringstream in(".\xfc..\xfd...\xfe.");
+
+  // This should find '\xfd' even on platforms where char is signed,
+  // because the delimiter is correctly converted to the stream's int_type.
+  in.ignore(100, std::char_traits::to_int_type('\xfc'));
+  VERIFY( in.gcount() == 2 );
+  VERIFY( ! in.eof() );
+
+  // This should work equivalently to traits_type::to_int_type
+  in.ignore(100, (unsigned char)'\xfd');
+  VERIFY( in.gcount() == 3 );
+  VERIFY( ! in.eof() );
+
+  // This only 

[committed] libstdc++: Use char for _Utf8_view if char8_t isn't available [PR114519]

2024-04-08 Thread Jonathan Wakely
I recently disabled _Utf8_view for -fno-char8_t, but we can just make it
use char instead of char8_t. The existing uses of it in the library are
unaffected.

Tested x86_64-linux and aarch64-linux. Pushed to trunk.

-- >8 --

Instead of just omitting the definition of __unicode::_Utf8_view when
char8_t is disabled, we can make it use char instead.

libstdc++-v3/ChangeLog:

PR libstdc++/114519
* include/bits/unicode.h (_Utf8_view) [!__cpp_char8_t]: Define
using char instead of char8_t.
* testsuite/ext/unicode/view.cc: Use u8""sv literals to create
string views, instead of std::u8string_view.
---
 libstdc++-v3/include/bits/unicode.h| 3 +++
 libstdc++-v3/testsuite/ext/unicode/view.cc | 4 ++--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/bits/unicode.h 
b/libstdc++-v3/include/bits/unicode.h
index 0e95c86a0b0..29813b743dc 100644
--- a/libstdc++-v3/include/bits/unicode.h
+++ b/libstdc++-v3/include/bits/unicode.h
@@ -581,6 +581,9 @@ namespace __unicode
 #ifdef __cpp_char8_t
   template
 using _Utf8_view = _Utf_view;
+#else
+  template
+using _Utf8_view = _Utf_view;
 #endif
   template
 using _Utf16_view = _Utf_view;
diff --git a/libstdc++-v3/testsuite/ext/unicode/view.cc 
b/libstdc++-v3/testsuite/ext/unicode/view.cc
index 79ea2bbc6b7..ee23b0b1d8a 100644
--- a/libstdc++-v3/testsuite/ext/unicode/view.cc
+++ b/libstdc++-v3/testsuite/ext/unicode/view.cc
@@ -10,7 +10,7 @@ using namespace std::string_view_literals;
 constexpr void
 test_utf8_to_utf8()
 {
-  const std::u8string_view s8 = u8"£🇬🇧 €🇪🇺 æбçδé ♠
♥♦♣ 🤡";
+  const auto s8 = u8"£🇬🇧 €🇪🇺 æбçδé ♠♥♦♣ 🤡"sv;
   uc::_Utf8_view v(s8);
   VERIFY( std::ranges::distance(v) == s8.size() );
   VERIFY( std::ranges::equal(v,  s8) );
@@ -19,7 +19,7 @@ test_utf8_to_utf8()
 constexpr void
 test_utf8_to_utf16()
 {
-  const std::u8string_view s8  = u8"£🇬🇧 €🇪🇺 æбçδé ♠
♥♦♣ 🤡";
+  const auto s8  = u8"£🇬🇧 €🇪🇺 æбçδé ♠♥♦♣ 🤡"sv;
   const std::u16string_view s16 = u"£🇬🇧 €🇪🇺 æбçδé ♠
♥♦♣ 🤡";
   uc::_Utf16_view v(s8);
   VERIFY( std::ranges::distance(v) == s16.size() );
-- 
2.44.0



[committed] libstdc++: Fix tests that fail with -fno-char8_t

2024-04-08 Thread Jonathan Wakely
Tested x86_64-linux and aarch64-linux. Pushed to trunk.

-- >8 --

Adjust expected errors or skip tests as UNSUPPORTED if -fno-char8_t is
used in the test flags.

libstdc++-v3/ChangeLog:

* testsuite/20_util/integer_comparisons/equal_neg.cc: Use
no-opts selector for errors that depend on -fchar8_t.
* testsuite/20_util/integer_comparisons/greater_equal_neg.cc:
Likewise.
* testsuite/20_util/integer_comparisons/greater_neg.cc:
Likewise.
* testsuite/20_util/integer_comparisons/in_range_neg.cc:
Likewise.
* testsuite/20_util/integer_comparisons/less_equal_neg.cc:
Likewise.
* testsuite/20_util/integer_comparisons/less_neg.cc: Likewise.
* testsuite/20_util/integer_comparisons/not_equal_neg.cc:
Likewise.
* testsuite/21_strings/basic_string/hash/hash_char8_t.cc: Skip
if -fno-char8_t is used.
* testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc:
Likewise.
* testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc:
Likewise.
* testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc:
Likewise.
* testsuite/27_io/filesystem/path/factory/u8path-depr.cc: Use
char for u8 literal if char8_t is not available.
* testsuite/27_io/headers/iosfwd/synopsis.cc: Check
__cpp_char8_t.
* testsuite/29_atomics/atomic_integral/wait_notify.cc: Likewise.
* testsuite/29_atomics/headers/atomic/types_std_c++20_neg.cc:
Remove check for _GLIBCXX_USE_CHAR8_T.
---
 .../testsuite/20_util/integer_comparisons/equal_neg.cc  | 4 ++--
 .../20_util/integer_comparisons/greater_equal_neg.cc| 4 ++--
 .../testsuite/20_util/integer_comparisons/greater_neg.cc| 4 ++--
 .../testsuite/20_util/integer_comparisons/in_range_neg.cc   | 6 --
 .../testsuite/20_util/integer_comparisons/less_equal_neg.cc | 4 ++--
 .../testsuite/20_util/integer_comparisons/less_neg.cc   | 4 ++--
 .../testsuite/20_util/integer_comparisons/not_equal_neg.cc  | 4 ++--
 .../testsuite/21_strings/basic_string/hash/hash_char8_t.cc  | 1 +
 .../21_strings/headers/cuchar/functions_std_cxx20.cc| 1 +
 .../27_io/basic_ostream/inserters_character/char/deleted.cc | 1 +
 .../basic_ostream/inserters_character/wchar_t/deleted.cc| 1 +
 .../testsuite/27_io/filesystem/path/factory/u8path-depr.cc  | 4 +++-
 libstdc++-v3/testsuite/27_io/headers/iosfwd/synopsis.cc | 2 +-
 .../testsuite/29_atomics/atomic_integral/wait_notify.cc | 2 ++
 .../29_atomics/headers/atomic/types_std_c++20_neg.cc| 2 --
 15 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/libstdc++-v3/testsuite/20_util/integer_comparisons/equal_neg.cc 
b/libstdc++-v3/testsuite/20_util/integer_comparisons/equal_neg.cc
index 06cfd6ee8f4..7290e9e2417 100644
--- a/libstdc++-v3/testsuite/20_util/integer_comparisons/equal_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/integer_comparisons/equal_neg.cc
@@ -25,8 +25,8 @@ bool c = std::cmp_equal(2, L'2'); // { dg-error "here" }
 bool d = std::cmp_equal(L'2', 2); // { dg-error "here" }
 bool e = std::cmp_equal(true, 1); // { dg-error "here" }
 bool f = std::cmp_equal(0, false); // { dg-error "here" }
-bool g = std::cmp_equal(97, u8'a'); // { dg-error "here" }
-bool h = std::cmp_equal(u8'a', 97); // { dg-error "here" }
+bool g = std::cmp_equal(97, u8'a'); // { dg-error "here" "" { target { no-opts 
"-fno-char8_t" } } }
+bool h = std::cmp_equal(u8'a', 97); // { dg-error "here" "" { target { no-opts 
"-fno-char8_t" } } }
 bool i = std::cmp_equal(97, u'a'); // { dg-error "here" }
 bool j = std::cmp_equal(u'a', 97); // { dg-error "here" }
 bool k = std::cmp_equal(97, U'a'); // { dg-error "here" }
diff --git 
a/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_equal_neg.cc 
b/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_equal_neg.cc
index b5f03b70733..cb60bfe9d07 100644
--- a/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_equal_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_equal_neg.cc
@@ -25,8 +25,8 @@ bool c = std::cmp_greater_equal(2, L'2'); // { dg-error 
"constexpr" }
 bool d = std::cmp_greater_equal(L'2', 2); // { dg-error "constexpr" }
 bool e = std::cmp_greater_equal(true, 1); // { dg-error "constexpr" }
 bool f = std::cmp_greater_equal(0, false); // { dg-error "constexpr" }
-bool g = std::cmp_greater_equal(97, u8'a'); // { dg-error "constexpr" }
-bool h = std::cmp_greater_equal(u8'a', 97); // { dg-error "constexpr" }
+bool g = std::cmp_greater_equal(97, u8'a'); // { dg-error "constexpr" "" { 
target { no-opts "-fno-char8_t" } } }
+bool h = std::cmp_greater_equal(u8'a', 97); // { dg-error "constexpr" "" { 
target { no-opts "-fno-char8_t" } } }
 bool i = std::cmp_greater_equal(97, u'a'); // { dg-error "constexpr" }
 bool j = std::cmp_greater_equal(u'a', 97); // { dg-error "constexpr" }
 bool k = std::cmp_greater_equal(97, U'a'); // { dg-error 

[committed] libstdc++: Combine two std::from_chars tests into one

2024-04-08 Thread Jonathan Wakely
Tested x86_64-linux and aarch64-linux. Pushed to trunk.

-- >8 --

We don't need separate tests for the C++17 and C++20 cases, we can just
have one test that uses __cpp_char8_t to adjust whether it tests char8_t
or not. This means the C++20 one doesn't fail if -fno-char8_t is used.

libstdc++-v3/ChangeLog:

* testsuite/20_util/from_chars/1_neg.cc: Add char8_t cases,
using a struct of that name if -fno-char8_t is active.
* testsuite/20_util/from_chars/1_c++20_neg.cc: Removed.
---
 .../20_util/from_chars/1_c++20_neg.cc | 43 ---
 .../testsuite/20_util/from_chars/1_neg.cc |  7 +++
 2 files changed, 7 insertions(+), 43 deletions(-)
 delete mode 100644 libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc

diff --git a/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc
deleted file mode 100644
index d246eefb469..000
--- a/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc
+++ /dev/null
@@ -1,43 +0,0 @@
-// Copyright (C) 2017-2024 Free Software Foundation, Inc.
-//
-// This file is part of the GNU ISO C++ Library.  This library is free
-// software; you can redistribute it and/or modify it under the
-// terms of the GNU General Public License as published by the
-// Free Software Foundation; either version 3, or (at your option)
-// any later version.
-
-// This library is distributed in the hope that it will be useful,
-// but WITHOUT ANY WARRANTY; without even the implied warranty of
-// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-// GNU General Public License for more details.
-
-// You should have received a copy of the GNU General Public License along
-// with this library; see the file COPYING3.  If not see
-// .
-
-// { dg-do compile { target c++20 } }
-
-#include 
-
-void
-test01(const char* first, const char* last)
-{
-  wchar_t wc;
-  std::from_chars(first, last, wc); // { dg-error "no matching" }
-  std::from_chars(first, last, wc, 10); // { dg-error "no matching" }
-  char8_t c8;
-  std::from_chars(first, last, c8); // { dg-error "no matching" }
-  std::from_chars(first, last, c8, 10); // { dg-error "no matching" }
-  char16_t c16;
-  std::from_chars(first, last, c16); // { dg-error "no matching" }
-  std::from_chars(first, last, c16, 10); // { dg-error "no matching" }
-  char32_t c32;
-  std::from_chars(first, last, c32); // { dg-error "no matching" }
-  std::from_chars(first, last, c32, 10); // { dg-error "no matching" }
-  enum E { } e;
-  std::from_chars(first, last, e); // { dg-error "no matching" }
-  std::from_chars(first, last, e, 10); // { dg-error "no matching" }
-}
-
-// { dg-prune-output "enable_if" }
-// { dg-prune-output "cannot bind non-const lvalue reference" }
diff --git a/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc
index 7538d9448ac..3f4cd59f6fc 100644
--- a/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc
@@ -36,6 +36,13 @@ test01(const char* first, const char* last)
   enum E { } e;
   std::from_chars(first, last, e); // { dg-error "no matching" }
   std::from_chars(first, last, e, 10); // { dg-error "no matching" }
+
+#ifndef __cpp_char8_t
+  struct char8_t { };
+#endif
+  char8_t c8;
+  std::from_chars(first, last, c8); // { dg-error "no matching" }
+  std::from_chars(first, last, c8, 10); // { dg-error "no matching" }
 }
 
 // { dg-prune-output "enable_if" }
-- 
2.44.0



Re: [PATCH] middle-end/114604 - ranger allocates bitmap without initialized obstack

2024-04-08 Thread Aldy Hernandez
On Mon, Apr 8, 2024 at 6:29 PM Richard Biener  wrote:
>
>
>
> > Am 08.04.2024 um 18:09 schrieb Aldy Hernandez :
> >
> > On Mon, Apr 8, 2024 at 5:54 PM Jakub Jelinek  wrote:
> >>
> >> On Mon, Apr 08, 2024 at 05:40:23PM +0200, Aldy Hernandez wrote:
> PR middle-end/114604
> * gimple-range.cc (enable_ranger): Initialize the global
> bitmap obstack.
> (disable_ranger): Release it.
>  ---
>  gcc/gimple-range.cc | 4 
>  1 file changed, 4 insertions(+)
> 
>  diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
>  index c16b776c1e3..4d3b1ce8588 100644
>  --- a/gcc/gimple-range.cc
>  +++ b/gcc/gimple-range.cc
>  @@ -689,6 +689,8 @@ enable_ranger (struct function *fun, bool 
>  use_imm_uses)
>  {
>    gimple_ranger *r;
> 
>  +  bitmap_obstack_initialize (NULL);
>  +
>    gcc_checking_assert (!fun->x_range_query);
>    r = new gimple_ranger (use_imm_uses);
>    fun->x_range_query = r;
>  @@ -705,6 +707,8 @@ disable_ranger (struct function *fun)
>    gcc_checking_assert (fun->x_range_query);
>    delete fun->x_range_query;
>    fun->x_range_query = NULL;
>  +
>  +  bitmap_obstack_release (NULL);
> >>>
> >>> Are you not allowed to initialize/use obstacks unless
> >>> bitmap_obstack_initialize(NULL) is called?
> >>
> >> You can use it with some other obstack, just not the default one.
> >>
> >>> If so, wouldn't it be
> >>> better to lazily initialize it downstream (bitmap_alloc, or whomever
> >>> needs it initialized)?
> >>
> >> No, you still need to decide where is the safe point to release it.
> >> Unlike the non-default bitmap_obstack_initialize/bitmap_obstack_release,
> >> the default one can nest (has associated nesting counter).  So, the above
> >> patch just says that ranger starts using the default obstack in
> >> enable_ranger and stops using it in disable_ranger and anything ranger
> >> associated in the obstack can be freed at that point.
> >
> > I thought ranger never used the default one:
> >
> > $ grep bitmap_obstack_initialize *value* *range*
> > value-relation.cc:  bitmap_obstack_initialize (_bitmaps);
> > value-relation.cc:  bitmap_obstack_initialize (_bitmaps);
> > gimple-range-cache.cc:  bitmap_obstack_initialize (_bitmaps);
> > gimple-range-gori.cc:  bitmap_obstack_initialize (_bitmaps);
> > gimple-range-infer.cc:  bitmap_obstack_initialize (_bitmaps);
> > gimple-range-phi.cc:  bitmap_obstack_initialize (_bitmaps);
> >
> > or even:
> >
> > $ grep obstack.*NULL *value* *range*
> > value-range-storage.cc:obstack_free (_obstack, NULL);
> > value-relation.cc:  obstack_free (_chain_obstack, NULL);
> > value-relation.cc:  obstack_free (_chain_obstack, NULL);
> > gimple-range-infer.cc:  obstack_free (_list_obstack, NULL);
> > value-range-storage.cc:obstack_free (_obstack, NULL);
> >
> > I'm obviously missing something here.
>
> Look for BITMAP_ALLOC (NULL) in the backtrace in the PR

Ahh!  Thanks.

A few default obstack uses snuck in while I wasn't looking.

$ grep BITMAP_ALLOC.*NULL *range*
gimple-range-cache.cc:  m_propfail = BITMAP_ALLOC (NULL);
gimple-range-cache.h:  inline ssa_lazy_cache () { active_p =
BITMAP_ALLOC (NULL); }
gimple-range.cc:  m_pop_list = BITMAP_ALLOC (NULL);

I wonder if it would be cleaner to just change these to use named obstacks.

Andrew, is there a reason we were using the default obstack for these?
 For reference, they are  class update_list used in the ranger cache,
ssa_lazy_cache, and dom_ranger.

Aldy



[PATCH] build: Check for cargo when building rust language

2024-04-08 Thread pierre-emmanuel . patry
From: Pierre-Emmanuel Patry 

Hello,

The rust frontend requires cargo to build some of it's components,
it's presence was not checked during configuration.

Best regards,
Pierre-Emmanuel

--

Prevent rust language from building when cargo is
missing.

config/ChangeLog:

* acx.m4: Add a macro to check for rust
components.

ChangeLog:

* configure: Regenerate.
* configure.ac: Emit an error message when cargo
is missing.

Signed-off-by: Pierre-Emmanuel Patry 
---
 config/acx.m4 |  11 +
 configure | 117 ++
 configure.ac  |  18 
 3 files changed, 146 insertions(+)

diff --git a/config/acx.m4 b/config/acx.m4
index 7efe98aaf96..3c5fe67342e 100644
--- a/config/acx.m4
+++ b/config/acx.m4
@@ -424,6 +424,17 @@ else
 fi
 ])
 
+# Test for Rust
+# We require cargo and rustc for some parts of the rust compiler.
+AC_DEFUN([ACX_PROG_CARGO],
+[AC_REQUIRE([AC_CHECK_TOOL_PREFIX])
+AC_CHECK_TOOL(CARGO, cargo, no)
+if test "x$CARGO" != xno; then
+  have_cargo=yes
+else
+  have_cargo=no
+fi])
+
 # Test for D.
 AC_DEFUN([ACX_PROG_GDC],
 [AC_REQUIRE([AC_CHECK_TOOL_PREFIX])
diff --git a/configure b/configure
index 874966fb9f0..46e66e20197 100755
--- a/configure
+++ b/configure
@@ -714,6 +714,7 @@ PGO_BUILD_GEN_CFLAGS
 HAVE_CXX11_FOR_BUILD
 HAVE_CXX11
 do_compare
+CARGO
 GDC
 GNATMAKE
 GNATBIND
@@ -5786,6 +5787,104 @@ else
   have_gdc=no
 fi
 
+
+if test -n "$ac_tool_prefix"; then
+  # Extract the first word of "${ac_tool_prefix}cargo", so it can be a program 
name with args.
+set dummy ${ac_tool_prefix}cargo; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_prog_CARGO+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  if test -n "$CARGO"; then
+  ac_cv_prog_CARGO="$CARGO" # Let the user override the test.
+else
+as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+ac_cv_prog_CARGO="${ac_tool_prefix}cargo"
+$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" 
>&5
+break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+fi
+fi
+CARGO=$ac_cv_prog_CARGO
+if test -n "$CARGO"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $CARGO" >&5
+$as_echo "$CARGO" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
+fi
+if test -z "$ac_cv_prog_CARGO"; then
+  ac_ct_CARGO=$CARGO
+  # Extract the first word of "cargo", so it can be a program name with args.
+set dummy cargo; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_prog_ac_ct_CARGO+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  if test -n "$ac_ct_CARGO"; then
+  ac_cv_prog_ac_ct_CARGO="$ac_ct_CARGO" # Let the user override the test.
+else
+as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+ac_cv_prog_ac_ct_CARGO="cargo"
+$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" 
>&5
+break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+fi
+fi
+ac_ct_CARGO=$ac_cv_prog_ac_ct_CARGO
+if test -n "$ac_ct_CARGO"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_ct_CARGO" >&5
+$as_echo "$ac_ct_CARGO" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+  if test "x$ac_ct_CARGO" = x; then
+CARGO="no"
+  else
+case $cross_compiling:$ac_tool_warned in
+yes:)
+{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not 
prefixed with host triplet" >&5
+$as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" 
>&2;}
+ac_tool_warned=yes ;;
+esac
+CARGO=$ac_ct_CARGO
+  fi
+else
+  CARGO="$ac_cv_prog_CARGO"
+fi
+
+if test "x$CARGO" != xno; then
+  have_cargo=yes
+else
+  have_cargo=no
+fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking how to compare bootstrapped 
objects" >&5
 $as_echo_n "checking how to compare bootstrapped objects... " >&6; }
 if ${gcc_cv_prog_cmp_skip+:} false; then :
@@ -9099,6 +9198,24 @@ $as_echo "$as_me: WARNING: --enable-host-shared required 
to build $language" >&2
   ;;
 esac
 
+# Disable Rust if cargo is unavailable.
+case ${add_this_lang}:${language}:${have_cargo} in
+  yes:rust:no)
+# Specifically requested language; tell them.
+as_fn_error $? "cargo is required to build $language" "$LINENO" 5
+;;
+  all:rust:no)
+{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: cargo is 
required to build $language" 

Re: [PATCH] middle-end/114604 - ranger allocates bitmap without initialized obstack

2024-04-08 Thread Richard Biener



> Am 08.04.2024 um 18:09 schrieb Aldy Hernandez :
> 
> On Mon, Apr 8, 2024 at 5:54 PM Jakub Jelinek  wrote:
>> 
>> On Mon, Apr 08, 2024 at 05:40:23PM +0200, Aldy Hernandez wrote:
PR middle-end/114604
* gimple-range.cc (enable_ranger): Initialize the global
bitmap obstack.
(disable_ranger): Release it.
 ---
 gcc/gimple-range.cc | 4 
 1 file changed, 4 insertions(+)
 
 diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
 index c16b776c1e3..4d3b1ce8588 100644
 --- a/gcc/gimple-range.cc
 +++ b/gcc/gimple-range.cc
 @@ -689,6 +689,8 @@ enable_ranger (struct function *fun, bool use_imm_uses)
 {
   gimple_ranger *r;
 
 +  bitmap_obstack_initialize (NULL);
 +
   gcc_checking_assert (!fun->x_range_query);
   r = new gimple_ranger (use_imm_uses);
   fun->x_range_query = r;
 @@ -705,6 +707,8 @@ disable_ranger (struct function *fun)
   gcc_checking_assert (fun->x_range_query);
   delete fun->x_range_query;
   fun->x_range_query = NULL;
 +
 +  bitmap_obstack_release (NULL);
>>> 
>>> Are you not allowed to initialize/use obstacks unless
>>> bitmap_obstack_initialize(NULL) is called?
>> 
>> You can use it with some other obstack, just not the default one.
>> 
>>> If so, wouldn't it be
>>> better to lazily initialize it downstream (bitmap_alloc, or whomever
>>> needs it initialized)?
>> 
>> No, you still need to decide where is the safe point to release it.
>> Unlike the non-default bitmap_obstack_initialize/bitmap_obstack_release,
>> the default one can nest (has associated nesting counter).  So, the above
>> patch just says that ranger starts using the default obstack in
>> enable_ranger and stops using it in disable_ranger and anything ranger
>> associated in the obstack can be freed at that point.
> 
> I thought ranger never used the default one:
> 
> $ grep bitmap_obstack_initialize *value* *range*
> value-relation.cc:  bitmap_obstack_initialize (_bitmaps);
> value-relation.cc:  bitmap_obstack_initialize (_bitmaps);
> gimple-range-cache.cc:  bitmap_obstack_initialize (_bitmaps);
> gimple-range-gori.cc:  bitmap_obstack_initialize (_bitmaps);
> gimple-range-infer.cc:  bitmap_obstack_initialize (_bitmaps);
> gimple-range-phi.cc:  bitmap_obstack_initialize (_bitmaps);
> 
> or even:
> 
> $ grep obstack.*NULL *value* *range*
> value-range-storage.cc:obstack_free (_obstack, NULL);
> value-relation.cc:  obstack_free (_chain_obstack, NULL);
> value-relation.cc:  obstack_free (_chain_obstack, NULL);
> gimple-range-infer.cc:  obstack_free (_list_obstack, NULL);
> value-range-storage.cc:obstack_free (_obstack, NULL);
> 
> I'm obviously missing something here.

Look for BITMAP_ALLOC (NULL) in the backtrace in the PR

Richard 


> Aldy
> 


Re: [PATCH] middle-end/114604 - ranger allocates bitmap without initialized obstack

2024-04-08 Thread Aldy Hernandez
On Mon, Apr 8, 2024 at 5:54 PM Jakub Jelinek  wrote:
>
> On Mon, Apr 08, 2024 at 05:40:23PM +0200, Aldy Hernandez wrote:
> > > PR middle-end/114604
> > > * gimple-range.cc (enable_ranger): Initialize the global
> > > bitmap obstack.
> > > (disable_ranger): Release it.
> > > ---
> > >  gcc/gimple-range.cc | 4 
> > >  1 file changed, 4 insertions(+)
> > >
> > > diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
> > > index c16b776c1e3..4d3b1ce8588 100644
> > > --- a/gcc/gimple-range.cc
> > > +++ b/gcc/gimple-range.cc
> > > @@ -689,6 +689,8 @@ enable_ranger (struct function *fun, bool 
> > > use_imm_uses)
> > >  {
> > >gimple_ranger *r;
> > >
> > > +  bitmap_obstack_initialize (NULL);
> > > +
> > >gcc_checking_assert (!fun->x_range_query);
> > >r = new gimple_ranger (use_imm_uses);
> > >fun->x_range_query = r;
> > > @@ -705,6 +707,8 @@ disable_ranger (struct function *fun)
> > >gcc_checking_assert (fun->x_range_query);
> > >delete fun->x_range_query;
> > >fun->x_range_query = NULL;
> > > +
> > > +  bitmap_obstack_release (NULL);
> >
> > Are you not allowed to initialize/use obstacks unless
> > bitmap_obstack_initialize(NULL) is called?
>
> You can use it with some other obstack, just not the default one.
>
> > If so, wouldn't it be
> > better to lazily initialize it downstream (bitmap_alloc, or whomever
> > needs it initialized)?
>
> No, you still need to decide where is the safe point to release it.
> Unlike the non-default bitmap_obstack_initialize/bitmap_obstack_release,
> the default one can nest (has associated nesting counter).  So, the above
> patch just says that ranger starts using the default obstack in
> enable_ranger and stops using it in disable_ranger and anything ranger
> associated in the obstack can be freed at that point.

I thought ranger never used the default one:

$ grep bitmap_obstack_initialize *value* *range*
value-relation.cc:  bitmap_obstack_initialize (_bitmaps);
value-relation.cc:  bitmap_obstack_initialize (_bitmaps);
gimple-range-cache.cc:  bitmap_obstack_initialize (_bitmaps);
gimple-range-gori.cc:  bitmap_obstack_initialize (_bitmaps);
gimple-range-infer.cc:  bitmap_obstack_initialize (_bitmaps);
gimple-range-phi.cc:  bitmap_obstack_initialize (_bitmaps);

or even:

$ grep obstack.*NULL *value* *range*
value-range-storage.cc:obstack_free (_obstack, NULL);
value-relation.cc:  obstack_free (_chain_obstack, NULL);
value-relation.cc:  obstack_free (_chain_obstack, NULL);
gimple-range-infer.cc:  obstack_free (_list_obstack, NULL);
value-range-storage.cc:obstack_free (_obstack, NULL);

I'm obviously missing something here.

Aldy



[pushed] aarch64: Fix expansion of svsudot [PR114607]

2024-04-08 Thread Richard Sandiford
Not sure how this happend, but: svsudot is supposed to be expanded
as USDOT with the operands swapped.  However, a thinko in the
expansion of svsudot meant that the arguments weren't in fact
swapped; the attempted swap was just a no-op.  And the testcases
blithely accepted that.

Tested on aarch64-linux-gnu and pushed to trunk.  I'll backport
in a few weeks if there's no fallout.

Richard


gcc/
PR target/114607
* config/aarch64/aarch64-sve-builtins-base.cc
(svusdot_impl::expand): Fix botched attempt to swap the operands
for svsudot.

gcc/testsuite/
PR target/114607
* gcc.target/aarch64/sve/acle/asm/sudot_s32.c: New test.
---
 gcc/config/aarch64/aarch64-sve-builtins-base.cc   | 2 +-
 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/sudot_s32.c | 8 
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index 5be2315a3c6..0d2edf3f19e 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -2809,7 +2809,7 @@ public:
version) is through the USDOT instruction but with the second and third
inputs swapped.  */
 if (m_su)
-  e.rotate_inputs_left (1, 2);
+  e.rotate_inputs_left (1, 3);
 /* The ACLE function has the same order requirements as for svdot.
While there's no requirement for the RTL pattern to have the same sort
of order as that for dot_prod, it's easier to read.
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/sudot_s32.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/sudot_s32.c
index 4b452619eee..e06b69affab 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/sudot_s32.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/sudot_s32.c
@@ -6,7 +6,7 @@
 
 /*
 ** sudot_s32_tied1:
-** usdot   z0\.s, z2\.b, z4\.b
+** usdot   z0\.s, z4\.b, z2\.b
 ** ret
 */
 TEST_TRIPLE_Z (sudot_s32_tied1, svint32_t, svint8_t, svuint8_t,
@@ -17,7 +17,7 @@ TEST_TRIPLE_Z (sudot_s32_tied1, svint32_t, svint8_t, 
svuint8_t,
 ** sudot_s32_tied2:
 ** mov (z[0-9]+)\.d, z0\.d
 ** movprfx z0, z4
-** usdot   z0\.s, z2\.b, \1\.b
+** usdot   z0\.s, \1\.b, z2\.b
 ** ret
 */
 TEST_TRIPLE_Z_REV (sudot_s32_tied2, svint32_t, svint8_t, svuint8_t,
@@ -27,7 +27,7 @@ TEST_TRIPLE_Z_REV (sudot_s32_tied2, svint32_t, svint8_t, 
svuint8_t,
 /*
 ** sudot_w0_s32_tied:
 ** mov (z[0-9]+\.b), w0
-** usdot   z0\.s, z2\.b, \1
+** usdot   z0\.s, \1, z2\.b
 ** ret
 */
 TEST_TRIPLE_ZX (sudot_w0_s32_tied, svint32_t, svint8_t, uint8_t,
@@ -37,7 +37,7 @@ TEST_TRIPLE_ZX (sudot_w0_s32_tied, svint32_t, svint8_t, 
uint8_t,
 /*
 ** sudot_9_s32_tied:
 ** mov (z[0-9]+\.b), #9
-** usdot   z0\.s, z2\.b, \1
+** usdot   z0\.s, \1, z2\.b
 ** ret
 */
 TEST_TRIPLE_Z (sudot_9_s32_tied, svint32_t, svint8_t, uint8_t,
-- 
2.25.1



Re: [PATCH] middle-end/114604 - ranger allocates bitmap without initialized obstack

2024-04-08 Thread Jakub Jelinek
On Mon, Apr 08, 2024 at 05:40:23PM +0200, Aldy Hernandez wrote:
> > PR middle-end/114604
> > * gimple-range.cc (enable_ranger): Initialize the global
> > bitmap obstack.
> > (disable_ranger): Release it.
> > ---
> >  gcc/gimple-range.cc | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
> > index c16b776c1e3..4d3b1ce8588 100644
> > --- a/gcc/gimple-range.cc
> > +++ b/gcc/gimple-range.cc
> > @@ -689,6 +689,8 @@ enable_ranger (struct function *fun, bool use_imm_uses)
> >  {
> >gimple_ranger *r;
> >
> > +  bitmap_obstack_initialize (NULL);
> > +
> >gcc_checking_assert (!fun->x_range_query);
> >r = new gimple_ranger (use_imm_uses);
> >fun->x_range_query = r;
> > @@ -705,6 +707,8 @@ disable_ranger (struct function *fun)
> >gcc_checking_assert (fun->x_range_query);
> >delete fun->x_range_query;
> >fun->x_range_query = NULL;
> > +
> > +  bitmap_obstack_release (NULL);
> 
> Are you not allowed to initialize/use obstacks unless
> bitmap_obstack_initialize(NULL) is called?

You can use it with some other obstack, just not the default one.

> If so, wouldn't it be
> better to lazily initialize it downstream (bitmap_alloc, or whomever
> needs it initialized)?

No, you still need to decide where is the safe point to release it.
Unlike the non-default bitmap_obstack_initialize/bitmap_obstack_release,
the default one can nest (has associated nesting counter).  So, the above
patch just says that ranger starts using the default obstack in
enable_ranger and stops using it in disable_ranger and anything ranger
associated in the obstack can be freed at that point.
If one uses this from a pass which has its own
bitmap_obstack_{initializer,release} (NULL) around this, then the above will
not do anything, just bump the nesting counter and decrease it later.
But if some pass doesn't, i.e. the pass itself doesn't use the default
obstack, only ranger does, then the above seems like the right change to me.

> Unless I'm misunderstanding something, it
> doesn't seem like ranger is the correct place to fix this.

Jakub



[PATCH] x86: Define macros for APX options

2024-04-08 Thread H.J. Lu
Define following macros for APX options:

1. __APX_EGPR__: -mapx-features=egpr.
2. __APX_PUSH2POP2__: -mapx-features=push2pop2.
3. __APX_NDD__: -mapx-features=ndd.
4. __APX_PPX__: -mapx-features=ppx.
5. __APX_INLINE_ASM_USE_GPR32__: -mapx-inline-asm-use-gpr32.

They can be used to make assembly codes compatible with APX options.
Some use cases are:

1. When __APX_PUSH2POP2__ is defined, assembly codes should always align
the outgoing stack to 16 bytes.
2. When __APX_INLINE_ASM_USE_GPR32__ is defined, inline asm statements
should contain only instructions compatible with r16-r31.

gcc/

PR target/114587
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__APX_XXX__ for APX options.

gcc/testsuite/

PR target/114587
* gcc.target/i386/apx-3a.c: New test.
* gcc.target/i386/apx-3b.c: Likewise.
* gcc.target/i386/apx-3c.c: Likewise.
* gcc.target/i386/apx-3d.c: Likewise.
* gcc.target/i386/apx-3e.c: Likewise.
* gcc.target/i386/apx-4.c: Likewise.
---
 gcc/config/i386/i386-c.cc  | 10 ++
 gcc/testsuite/gcc.target/i386/apx-3a.c |  6 ++
 gcc/testsuite/gcc.target/i386/apx-3b.c |  6 ++
 gcc/testsuite/gcc.target/i386/apx-3c.c |  6 ++
 gcc/testsuite/gcc.target/i386/apx-3d.c |  6 ++
 gcc/testsuite/gcc.target/i386/apx-3e.c | 18 ++
 gcc/testsuite/gcc.target/i386/apx-4.c  |  6 ++
 7 files changed, 58 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-3a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-3b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-3c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-3d.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-3e.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-4.c

diff --git a/gcc/config/i386/i386-c.cc b/gcc/config/i386/i386-c.cc
index 226d277676c..b8cfba90fdc 100644
--- a/gcc/config/i386/i386-c.cc
+++ b/gcc/config/i386/i386-c.cc
@@ -751,6 +751,16 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
 def_or_undef (parse_in, "__AVX10_1_512__");
   if (isa_flag2 & OPTION_MASK_ISA2_APX_F)
 def_or_undef (parse_in, "__APX_F__");
+  if (TARGET_APX_EGPR)
+def_or_undef (parse_in, "__APX_EGPR__");
+  if (TARGET_APX_PUSH2POP2)
+def_or_undef (parse_in, "__APX_PUSH2POP2__");
+  if (TARGET_APX_NDD)
+def_or_undef (parse_in, "__APX_NDD__");
+  if (TARGET_APX_PPX)
+def_or_undef (parse_in, "__APX_PPX__");
+  if (ix86_apx_inline_asm_use_gpr32)
+def_or_undef (parse_in, "__APX_INLINE_ASM_USE_GPR32__");
   if (TARGET_IAMCU)
 {
   def_or_undef (parse_in, "__iamcu");
diff --git a/gcc/testsuite/gcc.target/i386/apx-3a.c 
b/gcc/testsuite/gcc.target/i386/apx-3a.c
new file mode 100644
index 000..86d3ef2061d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/apx-3a.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-mapx-features=egpr" } */
+
+#ifndef __APX_EGPR__
+# error __APX_EGPR__ not defined
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/apx-3b.c 
b/gcc/testsuite/gcc.target/i386/apx-3b.c
new file mode 100644
index 000..611727a389a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/apx-3b.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-mapx-features=push2pop2" } */
+
+#ifndef __APX_PUSH2POP2__
+# error __APX_PUSH2POP2__ not defined
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/apx-3c.c 
b/gcc/testsuite/gcc.target/i386/apx-3c.c
new file mode 100644
index 000..52655b6cfa5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/apx-3c.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-mapx-features=ndd" } */
+
+#ifndef __APX_NDD__
+# error __APX_NDD__ not defined
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/apx-3d.c 
b/gcc/testsuite/gcc.target/i386/apx-3d.c
new file mode 100644
index 000..9b91af1d377
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/apx-3d.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-mapx-features=ppx" } */
+
+#ifndef __APX_PPX__
+# error __APX_PPX__ not defined
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/apx-3e.c 
b/gcc/testsuite/gcc.target/i386/apx-3e.c
new file mode 100644
index 000..7278428e5c4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/apx-3e.c
@@ -0,0 +1,18 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-mapx-features=egpr,push2pop2,ndd,ppx" } */
+
+#ifndef __APX_EGPR__
+# error __APX_EGPR__ not defined
+#endif
+
+#ifndef __APX_PUSH2POP2__
+# error __APX_PUSH2POP2__ not defined
+#endif
+
+#ifndef __APX_NDD__
+# error __APX_NDD__ not defined
+#endif
+
+#ifndef __APX_PPX__
+# error __APX_PPX__ not defined
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/apx-4.c 
b/gcc/testsuite/gcc.target/i386/apx-4.c
new file mode 100644
index 000..1ba4ac036fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/apx-4.c
@@ -0,0 +1,6 @@
+/* { 

Re: [PATCH] middle-end/114604 - ranger allocates bitmap without initialized obstack

2024-04-08 Thread Aldy Hernandez
On Mon, Apr 8, 2024 at 11:50 AM Richard Biener  wrote:
>
> The following fixes ranger bitmap allocation when invoked from IPA
> context where the global bitmap obstack possibly isn't initialized.
> Instead of trying to use one of the ranger obstacks the following
> simply initializes the global bitmap obstack around an active ranger.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu with bitmap_alloc
> instrumentation (all ICEs gone, so ranger was the only offender).
>
> OK?
>
> Thanks,
> Richard.
>
> PR middle-end/114604
> * gimple-range.cc (enable_ranger): Initialize the global
> bitmap obstack.
> (disable_ranger): Release it.
> ---
>  gcc/gimple-range.cc | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
> index c16b776c1e3..4d3b1ce8588 100644
> --- a/gcc/gimple-range.cc
> +++ b/gcc/gimple-range.cc
> @@ -689,6 +689,8 @@ enable_ranger (struct function *fun, bool use_imm_uses)
>  {
>gimple_ranger *r;
>
> +  bitmap_obstack_initialize (NULL);
> +
>gcc_checking_assert (!fun->x_range_query);
>r = new gimple_ranger (use_imm_uses);
>fun->x_range_query = r;
> @@ -705,6 +707,8 @@ disable_ranger (struct function *fun)
>gcc_checking_assert (fun->x_range_query);
>delete fun->x_range_query;
>fun->x_range_query = NULL;
> +
> +  bitmap_obstack_release (NULL);

Are you not allowed to initialize/use obstacks unless
bitmap_obstack_initialize(NULL) is called?  If so, wouldn't it be
better to lazily initialize it downstream (bitmap_alloc, or whomever
needs it initialized)?  Unless I'm misunderstanding something, it
doesn't seem like ranger is the correct place to fix this.

Aldy



Re: [PATCH v5] RISC-V: Implement TLS Descriptors.

2024-04-08 Thread Kito Cheng
Committed to trunk, thanks Tatsuyuki!

On Fri, Mar 29, 2024 at 2:32 PM Kito Cheng  wrote:
>
> Hi Tatsuyuki:
>
> Thanks for your hard work and keep updating, the patch set is LGTM, I
> plan to commit this next week if no further comments :)
>
> Hi MaskRay:
>
> Thanks for your review on the patchset! just put you into the cc list
> in case you have few more comments :)
>
>
> On Fri, Mar 29, 2024 at 1:53 PM Tatsuyuki Ishi  
> wrote:
> >
> > This implements TLS Descriptors (TLSDESC) as specified in [1].
> >
> > The 4-instruction sequence is implemented as a single RTX insn for
> > simplicity, but this can be revisited later if instruction scheduling or
> > more flexible RA is desired.
> >
> > The default remains to be the traditional TLS model, but can be configured
> > with --with-tls={trad,desc}. The choice can be revisited once toolchain
> > and libc support ships.
> >
> > [1]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373.
> >
> > gcc/Changelog:
> > * config/riscv/riscv.opt: Add -mtls-dialect to configure TLS flavor.
> > * config.gcc: Add --with-tls configuration option to change the
> > default TLS flavor.
> > * config/riscv/riscv.h: Add TARGET_TLSDESC determined from
> > -mtls-dialect and with_tls defaults.
> > * config/riscv/riscv-opts.h: Define enum riscv_tls_type for the
> > two TLS flavors.
> > * config/riscv/riscv-protos.h: Define SYMBOL_TLSDESC symbol type.
> > * config/riscv/riscv.md: Add instruction sequence for TLSDESC.
> > * config/riscv/riscv.cc (riscv_symbol_insns): Add instruction
> > sequence length data for TLSDESC.
> > (riscv_legitimize_tls_address): Add lowering of TLSDESC.
> > * doc/install.texi: Document --with-tls for RISC-V.
> > * doc/invoke.texi: Document -mtls-dialect for RISC-V.
> > * testsuite/gcc.target/riscv/tls_1.x: Add TLSDESC GD test case.
> > * testsuite/gcc.target/riscv/tlsdesc.c: Same as above.
> > ---
> > No regression in gcc tests for rv32gcv and rv64gcv, tested alongside
> > the binutils and glibc implementation. Tested with --with-tls=desc.
> >
> > v2: Add with_tls configuration option, and a few readability improvements.
> > Added Changelog.
> > v3: Add documentation per Kito's suggestion.
> > Fix minor issues pointed out by Kito and Jeff.
> > Thanks Kito Cheng and Jeff Law for review.
> > v4: Add TLSDESC GD assembly test.
> > Rebase on top of trunk.
> > v5: Trivial rebase on top of trunk.
> >
> > I have recently addressed relaxation concerns on binutils and RVV
> > register save/restore on glibc, so I'm sending out a trivial rebase
> > with the hope that the full set can be merged soon.
> >
> >  gcc/config.gcc   | 15 ++-
> >  gcc/config/riscv/riscv-opts.h|  6 ++
> >  gcc/config/riscv/riscv-protos.h  |  5 +++--
> >  gcc/config/riscv/riscv.cc| 24 
> >  gcc/config/riscv/riscv.h |  9 +++--
> >  gcc/config/riscv/riscv.md| 20 +++-
> >  gcc/config/riscv/riscv.opt   | 14 ++
> >  gcc/doc/install.texi |  3 +++
> >  gcc/doc/invoke.texi  | 13 -
> >  gcc/testsuite/gcc.target/riscv/tls_1.x   |  5 +
> >  gcc/testsuite/gcc.target/riscv/tlsdesc.c | 12 
> >  11 files changed, 115 insertions(+), 11 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/tls_1.x
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/tlsdesc.c
> >
> > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > index 17873ac2103..1a5870672d2 100644
> > --- a/gcc/config.gcc
> > +++ b/gcc/config.gcc
> > @@ -2492,6 +2492,7 @@ riscv*-*-linux*)
> > # Force .init_array support.  The configure script cannot always
> > # automatically detect that GAS supports it, yet we require it.
> > gcc_cv_initfini_array=yes
> > +   with_tls=${with_tls:-trad}
> > ;;
> >  riscv*-*-elf* | riscv*-*-rtems*)
> > tm_file="elfos.h newlib-stdint.h ${tm_file} riscv/elf.h"
> > @@ -2534,6 +2535,7 @@ riscv*-*-freebsd*)
> > # Force .init_array support.  The configure script cannot always
> > # automatically detect that GAS supports it, yet we require it.
> > gcc_cv_initfini_array=yes
> > +   with_tls=${with_tls:-trad}
> > ;;
> >
> >  loongarch*-*-linux*)
> > @@ -4671,7 +4673,7 @@ case "${target}" in
> > ;;
> >
> > riscv*-*-*)
> > -   supported_defaults="abi arch tune riscv_attribute isa_spec"
> > +   supported_defaults="abi arch tune riscv_attribute isa_spec 
> > tls"
> >
> > case "${target}" in
> > riscv-* | riscv32*) xlen=32 ;;
> > @@ -4801,6 +4803,17 @@ case "${target}" in
> > ;;
> > esac
> > fi
> > +  

[PATCH 0/2] More condition coverage fixes

2024-04-08 Thread Jørgen Kvalsvik
Here are two more patches for the condition coverage.

Inlined conditions are actually counted this time, as the recorded
expression mapping uses the new, deep-copied gconds as keys and not the
pointers from the source function.

The reported UB is gone when the number of conditions are exactly 64
(sizeof uint64_t), by slightly restructuring the loop generating
constants.

This passes check-gcc RUNTESTFLAGS=gcov.exp in my ubsan instrumented
build:

configure --enable-languages=c,c++ --enable-host-shared
--enable-checking=release --disable-multilib
--with-build-config=bootstrap-ubsan

Jørgen Kvalsvik (2):
  Add tree-inlined gconds to caller cond->expr map
  Generate constant at start of loop, without UB

 gcc/testsuite/gcc.misc-tests/gcov-19.c | 37 ++
 gcc/tree-inline.cc | 43 ++
 gcc/tree-profile.cc| 20 +---
 3 files changed, 76 insertions(+), 24 deletions(-)

-- 
2.30.2



[PATCH 2/2] Generate constant at start of loop, without UB

2024-04-08 Thread Jørgen Kvalsvik
Generating the constants used for recording the edges taken for
condition coverage would trigger undefined behavior when an expression
had exactly 64 (== sizeof (1ULL)) conditions, as it would generate the
constant for the next iteration at the end of the loop body, even if there
was never a next iteration. By moving the check and constant generation
to the top of the loop and hoisting the increment flag there is no
opportunity for UB.

PR 114627

gcc/ChangeLog:

* tree-profile.cc (instrument_decisions): Generate constant
at the start of loop.
---
 gcc/tree-profile.cc | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index 33ff550a7bc..e58f5c83472 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -1049,6 +1049,7 @@ instrument_decisions (array_slice expr, 
size_t condno,
 zerocounter[2] = zero;
 
 unsigned xi = 0;
+bool increment = false;
 tree rhs = build_int_cst (gcov_type_node, 1ULL << xi);
 for (basic_block current : expr)
 {
@@ -1057,7 +1058,14 @@ instrument_decisions (array_slice expr, 
size_t condno,
candidates.safe_push (zerocounter);
counters prev = resolve_counters (candidates);
 
-   int increment = 0;
+   if (increment)
+   {
+   xi += 1;
+   gcc_checking_assert (xi < sizeof (uint64_t) * BITS_PER_UNIT);
+   rhs = build_int_cst (gcov_type_node, 1ULL << xi);
+   increment = false;
+   }
+
for (edge e : current->succs)
{
counters next = prev;
@@ -1072,7 +1080,7 @@ instrument_decisions (array_slice expr, 
size_t condno,
tree m = build_int_cst (gcov_type_node, masks[2*xi + k]);
next[2] = emit_bitwise_op (e, prev[2], BIT_IOR_EXPR, m);
}
-   increment = 1;
+   increment = true;
}
else if (e->flags & EDGE_COMPLEX)
{
@@ -1085,11 +1093,13 @@ instrument_decisions (array_slice expr, 
size_t condno,
}
table.get_or_insert (e->dest).safe_push (next);
}
-   xi += increment;
-   if (increment)
-   rhs = build_int_cst (gcov_type_node, 1ULL << xi);
 }
 
+/* Since this is also the return value, the number of conditions, make sure
+   to include the increment of the last basic block.  */
+if (increment)
+   xi += 1;
+
 gcc_assert (xi == bitmap_count_bits (core));
 
 const tree relaxed = build_int_cst (integer_type_node, MEMMODEL_RELAXED);
-- 
2.30.2



[PATCH 1/2] Add tree-inlined gconds to caller cond->expr map

2024-04-08 Thread Jørgen Kvalsvik
Properly add the condition -> expression mapping of inlined gconds from
the caller into the callee map. This is a fix for PR114599 that works
beyond fixing the segfault, as the previous fixed copied references to
the source gconds, not the deep copied ones that end up in the calle
body.

The new tests checks this, both in the case of a calle without
conditions (which triggered the segfault), and a test that shows that
conditions are properly mapped, and not mixed.

PR middle-end/114599

gcc/ChangeLog:

* tree-inline.cc (copy_bb): Copy cond_uids into callee.
(prepend_lexical_block): Remove outdated comment.
(add_local_variables): Remove bad cond_uids copy.

gcc/testsuite/ChangeLog:

* gcc.misc-tests/gcov-19.c: New test.
---
 gcc/testsuite/gcc.misc-tests/gcov-19.c | 37 ++
 gcc/tree-inline.cc | 43 ++
 2 files changed, 61 insertions(+), 19 deletions(-)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-19.c 
b/gcc/testsuite/gcc.misc-tests/gcov-19.c
index b83a38531ba..78769fa57b8 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-19.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-19.c
@@ -1468,6 +1468,40 @@ mcdc026e (int a, int b, int c, int d, int e)
 return x + y;
 }
 
+__attribute__((always_inline))
+inline int
+mcdc027_inlined (int a)
+{
+if (a)
+   x = a;
+else
+   x = 1;
+
+return a + 1;
+}
+
+/* Check that conditions in a function inlined into a function without
+   conditionals works.  */
+void
+mcdc027a (int a) /* conditions(1/2) false(0) */
+/* conditions(end) */
+{
+mcdc027_inlined (a);
+}
+
+/* Check that conditions in a function inlined into a function with
+   conditionals works and that the conditions are not mixed up.  */
+void
+mcdc027b (int a) /* conditions(1/2) false(0) */
+/* conditions(end) */
+{
+int v = mcdc027_inlined (a);
+
+if (v > 10) /* conditions(1/2) true(0) */
+   /* condiitions(end) */
+   x = 10;
+}
+
 int main ()
 {
 mcdc001a (0, 1);
@@ -1721,6 +1755,9 @@ int main ()
 mcdc026e (1, 1, 1, 0, 1);
 mcdc026e (1, 1, 0, 0, 1);
 mcdc026e (1, 1, 1, 1, 1);
+
+mcdc027a (1);
+mcdc027b (1);
 }
 
 /* { dg-final { run-gcov conditions { --conditions gcov-19.c } } } */
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index b18917707cc..5f852885e7f 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -2049,6 +2049,17 @@ copy_bb (copy_body_data *id, basic_block bb,
 
   copy_gsi = gsi_start_bb (copy_basic_block);
 
+  unsigned min_cond_uid = 0;
+  if (id->src_cfun->cond_uids)
+{
+  if (!cfun->cond_uids)
+   cfun->cond_uids = new hash_map  ();
+
+  for (auto itr : *id->src_cfun->cond_uids)
+   if (itr.second >= min_cond_uid)
+ min_cond_uid = itr.second + 1;
+}
+
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
 {
   gimple_seq stmts;
@@ -2076,6 +2087,18 @@ copy_bb (copy_body_data *id, basic_block bb,
  if (gimple_nop_p (stmt))
  continue;
 
+ /* If -fcondition-coverage is used, register the inlined conditions
+in the cond->expression mapping of the caller.  The expression tag
+is shifted conditions from the two bodies are not mixed.  */
+ if (id->src_cfun->cond_uids && is_a  (orig_stmt))
+   {
+ gcond *orig_cond = as_a  (orig_stmt);
+ gcond *cond = as_a  (stmt);
+ unsigned *v = id->src_cfun->cond_uids->get (orig_cond);
+ if (v)
+   cfun->cond_uids->put (cond, *v + min_cond_uid);
+   }
+
  gimple_duplicate_stmt_histograms (cfun, stmt, id->src_cfun,
orig_stmt);
 
@@ -4659,8 +4682,7 @@ prepend_lexical_block (tree current_block, tree new_block)
   BLOCK_SUPERCONTEXT (new_block) = current_block;
 }
 
-/* Add local variables from CALLEE to CALLER.  If set for condition coverage,
-   copy basic condition -> expression mapping to CALLER.  */
+/* Add local variables from CALLEE to CALLER.  */
 
 static inline void
 add_local_variables (struct function *callee, struct function *caller,
@@ -4690,23 +4712,6 @@ add_local_variables (struct function *callee, struct 
function *caller,
  }
add_local_decl (caller, new_var);
   }
-
-  /* If -fcondition-coverage is used and the caller has conditions, copy the
- mapping into the caller but and the end so the caller and callee
- expressions aren't mixed.  */
-  if (callee->cond_uids)
-{
-  if (!caller->cond_uids)
-   caller->cond_uids = new hash_map  ();
-
-  unsigned dst_max_uid = 0;
-  for (auto itr : *callee->cond_uids)
-   if (itr.second >= dst_max_uid)
- dst_max_uid = itr.second + 1;
-
-  for (auto itr : *callee->cond_uids)
-   caller->cond_uids->put (itr.first, itr.second + dst_max_uid);
-}
 }
 
 /* Add to BINDINGS a debug stmt resetting SRCVAR if 

[committed] Restore daily bumps and ChangeLog regeneration

2024-04-08 Thread Jakub Jelinek
Hi!

The
commit r14-9805-g8057f9aa1f7e70490064de796d7a8d42d446caf8   

  
Author: Martin Uecker 

  
Date:   Fri Apr 5 12:14:56 2024 +0200   

  


  
Revert "Fix ICE with -g and -std=c23 related to incomplete types 
[PR114361]" 
 


  
This reverts commit 871bb5ad2dd56343d80b6a6d269e85efdce5  because it

  
breaks LTO and needs a bit more work. See PR 114574.

  
commit broke ChangeLog regeneration.  The
"This reverts commit ."
line is something that is parsed by the ChangeLog updates script, so can't
be ammended like the above.

I've committed the following patch as obvious to unbreak it, then ran
update_git_version by hand on sourceware and then committed the attached
patch to add the ChangeLog entries there by hand.

2024-04-08  Jakub Jelinek  

* gcc-changelog/git_update_version.py: Add
8057f9aa1f7e70490064de796d7a8d42d446caf8 to IGNORED_COMMITS.

--- contrib/gcc-changelog/git_update_version.py.jj  2024-01-05 
15:22:21.721686738 +0100
+++ contrib/gcc-changelog/git_update_version.py 2024-04-08 14:10:56.096421526 
+0200
@@ -38,7 +38,8 @@ IGNORED_COMMITS = (
 '86d8e0c0652ef5236a460b75c25e4f7093cc0651',
 'e4cba49413ca429dc82f6aa2e88129ecb3fdd943',
 '1957bedf29a1b2cc231972aba680fe80199d5498',
-'040e5b0edbca861196d9e2ea2af5e805769c8d5d')
+'040e5b0edbca861196d9e2ea2af5e805769c8d5d',
+'8057f9aa1f7e70490064de796d7a8d42d446caf8')
 
 FORMAT = '%(asctime)s:%(levelname)s:%(name)s:%(message)s'
 logging.basicConfig(level=logging.INFO, format=FORMAT,

Jakub
--- gcc/c/ChangeLog
+++ gcc/c/ChangeLog
@@ -1,3 +1,12 @@
+2024-04-05  Martin Uecker  
+
+   Revert:
+   2024-04-02  Martin Uecker  
+
+   PR c/114361
+   * c-decl.cc (finish_struct): Set TYPE_CANONICAL when completing
+   strucute types.
+
 2024-04-02  Martin Uecker  
 
PR c/114361
--- gcc/testsuite/ChangeLog
+++ gcc/testsuite/ChangeLog
@@ -138,6 +138,16 @@
PR tree-optimization/114566
* gcc.target/i386/avx512f-pr114566.c: New test.
 
+2024-04-05  Martin Uecker  
+
+   Revert:
+   2024-04-02  Martin Uecker  
+
+   PR c/114361
+   * gcc.dg/pr114361.c: New test.
+   * gcc.dg/c23-tag-incomplete-1.c: New test.
+   * gcc.dg/c23-tag-incomplete-2.c: New test.
+
 2024-04-05  Jakub Jelinek  
 
* gdc.dg/dg.exp: Prune gcov*.d from the list of tests to run.


Re: [PATCH][wwwdocs] Add NEON-SVE bridge intrinsics to changes.html

2024-04-08 Thread Richard Sandiford
Richard Ball  writes:
> Hi all,
>
> Adding the NEON-SVE bridge intrinsics that were missed
> in the last patch.
>
> Thanks,
> Richard

OK, thanks.

Richard

> diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
> index 
> 9fd224c1df3f05eadcedaaa41c0859e712b93b78..df63af48298564de9c35bab1dd35891c2581e3d6
>  100644
> --- a/htdocs/gcc-14/changes.html
> +++ b/htdocs/gcc-14/changes.html
> @@ -420,6 +420,12 @@ a work-in-progress.
>-march=armv8.2-a or higher to be specified.  Likewise, the
>intrinsics enabled by +memtag no longer require
>-march=armv8.5-a.
> +  Support for the
> +   href="https://github.com/ARM-software/acle/blob/main/main/acle.md#neon-sve-bridge;>
> +  NEON-SVE Bridge intrinsics.
> +  These are intrinsics that allow conversions between NEON and SVE 
> vectors,
> +  enabled through the inclusion of the 
> arm_neon_sve_bridge.h header.
> +  
>  
>The option -mtp= is now supported for changing the TPIDR
> register used for TLS accesses.  For more details please refer to the


Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-08 Thread Peter Bergner
On 4/8/24 3:55 AM, Kewen.Lin wrote:
> on 2024/4/6 06:28, Peter Bergner wrote:
>> +mno-direct-move
>> +Target Undocumented WarnRemoved
>> +
>>  mdirect-move
>> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
>> +Target Undocumented WarnRemoved
> 
> When reviewing my previous patch to "neuter option -mpower{8,9}-vector",
> Segher mentioned that we don't need to keep such option warning all the
> time and can drop it like in a release later as users should be aware of
> this information then, I agreed and considering that patch disabling
> -m[no-]direct-move was r8-7845-g57f108f5a1e1b2, I think we can just remove
> m[no-]direct-move here?  What do you think?


I'm fine with that if that is what we want.  So something like the following?

+;; This option existed in the past, but now is always silently ignored.
mdirect-move
-Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
+Target Undocumented Ignore


The above seems to silently ignore both -mdirect-move and -mno-direct-move
which I think is what we want.  That said, it's not what we've done with
other options, but maybe those just need to be changed too?


Peter


;; This option existed in the past, but now is always off.
mno-mfpgpr
Target RejectNegative Undocumented Ignore

mmfpgpr
Target RejectNegative Undocumented WarnRemoved

[snip other examples]


[PATCH][wwwdocs] Add NEON-SVE bridge intrinsics to changes.html

2024-04-08 Thread Richard Ball
Hi all,

Adding the NEON-SVE bridge intrinsics that were missed
in the last patch.

Thanks,
Richarddiff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index 
9fd224c1df3f05eadcedaaa41c0859e712b93b78..df63af48298564de9c35bab1dd35891c2581e3d6
 100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
@@ -420,6 +420,12 @@ a work-in-progress.
   -march=armv8.2-a or higher to be specified.  Likewise, the
   intrinsics enabled by +memtag no longer require
   -march=armv8.5-a.
+  Support for the
+  https://github.com/ARM-software/acle/blob/main/main/acle.md#neon-sve-bridge;>
+  NEON-SVE Bridge intrinsics.
+  These are intrinsics that allow conversions between NEON and SVE vectors,
+  enabled through the inclusion of the arm_neon_sve_bridge.h 
header.
+  
 
   The option -mtp= is now supported for changing the TPIDR
register used for TLS accesses.  For more details please refer to the


Re: [PATCH] aarch64: Fix vld1/st1_x4 intrinsic test

2024-04-08 Thread Richard Sandiford
"Swinney, Jonathan"  writes:
> The test for this intrinsic was failing silently and so it failed to
> report the bug reported in 114521. This patch modifes the test to
> report the result.
>
> Bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521
>
> Signed-off-by: Jonathan Swinney 
> ---
>  .../gcc.target/aarch64/advsimd-intrinsics/vld1x4.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)

Thanks, pushed to trunk.

Richard

>
> diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c 
> b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
> index 89b289bb21d..17db262a31a 100644
> --- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
> +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
> @@ -3,6 +3,7 @@
>  /* { dg-skip-if "unimplemented" { arm*-*-* } } */
>  /* { dg-options "-O3" } */
>  
> +#include 
>  #include 
>  #include "arm-neon-ref.h"
>  
> @@ -71,13 +72,16 @@ VARIANT (float64, 2, q_f64)
>  VARIANTS (TESTMETH)
>  
>  #define CHECKS(BASE, ELTS, SUFFIX)   \
> -  if (test_vld1##SUFFIX##_x4 () != 0)\
> -fprintf (stderr, "test_vld1##SUFFIX##_x4");
> +  if (test_vld1##SUFFIX##_x4 () != 0) {  \
> +fprintf (stderr, "test_vld1" #SUFFIX "_x4 failed\n"); \
> +failed = true; \
> +  }
>  
>  int
>  main (int argc, char **argv)
>  {
> +  bool failed = false;
>VARIANTS (CHECKS)
>  
> -  return 0;
> +  return (failed) ? 1 : 0;
>  }


RE: [PATCH] i386: Fix aes/vaes patterns [PR114576]

2024-04-08 Thread Jiang, Haochen
> -Original Message-
> From: Jakub Jelinek 
> Sent: Monday, April 8, 2024 9:43 PM
> To: Jiang, Haochen 
> Cc: Hongtao Liu ; gcc-patches@gcc.gnu.org; Liu, Hongtao
> ; ubiz...@gmail.com
> Subject: Re: [PATCH] i386: Fix aes/vaes patterns [PR114576]
> 
> On Mon, Apr 08, 2024 at 12:33:39PM +, Jiang, Haochen wrote:
> > Sorry for the late response since I am on vacation for now.
> >
> > > As the following testcase shows, the above change was incorrect.
> > >
> > > Using aes isa for the second alternative is obviously wrong, aes is 
> > > enabled
> > > whenever -maes is, regardless of -mavx or -mno-avx, so the above change
> > > means that for -maes -mno-avx RA can choose, either it matches the first
> > > alternative with the dup operand, or it matches the second one (but that
> > > is of course wrong because vaesenc VEX encoded insn needs AES & AVX
> CPUID).
> >
> > When I wrote that patch, I suppose it will never match the second one when
> > AVX is not enabled because it will immediately drop to the first one so the
> > second one is automatically AES && AVX, which is tricky here.
> 
> Before the -mvaes changes the alternatives were noavx,avx isa and so clearly
> it was either the first alternative is the solely available, or the second,
> depending on TARGET_AVX.  But with noavx,aes on the first alternative is
> enabled only for !TARGET_AVX, but the second one whenever TARGET_AES, which
> is both if !TARGET_AVX and TARGET_AVX.  So, the RA is free to consider both
> alternatives, and because the first one is more restrictive (requires
> output matching input), if there is a match between those, it will use the
> first alternative, but if there isn't, it will happily use the second
> alternative.
> 

Aha, I see. Thanks for the explanation.

Thx,
Haochen


Re: [PATCH] i386: Fix aes/vaes patterns [PR114576]

2024-04-08 Thread Jakub Jelinek
On Mon, Apr 08, 2024 at 12:33:39PM +, Jiang, Haochen wrote:
> Sorry for the late response since I am on vacation for now.
> 
> > As the following testcase shows, the above change was incorrect.
> > 
> > Using aes isa for the second alternative is obviously wrong, aes is enabled
> > whenever -maes is, regardless of -mavx or -mno-avx, so the above change
> > means that for -maes -mno-avx RA can choose, either it matches the first
> > alternative with the dup operand, or it matches the second one (but that
> > is of course wrong because vaesenc VEX encoded insn needs AES & AVX CPUID).
> 
> When I wrote that patch, I suppose it will never match the second one when
> AVX is not enabled because it will immediately drop to the first one so the
> second one is automatically AES && AVX, which is tricky here.

Before the -mvaes changes the alternatives were noavx,avx isa and so clearly
it was either the first alternative is the solely available, or the second,
depending on TARGET_AVX.  But with noavx,aes on the first alternative is
enabled only for !TARGET_AVX, but the second one whenever TARGET_AES, which
is both if !TARGET_AVX and TARGET_AVX.  So, the RA is free to consider both
alternatives, and because the first one is more restrictive (requires
output matching input), if there is a match between those, it will use the
first alternative, but if there isn't, it will happily use the second
alternative.

> LLVM always had less restrictions on ISA under such circumstances, I would 
> like to
> stick to how SDM did when implementing that, which is a little conservative.
> 
> However, I am also ok with VAES implying AES if there is no real HW that has
> VAES w/o AES to reduce complexity in this scenario.

I'm fine with -mvaes not implying -maes, just want to mention that it is
fairly user visible thing and so we shouldn't be changing it after deciding
if we do it one way or another.  Now, I thought -mvaes was added in GCC 14,
but it has been around for a few years, so that means it is likely a bad
idea to change it now.

Jakub



RE: [PATCH] i386: Fix aes/vaes patterns [PR114576]

2024-04-08 Thread Jiang, Haochen
Hi Jakub,

Sorry for the late response since I am on vacation for now.

> As the following testcase shows, the above change was incorrect.
> 
> Using aes isa for the second alternative is obviously wrong, aes is enabled
> whenever -maes is, regardless of -mavx or -mno-avx, so the above change
> means that for -maes -mno-avx RA can choose, either it matches the first
> alternative with the dup operand, or it matches the second one (but that
> is of course wrong because vaesenc VEX encoded insn needs AES & AVX CPUID).

When I wrote that patch, I suppose it will never match the second one when
AVX is not enabled because it will immediately drop to the first one so the
second one is automatically AES && AVX, which is tricky here.

But this patch is buggy when "-maes -mavx512vl -mno-vaes" with %xmm16+ so
your change is needed, really appreciate that.

> 
> The big question is if "Since VAES should not imply AES" is the case or not.
> Looking around at what LLVM does on godbolt, seems since clang 6 which added
> -mvaes support -mvaes there implies -maes, but GCC treats those two
> independent.
> 
> Now, if we'd take the LLVM path of making -mvaes imply -maes and -mno-aes
> imply -mno-vaes, then we should probably just revert the above patch and
> tweak common/config/i386/ to do the implications (+ add the testcase from
> this patch).

LLVM always had less restrictions on ISA under such circumstances, I would like 
to
stick to how SDM did when implementing that, which is a little conservative.

However, I am also ok with VAES implying AES if there is no real HW that has
VAES w/o AES to reduce complexity in this scenario.

Thx,
Haochen


Re: [PATCH] s390: Fix s390_const_int_pool_entry_p and movdi peephole2 [PR114605]

2024-04-08 Thread Andreas Krebbel
On 4/8/24 13:43, Ilya Leoshkevich wrote:
> On Sat, 2024-04-06 at 18:58 +0200, Jakub Jelinek wrote:
>> Hi!
>>
>> The following testcase is miscompiled, because we have initially
>> a movti which loads the 0x3f803f80ULL TImode constant
>> from constant pool.  Later on we split it into a pair of DImode
>> loads.  Now, for the first load (why just that?, though not stage4
>> material) we trigger the peephole2 which uses
>> s390_const_int_pool_entry_p.
>> That function doesn't check at all the constant pool mode though,
>> sees
>> the constant pool at that address has a CONST_INT value and just
>> assumes
>> that is the value to return, which is especially wrong for big-
>> endian,
>> if it is a DImode load from offset 0, it should be loading 0 rather
>> than
>> 0x3f803f80ULL.
>> The following patch adds checks if we are extracing a MODE_INT mode,
>> if the constant pool has MODE_INT mode as well, punts if constant
>> pool
>> has smaller mode size than the extraction one (then it would be UB),
>> if it has the same mode as before keeps using what it did before,
>> if constant pool has a larger mode than the one being extracted, uses
>> simplify_subreg.  I'd have used avoid_constant_pool_reference
>> instead which can handle also offsets into the constant pool
>> constants,
>> but it can't handle UNSPEC_LTREF.
>>
>> Another thing is that once that is fixed, we ICE when we extract
>> constant
>> like 0, ior insn predicate require non-0 constant.  So, the patch
>> also
>> fixes the peephole2 so that if either 32-bit half is zero, it uses a
>> mere
>> load of the constant into register rather than a pair of such load
>> and ior.
>>
>> Bootstrapped/regtested on s390x-linux, ok for trunk?
> 
> Hi Jakub, thanks for the patch, it looks good to me.
> Since I'm not a maintainer, we need to wait for Andreas' opinion.

Ok. Thank you very much Jakub for fixing this!

Andreas

> 
>>
>> 2024-04-06  Jakub Jelinek  
>>
>>  PR target/114605
>>  * config/s390/s390.cc (s390_const_int_pool_entry_p): Punt
>>  if mem doesn't have MODE_INT mode, or pool constant doesn't
>>  have MODE_INT mode, or if pool constant mode is smaller than
>>  mem mode.  If mem mode is different from pool constant mode,
>>  try to simplify subreg.  If that doesn't work, punt, if it
>>  does, use the simplified constant instead of the constant
>> pool
>>  constant.
>>  * config/s390/s390.md (movdi from const pool peephole): If
>>  either low or high 32-bit part is zero, just emit move insn
>>  instead of move + ior.
>>
>>  * gcc.dg/pr114605.c: New test.



Re: GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]'

2024-04-08 Thread Andrew Stubbs

On 08/04/2024 11:45, Thomas Schwinge wrote:

Hi!

On 2024-03-28T08:00:50+0100, I wrote:

On 2024-03-22T15:54:48+, Andrew Stubbs  wrote:

This patch alters the default (preferred) vector size to 32 on RDNA devices to
better match the actual hardware.  64-lane vectors will continue to be
used where they are hard-coded (such as function prologues).

We run these devices in wavefrontsize64 for compatibility, but they actually
only have 32-lane vectors, natively.  If the upper part of a V64 is masked
off (as it is in V32) then RDNA devices will skip execution of the upper part
for most operations, so this adjustment shouldn't leave too much performance on
the table.  One exception is memory instructions, so full wavefrontsize32
support would be better.

The advantage is that we avoid the missing V64 operations (such as permute and
vec_extract).

Committed to mainline.


In my GCN target '-march=gfx1100' testing, this commit
"amdgcn: Prefer V32 on RDNA devices" does resolve (or, make latent?) a
number of execution test FAILs (that is, regressions compared to earlier
'-march=gfx90a' etc. testing).

This commit also resolves (for my '-march=gfx1100' testing) one
pre-existing FAIL (that is, already seen in '-march=gfx90a' earlier
etc. testing):

 PASS: gcc.dg/tree-ssa/scev-14.c (test for excess errors)
 [-FAIL:-]{+PASS:+} gcc.dg/tree-ssa/scev-14.c scan-tree-dump ivopts 
"Overflowness wrto loop niter:\tNo-overflow"

That means, this test case specifically (or, just its 'scan-tree-dump'?)
needs to be adjusted for GCN V64 testing?

This commit, as you'd also mentioned elsewhere, however also causes a
number of regressions in 'gcc.target/gcn/gcn.exp', see list below.

Those can be "fixed" with 'dg-additional-options -march=gfx90a' (or
similar) in the affected test cases (let me know if you'd like me to
'git push' that), but I suppose something more elaborate may be in order?
(Conditionalize those on 'target { ! gcn_rdna }', and add respective
scanning for 'target gcn_rdna'?  I can help with effective-target
'gcn_rdna' (or similar), if you'd like me to.)

And/or, have a '-mpreferred-simd-mode=v64' (or similar) to be used for
such test cases, to override 'if (TARGET_RDNA2_PLUS)' etc. in
'gcn_vectorize_preferred_simd_mode'?


The latter I have quickly implemented, see attached
"GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]'".  OK to
push to trunk branch?

(This '--param' will also be useful for another bug/regression I'm about
to file.)


Best, probably, both these things, to properly test both V32 and V64?


That part remains to be done, but is best done by someone who actually
knowns "GCN" assembly/GCC back end -- that is, not me.


I'm not sure that this is *best* solution to the problem (in general, 
it's probably best to test the actual code that will be generated in 
practice), but I think this option will be useful for testing 
performance in each configuration and other correctness issues, and 
these tests are not testing that feature.


However, "vector lane width" sounds like it's configuring the number of 
bits in each lane. I think "vectorization factor" is unambigous.


OK to commit, with the name change.

Andrew




Grüße
  Thomas



 PASS: gcc.target/gcn/cond_fmaxnm_1.c (test for excess errors)
 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-not 
\\tv_writelane_b32\\tv[0-9]+, vcc_..
 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times 
smaxv64df3_exec 3
 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times 
smaxv64sf3_exec 3
 PASS: gcc.target/gcn/cond_fmaxnm_1_run.c (test for excess errors)
 PASS: gcc.target/gcn/cond_fmaxnm_1_run.c execution test

 PASS: gcc.target/gcn/cond_fmaxnm_2.c (test for excess errors)
 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-not 
\\tv_writelane_b32\\tv[0-9]+, vcc_..
 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times 
smaxv64df3_exec 3
 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times 
smaxv64sf3_exec 3
 PASS: gcc.target/gcn/cond_fmaxnm_2_run.c (test for excess errors)
 PASS: gcc.target/gcn/cond_fmaxnm_2_run.c execution test

 PASS: gcc.target/gcn/cond_fmaxnm_3.c (test for excess errors)
 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-not 
\\tv_writelane_b32\\tv[0-9]+, vcc_..
 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times 
movv64df_exec 3
 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times 
movv64sf_exec 3
 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times 
smaxv64sf3 3
 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times 
smaxv64sf3 3
 PASS: gcc.target/gcn/cond_fmaxnm_3_run.c (test for excess errors)
 PASS: gcc.target/gcn/cond_fmaxnm_3_run.c execution test

 PASS: gcc.target/gcn/cond_fmaxnm_4.c (test for excess errors)
 [-PASS:-]{+FAIL:+} 

Re: [PATCH] rtl-optimization/101523 - avoid re-combine after noop 2->2 combination

2024-04-08 Thread Richard Sandiford
Segher Boessenkool  writes:
> Hi!
>
> On Wed, Apr 03, 2024 at 01:07:41PM +0200, Richard Biener wrote:
>> The following avoids re-walking and re-combining the instructions
>> between i2 and i3 when the pattern of i2 doesn't change.
>> 
>> Bootstrap and regtest running ontop of a reversal of 
>> r14-9692-g839bc42772ba7a.
>
> Please include that in the patch (or series, preferably).
>
>> It brings down memory use frmo 9GB to 400MB and compile-time from
>> 80s to 3.5s.  r14-9692-g839bc42772ba7a does better in both metrics
>> but has shown code generation regressions across acrchitectures.
>> 
>> OK to revert r14-9692-g839bc42772ba7a?
>
> No.
>
> The patch solved a very real problem.  How does your replacement handle
> that?  You don't say.  It looks like it only battles symptoms a bit,
> instead :-(
>
> We had this before: 3->2 combinations that leave an instruction
> identical to what was there before.  This was just a combination with
> context as well.  The only reason this wasn't a huge problem then
> already was because this is a 3->2 combination, even if it really is a
> 2->1 one it still is beneficial in all the same cases.  But in the new
> case it can iterate indefinitely -- well not quite, but some polynomial
> number of times, for a polynomial at least of degree three, possibly
> more :-(
>
> With this patch you need to show combine still is linear.  I don't think
> it is, but some deeper analysis might show it still is.
>
>   ~ - ~ - ~
>
> What should *really* be done is something that has been on the wish list
> for decades: an uncse pass.
>
> The things that combine no longer works on after my patch are actually
> 1->1 combinations (which we never do currently, although we probably
> should); or alternatively, an un-CSE followed by a 2->1 combination.
>
> We can do the latter of course, but we need to do an actual uncse first!
> Somewhere before combine, and then redo a CSE after it.  An actual CSE,
> not doing ten gazillion other things.

Can you give a specific example of a 2->2 combination that we still
want to apply after r14-9692-g839bc42772ba7a?

2->2 combinations as I understand them were added by
c4c5ad1d6d1e1e1fe7a1c2b3bb097cc269dc7306:

Author: Segher Boessenkool 
Date:   Mon Jul 30 15:18:17 2018 +0200

combine: Allow combining two insns to two insns

This patch allows combine to combine two insns into two.  This helps
in many cases, by reducing instruction path length, and also allowing
further combinations to happen.  PR85160 is a typical example of code
that it can improve.

This patch does not allow such combinations if either of the original
instructions was a simple move instruction.  In those cases combining
the two instructions increases register pressure without improving the
code.  With this move test register pressure does no longer increase
noticably as far as I can tell.

(At first I also didn't allow either of the resulting insns to be a
move instruction.  But that is actually a very good thing to have, as
should have been obvious).

PR rtl-optimization/85160
* combine.c (is_just_move): New function.
(try_combine): Allow combining two instructions into two if neither 
of
the original instructions was a move.

That patch didn't have a testcase, but one was added later in
81bdfc1e2940fc93bcd0bba4416daff47f04f3b3:

testcase for 2-2 combine

gcc/testsuite/
PR rtl-optimization/85160
* gcc.target/powerpc/combine-2-2.c: New testcase.

But this is the powerpc test that regresses with the recent patch (PR114518).

The patches reference aarch64 bug PR85160.  If I check out and build
c4c5ad1d6d1e above, I can see that it does indeed remove two mvns from
the PR85160 testcase.  The diff from c4c5ad1d6d1e~ is:

@@ -10,12 +10,10 @@
.cfi_startproc
ldr w3, [x2, w3, sxtw 2]
ldr w2, [x2, w4, sxtw 2]
-   mvn w3, w3
-   mvn w2, w2
-   and w4, w3, w1
-   and w1, w2, w1
-   and w3, w3, w0
-   and w2, w2, w0
+   bic w4, w1, w3
+   bic w3, w0, w3
+   bic w1, w1, w2
+   bic w2, w0, w2
asr w4, w4, 9
asr w1, w1, 7
orr w3, w4, w3, asr 7

(which is great).  But if I apply 839bc42772ba on top of c4c5ad1d6d1e
then the optimisation is undone.

Is that the intention?  I.e. are we effectively removing the kind of
2->2 combinations added in c4c5ad1d6d1e1e1fe?  If so, why not simply
revert c4c5ad1d6d1e1e1fe itself?

Or is there a specific testcase that is still optimised with the
combination of c4c5ad1d6d1e1e1fe and 839bc42772ba7a that would not
be optimised without c4c5ad1d6d1e1e1fe?  If so, can you say what it is?

Thanks,
Richard


Re: [PATCH 3/3] tree-optimization/114052 - niter analysis from undefined behavior

2024-04-08 Thread Richard Biener
On Mon, 8 Apr 2024, Richard Biener wrote:

> On Fri, 5 Apr 2024, Jan Hubicka wrote:
> 
> > > +   /* When there's a call that might not return the last iteration
> > > +  is possibly partial.  This matches what we check in invariant
> > > +  motion.
> > > +  ???  For the call argument evaluation it would be still OK.  */
> > > +   if (!may_have_exited
> > > +   && is_gimple_call (stmt)
> > > +   && gimple_has_side_effects (stmt))
> > > + may_have_exited = true;
> > 
> > I think you are missing here non-call EH, volatile asms and traps.
> >  We have stmt_may_terminate_function_p which tests there.
> 
> That returns true for all variable array accesses, I think we want
> to catch traps explicitly here.  I'm going to do
> 
>   if (!may_have_exited
>   && (gimple_has_side_effects (stmt)
>   || stmt_can_throw_external (cfun, stmt)))
> may_have_exited = true;
> 
> that should cover all but the generic trapping and not use IPA info
> to prove no side-effects.

Hum.  Maybe I'm a bit confused but we seem to "properly" take things
into account via maybe_lower_iteration_bound and are not directly using
the recorded bounds?  The function does a DFS walk though, not
reliably finding exits via calls early enough (it also lacks external
EH).  Oddly enough it seems to handle(?) gcc.dg/tree-ssa/cunroll-9.c
"correctly", but not gcc.dg/tree-ssa/cunroll-10.c which has the
number of iterations wrongly computed.

Maybe we should really record all the bounds properly but merge them
to a loop upper bound at one place?  gcc.dg/tree-ssa/cunroll-11.c
needs to see the g_x[i] bound is enforced in all paths to the latch
for example.

I'm most definitely defering this to GCC 15 now, I wonder if you
have any preferences here (and maybe want to pick this up also
for cleanup - it's mostly your code).

Richard.

> Richard.
> 
> > Honza
> > > +
> > > +   infer_loop_bounds_from_array (loop, stmt,
> > > + reliable && !may_have_exited);
> > >  
> > > -   if (reliable)
> > > +   if (reliable && !may_have_exited)
> > >  {
> > >infer_loop_bounds_from_signedness (loop, stmt);
> > >infer_loop_bounds_from_pointer_arith (loop, stmt);
> > >  }
> > >   }
> > > -
> > >  }
> > >  }
> > >  
> > > @@ -4832,7 +4855,7 @@ estimate_numbers_of_iterations (class loop *loop)
> > >   diagnose those loops with -Waggressive-loop-optimizations.  */
> > >number_of_latch_executions (loop);
> > >  
> > > -  basic_block *body = get_loop_body (loop);
> > > +  basic_block *body = get_loop_body_in_rpo (cfun, loop);
> > >auto_vec exits = get_loop_exit_edges (loop, body);
> > >likely_exit = single_likely_exit (loop, exits);
> > >FOR_EACH_VEC_ELT (exits, i, ex)
> > > -- 
> > > 2.35.3
> > 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH] s390: Fix s390_const_int_pool_entry_p and movdi peephole2 [PR114605]

2024-04-08 Thread Ilya Leoshkevich
On Sat, 2024-04-06 at 18:58 +0200, Jakub Jelinek wrote:
> Hi!
> 
> The following testcase is miscompiled, because we have initially
> a movti which loads the 0x3f803f80ULL TImode constant
> from constant pool.  Later on we split it into a pair of DImode
> loads.  Now, for the first load (why just that?, though not stage4
> material) we trigger the peephole2 which uses
> s390_const_int_pool_entry_p.
> That function doesn't check at all the constant pool mode though,
> sees
> the constant pool at that address has a CONST_INT value and just
> assumes
> that is the value to return, which is especially wrong for big-
> endian,
> if it is a DImode load from offset 0, it should be loading 0 rather
> than
> 0x3f803f80ULL.
> The following patch adds checks if we are extracing a MODE_INT mode,
> if the constant pool has MODE_INT mode as well, punts if constant
> pool
> has smaller mode size than the extraction one (then it would be UB),
> if it has the same mode as before keeps using what it did before,
> if constant pool has a larger mode than the one being extracted, uses
> simplify_subreg.  I'd have used avoid_constant_pool_reference
> instead which can handle also offsets into the constant pool
> constants,
> but it can't handle UNSPEC_LTREF.
> 
> Another thing is that once that is fixed, we ICE when we extract
> constant
> like 0, ior insn predicate require non-0 constant.  So, the patch
> also
> fixes the peephole2 so that if either 32-bit half is zero, it uses a
> mere
> load of the constant into register rather than a pair of such load
> and ior.
> 
> Bootstrapped/regtested on s390x-linux, ok for trunk?

Hi Jakub, thanks for the patch, it looks good to me.
Since I'm not a maintainer, we need to wait for Andreas' opinion.

> 
> 2024-04-06  Jakub Jelinek  
> 
>   PR target/114605
>   * config/s390/s390.cc (s390_const_int_pool_entry_p): Punt
>   if mem doesn't have MODE_INT mode, or pool constant doesn't
>   have MODE_INT mode, or if pool constant mode is smaller than
>   mem mode.  If mem mode is different from pool constant mode,
>   try to simplify subreg.  If that doesn't work, punt, if it
>   does, use the simplified constant instead of the constant
> pool
>   constant.
>   * config/s390/s390.md (movdi from const pool peephole): If
>   either low or high 32-bit part is zero, just emit move insn
>   instead of move + ior.
> 
>   * gcc.dg/pr114605.c: New test.


Re: [PATCH 3/3] tree-optimization/114052 - niter analysis from undefined behavior

2024-04-08 Thread Richard Biener
On Fri, 5 Apr 2024, Jan Hubicka wrote:

> > + /* When there's a call that might not return the last iteration
> > +is possibly partial.  This matches what we check in invariant
> > +motion.
> > +???  For the call argument evaluation it would be still OK.  */
> > + if (!may_have_exited
> > + && is_gimple_call (stmt)
> > + && gimple_has_side_effects (stmt))
> > +   may_have_exited = true;
> 
> I think you are missing here non-call EH, volatile asms and traps.
>  We have stmt_may_terminate_function_p which tests there.

That returns true for all variable array accesses, I think we want
to catch traps explicitly here.  I'm going to do

  if (!may_have_exited
  && (gimple_has_side_effects (stmt)
  || stmt_can_throw_external (cfun, stmt)))
may_have_exited = true;

that should cover all but the generic trapping and not use IPA info
to prove no side-effects.

Richard.

> Honza
> > +
> > + infer_loop_bounds_from_array (loop, stmt,
> > +   reliable && !may_have_exited);
> >  
> > - if (reliable)
> > + if (reliable && !may_have_exited)
> >  {
> >infer_loop_bounds_from_signedness (loop, stmt);
> >infer_loop_bounds_from_pointer_arith (loop, stmt);
> >  }
> > }
> > -
> >  }
> >  }
> >  
> > @@ -4832,7 +4855,7 @@ estimate_numbers_of_iterations (class loop *loop)
> >   diagnose those loops with -Waggressive-loop-optimizations.  */
> >number_of_latch_executions (loop);
> >  
> > -  basic_block *body = get_loop_body (loop);
> > +  basic_block *body = get_loop_body_in_rpo (cfun, loop);
> >auto_vec exits = get_loop_exit_edges (loop, body);
> >likely_exit = single_likely_exit (loop, exits);
> >FOR_EACH_VEC_ELT (exits, i, ex)
> > -- 
> > 2.35.3
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH] testsuite: Add profile_update_atomic check to gcov-20.c [PR114614]

2024-04-08 Thread Richard Biener
On Mon, Apr 8, 2024 at 11:23 AM Kewen.Lin  wrote:
>
> Hi,
>
> As PR114614 shows, the newly added test case gcov-20.c by
> commit r14-9789-g08a52331803f66 failed on targets which do
> not support atomic profile update, there would be a message
> like:
>
>   warning: target does not support atomic profile update,
>single mode is selected
>
> Since the test case adopts -fprofile-update=atomic, it
> requires effective target check profile_update_atomic, this
> patch is to add the check accordingly.
>
> Tested well on x86_64-redhat-linux, powerpc64-linux-gnu P8/P9
> and powerpc64le-linux-gnu P9/P10.
>
> Is it ok for trunk?

OK

> BR,
> Kewen
> -
> PR testsuite/114614
>
> gcc/testsuite/ChangeLog:
>
> * gcc.misc-tests/gcov-20.c: Add effective target check
> profile_update_atomic.
> ---
>  gcc/testsuite/gcc.misc-tests/gcov-20.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/gcc/testsuite/gcc.misc-tests/gcov-20.c 
> b/gcc/testsuite/gcc.misc-tests/gcov-20.c
> index 215faffc980..ca8c12aad2b 100644
> --- a/gcc/testsuite/gcc.misc-tests/gcov-20.c
> +++ b/gcc/testsuite/gcc.misc-tests/gcov-20.c
> @@ -1,5 +1,6 @@
>  /* { dg-options "-fcondition-coverage -ftest-coverage 
> -fprofile-update=atomic" } */
>  /* { dg-do run { target native } } */
> +/* { dg-require-effective-target profile_update_atomic } */
>
>  /* Some side effect to stop branches from being pruned */
>  int x = 0;
> --
> 2.43.0


Re: [PATCH] rs6000: Fix wrong align passed to build_aligned_type [PR88309]

2024-04-08 Thread Richard Biener
On Mon, Apr 8, 2024 at 11:22 AM Kewen.Lin  wrote:
>
> Hi,
>
> As the comments in PR88309 show, there are two oversights
> in rs6000_gimple_fold_builtin that pass align in bytes to
> build_aligned_type but which actually requires align in
> bits, it causes unexpected ICE or hanging in function
> is_miss_rate_acceptable due to zero align_unit value.
>
> This patch is to fix them by converting bytes to bits, add
> an assertion on positive align_unit value and notes function
> build_aligned_type requires align measured in bits in its
> function comment.
>
> Bootstrapped and regtested on x86_64-redhat-linux,
> powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10.
>
> Is it (the generic part code change) ok for trunk?

OK

> BR,
> Kewen
> -
> PR target/88309
>
> Co-authored-by: Andrew Pinski 
>
> gcc/ChangeLog:
>
> * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Fix
> wrong align passed to function build_aligned_type.
> * tree-ssa-loop-prefetch.cc (is_miss_rate_acceptable): Add an
> assertion to ensure align_unit should be positive.
> * tree.cc (build_qualified_type): Update function comments.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/powerpc/pr88309.c: New test.
> ---
>  gcc/config/rs6000/rs6000-builtin.cc|  4 ++--
>  gcc/testsuite/gcc.target/powerpc/pr88309.c | 27 ++
>  gcc/tree-ssa-loop-prefetch.cc  |  2 ++
>  gcc/tree.cc|  3 ++-
>  4 files changed, 33 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr88309.c
>
> diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
> b/gcc/config/rs6000/rs6000-builtin.cc
> index 6698274031b..e7d6204074c 100644
> --- a/gcc/config/rs6000/rs6000-builtin.cc
> +++ b/gcc/config/rs6000/rs6000-builtin.cc
> @@ -1900,7 +1900,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
> tree lhs_type = TREE_TYPE (lhs);
> /* In GIMPLE the type of the MEM_REF specifies the alignment.  The
>   required alignment (power) is 4 bytes regardless of data type.  */
> -   tree align_ltype = build_aligned_type (lhs_type, 4);
> +   tree align_ltype = build_aligned_type (lhs_type, 32);
> /* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  
> Create
>the tree using the value from arg0.  The resulting type will match
>the type of arg1.  */
> @@ -1944,7 +1944,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
> tree arg2_type = ptr_type_node;
> /* In GIMPLE the type of the MEM_REF specifies the alignment.  The
>required alignment (power) is 4 bytes regardless of data type.  */
> -   tree align_stype = build_aligned_type (arg0_type, 4);
> +   tree align_stype = build_aligned_type (arg0_type, 32);
> /* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  
> Create
>the tree using the value from arg1.  */
> gimple_seq stmts = NULL;
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr88309.c 
> b/gcc/testsuite/gcc.target/powerpc/pr88309.c
> new file mode 100644
> index 000..c0078cf2b8c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr88309.c
> @@ -0,0 +1,27 @@
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> +/* { dg-options "-mvsx -O2 -fprefetch-loop-arrays" } */
> +
> +/* Verify there is no ICE or hanging.  */
> +
> +#include 
> +
> +void b(float *c, vector float a, vector float, vector float)
> +{
> +  vector float d;
> +  vector char ahbc;
> +  vec_xst(vec_perm(a, d, ahbc), 0, c);
> +}
> +
> +vector float e(vector unsigned);
> +
> +void f() {
> +  float *dst;
> +  int g = 0;
> +  for (;; g += 16) {
> +vector unsigned m, i;
> +vector unsigned n, j;
> +vector unsigned k, l;
> +b(dst + g * 3, e(m), e(n), e(k));
> +b(dst + (g + 4) * 3, e(i), e(j), e(l));
> +  }
> +}
> diff --git a/gcc/tree-ssa-loop-prefetch.cc b/gcc/tree-ssa-loop-prefetch.cc
> index bbd98e03254..70073cc4fe4 100644
> --- a/gcc/tree-ssa-loop-prefetch.cc
> +++ b/gcc/tree-ssa-loop-prefetch.cc
> @@ -739,6 +739,8 @@ is_miss_rate_acceptable (unsigned HOST_WIDE_INT 
> cache_line_size,
>if (delta >= (HOST_WIDE_INT) cache_line_size)
>  return false;
>
> +  gcc_assert (align_unit > 0);
> +
>miss_positions = 0;
>total_positions = (cache_line_size / align_unit) * distinct_iters;
>max_allowed_miss_positions = (ACCEPTABLE_MISS_RATE * total_positions) / 
> 1000;
> diff --git a/gcc/tree.cc b/gcc/tree.cc
> index f801712c9dd..6f8400e6640 100644
> --- a/gcc/tree.cc
> +++ b/gcc/tree.cc
> @@ -5689,7 +5689,8 @@ build_qualified_type (tree type, int type_quals 
> MEM_STAT_DECL)
>return t;
>  }
>
> -/* Create a variant of type T with alignment ALIGN.  */
> +/* Create a variant of type T with alignment ALIGN which
> +   is measured in bits.  */
>
>  tree
>  build_aligned_type (tree type, unsigned int align)
> --
> 2.43.0


GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]' (was: [committed] amdgcn: Prefer V32 on RDNA devices)

2024-04-08 Thread Thomas Schwinge
Hi!

On 2024-03-28T08:00:50+0100, I wrote:
> On 2024-03-22T15:54:48+, Andrew Stubbs  wrote:
>> This patch alters the default (preferred) vector size to 32 on RDNA devices 
>> to
>> better match the actual hardware.  64-lane vectors will continue to be
>> used where they are hard-coded (such as function prologues).
>>
>> We run these devices in wavefrontsize64 for compatibility, but they actually
>> only have 32-lane vectors, natively.  If the upper part of a V64 is masked
>> off (as it is in V32) then RDNA devices will skip execution of the upper part
>> for most operations, so this adjustment shouldn't leave too much performance 
>> on
>> the table.  One exception is memory instructions, so full wavefrontsize32
>> support would be better.
>>
>> The advantage is that we avoid the missing V64 operations (such as permute 
>> and
>> vec_extract).
>>
>> Committed to mainline.
>
> In my GCN target '-march=gfx1100' testing, this commit
> "amdgcn: Prefer V32 on RDNA devices" does resolve (or, make latent?) a
> number of execution test FAILs (that is, regressions compared to earlier
> '-march=gfx90a' etc. testing).
>
> This commit also resolves (for my '-march=gfx1100' testing) one
> pre-existing FAIL (that is, already seen in '-march=gfx90a' earlier
> etc. testing):
>
> PASS: gcc.dg/tree-ssa/scev-14.c (test for excess errors)
> [-FAIL:-]{+PASS:+} gcc.dg/tree-ssa/scev-14.c scan-tree-dump ivopts 
> "Overflowness wrto loop niter:\tNo-overflow"
>
> That means, this test case specifically (or, just its 'scan-tree-dump'?)
> needs to be adjusted for GCN V64 testing?
>
> This commit, as you'd also mentioned elsewhere, however also causes a
> number of regressions in 'gcc.target/gcn/gcn.exp', see list below.
>
> Those can be "fixed" with 'dg-additional-options -march=gfx90a' (or
> similar) in the affected test cases (let me know if you'd like me to
> 'git push' that), but I suppose something more elaborate may be in order?
> (Conditionalize those on 'target { ! gcn_rdna }', and add respective
> scanning for 'target gcn_rdna'?  I can help with effective-target
> 'gcn_rdna' (or similar), if you'd like me to.)
>
> And/or, have a '-mpreferred-simd-mode=v64' (or similar) to be used for
> such test cases, to override 'if (TARGET_RDNA2_PLUS)' etc. in
> 'gcn_vectorize_preferred_simd_mode'?

The latter I have quickly implemented, see attached
"GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]'".  OK to
push to trunk branch?

(This '--param' will also be useful for another bug/regression I'm about
to file.)

> Best, probably, both these things, to properly test both V32 and V64?

That part remains to be done, but is best done by someone who actually
knowns "GCN" assembly/GCC back end -- that is, not me.


Grüße
 Thomas


> PASS: gcc.target/gcn/cond_fmaxnm_1.c (test for excess errors)
> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-not 
> \\tv_writelane_b32\\tv[0-9]+, vcc_..
> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times 
> smaxv64df3_exec 3
> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times 
> smaxv64sf3_exec 3
> PASS: gcc.target/gcn/cond_fmaxnm_1_run.c (test for excess errors)
> PASS: gcc.target/gcn/cond_fmaxnm_1_run.c execution test
>
> PASS: gcc.target/gcn/cond_fmaxnm_2.c (test for excess errors)
> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-not 
> \\tv_writelane_b32\\tv[0-9]+, vcc_..
> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times 
> smaxv64df3_exec 3
> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times 
> smaxv64sf3_exec 3
> PASS: gcc.target/gcn/cond_fmaxnm_2_run.c (test for excess errors)
> PASS: gcc.target/gcn/cond_fmaxnm_2_run.c execution test
>
> PASS: gcc.target/gcn/cond_fmaxnm_3.c (test for excess errors)
> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-not 
> \\tv_writelane_b32\\tv[0-9]+, vcc_..
> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times 
> movv64df_exec 3
> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times 
> movv64sf_exec 3
> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times 
> smaxv64sf3 3
> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times 
> smaxv64sf3 3
> PASS: gcc.target/gcn/cond_fmaxnm_3_run.c (test for excess errors)
> PASS: gcc.target/gcn/cond_fmaxnm_3_run.c execution test
>
> PASS: gcc.target/gcn/cond_fmaxnm_4.c (test for excess errors)
> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-not 
> \\tv_writelane_b32\\tv[0-9]+, vcc_..
> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times 
> movv64df_exec 3
> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times 
> movv64sf_exec 3
> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times 
> smaxv64sf3 3
> [-PASS:-]{+FAIL:+} 

gcc: docs: Fix documentation of two hooks

2024-04-08 Thread Matthew Malcomson
The `function_attribute_inlinable_p` hook documentation described it
returning the value if it is OK to inline the provided fndecl into "the
current function".  AFAICS This hook is only called when
`current_function_decl` is the same as the `fndecl` argument that the
hook is given, hence asking whether `fndecl` can be inlined into "the
current function" doesn't make sense.
Update the documentation to match this understanding.

The `unspec_may_trap_p` documentation mentioned applying to either
`unspec` or `unspec_volatile`.  AFAICS this hook is only used for
`unspec` codes since c84a808e493a, so I removed the mention of
`unspec_volatile`.

gcc/ChangeLog:

* doc/tm.texi (function_attribute_inlinable_p,
unspec_may_trap_p): Update documentation.
* target.def (function_attribute_inlinable_p,
unspec_may_trap_p): Update documentation.

--
N.b. not entirely sure who to ask for review, went with docs maintainers, but
if that's incorrect please do redirect me.

### Attachment also inlined for ease of reply###


diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 
c8b8b126b2424b6552f824ba42ac329cfaf84d84..f0051f0ae1e9444d5d585135c90a68ca760c2fbd
 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -10752,10 +10752,10 @@ attribute handlers.  So far this only affects the 
@var{noinit} and
 
 @deftypefn {Target Hook} bool TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P 
(const_tree @var{fndecl})
 @cindex inlining
-This target hook returns @code{true} if it is OK to inline @var{fndecl}
-into the current function, despite its having target-specific
-attributes, @code{false} otherwise.  By default, if a function has a
-target specific attribute attached to it, it will not be inlined.
+This target hook returns @code{false} if the target-specific attributes on
+@var{fndecl} always block it getting inlined, @code{true} otherwise.  By
+default, if a function has a target specific attribute attached to it, it
+will not be inlined.
 @end deftypefn
 
 @deftypefn {Target Hook} bool TARGET_OPTION_VALID_ATTRIBUTE_P (tree 
@var{fndecl}, tree @var{name}, tree @var{args}, int @var{flags})
@@ -12245,12 +12245,10 @@ allocation.
 @end deftypefn
 
 @deftypefn {Target Hook} int TARGET_UNSPEC_MAY_TRAP_P (const_rtx @var{x}, 
unsigned @var{flags})
-This target hook returns nonzero if @var{x}, an @code{unspec} or
-@code{unspec_volatile} operation, might cause a trap.  Targets can use
-this hook to enhance precision of analysis for @code{unspec} and
-@code{unspec_volatile} operations.  You may call @code{may_trap_p_1}
-to analyze inner elements of @var{x} in which case @var{flags} should be
-passed along.
+This target hook returns nonzero if @var{x}, an @code{unspec} might cause
+a trap.  Targets can use this hook to enhance precision of analysis for
+@code{unspec} operations.  You may call @code{may_trap_p_1} to analyze inner
+elements of @var{x} in which case @var{flags} should be passed along.
 @end deftypefn
 
 @deftypefn {Target Hook} void TARGET_SET_CURRENT_FUNCTION (tree @var{decl})
diff --git a/gcc/target.def b/gcc/target.def
index 
fdad7bbc93e2ad8aea30336d5cd4af67801e9c74..2b2a6c11807eff228788fae1cd1370e8971fbf3e
 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2314,10 +2314,10 @@ attribute handlers.  So far this only affects the 
@var{noinit} and\n\
 DEFHOOK
 (function_attribute_inlinable_p,
  "@cindex inlining\n\
-This target hook returns @code{true} if it is OK to inline @var{fndecl}\n\
-into the current function, despite its having target-specific\n\
-attributes, @code{false} otherwise.  By default, if a function has a\n\
-target specific attribute attached to it, it will not be inlined.",
+This target hook returns @code{false} if the target-specific attributes on\n\
+@var{fndecl} always block it getting inlined, @code{true} otherwise.  By\n\
+default, if a function has a target specific attribute attached to it, it\n\
+will not be inlined.",
  bool, (const_tree fndecl),
  hook_bool_const_tree_false)
 
@@ -4057,12 +4057,10 @@ allocation.",
FLAGS has the same meaning as in rtlanal.cc: may_trap_p_1.  */
 DEFHOOK
 (unspec_may_trap_p,
- "This target hook returns nonzero if @var{x}, an @code{unspec} or\n\
-@code{unspec_volatile} operation, might cause a trap.  Targets can use\n\
-this hook to enhance precision of analysis for @code{unspec} and\n\
-@code{unspec_volatile} operations.  You may call @code{may_trap_p_1}\n\
-to analyze inner elements of @var{x} in which case @var{flags} should be\n\
-passed along.",
+ "This target hook returns nonzero if @var{x}, an @code{unspec} might cause\n\
+a trap.  Targets can use this hook to enhance precision of analysis for\n\
+@code{unspec} operations.  You may call @code{may_trap_p_1} to analyze inner\n\
+elements of @var{x} in which case @var{flags} should be passed along.",
  int, (const_rtx x, unsigned flags),
  default_unspec_may_trap_p)
 



diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 

[Patch] Fortran: List-directed read - accept again tab as alternative to space as separator [PR114304] (was: [patch, libgfortran] PR114304 - [13/14 Regression] libgfortran I/O – bogus "Semicolon not a

2024-04-08 Thread Tobias Burnus

Jerry D wrote:

See attached updated patch.


It turned rather quickly out that this patch – committed as 
r14-9822-g93adf88cc6744a – caused regressions.


Namely, real-world code use tab(s) as separator instead of spaces.

[For instance, PR114304 which contains a named-list input file from SPEC 
CPU 2017; that example uses tabs before the '=' sign, but the issue is 
more generic.]


I think the ISO Fortran standard only permits spaces, but as it feels 
natural and is widely supported, tabs are used and should remain supported.


It is not quite clear how '\r' are or should be handled, but as 
eat_spaces did use it, I thought I would add one testcase using them as 
well.


That test is not affected by my change; it did work before with GCC and 
still does – but it does fail with ifort/ifx/flang. I have not thought 
deeply whether it should be supported or not – and looking at the 
libgfortran source file, it often but (→ testcase) not consistently 
requires that an \n follows the \r.


OK for mainline? [And: When the previous patch gets backported, this 
surely needs to be included as well.]


Tobias
Fortran: Accept again tab as alternative to space as separator [PR114304]

This fixes a side-effect of/regression caused by r14-9822-g93adf88cc6744a,
which was for the same PR.

	PR libfortran/114304

libgfortran/ChangeLog:

	* io/list_read.c (eat_separator): Accept tab as alternative to space.

gcc/testsuite/ChangeLog:

	* gfortran.dg/pr114304-2.f90: New test.

 gcc/testsuite/gfortran.dg/pr114304-2.f90 | 82 
 libgfortran/io/list_read.c   |  2 +-
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gfortran.dg/pr114304-2.f90 b/gcc/testsuite/gfortran.dg/pr114304-2.f90
new file mode 100644
index 000..5ef5874f528
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr114304-2.f90
@@ -0,0 +1,82 @@
+! { dg-do run }
+!
+! PR fortran/114304
+!
+! Ensure that '\t' (tab) is supported as separator in list-directed input
+! While not really standard conform, this is widely used in user input and
+! widely supported.
+!
+
+use iso_c_binding
+implicit none
+character(len=*,kind=c_char), parameter :: tab = C_HORIZONTAL_TAB
+
+! Accept '' as variant to ' ' as separator
+! Check that  and  are handled
+
+character(len=*,kind=c_char), parameter :: nml_str &
+   = ''//C_CARRIAGE_RETURN // C_NEW_LINE // &
+ 'first'//tab//'='//tab//' .true.'// C_NEW_LINE // &
+ ' , other'//tab//' ='//tab//'3'//tab//', 2'//tab//'/'
+
+! Check that  is handled,
+
+! Note: For new line, Unix uses \n, Windows \r\n but old Apple systems used '\r'
+!
+! Gfortran does not seem to support all \r, but the following is supported
+! since ages, ! which seems to be a gfortran extension as ifort and flang don't like it.
+
+character(len=*,kind=c_char), parameter :: nml_str2 &
+   = ''//C_CARRIAGE_RETURN // C_NEW_LINE // &
+ 'first'//C_NEW_LINE//'='//tab//' .true.'// C_CARRIAGE_RETURN // &
+ ' , other'//tab//' ='//tab//'3'//tab//', 2'//tab//'/'
+
+character(len=*,kind=c_char), parameter :: str &
+   = tab//'1'//tab//'2,'//tab//'3'//tab//',4'//tab//','//tab//'5'//tab//'/'
+character(len=*,kind=c_char), parameter :: str2 &
+   = tab//'1'//tab//'2;'//tab//'3'//tab//';4'//tab//';'//tab//'5'//tab//'/'
+logical :: first
+integer :: other(4)
+integer :: ints(6)
+namelist /inparm/ first , other
+
+other = 1
+
+open(99, file="test.inp")
+write(99, '(a)') nml_str
+rewind(99)
+read(99,nml=inparm)
+close(99, status="delete")
+
+if (.not.first .or. any (other /= [3,2,1,1])) stop 1
+
+other = 9
+
+open(99, file="test.inp")
+write(99, '(a)') nml_str2
+rewind(99)
+read(99,nml=inparm)
+close(99, status="delete")
+
+if (.not.first .or. any (other /= [3,2,9,9])) stop 2
+
+ints = 66
+
+open(99, file="test.inp", decimal='point')
+write(99, '(a)') str
+rewind(99)
+read(99,*) ints
+close(99, status="delete")
+
+if (any (ints /= [1,2,3,4,5,66])) stop 3
+
+ints = 77 
+
+open(99, file="test.inp", decimal='comma')
+write(99, '(a)') str2
+rewind(99)
+read(99,*) ints
+close(99, status="delete")
+
+if (any (ints /= [1,2,3,4,5,77])) stop 4
+end
diff --git a/libgfortran/io/list_read.c b/libgfortran/io/list_read.c
index b56f2a4e6d6..5bbbef26c26 100644
--- a/libgfortran/io/list_read.c
+++ b/libgfortran/io/list_read.c
@@ -463,7 +463,7 @@ eat_separator (st_parameter_dt *dtp)
 
   dtp->u.p.comma_flag = 0;
   c = next_char (dtp);
-  if (c == ' ')
+  if (c == ' ' || c == '\t')
 {
   eat_spaces (dtp);
   c = next_char (dtp);


[PATCH] middle-end/114604 - ranger allocates bitmap without initialized obstack

2024-04-08 Thread Richard Biener
The following fixes ranger bitmap allocation when invoked from IPA
context where the global bitmap obstack possibly isn't initialized.
Instead of trying to use one of the ranger obstacks the following
simply initializes the global bitmap obstack around an active ranger.

Bootstrapped and tested on x86_64-unknown-linux-gnu with bitmap_alloc
instrumentation (all ICEs gone, so ranger was the only offender).

OK?

Thanks,
Richard.

PR middle-end/114604
* gimple-range.cc (enable_ranger): Initialize the global
bitmap obstack.
(disable_ranger): Release it.
---
 gcc/gimple-range.cc | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index c16b776c1e3..4d3b1ce8588 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -689,6 +689,8 @@ enable_ranger (struct function *fun, bool use_imm_uses)
 {
   gimple_ranger *r;
 
+  bitmap_obstack_initialize (NULL);
+
   gcc_checking_assert (!fun->x_range_query);
   r = new gimple_ranger (use_imm_uses);
   fun->x_range_query = r;
@@ -705,6 +707,8 @@ disable_ranger (struct function *fun)
   gcc_checking_assert (fun->x_range_query);
   delete fun->x_range_query;
   fun->x_range_query = NULL;
+
+  bitmap_obstack_release (NULL);
 }
 
 // 
-- 
2.35.3


Re: [pushed] aarch64: Fix bogus cnot optimisation [PR114603]

2024-04-08 Thread Richard Sandiford
Richard Biener  writes:
> On Fri, Apr 5, 2024 at 3:52 PM Richard Sandiford
>> This isn't a regression on a known testcase.  However, it's a nasty
>> wrong code bug that could conceivably trigger for autovec code (although
>> I've not been able to construct a reproducer so far).  That fix is also
>> quite localised to the buggy operation.  I'd therefore prefer to push
>> the fix now rather than wait for GCC 15.
>
> wrong-code bugs (and also rejects-valid or ice-on-valid) are always exempt
> from the regression-only fixing.  In practice every such bug will be a
> regression,
> in this case to when the combining pattern was introduced (unless that was
> with the version with the initial introduction of the port of course).

Ah, thanks, hadn't realised that.  Makes sense though.

It's good news of a sort since unfortunately I've another SVE wrong-code
fix in the works...

Richard


[Patch, fortran] PR114535 - [13/14 regression] ICE with elemental finalizer

2024-04-08 Thread Paul Richard Thomas
Hi All,

This one is blazingly 'obvious'. I haven't had the heart to investigate why
somebody thought that it is a good idea to check if unreferenced symbols
are finalizable because, I suspect, that 'somebody' was me. Worse, I tried
a couple of other fixes before I hit on the 'obvious' one :-(

The ChangeLog says it all. OK for mainline and then backporting in a couple
of weeks?

Paul

Fortran: Fix ICE in trans-stmt.cc(gfc_trans_call) [PR114535]

2024-04-08  Paul Thomas  

gcc/fortran
PR fortran/114535
* resolve.cc (resolve_symbol): Remove last chunk that checked
for finalization of unreferenced symbols.

gcc/testsuite/
PR fortran/114535
* gfortran.dg/pr114535d.f90: New test.
* gfortran.dg/pr114535iv.f90: Additional source.
diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 43315a6a550..4cbf7186119 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -17069,15 +17069,6 @@ resolve_symbol (gfc_symbol *sym)
 
   if (sym->param_list)
 resolve_pdt (sym);
-
-  if (!sym->attr.referenced
-  && (sym->ts.type == BT_CLASS || sym->ts.type == BT_DERIVED))
-{
-  gfc_expr *final_expr = gfc_lval_expr_from_sym (sym);
-  if (gfc_is_finalizable (final_expr->ts.u.derived, NULL))
-	gfc_set_sym_referenced (sym);
-  gfc_free_expr (final_expr);
-}
 }
 
 
diff --git a/gcc/testsuite/gfortran.dg/pr114535d.f90 b/gcc/testsuite/gfortran.dg/pr114535d.f90
new file mode 100644
index 000..7ce178a1e30
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr114535d.f90
@@ -0,0 +1,42 @@
+! { dg-do compile }
+! { dg-compile-aux-modules "pr114535iv.f90" }
+! Contributed by Andrew Benson  
+!
+module d
+  implicit none
+contains
+  function en() result(dd)
+use :: iv
+implicit none
+type(vs) :: dd
+dd%i = 1
+  end function en
+end module d
+
+! Delete line 1 and all brands complain that 'vs' is an undefined type.
+! Delete lines 1 and line 2 recreates the original problem.
+module ni
+  implicit none
+contains
+  subroutine iss1()
+!use :: iv! line 1
+use :: d
+implicit none
+!type(vs) :: ans; ans = en(); ! line 2
+  end subroutine iss1
+  subroutine iss2()
+use :: d
+implicit none
+  end subroutine iss2
+end module ni ! Used to give an ICE: in gfc_trans_call, at fortran/trans-stmt.cc:400
+
+  use ni
+  use iv
+  type(vs) :: x
+  call iss1()
+  call iss1()
+  if ((ctr .eq. 0) .or. (ctr .ne. 6)) stop 1  ! Depends whether lines 1 & 2 are present
+  call iss2()
+  x = vs(42)
+  if ((ctr .eq. 1) .or. (ctr .ne. 7)) stop 2  ! Make sure destructor available here
+end
diff --git a/gcc/testsuite/gfortran.dg/pr114535iv.f90 b/gcc/testsuite/gfortran.dg/pr114535iv.f90
new file mode 100644
index 000..be629991023
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr114535iv.f90
@@ -0,0 +1,18 @@
+! Compiled with pr114535d.f90
+! Contributed by Andrew Benson  
+!
+module iv
+  type, public :: vs
+ integer :: i
+   contains
+ final :: destructor
+  end type vs
+  integer :: ctr = 0
+contains
+  impure elemental subroutine destructor(s)
+type(vs), intent(inout) :: s
+s%i = 0
+ctr = ctr + 1
+  end subroutine destructor
+end module iv
+


[PATCH] tree-optimization/114624 - fix use-after-free in SCCP

2024-04-08 Thread Richard Biener
We're inspecting the replaced PHI node after releasing it.

Bootstrapped and tested on x86-64-unknown-linux-gnu, pushed.

PR tree-optimization/114624
* tree-scalar-evolution.cc (final_value_replacement_loop):
Get at the PHI arg location before releasing the PHI node.

* gcc.dg/torture/pr114624.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr114624.c | 20 
 gcc/tree-scalar-evolution.cc|  4 ++--
 2 files changed, 22 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr114624.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr114624.c 
b/gcc/testsuite/gcc.dg/torture/pr114624.c
new file mode 100644
index 000..ae031356982
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr114624.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+
+int a, b;
+int main() {
+  int c, d = 1;
+  while (a) {
+while (b)
+  if (d)
+while (a)
+  ;
+for (; b < 2; b++)
+  if (b)
+for (c = 0; c < 8; c++)
+  d = 0;
+  else
+for (a = 0; a < 2; a++)
+  ;
+  }
+  return 0;
+}
diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
index 25e3130e2f1..b0a5e09a77c 100644
--- a/gcc/tree-scalar-evolution.cc
+++ b/gcc/tree-scalar-evolution.cc
@@ -3877,6 +3877,7 @@ final_value_replacement_loop (class loop *loop)
 to a GIMPLE sequence or to a statement list (keeping this a
 GENERIC interface).  */
   def = unshare_expr (def);
+  auto loc = gimple_phi_arg_location (phi, exit->dest_idx);
   remove_phi_node (, false);
 
   /* Propagate constants immediately, but leave an unused initialization
@@ -3888,8 +3889,7 @@ final_value_replacement_loop (class loop *loop)
   gimple_seq stmts;
   def = force_gimple_operand (def, , false, NULL_TREE);
   gassign *ass = gimple_build_assign (rslt, def);
-  gimple_set_location (ass,
-  gimple_phi_arg_location (phi, exit->dest_idx));
+  gimple_set_location (ass, loc);
   gimple_seq_add_stmt (, ass);
 
   /* If def's type has undefined overflow and there were folded
-- 
2.35.3


[PATCH] rs6000: Fix wrong align passed to build_aligned_type [PR88309]

2024-04-08 Thread Kewen.Lin
Hi,

As the comments in PR88309 show, there are two oversights
in rs6000_gimple_fold_builtin that pass align in bytes to
build_aligned_type but which actually requires align in
bits, it causes unexpected ICE or hanging in function
is_miss_rate_acceptable due to zero align_unit value.

This patch is to fix them by converting bytes to bits, add
an assertion on positive align_unit value and notes function
build_aligned_type requires align measured in bits in its
function comment.

Bootstrapped and regtested on x86_64-redhat-linux, 
powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10.

Is it (the generic part code change) ok for trunk?

BR,
Kewen
-
PR target/88309

Co-authored-by: Andrew Pinski 

gcc/ChangeLog:

* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Fix
wrong align passed to function build_aligned_type.
* tree-ssa-loop-prefetch.cc (is_miss_rate_acceptable): Add an
assertion to ensure align_unit should be positive.
* tree.cc (build_qualified_type): Update function comments.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr88309.c: New test.
---
 gcc/config/rs6000/rs6000-builtin.cc|  4 ++--
 gcc/testsuite/gcc.target/powerpc/pr88309.c | 27 ++
 gcc/tree-ssa-loop-prefetch.cc  |  2 ++
 gcc/tree.cc|  3 ++-
 4 files changed, 33 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr88309.c

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 6698274031b..e7d6204074c 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -1900,7 +1900,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
tree lhs_type = TREE_TYPE (lhs);
/* In GIMPLE the type of the MEM_REF specifies the alignment.  The
  required alignment (power) is 4 bytes regardless of data type.  */
-   tree align_ltype = build_aligned_type (lhs_type, 4);
+   tree align_ltype = build_aligned_type (lhs_type, 32);
/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
   the tree using the value from arg0.  The resulting type will match
   the type of arg1.  */
@@ -1944,7 +1944,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
tree arg2_type = ptr_type_node;
/* In GIMPLE the type of the MEM_REF specifies the alignment.  The
   required alignment (power) is 4 bytes regardless of data type.  */
-   tree align_stype = build_aligned_type (arg0_type, 4);
+   tree align_stype = build_aligned_type (arg0_type, 32);
/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
   the tree using the value from arg1.  */
gimple_seq stmts = NULL;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr88309.c 
b/gcc/testsuite/gcc.target/powerpc/pr88309.c
new file mode 100644
index 000..c0078cf2b8c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr88309.c
@@ -0,0 +1,27 @@
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2 -fprefetch-loop-arrays" } */
+
+/* Verify there is no ICE or hanging.  */
+
+#include 
+
+void b(float *c, vector float a, vector float, vector float)
+{
+  vector float d;
+  vector char ahbc;
+  vec_xst(vec_perm(a, d, ahbc), 0, c);
+}
+
+vector float e(vector unsigned);
+
+void f() {
+  float *dst;
+  int g = 0;
+  for (;; g += 16) {
+vector unsigned m, i;
+vector unsigned n, j;
+vector unsigned k, l;
+b(dst + g * 3, e(m), e(n), e(k));
+b(dst + (g + 4) * 3, e(i), e(j), e(l));
+  }
+}
diff --git a/gcc/tree-ssa-loop-prefetch.cc b/gcc/tree-ssa-loop-prefetch.cc
index bbd98e03254..70073cc4fe4 100644
--- a/gcc/tree-ssa-loop-prefetch.cc
+++ b/gcc/tree-ssa-loop-prefetch.cc
@@ -739,6 +739,8 @@ is_miss_rate_acceptable (unsigned HOST_WIDE_INT 
cache_line_size,
   if (delta >= (HOST_WIDE_INT) cache_line_size)
 return false;

+  gcc_assert (align_unit > 0);
+
   miss_positions = 0;
   total_positions = (cache_line_size / align_unit) * distinct_iters;
   max_allowed_miss_positions = (ACCEPTABLE_MISS_RATE * total_positions) / 1000;
diff --git a/gcc/tree.cc b/gcc/tree.cc
index f801712c9dd..6f8400e6640 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -5689,7 +5689,8 @@ build_qualified_type (tree type, int type_quals 
MEM_STAT_DECL)
   return t;
 }

-/* Create a variant of type T with alignment ALIGN.  */
+/* Create a variant of type T with alignment ALIGN which
+   is measured in bits.  */

 tree
 build_aligned_type (tree type, unsigned int align)
--
2.43.0


[PATCH] testsuite: Add profile_update_atomic check to gcov-20.c [PR114614]

2024-04-08 Thread Kewen.Lin
Hi,

As PR114614 shows, the newly added test case gcov-20.c by
commit r14-9789-g08a52331803f66 failed on targets which do
not support atomic profile update, there would be a message
like:

  warning: target does not support atomic profile update,
   single mode is selected

Since the test case adopts -fprofile-update=atomic, it
requires effective target check profile_update_atomic, this
patch is to add the check accordingly.

Tested well on x86_64-redhat-linux, powerpc64-linux-gnu P8/P9
and powerpc64le-linux-gnu P9/P10.

Is it ok for trunk?

BR,
Kewen
-
PR testsuite/114614

gcc/testsuite/ChangeLog:

* gcc.misc-tests/gcov-20.c: Add effective target check
profile_update_atomic.
---
 gcc/testsuite/gcc.misc-tests/gcov-20.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-20.c 
b/gcc/testsuite/gcc.misc-tests/gcov-20.c
index 215faffc980..ca8c12aad2b 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-20.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-20.c
@@ -1,5 +1,6 @@
 /* { dg-options "-fcondition-coverage -ftest-coverage -fprofile-update=atomic" 
} */
 /* { dg-do run { target native } } */
+/* { dg-require-effective-target profile_update_atomic } */

 /* Some side effect to stop branches from being pruned */
 int x = 0;
--
2.43.0


Re: [PATCH 3/3] Add -mcpu=power11 tests

2024-04-08 Thread Kewen.Lin
Hi Mike,

on 2024/3/20 12:16, Michael Meissner wrote:
> This patch adds some simple tests for -mcpu=power11 support.  In order to run
> these tests, you need an assembler that supports the appropriate option for
> supporting the Power11 processor (-mpower11 under Linux or -mpwr11 under AIX).
> 
> I have tested this patch on a little endian power10 system and a big endian
> power9 system using the latest binutils which includes support for power11.
> There were no regressions, and the 3 power11 tests added ran on both systems.
> Can I check this patch into GCC 15 when it opens up for general patches?
> 
> 2024-03-18  Michael Meissner  
> 
> gcc/testsuite/
> 
>   * gcc.target/powerpc/power11-1.c: New test.
>   * gcc.target/powerpc/power11-2.c: Likewise.
>   * gcc.target/powerpc/power11-3.c: Likewise.
>   * lib/target-supports.exp (check_effective_target_power11_ok): Add new
>   effective target.
> ---
>  gcc/testsuite/gcc.target/powerpc/power11-1.c | 13 +
>  gcc/testsuite/gcc.target/powerpc/power11-2.c | 20 
>  gcc/testsuite/gcc.target/powerpc/power11-3.c | 10 ++
>  gcc/testsuite/lib/target-supports.exp| 17 +
>  4 files changed, 60 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/power11-1.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/power11-2.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/power11-3.c
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/power11-1.c 
> b/gcc/testsuite/gcc.target/powerpc/power11-1.c
> new file mode 100644
> index 000..6a2e802eedf
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/power11-1.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile { target powerpc*-*-* } } */
> +/* { dg-require-effective-target power11_ok } */
> +/* { dg-options "-mdejagnu-cpu=power11 -O2" } */
> +
> +/* Basic check to see if the compiler supports -mcpu=power11.  */
> +
> +#ifndef _ARCH_PWR11
> +#error "-mcpu=power11 is not supported"
> +#endif
> +
> +void foo (void)
> +{
> +}
> diff --git a/gcc/testsuite/gcc.target/powerpc/power11-2.c 
> b/gcc/testsuite/gcc.target/powerpc/power11-2.c
> new file mode 100644
> index 000..7b9904c1d29
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/power11-2.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile { target powerpc*-*-* } } */
> +/* { dg-require-effective-target power11_ok } */
> +/* { dg-options "-O2" } */
> +
> +/* Check if we can set the power11 target via a target attribute.  */
> +
> +__attribute__((__target__("cpu=power9")))
> +void foo_p9 (void)
> +{
> +}
> +
> +__attribute__((__target__("cpu=power10")))
> +void foo_p10 (void)
> +{
> +}
> +
> +__attribute__((__target__("cpu=power11")))
> +void foo_p11 (void)
> +{
> +}
> diff --git a/gcc/testsuite/gcc.target/powerpc/power11-3.c 
> b/gcc/testsuite/gcc.target/powerpc/power11-3.c
> new file mode 100644
> index 000..9b2d643cc0f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/power11-3.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile { target powerpc*-*-* } }  */
> +/* { dg-require-effective-target power11_ok } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2" }  */
> +
> +/* Check if we can set the power11 target via a target_clones attribute.  */
> +
> +__attribute__((__target_clones__("cpu=power11,cpu=power9,default")))
> +void foo (void)
> +{
> +}
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index 467b539b20d..be80494be80 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -7104,6 +7104,23 @@ proc check_effective_target_power10_ok { } {
>  }
>  }
>  
> +# Return 1 if this is a PowerPC target supporting -mcpu=power11.
> +
> +proc check_effective_target_power11_ok { } {
> +if { ([istarget powerpc*-*-*]) } {
> + return [check_no_compiler_messages power11_ok object {
> + int main (void) {
> + #ifndef _ARCH_PWR11
> + #error "-mcpu=power11 is not supported"
> + #endif
> + return 0;
> + }
> + } "-mcpu=power11"]
> +} else {
> + return 0
> +}
> +}

Sorry that I didn't catch this before, this effective target looks useless
since its users power11-[123].c are all for compiling and the compilation
doesn't rely on assembler behavior.  power11-1.c has checked for _ARCH_PWR11,
maybe we want some cases with "dg-do assemble" to adopt this?

btw, the other two sub-patches in this series look good to me, as I know this
series has been on Segher's TODO list, I'll leave the approvals to him.

BR,
Kewen



Re: [PATCH v2] LoongArch: Enable switchable target

2024-04-08 Thread Yang Yujie
On Mon, Apr 08, 2024 at 04:49:58PM +0800, Xi Ruoyao wrote:
> On Mon, 2024-04-08 at 16:46 +0800, Yang Yujie wrote:
> > v1 -> v2:
> > Remove spaces from changelog.
> 
> I've rebuilt the base system with a GCC including this patch.  LTO+PGO
> bootstrap fine, regtested fine, and no issues observed.
> 
> I do usually include the optimization flags into LDFLAGS when I do LTO,
> so I don't really rely on this patch though.
> 
> -- 
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University

Thanks for the testing!

Respecting the options from compile time seems to be the convention though.
And it would be a natural consequence of implementing target attributes,
which is not here yet for LoongArch.

With this patch, we can have this build issue solved:
https://gitlab.freedesktop.org/pixman/pixman/-/merge_requests/83



Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-08 Thread Kewen.Lin
Hi Peter,

on 2024/4/6 06:28, Peter Bergner wrote:
> This is a cleanup patch in preparation to fixing the real bug in PR101865.
> TARGET_DIRECT_MOVE is redundant with TARGET_P8_VECTOR, so alias it to that.
> Also replace all usages of OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR
> and delete the now dead mask.
> 
> This passed bootstrap and retesting on powerpc64le-linux with no regressions.
> Ok for trunk?
> 
> Eventually we'll want to backport this along with the follow-on patch that
> actually fixes PR101865.
> 
> Peter
> 
> 
> gcc/
>   PR target/101865
>   * config/rs6000/rs6000.h (TARGET_DIRECT_MOVE): Define.
>   * config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace
>   OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete redundant
>   OPTION_MASK_DIRECT_MOVE usage.  Delete TARGET_DIRECT_MOVE dead code.
>   (rs6000_opt_masks): Neuter the "direct-move" option.
>   * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Replace
>   OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete useless
>   comment.
>   * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Delete
>   OPTION_MASK_DIRECT_MOVE.
>   (OTHER_VSX_VECTOR_MASKS): Likewise.
>   (POWERPC_MASKS): Likewise.
>   * config/rs6000/rs6000.opt (mno-direct-move): New.
>   (mdirect-move): Remove Mask and Var.
> 
> 
> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
> index 68bc45d65ba..77d045c9f6e 100644
> --- a/gcc/config/rs6000/rs6000.h
> +++ b/gcc/config/rs6000/rs6000.h
> @@ -471,6 +471,8 @@ extern int rs6000_vector_align[];
>  #define TARGET_EXTSWSLI  (TARGET_MODULO && TARGET_POWERPC64)
>  #define TARGET_MADDLDTARGET_MODULO
>  
> +/* TARGET_DIRECT_MOVE is redundant to TARGET_P8_VECTOR, so alias it to that. 
>  */
> +#define TARGET_DIRECT_MOVE   TARGET_P8_VECTOR
>  #define TARGET_XSCVDPSPN (TARGET_DIRECT_MOVE || TARGET_P8_VECTOR)
>  #define TARGET_XSCVSPDPN (TARGET_DIRECT_MOVE || TARGET_P8_VECTOR)
>  #define TARGET_VADDUQM   (TARGET_P8_VECTOR && TARGET_POWERPC64)
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 6ba9df4f02e..c241371147c 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -3811,7 +3811,7 @@ rs6000_option_override_internal (bool global_init_p)
>   Testing for direct_move matches power8 and later.  */
>if (!BYTES_BIG_ENDIAN
>&& !(processor_target_table[tune_index].target_enable
> -& OPTION_MASK_DIRECT_MOVE))
> +& OPTION_MASK_P8_VECTOR))
>  rs6000_isa_flags |= ~rs6000_isa_flags_explicit & 
> OPTION_MASK_STRICT_ALIGN;
>  
>/* Add some warnings for VSX.  */
> @@ -3853,8 +3853,7 @@ rs6000_option_override_internal (bool global_init_p)
>&& (rs6000_isa_flags_explicit & (OPTION_MASK_SOFT_FLOAT
>  | OPTION_MASK_ALTIVEC
>  | OPTION_MASK_VSX)) != 0)
> -rs6000_isa_flags &= ~((OPTION_MASK_P8_VECTOR | OPTION_MASK_CRYPTO
> -| OPTION_MASK_DIRECT_MOVE)
> +rs6000_isa_flags &= ~((OPTION_MASK_P8_VECTOR | OPTION_MASK_CRYPTO)
>& ~rs6000_isa_flags_explicit);
>  
>if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
> @@ -3939,13 +3938,6 @@ rs6000_option_override_internal (bool global_init_p)
>rs6000_isa_flags &= ~OPTION_MASK_FPRND;
>  }
>  
> -  if (TARGET_DIRECT_MOVE && !TARGET_VSX)
> -{
> -  if (rs6000_isa_flags_explicit & OPTION_MASK_DIRECT_MOVE)
> - error ("%qs requires %qs", "-mdirect-move", "-mvsx");
> -  rs6000_isa_flags &= ~OPTION_MASK_DIRECT_MOVE;
> -}
> -
>if (TARGET_P8_VECTOR && !TARGET_ALTIVEC)
>  rs6000_isa_flags &= ~OPTION_MASK_P8_VECTOR;
>  
> @@ -24429,7 +24421,7 @@ static struct rs6000_opt_mask const 
> rs6000_opt_masks[] =
>   false, true  },
>{ "cmpb",  OPTION_MASK_CMPB,   false, true  },
>{ "crypto",OPTION_MASK_CRYPTO, false, 
> true  },
> -  { "direct-move",   OPTION_MASK_DIRECT_MOVE,false, true  },
> +  { "direct-move",   0,  false, true  },
>{ "dlmzb", OPTION_MASK_DLMZB,  false, true  },
>{ "efficient-unaligned-vsx",   OPTION_MASK_EFFICIENT_UNALIGNED_VSX,
>   false, true  },
> diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
> index ce0b14a8d37..647f20de7f2 100644
> --- a/gcc/config/rs6000/rs6000-c.cc
> +++ b/gcc/config/rs6000/rs6000-c.cc
> @@ -429,19 +429,7 @@ rs6000_target_modify_macros (bool define_p, 
> HOST_WIDE_INT flags)
>  rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6");
>if ((flags & OPTION_MASK_POPCNTD) != 0)
>  rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7");
> -  /* Note 

Re: [PATCH v2] LoongArch: Enable switchable target

2024-04-08 Thread Xi Ruoyao
On Mon, 2024-04-08 at 16:46 +0800, Yang Yujie wrote:
> v1 -> v2:
> Remove spaces from changelog.

I've rebuilt the base system with a GCC including this patch.  LTO+PGO
bootstrap fine, regtested fine, and no issues observed.

I do usually include the optimization flags into LDFLAGS when I do LTO,
so I don't really rely on this patch though.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH v2] LoongArch: Enable switchable target

2024-04-08 Thread Yang Yujie
v1 -> v2:
Remove spaces from changelog.



[PATCH v2] LoongArch: Enable switchable target

2024-04-08 Thread Yang Yujie
This patch fixes the back-end context switching in cases where functions
should be built with their own target contexts instead of the
global one, such as LTO linking and functions with target attributes (TBD).

PR target/113233

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_reg_init):
Reinitialize the loongarch_regno_mode_ok cache.
(loongarch_option_override): Same.
(loongarch_save_restore_target_globals): Restore target globals.
(loongarch_set_current_function): Restore the target contexts
for functions.
(TARGET_SET_CURRENT_FUNCTION): Define.
* config/loongarch/loongarch.h (SWITCHABLE_TARGET): Enable
switchable target context.
* config/loongarch/loongarch-builtins.cc (loongarch_init_builtins):
Initialize all builtin functions at startup.
(loongarch_expand_builtin): Turn assertion of builtin availability
into a test.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Define condition loongarch_sx_as.
* gcc.dg/lto/pr113233_0.c: New test.
---
 gcc/config/loongarch/loongarch-builtins.cc | 25 +++---
 gcc/config/loongarch/loongarch.cc  | 91 --
 gcc/config/loongarch/loongarch.h   |  2 +
 gcc/testsuite/gcc.dg/lto/pr113233_0.c  | 14 
 gcc/testsuite/lib/target-supports.exp  | 12 +++
 5 files changed, 127 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr113233_0.c

diff --git a/gcc/config/loongarch/loongarch-builtins.cc 
b/gcc/config/loongarch/loongarch-builtins.cc
index efe7e5e5ebc..fbe46833c9b 100644
--- a/gcc/config/loongarch/loongarch-builtins.cc
+++ b/gcc/config/loongarch/loongarch-builtins.cc
@@ -2512,14 +2512,11 @@ loongarch_init_builtins (void)
   for (i = 0; i < ARRAY_SIZE (loongarch_builtins); i++)
 {
   d = _builtins[i];
-  if (d->avail ())
-   {
- type = loongarch_build_function_type (d->function_type);
- loongarch_builtin_decls[i]
-   = add_builtin_function (d->name, type, i, BUILT_IN_MD, NULL,
-   NULL);
- loongarch_get_builtin_decl_index[d->icode] = i;
-   }
+  type = loongarch_build_function_type (d->function_type);
+  loongarch_builtin_decls[i]
+   = add_builtin_function (d->name, type, i, BUILT_IN_MD, NULL,
+ NULL);
+  loongarch_get_builtin_decl_index[d->icode] = i;
 }
 }
 
@@ -3105,15 +3102,21 @@ loongarch_expand_builtin (tree exp, rtx target, rtx 
subtarget ATTRIBUTE_UNUSED,
  int ignore ATTRIBUTE_UNUSED)
 {
   tree fndecl;
-  unsigned int fcode, avail;
+  unsigned int fcode;
   const struct loongarch_builtin_description *d;
 
   fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
   fcode = DECL_MD_FUNCTION_CODE (fndecl);
   gcc_assert (fcode < ARRAY_SIZE (loongarch_builtins));
   d = _builtins[fcode];
-  avail = d->avail ();
-  gcc_assert (avail != 0);
+
+  if (!d->avail ())
+{
+  error_at (EXPR_LOCATION (exp),
+   "built-in function %qD is not enabled", fndecl);
+  return target;
+}
+
   switch (d->builtin_type)
 {
 case LARCH_BUILTIN_DIRECT:
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index c90b701a533..6b92e7034c5 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -7570,15 +7570,19 @@ loongarch_global_init (void)
loongarch_dwarf_regno[i] = INVALID_REGNUM;
 }
 
+  /* Function to allocate machine-dependent function status.  */
+  init_machine_status = _init_machine_status;
+};
+
+static void
+loongarch_reg_init (void)
+{
   /* Set up loongarch_hard_regno_mode_ok.  */
   for (int mode = 0; mode < MAX_MACHINE_MODE; mode++)
 for (int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
   loongarch_hard_regno_mode_ok_p[mode][regno]
= loongarch_hard_regno_mode_ok_uncached (regno, (machine_mode) mode);
-
-  /* Function to allocate machine-dependent function status.  */
-  init_machine_status = _init_machine_status;
-};
+}
 
 static void
 loongarch_option_override_internal (struct loongarch_target *target,
@@ -7605,20 +7609,92 @@ loongarch_option_override_internal (struct 
loongarch_target *target,
 
   /* Override some options according to the resolved target.  */
   loongarch_target_option_override (target, opts, opts_set);
+
+  target_option_default_node = target_option_current_node
+= build_target_option_node (opts, opts_set);
+
+  loongarch_reg_init ();
+}
+
+/* Remember the last target of loongarch_set_current_function.  */
+
+static GTY(()) tree loongarch_previous_fndecl;
+
+/* Restore or save the TREE_TARGET_GLOBALS from or to new_tree.
+   Used by loongarch_set_current_function to
+   make sure optab availability predicates are recomputed when necessary.  */
+
+static void
+loongarch_save_restore_target_globals (tree new_tree)
+{
+  if (TREE_TARGET_GLOBALS (new_tree))
+

Re: [PATCH v1] RISC-V: Refine the error msg for RVV intrinisc required ext

2024-04-08 Thread juzhe.zh...@rivai.ai
LGTM. Thanks.



juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2024-04-08 16:09
To: gcc-patches
CC: juzhe.zhong; kito.cheng; Pan Li
Subject: [PATCH v1] RISC-V: Refine the error msg for RVV intrinisc required ext
From: Pan Li 
 
The RVV intrinisc API has sorts of required extension from both
the march or target attribute.  It will have error message similar
to below:
 
built-in function '__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension
 
However, it is not accurate as we have many additional sub extenstion
besides v extension.  For example, zvbb, zvbk, zvbc ... etc.  This patch
would like to refine the error message with a friendly hint for the
required extension.  For example as below:
 
vuint64m1_t
__attribute__((target("arch=+v")))
test_1 (vuint64m1_t op_1, vuint64m1_t op_2, size_t vl)
{
  return __riscv_vclmul_vv_u64m1 (op_1, op_2, vl);
}
 
When compile with march=rv64gc and target arch=+v, we will have error
message as below:
 
error: built-in function '__riscv_vclmul_vv_u64m1(op_1,  op_2,  vl)'
  requires the 'zvbc' ISA extension
 
Then the end-user will get the point that the *zvbc* extension is missing
for the intrinisc API easily.
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-builtins-shapes.cc (build_one): Pass
required_ext arg when invoke add function.
(build_th_loadstore): Ditto.
(struct vcreate_def): Ditto.
(struct read_vl_def): Ditto.
(struct vlenb_def): Ditto.
* config/riscv/riscv-vector-builtins.cc (function_builder::add_function):
Introduce new arg required_ext to fill in the register func.
(function_builder::add_unique_function): Ditto.
(function_builder::add_overloaded_function): Ditto.
(expand_builtin): Leverage required_extensions_specified to
check if the required extension is provided.
* config/riscv/riscv-vector-builtins.h (reqired_ext_to_isa_name): New
func impl to convert the required_ext enum to the extension name.
(required_extensions_specified): New func impl to predicate if
the required extension is well feeded.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c: Adjust
the error message for v extension.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c: Ditto.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-1.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-10.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-2.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-3.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-4.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-5.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-6.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-7.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-8.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-9.c: New test.
 
Signed-off-by: Pan Li 
---
.../riscv/riscv-vector-builtins-shapes.cc | 18 +++--
gcc/config/riscv/riscv-vector-builtins.cc | 23 --
gcc/config/riscv/riscv-vector-builtins.h  | 75 ++-
.../riscv/rvv/base/intrinsic_required_ext-1.c | 10 +++
.../rvv/base/intrinsic_required_ext-10.c  | 11 +++
.../riscv/rvv/base/intrinsic_required_ext-2.c | 11 +++
.../riscv/rvv/base/intrinsic_required_ext-3.c | 11 +++
.../riscv/rvv/base/intrinsic_required_ext-4.c | 11 +++
.../riscv/rvv/base/intrinsic_required_ext-5.c | 11 +++
.../riscv/rvv/base/intrinsic_required_ext-6.c | 11 +++
.../riscv/rvv/base/intrinsic_required_ext-7.c | 11 +++
.../riscv/rvv/base/intrinsic_required_ext-8.c | 11 +++
.../riscv/rvv/base/intrinsic_required_ext-9.c | 11 +++
.../target_attribute_v_with_intrinsic-7.c |  2 +-
.../target_attribute_v_with_intrinsic-8.c |  2 +-
15 files changed, 210 insertions(+), 19 deletions(-)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-1.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-10.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-2.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-3.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-4.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-5.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-6.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-7.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-8.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-9.c
 
diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc 
b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
index c5ffcc1f2c4..7f983e82370 100644
--- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
@@ -72,9 +72,10 @@ build_one (function_builder , const function_group_info 
,
   if 

[PATCH v1] RISC-V: Refine the error msg for RVV intrinisc required ext

2024-04-08 Thread pan2 . li
From: Pan Li 

The RVV intrinisc API has sorts of required extension from both
the march or target attribute.  It will have error message similar
to below:

built-in function '__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension

However, it is not accurate as we have many additional sub extenstion
besides v extension.  For example, zvbb, zvbk, zvbc ... etc.  This patch
would like to refine the error message with a friendly hint for the
required extension.  For example as below:

vuint64m1_t
__attribute__((target("arch=+v")))
test_1 (vuint64m1_t op_1, vuint64m1_t op_2, size_t vl)
{
  return __riscv_vclmul_vv_u64m1 (op_1, op_2, vl);
}

When compile with march=rv64gc and target arch=+v, we will have error
message as below:

error: built-in function '__riscv_vclmul_vv_u64m1(op_1,  op_2,  vl)'
  requires the 'zvbc' ISA extension

Then the end-user will get the point that the *zvbc* extension is missing
for the intrinisc API easily.

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-shapes.cc (build_one): Pass
required_ext arg when invoke add function.
(build_th_loadstore): Ditto.
(struct vcreate_def): Ditto.
(struct read_vl_def): Ditto.
(struct vlenb_def): Ditto.
* config/riscv/riscv-vector-builtins.cc 
(function_builder::add_function):
Introduce new arg required_ext to fill in the register func.
(function_builder::add_unique_function): Ditto.
(function_builder::add_overloaded_function): Ditto.
(expand_builtin): Leverage required_extensions_specified to
check if the required extension is provided.
* config/riscv/riscv-vector-builtins.h (reqired_ext_to_isa_name): New
func impl to convert the required_ext enum to the extension name.
(required_extensions_specified): New func impl to predicate if
the required extension is well feeded.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c: 
Adjust
the error message for v extension.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c: 
Ditto.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-1.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-10.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-2.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-3.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-4.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-5.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-6.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-7.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-8.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-9.c: New test.

Signed-off-by: Pan Li 
---
 .../riscv/riscv-vector-builtins-shapes.cc | 18 +++--
 gcc/config/riscv/riscv-vector-builtins.cc | 23 --
 gcc/config/riscv/riscv-vector-builtins.h  | 75 ++-
 .../riscv/rvv/base/intrinsic_required_ext-1.c | 10 +++
 .../rvv/base/intrinsic_required_ext-10.c  | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-2.c | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-3.c | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-4.c | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-5.c | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-6.c | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-7.c | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-8.c | 11 +++
 .../riscv/rvv/base/intrinsic_required_ext-9.c | 11 +++
 .../target_attribute_v_with_intrinsic-7.c |  2 +-
 .../target_attribute_v_with_intrinsic-8.c |  2 +-
 15 files changed, 210 insertions(+), 19 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-10.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-5.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-6.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-7.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/intrinsic_required_ext-9.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc 
b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
index c5ffcc1f2c4..7f983e82370 100644
--- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
@@ -72,9 +72,10 @@ build_one 

[pushed] Darwin: Sync coverage specs with gcc/gcc.cc.

2024-04-08 Thread Iain Sandoe
Tested on x86_64 and aarch64 Darwin, pushed to trunk, thanks,
Iain

--- 8< ---

The specs for coverage ere out of date leading to test fails for
fcondition-coverage cases. Fixed by updating to match the specs
in gcc/gcc.cc.

gcc/ChangeLog:

* config/darwin.h (LINK_COMMAND_SPEC_A): Update coverage
specs.

Signed-off-by: Iain Sandoe 
---
 gcc/config/darwin.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index 31019a0c49d..c09b9e9dc94 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -406,7 +406,7 @@ extern GTY(()) int darwin_ms_struct;
 %{!r:%{!nostdlib:%{!nodefaultlibs: " DARWIN_WEAK_CRTS "}}} \
 %o \
 %{!r:%{!nostdlib:%{!nodefaultlibs:\
-  %{fprofile-arcs|fprofile-generate*|coverage:-lgcov} \
+  %{fprofile-arcs|fcondition-coverage|fprofile-generate*|coverage:-lgcov} \
   %{fopenacc|fopenmp|%:gt(%{ftree-parallelize-loops=*:%*} 1): \
%{static|static-libgcc|static-libstdc++|static-libgfortran: \
  libgomp.a%s; : -lgomp }} \
-- 
2.39.2 (Apple Git-143)



Re: [PATCH 0/2] Condition coverage fixes

2024-04-08 Thread Sam James
Jørgen Kvalsvik  writes:

> Hi,
>
> I propose these fixes for the current issues with the condition
> coverage.
>
> Rainer, I propose to simply delete the test with __sigsetjmp. I don't
> think it actually detects anything reasonable any more, I kept it around
> to prevent a regression. Since then I have built a lot of programs (with
> optimization enabled) and not really seen this problem.
>
> H.J., the problem you found with -O2 was really a problem of
> tree-inlining, which was actually caught earlier by Jan [1]. It probably
> warrants some more testing, but I could reproduce by tuning your test
> case to use always_inline and not -O2 and trigger the error.
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648785.html

I couldn't find your BZ account, but FWIW:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114627.

Thanks.

>
> Thanks,
> Jørgen
>
> Jørgen Kvalsvik (2):
>   Remove unecessary and broken MC/DC compile test
>   Copy condition->expr map when inlining [PR114599]
>
>  gcc/testsuite/gcc.misc-tests/gcov-19.c   | 11 -
>  gcc/testsuite/gcc.misc-tests/gcov-pr114599.c | 25 
>  gcc/tree-inline.cc   | 20 +++-
>  3 files changed, 44 insertions(+), 12 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-pr114599.c